* [PATCH v4 01/16] spi: dt-bindings: add spi-max-post-config-frequency property
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 16:36 ` Conor Dooley
2026-06-22 9:14 ` Krzysztof Kozlowski
2026-06-18 7:37 ` [PATCH v4 02/16] spi: dt-bindings: add spi-phy-pattern-partition property Santhosh Kumar K
` (15 subsequent siblings)
16 siblings, 2 replies; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Add spi-max-post-config-frequency, a generic uint32 property for SPI
peripherals that support two distinct clock rates: a conservative rate
always reachable without controller configuration, and a higher rate
reachable only after controller-side configuration such as PHY tuning.
When both properties are present, spi-max-frequency gives the
conservative pre-configuration rate and spi-max-post-config-frequency
gives the higher post-configuration target.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
.../devicetree/bindings/spi/spi-peripheral-props.yaml | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
index 880a9f624566..ece86f65930f 100644
--- a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
+++ b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
@@ -45,6 +45,12 @@ properties:
description:
Maximum SPI clocking speed of the device in Hz.
+ spi-max-post-config-frequency:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description:
+ Maximum SPI clock frequency in Hz achievable post controller-side
+ configuration.
+
spi-cs-setup-delay-ns:
description:
Delay in nanoseconds to be introduced by the controller after CS is
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [PATCH v4 01/16] spi: dt-bindings: add spi-max-post-config-frequency property
2026-06-18 7:37 ` [PATCH v4 01/16] spi: dt-bindings: add spi-max-post-config-frequency property Santhosh Kumar K
@ 2026-06-18 16:36 ` Conor Dooley
2026-06-22 9:14 ` Krzysztof Kozlowski
1 sibling, 0 replies; 24+ messages in thread
From: Conor Dooley @ 2026-06-18 16:36 UTC (permalink / raw)
To: Santhosh Kumar K
Cc: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano, linux-spi,
devicetree, linux-kernel, linux-mtd, praneeth, u-kumar1, a-dutta
[-- Attachment #1: Type: text/plain, Size: 680 bytes --]
On Thu, Jun 18, 2026 at 01:07:10PM +0530, Santhosh Kumar K wrote:
> Add spi-max-post-config-frequency, a generic uint32 property for SPI
> peripherals that support two distinct clock rates: a conservative rate
> always reachable without controller configuration, and a higher rate
> reachable only after controller-side configuration such as PHY tuning.
>
> When both properties are present, spi-max-frequency gives the
> conservative pre-configuration rate and spi-max-post-config-frequency
> gives the higher post-configuration target.
>
> Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
pw-bot: not-applicable
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v4 01/16] spi: dt-bindings: add spi-max-post-config-frequency property
2026-06-18 7:37 ` [PATCH v4 01/16] spi: dt-bindings: add spi-max-post-config-frequency property Santhosh Kumar K
2026-06-18 16:36 ` Conor Dooley
@ 2026-06-22 9:14 ` Krzysztof Kozlowski
2026-06-22 19:46 ` Conor Dooley
1 sibling, 1 reply; 24+ messages in thread
From: Krzysztof Kozlowski @ 2026-06-22 9:14 UTC (permalink / raw)
To: Santhosh Kumar K
Cc: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano, linux-spi,
devicetree, linux-kernel, linux-mtd, praneeth, u-kumar1, a-dutta
On Thu, Jun 18, 2026 at 01:07:10PM +0530, Santhosh Kumar K wrote:
> Add spi-max-post-config-frequency, a generic uint32 property for SPI
> peripherals that support two distinct clock rates: a conservative rate
> always reachable without controller configuration, and a higher rate
> reachable only after controller-side configuration such as PHY tuning.
>
> When both properties are present, spi-max-frequency gives the
> conservative pre-configuration rate and spi-max-post-config-frequency
> gives the higher post-configuration target.
>
> Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
> ---
> .../devicetree/bindings/spi/spi-peripheral-props.yaml | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
> index 880a9f624566..ece86f65930f 100644
> --- a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
> +++ b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
> @@ -45,6 +45,12 @@ properties:
> description:
> Maximum SPI clocking speed of the device in Hz.
>
> + spi-max-post-config-frequency:
-hz
https://github.com/devicetree-org/dt-schema/blob/main/dtschema/schemas/property-units.yaml
and you need maxItems: 1.
Now, please take time and think if this should not be an array instead
(maxItems: ...) to cover other possible cases, e.g. different tuning
levels? IOW, having single spi-max-frequency turned out to be
insufficient. You address that insufficiency with one more frequency,
but what if this is going to be insufficient next month as well?
I don't know, answer is rather for domain experts.
> + $ref: /schemas/types.yaml#/definitions/uint32
> + description:
> + Maximum SPI clock frequency in Hz achievable post controller-side
> + configuration.
Best regards,
Krzysztof
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v4 01/16] spi: dt-bindings: add spi-max-post-config-frequency property
2026-06-22 9:14 ` Krzysztof Kozlowski
@ 2026-06-22 19:46 ` Conor Dooley
0 siblings, 0 replies; 24+ messages in thread
From: Conor Dooley @ 2026-06-22 19:46 UTC (permalink / raw)
To: Krzysztof Kozlowski
Cc: Santhosh Kumar K, broonie, robh, krzk+dt, conor+dt, miquel.raynal,
richard, vigneshr, pratyush, mwalle, takahiro.kuwano, linux-spi,
devicetree, linux-kernel, linux-mtd, praneeth, u-kumar1, a-dutta
[-- Attachment #1: Type: text/plain, Size: 2232 bytes --]
On Mon, Jun 22, 2026 at 11:14:32AM +0200, Krzysztof Kozlowski wrote:
> On Thu, Jun 18, 2026 at 01:07:10PM +0530, Santhosh Kumar K wrote:
> > Add spi-max-post-config-frequency, a generic uint32 property for SPI
> > peripherals that support two distinct clock rates: a conservative rate
> > always reachable without controller configuration, and a higher rate
> > reachable only after controller-side configuration such as PHY tuning.
> >
> > When both properties are present, spi-max-frequency gives the
> > conservative pre-configuration rate and spi-max-post-config-frequency
> > gives the higher post-configuration target.
> >
> > Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
> > ---
> > .../devicetree/bindings/spi/spi-peripheral-props.yaml | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
> > index 880a9f624566..ece86f65930f 100644
> > --- a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
> > +++ b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
> > @@ -45,6 +45,12 @@ properties:
> > description:
> > Maximum SPI clocking speed of the device in Hz.
> >
> > + spi-max-post-config-frequency:
>
> -hz
Ah d'oh. I was hung up on matching the existing property that I didn't
include -hz in my original suggestion, sorry!
> https://github.com/devicetree-org/dt-schema/blob/main/dtschema/schemas/property-units.yaml
>
> and you need maxItems: 1.
>
> Now, please take time and think if this should not be an array instead
> (maxItems: ...) to cover other possible cases, e.g. different tuning
> levels? IOW, having single spi-max-frequency turned out to be
> insufficient. You address that insufficiency with one more frequency,
> but what if this is going to be insufficient next month as well?
>
> I don't know, answer is rather for domain experts.
>
> > + $ref: /schemas/types.yaml#/definitions/uint32
> > + description:
> > + Maximum SPI clock frequency in Hz achievable post controller-side
> > + configuration.
>
> Best regards,
> Krzysztof
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v4 02/16] spi: dt-bindings: add spi-phy-pattern-partition property
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 01/16] spi: dt-bindings: add spi-max-post-config-frequency property Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-22 9:17 ` Krzysztof Kozlowski
2026-06-18 7:37 ` [PATCH v4 03/16] spi: parse spi-max-post-config-frequency into post_config_max_speed_hz Santhosh Kumar K
` (14 subsequent siblings)
16 siblings, 1 reply; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Add spi-phy-pattern-partition, a per-device phandle property on the
flash sub-node that allows the DT author to directly reference the
partition holding the PHY tuning pattern. Used to locate the pattern
data during PHY tuning when the device cannot load the pattern
dynamically.
"Read PHY tuning" works by reading a known data pattern from the device
repeatedly while sweeping controller delay parameters until the
capture window is stable. For SPI NAND, the driver loads the pattern
into the page cache once using write-to-cache opcodes, then reads it
during the sweep. SPI NOR devices have no equivalent opcode, so the
pattern must be pre-programmed in a dedicated flash partition. One
partition per device is required to keep the procedure unambiguous
when multiple devices share a bus.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
.../bindings/spi/cdns,qspi-nor.yaml | 19 +++++++++++++++++++
.../bindings/spi/spi-peripheral-props.yaml | 7 +++++++
2 files changed, 26 insertions(+)
diff --git a/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml b/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml
index 891f578b5ac4..c6f1b1d1251d 100644
--- a/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml
+++ b/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml
@@ -204,10 +204,29 @@ examples:
flash@0 {
compatible = "jedec,spi-nor";
reg = <0x0>;
+ #address-cells = <1>;
+ #size-cells = <1>;
cdns,read-delay = <4>;
cdns,tshsl-ns = <60>;
cdns,tsd2d-ns = <60>;
cdns,tchsh-ns = <60>;
cdns,tslch-ns = <60>;
+ spi-phy-pattern-partition = <&phy_pattern>;
+
+ partitions {
+ compatible = "fixed-partitions";
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+ partition@0 {
+ label = "data";
+ reg = <0x0 0x3fc0000>;
+ };
+
+ phy_pattern: partition@3fc0000 {
+ label = "phy-pattern";
+ reg = <0x3fc0000 0x40000>;
+ };
+ };
};
};
diff --git a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
index ece86f65930f..38708f8197f9 100644
--- a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
+++ b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
@@ -123,6 +123,13 @@ properties:
description:
Delay, in microseconds, after a write transfer.
+ spi-phy-pattern-partition:
+ $ref: /schemas/types.yaml#/definitions/phandle
+ description:
+ Phandle to the flash partition holding the pre-programmed PHY tuning
+ pattern. Used when the device cannot load the pattern dynamically during
+ PHY tuning.
+
stacked-memories:
description: Several SPI memories can be wired in stacked mode.
This basically means that either a device features several chip
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [PATCH v4 02/16] spi: dt-bindings: add spi-phy-pattern-partition property
2026-06-18 7:37 ` [PATCH v4 02/16] spi: dt-bindings: add spi-phy-pattern-partition property Santhosh Kumar K
@ 2026-06-22 9:17 ` Krzysztof Kozlowski
2026-06-22 17:11 ` Miquel Raynal
0 siblings, 1 reply; 24+ messages in thread
From: Krzysztof Kozlowski @ 2026-06-22 9:17 UTC (permalink / raw)
To: Santhosh Kumar K
Cc: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano, linux-spi,
devicetree, linux-kernel, linux-mtd, praneeth, u-kumar1, a-dutta
On Thu, Jun 18, 2026 at 01:07:11PM +0530, Santhosh Kumar K wrote:
> Add spi-phy-pattern-partition, a per-device phandle property on the
> flash sub-node that allows the DT author to directly reference the
> partition holding the PHY tuning pattern. Used to locate the pattern
> data during PHY tuning when the device cannot load the pattern
> dynamically.
>
> "Read PHY tuning" works by reading a known data pattern from the device
> repeatedly while sweeping controller delay parameters until the
> capture window is stable. For SPI NAND, the driver loads the pattern
> into the page cache once using write-to-cache opcodes, then reads it
> during the sweep. SPI NOR devices have no equivalent opcode, so the
> pattern must be pre-programmed in a dedicated flash partition. One
> partition per device is required to keep the procedure unambiguous
> when multiple devices share a bus.
>
> Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
> ---
> .../bindings/spi/cdns,qspi-nor.yaml | 19 +++++++++++++++++++
> .../bindings/spi/spi-peripheral-props.yaml | 7 +++++++
> 2 files changed, 26 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml b/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml
> index 891f578b5ac4..c6f1b1d1251d 100644
> --- a/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml
> +++ b/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml
> @@ -204,10 +204,29 @@ examples:
> flash@0 {
> compatible = "jedec,spi-nor";
> reg = <0x0>;
> + #address-cells = <1>;
> + #size-cells = <1>;
This is neither needed, nor correct.
> cdns,read-delay = <4>;
> cdns,tshsl-ns = <60>;
> cdns,tsd2d-ns = <60>;
> cdns,tchsh-ns = <60>;
> cdns,tslch-ns = <60>;
> + spi-phy-pattern-partition = <&phy_pattern>;
> +
> + partitions {
> + compatible = "fixed-partitions";
> + #address-cells = <1>;
> + #size-cells = <1>;
> +
> + partition@0 {
> + label = "data";
> + reg = <0x0 0x3fc0000>;
> + };
> +
> + phy_pattern: partition@3fc0000 {
> + label = "phy-pattern";
> + reg = <0x3fc0000 0x40000>;
> + };
> + };
> };
> };
> diff --git a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
> index ece86f65930f..38708f8197f9 100644
> --- a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
> +++ b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
> @@ -123,6 +123,13 @@ properties:
> description:
> Delay, in microseconds, after a write transfer.
>
> + spi-phy-pattern-partition:
Is this specific to SPI-based MTD/NAND or rather broader - specific to
MTD/NAND memories, regardless of interface? Feels like the second, thus
maybe should be placed into the NAND bindings.
If the first, then in below description:
s/PHY/SPI PHY/ to be clear that this is about SPI, not the memory
itself.
> + $ref: /schemas/types.yaml#/definitions/phandle
> + description:
> + Phandle to the flash partition holding the pre-programmed PHY tuning
> + pattern. Used when the device cannot load the pattern dynamically during
> + PHY tuning.
Best regards,
Krzysztof
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [PATCH v4 02/16] spi: dt-bindings: add spi-phy-pattern-partition property
2026-06-22 9:17 ` Krzysztof Kozlowski
@ 2026-06-22 17:11 ` Miquel Raynal
0 siblings, 0 replies; 24+ messages in thread
From: Miquel Raynal @ 2026-06-22 17:11 UTC (permalink / raw)
To: Krzysztof Kozlowski
Cc: Santhosh Kumar K, broonie, robh, krzk+dt, conor+dt, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano, linux-spi,
devicetree, linux-kernel, linux-mtd, praneeth, u-kumar1, a-dutta
Hello,
>> + spi-phy-pattern-partition:
>
> Is this specific to SPI-based MTD/NAND or rather broader - specific to
> MTD/NAND memories, regardless of interface? Feels like the second, thus
> maybe should be placed into the NAND bindings.
>
> If the first, then in below description:
>
> s/PHY/SPI PHY/ to be clear that this is about SPI, not the memory
> itself.
As far as I know, there is no raw NAND controller with such
capability. In the raw/parallel NAND world, timings are well defined by
the ONFI specification, it covers both the bus timings and the minimal
requirements for the chips. There is a method to query what "timing mode"
the NAND chip supports, and then we tune the controller registers to fit
the highest supported timings (capped by possible controller limits).
In the SPI world it is different. No specific timing has ever been
globally defined, so every manufacturer has its own capabilities which
are not discoverable dynamically. The routing also weights a lot. I
would say that we can safely keep this property SPI related, because it
is about the SPI bus being used with optimized timings, rather than some
kind of memory specific feature.
The reason why we need a property in those memories for the feature to
work, is because we need to make data transfers with a known pattern,
thus requiring to read the pattern from the internal array somehow.
Therefore, we shall indeed go for the s/PHY/SPI PHY/ naming indeed.
Thanks,
Miquèl
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v4 03/16] spi: parse spi-max-post-config-frequency into post_config_max_speed_hz
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 01/16] spi: dt-bindings: add spi-max-post-config-frequency property Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 02/16] spi: dt-bindings: add spi-phy-pattern-partition property Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 04/16] spi: spi-mem: teach spi_mem_adjust_op_freq() about post-config ops Santhosh Kumar K
` (13 subsequent siblings)
16 siblings, 0 replies; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Add post_config_max_speed_hz to struct spi_device and parse it from
the spi-max-post-config-frequency DT property in of_spi_parse_dt().
This supports SPI devices that operate at two distinct clock rates: a
conservative rate always reachable without controller configuration,
and a higher rate achievable only after controller-side configuration
such as PHY tuning. With both properties set, spi-max-frequency gives
the conservative pre-configuration rate and post_config_max_speed_hz
carries the post-configuration target for the SPI-MEM layer.
Zero when not set, preserving existing behaviour.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi.c | 2 ++
include/linux/spi/spi.h | 3 +++
2 files changed, 5 insertions(+)
diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 76e3563c523f..36951ab47a2f 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -2599,6 +2599,8 @@ static int of_spi_parse_dt(struct spi_controller *ctlr, struct spi_device *spi,
/* Device speed */
if (!of_property_read_u32(nc, "spi-max-frequency", &value))
spi->max_speed_hz = value;
+ if (!of_property_read_u32(nc, "spi-max-post-config-frequency", &value))
+ spi->post_config_max_speed_hz = value;
/* Device CS delays */
of_spi_parse_dt_cs_delay(nc, &spi->cs_setup, "spi-cs-setup-delay-ns");
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index f6ed93eff00b..2d90ec91450a 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -139,6 +139,8 @@ extern void spi_transfer_cs_change_delay_exec(struct spi_message *msg,
* @max_speed_hz: Maximum clock rate to be used with this chip
* (on this board); may be changed by the device's driver.
* The spi_transfer.speed_hz can override this for each transfer.
+ * @post_config_max_speed_hz: Maximum clock rate achievable after controller
+ * configuration (e.g. PHY tuning); zero when not assigned.
* @bits_per_word: Data transfers involve one or more words; word sizes
* like eight or 12 bits are common. In-memory wordsizes are
* powers of two bytes (e.g. 20 bit samples use 32 bits).
@@ -191,6 +193,7 @@ struct spi_device {
struct device dev;
struct spi_controller *controller;
u32 max_speed_hz;
+ u32 post_config_max_speed_hz;
u8 bits_per_word;
bool rt;
#define SPI_NO_TX BIT(31) /* No transmit wire */
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v4 04/16] spi: spi-mem: teach spi_mem_adjust_op_freq() about post-config ops
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (2 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 03/16] spi: parse spi-max-post-config-frequency into post_config_max_speed_hz Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 05/16] spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning() Santhosh Kumar K
` (12 subsequent siblings)
16 siblings, 0 replies; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
When a device exposes both a conservative base speed (spi-max-frequency)
and a maximum post-configuration speed (spi-max-post-config-frequency),
operations validated after controller configuration must run at the
higher rate while all others are capped at the base rate.
Extend spi_mem_adjust_op_freq() with a bypass: if op->max_freq equals
post_config_max_speed_hz (the value written by execute_tuning on
success), return immediately leaving op->max_freq unchanged. All other
ops are capped to max_speed_hz, the always-reachable base rate. This
integrates the policy into the single existing frequency-adjustment
point so exec_op(), supports_op(), and calc_op_duration() all behave
consistently.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-mem.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
index a88b9f038356..e20eca1b8245 100644
--- a/drivers/spi/spi-mem.c
+++ b/drivers/spi/spi-mem.c
@@ -591,9 +591,18 @@ EXPORT_SYMBOL_GPL(spi_mem_adjust_op_size);
* Some chips have per-op frequency limitations and must adapt the maximum
* speed. This function allows SPI mem drivers to set @op->max_freq to the
* maximum supported value.
+ *
+ * When @mem->spi->post_config_max_speed_hz is set, ops with @op->max_freq
+ * equal to that value are treated as post-configuration ops (e.g. PHY-tuned)
+ * and are allowed to run at the full post-config rate. All other ops are
+ * capped to @mem->spi->max_speed_hz, the always-reachable base rate.
*/
void spi_mem_adjust_op_freq(struct spi_mem *mem, struct spi_mem_op *op)
{
+ if (mem->spi->post_config_max_speed_hz &&
+ op->max_freq == mem->spi->post_config_max_speed_hz)
+ return;
+
if (!op->max_freq || op->max_freq > mem->spi->max_speed_hz)
op->max_freq = mem->spi->max_speed_hz;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v4 05/16] spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning()
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (3 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 04/16] spi: spi-mem: teach spi_mem_adjust_op_freq() about post-config ops Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 06/16] spi: cadence-quadspi: move cqspi_readdata_capture earlier Santhosh Kumar K
` (11 subsequent siblings)
16 siblings, 0 replies; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
SPI memory controllers that support high-speed operating modes often
require a tuning procedure to calibrate internal timing before operating
at maximum frequency. There is currently no standard spi-mem interface
for drivers to trigger this procedure.
Add an execute_tuning callback to struct spi_controller_mem_ops. The
callback receives a mandatory read op template and an optional write op
template. On success the controller sets op->max_freq in each provided
template to the validated clock rate.
Add the corresponding spi_mem_execute_tuning() wrapper that validates
inputs and returns -EOPNOTSUPP when the controller has not implemented
the callback, allowing callers to handle controllers that do not support
tuning gracefully.
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-mem.c | 31 +++++++++++++++++++++++++++++++
include/linux/spi/spi-mem.h | 14 ++++++++++++++
2 files changed, 45 insertions(+)
diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
index e20eca1b8245..571a7dd9c2a4 100644
--- a/drivers/spi/spi-mem.c
+++ b/drivers/spi/spi-mem.c
@@ -660,6 +660,37 @@ u64 spi_mem_calc_op_duration(struct spi_mem *mem, struct spi_mem_op *op)
}
EXPORT_SYMBOL_GPL(spi_mem_calc_op_duration);
+/**
+ * spi_mem_execute_tuning() - Execute controller tuning procedure
+ * @mem: the SPI memory device
+ * @read_op: read operation template (mandatory)
+ * @write_op: write operation template (optional, may be NULL)
+ *
+ * Requests the controller to perform tuning for high-speed operation
+ * using the provided op templates. On success the controller callback
+ * sets @read_op->max_freq (and @write_op->max_freq when non-NULL) to
+ * the validated clock rate.
+ *
+ * Return: 0 on success, -EINVAL if @mem or @read_op is NULL,
+ * -EOPNOTSUPP if controller doesn't support tuning,
+ * or a controller-specific error code on failure.
+ */
+int spi_mem_execute_tuning(struct spi_mem *mem, struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op)
+{
+ struct spi_controller *ctlr;
+
+ if (!mem || !read_op)
+ return -EINVAL;
+
+ ctlr = mem->spi->controller;
+ if (!ctlr->mem_ops || !ctlr->mem_ops->execute_tuning)
+ return -EOPNOTSUPP;
+
+ return ctlr->mem_ops->execute_tuning(mem, read_op, write_op);
+}
+EXPORT_SYMBOL_GPL(spi_mem_execute_tuning);
+
static ssize_t spi_mem_no_dirmap_read(struct spi_mem_dirmap_desc *desc,
u64 offs, size_t len, void *buf)
{
diff --git a/include/linux/spi/spi-mem.h b/include/linux/spi/spi-mem.h
index f660bb2e9f85..ef5a6d60bae9 100644
--- a/include/linux/spi/spi-mem.h
+++ b/include/linux/spi/spi-mem.h
@@ -346,6 +346,15 @@ static inline void *spi_mem_get_drvdata(struct spi_mem *mem)
* @poll_status: poll memory device status until (status & mask) == match or
* when the timeout has expired. It fills the data buffer with
* the last status value.
+ * @execute_tuning: run the controller tuning procedure using the provided
+ * read and optional write op templates. On success, set
+ * @read_op->max_freq (and @write_op->max_freq when non-NULL)
+ * to the validated clock rate. Return a negative errno on
+ * error. Return -EOPNOTSUPP if the controller has no tuning
+ * capability at all. Return 0 with @read_op->max_freq left at
+ * zero to signal that this specific op cannot be PHY-tuned
+ * (e.g. a hardware erratum blocks it) but another variant may
+ * succeed; the caller will iterate remaining op variants.
*
* This interface should be implemented by SPI controllers providing an
* high-level interface to execute SPI memory operation, which is usually the
@@ -376,6 +385,8 @@ struct spi_controller_mem_ops {
unsigned long initial_delay_us,
unsigned long polling_rate_us,
unsigned long timeout_ms);
+ int (*execute_tuning)(struct spi_mem *mem, struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op);
};
/**
@@ -465,6 +476,9 @@ int spi_mem_adjust_op_size(struct spi_mem *mem, struct spi_mem_op *op);
void spi_mem_adjust_op_freq(struct spi_mem *mem, struct spi_mem_op *op);
u64 spi_mem_calc_op_duration(struct spi_mem *mem, struct spi_mem_op *op);
+int spi_mem_execute_tuning(struct spi_mem *mem, struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op);
+
bool spi_mem_supports_op(struct spi_mem *mem,
const struct spi_mem_op *op);
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v4 06/16] spi: cadence-quadspi: move cqspi_readdata_capture earlier
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (4 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 05/16] spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning() Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 07/16] spi: cadence-quadspi: add DQS support to read data capture Santhosh Kumar K
` (10 subsequent siblings)
16 siblings, 0 replies; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Move cqspi_readdata_capture() function earlier in the file. This is
preparatory refactoring for upcoming PHY tuning support for read and
write operations.
No functional changes.
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 45 +++++++++++++++----------------
1 file changed, 22 insertions(+), 23 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index aaba1a3ad577..54fd7b591e06 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -453,6 +453,28 @@ static int cqspi_wait_idle(struct cqspi_st *cqspi)
}
}
+static void cqspi_readdata_capture(struct cqspi_st *cqspi, const bool bypass,
+ const unsigned int delay)
+{
+ void __iomem *reg_base = cqspi->iobase;
+ unsigned int reg;
+
+ reg = readl(reg_base + CQSPI_REG_READCAPTURE);
+
+ if (bypass)
+ reg |= BIT(CQSPI_REG_READCAPTURE_BYPASS_LSB);
+ else
+ reg &= ~BIT(CQSPI_REG_READCAPTURE_BYPASS_LSB);
+
+ reg &= ~(CQSPI_REG_READCAPTURE_DELAY_MASK
+ << CQSPI_REG_READCAPTURE_DELAY_LSB);
+
+ reg |= (delay & CQSPI_REG_READCAPTURE_DELAY_MASK)
+ << CQSPI_REG_READCAPTURE_DELAY_LSB;
+
+ writel(reg, reg_base + CQSPI_REG_READCAPTURE);
+}
+
static int cqspi_exec_flash_cmd(struct cqspi_st *cqspi, unsigned int reg)
{
void __iomem *reg_base = cqspi->iobase;
@@ -1270,29 +1292,6 @@ static void cqspi_config_baudrate_div(struct cqspi_st *cqspi)
writel(reg, reg_base + CQSPI_REG_CONFIG);
}
-static void cqspi_readdata_capture(struct cqspi_st *cqspi,
- const bool bypass,
- const unsigned int delay)
-{
- void __iomem *reg_base = cqspi->iobase;
- unsigned int reg;
-
- reg = readl(reg_base + CQSPI_REG_READCAPTURE);
-
- if (bypass)
- reg |= BIT(CQSPI_REG_READCAPTURE_BYPASS_LSB);
- else
- reg &= ~BIT(CQSPI_REG_READCAPTURE_BYPASS_LSB);
-
- reg &= ~(CQSPI_REG_READCAPTURE_DELAY_MASK
- << CQSPI_REG_READCAPTURE_DELAY_LSB);
-
- reg |= (delay & CQSPI_REG_READCAPTURE_DELAY_MASK)
- << CQSPI_REG_READCAPTURE_DELAY_LSB;
-
- writel(reg, reg_base + CQSPI_REG_READCAPTURE);
-}
-
static void cqspi_configure(struct cqspi_flash_pdata *f_pdata,
unsigned long sclk)
{
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v4 07/16] spi: cadence-quadspi: add DQS support to read data capture
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (5 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 06/16] spi: cadence-quadspi: move cqspi_readdata_capture earlier Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 08/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (9 subsequent siblings)
16 siblings, 0 replies; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Add DQS (Data Strobe) parameter to cqspi_readdata_capture() to control
data capture timing. DQS mode uses a dedicated strobe signal for
improved timing margins in high-speed SPI modes.
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 54fd7b591e06..201d69c64c49 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -192,6 +192,7 @@ struct cqspi_driver_platdata {
#define CQSPI_REG_READCAPTURE_BYPASS_LSB 0
#define CQSPI_REG_READCAPTURE_DELAY_LSB 1
#define CQSPI_REG_READCAPTURE_DELAY_MASK 0xF
+#define CQSPI_REG_READCAPTURE_DQS_LSB 8
#define CQSPI_REG_SIZE 0x14
#define CQSPI_REG_SIZE_ADDRESS_LSB 0
@@ -454,7 +455,7 @@ static int cqspi_wait_idle(struct cqspi_st *cqspi)
}
static void cqspi_readdata_capture(struct cqspi_st *cqspi, const bool bypass,
- const unsigned int delay)
+ const bool dqs, const unsigned int delay)
{
void __iomem *reg_base = cqspi->iobase;
unsigned int reg;
@@ -472,6 +473,11 @@ static void cqspi_readdata_capture(struct cqspi_st *cqspi, const bool bypass,
reg |= (delay & CQSPI_REG_READCAPTURE_DELAY_MASK)
<< CQSPI_REG_READCAPTURE_DELAY_LSB;
+ if (dqs)
+ reg |= BIT(CQSPI_REG_READCAPTURE_DQS_LSB);
+ else
+ reg &= ~BIT(CQSPI_REG_READCAPTURE_DQS_LSB);
+
writel(reg, reg_base + CQSPI_REG_READCAPTURE);
}
@@ -1313,7 +1319,7 @@ static void cqspi_configure(struct cqspi_flash_pdata *f_pdata,
cqspi->sclk = sclk;
cqspi_config_baudrate_div(cqspi);
cqspi_delay(f_pdata);
- cqspi_readdata_capture(cqspi, !cqspi->rclk_en,
+ cqspi_readdata_capture(cqspi, !cqspi->rclk_en, false,
f_pdata->read_delay);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v4 08/16] spi: cadence-quadspi: add PHY tuning support
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (6 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 07/16] spi: cadence-quadspi: add DQS support to read data capture Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-19 17:33 ` Mark Brown
2026-06-18 7:37 ` [PATCH v4 09/16] spi: cadence-quadspi: skip DDR PHY tuning for 2-byte-address ops (i2383) Santhosh Kumar K
` (8 subsequent siblings)
16 siblings, 1 reply; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
The Cadence QSPI controller supports a delay-line PHY for high-speed
operation. Without calibration the PHY is unused and read capture relies
on a fixed delay, limiting throughput at frequencies above the base
operating speed.
Add an execute_tuning callback that performs delay-line calibration using
a known data pattern written to a dedicated flash region. The pattern is
either read from a NOR partition identified by the DT property
spi-phy-pattern-partition, or written to the NAND page cache before
each calibration read.
For DDR protocols (8D-8D-8D) a 2D sweep of (rx_delay, tx_delay) pairs
is performed to find the widest passing region in the combined RX/TX
space. Binary search locates the gap boundary between passing regions
when two separate windows exist; the final operating point is placed at
the centre of the larger region with a small temperature-dependent
offset.
For SDR protocols a 1D sweep of the RX delay is sufficient. Two windows
at adjacent read_delay values are measured; the wider one's midpoint is
selected.
The tuning infrastructure is platform-specific: only am654-based OSPI
controllers populate the execute_tuning hook. All other platform data
entries return -EOPNOTSUPP and are unaffected.
The calibration target is sourced from spi->post_config_max_speed_hz,
populated by the SPI core from the spi-max-post-config-frequency DT
property. Tuning is skipped when post_config_max_speed_hz is zero.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 1776 ++++++++++++++++++++++++++++-
1 file changed, 1763 insertions(+), 13 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 201d69c64c49..72768292a32b 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -65,15 +65,25 @@ enum {
struct cqspi_st;
+struct phy_setting {
+ u8 rx;
+ u8 tx;
+ u8 read_delay;
+};
+
struct cqspi_flash_pdata {
- struct cqspi_st *cqspi;
- u32 clk_rate;
- u32 read_delay;
- u32 tshsl_ns;
- u32 tsd2d_ns;
- u32 tchsh_ns;
- u32 tslch_ns;
- u8 cs;
+ struct cqspi_st *cqspi;
+ u32 read_delay;
+ u32 tshsl_ns;
+ u32 tsd2d_ns;
+ u32 tchsh_ns;
+ u32 tslch_ns;
+ bool use_dqs;
+ bool use_tuned_phy;
+ u8 cs;
+ struct phy_setting phy_setting;
+ struct spi_mem_op phy_read_op;
+ struct spi_mem_op phy_write_op;
};
static const struct clk_bulk_data cqspi_clks[CLK_QSPI_NUM] = {
@@ -129,12 +139,15 @@ struct cqspi_driver_platdata {
int (*indirect_read_dma)(struct cqspi_flash_pdata *f_pdata,
u_char *rxbuf, loff_t from_addr, size_t n_rx);
u32 (*get_dma_status)(struct cqspi_st *cqspi);
+ int (*execute_tuning)(struct spi_mem *mem, struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op);
};
/* Operation timeout value */
#define CQSPI_TIMEOUT_MS 500
#define CQSPI_READ_TIMEOUT_MS 10
#define CQSPI_BUSYWAIT_TIMEOUT_US 500
+#define CQSPI_DLL_TIMEOUT_US 300
/* Runtime_pm autosuspend delay */
#define CQSPI_AUTOSUSPEND_TIMEOUT 2000
@@ -148,12 +161,14 @@ struct cqspi_driver_platdata {
/* Register map */
#define CQSPI_REG_CONFIG 0x00
#define CQSPI_REG_CONFIG_ENABLE_MASK BIT(0)
+#define CQSPI_REG_CONFIG_PHY_EN BIT(3)
#define CQSPI_REG_CONFIG_ENB_DIR_ACC_CTRL BIT(7)
#define CQSPI_REG_CONFIG_DECODE_MASK BIT(9)
#define CQSPI_REG_CONFIG_CHIPSELECT_LSB 10
#define CQSPI_REG_CONFIG_DMA_MASK BIT(15)
#define CQSPI_REG_CONFIG_BAUD_LSB 19
#define CQSPI_REG_CONFIG_DTR_PROTO BIT(24)
+#define CQSPI_REG_CONFIG_PHY_PIPELINE BIT(25)
#define CQSPI_REG_CONFIG_DUAL_OPCODE BIT(30)
#define CQSPI_REG_CONFIG_IDLE_LSB 31
#define CQSPI_REG_CONFIG_CHIPSELECT_MASK 0xF
@@ -192,6 +207,7 @@ struct cqspi_driver_platdata {
#define CQSPI_REG_READCAPTURE_BYPASS_LSB 0
#define CQSPI_REG_READCAPTURE_DELAY_LSB 1
#define CQSPI_REG_READCAPTURE_DELAY_MASK 0xF
+#define CQSPI_REG_READCAPTURE_EDGE_LSB 5
#define CQSPI_REG_READCAPTURE_DQS_LSB 8
#define CQSPI_REG_SIZE 0x14
@@ -273,6 +289,27 @@ struct cqspi_driver_platdata {
#define CQSPI_REG_POLLING_STATUS 0xB0
#define CQSPI_REG_POLLING_STATUS_DUMMY_LSB 16
+#define CQSPI_REG_PHY_CONFIG 0xB4
+#define CQSPI_REG_PHY_CONFIG_RX_DEL_LSB 0
+#define CQSPI_REG_PHY_CONFIG_RX_DEL_MASK 0x7F
+#define CQSPI_REG_PHY_CONFIG_TX_DEL_LSB 16
+#define CQSPI_REG_PHY_CONFIG_TX_DEL_MASK 0x7F
+#define CQSPI_REG_PHY_CONFIG_DLL_RESET BIT(30)
+#define CQSPI_REG_PHY_CONFIG_RESYNC BIT(31)
+
+#define CQSPI_REG_PHY_DLL_MASTER 0xB8
+#define CQSPI_REG_PHY_DLL_MASTER_INIT_DELAY_LSB 0
+#define CQSPI_REG_PHY_DLL_MASTER_INIT_DELAY_VAL 16
+#define CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_LEN 0x7
+#define CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_LSB 20
+#define CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_3 0x2
+#define CQSPI_REG_PHY_DLL_MASTER_BYPASS BIT(23)
+#define CQSPI_REG_PHY_DLL_MASTER_CYCLE BIT(24)
+
+#define CQSPI_REG_DLL_OBS_LOW 0xBC
+#define CQSPI_REG_DLL_OBS_LOW_DLL_LOCK BIT(0)
+#define CQSPI_REG_DLL_OBS_LOW_LOOPBACK_LOCK BIT(15)
+
#define CQSPI_REG_OP_EXT_LOWER 0xE0
#define CQSPI_REG_OP_EXT_READ_LSB 24
#define CQSPI_REG_OP_EXT_WRITE_LSB 16
@@ -321,6 +358,50 @@ struct cqspi_driver_platdata {
#define CQSPI_REG_VERSAL_DMA_VAL 0x602
+#define CQSPI_PHY_INIT_RD 1
+#define CQSPI_PHY_MAX_RD 4
+#define CQSPI_PHY_MAX_DELAY 127
+#define CQSPI_PHY_DDR_SEARCH_STEP 4
+#define CQSPI_PHY_TX_LOOKUP_LOW_START 28
+#define CQSPI_PHY_TX_LOOKUP_LOW_END 48
+#define CQSPI_PHY_TX_LOOKUP_HIGH_START 60
+#define CQSPI_PHY_TX_LOOKUP_HIGH_END 96
+#define CQSPI_PHY_RX_LOW_SEARCH_START 0
+#define CQSPI_PHY_RX_LOW_SEARCH_END 40
+#define CQSPI_PHY_RX_HIGH_SEARCH_START 24
+#define CQSPI_PHY_RX_HIGH_SEARCH_END 127
+#define CQSPI_PHY_TX_LOW_SEARCH_START 0
+#define CQSPI_PHY_TX_LOW_SEARCH_END 64
+#define CQSPI_PHY_TX_HIGH_SEARCH_START 78
+#define CQSPI_PHY_TX_HIGH_SEARCH_END 127
+#define CQSPI_PHY_SEARCH_OFFSET 8
+
+#define CQSPI_PHY_DEFAULT_TEMP 45
+#define CQSPI_PHY_MIN_TEMP -45
+#define CQSPI_PHY_MAX_TEMP 130
+#define CQSPI_PHY_MID_TEMP (CQSPI_PHY_MIN_TEMP + \
+ ((CQSPI_PHY_MAX_TEMP - \
+ CQSPI_PHY_MIN_TEMP) / 2))
+
+/*
+ * PHY tuning pattern for calibrating read data capture delay. This 128-byte
+ * pattern provides sufficient bit transitions across all byte lanes to
+ * reliably detect timing windows at high frequencies.
+ */
+static const u8 phy_tuning_pattern[] __aligned(64) = {
+ 0xFE, 0xFF, 0x01, 0x01, 0x01, 0x01, 0x01, 0x00, 0x00, 0xFE, 0xFE, 0x01,
+ 0x01, 0x01, 0x01, 0x00, 0x00, 0xFE, 0xFE, 0x01, 0xFF, 0xFF, 0xFF, 0xFF,
+ 0xFF, 0x00, 0x00, 0xFE, 0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0xFE,
+ 0xFE, 0xFF, 0x01, 0x01, 0x01, 0x01, 0x01, 0xFE, 0x00, 0xFE, 0xFE, 0x01,
+ 0x01, 0x01, 0x01, 0xFE, 0x00, 0xFE, 0xFE, 0x01, 0xFF, 0xFF, 0xFF, 0xFF,
+ 0xFF, 0xFE, 0x00, 0xFE, 0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0xFE, 0x00, 0xFE,
+ 0xFE, 0xFF, 0x01, 0x01, 0x01, 0x01, 0x01, 0x00, 0xFE, 0xFE, 0xFE, 0x01,
+ 0x01, 0x01, 0x01, 0x00, 0xFE, 0xFE, 0xFE, 0x01, 0xFF, 0xFF, 0xFF, 0xFF,
+ 0xFF, 0x00, 0xFE, 0xFE, 0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0xFE, 0xFE,
+ 0xFE, 0xFF, 0x01, 0x01, 0x01, 0x01, 0x01, 0xFE, 0xFE, 0xFE, 0xFE, 0x01,
+ 0x01, 0x01, 0x01, 0xFE, 0xFE, 0xFE, 0xFE, 0x01,
+};
+
static int cqspi_wait_for_bit(const struct cqspi_driver_platdata *ddata,
void __iomem *reg, const u32 mask, bool clr,
bool busywait)
@@ -1555,6 +1636,1678 @@ static bool cqspi_supports_mem_op(struct spi_mem *mem,
return spi_mem_default_supports_op(mem, op);
}
+static int cqspi_write_pattern_to_cache(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem,
+ struct spi_mem_op *write_op)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ write_op->data.nbytes = sizeof(phy_tuning_pattern);
+ write_op->data.buf.out = phy_tuning_pattern;
+
+ ret = spi_mem_exec_op(mem, write_op);
+ if (ret) {
+ dev_err(dev, "Failed to write PHY pattern to cache: %d\n", ret);
+ return ret;
+ }
+ dev_dbg(dev, "PHY pattern (%zu bytes) written to cache\n",
+ sizeof(phy_tuning_pattern));
+
+ return 0;
+}
+
+static int cqspi_get_phy_pattern_offset(struct device *dev, u32 *offset)
+{
+ struct device_node *np, *flash_np = NULL, *part_np;
+ const __be32 *reg;
+ int len;
+
+ if (!dev || !dev->of_node)
+ return -EINVAL;
+
+ for_each_child_of_node(dev->of_node, np) {
+ if (of_node_name_prefix(np, "flash")) {
+ flash_np = np;
+ break;
+ }
+ }
+
+ if (!flash_np)
+ return -ENODEV;
+
+ part_np = of_parse_phandle(flash_np, "spi-phy-pattern-partition", 0);
+ of_node_put(flash_np);
+ if (!part_np)
+ return -ENODEV;
+
+ reg = of_get_property(part_np, "reg", &len);
+ if (reg && len >= sizeof(__be32)) {
+ *offset = be32_to_cpu(reg[0]);
+ of_node_put(part_np);
+ return 0;
+ }
+
+ of_node_put(part_np);
+ return -ENOENT;
+}
+
+static int cqspi_phy_check_pattern(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem)
+{
+ struct spi_mem_op op;
+ u8 *read_data;
+ int ret;
+
+ read_data = kmalloc_array(ARRAY_SIZE(phy_tuning_pattern),
+ sizeof(phy_tuning_pattern[0]), GFP_KERNEL);
+ if (!read_data)
+ return -ENOMEM;
+
+ op = f_pdata->phy_read_op;
+ op.data.buf.in = read_data;
+ op.data.nbytes = sizeof(phy_tuning_pattern);
+
+ ret = spi_mem_exec_op(mem, &op);
+ if (ret)
+ goto out;
+
+ if (memcmp(read_data, phy_tuning_pattern, sizeof(phy_tuning_pattern)))
+ ret = -EAGAIN;
+
+out:
+ kfree(read_data);
+ return ret;
+}
+
+static void cqspi_phy_set_dll_master(struct cqspi_st *cqspi)
+{
+ void __iomem *reg_base = cqspi->iobase;
+ unsigned int reg;
+
+ reg = readl(reg_base + CQSPI_REG_PHY_DLL_MASTER);
+ reg &= ~((CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_LEN
+ << CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_LSB) |
+ CQSPI_REG_PHY_DLL_MASTER_BYPASS |
+ CQSPI_REG_PHY_DLL_MASTER_CYCLE);
+ reg |= ((CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_3
+ << CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_LSB) |
+ CQSPI_REG_PHY_DLL_MASTER_CYCLE);
+
+ writel(reg, reg_base + CQSPI_REG_PHY_DLL_MASTER);
+}
+
+static void cqspi_phy_pre_config(struct cqspi_st *cqspi,
+ struct cqspi_flash_pdata *f_pdata,
+ const bool bypass)
+{
+ void __iomem *reg_base = cqspi->iobase;
+ unsigned int reg;
+ u8 dummy;
+
+ cqspi_readdata_capture(cqspi, bypass, f_pdata->use_dqs,
+ f_pdata->phy_setting.read_delay);
+
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg &= ~(CQSPI_REG_CONFIG_PHY_EN | CQSPI_REG_CONFIG_PHY_PIPELINE);
+ reg |= CQSPI_REG_CONFIG_PHY_EN;
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ reg = readl(reg_base + CQSPI_REG_RD_INSTR);
+ dummy = FIELD_GET(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ reg);
+ dummy--;
+ reg &= ~(CQSPI_REG_RD_INSTR_DUMMY_MASK << CQSPI_REG_RD_INSTR_DUMMY_LSB);
+ reg |= FIELD_PREP(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ dummy);
+ writel(reg, reg_base + CQSPI_REG_RD_INSTR);
+
+ cqspi_phy_set_dll_master(cqspi);
+}
+
+static void cqspi_phy_post_config(struct cqspi_st *cqspi,
+ const unsigned int delay)
+{
+ void __iomem *reg_base = cqspi->iobase;
+ unsigned int reg;
+ u8 dummy;
+
+ reg = readl(reg_base + CQSPI_REG_READCAPTURE);
+ reg &= ~(CQSPI_REG_READCAPTURE_DELAY_MASK
+ << CQSPI_REG_READCAPTURE_DELAY_LSB);
+
+ reg |= (delay & CQSPI_REG_READCAPTURE_DELAY_MASK)
+ << CQSPI_REG_READCAPTURE_DELAY_LSB;
+ writel(reg, reg_base + CQSPI_REG_READCAPTURE);
+
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg &= ~(CQSPI_REG_CONFIG_PHY_EN | CQSPI_REG_CONFIG_PHY_PIPELINE);
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ reg = readl(reg_base + CQSPI_REG_RD_INSTR);
+ dummy = FIELD_GET(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ reg);
+ dummy++;
+ reg &= ~(CQSPI_REG_RD_INSTR_DUMMY_MASK << CQSPI_REG_RD_INSTR_DUMMY_LSB);
+ reg |= FIELD_PREP(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ dummy);
+ writel(reg, reg_base + CQSPI_REG_RD_INSTR);
+}
+
+static void cqspi_set_dll(void __iomem *reg_base, u8 rx_dll, u8 tx_dll)
+{
+ unsigned int reg;
+
+ reg = readl(reg_base + CQSPI_REG_PHY_CONFIG);
+ reg &= ~((CQSPI_REG_PHY_CONFIG_RX_DEL_MASK
+ << CQSPI_REG_PHY_CONFIG_RX_DEL_LSB) |
+ (CQSPI_REG_PHY_CONFIG_TX_DEL_MASK
+ << CQSPI_REG_PHY_CONFIG_TX_DEL_LSB));
+ reg |= ((rx_dll & CQSPI_REG_PHY_CONFIG_RX_DEL_MASK)
+ << CQSPI_REG_PHY_CONFIG_RX_DEL_LSB) |
+ ((tx_dll & CQSPI_REG_PHY_CONFIG_TX_DEL_MASK)
+ << CQSPI_REG_PHY_CONFIG_TX_DEL_LSB) |
+ CQSPI_REG_PHY_CONFIG_RESYNC;
+ writel(reg, reg_base + CQSPI_REG_PHY_CONFIG);
+}
+
+static int cqspi_resync_dll(struct cqspi_st *cqspi)
+{
+ void __iomem *reg_base = cqspi->iobase;
+ unsigned int reg;
+ int ret;
+
+ ret = cqspi_wait_idle(cqspi);
+ if (ret)
+ return ret;
+
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg &= ~CQSPI_REG_CONFIG_ENABLE_MASK;
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ reg = readl(reg_base + CQSPI_REG_PHY_CONFIG);
+ reg &= ~(CQSPI_REG_PHY_CONFIG_DLL_RESET | CQSPI_REG_PHY_CONFIG_RESYNC);
+ writel(reg, reg_base + CQSPI_REG_PHY_CONFIG);
+
+ reg = readl(reg_base + CQSPI_REG_PHY_DLL_MASTER);
+ reg |= (CQSPI_REG_PHY_DLL_MASTER_INIT_DELAY_VAL
+ << CQSPI_REG_PHY_DLL_MASTER_INIT_DELAY_LSB);
+ writel(reg, reg_base + CQSPI_REG_PHY_DLL_MASTER);
+
+ reg = readl(reg_base + CQSPI_REG_PHY_CONFIG);
+ reg |= CQSPI_REG_PHY_CONFIG_DLL_RESET;
+ writel(reg, reg_base + CQSPI_REG_PHY_CONFIG);
+
+ ret = readl_poll_timeout(reg_base + CQSPI_REG_DLL_OBS_LOW, reg,
+ (reg & CQSPI_REG_DLL_OBS_LOW_DLL_LOCK), 0,
+ CQSPI_DLL_TIMEOUT_US);
+ if (ret)
+ goto re_enable;
+
+ ret = readl_poll_timeout(reg_base + CQSPI_REG_DLL_OBS_LOW, reg,
+ (reg & CQSPI_REG_DLL_OBS_LOW_LOOPBACK_LOCK), 0,
+ CQSPI_DLL_TIMEOUT_US);
+ if (ret)
+ goto re_enable;
+
+ reg = readl(reg_base + CQSPI_REG_PHY_CONFIG);
+ reg |= CQSPI_REG_PHY_CONFIG_RESYNC;
+ writel(reg, reg_base + CQSPI_REG_PHY_CONFIG);
+
+re_enable:
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg |= CQSPI_REG_CONFIG_ENABLE_MASK;
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ return ret;
+}
+
+static int cqspi_phy_apply_setting(struct cqspi_flash_pdata *f_pdata,
+ struct phy_setting *phy)
+{
+ struct cqspi_st *cqspi = f_pdata->cqspi;
+ unsigned int reg;
+ int ret;
+
+ reg = readl(cqspi->iobase + CQSPI_REG_READCAPTURE);
+ reg |= BIT(CQSPI_REG_READCAPTURE_EDGE_LSB);
+ writel(reg, cqspi->iobase + CQSPI_REG_READCAPTURE);
+
+ cqspi_set_dll(cqspi->iobase, phy->rx, phy->tx);
+
+ ret = cqspi_resync_dll(cqspi);
+ if (ret)
+ return ret;
+
+ f_pdata->phy_setting.read_delay = phy->read_delay;
+ return 0;
+}
+
+static int cqspi_find_rx_low_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ do {
+ phy->rx = CQSPI_PHY_RX_LOW_SEARCH_START;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+
+ phy->rx += CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (phy->rx <= CQSPI_PHY_RX_LOW_SEARCH_END);
+
+ phy->read_delay++;
+ } while (phy->read_delay <= CQSPI_PHY_MAX_RD);
+
+ dev_dbg(dev, "Unable to find RX low\n");
+ return -ENOENT;
+}
+
+static int cqspi_find_rx_low_sdr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ phy->rx = 0;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+ phy->rx++;
+ } while (phy->rx <= CQSPI_PHY_MAX_DELAY);
+
+ dev_dbg(dev, "Unable to find RX low\n");
+ return -ENOENT;
+}
+
+static int cqspi_find_rx_high_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ do {
+ phy->rx = CQSPI_PHY_RX_HIGH_SEARCH_END;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+
+ phy->rx -= CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (phy->rx >= CQSPI_PHY_RX_HIGH_SEARCH_START);
+
+ phy->read_delay--;
+ } while (phy->read_delay >= CQSPI_PHY_INIT_RD);
+
+ dev_dbg(dev, "Unable to find RX high\n");
+ return -ENOENT;
+}
+
+static int cqspi_find_rx_high_sdr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy,
+ u8 lowerbound)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ phy->rx = CQSPI_PHY_MAX_DELAY;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+ phy->rx--;
+ } while (phy->rx > lowerbound);
+
+ dev_dbg(dev, "Unable to find RX high\n");
+ return -ENOENT;
+}
+
+static int cqspi_find_tx_low_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ do {
+ phy->tx = CQSPI_PHY_TX_LOW_SEARCH_START;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+
+ phy->tx += CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (phy->tx <= CQSPI_PHY_TX_LOW_SEARCH_END);
+
+ phy->read_delay++;
+ } while (phy->read_delay <= CQSPI_PHY_MAX_RD);
+
+ dev_dbg(dev, "Unable to find TX low\n");
+ return -ENOENT;
+}
+
+static int cqspi_find_tx_high_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ do {
+ phy->tx = CQSPI_PHY_TX_HIGH_SEARCH_END;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+
+ phy->tx -= CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (phy->tx >= CQSPI_PHY_TX_HIGH_SEARCH_START);
+
+ phy->read_delay--;
+ } while (phy->read_delay >= CQSPI_PHY_INIT_RD);
+
+ dev_dbg(dev, "Unable to find TX high\n");
+ return -ENOENT;
+}
+
+static void cqspi_phy_find_gaplow_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem,
+ struct phy_setting *bottomleft,
+ struct phy_setting *topright,
+ struct phy_setting *gaplow)
+{
+ struct phy_setting left, right, mid;
+ int ret;
+
+ left = *bottomleft;
+ right = *topright;
+
+ mid.tx = left.tx + ((right.tx - left.tx) / 2);
+ mid.rx = left.rx + ((right.rx - left.rx) / 2);
+ mid.read_delay = left.read_delay;
+
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, &mid);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ /* The pattern was not found. Go to the lower half. */
+ right.tx = mid.tx;
+ right.rx = mid.rx;
+
+ mid.tx = left.tx + ((mid.tx - left.tx) / 2);
+ mid.rx = left.rx + ((mid.rx - left.rx) / 2);
+ } else {
+ /* The pattern was found. Go to the upper half. */
+ left.tx = mid.tx;
+ left.rx = mid.rx;
+
+ mid.tx = mid.tx + ((right.tx - mid.tx) / 2);
+ mid.rx = mid.rx + ((right.rx - mid.rx) / 2);
+ }
+
+ /* Break the loop if the window has closed. */
+ } while ((right.tx - left.tx >= 2) && (right.rx - left.rx >= 2));
+
+ *gaplow = mid;
+}
+
+static void cqspi_phy_find_gaphigh_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem,
+ struct phy_setting *bottomleft,
+ struct phy_setting *topright,
+ struct phy_setting *gaphigh)
+{
+ struct phy_setting left, right, mid;
+ int ret;
+
+ left = *bottomleft;
+ right = *topright;
+
+ mid.tx = left.tx + ((right.tx - left.tx) / 2);
+ mid.rx = left.rx + ((right.rx - left.rx) / 2);
+ mid.read_delay = right.read_delay;
+
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, &mid);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ /* The pattern was not found. Go to the upper half. */
+ left.tx = mid.tx;
+ left.rx = mid.rx;
+
+ mid.tx = mid.tx + ((right.tx - mid.tx) / 2);
+ mid.rx = mid.rx + ((right.rx - mid.rx) / 2);
+ } else {
+ /* The pattern was found. Go to the lower half. */
+ right.tx = mid.tx;
+ right.rx = mid.rx;
+
+ mid.tx = left.tx + ((mid.tx - left.tx) / 2);
+ mid.rx = left.rx + ((mid.rx - left.rx) / 2);
+ }
+
+ /* Break the loop if the window has closed. */
+ } while ((right.tx - left.tx >= 2) && (right.rx - left.rx >= 2));
+
+ *gaphigh = mid;
+}
+
+static int cqspi_get_temp(int *temp)
+{
+ /* TODO: read SoC thermal sensor; caller falls back to room temperature */
+ return -EOPNOTSUPP;
+}
+
+static inline void cqspi_phy_reset_setting(struct phy_setting *phy)
+{
+ *phy = (struct phy_setting){ .rx = 0, .tx = 127, .read_delay = 0 };
+}
+
+static int cqspi_phy_tuning_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem)
+{
+ struct cqspi_st *cqspi = f_pdata->cqspi;
+ struct device *dev = &cqspi->pdev->dev;
+ struct phy_setting rxlow, rxhigh, txlow, txhigh;
+ struct phy_setting srxlow, srxhigh;
+ struct phy_setting bottomleft, topright, searchpoint;
+ struct phy_setting gaplow, gaphigh;
+ struct phy_setting backuppoint, backupcornerpoint;
+ int ret, rx_window, temp;
+ bool primary = true, secondary = true;
+
+ /*
+ * DDR tuning: 2D search across RX and TX delays for optimal timing.
+ *
+ * Algorithm: Find RX boundaries (rxlow/rxhigh) using TX window search,
+ * find TX boundaries (txlow/txhigh) at fixed RX, define valid region,
+ * locate gaps via binary search, select final point with temperature
+ * compensation.
+ *
+ * rx
+ * 127 ^
+ * | topright
+ * | *
+ * | xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * | xxxxxxxxxxxxx ++++++++++++
+ * | xxxxxxxxxxxxxx +++++++++++
+ * | xxxxxxxxxxxxxxx ++++++++++
+ * | xxxxxxxxxxxxxxxx +++++++++
+ * | xxxxxxxxxxxxxxxxx ++++++++
+ * | xxxxxxxxxxxxxxxxxx +++++++
+ * | *
+ * | bottomleft
+ * -----------------------------------------> tx
+ * 0 127
+ */
+
+ f_pdata->use_tuned_phy = true;
+
+ /* Golden rxlow search: Find lower RX boundary using TX window sweep */
+
+ /*
+ * rx
+ * 127 ^
+ * | xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | |xxxxx|xxxxx +++++++++++++
+ * | |xxxxx|xxxxxx ++++++++++++
+ * search | |xxxxx|xxxxxxx +++++++++++
+ * rxlow --------->|xxxxx|xxxxxxxx ++++++++++
+ * | |xxxxx|xxxxxxxxx +++++++++
+ * | |xxxxx|xxxxxxxxxx ++++++++
+ * | |xxxxx|xxxxxxxxxxx +++++++
+ * | | |
+ * --------|-----|----------------------------> tx
+ * 0 | | 127
+ * txlow txlow
+ * start end
+ *
+ * |----------------------------------------------------------|
+ * | Primary | Secondary | Final |
+ * | Search | Search | Point |
+ * |---------|-----------|------------------------------------|
+ * | Fail | Fail | Return Fail |
+ * |---------|-----------|------------------------------------|
+ * | Fail | Pass | Return Fail |
+ * |---------|-----------|------------------------------------|
+ * | Pass | Fail | Return Fail |
+ * |---------|-----------|------------------------------------|
+ * | Pass | Pass | rx = min(primary.rx, secondary.rx) |
+ * | | | tx = primary.tx |
+ * | | | read_delay = |
+ * | | | min(primary.read_delay, |
+ * | | | secondary.read_delay) |
+ * |----------------------------------------------------------|
+ */
+
+ /* Primary rxlow: Sweep TX window to find valid RX lower bound */
+
+ rxlow.tx = CQSPI_PHY_TX_LOOKUP_LOW_START;
+ do {
+ dev_dbg(dev, "Searching for Golden Primary rxlow on TX = %d\n",
+ rxlow.tx);
+ rxlow.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_rx_low_ddr(f_pdata, mem, &rxlow);
+ rxlow.tx += CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (ret && rxlow.tx <= CQSPI_PHY_TX_LOOKUP_LOW_END);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Golden Primary rxlow: RX: %d TX: %d RD: %d\n", rxlow.rx,
+ rxlow.tx, rxlow.read_delay);
+
+ /* Secondary rxlow: Verify at offset TX for robustness */
+
+ if (rxlow.tx <= (CQSPI_PHY_TX_LOOKUP_LOW_END - CQSPI_PHY_SEARCH_OFFSET))
+ srxlow.tx = rxlow.tx + CQSPI_PHY_SEARCH_OFFSET;
+ else
+ srxlow.tx = CQSPI_PHY_TX_LOOKUP_LOW_END;
+ dev_dbg(dev, "Searching for Golden Secondary rxlow on TX = %d\n",
+ srxlow.tx);
+ srxlow.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_rx_low_ddr(f_pdata, mem, &srxlow);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Golden Secondary rxlow: RX: %d TX: %d RD: %d\n",
+ srxlow.rx, srxlow.tx, srxlow.read_delay);
+
+ rxlow.rx = min(rxlow.rx, srxlow.rx);
+ rxlow.read_delay = min(rxlow.read_delay, srxlow.read_delay);
+ dev_dbg(dev, "Golden Final rxlow: RX: %d TX: %d RD: %d\n", rxlow.rx,
+ rxlow.tx, rxlow.read_delay);
+
+ /* Golden rxhigh search: Find upper RX boundary at fixed TX */
+
+ /*
+ * rx
+ * 127 ^
+ * | |xxxx ++++++++++++++++++++
+ * | |xxxxx +++++++++++++++++++
+ * search | |xxxxxx ++++++++++++++++++
+ * rxhigh --------->|xxxxxxx +++++++++++++++++
+ * on fixed | |xxxxxxxx ++++++++++++++++
+ * tx | |xxxxxxxxx +++++++++++++++
+ * | |xxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * | xxxxxxxxxxxxx ++++++++++++
+ * | xxxxxxxxxxxxxx +++++++++++
+ * | xxxxxxxxxxxxxxx ++++++++++
+ * | xxxxxxxxxxxxxxxx +++++++++
+ * | xxxxxxxxxxxxxxxxx ++++++++
+ * | xxxxxxxxxxxxxxxxxx +++++++
+ * |
+ * -------------------------------------------> tx
+ * 0 127
+ *
+ * |----------------------------------------------------------|
+ * | Primary | Secondary | Final |
+ * | Search | Search | Point |
+ * |---------|-----------|------------------------------------|
+ * | Fail | Fail | Return Fail |
+ * |---------|-----------|------------------------------------|
+ * | Fail | Pass | Choose Secondary |
+ * |---------|-----------|------------------------------------|
+ * | Pass | Fail | Choose Primary |
+ * |---------|-----------|------------------------------------|
+ * | Pass | Pass | if (secondary.rx > primary.rx) |
+ * | | | Choose Secondary |
+ * | | | else |
+ * | | | Choose Primary |
+ * |----------------------------------------------------------|
+ */
+
+ /* Primary rxhigh: Search at rxlow's TX, decrement from max read_delay */
+
+ rxhigh.tx = rxlow.tx;
+ dev_dbg(dev, "Searching for Golden Primary rxhigh on TX = %d\n",
+ rxhigh.tx);
+ rxhigh.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_rx_high_ddr(f_pdata, mem, &rxhigh);
+ if (ret)
+ primary = false;
+ dev_dbg(dev, "Golden Primary rxhigh: RX: %d TX: %d RD: %d\n", rxhigh.rx,
+ rxhigh.tx, rxhigh.read_delay);
+
+ /* Secondary rxhigh: Verify at offset TX */
+
+ if (rxhigh.tx <=
+ (CQSPI_PHY_TX_LOOKUP_LOW_END - CQSPI_PHY_SEARCH_OFFSET))
+ srxhigh.tx = rxhigh.tx + CQSPI_PHY_SEARCH_OFFSET;
+ else
+ srxhigh.tx = CQSPI_PHY_TX_LOOKUP_LOW_END;
+ dev_dbg(dev, "Searching for Golden Secondary rxhigh on TX = %d\n",
+ srxhigh.tx);
+ srxhigh.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_rx_high_ddr(f_pdata, mem, &srxhigh);
+ if (ret)
+ secondary = false;
+ dev_dbg(dev, "Golden Secondary rxhigh: RX: %d TX: %d RD: %d\n",
+ srxhigh.rx, srxhigh.tx, srxhigh.read_delay);
+
+ if (!primary && !secondary)
+ goto out;
+ else if (!primary)
+ rxhigh = srxhigh;
+ else if (secondary && srxhigh.rx > rxhigh.rx)
+ rxhigh = srxhigh;
+ dev_dbg(dev, "Golden Final rxhigh: RX: %d TX: %d RD: %d\n", rxhigh.rx,
+ rxhigh.tx, rxhigh.read_delay);
+
+ primary = true;
+ secondary = true;
+
+ /* If rxlow/rxhigh at same read_delay, search backup at upper TX range */
+
+ if (rxlow.read_delay == rxhigh.read_delay) {
+ dev_dbg(dev, "rxlow and rxhigh at the same read delay.\n");
+
+ /* Backup rxlow: Search at high TX window */
+
+ /*
+ * rx
+ * 127 ^
+ * | xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++|++++|
+ * | xxxxxxxxxxxxx ++++++|++++|
+ * search | xxxxxxxxxxxxxx +++++|++++|
+ * rxlow --------------------------------->|++++|
+ * | xxxxxxxxxxxxxxxx +++|++++|
+ * | xxxxxxxxxxxxxxxxx ++|++++|
+ * | xxxxxxxxxxxxxxxxxx +|++++|
+ * | | |
+ * --------------------------------|----|-----> tx
+ * 0 | | 127
+ * txhigh txhigh
+ * start end
+ *
+ * |-----------------------------------------------------|
+ * | Primary | Secondary | Final |
+ * | Search | Search | Point |
+ * |---------|-----------|-------------------------------|
+ * | Fail | Fail | Return Fail |
+ * |---------|-----------|-------------------------------|
+ * | Fail | Pass | Return Fail |
+ * |---------|-----------|-------------------------------|
+ * | Pass | Fail | Return Fail |
+ * |---------|-----------|-------------------------------|
+ * | Pass | Pass | rx = |
+ * | | | min(primary.rx, secondary.rx)|
+ * | | | tx = primary.tx |
+ * | | | read_delay = |
+ * | | | min(primary.read_delay, |
+ * | | | secondary.read_delay) |
+ * |-----------------------------------------------------|
+ */
+
+ /* Primary backup: Decrement TX from high window end */
+
+ backuppoint.tx = CQSPI_PHY_TX_LOOKUP_HIGH_END;
+ do {
+ dev_dbg(dev,
+ "Searching for Backup Primary rxlow on TX = %d\n",
+ backuppoint.tx);
+ backuppoint.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_rx_low_ddr(f_pdata, mem, &backuppoint);
+ backuppoint.tx -= CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (ret &&
+ backuppoint.tx >= CQSPI_PHY_TX_LOOKUP_HIGH_START);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Backup Primary rxlow: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ /* Secondary backup: Verify at offset TX */
+
+ if (backuppoint.tx >=
+ (CQSPI_PHY_TX_LOOKUP_HIGH_START + CQSPI_PHY_SEARCH_OFFSET))
+ srxlow.tx = backuppoint.tx - CQSPI_PHY_SEARCH_OFFSET;
+ else
+ srxlow.tx = CQSPI_PHY_TX_LOOKUP_HIGH_START;
+ dev_dbg(dev,
+ "Searching for Backup Secondary rxlow on TX = %d\n",
+ srxlow.tx);
+ srxlow.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_rx_low_ddr(f_pdata, mem, &srxlow);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Backup Secondary rxlow: RX: %d TX: %d RD: %d\n",
+ srxlow.rx, srxlow.tx, srxlow.read_delay);
+
+ backuppoint.rx = min(backuppoint.rx, srxlow.rx);
+ backuppoint.read_delay =
+ min(backuppoint.read_delay, srxlow.read_delay);
+ dev_dbg(dev, "Backup Final rxlow: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ if (backuppoint.rx < rxlow.rx) {
+ rxlow = backuppoint;
+ dev_dbg(dev, "Updating rxlow to the one at TX = %d\n",
+ backuppoint.tx);
+ }
+ dev_dbg(dev, "Final rxlow: RX: %d TX: %d RD: %d\n", rxlow.rx,
+ rxlow.tx, rxlow.read_delay);
+
+ /* Backup rxhigh: Search at fixed backup TX */
+
+ /*
+ * rx
+ * 127 ^
+ * | xxxxx +++++++++++++++++++|
+ * | xxxxxx ++++++++++++++++++|
+ * search | xxxxxxx +++++++++++++++++|
+ * rxhigh -------------------------------------->|
+ * on fixed | xxxxxxxxx +++++++++++++++|
+ * tx | xxxxxxxxxx ++++++++++++++|
+ * | xxxxxxxxxxx +++++++++++++|
+ * | xxxxxxxxxxxx +++++++++++++
+ * | xxxxxxxxxxxxx ++++++++++++
+ * | xxxxxxxxxxxxxx +++++++++++
+ * | xxxxxxxxxxxxxxx ++++++++++
+ * | xxxxxxxxxxxxxxxx +++++++++
+ * | xxxxxxxxxxxxxxxxx ++++++++
+ * | xxxxxxxxxxxxxxxxxx +++++++
+ * |
+ * -------------------------------------------> tx
+ * 0 127
+ *
+ * |-----------------------------------------------------|
+ * | Primary | Secondary | Final |
+ * | Search | Search | Point |
+ * |---------|-----------|-------------------------------|
+ * | Fail | Fail | Return Fail |
+ * |---------|-----------|-------------------------------|
+ * | Fail | Pass | Choose Secondary |
+ * |---------|-----------|-------------------------------|
+ * | Pass | Fail | Choose Primary |
+ * |---------|-----------|-------------------------------|
+ * | Pass | Pass | if (secondary.rx > primary.rx)|
+ * | | | Choose Secondary |
+ * | | | else |
+ * | | | Choose Primary |
+ * |-----------------------------------------------------|
+ */
+
+ /* Primary backup rxhigh: Use backup TX, decrement from max read_delay */
+
+ dev_dbg(dev, "Searching for Backup Primary rxhigh on TX = %d\n",
+ backuppoint.tx);
+ backuppoint.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_rx_high_ddr(f_pdata, mem, &backuppoint);
+ if (ret)
+ primary = false;
+ dev_dbg(dev, "Backup Primary rxhigh: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ /* Secondary backup rxhigh: Verify at offset TX */
+
+ if (backuppoint.tx >=
+ (CQSPI_PHY_TX_LOOKUP_HIGH_START + CQSPI_PHY_SEARCH_OFFSET))
+ srxhigh.tx = backuppoint.tx - CQSPI_PHY_SEARCH_OFFSET;
+ else
+ srxhigh.tx = CQSPI_PHY_TX_LOOKUP_HIGH_START;
+ dev_dbg(dev,
+ "Searching for Backup Secondary rxhigh on TX = %d\n",
+ srxhigh.tx);
+ srxhigh.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_rx_high_ddr(f_pdata, mem, &srxhigh);
+ if (ret)
+ secondary = false;
+ dev_dbg(dev, "Backup Secondary rxhigh: RX: %d TX: %d RD: %d\n",
+ srxhigh.rx, srxhigh.tx, srxhigh.read_delay);
+
+ if (!primary && !secondary)
+ goto out;
+ else if (!primary)
+ backuppoint = srxhigh;
+ else if (secondary && srxhigh.rx > backuppoint.rx)
+ backuppoint = srxhigh;
+ dev_dbg(dev, "Backup Final rxhigh: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ if (backuppoint.rx > rxhigh.rx) {
+ rxhigh = backuppoint;
+ dev_dbg(dev, "Updating rxhigh to the one at TX = %d\n",
+ backuppoint.tx);
+ }
+ dev_dbg(dev, "Final rxhigh: RX: %d TX: %d RD: %d\n", rxhigh.rx,
+ rxhigh.tx, rxhigh.read_delay);
+ }
+
+ /* Golden txlow: Fix RX at 1/4 of RX window, search TX lower bound */
+
+ /*
+ * rx
+ * 127 ^
+ * |
+ * rxhigh --------->xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * fix rx | xxxxxxxxxxxxx ++++++++++++
+ * 1/4 b/w ---------><------->xxxxx +++++++++++
+ * rxlow and | xxxx|xxxxxxxxxx ++++++++++
+ * rxhigh | xxxx|xxxxxxxxxxx +++++++++
+ * | xxxx|xxxxxxxxxxxx ++++++++
+ * rxlow --------->xxxx|xxxxxxxxxxxxx +++++++
+ * | |
+ * ------------|------------------------------> tx
+ * 0 | 127
+ * search
+ * txlow
+ */
+
+ rx_window = rxhigh.rx - rxlow.rx;
+ txlow.rx = rxlow.rx + (rx_window / 4);
+ dev_dbg(dev, "Searching for Golden txlow on RX = %d\n", txlow.rx);
+ txlow.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_tx_low_ddr(f_pdata, mem, &txlow);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Golden txlow: RX: %d TX: %d RD: %d\n", txlow.rx, txlow.tx,
+ txlow.read_delay);
+
+ /* Golden txhigh: Same RX as txlow, decrement from max read_delay */
+
+ /*
+ * rx
+ * 127 ^
+ * |
+ * rxhigh --------->xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * fix rx | xxxxxxxxxxxxx ++++++++++++
+ * 1/4 b/w --------------------------------><----->
+ * rxlow and | xxxxxxxxxxxxxxx ++++++|+++
+ * rxhigh | xxxxxxxxxxxxxxxx +++++|+++
+ * | xxxxxxxxxxxxxxxxx ++++|+++
+ * rxlow --------->xxxxxxxxxxxxxxxxxx +++|+++
+ * | |
+ * ----------------------------------|--------> tx
+ * 0 | 127
+ * search
+ * txhigh
+ */
+
+ txhigh.rx = txlow.rx;
+ dev_dbg(dev, "Searching for Golden txhigh on RX = %d\n", txhigh.rx);
+ txhigh.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_tx_high_ddr(f_pdata, mem, &txhigh);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Golden txhigh: RX: %d TX: %d RD: %d\n", txhigh.rx,
+ txhigh.tx, txhigh.read_delay);
+
+ /* If txlow/txhigh at same read_delay, search backup at 3/4 RX window */
+
+ if (txlow.read_delay == txhigh.read_delay) {
+ /* Backup txlow: Fix RX at 3/4 of RX window */
+
+ /*
+ * rx
+ * 127 ^
+ * |
+ * rxhigh --------->xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * fix rx | xxxxxxx ++++++++++++++++++
+ * 3/4 b/w ---------><----->x +++++++++++++++++
+ * rxlow and | xxxx|xxxx ++++++++++++++++
+ * rxhigh | xxxx|xxxxx +++++++++++++++
+ * | xxxx|xxxxxx ++++++++++++++
+ * | xxxx|xxxxxxx +++++++++++++
+ * | xxxx|xxxxxxxx ++++++++++++
+ * | xxxx|xxxxxxxxx +++++++++++
+ * | xxxx|xxxxxxxxxx ++++++++++
+ * | xxxx|xxxxxxxxxxx +++++++++
+ * | xxxx|xxxxxxxxxxxx ++++++++
+ * rxlow --------->xxxx|xxxxxxxxxxxxx +++++++
+ * | |
+ * ------------|------------------------------> tx
+ * 0 | 127
+ * search
+ * txlow
+ */
+
+ dev_dbg(dev, "txlow and txhigh at the same read delay.\n");
+ backuppoint.rx = rxlow.rx + ((rx_window * 3) / 4);
+ dev_dbg(dev, "Searching for Backup txlow on RX = %d\n",
+ backuppoint.rx);
+ backuppoint.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_tx_low_ddr(f_pdata, mem, &backuppoint);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Backup txlow: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ if (backuppoint.tx < txlow.tx) {
+ txlow = backuppoint;
+ dev_dbg(dev, "Updating txlow with the one at RX = %d\n",
+ backuppoint.rx);
+ }
+ dev_dbg(dev, "Final txlow: RX: %d TX: %d RD: %d\n", txlow.rx,
+ txlow.tx, txlow.read_delay);
+
+ /* Backup txhigh: Same RX as backup txlow, decrement from max */
+
+ /*
+ * rx
+ * 127 ^
+ * |
+ * rxhigh --------->xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * fix rx | xxxxxxx ++++++++++++++++++
+ * 3/4 b/w ------------------------------><------->
+ * rxlow and | xxxxxxxxx +++++++++++|++++
+ * rxhigh | xxxxxxxxxx ++++++++++|++++
+ * | xxxxxxxxxxx +++++++++|++++
+ * | xxxxxxxxxxxx ++++++++|++++
+ * | xxxxxxxxxxxxx +++++++|++++
+ * | xxxxxxxxxxxxxx ++++++|++++
+ * | xxxxxxxxxxxxxxx +++++|++++
+ * | xxxxxxxxxxxxxxxx ++++|++++
+ * | xxxxxxxxxxxxxxxxx +++|++++
+ * rxlow --------->xxxxxxxxxxxxxxxxxx ++|++++
+ * | |
+ * ---------------------------------|---------> tx
+ * 0 | 127
+ * search
+ * txhigh
+ */
+
+ dev_dbg(dev, "Searching for Backup txhigh on RX = %d\n",
+ backuppoint.rx);
+ backuppoint.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_tx_high_ddr(f_pdata, mem, &backuppoint);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Backup txhigh: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ if (backuppoint.tx > txhigh.tx) {
+ txhigh = backuppoint;
+ dev_dbg(dev,
+ "Updating txhigh with the one at RX = %d\n",
+ backuppoint.rx);
+ }
+ dev_dbg(dev, "Final txhigh: RX: %d TX: %d RD: %d\n", txhigh.rx,
+ txhigh.tx, txhigh.read_delay);
+ }
+
+ /* Corner points: Define and verify bottomleft and topright boundaries */
+
+ /*
+ * rx
+ * 127 ^
+ * | topright
+ * | *
+ * rxhigh -----------xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * | xxxxxxxxxxxxx ++++++++++++
+ * | xxxxxxxxxxxxxx +++++++++++
+ * | xxxxxxxxxxxxxxx ++++++++++
+ * | xxxxxxxxxxxxxxxx +++++++++
+ * | xxxxxxxxxxxxxxxxx ++++++++
+ * rxlow -----------xxxxxxxxxxxxxxxxxx +++++++
+ * | * |
+ * | bottom|left |
+ * --------|----------------------------|---> tx
+ * 0 | | 127
+ * | |
+ * txlow txhigh
+ *
+ * Verification: Test point 4 taps inside each corner, adjust
+ * read_delay ±1 if needed to ensure valid corners for gap search.
+ */
+
+ bottomleft.tx = txlow.tx;
+ bottomleft.rx = rxlow.rx;
+ if (txlow.read_delay <= rxlow.read_delay)
+ bottomleft.read_delay = txlow.read_delay;
+ else
+ bottomleft.read_delay = rxlow.read_delay;
+
+ /* Verify bottomleft: Test 4 taps inside, adjust read_delay if needed */
+ backupcornerpoint = bottomleft;
+ backupcornerpoint.tx += 4;
+ backupcornerpoint.rx += 4;
+ ret = cqspi_phy_apply_setting(f_pdata, &backupcornerpoint);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ backupcornerpoint.read_delay--;
+ ret = cqspi_phy_apply_setting(f_pdata, &backupcornerpoint);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ }
+
+ if (ret)
+ goto out;
+
+ bottomleft.read_delay = backupcornerpoint.read_delay;
+
+ topright.tx = txhigh.tx;
+ topright.rx = rxhigh.rx;
+ if (txhigh.read_delay >= rxhigh.read_delay)
+ topright.read_delay = txhigh.read_delay;
+ else
+ topright.read_delay = rxhigh.read_delay;
+
+ /* Verify topright: Test 4 taps inside, adjust read_delay if needed */
+ backupcornerpoint = topright;
+ backupcornerpoint.tx -= 4;
+ backupcornerpoint.rx -= 4;
+ ret = cqspi_phy_apply_setting(f_pdata, &backupcornerpoint);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ backupcornerpoint.read_delay++;
+ ret = cqspi_phy_apply_setting(f_pdata, &backupcornerpoint);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ }
+
+ if (ret)
+ goto out;
+
+ topright.read_delay = backupcornerpoint.read_delay;
+
+ dev_dbg(dev, "topright: RX: %d TX: %d RD: %d\n", topright.rx,
+ topright.tx, topright.read_delay);
+ dev_dbg(dev, "bottomleft: RX: %d TX: %d RD: %d\n", bottomleft.rx,
+ bottomleft.tx, bottomleft.read_delay);
+ cqspi_phy_find_gaplow_ddr(f_pdata, mem, &bottomleft, &topright, &gaplow);
+ dev_dbg(dev, "gaplow: RX: %d TX: %d RD: %d\n", gaplow.rx, gaplow.tx,
+ gaplow.read_delay);
+
+ /* Final point selection: Handle single vs dual passing regions */
+
+ if (bottomleft.read_delay == topright.read_delay) {
+ /*
+ * Single region: Use midpoint with temperature compensation.
+ * Gaplow approximates upper boundary of valid region.
+ *
+ * rx
+ * 127 ^
+ * | gaplow (approx. topright)
+ * | |
+ * rxhigh -----------xxxxxxx| failing
+ * | xxxxxxx| region
+ * | xxxxxxx| <--------------->
+ * | xxxxxxx| +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * | xxxxxxxxxxxxx ++++++++++++
+ * | xxxxxxxxxxxxxx +++++++++++
+ * | xxxxxxxxxxxxxxx ++++++++++
+ * | xxxxxxxxxxxxxxxx +++++++++
+ * | xxxxxxxxxxxxxxxxx ++++++++
+ * rxlow -----------xxxxxxxxxxxxxxxxxx +++++++
+ * | * |
+ * | bottom|left |
+ * --------|----------------------------|---> tx
+ * 0 | | 127
+ * | |
+ * txlow txhigh
+ * (same read_delay)
+ *
+ * Temperature compensation: Valid region shifts with temp.
+ * Offset = region_size / (330 / (temp - 42°C))
+ * Factor 330 is empirically determined for this hardware.
+ */
+
+ dev_dbg(dev,
+ "bottomleft and topright at the same read delay.\n");
+
+ topright = gaplow;
+ searchpoint.read_delay = bottomleft.read_delay;
+ searchpoint.tx =
+ bottomleft.tx + ((topright.tx - bottomleft.tx) / 2);
+ searchpoint.rx =
+ bottomleft.rx + ((topright.rx - bottomleft.rx) / 2);
+
+ ret = cqspi_get_temp(&temp);
+ if (ret) {
+ /* Assume room temperature if sensor unavailable */
+ dev_dbg(dev,
+ "Unable to get temperature. Assuming room temperature\n");
+ temp = CQSPI_PHY_DEFAULT_TEMP;
+ }
+
+ if (temp < CQSPI_PHY_MIN_TEMP || temp > CQSPI_PHY_MAX_TEMP) {
+ dev_err(dev,
+ "Temperature outside operating range: %dC\n",
+ temp);
+ ret = -EINVAL;
+ goto out;
+ }
+
+ if (temp == CQSPI_PHY_MID_TEMP)
+ temp++; /* Avoid divide-by-zero */
+ dev_dbg(dev, "Temperature: %dC\n", temp);
+
+ /*
+ * Apply temperature offset: positive at high temp, negative at low.
+ * Compute the divisor once and apply to both TX and RX. Use int
+ * arithmetic throughout to avoid u8 wrapping on negative offsets.
+ */
+ temp = 330 / (temp - CQSPI_PHY_MID_TEMP);
+ searchpoint.tx = clamp((int)searchpoint.tx +
+ (topright.tx - bottomleft.tx) / temp,
+ 0, CQSPI_PHY_MAX_DELAY);
+ searchpoint.rx = clamp((int)searchpoint.rx +
+ (topright.rx - bottomleft.rx) / temp,
+ 0, CQSPI_PHY_MAX_DELAY);
+ } else {
+ /*
+ * Dual regions: Gap separates two valid regions, choose larger.
+ *
+ * rx
+ * 127 ^
+ * | topright
+ * | *
+ * rxhigh -----------xxxxx +++++++++++++++++++|
+ * | xxxxxx <region 2> ++++++++|
+ * | xxxxxxx +++++++++++++++++|
+ * | xxxxxxxx ++++++++++++++++|
+ * | xxxxxxxxx +++++++++++++++|
+ * | xxxxxxxxxx ++++++++++++++|
+ * | failing |
+ * | region |
+ * | xxxxxxxxxxxxx +++++++++++|
+ * | xxxxxxxxxxxxxx ++++++++++|
+ * | xxxxxxxxxxxxxxx +++++++++|
+ * | xxxxxxxxxxxxxxxx ++++++++|
+ * | xxxxxxxxx <region 1> +++++++|
+ * rxlow -----------xxxxxxxxxxxxxxxxxx ++++++|
+ * | * |
+ * | bottom|left |
+ * --------|----------------------------|---> tx
+ * 0 | | 127
+ * | |
+ * txlow txhigh
+ *
+ * Strategy: Compare Manhattan distances from gap boundaries to
+ * corners. Choose corner furthest from gap (larger region).
+ * Apply 16-tap margin inward, scale RX proportionally.
+ */
+
+ cqspi_phy_find_gaphigh_ddr(f_pdata, mem, &bottomleft,
+ &topright, &gaphigh);
+ dev_dbg(dev, "gaphigh: RX: %d TX: %d RD: %d\n", gaphigh.rx,
+ gaphigh.tx, gaphigh.read_delay);
+
+ if (topright.tx == bottomleft.tx) {
+ dev_err(dev, "zero TX span in dual-region: cannot compute search point\n");
+ ret = -EINVAL;
+ goto out;
+ }
+
+ /* Compare Manhattan distances: choose corner furthest from gap */
+ if ((abs(gaplow.tx - bottomleft.tx) +
+ abs(gaplow.rx - bottomleft.rx)) <
+ (abs(gaphigh.tx - topright.tx) +
+ abs(gaphigh.rx - topright.rx))) {
+ /* Topright further: Use Region 2, 16 taps inward */
+ searchpoint = topright;
+ searchpoint.tx -= 16;
+ searchpoint.rx -= (16 * (topright.rx - bottomleft.rx)) /
+ (topright.tx - bottomleft.tx);
+ } else {
+ /* Bottomleft further: Use Region 1, 16 taps inward */
+ searchpoint = bottomleft;
+ searchpoint.tx += 16;
+ searchpoint.rx += (16 * (topright.rx - bottomleft.rx)) /
+ (topright.tx - bottomleft.tx);
+ }
+ }
+
+ /* Apply and verify final tuning point */
+ dev_dbg(dev, "Final tuning point: RX: %d TX: %d RD: %d\n",
+ searchpoint.rx, searchpoint.tx, searchpoint.read_delay);
+ ret = cqspi_phy_apply_setting(f_pdata, &searchpoint);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ dev_err(dev,
+ "Failed to find pattern at final calibration point\n");
+ ret = -EINVAL;
+ goto out;
+ }
+
+ f_pdata->phy_setting.read_delay = searchpoint.read_delay;
+ f_pdata->phy_setting.rx = searchpoint.rx;
+ f_pdata->phy_setting.tx = searchpoint.tx;
+out:
+ if (ret)
+ f_pdata->use_tuned_phy = false;
+
+ return ret;
+}
+
+static int cqspi_phy_tuning_sdr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem)
+{
+ struct cqspi_st *cqspi = f_pdata->cqspi;
+ struct device *dev = &cqspi->pdev->dev;
+ struct phy_setting rxlow, rxhigh, first, second, final;
+ u8 window1 = 0;
+ u8 window2 = 0;
+ int ret;
+
+ /*
+ * SDR tuning: 1D search for optimal RX delay (TX less critical).
+ * Find two consecutive windows, choose larger, use midpoint.
+ *
+ * rx
+ * 127 ^
+ * | |-----window at----------|
+ * | |-----read_delay = n+1---|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | rxlow(n+1) midpoint rxhigh(n+1)
+ * |
+ * | |---window at--------|
+ * | |---read_delay = n---|
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * | rxlow(n) midpoint rxhigh(n)
+ * |
+ * -----------------------------------------> tx
+ * 0 127
+ * read_delay=n read_delay=n+1
+ */
+
+ f_pdata->use_tuned_phy = true;
+ cqspi_phy_reset_setting(&rxlow);
+ cqspi_phy_reset_setting(&rxhigh);
+ cqspi_phy_reset_setting(&first);
+
+ /* First window: Find rxlow by incrementing read_delay from 0 */
+
+ /*
+ * rx
+ * 127 ^
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * search | |xxxxxxxxxxxxxxxxxxxx|
+ * rxlow | |xxxxxxxxxxxxxxxxxxxx|
+ * increasing | |xxxxxxxxxxxxxxxxxxxx|
+ * --------->|xxxxxxxxxxxxxxxxxxxx|
+ * read_delay | |xxxxxxxxxxxxxxxxxxx|
+ * until found | |xxxxxxxxxxxxxxxxxxx|
+ * | rxlow
+ * -----------------------------------------> tx
+ * 0 tx fixed at 127
+ */
+
+ do {
+ ret = cqspi_find_rx_low_sdr(f_pdata, mem, &rxlow);
+
+ if (ret)
+ rxlow.read_delay++;
+ } while (ret && rxlow.read_delay <= CQSPI_PHY_MAX_RD);
+
+ /* Find rxhigh: Decrement from RX=127 at same read_delay */
+
+ /*
+ * rx
+ * 127 ^ search rxhigh
+ * | (decrement from
+ * | 127 until found)
+ * | |
+ * | |
+ * | v
+ * | |------------------------|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | rxlow rxhigh
+ * -----------------------------------------> tx
+ * 0 tx fixed at 127
+ */
+
+ rxhigh.read_delay = rxlow.read_delay;
+ ret = cqspi_find_rx_high_sdr(f_pdata, mem, &rxhigh, rxlow.rx);
+ if (ret)
+ goto out;
+
+ /* Calculate first window midpoint for max margin */
+
+ /*
+ * rx
+ * 127 ^
+ * | |--------window1---------|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxx * xxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | rxlow ^ rxhigh
+ * ----------------------|------------------> tx
+ * 0 | tx fixed at 127
+ * window1/2
+ */
+
+ first.read_delay = rxlow.read_delay;
+ window1 = rxhigh.rx - rxlow.rx;
+ first.rx = rxlow.rx + (window1 / 2);
+
+ dev_dbg(dev, "First tuning point: RX: %d TX: %d RD: %d\n", first.rx,
+ first.tx, first.read_delay);
+ ret = cqspi_phy_apply_setting(f_pdata, &first);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret || first.read_delay > CQSPI_PHY_MAX_RD)
+ goto out;
+
+ /* Second window: Search at read_delay+1, may differ in size */
+
+ /*
+ * rx
+ * 127 ^
+ * | |-------|
+ * | |xxxxxxx|
+ * | |xxxxxxx|
+ * | |xxxxxxx|
+ * | |xxxxxxx|
+ * | |xxxxxxx|
+ * | rxlow rxhigh
+ * -----------------------------------------> tx
+ * 0
+ * read_delay = n (smaller window)
+ *
+ * rx
+ * 127 ^
+ * | |-----------------|
+ * | |xxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxx|
+ * | rxlow rxhigh
+ * -----------------------------------------> tx
+ * 0
+ * read_delay = n+1 (larger window - better)
+ */
+
+ cqspi_phy_reset_setting(&rxlow);
+ cqspi_phy_reset_setting(&rxhigh);
+ cqspi_phy_reset_setting(&second);
+
+ rxlow.read_delay = first.read_delay + 1;
+ if (rxlow.read_delay > CQSPI_PHY_MAX_RD)
+ goto compare;
+
+ ret = cqspi_find_rx_low_sdr(f_pdata, mem, &rxlow);
+ if (ret)
+ goto compare;
+
+ rxhigh.read_delay = rxlow.read_delay;
+ ret = cqspi_find_rx_high_sdr(f_pdata, mem, &rxhigh, rxlow.rx);
+ if (ret)
+ goto compare;
+
+ /* Calculate second window midpoint */
+
+ /*
+ * rx
+ * 127 ^
+ * | |--------window2---------|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxx * xxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | rxlow ^ rxhigh
+ * ----------------------|------------------> tx
+ * 0 | tx fixed at 127
+ * window2/2
+ * read_delay = n+1
+ */
+
+ window2 = rxhigh.rx - rxlow.rx;
+ second.rx = rxlow.rx + (window2 / 2);
+ second.read_delay = rxlow.read_delay;
+
+ dev_dbg(dev, "Second tuning point: RX: %d TX: %d RD: %d\n", second.rx,
+ second.tx, second.read_delay);
+ ret = cqspi_phy_apply_setting(f_pdata, &second);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret || second.read_delay > CQSPI_PHY_MAX_RD)
+ window2 = 0;
+
+ /* Window comparison: Choose larger window for better margin */
+
+compare:
+ cqspi_phy_reset_setting(&final);
+ if (window2 > window1) {
+ final.rx = second.rx;
+ final.read_delay = second.read_delay;
+ } else {
+ final.rx = first.rx;
+ final.read_delay = first.read_delay;
+ }
+
+ /* Apply and verify final tuning point */
+
+ dev_dbg(dev, "Final tuning point: RX: %d TX: %d RD: %d\n", final.rx,
+ final.tx, final.read_delay);
+ ret = cqspi_phy_apply_setting(f_pdata, &final);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ f_pdata->phy_setting.read_delay = final.read_delay;
+ f_pdata->phy_setting.rx = final.rx;
+ f_pdata->phy_setting.tx = final.tx;
+
+out:
+ if (ret)
+ f_pdata->use_tuned_phy = false;
+
+ return ret;
+}
+
+static int cqspi_am654_ospi_execute_tuning(struct spi_mem *mem,
+ struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op)
+{
+ struct cqspi_st *cqspi =
+ spi_controller_get_devdata(mem->spi->controller);
+ struct cqspi_flash_pdata *f_pdata;
+ struct device *dev = &cqspi->pdev->dev;
+ u32 phy_offset = 0;
+ int ret;
+
+ f_pdata = &cqspi->f_pdata[spi_get_chipselect(mem->spi, 0)];
+
+ /*
+ * spi-max-post-config-frequency must be present for PHY tuning.
+ * If absent, post_config_max_speed_hz is zero and there is no
+ * calibration target, so skip tuning gracefully.
+ */
+ if (!mem->spi->post_config_max_speed_hz) {
+ dev_dbg(dev,
+ "No post-config frequency configured, skipping tuning\n");
+ return -EOPNOTSUPP;
+ }
+
+ if (write_op) {
+ /*
+ * For NAND: write the calibration pattern to the page cache.
+ * The write op runs at the conservative base speed (spi->max_speed_hz)
+ * so the write itself is reliable before PHY calibration.
+ */
+ ret = cqspi_write_pattern_to_cache(f_pdata, mem, write_op);
+ if (ret) {
+ dev_warn(dev,
+ "failed to write pattern to cache: %d, skipping tuning\n",
+ ret);
+ goto out;
+ }
+
+ f_pdata->phy_write_op = *write_op;
+ } else {
+ ret = cqspi_get_phy_pattern_offset(dev, &phy_offset);
+ if (ret) {
+ dev_warn(dev,
+ "pattern partition not found: %d, skipping tuning\n",
+ ret);
+ goto out;
+ }
+
+ read_op->addr.val = phy_offset;
+ }
+
+ /*
+ * Verify the calibration pattern exists using the conservative base
+ * speed. At high clock rates the DLL is not yet trained, so DTR
+ * data capture is unreliable and the read would return garbage.
+ * Setting max_freq to 0 causes spi_mem_adjust_op_freq() to cap the
+ * read to max_speed_hz (the base rate), well within reliable DTR
+ * margins. max_freq is set to post_config_max_speed_hz below after
+ * the pattern is confirmed, so the tuning-loop reads run at full rate.
+ */
+ f_pdata->phy_read_op = *read_op;
+ f_pdata->phy_read_op.max_freq = 0;
+
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (ret) {
+ dev_err(dev, "pattern not found: %d, skipping tuning\n", ret);
+ goto out;
+ }
+
+ /*
+ * Pattern confirmed. Set phy_read_op.max_freq to
+ * post_config_max_speed_hz so that tuning-loop reads bypass the base
+ * frequency cap in spi_mem_adjust_op_freq() and run at the full
+ * calibration rate.
+ */
+ f_pdata->phy_read_op.max_freq = mem->spi->post_config_max_speed_hz;
+
+ if (read_op->cmd.dtr || read_op->addr.dtr || read_op->dummy.dtr ||
+ read_op->data.dtr) {
+ f_pdata->use_dqs = true;
+ cqspi_phy_pre_config(cqspi, f_pdata, false);
+ ret = cqspi_phy_tuning_ddr(f_pdata, mem);
+ } else {
+ f_pdata->use_dqs = false;
+ cqspi_phy_pre_config(cqspi, f_pdata, true);
+ ret = cqspi_phy_tuning_sdr(f_pdata, mem);
+ }
+
+ if (ret)
+ dev_warn(dev, "tuning failed: %d\n", ret);
+
+ cqspi_phy_post_config(cqspi, f_pdata->read_delay);
+
+out:
+ /*
+ * On success, write back the validated maximum speed into the caller's
+ * op templates so that those specific ops bypass the cap in subsequent
+ * exec_op calls.
+ */
+ if (!ret) {
+ read_op->max_freq = mem->spi->post_config_max_speed_hz;
+ if (write_op)
+ write_op->max_freq = mem->spi->post_config_max_speed_hz;
+ }
+
+ return ret;
+}
+
+static int cqspi_mem_op_execute_tuning(struct spi_mem *mem,
+ struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op)
+{
+ struct cqspi_st *cqspi =
+ spi_controller_get_devdata(mem->spi->controller);
+
+ if (!cqspi->ddata->execute_tuning)
+ return -EOPNOTSUPP;
+
+ return cqspi->ddata->execute_tuning(mem, read_op, write_op);
+}
+
static int cqspi_of_get_flash_pdata(struct platform_device *pdev,
struct cqspi_flash_pdata *f_pdata,
struct device_node *np)
@@ -1584,11 +3337,6 @@ static int cqspi_of_get_flash_pdata(struct platform_device *pdev,
return -ENXIO;
}
- if (of_property_read_u32(np, "spi-max-frequency", &f_pdata->clk_rate)) {
- dev_err(&pdev->dev, "couldn't determine spi-max-frequency\n");
- return -ENXIO;
- }
-
return 0;
}
@@ -1736,6 +3484,7 @@ static const struct spi_controller_mem_ops cqspi_mem_ops = {
.exec_op = cqspi_exec_mem_op,
.get_name = cqspi_get_name,
.supports_op = cqspi_supports_mem_op,
+ .execute_tuning = cqspi_mem_op_execute_tuning,
};
static const struct spi_controller_mem_caps cqspi_mem_caps = {
@@ -2104,6 +3853,7 @@ static const struct cqspi_driver_platdata k2g_qspi = {
static const struct cqspi_driver_platdata am654_ospi = {
.hwcaps_mask = CQSPI_SUPPORTS_OCTAL | CQSPI_SUPPORTS_QUAD,
.quirks = CQSPI_NEEDS_WR_DELAY,
+ .execute_tuning = cqspi_am654_ospi_execute_tuning,
};
static const struct cqspi_driver_platdata intel_lgm_qspi = {
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [PATCH v4 08/16] spi: cadence-quadspi: add PHY tuning support
2026-06-18 7:37 ` [PATCH v4 08/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
@ 2026-06-19 17:33 ` Mark Brown
0 siblings, 0 replies; 24+ messages in thread
From: Mark Brown @ 2026-06-19 17:33 UTC (permalink / raw)
To: Santhosh Kumar K
Cc: robh, krzk+dt, conor+dt, miquel.raynal, richard, vigneshr,
pratyush, mwalle, takahiro.kuwano, linux-spi, devicetree,
linux-kernel, linux-mtd, praneeth, u-kumar1, a-dutta
[-- Attachment #1: Type: text/plain, Size: 910 bytes --]
On Thu, Jun 18, 2026 at 01:07:17PM +0530, Santhosh Kumar K wrote:
> The Cadence QSPI controller supports a delay-line PHY for high-speed
> operation. Without calibration the PHY is unused and read capture relies
> on a fixed delay, limiting throughput at frequencies above the base
> operating speed.
> +static int cqspi_get_phy_pattern_offset(struct device *dev, u32 *offset)
> +{
> + struct device_node *np, *flash_np = NULL, *part_np;
> + const __be32 *reg;
> + int len;
> +
> + if (!dev || !dev->of_node)
> + return -EINVAL;
> +
> + for_each_child_of_node(dev->of_node, np) {
> + if (of_node_name_prefix(np, "flash")) {
> + flash_np = np;
> + break;
> + }
> + }
This isn't going to do the right thing if there's more than one flash,
that doesn't seem a super sensible hardware configuration but I'm not
sure I see anything stopping it being set up and system integrators do
enjoy differentiating.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v4 09/16] spi: cadence-quadspi: skip DDR PHY tuning for 2-byte-address ops (i2383)
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (7 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 08/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 10/16] spi: cadence-quadspi: refactor direct read path for PHY support Santhosh Kumar K
` (7 subsequent siblings)
16 siblings, 0 replies; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Erratum i2383 on AM654 locks the address phase in PHY DDR mode when a
2-byte column address is used. DDR PHY tuning must not be attempted for
such operations; non-PHY DDR usage is unaffected. [0]
Add CQSPI_NO_2BYTE_ADDR_PHY_DDR quirk and check it in
cqspi_am654_ospi_execute_tuning(). When the erratum applies, return 0
with read_op->max_freq cleared.
[0] https://www.ti.com/lit/er/sprz544c/sprz544c.pdf
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 72768292a32b..22df5f3bdb96 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -49,6 +49,7 @@ static_assert(CQSPI_MAX_CHIPSELECT <= SPI_DEVICE_CS_CNT_MAX);
#define CQSPI_DISABLE_RUNTIME_PM BIT(10)
#define CQSPI_NO_INDIRECT_MODE BIT(11)
#define CQSPI_HAS_WR_PROTECT BIT(12)
+#define CQSPI_NO_2BYTE_ADDR_PHY_DDR BIT(13)
/* Capabilities */
#define CQSPI_SUPPORTS_OCTAL BIT(0)
@@ -3211,6 +3212,20 @@ static int cqspi_am654_ospi_execute_tuning(struct spi_mem *mem,
return -EOPNOTSUPP;
}
+ /*
+ * Erratum i2383: in PHY DDR mode, a 2-byte column address locks up
+ * the address phase. Skip DDR PHY tuning for such operations.
+ */
+ if ((cqspi->ddata->quirks & CQSPI_NO_2BYTE_ADDR_PHY_DDR) &&
+ read_op->addr.nbytes == 2 &&
+ (read_op->cmd.dtr || read_op->addr.dtr || read_op->dummy.dtr ||
+ read_op->data.dtr)) {
+ dev_dbg(dev,
+ "i2383: skipping DDR PHY tuning (2-byte address)\n");
+ read_op->max_freq = 0;
+ return 0;
+ }
+
if (write_op) {
/*
* For NAND: write the calibration pattern to the page cache.
@@ -3852,7 +3867,7 @@ static const struct cqspi_driver_platdata k2g_qspi = {
static const struct cqspi_driver_platdata am654_ospi = {
.hwcaps_mask = CQSPI_SUPPORTS_OCTAL | CQSPI_SUPPORTS_QUAD,
- .quirks = CQSPI_NEEDS_WR_DELAY,
+ .quirks = CQSPI_NEEDS_WR_DELAY | CQSPI_NO_2BYTE_ADDR_PHY_DDR,
.execute_tuning = cqspi_am654_ospi_execute_tuning,
};
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v4 10/16] spi: cadence-quadspi: refactor direct read path for PHY support
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (8 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 09/16] spi: cadence-quadspi: skip DDR PHY tuning for 2-byte-address ops (i2383) Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 11/16] spi: cadence-quadspi: enable PHY for direct reads Santhosh Kumar K
` (6 subsequent siblings)
16 siblings, 0 replies; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Extract the DMA transfer code from cqspi_direct_read_execute() into a
new cqspi_direct_read_dma() helper. Add cqspi_memcpy_fromio() to handle
non-DMA transfers, with 2-byte-aligned I/O accesses for 8D-8D-8D mode.
Change cqspi_direct_read_execute() to take the full spi_mem_op and
max_speed_hz instead of separate buf/from/len parameters, matching the
interface needed by the PHY-aware version in the following patch.
Thread max_speed_hz from cqspi_mem_process() through cqspi_read().
Transfers shorter than CQSPI_PHY_MIN_DIRECT_READ_LEN bytes always use
the memcpy path; longer transfers use DMA when available.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 72 ++++++++++++++++++++++++++-----
1 file changed, 62 insertions(+), 10 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 22df5f3bdb96..e41aeca47885 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -149,6 +149,8 @@ struct cqspi_driver_platdata {
#define CQSPI_READ_TIMEOUT_MS 10
#define CQSPI_BUSYWAIT_TIMEOUT_US 500
#define CQSPI_DLL_TIMEOUT_US 300
+/* Minimum transfer length to use DMA for direct reads */
+#define CQSPI_PHY_MIN_DIRECT_READ_LEN 17
/* Runtime_pm autosuspend delay */
#define CQSPI_AUTOSUSPEND_TIMEOUT 2000
@@ -1447,8 +1449,8 @@ static void cqspi_rx_dma_callback(void *param)
complete(&cqspi->rx_dma_complete);
}
-static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
- u_char *buf, loff_t from, size_t len)
+static int cqspi_direct_read_dma(struct cqspi_flash_pdata *f_pdata, u_char *buf,
+ loff_t from, size_t len)
{
struct cqspi_st *cqspi = f_pdata->cqspi;
struct device *dev = &cqspi->pdev->dev;
@@ -1460,11 +1462,6 @@ static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
dma_addr_t dma_dst;
struct device *ddev;
- if (!cqspi->rx_chan || !virt_addr_valid(buf)) {
- memcpy_fromio(buf, cqspi->ahb_base + from, len);
- return 0;
- }
-
ddev = cqspi->rx_chan->device->dev;
dma_dst = dma_map_single(ddev, buf, len, DMA_FROM_DEVICE);
if (dma_mapping_error(ddev, dma_dst)) {
@@ -1506,8 +1503,61 @@ static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
return ret;
}
+static void cqspi_memcpy_fromio(const struct spi_mem_op *op, void *to,
+ const void __iomem *from, size_t count)
+{
+ if (op->data.buswidth == 8 && op->data.dtr) {
+ unsigned long from_addr = (unsigned long)from;
+
+ /* Handle unaligned start with 2-byte read */
+ if (count && !IS_ALIGNED(from_addr, 4)) {
+ *(u16 *)to = __raw_readw(from);
+ from += 2;
+ to += 2;
+ count -= 2;
+ }
+
+ /* Use 4-byte reads for aligned bulk (no readq for 32-bit) */
+ if (count >= 4) {
+ size_t len = round_down(count, 4);
+
+ memcpy_fromio(to, from, len);
+ from += len;
+ to += len;
+ count -= len;
+ }
+
+ /* Handle remaining 2 bytes */
+ if (count)
+ *(u16 *)to = __raw_readw(from);
+
+ return;
+ }
+
+ memcpy_fromio(to, from, count);
+}
+
+static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
+ const struct spi_mem_op *op,
+ u32 post_config_max_speed_hz)
+{
+ struct cqspi_st *cqspi = f_pdata->cqspi;
+ loff_t from = op->addr.val;
+ size_t len = op->data.nbytes;
+ u_char *buf = op->data.buf.in;
+
+ if (!cqspi->rx_chan || !virt_addr_valid(buf) ||
+ len < CQSPI_PHY_MIN_DIRECT_READ_LEN) {
+ cqspi_memcpy_fromio(op, buf, cqspi->ahb_base + from, len);
+ return 0;
+ }
+
+ return cqspi_direct_read_dma(f_pdata, buf, from, len);
+}
+
static ssize_t cqspi_read(struct cqspi_flash_pdata *f_pdata,
- const struct spi_mem_op *op)
+ const struct spi_mem_op *op,
+ u32 post_config_max_speed_hz)
{
struct cqspi_st *cqspi = f_pdata->cqspi;
const struct cqspi_driver_platdata *ddata = cqspi->ddata;
@@ -1523,7 +1573,8 @@ static ssize_t cqspi_read(struct cqspi_flash_pdata *f_pdata,
if ((cqspi->use_direct_mode && ((from + len) <= cqspi->ahb_size)) ||
(cqspi->ddata && cqspi->ddata->quirks & CQSPI_NO_INDIRECT_MODE))
- return cqspi_direct_read_execute(f_pdata, buf, from, len);
+ return cqspi_direct_read_execute(f_pdata, op,
+ post_config_max_speed_hz);
if (cqspi->use_dma_read && ddata && ddata->indirect_read_dma &&
virt_addr_valid(buf) && ((dma_align & CQSPI_DMA_UNALIGN) == 0))
@@ -1551,7 +1602,8 @@ static int cqspi_mem_process(struct spi_mem *mem, const struct spi_mem_op *op)
!cqspi->disable_stig_mode))
return cqspi_command_read(f_pdata, op);
- return cqspi_read(f_pdata, op);
+ return cqspi_read(f_pdata, op,
+ mem->spi->post_config_max_speed_hz);
}
if (!op->addr.nbytes || !op->data.buf.out)
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v4 11/16] spi: cadence-quadspi: enable PHY for direct reads
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (9 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 10/16] spi: cadence-quadspi: refactor direct read path for PHY support Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 12/16] spi: cadence-quadspi: enable PHY for indirect writes Santhosh Kumar K
` (5 subsequent siblings)
16 siblings, 0 replies; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Add cqspi_tune_phy() to toggle PHY mode. Enabling sets the calibrated
read-capture delay, asserts PHY_EN and PHY_PIPELINE, and decrements the
dummy cycle count by one since the PHY pipeline absorbs that latency.
Disabling reverses all three. Disable is best-effort.
For direct reads, split the transfer into an unaligned head, a
16-byte-aligned middle section with PHY active, and an unaligned tail.
PHY is used when tuning completed successfully and the transfer is at
the calibrated frequency.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 103 +++++++++++++++++++++++++++++-
1 file changed, 102 insertions(+), 1 deletion(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index e41aeca47885..16e3b843f0aa 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -565,6 +565,61 @@ static void cqspi_readdata_capture(struct cqspi_st *cqspi, const bool bypass,
writel(reg, reg_base + CQSPI_REG_READCAPTURE);
}
+static int cqspi_tune_phy(struct cqspi_flash_pdata *f_pdata, bool enable)
+{
+ struct cqspi_st *cqspi = f_pdata->cqspi;
+ void __iomem *reg_base = cqspi->iobase;
+ u32 reg;
+ u8 dummy;
+
+ if (enable) {
+ cqspi_readdata_capture(cqspi, true, f_pdata->use_dqs,
+ f_pdata->phy_setting.read_delay);
+
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg |= CQSPI_REG_CONFIG_PHY_EN | CQSPI_REG_CONFIG_PHY_PIPELINE;
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ /*
+ * The PHY data-capture pipeline absorbs one dummy cycle's
+ * worth of latency; reduce the count to avoid over-compensation.
+ */
+ reg = readl(reg_base + CQSPI_REG_RD_INSTR);
+ dummy = FIELD_GET(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ reg);
+ dummy--;
+ reg &= ~(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB);
+ reg |= FIELD_PREP(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ dummy);
+ writel(reg, reg_base + CQSPI_REG_RD_INSTR);
+ } else {
+ cqspi_readdata_capture(cqspi, !cqspi->rclk_en, false,
+ f_pdata->read_delay);
+
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg &= ~(CQSPI_REG_CONFIG_PHY_EN |
+ CQSPI_REG_CONFIG_PHY_PIPELINE);
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ reg = readl(reg_base + CQSPI_REG_RD_INSTR);
+ dummy = FIELD_GET(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ reg);
+ dummy++;
+ reg &= ~(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB);
+ reg |= FIELD_PREP(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ dummy);
+ writel(reg, reg_base + CQSPI_REG_RD_INSTR);
+ }
+
+ return cqspi_wait_idle(cqspi);
+}
+
static int cqspi_exec_flash_cmd(struct cqspi_st *cqspi, unsigned int reg)
{
void __iomem *reg_base = cqspi->iobase;
@@ -1442,6 +1497,14 @@ static ssize_t cqspi_write(struct cqspi_flash_pdata *f_pdata,
return cqspi_indirect_write_execute(f_pdata, to, buf, len);
}
+static bool cqspi_use_tuned_phy(struct cqspi_flash_pdata *f_pdata,
+ const struct spi_mem_op *op,
+ u32 post_config_max_speed_hz)
+{
+ return f_pdata->use_tuned_phy &&
+ op->max_freq == post_config_max_speed_hz;
+}
+
static void cqspi_rx_dma_callback(void *param)
{
struct cqspi_st *cqspi = param;
@@ -1543,8 +1606,11 @@ static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
{
struct cqspi_st *cqspi = f_pdata->cqspi;
loff_t from = op->addr.val;
+ loff_t from_aligned, to_aligned;
size_t len = op->data.nbytes;
+ size_t len_aligned;
u_char *buf = op->data.buf.in;
+ int ret;
if (!cqspi->rx_chan || !virt_addr_valid(buf) ||
len < CQSPI_PHY_MIN_DIRECT_READ_LEN) {
@@ -1552,7 +1618,42 @@ static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
return 0;
}
- return cqspi_direct_read_dma(f_pdata, buf, from, len);
+ if (!cqspi_use_tuned_phy(f_pdata, op, post_config_max_speed_hz))
+ return cqspi_direct_read_dma(f_pdata, buf, from, len);
+
+ /* Split into unaligned head, aligned middle, unaligned tail */
+ from_aligned = ALIGN(from, 16);
+ to_aligned = ALIGN_DOWN(from + len, 16);
+ len_aligned = to_aligned - from_aligned;
+
+ if (from != from_aligned) {
+ ret = cqspi_direct_read_dma(f_pdata, buf, from,
+ from_aligned - from);
+ if (ret)
+ return ret;
+ buf += from_aligned - from;
+ }
+
+ if (len_aligned) {
+ ret = cqspi_tune_phy(f_pdata, true);
+ if (ret)
+ return ret;
+ ret = cqspi_direct_read_dma(f_pdata, buf, from_aligned,
+ len_aligned);
+ cqspi_tune_phy(f_pdata, false);
+ if (ret)
+ return ret;
+ buf += len_aligned;
+ }
+
+ if (to_aligned != (from + len)) {
+ ret = cqspi_direct_read_dma(f_pdata, buf, to_aligned,
+ (from + len) - to_aligned);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
}
static ssize_t cqspi_read(struct cqspi_flash_pdata *f_pdata,
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v4 12/16] spi: cadence-quadspi: enable PHY for indirect writes
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (10 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 11/16] spi: cadence-quadspi: enable PHY for direct reads Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 13/16] mtd: spinand: extract variant ranking logic into spinand_op_find_best() Santhosh Kumar K
` (4 subsequent siblings)
16 siblings, 0 replies; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Enable PHY for indirect writes of at least CQSPI_PHY_MIN_INDIRECT_WRITE_LEN
bytes. PHY is activated only when tuning completed successfully and the
write op runs at the calibrated post-config frequency, matching the same
frequency guard used by the read path.
Thread max_speed_hz from cqspi_mem_process() through cqspi_write() into
cqspi_indirect_write_execute() for the frequency check.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 32 +++++++++++++++++++++++++++----
1 file changed, 28 insertions(+), 4 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 16e3b843f0aa..df7fcdf404a6 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -151,6 +151,8 @@ struct cqspi_driver_platdata {
#define CQSPI_DLL_TIMEOUT_US 300
/* Minimum transfer length to use DMA for direct reads */
#define CQSPI_PHY_MIN_DIRECT_READ_LEN 17
+/* Minimum indirect write length to amortize PHY enable/disable overhead */
+#define CQSPI_PHY_MIN_INDIRECT_WRITE_LEN SZ_1K
/* Runtime_pm autosuspend delay */
#define CQSPI_AUTOSUSPEND_TIMEOUT 2000
@@ -1240,13 +1242,15 @@ static int cqspi_write_setup(struct cqspi_flash_pdata *f_pdata,
static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
loff_t to_addr, const u8 *txbuf,
- const size_t n_tx)
+ const size_t n_tx,
+ u32 post_config_max_speed_hz)
{
struct cqspi_st *cqspi = f_pdata->cqspi;
struct device *dev = &cqspi->pdev->dev;
void __iomem *reg_base = cqspi->iobase;
unsigned int remaining = n_tx;
unsigned int write_bytes;
+ bool use_tuned_phy_write;
int ret;
if (!refcount_read(&cqspi->refcount))
@@ -1282,6 +1286,18 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
if (cqspi->apb_ahb_hazard)
readl(reg_base + CQSPI_REG_INDIRECTWR);
+ /* Use PHY only for large writes at the calibrated rate */
+ use_tuned_phy_write = n_tx >= CQSPI_PHY_MIN_INDIRECT_WRITE_LEN &&
+ f_pdata->use_tuned_phy &&
+ f_pdata->phy_write_op.max_freq ==
+ post_config_max_speed_hz;
+
+ if (use_tuned_phy_write) {
+ ret = cqspi_tune_phy(f_pdata, true);
+ if (ret)
+ goto failwr;
+ }
+
while (remaining > 0) {
size_t write_words, mod_bytes;
@@ -1330,9 +1346,15 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
cqspi_wait_idle(cqspi);
+ if (use_tuned_phy_write)
+ cqspi_tune_phy(f_pdata, false);
+
return 0;
failwr:
+ if (use_tuned_phy_write)
+ cqspi_tune_phy(f_pdata, false);
+
/* Disable interrupt. */
writel(0, reg_base + CQSPI_REG_IRQMASK);
@@ -1467,7 +1489,8 @@ static void cqspi_configure(struct cqspi_flash_pdata *f_pdata,
}
static ssize_t cqspi_write(struct cqspi_flash_pdata *f_pdata,
- const struct spi_mem_op *op)
+ const struct spi_mem_op *op,
+ u32 post_config_max_speed_hz)
{
struct cqspi_st *cqspi = f_pdata->cqspi;
loff_t to = op->addr.val;
@@ -1494,7 +1517,8 @@ static ssize_t cqspi_write(struct cqspi_flash_pdata *f_pdata,
return cqspi_wait_idle(cqspi);
}
- return cqspi_indirect_write_execute(f_pdata, to, buf, len);
+ return cqspi_indirect_write_execute(f_pdata, to, buf, len,
+ post_config_max_speed_hz);
}
static bool cqspi_use_tuned_phy(struct cqspi_flash_pdata *f_pdata,
@@ -1710,7 +1734,7 @@ static int cqspi_mem_process(struct spi_mem *mem, const struct spi_mem_op *op)
if (!op->addr.nbytes || !op->data.buf.out)
return cqspi_command_write(f_pdata, op);
- return cqspi_write(f_pdata, op);
+ return cqspi_write(f_pdata, op, mem->spi->post_config_max_speed_hz);
}
static int cqspi_exec_mem_op(struct spi_mem *mem, const struct spi_mem_op *op)
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v4 13/16] mtd: spinand: extract variant ranking logic into spinand_op_find_best()
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (11 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 12/16] spi: cadence-quadspi: enable PHY for indirect writes Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 14/16] mtd: spinand: negotiate optimal PHY operating point before dirmap creation Santhosh Kumar K
` (3 subsequent siblings)
16 siblings, 0 replies; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
spinand_select_op_variant() open-codes a loop that finds the fastest
eligible op variant by transfer duration. Extract this into a shared
helper spinand_op_find_best() that accepts a skip_mask bitmask of
already-tried variant indices, enabling callers to iterate variants in
ranked order while skipping previously attempted ones.
spinand_select_op_variant() becomes a one-liner. No functional change.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/mtd/nand/spi/core.c | 32 +++++++++++++++++++++++++++-----
1 file changed, 27 insertions(+), 5 deletions(-)
diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
index f86786344d52..b678d0534297 100644
--- a/drivers/mtd/nand/spi/core.c
+++ b/drivers/mtd/nand/spi/core.c
@@ -1541,9 +1541,22 @@ static int spinand_init_odtr_instruction_set(struct spinand_device *spinand)
return 0;
}
+/*
+ * spinand_op_find_best() - Find the fastest eligible op variant.
+ * @spinand: SPI NAND device
+ * @variants: full variant list to search
+ * @odtr: true to consider ODTR ops, false for SSDR ops
+ * @skip_mask: bitmask of variant indices to skip (already tried)
+ *
+ * Iterates @variants, evaluates transfer duration for each eligible op, and
+ * returns a pointer to the fastest one not in @skip_mask. Returns NULL when
+ * no eligible variant remains. Used by both variant selection at init time
+ * (skip_mask == 0) and ranked PHY tuning iteration.
+ */
static const struct spi_mem_op *
-spinand_select_op_variant(struct spinand_device *spinand, enum spinand_bus_interface iface,
- const struct spinand_op_variants *variants)
+spinand_op_find_best(struct spinand_device *spinand,
+ const struct spinand_op_variants *variants, bool odtr,
+ u32 skip_mask)
{
struct nand_device *nand = spinand_to_nand(spinand);
const struct spi_mem_op *best_variant = NULL;
@@ -1551,15 +1564,16 @@ spinand_select_op_variant(struct spinand_device *spinand, enum spinand_bus_inter
unsigned int i;
for (i = 0; i < variants->nops; i++) {
- struct spi_mem_op op = variants->ops[i];
+ struct spi_mem_op op;
u64 op_duration_ns = 0;
unsigned int nbytes;
int ret;
- if ((iface == SSDR && spinand_op_is_odtr(&op)) ||
- (iface == ODTR && !spinand_op_is_odtr(&op)))
+ if ((skip_mask & BIT(i)) ||
+ spinand_op_is_odtr(&variants->ops[i]) != odtr)
continue;
+ op = variants->ops[i];
nbytes = nanddev_per_page_oobsize(nand) +
nanddev_page_size(nand);
@@ -1588,6 +1602,14 @@ spinand_select_op_variant(struct spinand_device *spinand, enum spinand_bus_inter
return best_variant;
}
+static const struct spi_mem_op *
+spinand_select_op_variant(struct spinand_device *spinand,
+ enum spinand_bus_interface iface,
+ const struct spinand_op_variants *variants)
+{
+ return spinand_op_find_best(spinand, variants, iface == ODTR, 0);
+}
+
/**
* spinand_match_and_init() - Try to find a match between a device ID and an
* entry in a spinand_info table
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v4 14/16] mtd: spinand: negotiate optimal PHY operating point before dirmap creation
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (12 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 13/16] mtd: spinand: extract variant ranking logic into spinand_op_find_best() Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 15/16] mtd: spi-nor: extract read op template construction into helper Santhosh Kumar K
` (2 subsequent siblings)
16 siblings, 0 replies; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Dirmap descriptors encode the op template including the operating
frequency at creation time, so PHY tuning must complete before dirmaps
are created to ensure the validated frequency is embedded in the
descriptors from the start.
Move dirmap creation from spinand_init() to spinand_probe(), after a
new spinand_configure_phy() call that negotiates the best available PHY
operating point. spinand_configure_phy() tries the pre-selected variant
first. If the controller signals that PHY tuning is not applicable for
that op, spinand_try_phy_ranked() iterates remaining variants in
performance order — DTR variants first, then SDR variants after
switching the bus interface if needed. On full failure the device falls
back to the best available non-PHY mode.
Add spinand_reset_max_ops() to copy op templates with max_freq zeroed
before each execute_tuning call, enforcing the invariant that a non-zero
max_freq only results from a successful tuning.
PHY failure is non-fatal; the device operates at the conservative base
rate.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/mtd/nand/spi/core.c | 214 ++++++++++++++++++++++++++++++++++--
include/linux/mtd/spinand.h | 11 ++
2 files changed, 216 insertions(+), 9 deletions(-)
diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
index b678d0534297..5dcfaabaf2cc 100644
--- a/drivers/mtd/nand/spi/core.c
+++ b/drivers/mtd/nand/spi/core.c
@@ -1280,10 +1280,16 @@ static int spinand_create_dirmap(struct spinand_device *spinand,
/* The plane number is passed in MSB just above the column address */
info.offset = plane << fls(nand->memorg.pagesize);
+ /*
+ * Propagate the validated PHY frequency into the dirmap op templates
+ * at construction time.
+ */
+
/* Write descriptor */
info.length = nanddev_page_size(nand) + nanddev_per_page_oobsize(nand);
info.primary_op_tmpl = *spinand->op_templates->update_cache;
info.primary_op_tmpl.data.ecc = enable_ecc;
+ info.primary_op_tmpl.max_freq = spinand->max_write_op.max_freq;
desc = devm_spi_mem_dirmap_create(&spinand->spimem->spi->dev,
spinand->spimem, &info);
if (IS_ERR(desc))
@@ -1294,9 +1300,11 @@ static int spinand_create_dirmap(struct spinand_device *spinand,
/* Read descriptor */
info.primary_op_tmpl = *spinand->op_templates->read_cache;
info.primary_op_tmpl.data.ecc = enable_ecc;
+ info.primary_op_tmpl.max_freq = spinand->max_read_op.max_freq;
if (secondary_op) {
info.secondary_op_tmpl = *spinand->op_templates->cont_read_cache;
info.secondary_op_tmpl.data.ecc = enable_ecc;
+ info.secondary_op_tmpl.max_freq = spinand->max_read_op.max_freq;
}
desc = spinand_create_rdesc(spinand, &info);
if (IS_ERR(desc))
@@ -1744,6 +1752,17 @@ int spinand_match_and_init(struct spinand_device *spinand,
spinand->cont_read_possible = false;
}
+ /*
+ * Save the full read variant list (ODTR and SSDR ops) for PHY
+ * tuning iteration. Only saved when all ODTR templates are
+ * valid so spinand_configure_phy() knows ranked fallback is
+ * available.
+ */
+ if (spinand->odtr_op_templates.read_cache &&
+ spinand->odtr_op_templates.write_cache &&
+ spinand->odtr_op_templates.update_cache)
+ spinand->phy_read_variants = info->op_variants.read_cache;
+
return 0;
}
@@ -1922,7 +1941,6 @@ static int spinand_mtd_suspend(struct mtd_info *mtd)
static int spinand_init(struct spinand_device *spinand)
{
- struct device *dev = &spinand->spimem->spi->dev;
struct mtd_info *mtd = spinand_to_mtd(spinand);
struct nand_device *nand = mtd_to_nanddev(mtd);
int ret;
@@ -2014,14 +2032,6 @@ static int spinand_init(struct spinand_device *spinand)
mtd->ecc_step_size = nanddev_get_ecc_conf(nand)->step_size;
mtd->bitflip_threshold = DIV_ROUND_UP(mtd->ecc_strength * 3, 4);
- ret = spinand_create_dirmaps(spinand);
- if (ret) {
- dev_err(dev,
- "Failed to create direct mappings for read/write operations (err = %d)\n",
- ret);
- goto err_cleanup_ecc_engine;
- }
-
return 0;
err_cleanup_ecc_engine:
@@ -2050,6 +2060,178 @@ static void spinand_cleanup(struct spinand_device *spinand)
kfree(spinand->scratchbuf);
}
+/*
+ * spinand_try_phy_ranked() - Try PHY tuning on variants in performance order.
+ * @spinand: SPI NAND device
+ * @mem: SPI memory device
+ * @odtr: true to iterate ODTR variants, false for SSDR variants
+ * @tried_mask: bitmask of already-tried variant indices; updated on each try
+ *
+ * Iterates the full read variant list in descending performance order,
+ * skipping variants in @tried_mask, and calls execute_tuning on each until one
+ * succeeds. The fastest PHY-tunable variant is chosen regardless of which was
+ * pre-selected at init; ranked iteration finds the best available variant
+ * without re-trying already-attempted ones.
+ *
+ * On success, sets spinand->max_read_op and updates the matching
+ * odtr_op_templates.read_cache or ssdr_op_templates.read_cache.
+ */
+static bool spinand_try_phy_ranked(struct spinand_device *spinand,
+ struct spi_mem *mem, bool odtr,
+ u32 *tried_mask)
+{
+ const struct spinand_op_variants *variants = spinand->phy_read_variants;
+ const struct spi_mem_op *best;
+ int ret;
+
+ if (!variants)
+ return false;
+
+ while ((best = spinand_op_find_best(spinand, variants, odtr,
+ *tried_mask))) {
+ *tried_mask |= BIT(best - variants->ops);
+ spinand->max_read_op = *best;
+ spinand->max_read_op.max_freq = 0;
+ ret = spi_mem_execute_tuning(mem, &spinand->max_read_op,
+ &spinand->max_write_op);
+ if (ret && ret != -EOPNOTSUPP)
+ dev_warn(&mem->spi->dev, "%s PHY tuning failed: %d\n",
+ odtr ? "ODTR" : "SSDR", ret);
+ if (!ret && spinand->max_read_op.max_freq) {
+ if (odtr)
+ spinand->odtr_op_templates.read_cache = best;
+ else
+ spinand->ssdr_op_templates.read_cache = best;
+ return true;
+ }
+ }
+ return false;
+}
+
+/*
+ * spinand_reset_max_ops() - Copy op templates and zero max_freq on both.
+ * @spinand: SPI NAND device
+ * @templates: op template set to copy from
+ *
+ * Called before execute_tuning so max_freq starts at zero; execute_tuning sets
+ * it to the validated clock rate only on success. A non-zero max_freq means
+ * PHY-validated; zero means the base rate applies.
+ */
+static void spinand_reset_max_ops(struct spinand_device *spinand,
+ struct spinand_mem_ops *templates)
+{
+ spinand->max_read_op = *templates->read_cache;
+ spinand->max_read_op.max_freq = 0;
+ spinand->max_write_op = *templates->write_cache;
+ spinand->max_write_op.max_freq = 0;
+}
+
+/*
+ * spinand_configure_phy() - Negotiate the optimal PHY operating point.
+ * @spinand: SPI NAND device
+ * @mem: SPI memory device
+ *
+ * Tries the pre-selected variant first. If the controller signals that
+ * PHY tuning is not applicable for that specific op, iterates all remaining
+ * variants in performance order. For devices that support both DTR and SDR
+ * interfaces, DTR variants are tried first; if all fail the device is
+ * switched to SDR mode and SDR variants are tried. On full failure the
+ * device falls back to the best available non-PHY mode. Devices that
+ * support only SDR skip the DTR ranked pass entirely.
+ *
+ * PHY failure is never fatal.
+ *
+ * Note: tried_mask is u32, supporting up to 32 variants total across both
+ * ODTR and SSDR. Flash devices with more than 32 read variants are not
+ * supported.
+ */
+static void spinand_configure_phy(struct spinand_device *spinand,
+ struct spi_mem *mem)
+{
+ u32 tried_mask;
+ int ret;
+
+ spinand_reset_max_ops(spinand, spinand->op_templates);
+
+ ret = spi_mem_execute_tuning(mem, &spinand->max_read_op,
+ &spinand->max_write_op);
+ if (ret && ret != -EOPNOTSUPP)
+ dev_warn(&mem->spi->dev, "Failed to execute PHY tuning: %d\n",
+ ret);
+
+ /*
+ * Any non-zero return or a set max_freq means we are done (error,
+ * unsupported, or success). Fallback only for the op-specific "skip"
+ * signal: ret == 0 with max_freq still 0.
+ */
+ if (ret || spinand->max_read_op.max_freq)
+ return;
+
+ if (!mem->spi->post_config_max_speed_hz || spinand->bus_iface == SSDR ||
+ !spinand->phy_read_variants)
+ return;
+
+ if (WARN_ON(spinand->phy_read_variants->nops > 32))
+ return;
+
+ /* Mark the pre-selected ODTR variant as already tried */
+ tried_mask = BIT(spinand->odtr_op_templates.read_cache -
+ spinand->phy_read_variants->ops);
+
+ dev_dbg(&mem->spi->dev,
+ "PHY tuning skipped for current op; searching for best PHY variant\n");
+
+ /* Pass 1: try all remaining ODTR variants in performance order */
+ if (spinand_try_phy_ranked(spinand, mem, true, &tried_mask))
+ return;
+
+ /*
+ * Pass 2: switch to SSDR and try all SSDR variants in performance
+ * order.
+ *
+ * Only enter if we actually have SSDR support and a reconfigure
+ * callback. The hardware is still in ODTR mode here so no
+ * configure_chip call is needed to undo; just set up the ODTR non-PHY
+ * fallback and return.
+ */
+ if (!spinand->ssdr_op_templates.read_cache ||
+ !spinand->ssdr_op_templates.write_cache ||
+ !spinand->configure_chip)
+ goto use_odtr_non_phy;
+
+ if (spinand->configure_chip(spinand, SSDR))
+ goto use_odtr_non_phy;
+
+ spinand->op_templates = &spinand->ssdr_op_templates;
+ spinand->bus_iface = SSDR;
+ spinand->max_write_op = *spinand->ssdr_op_templates.write_cache;
+ spinand->max_write_op.max_freq = 0;
+
+ /*
+ * Only ODTR variants were candidates in Pass 1; SSDR bit positions
+ * are clear
+ */
+ if (spinand_try_phy_ranked(spinand, mem, false, &tried_mask))
+ return;
+
+ /*
+ * All PHY attempts exhausted. Revert to ODTR for non-PHY DTR
+ * operation. If revert fails, stay in SSDR — a mode mismatch
+ * (ODTR op templates on SSDR-mode device) would corrupt data.
+ */
+ if (spinand->configure_chip(spinand, ODTR)) {
+ dev_warn(&mem->spi->dev,
+ "Failed to revert to ODTR, staying in SSDR non-PHY\n");
+ spinand_reset_max_ops(spinand, &spinand->ssdr_op_templates);
+ return;
+ }
+
+use_odtr_non_phy:
+ spinand->op_templates = &spinand->odtr_op_templates;
+ spinand->bus_iface = ODTR;
+ spinand_reset_max_ops(spinand, &spinand->odtr_op_templates);
+}
+
static int spinand_probe(struct spi_mem *mem)
{
struct spinand_device *spinand;
@@ -2072,6 +2254,20 @@ static int spinand_probe(struct spi_mem *mem)
if (ret)
return ret;
+ /*
+ * Negotiate the best PHY operating point before creating dirmaps so
+ * the validated frequency is available at dirmap construction time.
+ */
+ spinand_configure_phy(spinand, mem);
+
+ ret = spinand_create_dirmaps(spinand);
+ if (ret) {
+ dev_err(&mem->spi->dev,
+ "Failed to create direct mappings for read/write operations (err = %d)\n",
+ ret);
+ goto err_spinand_cleanup;
+ }
+
ret = mtd_device_register(mtd, NULL, 0);
if (ret)
goto err_spinand_cleanup;
diff --git a/include/linux/mtd/spinand.h b/include/linux/mtd/spinand.h
index ec6efcfeef83..50a8319cf11e 100644
--- a/include/linux/mtd/spinand.h
+++ b/include/linux/mtd/spinand.h
@@ -791,8 +791,19 @@ struct spinand_device {
struct spinand_mem_ops *op_templates;
enum spinand_bus_interface bus_iface;
+ /*
+ * Full read variant list (ODTR and SSDR ops together), saved when ODTR
+ * templates are valid. Used by spinand_configure_phy() to iterate all
+ * candidates when the pre-selected variant cannot be PHY-tuned.
+ */
+ const struct spinand_op_variants *phy_read_variants;
+
struct spinand_dirmap *dirmaps;
+ /* Persistent op templates updated by execute_tuning with validated speed. */
+ struct spi_mem_op max_read_op;
+ struct spi_mem_op max_write_op;
+
int (*select_target)(struct spinand_device *spinand,
unsigned int target);
unsigned int cur_target;
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v4 15/16] mtd: spi-nor: extract read op template construction into helper
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (13 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 14/16] mtd: spinand: negotiate optimal PHY operating point before dirmap creation Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 16/16] mtd: spi-nor: run PHY tuning after init and update dirmap frequency Santhosh Kumar K
2026-06-22 4:30 ` [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Mahapatra, Amit Kumar
16 siblings, 0 replies; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
From: Pratyush Yadav <pratyush@kernel.org>
spi_nor_spimem_read_data() and spi_nor_create_read_dirmap() both
open-code the same sequence: construct a SPI_MEM_OP, call
spi_nor_spimem_setup_op(), convert dummy cycles to bytes, and adjust
for DTR. The dirmap path also needed an explicit data buswidth fixup
because spi_nor_spimem_setup_op() skips buswidth when data.nbytes is
zero.
Extract this into spi_nor_spimem_get_read_op(). Using data.nbytes = 2
as a non-zero placeholder lets spi_nor_spimem_setup_op() configure the
data buswidth without a separate override; callers replace data.nbytes
with the actual transfer length before use. No functional change.
Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/mtd/spi-nor/core.c | 66 +++++++++++++++++++++-----------------
1 file changed, 36 insertions(+), 30 deletions(-)
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index ccf4396cdcd0..b683c077a233 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -188,6 +188,37 @@ static int spi_nor_controller_ops_erase(struct spi_nor *nor, loff_t offs)
return nor->controller_ops->erase(nor, offs);
}
+/**
+ * spi_nor_spimem_get_read_op() - build a configured read op template
+ * @nor: the spi-nor device
+ *
+ * Returns a spi_mem_op with the command, address format, dummy cycles,
+ * and data buswidth configured for @nor. For direct reads, the caller
+ * must fill in addr.val, data.nbytes, and data.buf.in before use.
+ */
+static struct spi_mem_op spi_nor_spimem_get_read_op(struct spi_nor *nor)
+{
+ /*
+ * data.nbytes must be non-zero so spi_nor_spimem_setup_op()
+ * configures the data buswidth; callers replace it with the
+ * actual transfer length.
+ */
+ struct spi_mem_op op =
+ SPI_MEM_OP(SPI_MEM_OP_CMD(nor->read_opcode, 0),
+ SPI_MEM_OP_ADDR(nor->addr_nbytes, 0, 0),
+ SPI_MEM_OP_DUMMY(nor->read_dummy, 0),
+ SPI_MEM_OP_DATA_IN(2, NULL, 0));
+
+ spi_nor_spimem_setup_op(nor, &op, nor->read_proto);
+
+ /* convert the dummy cycles to the number of bytes */
+ op.dummy.nbytes = (nor->read_dummy * op.dummy.buswidth) / 8;
+ if (spi_nor_protocol_is_dtr(nor->read_proto))
+ op.dummy.nbytes *= 2;
+
+ return op;
+}
+
/**
* spi_nor_spimem_read_data() - read data from flash's memory region via
* spi-mem
@@ -201,21 +232,14 @@ static int spi_nor_controller_ops_erase(struct spi_nor *nor, loff_t offs)
static ssize_t spi_nor_spimem_read_data(struct spi_nor *nor, loff_t from,
size_t len, u8 *buf)
{
- struct spi_mem_op op =
- SPI_MEM_OP(SPI_MEM_OP_CMD(nor->read_opcode, 0),
- SPI_MEM_OP_ADDR(nor->addr_nbytes, from, 0),
- SPI_MEM_OP_DUMMY(nor->read_dummy, 0),
- SPI_MEM_OP_DATA_IN(len, buf, 0));
+ struct spi_mem_op op = spi_nor_spimem_get_read_op(nor);
bool usebouncebuf;
ssize_t nbytes;
int error;
- spi_nor_spimem_setup_op(nor, &op, nor->read_proto);
-
- /* convert the dummy cycles to the number of bytes */
- op.dummy.nbytes = (nor->read_dummy * op.dummy.buswidth) / 8;
- if (spi_nor_protocol_is_dtr(nor->read_proto))
- op.dummy.nbytes *= 2;
+ op.addr.val = from;
+ op.data.nbytes = len;
+ op.data.buf.in = buf;
usebouncebuf = spi_nor_spimem_bounce(nor, &op);
@@ -3712,28 +3736,10 @@ static int spi_nor_create_read_dirmap(struct spi_nor *nor)
{
struct spi_mem_dirmap_info info = {
.op_tmpl = &info.primary_op_tmpl,
- .primary_op_tmpl = SPI_MEM_OP(SPI_MEM_OP_CMD(nor->read_opcode, 0),
- SPI_MEM_OP_ADDR(nor->addr_nbytes, 0, 0),
- SPI_MEM_OP_DUMMY(nor->read_dummy, 0),
- SPI_MEM_OP_DATA_IN(0, NULL, 0)),
+ .primary_op_tmpl = spi_nor_spimem_get_read_op(nor),
.offset = 0,
.length = nor->params->size,
};
- struct spi_mem_op *op = info.op_tmpl;
-
- spi_nor_spimem_setup_op(nor, op, nor->read_proto);
-
- /* convert the dummy cycles to the number of bytes */
- op->dummy.nbytes = (nor->read_dummy * op->dummy.buswidth) / 8;
- if (spi_nor_protocol_is_dtr(nor->read_proto))
- op->dummy.nbytes *= 2;
-
- /*
- * Since spi_nor_spimem_setup_op() only sets buswidth when the number
- * of data bytes is non-zero, the data buswidth won't be set here. So,
- * do it explicitly.
- */
- op->data.buswidth = spi_nor_get_protocol_data_nbits(nor->read_proto);
nor->dirmap.rdesc = devm_spi_mem_dirmap_create(nor->dev, nor->spimem,
&info);
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* [PATCH v4 16/16] mtd: spi-nor: run PHY tuning after init and update dirmap frequency
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (14 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 15/16] mtd: spi-nor: extract read op template construction into helper Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-22 4:30 ` [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Mahapatra, Amit Kumar
16 siblings, 0 replies; 24+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Run PHY tuning in spi_nor_probe() before creating dirmaps so the
validated frequency is available at dirmap construction time.
Store the configured read op template in nor->max_read_op and pass it
to spi_mem_execute_tuning(). On success the controller sets
max_read_op.max_freq to the calibrated rate. spi_nor_spimem_get_read_op()
propagates nor->max_read_op.max_freq into every op it returns, so the
validated frequency flows automatically into the dirmap template and into
regular read ops.
Tuning failure is non-fatal; the device operates at the conservative
base rate.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/mtd/spi-nor/core.c | 14 ++++++++++++++
include/linux/mtd/spi-nor.h | 3 +++
2 files changed, 17 insertions(+)
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index b683c077a233..a2a2cf3d3603 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -216,6 +216,9 @@ static struct spi_mem_op spi_nor_spimem_get_read_op(struct spi_nor *nor)
if (spi_nor_protocol_is_dtr(nor->read_proto))
op.dummy.nbytes *= 2;
+ /* Propagate the validated frequency; zero before tuning. */
+ op.max_freq = nor->max_read_op.max_freq;
+
return op;
}
@@ -3843,6 +3846,17 @@ static int spi_nor_probe(struct spi_mem *spimem)
return -ENOMEM;
}
+ /*
+ * Populate the persistent template and run PHY tuning before dirmap
+ * creation so the validated frequency feeds into the dirmap op.
+ * Tuning failure is non-fatal; the device operates at base speed.
+ */
+ nor->max_read_op = spi_nor_spimem_get_read_op(nor);
+
+ ret = spi_mem_execute_tuning(spimem, &nor->max_read_op, NULL);
+ if (ret && ret != -EOPNOTSUPP)
+ dev_warn(dev, "Failed to execute PHY tuning: %d\n", ret);
+
ret = spi_nor_create_read_dirmap(nor);
if (ret)
return ret;
diff --git a/include/linux/mtd/spi-nor.h b/include/linux/mtd/spi-nor.h
index 4b92494827b1..ab498a50f15f 100644
--- a/include/linux/mtd/spi-nor.h
+++ b/include/linux/mtd/spi-nor.h
@@ -422,6 +422,9 @@ struct spi_nor {
struct spi_mem_dirmap_desc *wdesc;
} dirmap;
+ /* Persistent op template updated by execute_tuning with validated speed. */
+ struct spi_mem_op max_read_op;
+
void *priv;
};
--
2.34.1
^ permalink raw reply related [flat|nested] 24+ messages in thread* RE: [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (15 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 16/16] mtd: spi-nor: run PHY tuning after init and update dirmap frequency Santhosh Kumar K
@ 2026-06-22 4:30 ` Mahapatra, Amit Kumar
16 siblings, 0 replies; 24+ messages in thread
From: Mahapatra, Amit Kumar @ 2026-06-22 4:30 UTC (permalink / raw)
To: Santhosh Kumar K, broonie@kernel.org, robh@kernel.org,
krzk+dt@kernel.org, conor+dt@kernel.org,
miquel.raynal@bootlin.com, richard@nod.at, vigneshr@ti.com,
pratyush@kernel.org, mwalle@kernel.org,
takahiro.kuwano@infineon.com
Cc: linux-spi@vger.kernel.org, devicetree@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mtd@lists.infradead.org,
praneeth@ti.com, u-kumar1@ti.com, a-dutta@ti.com,
git (AMD-Xilinx), Amit Mohapatra
AMD General
Hello Santosh,
> -----Original Message-----
> From: Santhosh Kumar K <s-k6@ti.com>
> Sent: Thursday, June 18, 2026 1:07 PM
> To: broonie@kernel.org; robh@kernel.org; krzk+dt@kernel.org;
> conor+dt@kernel.org; miquel.raynal@bootlin.com; richard@nod.at;
> vigneshr@ti.com; pratyush@kernel.org; mwalle@kernel.org;
> takahiro.kuwano@infineon.com
> Cc: linux-spi@vger.kernel.org; devicetree@vger.kernel.org; linux-
> kernel@vger.kernel.org; linux-mtd@lists.infradead.org; praneeth@ti.com; u-
> kumar1@ti.com; a-dutta@ti.com; s-k6@ti.com
> Subject: [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support
>
> This series implements PHY tuning support for the Cadence QSPI controller to
> enable reliable high-speed operations. Without PHY tuning, controllers use
> conservative timing that limits performance. PHY tuning calibrates RX/TX delay lines
> to find optimal data capture timing windows, enabling operation up to the controller's
> maximum frequency.
>
> Background:
> High-speed SPI memory controllers require precise timing calibration for reliable
> operation. At higher frequencies, board-to-board variations make fixed timing
> parameters inadequate. The Cadence QSPI controller includes a PHY interface with
> programmable delay lines (0-127 taps) for RX and TX paths, but these require
> runtime calibration to find the valid timing window.
>
> Approach:
> Add SDR/DDR PHY tuning algorithms for the Cadence controller:
>
> SDR Mode Tuning (1D search):
> - Searches for two consecutive valid RX delay windows
> - Selects the larger window and uses its midpoint for maximum margin
> - TX delay fixed at maximum (127) as it's less critical in SDR
>
> DDR Mode Tuning (2D search):
> - Finds RX boundaries (rxlow/rxhigh) using TX window sweeps
> - Finds TX boundaries (txlow/txhigh) at fixed RX positions
> - Defines valid region corners and detects gaps via binary search
> - Applies temperature compensation for optimal point selection
> - Handles single or dual passing regions with different strategies
Thank you for this series. I had a question regarding the Virtual Concat
driver patch series [1]. Now that it has been merged into the kernel and
enables support for multiple flash devices connected in stacked mode-where
each flash device is probed and configured independently-if both flash
parts are required to operate in DDR mode, each device would need to
perform tuning and store its tuning data separately.
Given this, should we consider this use case and adapt the tuning
architecture to support it?
I'd appreciate your thoughts on this.
[1] https://lore.kernel.org/all/20260204-mtd-virt-concat-v17-0-5e98239bb55b@bootlin.com/
Regards,
Amit
>
> Patch description:
> Infrastructure (1-5):
> - Patch 1: Add spi-max-post-config-frequency to describe maximum
> frequency achievable post controller configuration
> - Patch 2: Add spi-phy-pattern-partition phandle for
> NOR flash PHY tuning pattern location
> - Patch 3: Parse spi-max-post-config-frequency in spi.c; adds
> spi_device.post_config_max_speed_hz (0 when not set
> keeping all existing DT fully compatible)
> - Patch 4: Extend spi_mem_adjust_op_freq() with a bypass: if
> op->max_freq equals post_config_max_speed_hz, return
> immediately leaving op->max_freq unchanged. All other
> ops are capped to max_speed_hz
> - Patch 5: Add execute_tuning callback to spi_controller_mem_ops and
> spi_mem_execute_tuning() wrapper in SPI-MEM core
>
> Cadence QSPI Implementation (6-12):
> - Patch 6: Move cqspi_readdata_capture() earlier (preparatory)
> - Patch 7: Add DQS bit to cqspi_readdata_capture() (preparatory)
> - Patch 8: Add complete PHY tuning support: DLL management, pattern
> verification (NOR via spi-phy-pattern-partition phandle,
> NAND via write-to-cache), SDR 1D and DDR 2D search
> algorithms with temperature compensation, AM654-specific
> execute_tuning entry point;
> - Patch 9: Reject 2-byte-address DDR operations via a new
> CQSPI_NO_2BYTE_ADDR_PHY_DDR quirk flag to work around
> AM654 OSPI erratum i2383
> - Patch 10: Refactor direct read path for PHY support (preparatory)
> - Patch 11: Enable PHY for direct reads, split the transfer into an
> unaligned head, a 16-byte-aligned middle section with PHY
> active, and an unaligned tail
> - Patch 12: Enable PHY for indirect writes of at least
> CQSPI_PHY_MIN_INDIRECT_WRITE_LEN bytes
>
> MTD core (13-16):
> - Patch 13: Extract spinand_select_op_variant() into a shared helper
> spinand_op_find_best() with a skip_mask
> - Patch 14: Negotiate optimal PHY operating point before dirmap
> creation
> - Patch 15: Extract spi_nor_spimem_get_read_op() helper (preparatory)
> - Patch 16: Execute PHY tuning in spi_nor_probe() before creating
> dirmaps
>
> Testing:
> This series was tested on TI's
> AM62Ax SK with OSPI NAND flash and
> AM62Px SK with OSPI NOR flash:
>
> Read throughput:
> |----------------------------------------|
> | | non-PHY | PHY |
> |----------------------------------------|
> | OSPI NOR (8D) | 37.5 MB/s | 216 MB/s |
> |----------------------------------------|
> | OSPI NAND (8S) | 9.2 MB/s | 35.1 MB/s |
> |----------------------------------------|
>
> Write throughput:
> |----------------------------------------|
> | | non-PHY | PHY |
> |----------------------------------------|
> | OSPI NAND (8S) | 6 MB/s | 9.2 MB/s |
> |----------------------------------------|
>
> Test log: https://gist.github.com/santhosh21/fe98754e52970287eb9011154100b62d
> Repo: https://github.com/santhosh21/linux/commits/phy_tuning_v4/
>
> Changes in v4:
> - Add spi-max-post-config-frequency instead of extending spi-max-frequency
> to accept an optional second value
> - Replace spi_mem_apply_base_freq_cap() with spi_mem_adjust_op_freq()
> extension
> - For SPI NOR/NAND, execute PHY tuning before the dirmap creation
> - For SPI NAND, execute PHY tuning across all operation variants available,
> perform duration comparison, and select the best resulting variant
> by taking controller-specific restrictions into account
> - Move i2383 check from cqspi_supports_mem_op() to
> cqspi_am654_ospi_execute_tuning()
> - Rename cdns,phy-pattern-partition to spi-phy-pattern-partition,
> cqspi_phy_enable to cqspi_tune_phy and f_pdata->use_phy to use_tuned_phy
> - Remove redundant spi-max-frequency parsing in driver cqspi_of_get_flash_pdata()
> - Extract DMA refactoring into a preparatory patch
> - Rebase on v7.1
> - Collect tags from Miquel
> - Link to v3: https://lore.kernel.org/linux-spi/20260527175527.2247679-1-s-
> k6@ti.com/
>
> Changes in v3:
> - Drop spi-has-dqs DT property; DQS is now enabled automatically when
> the selected read operation uses DDR signalling (dtr flags in the op)
> - Extend spi-max-frequency to accept an optional second value forming a
> [base-freq, max-freq] pair; the presence of two values signals PHY
> tuning intent and encodes both the conservative base speed and the
> calibration target in one property
> - Add base_speed_hz to struct spi_device (spi.c/spi.h) and parse the
> two-element array there; single-value DT is fully backward-compatible
> - Move frequency enforcement from the cadence driver to core: new
> spi_mem_apply_base_freq_cap() called from spi_mem_exec_op() replaces
> the per-driver cqspi_op_matches_tuned() and non_phy_clk_rate field
> - Propagate the tuned max_freq to dirmap op templates after
> execute_tuning() succeeds; store persistent op templates in
> spi_nor.max_read_op and spinand.{max_read,max_write}_op so the
> frequency writeback survives across the probe call
> - Replace NOR pattern partition lookup by name with a
> cdns,phy-pattern-partition DT phandle pointing directly to the
> partition node
> - Add CQSPI_NO_2BYTE_ADDR_PHY_DDR quirk and reject 2-byte-address DDR
> ops in cqspi_supports_mem_op() to work around AM654 erratum i2383
> - Remove RFC tag
> - Rebase on v7.1-rc5
> - Collect tags from Miquel
> - Link to v2: https://lore.kernel.org/linux-spi/20260113141617.1905039-1-s-
> k6@ti.com/
>
> Changes in v2:
> - Restructure the .execute_tuning() call from spi-mem clients instead
> of mtdcore with best read_op and write_op (optional) passed
> - Add compatible-specific .execute_tuning() call which can be called by
> spi_mem_execute_tuning() if exists
> - Handle tuning requirement check by controller instead of spi-mem
> clients
> - Add support to write the phy_pattern to cache if relevant write_op
> is passed or get the partition offset which contains the phy_pattern
> - Add tuning algorithm for DDR mode
> - Add support for DQS
> - Restrict PHY frequency to tuned operations
> - Link to v1: https://lore.kernel.org/linux-spi/20250811193219.731851-1-s-
> k6@ti.com/
>
> Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
>
> Pratyush Yadav (1):
> mtd: spi-nor: extract read op template construction into helper
>
> Santhosh Kumar K (15):
> spi: dt-bindings: add spi-max-post-config-frequency property
> spi: dt-bindings: add spi-phy-pattern-partition property
> spi: parse spi-max-post-config-frequency into post_config_max_speed_hz
> spi: spi-mem: teach spi_mem_adjust_op_freq() about post-config ops
> spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning()
> spi: cadence-quadspi: move cqspi_readdata_capture earlier
> spi: cadence-quadspi: add DQS support to read data capture
> spi: cadence-quadspi: add PHY tuning support
> spi: cadence-quadspi: skip DDR PHY tuning for 2-byte-address ops
> (i2383)
> spi: cadence-quadspi: refactor direct read path for PHY support
> spi: cadence-quadspi: enable PHY for direct reads
> spi: cadence-quadspi: enable PHY for indirect writes
> mtd: spinand: extract variant ranking logic into
> spinand_op_find_best()
> mtd: spinand: negotiate optimal PHY operating point before dirmap
> creation
> mtd: spi-nor: run PHY tuning after init and update dirmap frequency
>
> .../bindings/spi/cdns,qspi-nor.yaml | 19 +
> .../bindings/spi/spi-peripheral-props.yaml | 13 +
> drivers/mtd/nand/spi/core.c | 246 +-
> drivers/mtd/spi-nor/core.c | 80 +-
> drivers/spi/spi-cadence-quadspi.c | 2265 +++++++++++++++--
> drivers/spi/spi-mem.c | 40 +
> drivers/spi/spi.c | 2 +
> include/linux/mtd/spi-nor.h | 3 +
> include/linux/mtd/spinand.h | 11 +
> include/linux/spi/spi-mem.h | 14 +
> include/linux/spi/spi.h | 3 +
> 11 files changed, 2493 insertions(+), 203 deletions(-)
>
> --
> 2.34.1
^ permalink raw reply [flat|nested] 24+ messages in thread