* [PATCH v4 01/16] spi: dt-bindings: add spi-max-post-config-frequency property
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 02/16] spi: dt-bindings: add spi-phy-pattern-partition property Santhosh Kumar K
` (14 subsequent siblings)
15 siblings, 0 replies; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Add spi-max-post-config-frequency, a generic uint32 property for SPI
peripherals that support two distinct clock rates: a conservative rate
always reachable without controller configuration, and a higher rate
reachable only after controller-side configuration such as PHY tuning.
When both properties are present, spi-max-frequency gives the
conservative pre-configuration rate and spi-max-post-config-frequency
gives the higher post-configuration target.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
.../devicetree/bindings/spi/spi-peripheral-props.yaml | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
index 880a9f624566..ece86f65930f 100644
--- a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
+++ b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
@@ -45,6 +45,12 @@ properties:
description:
Maximum SPI clocking speed of the device in Hz.
+ spi-max-post-config-frequency:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description:
+ Maximum SPI clock frequency in Hz achievable post controller-side
+ configuration.
+
spi-cs-setup-delay-ns:
description:
Delay in nanoseconds to be introduced by the controller after CS is
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* [PATCH v4 02/16] spi: dt-bindings: add spi-phy-pattern-partition property
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 01/16] spi: dt-bindings: add spi-max-post-config-frequency property Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:50 ` sashiko-bot
2026-06-18 7:37 ` [PATCH v4 03/16] spi: parse spi-max-post-config-frequency into post_config_max_speed_hz Santhosh Kumar K
` (13 subsequent siblings)
15 siblings, 1 reply; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Add spi-phy-pattern-partition, a per-device phandle property on the
flash sub-node that allows the DT author to directly reference the
partition holding the PHY tuning pattern. Used to locate the pattern
data during PHY tuning when the device cannot load the pattern
dynamically.
"Read PHY tuning" works by reading a known data pattern from the device
repeatedly while sweeping controller delay parameters until the
capture window is stable. For SPI NAND, the driver loads the pattern
into the page cache once using write-to-cache opcodes, then reads it
during the sweep. SPI NOR devices have no equivalent opcode, so the
pattern must be pre-programmed in a dedicated flash partition. One
partition per device is required to keep the procedure unambiguous
when multiple devices share a bus.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
.../bindings/spi/cdns,qspi-nor.yaml | 19 +++++++++++++++++++
.../bindings/spi/spi-peripheral-props.yaml | 7 +++++++
2 files changed, 26 insertions(+)
diff --git a/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml b/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml
index 891f578b5ac4..c6f1b1d1251d 100644
--- a/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml
+++ b/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml
@@ -204,10 +204,29 @@ examples:
flash@0 {
compatible = "jedec,spi-nor";
reg = <0x0>;
+ #address-cells = <1>;
+ #size-cells = <1>;
cdns,read-delay = <4>;
cdns,tshsl-ns = <60>;
cdns,tsd2d-ns = <60>;
cdns,tchsh-ns = <60>;
cdns,tslch-ns = <60>;
+ spi-phy-pattern-partition = <&phy_pattern>;
+
+ partitions {
+ compatible = "fixed-partitions";
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+ partition@0 {
+ label = "data";
+ reg = <0x0 0x3fc0000>;
+ };
+
+ phy_pattern: partition@3fc0000 {
+ label = "phy-pattern";
+ reg = <0x3fc0000 0x40000>;
+ };
+ };
};
};
diff --git a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
index ece86f65930f..38708f8197f9 100644
--- a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
+++ b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
@@ -123,6 +123,13 @@ properties:
description:
Delay, in microseconds, after a write transfer.
+ spi-phy-pattern-partition:
+ $ref: /schemas/types.yaml#/definitions/phandle
+ description:
+ Phandle to the flash partition holding the pre-programmed PHY tuning
+ pattern. Used when the device cannot load the pattern dynamically during
+ PHY tuning.
+
stacked-memories:
description: Several SPI memories can be wired in stacked mode.
This basically means that either a device features several chip
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v4 02/16] spi: dt-bindings: add spi-phy-pattern-partition property
2026-06-18 7:37 ` [PATCH v4 02/16] spi: dt-bindings: add spi-phy-pattern-partition property Santhosh Kumar K
@ 2026-06-18 7:50 ` sashiko-bot
0 siblings, 0 replies; 29+ messages in thread
From: sashiko-bot @ 2026-06-18 7:50 UTC (permalink / raw)
To: Santhosh Kumar K; +Cc: devicetree, robh, conor+dt
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Low] Unnecessary `#address-cells` and `#size-cells` properties in the `flash@0` node will cause `dtc` warnings and violate the MTD schema.
--
commit 7871aada614c3c1ea1780c6f6248e24acac0f319
Author: Santhosh Kumar K <s-k6@ti.com>
spi: dt-bindings: add spi-phy-pattern-partition property
This commit introduces the spi-phy-pattern-partition property to the SPI flash
sub-node to allow referencing a specific partition containing a PHY tuning
pattern. This is used when devices like SPI NOR cannot load the pattern
dynamically.
> diff --git a/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml b/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml
> --- a/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml
> +++ b/Documentation/devicetree/bindings/spi/cdns,qspi-nor.yaml
> @@ -204,10 +204,29 @@ examples:
> flash@0 {
> compatible = "jedec,spi-nor";
> reg = <0x0>;
> + #address-cells = <1>;
> + #size-cells = <1>;
[Severity: Low]
Will adding these cell properties to the flash@0 node trigger an
avoid_unnecessary_addr_size warning during dt_binding_check?
Since the newly added partitions child node does not use a unit address,
these properties do not appear to be necessary on the flash node itself.
Does this also conflict with the MTD schema, which deprecates these properties
at the MTD flash node level in favor of placing them strictly inside the
partitions container?
[ ... ]
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260618073725.84733-1-s-k6@ti.com?part=2
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v4 03/16] spi: parse spi-max-post-config-frequency into post_config_max_speed_hz
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 01/16] spi: dt-bindings: add spi-max-post-config-frequency property Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 02/16] spi: dt-bindings: add spi-phy-pattern-partition property Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:54 ` sashiko-bot
2026-06-18 7:37 ` [PATCH v4 04/16] spi: spi-mem: teach spi_mem_adjust_op_freq() about post-config ops Santhosh Kumar K
` (12 subsequent siblings)
15 siblings, 1 reply; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Add post_config_max_speed_hz to struct spi_device and parse it from
the spi-max-post-config-frequency DT property in of_spi_parse_dt().
This supports SPI devices that operate at two distinct clock rates: a
conservative rate always reachable without controller configuration,
and a higher rate achievable only after controller-side configuration
such as PHY tuning. With both properties set, spi-max-frequency gives
the conservative pre-configuration rate and post_config_max_speed_hz
carries the post-configuration target for the SPI-MEM layer.
Zero when not set, preserving existing behaviour.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi.c | 2 ++
include/linux/spi/spi.h | 3 +++
2 files changed, 5 insertions(+)
diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 76e3563c523f..36951ab47a2f 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -2599,6 +2599,8 @@ static int of_spi_parse_dt(struct spi_controller *ctlr, struct spi_device *spi,
/* Device speed */
if (!of_property_read_u32(nc, "spi-max-frequency", &value))
spi->max_speed_hz = value;
+ if (!of_property_read_u32(nc, "spi-max-post-config-frequency", &value))
+ spi->post_config_max_speed_hz = value;
/* Device CS delays */
of_spi_parse_dt_cs_delay(nc, &spi->cs_setup, "spi-cs-setup-delay-ns");
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index f6ed93eff00b..2d90ec91450a 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -139,6 +139,8 @@ extern void spi_transfer_cs_change_delay_exec(struct spi_message *msg,
* @max_speed_hz: Maximum clock rate to be used with this chip
* (on this board); may be changed by the device's driver.
* The spi_transfer.speed_hz can override this for each transfer.
+ * @post_config_max_speed_hz: Maximum clock rate achievable after controller
+ * configuration (e.g. PHY tuning); zero when not assigned.
* @bits_per_word: Data transfers involve one or more words; word sizes
* like eight or 12 bits are common. In-memory wordsizes are
* powers of two bytes (e.g. 20 bit samples use 32 bits).
@@ -191,6 +193,7 @@ struct spi_device {
struct device dev;
struct spi_controller *controller;
u32 max_speed_hz;
+ u32 post_config_max_speed_hz;
u8 bits_per_word;
bool rt;
#define SPI_NO_TX BIT(31) /* No transmit wire */
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v4 03/16] spi: parse spi-max-post-config-frequency into post_config_max_speed_hz
2026-06-18 7:37 ` [PATCH v4 03/16] spi: parse spi-max-post-config-frequency into post_config_max_speed_hz Santhosh Kumar K
@ 2026-06-18 7:54 ` sashiko-bot
0 siblings, 0 replies; 29+ messages in thread
From: sashiko-bot @ 2026-06-18 7:54 UTC (permalink / raw)
To: Santhosh Kumar K; +Cc: robh, conor+dt, devicetree
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] spi->post_config_max_speed_hz is not capped by the SPI controller's maximum speed in spi_setup().
--
commit 7b33747ab555fad6b7ce057a971c3935bb2ff311
Author: Santhosh Kumar K <s-k6@ti.com>
spi: parse spi-max-post-config-frequency into post_config_max_speed_hz
This patch adds post_config_max_speed_hz to struct spi_device and parses
it from the spi-max-post-config-frequency device tree property. It allows
SPI devices to operate at a conservative speed initially and a higher speed
after controller-side configuration like PHY tuning.
> diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
> index 104279858f567..c423d5c9af202 100644
> --- a/drivers/spi/spi.c
> +++ b/drivers/spi/spi.c
> @@ -2601,6 +2601,8 @@ static int of_spi_parse_dt(struct spi_controller *ctlr, struct spi_device *spi,
> /* Device speed */
> if (!of_property_read_u32(nc, "spi-max-frequency", &value))
> spi->max_speed_hz = value;
> + if (!of_property_read_u32(nc, "spi-max-post-config-frequency", &value))
> + spi->post_config_max_speed_hz = value;
[Severity: Medium]
Does post_config_max_speed_hz also need to be capped against the SPI
controller's maximum speed limits?
Currently in __spi_setup(), max_speed_hz is explicitly checked and capped
against spi->controller->max_speed_hz:
drivers/spi/spi.c:__spi_setup() {
...
if (spi->controller->max_speed_hz &&
(!spi->max_speed_hz ||
spi->max_speed_hz > spi->controller->max_speed_hz))
spi->max_speed_hz = spi->controller->max_speed_hz;
...
}
However, post_config_max_speed_hz skips this validation. This allows an
uncapped, out-of-bounds frequency from the device tree to be passed down
to controller drivers. For instance, an excessively high frequency could later
cause a divisor calculation underflow in drivers like cqspi_config_baudrate_div()
when the frequency bypasses the max_speed_hz cap.
>
> /* Device CS delays */
> of_spi_parse_dt_cs_delay(nc, &spi->cs_setup, "spi-cs-setup-delay-ns");
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260618073725.84733-1-s-k6@ti.com?part=3
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v4 04/16] spi: spi-mem: teach spi_mem_adjust_op_freq() about post-config ops
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (2 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 03/16] spi: parse spi-max-post-config-frequency into post_config_max_speed_hz Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 8:02 ` sashiko-bot
2026-06-18 7:37 ` [PATCH v4 05/16] spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning() Santhosh Kumar K
` (11 subsequent siblings)
15 siblings, 1 reply; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
When a device exposes both a conservative base speed (spi-max-frequency)
and a maximum post-configuration speed (spi-max-post-config-frequency),
operations validated after controller configuration must run at the
higher rate while all others are capped at the base rate.
Extend spi_mem_adjust_op_freq() with a bypass: if op->max_freq equals
post_config_max_speed_hz (the value written by execute_tuning on
success), return immediately leaving op->max_freq unchanged. All other
ops are capped to max_speed_hz, the always-reachable base rate. This
integrates the policy into the single existing frequency-adjustment
point so exec_op(), supports_op(), and calc_op_duration() all behave
consistently.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-mem.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
index a88b9f038356..e20eca1b8245 100644
--- a/drivers/spi/spi-mem.c
+++ b/drivers/spi/spi-mem.c
@@ -591,9 +591,18 @@ EXPORT_SYMBOL_GPL(spi_mem_adjust_op_size);
* Some chips have per-op frequency limitations and must adapt the maximum
* speed. This function allows SPI mem drivers to set @op->max_freq to the
* maximum supported value.
+ *
+ * When @mem->spi->post_config_max_speed_hz is set, ops with @op->max_freq
+ * equal to that value are treated as post-configuration ops (e.g. PHY-tuned)
+ * and are allowed to run at the full post-config rate. All other ops are
+ * capped to @mem->spi->max_speed_hz, the always-reachable base rate.
*/
void spi_mem_adjust_op_freq(struct spi_mem *mem, struct spi_mem_op *op)
{
+ if (mem->spi->post_config_max_speed_hz &&
+ op->max_freq == mem->spi->post_config_max_speed_hz)
+ return;
+
if (!op->max_freq || op->max_freq > mem->spi->max_speed_hz)
op->max_freq = mem->spi->max_speed_hz;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v4 04/16] spi: spi-mem: teach spi_mem_adjust_op_freq() about post-config ops
2026-06-18 7:37 ` [PATCH v4 04/16] spi: spi-mem: teach spi_mem_adjust_op_freq() about post-config ops Santhosh Kumar K
@ 2026-06-18 8:02 ` sashiko-bot
0 siblings, 0 replies; 29+ messages in thread
From: sashiko-bot @ 2026-06-18 8:02 UTC (permalink / raw)
To: Santhosh Kumar K; +Cc: conor+dt, devicetree, robh
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Overloading `op->max_freq` as a magic exact-match token to identify post-configuration operations breaks frequency limit semantics, preventing valid operations from running at tuned speeds or erroneously allowing untuned operations to bypass safe limits.
--
commit c232673d530121d1e383d41f1db73523ed370bfb
Author: Santhosh Kumar K <s-k6@ti.com>
spi: spi-mem: teach spi_mem_adjust_op_freq() about post-config ops
This commit extends spi_mem_adjust_op_freq() to allow post-configuration
operations to run at higher speeds. It bypasses the max_speed_hz cap if the
operation's max_freq exactly matches the post_config_max_speed_hz value.
> diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
> index f64eda9bbd9f0..b0b111642a358 100644
> --- a/drivers/spi/spi-mem.c
> +++ b/drivers/spi/spi-mem.c
[ ... ]
> @@ -584,9 +584,18 @@ EXPORT_SYMBOL_GPL(spi_mem_adjust_op_size);
> * Some chips have per-op frequency limitations and must adapt the maximum
> * speed. This function allows SPI mem drivers to set @op->max_freq to the
> * maximum supported value.
> + *
> + * When @mem->spi->post_config_max_speed_hz is set, ops with @op->max_freq
> + * equal to that value are treated as post-configuration ops (e.g. PHY-tuned)
> + * and are allowed to run at the full post-config rate. All other ops are
> + * capped to @mem->spi->max_speed_hz, the always-reachable base rate.
> */
> void spi_mem_adjust_op_freq(struct spi_mem *mem, struct spi_mem_op *op)
> {
> + if (mem->spi->post_config_max_speed_hz &&
> + op->max_freq == mem->spi->post_config_max_speed_hz)
> + return;
[Severity: High]
Does this overload op->max_freq as a magic exact-match token to identify
post-configuration operations?
If an untuned operation (e.g., a write or alternate read variant defined in a
flash driver with a static hardware limit of 104MHz) happens to exactly match
the configured post_config_max_speed_hz, won't it incorrectly bypass the safe
max_speed_hz cap?
Furthermore, controller drivers like Cadence QuadSPI appear to rely on this
exact same equality check to decide whether to route the incoming operation
through the hardware PHY:
drivers/spi/spi-cadence-quadspi.c:cqspi_use_tuned_phy() {
...
return f_pdata->use_tuned_phy &&
op->max_freq == post_config_max_speed_hz;
}
Could this cause uncalibrated operations to be erroneously executed via the
PHY at high speeds, leading to potential data corruption or hardware lockups?
Would it be safer to use an explicit boolean flag (e.g., op->is_tuned) rather
than hijacking the frequency limit field?
> +
> if (!op->max_freq || op->max_freq > mem->spi->max_speed_hz)
> op->max_freq = mem->spi->max_speed_hz;
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260618073725.84733-1-s-k6@ti.com?part=4
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v4 05/16] spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning()
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (3 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 04/16] spi: spi-mem: teach spi_mem_adjust_op_freq() about post-config ops Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:57 ` sashiko-bot
2026-06-18 7:37 ` [PATCH v4 06/16] spi: cadence-quadspi: move cqspi_readdata_capture earlier Santhosh Kumar K
` (10 subsequent siblings)
15 siblings, 1 reply; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
SPI memory controllers that support high-speed operating modes often
require a tuning procedure to calibrate internal timing before operating
at maximum frequency. There is currently no standard spi-mem interface
for drivers to trigger this procedure.
Add an execute_tuning callback to struct spi_controller_mem_ops. The
callback receives a mandatory read op template and an optional write op
template. On success the controller sets op->max_freq in each provided
template to the validated clock rate.
Add the corresponding spi_mem_execute_tuning() wrapper that validates
inputs and returns -EOPNOTSUPP when the controller has not implemented
the callback, allowing callers to handle controllers that do not support
tuning gracefully.
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-mem.c | 31 +++++++++++++++++++++++++++++++
include/linux/spi/spi-mem.h | 14 ++++++++++++++
2 files changed, 45 insertions(+)
diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
index e20eca1b8245..571a7dd9c2a4 100644
--- a/drivers/spi/spi-mem.c
+++ b/drivers/spi/spi-mem.c
@@ -660,6 +660,37 @@ u64 spi_mem_calc_op_duration(struct spi_mem *mem, struct spi_mem_op *op)
}
EXPORT_SYMBOL_GPL(spi_mem_calc_op_duration);
+/**
+ * spi_mem_execute_tuning() - Execute controller tuning procedure
+ * @mem: the SPI memory device
+ * @read_op: read operation template (mandatory)
+ * @write_op: write operation template (optional, may be NULL)
+ *
+ * Requests the controller to perform tuning for high-speed operation
+ * using the provided op templates. On success the controller callback
+ * sets @read_op->max_freq (and @write_op->max_freq when non-NULL) to
+ * the validated clock rate.
+ *
+ * Return: 0 on success, -EINVAL if @mem or @read_op is NULL,
+ * -EOPNOTSUPP if controller doesn't support tuning,
+ * or a controller-specific error code on failure.
+ */
+int spi_mem_execute_tuning(struct spi_mem *mem, struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op)
+{
+ struct spi_controller *ctlr;
+
+ if (!mem || !read_op)
+ return -EINVAL;
+
+ ctlr = mem->spi->controller;
+ if (!ctlr->mem_ops || !ctlr->mem_ops->execute_tuning)
+ return -EOPNOTSUPP;
+
+ return ctlr->mem_ops->execute_tuning(mem, read_op, write_op);
+}
+EXPORT_SYMBOL_GPL(spi_mem_execute_tuning);
+
static ssize_t spi_mem_no_dirmap_read(struct spi_mem_dirmap_desc *desc,
u64 offs, size_t len, void *buf)
{
diff --git a/include/linux/spi/spi-mem.h b/include/linux/spi/spi-mem.h
index f660bb2e9f85..ef5a6d60bae9 100644
--- a/include/linux/spi/spi-mem.h
+++ b/include/linux/spi/spi-mem.h
@@ -346,6 +346,15 @@ static inline void *spi_mem_get_drvdata(struct spi_mem *mem)
* @poll_status: poll memory device status until (status & mask) == match or
* when the timeout has expired. It fills the data buffer with
* the last status value.
+ * @execute_tuning: run the controller tuning procedure using the provided
+ * read and optional write op templates. On success, set
+ * @read_op->max_freq (and @write_op->max_freq when non-NULL)
+ * to the validated clock rate. Return a negative errno on
+ * error. Return -EOPNOTSUPP if the controller has no tuning
+ * capability at all. Return 0 with @read_op->max_freq left at
+ * zero to signal that this specific op cannot be PHY-tuned
+ * (e.g. a hardware erratum blocks it) but another variant may
+ * succeed; the caller will iterate remaining op variants.
*
* This interface should be implemented by SPI controllers providing an
* high-level interface to execute SPI memory operation, which is usually the
@@ -376,6 +385,8 @@ struct spi_controller_mem_ops {
unsigned long initial_delay_us,
unsigned long polling_rate_us,
unsigned long timeout_ms);
+ int (*execute_tuning)(struct spi_mem *mem, struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op);
};
/**
@@ -465,6 +476,9 @@ int spi_mem_adjust_op_size(struct spi_mem *mem, struct spi_mem_op *op);
void spi_mem_adjust_op_freq(struct spi_mem *mem, struct spi_mem_op *op);
u64 spi_mem_calc_op_duration(struct spi_mem *mem, struct spi_mem_op *op);
+int spi_mem_execute_tuning(struct spi_mem *mem, struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op);
+
bool spi_mem_supports_op(struct spi_mem *mem,
const struct spi_mem_op *op);
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v4 05/16] spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning()
2026-06-18 7:37 ` [PATCH v4 05/16] spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning() Santhosh Kumar K
@ 2026-06-18 7:57 ` sashiko-bot
0 siblings, 0 replies; 29+ messages in thread
From: sashiko-bot @ 2026-06-18 7:57 UTC (permalink / raw)
To: Santhosh Kumar K; +Cc: conor+dt, devicetree, robh
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Missing bus lock in spi_mem_execute_tuning() leads to data corruption and hardware state race conditions.
- [Medium] spi_mem_execute_tuning() invokes hardware tuning without verifying if a GPIO chip-select is in use, causing spurious tuning failures.
--
commit 488642092bb43537c3028baead18a5f59abeb41c
Author: Santhosh Kumar K <s-k6@ti.com>
spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning()
This commit adds an execute_tuning callback to struct spi_controller_mem_ops
and a corresponding spi_mem_execute_tuning() wrapper. This provides a standard
interface for SPI memory drivers to trigger controller tuning procedures
required for high-speed operation modes.
> diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
> index b0b111642a358..ae9f55f4ef90d 100644
> --- a/drivers/spi/spi-mem.c
> +++ b/drivers/spi/spi-mem.c
[ ... ]
> @@ -653,6 +653,37 @@ u64 spi_mem_calc_op_duration(struct spi_mem *mem, struct spi_mem_op *op)
> }
> EXPORT_SYMBOL_GPL(spi_mem_calc_op_duration);
>
> +/**
> + * spi_mem_execute_tuning() - Execute controller tuning procedure
> + * @mem: the SPI memory device
> + * @read_op: read operation template (mandatory)
> + * @write_op: write operation template (optional, may be NULL)
> + *
> + * Requests the controller to perform tuning for high-speed operation
> + * using the provided op templates. On success the controller callback
> + * sets @read_op->max_freq (and @write_op->max_freq when non-NULL) to
> + * the validated clock rate.
> + *
> + * Return: 0 on success, -EINVAL if @mem or @read_op is NULL,
> + * -EOPNOTSUPP if controller doesn't support tuning,
> + * or a controller-specific error code on failure.
> + */
> +int spi_mem_execute_tuning(struct spi_mem *mem, struct spi_mem_op *read_op,
> + struct spi_mem_op *write_op)
> +{
> + struct spi_controller *ctlr;
> +
> + if (!mem || !read_op)
> + return -EINVAL;
> +
> + ctlr = mem->spi->controller;
> + if (!ctlr->mem_ops || !ctlr->mem_ops->execute_tuning)
> + return -EOPNOTSUPP;
[Severity: Medium]
Hardware tuning engines are usually hardwired to toggle the controller's
dedicated native CS line and cannot natively toggle a GPIO CS.
If we omit checking for a GPIO CS here, could the hardware tuning sequence
run on a GPIO CS configuration? The hardware sequencer might generate clock
and data without asserting the GPIO CS, causing the flash device to ignore
the tuning phase and resulting in a spurious failure.
Should this verify !spi_get_csgpiod(mem->spi, 0) before proceeding to invoke
the hardware-accelerated callback, similar to spi_mem_exec_op() and
spi_mem_poll_status()?
> +
> + return ctlr->mem_ops->execute_tuning(mem, read_op, write_op);
[Severity: High]
Does this execute_tuning() callback invocation need to be protected by the
SPI controller's bus_lock_mutex?
Hardware-specific tuning implementations often directly perform
read-modify-write operations on global controller registers to configure PHY
pipelines without necessarily relying entirely on spi_mem_exec_op() during
all stages.
If the bus lock is not held (via spi_mem_access_start() and
spi_mem_access_end()), could these unprotected register accesses violently
race with concurrent SPI transfers occurring on other chip-selects on the same
bus, potentially corrupting hardware states like PHY_EN or ENABLE bits?
> +}
> +EXPORT_SYMBOL_GPL(spi_mem_execute_tuning);
> +
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260618073725.84733-1-s-k6@ti.com?part=5
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v4 06/16] spi: cadence-quadspi: move cqspi_readdata_capture earlier
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (4 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 05/16] spi: spi-mem: add execute_tuning callback and spi_mem_execute_tuning() Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:48 ` sashiko-bot
2026-06-18 7:37 ` [PATCH v4 07/16] spi: cadence-quadspi: add DQS support to read data capture Santhosh Kumar K
` (9 subsequent siblings)
15 siblings, 1 reply; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Move cqspi_readdata_capture() function earlier in the file. This is
preparatory refactoring for upcoming PHY tuning support for read and
write operations.
No functional changes.
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 45 +++++++++++++++----------------
1 file changed, 22 insertions(+), 23 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index aaba1a3ad577..54fd7b591e06 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -453,6 +453,28 @@ static int cqspi_wait_idle(struct cqspi_st *cqspi)
}
}
+static void cqspi_readdata_capture(struct cqspi_st *cqspi, const bool bypass,
+ const unsigned int delay)
+{
+ void __iomem *reg_base = cqspi->iobase;
+ unsigned int reg;
+
+ reg = readl(reg_base + CQSPI_REG_READCAPTURE);
+
+ if (bypass)
+ reg |= BIT(CQSPI_REG_READCAPTURE_BYPASS_LSB);
+ else
+ reg &= ~BIT(CQSPI_REG_READCAPTURE_BYPASS_LSB);
+
+ reg &= ~(CQSPI_REG_READCAPTURE_DELAY_MASK
+ << CQSPI_REG_READCAPTURE_DELAY_LSB);
+
+ reg |= (delay & CQSPI_REG_READCAPTURE_DELAY_MASK)
+ << CQSPI_REG_READCAPTURE_DELAY_LSB;
+
+ writel(reg, reg_base + CQSPI_REG_READCAPTURE);
+}
+
static int cqspi_exec_flash_cmd(struct cqspi_st *cqspi, unsigned int reg)
{
void __iomem *reg_base = cqspi->iobase;
@@ -1270,29 +1292,6 @@ static void cqspi_config_baudrate_div(struct cqspi_st *cqspi)
writel(reg, reg_base + CQSPI_REG_CONFIG);
}
-static void cqspi_readdata_capture(struct cqspi_st *cqspi,
- const bool bypass,
- const unsigned int delay)
-{
- void __iomem *reg_base = cqspi->iobase;
- unsigned int reg;
-
- reg = readl(reg_base + CQSPI_REG_READCAPTURE);
-
- if (bypass)
- reg |= BIT(CQSPI_REG_READCAPTURE_BYPASS_LSB);
- else
- reg &= ~BIT(CQSPI_REG_READCAPTURE_BYPASS_LSB);
-
- reg &= ~(CQSPI_REG_READCAPTURE_DELAY_MASK
- << CQSPI_REG_READCAPTURE_DELAY_LSB);
-
- reg |= (delay & CQSPI_REG_READCAPTURE_DELAY_MASK)
- << CQSPI_REG_READCAPTURE_DELAY_LSB;
-
- writel(reg, reg_base + CQSPI_REG_READCAPTURE);
-}
-
static void cqspi_configure(struct cqspi_flash_pdata *f_pdata,
unsigned long sclk)
{
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v4 06/16] spi: cadence-quadspi: move cqspi_readdata_capture earlier
2026-06-18 7:37 ` [PATCH v4 06/16] spi: cadence-quadspi: move cqspi_readdata_capture earlier Santhosh Kumar K
@ 2026-06-18 7:48 ` sashiko-bot
0 siblings, 0 replies; 29+ messages in thread
From: sashiko-bot @ 2026-06-18 7:48 UTC (permalink / raw)
To: Santhosh Kumar K; +Cc: robh, devicetree, conor+dt
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] Mixed accessor semantics for FIFO I/O causes endianness portability bug and data corruption on big-endian architectures.
--
commit 1ac225c2f73296224ec939152c97968ddfb4feaf
Author: Santhosh Kumar K <s-k6@ti.com>
spi: cadence-quadspi: move cqspi_readdata_capture earlier
This commit moves cqspi_readdata_capture() earlier in the file to prepare
for upcoming PHY tuning support for read and write operations. No functional
changes are introduced.
> diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
> index 057381e56a7fd..0cc0d648d78db 100644
> --- a/drivers/spi/spi-cadence-quadspi.c
> +++ b/drivers/spi/spi-cadence-quadspi.c
[ ... ]
drivers/spi/spi-cadence-quadspi.c:cqspi_indirect_read_execute() {
...
/* Read 4 byte word chunks then single bytes */
if (bytes_to_read) {
ioread32_rep(ahb_base, rxbuf,
(bytes_to_read / 4));
} else if (!word_remain && mod_bytes) {
unsigned int temp = ioread32(ahb_base);
bytes_to_read = mod_bytes;
memcpy(rxbuf, &temp, min((unsigned int)
(rxbuf_end - rxbuf),
bytes_to_read));
}
...
}
[Severity: High]
This isn't a bug introduced by this patch, but does the remainder read
logic cause endianness issues on big-endian architectures?
The code uses ioread32_rep() for bulk transfers and ioread32() for the
remainder bytes. On big-endian systems, ioread32_rep() preserves byte order
during the stream transfer, while ioread32() performs a CPU-to-device byte
swap.
Could mixing these accessors on the same FIFO cause the remainder bytes
to be corrupted due to unwanted byte-swapping?
drivers/spi/spi-cadence-quadspi.c:cqspi_indirect_write_execute() {
...
/* Write 4 bytes at a time then single bytes. */
if (write_words) {
iowrite32_rep(cqspi->ahb_base, txbuf, write_words);
txbuf += (write_words * 4);
}
if (mod_bytes) {
unsigned int temp = 0xFFFFFFFF;
memcpy(&temp, txbuf, mod_bytes);
iowrite32(temp, cqspi->ahb_base);
txbuf += mod_bytes;
}
...
}
[Severity: High]
This is also a pre-existing issue, but does using iowrite32() for the
remainder bytes after using iowrite32_rep() for the bulk transfer
have the same endianness problem?
Since iowrite32_rep() preserves byte order and iowrite32() performs a
byte swap, it appears the remainder bytes could be corrupted on
big-endian architectures in this function as well.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260618073725.84733-1-s-k6@ti.com?part=6
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v4 07/16] spi: cadence-quadspi: add DQS support to read data capture
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (5 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 06/16] spi: cadence-quadspi: move cqspi_readdata_capture earlier Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 08/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (8 subsequent siblings)
15 siblings, 0 replies; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Add DQS (Data Strobe) parameter to cqspi_readdata_capture() to control
data capture timing. DQS mode uses a dedicated strobe signal for
improved timing margins in high-speed SPI modes.
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 54fd7b591e06..201d69c64c49 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -192,6 +192,7 @@ struct cqspi_driver_platdata {
#define CQSPI_REG_READCAPTURE_BYPASS_LSB 0
#define CQSPI_REG_READCAPTURE_DELAY_LSB 1
#define CQSPI_REG_READCAPTURE_DELAY_MASK 0xF
+#define CQSPI_REG_READCAPTURE_DQS_LSB 8
#define CQSPI_REG_SIZE 0x14
#define CQSPI_REG_SIZE_ADDRESS_LSB 0
@@ -454,7 +455,7 @@ static int cqspi_wait_idle(struct cqspi_st *cqspi)
}
static void cqspi_readdata_capture(struct cqspi_st *cqspi, const bool bypass,
- const unsigned int delay)
+ const bool dqs, const unsigned int delay)
{
void __iomem *reg_base = cqspi->iobase;
unsigned int reg;
@@ -472,6 +473,11 @@ static void cqspi_readdata_capture(struct cqspi_st *cqspi, const bool bypass,
reg |= (delay & CQSPI_REG_READCAPTURE_DELAY_MASK)
<< CQSPI_REG_READCAPTURE_DELAY_LSB;
+ if (dqs)
+ reg |= BIT(CQSPI_REG_READCAPTURE_DQS_LSB);
+ else
+ reg &= ~BIT(CQSPI_REG_READCAPTURE_DQS_LSB);
+
writel(reg, reg_base + CQSPI_REG_READCAPTURE);
}
@@ -1313,7 +1319,7 @@ static void cqspi_configure(struct cqspi_flash_pdata *f_pdata,
cqspi->sclk = sclk;
cqspi_config_baudrate_div(cqspi);
cqspi_delay(f_pdata);
- cqspi_readdata_capture(cqspi, !cqspi->rclk_en,
+ cqspi_readdata_capture(cqspi, !cqspi->rclk_en, false,
f_pdata->read_delay);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* [PATCH v4 08/16] spi: cadence-quadspi: add PHY tuning support
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (6 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 07/16] spi: cadence-quadspi: add DQS support to read data capture Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:59 ` sashiko-bot
2026-06-18 7:37 ` [PATCH v4 09/16] spi: cadence-quadspi: skip DDR PHY tuning for 2-byte-address ops (i2383) Santhosh Kumar K
` (7 subsequent siblings)
15 siblings, 1 reply; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
The Cadence QSPI controller supports a delay-line PHY for high-speed
operation. Without calibration the PHY is unused and read capture relies
on a fixed delay, limiting throughput at frequencies above the base
operating speed.
Add an execute_tuning callback that performs delay-line calibration using
a known data pattern written to a dedicated flash region. The pattern is
either read from a NOR partition identified by the DT property
spi-phy-pattern-partition, or written to the NAND page cache before
each calibration read.
For DDR protocols (8D-8D-8D) a 2D sweep of (rx_delay, tx_delay) pairs
is performed to find the widest passing region in the combined RX/TX
space. Binary search locates the gap boundary between passing regions
when two separate windows exist; the final operating point is placed at
the centre of the larger region with a small temperature-dependent
offset.
For SDR protocols a 1D sweep of the RX delay is sufficient. Two windows
at adjacent read_delay values are measured; the wider one's midpoint is
selected.
The tuning infrastructure is platform-specific: only am654-based OSPI
controllers populate the execute_tuning hook. All other platform data
entries return -EOPNOTSUPP and are unaffected.
The calibration target is sourced from spi->post_config_max_speed_hz,
populated by the SPI core from the spi-max-post-config-frequency DT
property. Tuning is skipped when post_config_max_speed_hz is zero.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 1776 ++++++++++++++++++++++++++++-
1 file changed, 1763 insertions(+), 13 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 201d69c64c49..72768292a32b 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -65,15 +65,25 @@ enum {
struct cqspi_st;
+struct phy_setting {
+ u8 rx;
+ u8 tx;
+ u8 read_delay;
+};
+
struct cqspi_flash_pdata {
- struct cqspi_st *cqspi;
- u32 clk_rate;
- u32 read_delay;
- u32 tshsl_ns;
- u32 tsd2d_ns;
- u32 tchsh_ns;
- u32 tslch_ns;
- u8 cs;
+ struct cqspi_st *cqspi;
+ u32 read_delay;
+ u32 tshsl_ns;
+ u32 tsd2d_ns;
+ u32 tchsh_ns;
+ u32 tslch_ns;
+ bool use_dqs;
+ bool use_tuned_phy;
+ u8 cs;
+ struct phy_setting phy_setting;
+ struct spi_mem_op phy_read_op;
+ struct spi_mem_op phy_write_op;
};
static const struct clk_bulk_data cqspi_clks[CLK_QSPI_NUM] = {
@@ -129,12 +139,15 @@ struct cqspi_driver_platdata {
int (*indirect_read_dma)(struct cqspi_flash_pdata *f_pdata,
u_char *rxbuf, loff_t from_addr, size_t n_rx);
u32 (*get_dma_status)(struct cqspi_st *cqspi);
+ int (*execute_tuning)(struct spi_mem *mem, struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op);
};
/* Operation timeout value */
#define CQSPI_TIMEOUT_MS 500
#define CQSPI_READ_TIMEOUT_MS 10
#define CQSPI_BUSYWAIT_TIMEOUT_US 500
+#define CQSPI_DLL_TIMEOUT_US 300
/* Runtime_pm autosuspend delay */
#define CQSPI_AUTOSUSPEND_TIMEOUT 2000
@@ -148,12 +161,14 @@ struct cqspi_driver_platdata {
/* Register map */
#define CQSPI_REG_CONFIG 0x00
#define CQSPI_REG_CONFIG_ENABLE_MASK BIT(0)
+#define CQSPI_REG_CONFIG_PHY_EN BIT(3)
#define CQSPI_REG_CONFIG_ENB_DIR_ACC_CTRL BIT(7)
#define CQSPI_REG_CONFIG_DECODE_MASK BIT(9)
#define CQSPI_REG_CONFIG_CHIPSELECT_LSB 10
#define CQSPI_REG_CONFIG_DMA_MASK BIT(15)
#define CQSPI_REG_CONFIG_BAUD_LSB 19
#define CQSPI_REG_CONFIG_DTR_PROTO BIT(24)
+#define CQSPI_REG_CONFIG_PHY_PIPELINE BIT(25)
#define CQSPI_REG_CONFIG_DUAL_OPCODE BIT(30)
#define CQSPI_REG_CONFIG_IDLE_LSB 31
#define CQSPI_REG_CONFIG_CHIPSELECT_MASK 0xF
@@ -192,6 +207,7 @@ struct cqspi_driver_platdata {
#define CQSPI_REG_READCAPTURE_BYPASS_LSB 0
#define CQSPI_REG_READCAPTURE_DELAY_LSB 1
#define CQSPI_REG_READCAPTURE_DELAY_MASK 0xF
+#define CQSPI_REG_READCAPTURE_EDGE_LSB 5
#define CQSPI_REG_READCAPTURE_DQS_LSB 8
#define CQSPI_REG_SIZE 0x14
@@ -273,6 +289,27 @@ struct cqspi_driver_platdata {
#define CQSPI_REG_POLLING_STATUS 0xB0
#define CQSPI_REG_POLLING_STATUS_DUMMY_LSB 16
+#define CQSPI_REG_PHY_CONFIG 0xB4
+#define CQSPI_REG_PHY_CONFIG_RX_DEL_LSB 0
+#define CQSPI_REG_PHY_CONFIG_RX_DEL_MASK 0x7F
+#define CQSPI_REG_PHY_CONFIG_TX_DEL_LSB 16
+#define CQSPI_REG_PHY_CONFIG_TX_DEL_MASK 0x7F
+#define CQSPI_REG_PHY_CONFIG_DLL_RESET BIT(30)
+#define CQSPI_REG_PHY_CONFIG_RESYNC BIT(31)
+
+#define CQSPI_REG_PHY_DLL_MASTER 0xB8
+#define CQSPI_REG_PHY_DLL_MASTER_INIT_DELAY_LSB 0
+#define CQSPI_REG_PHY_DLL_MASTER_INIT_DELAY_VAL 16
+#define CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_LEN 0x7
+#define CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_LSB 20
+#define CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_3 0x2
+#define CQSPI_REG_PHY_DLL_MASTER_BYPASS BIT(23)
+#define CQSPI_REG_PHY_DLL_MASTER_CYCLE BIT(24)
+
+#define CQSPI_REG_DLL_OBS_LOW 0xBC
+#define CQSPI_REG_DLL_OBS_LOW_DLL_LOCK BIT(0)
+#define CQSPI_REG_DLL_OBS_LOW_LOOPBACK_LOCK BIT(15)
+
#define CQSPI_REG_OP_EXT_LOWER 0xE0
#define CQSPI_REG_OP_EXT_READ_LSB 24
#define CQSPI_REG_OP_EXT_WRITE_LSB 16
@@ -321,6 +358,50 @@ struct cqspi_driver_platdata {
#define CQSPI_REG_VERSAL_DMA_VAL 0x602
+#define CQSPI_PHY_INIT_RD 1
+#define CQSPI_PHY_MAX_RD 4
+#define CQSPI_PHY_MAX_DELAY 127
+#define CQSPI_PHY_DDR_SEARCH_STEP 4
+#define CQSPI_PHY_TX_LOOKUP_LOW_START 28
+#define CQSPI_PHY_TX_LOOKUP_LOW_END 48
+#define CQSPI_PHY_TX_LOOKUP_HIGH_START 60
+#define CQSPI_PHY_TX_LOOKUP_HIGH_END 96
+#define CQSPI_PHY_RX_LOW_SEARCH_START 0
+#define CQSPI_PHY_RX_LOW_SEARCH_END 40
+#define CQSPI_PHY_RX_HIGH_SEARCH_START 24
+#define CQSPI_PHY_RX_HIGH_SEARCH_END 127
+#define CQSPI_PHY_TX_LOW_SEARCH_START 0
+#define CQSPI_PHY_TX_LOW_SEARCH_END 64
+#define CQSPI_PHY_TX_HIGH_SEARCH_START 78
+#define CQSPI_PHY_TX_HIGH_SEARCH_END 127
+#define CQSPI_PHY_SEARCH_OFFSET 8
+
+#define CQSPI_PHY_DEFAULT_TEMP 45
+#define CQSPI_PHY_MIN_TEMP -45
+#define CQSPI_PHY_MAX_TEMP 130
+#define CQSPI_PHY_MID_TEMP (CQSPI_PHY_MIN_TEMP + \
+ ((CQSPI_PHY_MAX_TEMP - \
+ CQSPI_PHY_MIN_TEMP) / 2))
+
+/*
+ * PHY tuning pattern for calibrating read data capture delay. This 128-byte
+ * pattern provides sufficient bit transitions across all byte lanes to
+ * reliably detect timing windows at high frequencies.
+ */
+static const u8 phy_tuning_pattern[] __aligned(64) = {
+ 0xFE, 0xFF, 0x01, 0x01, 0x01, 0x01, 0x01, 0x00, 0x00, 0xFE, 0xFE, 0x01,
+ 0x01, 0x01, 0x01, 0x00, 0x00, 0xFE, 0xFE, 0x01, 0xFF, 0xFF, 0xFF, 0xFF,
+ 0xFF, 0x00, 0x00, 0xFE, 0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0x00, 0xFE,
+ 0xFE, 0xFF, 0x01, 0x01, 0x01, 0x01, 0x01, 0xFE, 0x00, 0xFE, 0xFE, 0x01,
+ 0x01, 0x01, 0x01, 0xFE, 0x00, 0xFE, 0xFE, 0x01, 0xFF, 0xFF, 0xFF, 0xFF,
+ 0xFF, 0xFE, 0x00, 0xFE, 0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0xFE, 0x00, 0xFE,
+ 0xFE, 0xFF, 0x01, 0x01, 0x01, 0x01, 0x01, 0x00, 0xFE, 0xFE, 0xFE, 0x01,
+ 0x01, 0x01, 0x01, 0x00, 0xFE, 0xFE, 0xFE, 0x01, 0xFF, 0xFF, 0xFF, 0xFF,
+ 0xFF, 0x00, 0xFE, 0xFE, 0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0x00, 0xFE, 0xFE,
+ 0xFE, 0xFF, 0x01, 0x01, 0x01, 0x01, 0x01, 0xFE, 0xFE, 0xFE, 0xFE, 0x01,
+ 0x01, 0x01, 0x01, 0xFE, 0xFE, 0xFE, 0xFE, 0x01,
+};
+
static int cqspi_wait_for_bit(const struct cqspi_driver_platdata *ddata,
void __iomem *reg, const u32 mask, bool clr,
bool busywait)
@@ -1555,6 +1636,1678 @@ static bool cqspi_supports_mem_op(struct spi_mem *mem,
return spi_mem_default_supports_op(mem, op);
}
+static int cqspi_write_pattern_to_cache(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem,
+ struct spi_mem_op *write_op)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ write_op->data.nbytes = sizeof(phy_tuning_pattern);
+ write_op->data.buf.out = phy_tuning_pattern;
+
+ ret = spi_mem_exec_op(mem, write_op);
+ if (ret) {
+ dev_err(dev, "Failed to write PHY pattern to cache: %d\n", ret);
+ return ret;
+ }
+ dev_dbg(dev, "PHY pattern (%zu bytes) written to cache\n",
+ sizeof(phy_tuning_pattern));
+
+ return 0;
+}
+
+static int cqspi_get_phy_pattern_offset(struct device *dev, u32 *offset)
+{
+ struct device_node *np, *flash_np = NULL, *part_np;
+ const __be32 *reg;
+ int len;
+
+ if (!dev || !dev->of_node)
+ return -EINVAL;
+
+ for_each_child_of_node(dev->of_node, np) {
+ if (of_node_name_prefix(np, "flash")) {
+ flash_np = np;
+ break;
+ }
+ }
+
+ if (!flash_np)
+ return -ENODEV;
+
+ part_np = of_parse_phandle(flash_np, "spi-phy-pattern-partition", 0);
+ of_node_put(flash_np);
+ if (!part_np)
+ return -ENODEV;
+
+ reg = of_get_property(part_np, "reg", &len);
+ if (reg && len >= sizeof(__be32)) {
+ *offset = be32_to_cpu(reg[0]);
+ of_node_put(part_np);
+ return 0;
+ }
+
+ of_node_put(part_np);
+ return -ENOENT;
+}
+
+static int cqspi_phy_check_pattern(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem)
+{
+ struct spi_mem_op op;
+ u8 *read_data;
+ int ret;
+
+ read_data = kmalloc_array(ARRAY_SIZE(phy_tuning_pattern),
+ sizeof(phy_tuning_pattern[0]), GFP_KERNEL);
+ if (!read_data)
+ return -ENOMEM;
+
+ op = f_pdata->phy_read_op;
+ op.data.buf.in = read_data;
+ op.data.nbytes = sizeof(phy_tuning_pattern);
+
+ ret = spi_mem_exec_op(mem, &op);
+ if (ret)
+ goto out;
+
+ if (memcmp(read_data, phy_tuning_pattern, sizeof(phy_tuning_pattern)))
+ ret = -EAGAIN;
+
+out:
+ kfree(read_data);
+ return ret;
+}
+
+static void cqspi_phy_set_dll_master(struct cqspi_st *cqspi)
+{
+ void __iomem *reg_base = cqspi->iobase;
+ unsigned int reg;
+
+ reg = readl(reg_base + CQSPI_REG_PHY_DLL_MASTER);
+ reg &= ~((CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_LEN
+ << CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_LSB) |
+ CQSPI_REG_PHY_DLL_MASTER_BYPASS |
+ CQSPI_REG_PHY_DLL_MASTER_CYCLE);
+ reg |= ((CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_3
+ << CQSPI_REG_PHY_DLL_MASTER_DLY_ELMTS_LSB) |
+ CQSPI_REG_PHY_DLL_MASTER_CYCLE);
+
+ writel(reg, reg_base + CQSPI_REG_PHY_DLL_MASTER);
+}
+
+static void cqspi_phy_pre_config(struct cqspi_st *cqspi,
+ struct cqspi_flash_pdata *f_pdata,
+ const bool bypass)
+{
+ void __iomem *reg_base = cqspi->iobase;
+ unsigned int reg;
+ u8 dummy;
+
+ cqspi_readdata_capture(cqspi, bypass, f_pdata->use_dqs,
+ f_pdata->phy_setting.read_delay);
+
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg &= ~(CQSPI_REG_CONFIG_PHY_EN | CQSPI_REG_CONFIG_PHY_PIPELINE);
+ reg |= CQSPI_REG_CONFIG_PHY_EN;
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ reg = readl(reg_base + CQSPI_REG_RD_INSTR);
+ dummy = FIELD_GET(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ reg);
+ dummy--;
+ reg &= ~(CQSPI_REG_RD_INSTR_DUMMY_MASK << CQSPI_REG_RD_INSTR_DUMMY_LSB);
+ reg |= FIELD_PREP(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ dummy);
+ writel(reg, reg_base + CQSPI_REG_RD_INSTR);
+
+ cqspi_phy_set_dll_master(cqspi);
+}
+
+static void cqspi_phy_post_config(struct cqspi_st *cqspi,
+ const unsigned int delay)
+{
+ void __iomem *reg_base = cqspi->iobase;
+ unsigned int reg;
+ u8 dummy;
+
+ reg = readl(reg_base + CQSPI_REG_READCAPTURE);
+ reg &= ~(CQSPI_REG_READCAPTURE_DELAY_MASK
+ << CQSPI_REG_READCAPTURE_DELAY_LSB);
+
+ reg |= (delay & CQSPI_REG_READCAPTURE_DELAY_MASK)
+ << CQSPI_REG_READCAPTURE_DELAY_LSB;
+ writel(reg, reg_base + CQSPI_REG_READCAPTURE);
+
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg &= ~(CQSPI_REG_CONFIG_PHY_EN | CQSPI_REG_CONFIG_PHY_PIPELINE);
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ reg = readl(reg_base + CQSPI_REG_RD_INSTR);
+ dummy = FIELD_GET(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ reg);
+ dummy++;
+ reg &= ~(CQSPI_REG_RD_INSTR_DUMMY_MASK << CQSPI_REG_RD_INSTR_DUMMY_LSB);
+ reg |= FIELD_PREP(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ dummy);
+ writel(reg, reg_base + CQSPI_REG_RD_INSTR);
+}
+
+static void cqspi_set_dll(void __iomem *reg_base, u8 rx_dll, u8 tx_dll)
+{
+ unsigned int reg;
+
+ reg = readl(reg_base + CQSPI_REG_PHY_CONFIG);
+ reg &= ~((CQSPI_REG_PHY_CONFIG_RX_DEL_MASK
+ << CQSPI_REG_PHY_CONFIG_RX_DEL_LSB) |
+ (CQSPI_REG_PHY_CONFIG_TX_DEL_MASK
+ << CQSPI_REG_PHY_CONFIG_TX_DEL_LSB));
+ reg |= ((rx_dll & CQSPI_REG_PHY_CONFIG_RX_DEL_MASK)
+ << CQSPI_REG_PHY_CONFIG_RX_DEL_LSB) |
+ ((tx_dll & CQSPI_REG_PHY_CONFIG_TX_DEL_MASK)
+ << CQSPI_REG_PHY_CONFIG_TX_DEL_LSB) |
+ CQSPI_REG_PHY_CONFIG_RESYNC;
+ writel(reg, reg_base + CQSPI_REG_PHY_CONFIG);
+}
+
+static int cqspi_resync_dll(struct cqspi_st *cqspi)
+{
+ void __iomem *reg_base = cqspi->iobase;
+ unsigned int reg;
+ int ret;
+
+ ret = cqspi_wait_idle(cqspi);
+ if (ret)
+ return ret;
+
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg &= ~CQSPI_REG_CONFIG_ENABLE_MASK;
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ reg = readl(reg_base + CQSPI_REG_PHY_CONFIG);
+ reg &= ~(CQSPI_REG_PHY_CONFIG_DLL_RESET | CQSPI_REG_PHY_CONFIG_RESYNC);
+ writel(reg, reg_base + CQSPI_REG_PHY_CONFIG);
+
+ reg = readl(reg_base + CQSPI_REG_PHY_DLL_MASTER);
+ reg |= (CQSPI_REG_PHY_DLL_MASTER_INIT_DELAY_VAL
+ << CQSPI_REG_PHY_DLL_MASTER_INIT_DELAY_LSB);
+ writel(reg, reg_base + CQSPI_REG_PHY_DLL_MASTER);
+
+ reg = readl(reg_base + CQSPI_REG_PHY_CONFIG);
+ reg |= CQSPI_REG_PHY_CONFIG_DLL_RESET;
+ writel(reg, reg_base + CQSPI_REG_PHY_CONFIG);
+
+ ret = readl_poll_timeout(reg_base + CQSPI_REG_DLL_OBS_LOW, reg,
+ (reg & CQSPI_REG_DLL_OBS_LOW_DLL_LOCK), 0,
+ CQSPI_DLL_TIMEOUT_US);
+ if (ret)
+ goto re_enable;
+
+ ret = readl_poll_timeout(reg_base + CQSPI_REG_DLL_OBS_LOW, reg,
+ (reg & CQSPI_REG_DLL_OBS_LOW_LOOPBACK_LOCK), 0,
+ CQSPI_DLL_TIMEOUT_US);
+ if (ret)
+ goto re_enable;
+
+ reg = readl(reg_base + CQSPI_REG_PHY_CONFIG);
+ reg |= CQSPI_REG_PHY_CONFIG_RESYNC;
+ writel(reg, reg_base + CQSPI_REG_PHY_CONFIG);
+
+re_enable:
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg |= CQSPI_REG_CONFIG_ENABLE_MASK;
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ return ret;
+}
+
+static int cqspi_phy_apply_setting(struct cqspi_flash_pdata *f_pdata,
+ struct phy_setting *phy)
+{
+ struct cqspi_st *cqspi = f_pdata->cqspi;
+ unsigned int reg;
+ int ret;
+
+ reg = readl(cqspi->iobase + CQSPI_REG_READCAPTURE);
+ reg |= BIT(CQSPI_REG_READCAPTURE_EDGE_LSB);
+ writel(reg, cqspi->iobase + CQSPI_REG_READCAPTURE);
+
+ cqspi_set_dll(cqspi->iobase, phy->rx, phy->tx);
+
+ ret = cqspi_resync_dll(cqspi);
+ if (ret)
+ return ret;
+
+ f_pdata->phy_setting.read_delay = phy->read_delay;
+ return 0;
+}
+
+static int cqspi_find_rx_low_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ do {
+ phy->rx = CQSPI_PHY_RX_LOW_SEARCH_START;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+
+ phy->rx += CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (phy->rx <= CQSPI_PHY_RX_LOW_SEARCH_END);
+
+ phy->read_delay++;
+ } while (phy->read_delay <= CQSPI_PHY_MAX_RD);
+
+ dev_dbg(dev, "Unable to find RX low\n");
+ return -ENOENT;
+}
+
+static int cqspi_find_rx_low_sdr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ phy->rx = 0;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+ phy->rx++;
+ } while (phy->rx <= CQSPI_PHY_MAX_DELAY);
+
+ dev_dbg(dev, "Unable to find RX low\n");
+ return -ENOENT;
+}
+
+static int cqspi_find_rx_high_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ do {
+ phy->rx = CQSPI_PHY_RX_HIGH_SEARCH_END;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+
+ phy->rx -= CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (phy->rx >= CQSPI_PHY_RX_HIGH_SEARCH_START);
+
+ phy->read_delay--;
+ } while (phy->read_delay >= CQSPI_PHY_INIT_RD);
+
+ dev_dbg(dev, "Unable to find RX high\n");
+ return -ENOENT;
+}
+
+static int cqspi_find_rx_high_sdr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy,
+ u8 lowerbound)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ phy->rx = CQSPI_PHY_MAX_DELAY;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+ phy->rx--;
+ } while (phy->rx > lowerbound);
+
+ dev_dbg(dev, "Unable to find RX high\n");
+ return -ENOENT;
+}
+
+static int cqspi_find_tx_low_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ do {
+ phy->tx = CQSPI_PHY_TX_LOW_SEARCH_START;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+
+ phy->tx += CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (phy->tx <= CQSPI_PHY_TX_LOW_SEARCH_END);
+
+ phy->read_delay++;
+ } while (phy->read_delay <= CQSPI_PHY_MAX_RD);
+
+ dev_dbg(dev, "Unable to find TX low\n");
+ return -ENOENT;
+}
+
+static int cqspi_find_tx_high_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem, struct phy_setting *phy)
+{
+ struct device *dev = &f_pdata->cqspi->pdev->dev;
+ int ret;
+
+ do {
+ phy->tx = CQSPI_PHY_TX_HIGH_SEARCH_END;
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, phy);
+ if (!ret) {
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (!ret)
+ return 0;
+ }
+
+ phy->tx -= CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (phy->tx >= CQSPI_PHY_TX_HIGH_SEARCH_START);
+
+ phy->read_delay--;
+ } while (phy->read_delay >= CQSPI_PHY_INIT_RD);
+
+ dev_dbg(dev, "Unable to find TX high\n");
+ return -ENOENT;
+}
+
+static void cqspi_phy_find_gaplow_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem,
+ struct phy_setting *bottomleft,
+ struct phy_setting *topright,
+ struct phy_setting *gaplow)
+{
+ struct phy_setting left, right, mid;
+ int ret;
+
+ left = *bottomleft;
+ right = *topright;
+
+ mid.tx = left.tx + ((right.tx - left.tx) / 2);
+ mid.rx = left.rx + ((right.rx - left.rx) / 2);
+ mid.read_delay = left.read_delay;
+
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, &mid);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ /* The pattern was not found. Go to the lower half. */
+ right.tx = mid.tx;
+ right.rx = mid.rx;
+
+ mid.tx = left.tx + ((mid.tx - left.tx) / 2);
+ mid.rx = left.rx + ((mid.rx - left.rx) / 2);
+ } else {
+ /* The pattern was found. Go to the upper half. */
+ left.tx = mid.tx;
+ left.rx = mid.rx;
+
+ mid.tx = mid.tx + ((right.tx - mid.tx) / 2);
+ mid.rx = mid.rx + ((right.rx - mid.rx) / 2);
+ }
+
+ /* Break the loop if the window has closed. */
+ } while ((right.tx - left.tx >= 2) && (right.rx - left.rx >= 2));
+
+ *gaplow = mid;
+}
+
+static void cqspi_phy_find_gaphigh_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem,
+ struct phy_setting *bottomleft,
+ struct phy_setting *topright,
+ struct phy_setting *gaphigh)
+{
+ struct phy_setting left, right, mid;
+ int ret;
+
+ left = *bottomleft;
+ right = *topright;
+
+ mid.tx = left.tx + ((right.tx - left.tx) / 2);
+ mid.rx = left.rx + ((right.rx - left.rx) / 2);
+ mid.read_delay = right.read_delay;
+
+ do {
+ ret = cqspi_phy_apply_setting(f_pdata, &mid);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ /* The pattern was not found. Go to the upper half. */
+ left.tx = mid.tx;
+ left.rx = mid.rx;
+
+ mid.tx = mid.tx + ((right.tx - mid.tx) / 2);
+ mid.rx = mid.rx + ((right.rx - mid.rx) / 2);
+ } else {
+ /* The pattern was found. Go to the lower half. */
+ right.tx = mid.tx;
+ right.rx = mid.rx;
+
+ mid.tx = left.tx + ((mid.tx - left.tx) / 2);
+ mid.rx = left.rx + ((mid.rx - left.rx) / 2);
+ }
+
+ /* Break the loop if the window has closed. */
+ } while ((right.tx - left.tx >= 2) && (right.rx - left.rx >= 2));
+
+ *gaphigh = mid;
+}
+
+static int cqspi_get_temp(int *temp)
+{
+ /* TODO: read SoC thermal sensor; caller falls back to room temperature */
+ return -EOPNOTSUPP;
+}
+
+static inline void cqspi_phy_reset_setting(struct phy_setting *phy)
+{
+ *phy = (struct phy_setting){ .rx = 0, .tx = 127, .read_delay = 0 };
+}
+
+static int cqspi_phy_tuning_ddr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem)
+{
+ struct cqspi_st *cqspi = f_pdata->cqspi;
+ struct device *dev = &cqspi->pdev->dev;
+ struct phy_setting rxlow, rxhigh, txlow, txhigh;
+ struct phy_setting srxlow, srxhigh;
+ struct phy_setting bottomleft, topright, searchpoint;
+ struct phy_setting gaplow, gaphigh;
+ struct phy_setting backuppoint, backupcornerpoint;
+ int ret, rx_window, temp;
+ bool primary = true, secondary = true;
+
+ /*
+ * DDR tuning: 2D search across RX and TX delays for optimal timing.
+ *
+ * Algorithm: Find RX boundaries (rxlow/rxhigh) using TX window search,
+ * find TX boundaries (txlow/txhigh) at fixed RX, define valid region,
+ * locate gaps via binary search, select final point with temperature
+ * compensation.
+ *
+ * rx
+ * 127 ^
+ * | topright
+ * | *
+ * | xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * | xxxxxxxxxxxxx ++++++++++++
+ * | xxxxxxxxxxxxxx +++++++++++
+ * | xxxxxxxxxxxxxxx ++++++++++
+ * | xxxxxxxxxxxxxxxx +++++++++
+ * | xxxxxxxxxxxxxxxxx ++++++++
+ * | xxxxxxxxxxxxxxxxxx +++++++
+ * | *
+ * | bottomleft
+ * -----------------------------------------> tx
+ * 0 127
+ */
+
+ f_pdata->use_tuned_phy = true;
+
+ /* Golden rxlow search: Find lower RX boundary using TX window sweep */
+
+ /*
+ * rx
+ * 127 ^
+ * | xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | |xxxxx|xxxxx +++++++++++++
+ * | |xxxxx|xxxxxx ++++++++++++
+ * search | |xxxxx|xxxxxxx +++++++++++
+ * rxlow --------->|xxxxx|xxxxxxxx ++++++++++
+ * | |xxxxx|xxxxxxxxx +++++++++
+ * | |xxxxx|xxxxxxxxxx ++++++++
+ * | |xxxxx|xxxxxxxxxxx +++++++
+ * | | |
+ * --------|-----|----------------------------> tx
+ * 0 | | 127
+ * txlow txlow
+ * start end
+ *
+ * |----------------------------------------------------------|
+ * | Primary | Secondary | Final |
+ * | Search | Search | Point |
+ * |---------|-----------|------------------------------------|
+ * | Fail | Fail | Return Fail |
+ * |---------|-----------|------------------------------------|
+ * | Fail | Pass | Return Fail |
+ * |---------|-----------|------------------------------------|
+ * | Pass | Fail | Return Fail |
+ * |---------|-----------|------------------------------------|
+ * | Pass | Pass | rx = min(primary.rx, secondary.rx) |
+ * | | | tx = primary.tx |
+ * | | | read_delay = |
+ * | | | min(primary.read_delay, |
+ * | | | secondary.read_delay) |
+ * |----------------------------------------------------------|
+ */
+
+ /* Primary rxlow: Sweep TX window to find valid RX lower bound */
+
+ rxlow.tx = CQSPI_PHY_TX_LOOKUP_LOW_START;
+ do {
+ dev_dbg(dev, "Searching for Golden Primary rxlow on TX = %d\n",
+ rxlow.tx);
+ rxlow.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_rx_low_ddr(f_pdata, mem, &rxlow);
+ rxlow.tx += CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (ret && rxlow.tx <= CQSPI_PHY_TX_LOOKUP_LOW_END);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Golden Primary rxlow: RX: %d TX: %d RD: %d\n", rxlow.rx,
+ rxlow.tx, rxlow.read_delay);
+
+ /* Secondary rxlow: Verify at offset TX for robustness */
+
+ if (rxlow.tx <= (CQSPI_PHY_TX_LOOKUP_LOW_END - CQSPI_PHY_SEARCH_OFFSET))
+ srxlow.tx = rxlow.tx + CQSPI_PHY_SEARCH_OFFSET;
+ else
+ srxlow.tx = CQSPI_PHY_TX_LOOKUP_LOW_END;
+ dev_dbg(dev, "Searching for Golden Secondary rxlow on TX = %d\n",
+ srxlow.tx);
+ srxlow.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_rx_low_ddr(f_pdata, mem, &srxlow);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Golden Secondary rxlow: RX: %d TX: %d RD: %d\n",
+ srxlow.rx, srxlow.tx, srxlow.read_delay);
+
+ rxlow.rx = min(rxlow.rx, srxlow.rx);
+ rxlow.read_delay = min(rxlow.read_delay, srxlow.read_delay);
+ dev_dbg(dev, "Golden Final rxlow: RX: %d TX: %d RD: %d\n", rxlow.rx,
+ rxlow.tx, rxlow.read_delay);
+
+ /* Golden rxhigh search: Find upper RX boundary at fixed TX */
+
+ /*
+ * rx
+ * 127 ^
+ * | |xxxx ++++++++++++++++++++
+ * | |xxxxx +++++++++++++++++++
+ * search | |xxxxxx ++++++++++++++++++
+ * rxhigh --------->|xxxxxxx +++++++++++++++++
+ * on fixed | |xxxxxxxx ++++++++++++++++
+ * tx | |xxxxxxxxx +++++++++++++++
+ * | |xxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * | xxxxxxxxxxxxx ++++++++++++
+ * | xxxxxxxxxxxxxx +++++++++++
+ * | xxxxxxxxxxxxxxx ++++++++++
+ * | xxxxxxxxxxxxxxxx +++++++++
+ * | xxxxxxxxxxxxxxxxx ++++++++
+ * | xxxxxxxxxxxxxxxxxx +++++++
+ * |
+ * -------------------------------------------> tx
+ * 0 127
+ *
+ * |----------------------------------------------------------|
+ * | Primary | Secondary | Final |
+ * | Search | Search | Point |
+ * |---------|-----------|------------------------------------|
+ * | Fail | Fail | Return Fail |
+ * |---------|-----------|------------------------------------|
+ * | Fail | Pass | Choose Secondary |
+ * |---------|-----------|------------------------------------|
+ * | Pass | Fail | Choose Primary |
+ * |---------|-----------|------------------------------------|
+ * | Pass | Pass | if (secondary.rx > primary.rx) |
+ * | | | Choose Secondary |
+ * | | | else |
+ * | | | Choose Primary |
+ * |----------------------------------------------------------|
+ */
+
+ /* Primary rxhigh: Search at rxlow's TX, decrement from max read_delay */
+
+ rxhigh.tx = rxlow.tx;
+ dev_dbg(dev, "Searching for Golden Primary rxhigh on TX = %d\n",
+ rxhigh.tx);
+ rxhigh.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_rx_high_ddr(f_pdata, mem, &rxhigh);
+ if (ret)
+ primary = false;
+ dev_dbg(dev, "Golden Primary rxhigh: RX: %d TX: %d RD: %d\n", rxhigh.rx,
+ rxhigh.tx, rxhigh.read_delay);
+
+ /* Secondary rxhigh: Verify at offset TX */
+
+ if (rxhigh.tx <=
+ (CQSPI_PHY_TX_LOOKUP_LOW_END - CQSPI_PHY_SEARCH_OFFSET))
+ srxhigh.tx = rxhigh.tx + CQSPI_PHY_SEARCH_OFFSET;
+ else
+ srxhigh.tx = CQSPI_PHY_TX_LOOKUP_LOW_END;
+ dev_dbg(dev, "Searching for Golden Secondary rxhigh on TX = %d\n",
+ srxhigh.tx);
+ srxhigh.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_rx_high_ddr(f_pdata, mem, &srxhigh);
+ if (ret)
+ secondary = false;
+ dev_dbg(dev, "Golden Secondary rxhigh: RX: %d TX: %d RD: %d\n",
+ srxhigh.rx, srxhigh.tx, srxhigh.read_delay);
+
+ if (!primary && !secondary)
+ goto out;
+ else if (!primary)
+ rxhigh = srxhigh;
+ else if (secondary && srxhigh.rx > rxhigh.rx)
+ rxhigh = srxhigh;
+ dev_dbg(dev, "Golden Final rxhigh: RX: %d TX: %d RD: %d\n", rxhigh.rx,
+ rxhigh.tx, rxhigh.read_delay);
+
+ primary = true;
+ secondary = true;
+
+ /* If rxlow/rxhigh at same read_delay, search backup at upper TX range */
+
+ if (rxlow.read_delay == rxhigh.read_delay) {
+ dev_dbg(dev, "rxlow and rxhigh at the same read delay.\n");
+
+ /* Backup rxlow: Search at high TX window */
+
+ /*
+ * rx
+ * 127 ^
+ * | xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++|++++|
+ * | xxxxxxxxxxxxx ++++++|++++|
+ * search | xxxxxxxxxxxxxx +++++|++++|
+ * rxlow --------------------------------->|++++|
+ * | xxxxxxxxxxxxxxxx +++|++++|
+ * | xxxxxxxxxxxxxxxxx ++|++++|
+ * | xxxxxxxxxxxxxxxxxx +|++++|
+ * | | |
+ * --------------------------------|----|-----> tx
+ * 0 | | 127
+ * txhigh txhigh
+ * start end
+ *
+ * |-----------------------------------------------------|
+ * | Primary | Secondary | Final |
+ * | Search | Search | Point |
+ * |---------|-----------|-------------------------------|
+ * | Fail | Fail | Return Fail |
+ * |---------|-----------|-------------------------------|
+ * | Fail | Pass | Return Fail |
+ * |---------|-----------|-------------------------------|
+ * | Pass | Fail | Return Fail |
+ * |---------|-----------|-------------------------------|
+ * | Pass | Pass | rx = |
+ * | | | min(primary.rx, secondary.rx)|
+ * | | | tx = primary.tx |
+ * | | | read_delay = |
+ * | | | min(primary.read_delay, |
+ * | | | secondary.read_delay) |
+ * |-----------------------------------------------------|
+ */
+
+ /* Primary backup: Decrement TX from high window end */
+
+ backuppoint.tx = CQSPI_PHY_TX_LOOKUP_HIGH_END;
+ do {
+ dev_dbg(dev,
+ "Searching for Backup Primary rxlow on TX = %d\n",
+ backuppoint.tx);
+ backuppoint.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_rx_low_ddr(f_pdata, mem, &backuppoint);
+ backuppoint.tx -= CQSPI_PHY_DDR_SEARCH_STEP;
+ } while (ret &&
+ backuppoint.tx >= CQSPI_PHY_TX_LOOKUP_HIGH_START);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Backup Primary rxlow: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ /* Secondary backup: Verify at offset TX */
+
+ if (backuppoint.tx >=
+ (CQSPI_PHY_TX_LOOKUP_HIGH_START + CQSPI_PHY_SEARCH_OFFSET))
+ srxlow.tx = backuppoint.tx - CQSPI_PHY_SEARCH_OFFSET;
+ else
+ srxlow.tx = CQSPI_PHY_TX_LOOKUP_HIGH_START;
+ dev_dbg(dev,
+ "Searching for Backup Secondary rxlow on TX = %d\n",
+ srxlow.tx);
+ srxlow.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_rx_low_ddr(f_pdata, mem, &srxlow);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Backup Secondary rxlow: RX: %d TX: %d RD: %d\n",
+ srxlow.rx, srxlow.tx, srxlow.read_delay);
+
+ backuppoint.rx = min(backuppoint.rx, srxlow.rx);
+ backuppoint.read_delay =
+ min(backuppoint.read_delay, srxlow.read_delay);
+ dev_dbg(dev, "Backup Final rxlow: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ if (backuppoint.rx < rxlow.rx) {
+ rxlow = backuppoint;
+ dev_dbg(dev, "Updating rxlow to the one at TX = %d\n",
+ backuppoint.tx);
+ }
+ dev_dbg(dev, "Final rxlow: RX: %d TX: %d RD: %d\n", rxlow.rx,
+ rxlow.tx, rxlow.read_delay);
+
+ /* Backup rxhigh: Search at fixed backup TX */
+
+ /*
+ * rx
+ * 127 ^
+ * | xxxxx +++++++++++++++++++|
+ * | xxxxxx ++++++++++++++++++|
+ * search | xxxxxxx +++++++++++++++++|
+ * rxhigh -------------------------------------->|
+ * on fixed | xxxxxxxxx +++++++++++++++|
+ * tx | xxxxxxxxxx ++++++++++++++|
+ * | xxxxxxxxxxx +++++++++++++|
+ * | xxxxxxxxxxxx +++++++++++++
+ * | xxxxxxxxxxxxx ++++++++++++
+ * | xxxxxxxxxxxxxx +++++++++++
+ * | xxxxxxxxxxxxxxx ++++++++++
+ * | xxxxxxxxxxxxxxxx +++++++++
+ * | xxxxxxxxxxxxxxxxx ++++++++
+ * | xxxxxxxxxxxxxxxxxx +++++++
+ * |
+ * -------------------------------------------> tx
+ * 0 127
+ *
+ * |-----------------------------------------------------|
+ * | Primary | Secondary | Final |
+ * | Search | Search | Point |
+ * |---------|-----------|-------------------------------|
+ * | Fail | Fail | Return Fail |
+ * |---------|-----------|-------------------------------|
+ * | Fail | Pass | Choose Secondary |
+ * |---------|-----------|-------------------------------|
+ * | Pass | Fail | Choose Primary |
+ * |---------|-----------|-------------------------------|
+ * | Pass | Pass | if (secondary.rx > primary.rx)|
+ * | | | Choose Secondary |
+ * | | | else |
+ * | | | Choose Primary |
+ * |-----------------------------------------------------|
+ */
+
+ /* Primary backup rxhigh: Use backup TX, decrement from max read_delay */
+
+ dev_dbg(dev, "Searching for Backup Primary rxhigh on TX = %d\n",
+ backuppoint.tx);
+ backuppoint.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_rx_high_ddr(f_pdata, mem, &backuppoint);
+ if (ret)
+ primary = false;
+ dev_dbg(dev, "Backup Primary rxhigh: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ /* Secondary backup rxhigh: Verify at offset TX */
+
+ if (backuppoint.tx >=
+ (CQSPI_PHY_TX_LOOKUP_HIGH_START + CQSPI_PHY_SEARCH_OFFSET))
+ srxhigh.tx = backuppoint.tx - CQSPI_PHY_SEARCH_OFFSET;
+ else
+ srxhigh.tx = CQSPI_PHY_TX_LOOKUP_HIGH_START;
+ dev_dbg(dev,
+ "Searching for Backup Secondary rxhigh on TX = %d\n",
+ srxhigh.tx);
+ srxhigh.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_rx_high_ddr(f_pdata, mem, &srxhigh);
+ if (ret)
+ secondary = false;
+ dev_dbg(dev, "Backup Secondary rxhigh: RX: %d TX: %d RD: %d\n",
+ srxhigh.rx, srxhigh.tx, srxhigh.read_delay);
+
+ if (!primary && !secondary)
+ goto out;
+ else if (!primary)
+ backuppoint = srxhigh;
+ else if (secondary && srxhigh.rx > backuppoint.rx)
+ backuppoint = srxhigh;
+ dev_dbg(dev, "Backup Final rxhigh: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ if (backuppoint.rx > rxhigh.rx) {
+ rxhigh = backuppoint;
+ dev_dbg(dev, "Updating rxhigh to the one at TX = %d\n",
+ backuppoint.tx);
+ }
+ dev_dbg(dev, "Final rxhigh: RX: %d TX: %d RD: %d\n", rxhigh.rx,
+ rxhigh.tx, rxhigh.read_delay);
+ }
+
+ /* Golden txlow: Fix RX at 1/4 of RX window, search TX lower bound */
+
+ /*
+ * rx
+ * 127 ^
+ * |
+ * rxhigh --------->xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * fix rx | xxxxxxxxxxxxx ++++++++++++
+ * 1/4 b/w ---------><------->xxxxx +++++++++++
+ * rxlow and | xxxx|xxxxxxxxxx ++++++++++
+ * rxhigh | xxxx|xxxxxxxxxxx +++++++++
+ * | xxxx|xxxxxxxxxxxx ++++++++
+ * rxlow --------->xxxx|xxxxxxxxxxxxx +++++++
+ * | |
+ * ------------|------------------------------> tx
+ * 0 | 127
+ * search
+ * txlow
+ */
+
+ rx_window = rxhigh.rx - rxlow.rx;
+ txlow.rx = rxlow.rx + (rx_window / 4);
+ dev_dbg(dev, "Searching for Golden txlow on RX = %d\n", txlow.rx);
+ txlow.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_tx_low_ddr(f_pdata, mem, &txlow);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Golden txlow: RX: %d TX: %d RD: %d\n", txlow.rx, txlow.tx,
+ txlow.read_delay);
+
+ /* Golden txhigh: Same RX as txlow, decrement from max read_delay */
+
+ /*
+ * rx
+ * 127 ^
+ * |
+ * rxhigh --------->xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * fix rx | xxxxxxxxxxxxx ++++++++++++
+ * 1/4 b/w --------------------------------><----->
+ * rxlow and | xxxxxxxxxxxxxxx ++++++|+++
+ * rxhigh | xxxxxxxxxxxxxxxx +++++|+++
+ * | xxxxxxxxxxxxxxxxx ++++|+++
+ * rxlow --------->xxxxxxxxxxxxxxxxxx +++|+++
+ * | |
+ * ----------------------------------|--------> tx
+ * 0 | 127
+ * search
+ * txhigh
+ */
+
+ txhigh.rx = txlow.rx;
+ dev_dbg(dev, "Searching for Golden txhigh on RX = %d\n", txhigh.rx);
+ txhigh.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_tx_high_ddr(f_pdata, mem, &txhigh);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Golden txhigh: RX: %d TX: %d RD: %d\n", txhigh.rx,
+ txhigh.tx, txhigh.read_delay);
+
+ /* If txlow/txhigh at same read_delay, search backup at 3/4 RX window */
+
+ if (txlow.read_delay == txhigh.read_delay) {
+ /* Backup txlow: Fix RX at 3/4 of RX window */
+
+ /*
+ * rx
+ * 127 ^
+ * |
+ * rxhigh --------->xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * fix rx | xxxxxxx ++++++++++++++++++
+ * 3/4 b/w ---------><----->x +++++++++++++++++
+ * rxlow and | xxxx|xxxx ++++++++++++++++
+ * rxhigh | xxxx|xxxxx +++++++++++++++
+ * | xxxx|xxxxxx ++++++++++++++
+ * | xxxx|xxxxxxx +++++++++++++
+ * | xxxx|xxxxxxxx ++++++++++++
+ * | xxxx|xxxxxxxxx +++++++++++
+ * | xxxx|xxxxxxxxxx ++++++++++
+ * | xxxx|xxxxxxxxxxx +++++++++
+ * | xxxx|xxxxxxxxxxxx ++++++++
+ * rxlow --------->xxxx|xxxxxxxxxxxxx +++++++
+ * | |
+ * ------------|------------------------------> tx
+ * 0 | 127
+ * search
+ * txlow
+ */
+
+ dev_dbg(dev, "txlow and txhigh at the same read delay.\n");
+ backuppoint.rx = rxlow.rx + ((rx_window * 3) / 4);
+ dev_dbg(dev, "Searching for Backup txlow on RX = %d\n",
+ backuppoint.rx);
+ backuppoint.read_delay = CQSPI_PHY_INIT_RD;
+ ret = cqspi_find_tx_low_ddr(f_pdata, mem, &backuppoint);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Backup txlow: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ if (backuppoint.tx < txlow.tx) {
+ txlow = backuppoint;
+ dev_dbg(dev, "Updating txlow with the one at RX = %d\n",
+ backuppoint.rx);
+ }
+ dev_dbg(dev, "Final txlow: RX: %d TX: %d RD: %d\n", txlow.rx,
+ txlow.tx, txlow.read_delay);
+
+ /* Backup txhigh: Same RX as backup txlow, decrement from max */
+
+ /*
+ * rx
+ * 127 ^
+ * |
+ * rxhigh --------->xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * fix rx | xxxxxxx ++++++++++++++++++
+ * 3/4 b/w ------------------------------><------->
+ * rxlow and | xxxxxxxxx +++++++++++|++++
+ * rxhigh | xxxxxxxxxx ++++++++++|++++
+ * | xxxxxxxxxxx +++++++++|++++
+ * | xxxxxxxxxxxx ++++++++|++++
+ * | xxxxxxxxxxxxx +++++++|++++
+ * | xxxxxxxxxxxxxx ++++++|++++
+ * | xxxxxxxxxxxxxxx +++++|++++
+ * | xxxxxxxxxxxxxxxx ++++|++++
+ * | xxxxxxxxxxxxxxxxx +++|++++
+ * rxlow --------->xxxxxxxxxxxxxxxxxx ++|++++
+ * | |
+ * ---------------------------------|---------> tx
+ * 0 | 127
+ * search
+ * txhigh
+ */
+
+ dev_dbg(dev, "Searching for Backup txhigh on RX = %d\n",
+ backuppoint.rx);
+ backuppoint.read_delay = CQSPI_PHY_MAX_RD;
+ ret = cqspi_find_tx_high_ddr(f_pdata, mem, &backuppoint);
+ if (ret)
+ goto out;
+ dev_dbg(dev, "Backup txhigh: RX: %d TX: %d RD: %d\n",
+ backuppoint.rx, backuppoint.tx, backuppoint.read_delay);
+
+ if (backuppoint.tx > txhigh.tx) {
+ txhigh = backuppoint;
+ dev_dbg(dev,
+ "Updating txhigh with the one at RX = %d\n",
+ backuppoint.rx);
+ }
+ dev_dbg(dev, "Final txhigh: RX: %d TX: %d RD: %d\n", txhigh.rx,
+ txhigh.tx, txhigh.read_delay);
+ }
+
+ /* Corner points: Define and verify bottomleft and topright boundaries */
+
+ /*
+ * rx
+ * 127 ^
+ * | topright
+ * | *
+ * rxhigh -----------xxxxx ++++++++++++++++++++
+ * | xxxxxx +++++++++++++++++++
+ * | xxxxxxx ++++++++++++++++++
+ * | xxxxxxxx +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * | xxxxxxxxxxxxx ++++++++++++
+ * | xxxxxxxxxxxxxx +++++++++++
+ * | xxxxxxxxxxxxxxx ++++++++++
+ * | xxxxxxxxxxxxxxxx +++++++++
+ * | xxxxxxxxxxxxxxxxx ++++++++
+ * rxlow -----------xxxxxxxxxxxxxxxxxx +++++++
+ * | * |
+ * | bottom|left |
+ * --------|----------------------------|---> tx
+ * 0 | | 127
+ * | |
+ * txlow txhigh
+ *
+ * Verification: Test point 4 taps inside each corner, adjust
+ * read_delay ±1 if needed to ensure valid corners for gap search.
+ */
+
+ bottomleft.tx = txlow.tx;
+ bottomleft.rx = rxlow.rx;
+ if (txlow.read_delay <= rxlow.read_delay)
+ bottomleft.read_delay = txlow.read_delay;
+ else
+ bottomleft.read_delay = rxlow.read_delay;
+
+ /* Verify bottomleft: Test 4 taps inside, adjust read_delay if needed */
+ backupcornerpoint = bottomleft;
+ backupcornerpoint.tx += 4;
+ backupcornerpoint.rx += 4;
+ ret = cqspi_phy_apply_setting(f_pdata, &backupcornerpoint);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ backupcornerpoint.read_delay--;
+ ret = cqspi_phy_apply_setting(f_pdata, &backupcornerpoint);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ }
+
+ if (ret)
+ goto out;
+
+ bottomleft.read_delay = backupcornerpoint.read_delay;
+
+ topright.tx = txhigh.tx;
+ topright.rx = rxhigh.rx;
+ if (txhigh.read_delay >= rxhigh.read_delay)
+ topright.read_delay = txhigh.read_delay;
+ else
+ topright.read_delay = rxhigh.read_delay;
+
+ /* Verify topright: Test 4 taps inside, adjust read_delay if needed */
+ backupcornerpoint = topright;
+ backupcornerpoint.tx -= 4;
+ backupcornerpoint.rx -= 4;
+ ret = cqspi_phy_apply_setting(f_pdata, &backupcornerpoint);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ backupcornerpoint.read_delay++;
+ ret = cqspi_phy_apply_setting(f_pdata, &backupcornerpoint);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ }
+
+ if (ret)
+ goto out;
+
+ topright.read_delay = backupcornerpoint.read_delay;
+
+ dev_dbg(dev, "topright: RX: %d TX: %d RD: %d\n", topright.rx,
+ topright.tx, topright.read_delay);
+ dev_dbg(dev, "bottomleft: RX: %d TX: %d RD: %d\n", bottomleft.rx,
+ bottomleft.tx, bottomleft.read_delay);
+ cqspi_phy_find_gaplow_ddr(f_pdata, mem, &bottomleft, &topright, &gaplow);
+ dev_dbg(dev, "gaplow: RX: %d TX: %d RD: %d\n", gaplow.rx, gaplow.tx,
+ gaplow.read_delay);
+
+ /* Final point selection: Handle single vs dual passing regions */
+
+ if (bottomleft.read_delay == topright.read_delay) {
+ /*
+ * Single region: Use midpoint with temperature compensation.
+ * Gaplow approximates upper boundary of valid region.
+ *
+ * rx
+ * 127 ^
+ * | gaplow (approx. topright)
+ * | |
+ * rxhigh -----------xxxxxxx| failing
+ * | xxxxxxx| region
+ * | xxxxxxx| <--------------->
+ * | xxxxxxx| +++++++++++++++++
+ * | xxxxxxxxx ++++++++++++++++
+ * | xxxxxxxxxx +++++++++++++++
+ * | xxxxxxxxxxx ++++++++++++++
+ * | xxxxxxxxxxxx +++++++++++++
+ * | xxxxxxxxxxxxx ++++++++++++
+ * | xxxxxxxxxxxxxx +++++++++++
+ * | xxxxxxxxxxxxxxx ++++++++++
+ * | xxxxxxxxxxxxxxxx +++++++++
+ * | xxxxxxxxxxxxxxxxx ++++++++
+ * rxlow -----------xxxxxxxxxxxxxxxxxx +++++++
+ * | * |
+ * | bottom|left |
+ * --------|----------------------------|---> tx
+ * 0 | | 127
+ * | |
+ * txlow txhigh
+ * (same read_delay)
+ *
+ * Temperature compensation: Valid region shifts with temp.
+ * Offset = region_size / (330 / (temp - 42°C))
+ * Factor 330 is empirically determined for this hardware.
+ */
+
+ dev_dbg(dev,
+ "bottomleft and topright at the same read delay.\n");
+
+ topright = gaplow;
+ searchpoint.read_delay = bottomleft.read_delay;
+ searchpoint.tx =
+ bottomleft.tx + ((topright.tx - bottomleft.tx) / 2);
+ searchpoint.rx =
+ bottomleft.rx + ((topright.rx - bottomleft.rx) / 2);
+
+ ret = cqspi_get_temp(&temp);
+ if (ret) {
+ /* Assume room temperature if sensor unavailable */
+ dev_dbg(dev,
+ "Unable to get temperature. Assuming room temperature\n");
+ temp = CQSPI_PHY_DEFAULT_TEMP;
+ }
+
+ if (temp < CQSPI_PHY_MIN_TEMP || temp > CQSPI_PHY_MAX_TEMP) {
+ dev_err(dev,
+ "Temperature outside operating range: %dC\n",
+ temp);
+ ret = -EINVAL;
+ goto out;
+ }
+
+ if (temp == CQSPI_PHY_MID_TEMP)
+ temp++; /* Avoid divide-by-zero */
+ dev_dbg(dev, "Temperature: %dC\n", temp);
+
+ /*
+ * Apply temperature offset: positive at high temp, negative at low.
+ * Compute the divisor once and apply to both TX and RX. Use int
+ * arithmetic throughout to avoid u8 wrapping on negative offsets.
+ */
+ temp = 330 / (temp - CQSPI_PHY_MID_TEMP);
+ searchpoint.tx = clamp((int)searchpoint.tx +
+ (topright.tx - bottomleft.tx) / temp,
+ 0, CQSPI_PHY_MAX_DELAY);
+ searchpoint.rx = clamp((int)searchpoint.rx +
+ (topright.rx - bottomleft.rx) / temp,
+ 0, CQSPI_PHY_MAX_DELAY);
+ } else {
+ /*
+ * Dual regions: Gap separates two valid regions, choose larger.
+ *
+ * rx
+ * 127 ^
+ * | topright
+ * | *
+ * rxhigh -----------xxxxx +++++++++++++++++++|
+ * | xxxxxx <region 2> ++++++++|
+ * | xxxxxxx +++++++++++++++++|
+ * | xxxxxxxx ++++++++++++++++|
+ * | xxxxxxxxx +++++++++++++++|
+ * | xxxxxxxxxx ++++++++++++++|
+ * | failing |
+ * | region |
+ * | xxxxxxxxxxxxx +++++++++++|
+ * | xxxxxxxxxxxxxx ++++++++++|
+ * | xxxxxxxxxxxxxxx +++++++++|
+ * | xxxxxxxxxxxxxxxx ++++++++|
+ * | xxxxxxxxx <region 1> +++++++|
+ * rxlow -----------xxxxxxxxxxxxxxxxxx ++++++|
+ * | * |
+ * | bottom|left |
+ * --------|----------------------------|---> tx
+ * 0 | | 127
+ * | |
+ * txlow txhigh
+ *
+ * Strategy: Compare Manhattan distances from gap boundaries to
+ * corners. Choose corner furthest from gap (larger region).
+ * Apply 16-tap margin inward, scale RX proportionally.
+ */
+
+ cqspi_phy_find_gaphigh_ddr(f_pdata, mem, &bottomleft,
+ &topright, &gaphigh);
+ dev_dbg(dev, "gaphigh: RX: %d TX: %d RD: %d\n", gaphigh.rx,
+ gaphigh.tx, gaphigh.read_delay);
+
+ if (topright.tx == bottomleft.tx) {
+ dev_err(dev, "zero TX span in dual-region: cannot compute search point\n");
+ ret = -EINVAL;
+ goto out;
+ }
+
+ /* Compare Manhattan distances: choose corner furthest from gap */
+ if ((abs(gaplow.tx - bottomleft.tx) +
+ abs(gaplow.rx - bottomleft.rx)) <
+ (abs(gaphigh.tx - topright.tx) +
+ abs(gaphigh.rx - topright.rx))) {
+ /* Topright further: Use Region 2, 16 taps inward */
+ searchpoint = topright;
+ searchpoint.tx -= 16;
+ searchpoint.rx -= (16 * (topright.rx - bottomleft.rx)) /
+ (topright.tx - bottomleft.tx);
+ } else {
+ /* Bottomleft further: Use Region 1, 16 taps inward */
+ searchpoint = bottomleft;
+ searchpoint.tx += 16;
+ searchpoint.rx += (16 * (topright.rx - bottomleft.rx)) /
+ (topright.tx - bottomleft.tx);
+ }
+ }
+
+ /* Apply and verify final tuning point */
+ dev_dbg(dev, "Final tuning point: RX: %d TX: %d RD: %d\n",
+ searchpoint.rx, searchpoint.tx, searchpoint.read_delay);
+ ret = cqspi_phy_apply_setting(f_pdata, &searchpoint);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ dev_err(dev,
+ "Failed to find pattern at final calibration point\n");
+ ret = -EINVAL;
+ goto out;
+ }
+
+ f_pdata->phy_setting.read_delay = searchpoint.read_delay;
+ f_pdata->phy_setting.rx = searchpoint.rx;
+ f_pdata->phy_setting.tx = searchpoint.tx;
+out:
+ if (ret)
+ f_pdata->use_tuned_phy = false;
+
+ return ret;
+}
+
+static int cqspi_phy_tuning_sdr(struct cqspi_flash_pdata *f_pdata,
+ struct spi_mem *mem)
+{
+ struct cqspi_st *cqspi = f_pdata->cqspi;
+ struct device *dev = &cqspi->pdev->dev;
+ struct phy_setting rxlow, rxhigh, first, second, final;
+ u8 window1 = 0;
+ u8 window2 = 0;
+ int ret;
+
+ /*
+ * SDR tuning: 1D search for optimal RX delay (TX less critical).
+ * Find two consecutive windows, choose larger, use midpoint.
+ *
+ * rx
+ * 127 ^
+ * | |-----window at----------|
+ * | |-----read_delay = n+1---|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | rxlow(n+1) midpoint rxhigh(n+1)
+ * |
+ * | |---window at--------|
+ * | |---read_delay = n---|
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * | rxlow(n) midpoint rxhigh(n)
+ * |
+ * -----------------------------------------> tx
+ * 0 127
+ * read_delay=n read_delay=n+1
+ */
+
+ f_pdata->use_tuned_phy = true;
+ cqspi_phy_reset_setting(&rxlow);
+ cqspi_phy_reset_setting(&rxhigh);
+ cqspi_phy_reset_setting(&first);
+
+ /* First window: Find rxlow by incrementing read_delay from 0 */
+
+ /*
+ * rx
+ * 127 ^
+ * | |xxxxxxxxxxxxxxxxxxxx|
+ * search | |xxxxxxxxxxxxxxxxxxxx|
+ * rxlow | |xxxxxxxxxxxxxxxxxxxx|
+ * increasing | |xxxxxxxxxxxxxxxxxxxx|
+ * --------->|xxxxxxxxxxxxxxxxxxxx|
+ * read_delay | |xxxxxxxxxxxxxxxxxxx|
+ * until found | |xxxxxxxxxxxxxxxxxxx|
+ * | rxlow
+ * -----------------------------------------> tx
+ * 0 tx fixed at 127
+ */
+
+ do {
+ ret = cqspi_find_rx_low_sdr(f_pdata, mem, &rxlow);
+
+ if (ret)
+ rxlow.read_delay++;
+ } while (ret && rxlow.read_delay <= CQSPI_PHY_MAX_RD);
+
+ /* Find rxhigh: Decrement from RX=127 at same read_delay */
+
+ /*
+ * rx
+ * 127 ^ search rxhigh
+ * | (decrement from
+ * | 127 until found)
+ * | |
+ * | |
+ * | v
+ * | |------------------------|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | rxlow rxhigh
+ * -----------------------------------------> tx
+ * 0 tx fixed at 127
+ */
+
+ rxhigh.read_delay = rxlow.read_delay;
+ ret = cqspi_find_rx_high_sdr(f_pdata, mem, &rxhigh, rxlow.rx);
+ if (ret)
+ goto out;
+
+ /* Calculate first window midpoint for max margin */
+
+ /*
+ * rx
+ * 127 ^
+ * | |--------window1---------|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxx * xxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | rxlow ^ rxhigh
+ * ----------------------|------------------> tx
+ * 0 | tx fixed at 127
+ * window1/2
+ */
+
+ first.read_delay = rxlow.read_delay;
+ window1 = rxhigh.rx - rxlow.rx;
+ first.rx = rxlow.rx + (window1 / 2);
+
+ dev_dbg(dev, "First tuning point: RX: %d TX: %d RD: %d\n", first.rx,
+ first.tx, first.read_delay);
+ ret = cqspi_phy_apply_setting(f_pdata, &first);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret || first.read_delay > CQSPI_PHY_MAX_RD)
+ goto out;
+
+ /* Second window: Search at read_delay+1, may differ in size */
+
+ /*
+ * rx
+ * 127 ^
+ * | |-------|
+ * | |xxxxxxx|
+ * | |xxxxxxx|
+ * | |xxxxxxx|
+ * | |xxxxxxx|
+ * | |xxxxxxx|
+ * | rxlow rxhigh
+ * -----------------------------------------> tx
+ * 0
+ * read_delay = n (smaller window)
+ *
+ * rx
+ * 127 ^
+ * | |-----------------|
+ * | |xxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxx|
+ * | rxlow rxhigh
+ * -----------------------------------------> tx
+ * 0
+ * read_delay = n+1 (larger window - better)
+ */
+
+ cqspi_phy_reset_setting(&rxlow);
+ cqspi_phy_reset_setting(&rxhigh);
+ cqspi_phy_reset_setting(&second);
+
+ rxlow.read_delay = first.read_delay + 1;
+ if (rxlow.read_delay > CQSPI_PHY_MAX_RD)
+ goto compare;
+
+ ret = cqspi_find_rx_low_sdr(f_pdata, mem, &rxlow);
+ if (ret)
+ goto compare;
+
+ rxhigh.read_delay = rxlow.read_delay;
+ ret = cqspi_find_rx_high_sdr(f_pdata, mem, &rxhigh, rxlow.rx);
+ if (ret)
+ goto compare;
+
+ /* Calculate second window midpoint */
+
+ /*
+ * rx
+ * 127 ^
+ * | |--------window2---------|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxx * xxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | |xxxxxxxxxxxxxxxxxxxxxxxx|
+ * | rxlow ^ rxhigh
+ * ----------------------|------------------> tx
+ * 0 | tx fixed at 127
+ * window2/2
+ * read_delay = n+1
+ */
+
+ window2 = rxhigh.rx - rxlow.rx;
+ second.rx = rxlow.rx + (window2 / 2);
+ second.read_delay = rxlow.read_delay;
+
+ dev_dbg(dev, "Second tuning point: RX: %d TX: %d RD: %d\n", second.rx,
+ second.tx, second.read_delay);
+ ret = cqspi_phy_apply_setting(f_pdata, &second);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret || second.read_delay > CQSPI_PHY_MAX_RD)
+ window2 = 0;
+
+ /* Window comparison: Choose larger window for better margin */
+
+compare:
+ cqspi_phy_reset_setting(&final);
+ if (window2 > window1) {
+ final.rx = second.rx;
+ final.read_delay = second.read_delay;
+ } else {
+ final.rx = first.rx;
+ final.read_delay = first.read_delay;
+ }
+
+ /* Apply and verify final tuning point */
+
+ dev_dbg(dev, "Final tuning point: RX: %d TX: %d RD: %d\n", final.rx,
+ final.tx, final.read_delay);
+ ret = cqspi_phy_apply_setting(f_pdata, &final);
+ if (!ret)
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+
+ if (ret) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ f_pdata->phy_setting.read_delay = final.read_delay;
+ f_pdata->phy_setting.rx = final.rx;
+ f_pdata->phy_setting.tx = final.tx;
+
+out:
+ if (ret)
+ f_pdata->use_tuned_phy = false;
+
+ return ret;
+}
+
+static int cqspi_am654_ospi_execute_tuning(struct spi_mem *mem,
+ struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op)
+{
+ struct cqspi_st *cqspi =
+ spi_controller_get_devdata(mem->spi->controller);
+ struct cqspi_flash_pdata *f_pdata;
+ struct device *dev = &cqspi->pdev->dev;
+ u32 phy_offset = 0;
+ int ret;
+
+ f_pdata = &cqspi->f_pdata[spi_get_chipselect(mem->spi, 0)];
+
+ /*
+ * spi-max-post-config-frequency must be present for PHY tuning.
+ * If absent, post_config_max_speed_hz is zero and there is no
+ * calibration target, so skip tuning gracefully.
+ */
+ if (!mem->spi->post_config_max_speed_hz) {
+ dev_dbg(dev,
+ "No post-config frequency configured, skipping tuning\n");
+ return -EOPNOTSUPP;
+ }
+
+ if (write_op) {
+ /*
+ * For NAND: write the calibration pattern to the page cache.
+ * The write op runs at the conservative base speed (spi->max_speed_hz)
+ * so the write itself is reliable before PHY calibration.
+ */
+ ret = cqspi_write_pattern_to_cache(f_pdata, mem, write_op);
+ if (ret) {
+ dev_warn(dev,
+ "failed to write pattern to cache: %d, skipping tuning\n",
+ ret);
+ goto out;
+ }
+
+ f_pdata->phy_write_op = *write_op;
+ } else {
+ ret = cqspi_get_phy_pattern_offset(dev, &phy_offset);
+ if (ret) {
+ dev_warn(dev,
+ "pattern partition not found: %d, skipping tuning\n",
+ ret);
+ goto out;
+ }
+
+ read_op->addr.val = phy_offset;
+ }
+
+ /*
+ * Verify the calibration pattern exists using the conservative base
+ * speed. At high clock rates the DLL is not yet trained, so DTR
+ * data capture is unreliable and the read would return garbage.
+ * Setting max_freq to 0 causes spi_mem_adjust_op_freq() to cap the
+ * read to max_speed_hz (the base rate), well within reliable DTR
+ * margins. max_freq is set to post_config_max_speed_hz below after
+ * the pattern is confirmed, so the tuning-loop reads run at full rate.
+ */
+ f_pdata->phy_read_op = *read_op;
+ f_pdata->phy_read_op.max_freq = 0;
+
+ ret = cqspi_phy_check_pattern(f_pdata, mem);
+ if (ret) {
+ dev_err(dev, "pattern not found: %d, skipping tuning\n", ret);
+ goto out;
+ }
+
+ /*
+ * Pattern confirmed. Set phy_read_op.max_freq to
+ * post_config_max_speed_hz so that tuning-loop reads bypass the base
+ * frequency cap in spi_mem_adjust_op_freq() and run at the full
+ * calibration rate.
+ */
+ f_pdata->phy_read_op.max_freq = mem->spi->post_config_max_speed_hz;
+
+ if (read_op->cmd.dtr || read_op->addr.dtr || read_op->dummy.dtr ||
+ read_op->data.dtr) {
+ f_pdata->use_dqs = true;
+ cqspi_phy_pre_config(cqspi, f_pdata, false);
+ ret = cqspi_phy_tuning_ddr(f_pdata, mem);
+ } else {
+ f_pdata->use_dqs = false;
+ cqspi_phy_pre_config(cqspi, f_pdata, true);
+ ret = cqspi_phy_tuning_sdr(f_pdata, mem);
+ }
+
+ if (ret)
+ dev_warn(dev, "tuning failed: %d\n", ret);
+
+ cqspi_phy_post_config(cqspi, f_pdata->read_delay);
+
+out:
+ /*
+ * On success, write back the validated maximum speed into the caller's
+ * op templates so that those specific ops bypass the cap in subsequent
+ * exec_op calls.
+ */
+ if (!ret) {
+ read_op->max_freq = mem->spi->post_config_max_speed_hz;
+ if (write_op)
+ write_op->max_freq = mem->spi->post_config_max_speed_hz;
+ }
+
+ return ret;
+}
+
+static int cqspi_mem_op_execute_tuning(struct spi_mem *mem,
+ struct spi_mem_op *read_op,
+ struct spi_mem_op *write_op)
+{
+ struct cqspi_st *cqspi =
+ spi_controller_get_devdata(mem->spi->controller);
+
+ if (!cqspi->ddata->execute_tuning)
+ return -EOPNOTSUPP;
+
+ return cqspi->ddata->execute_tuning(mem, read_op, write_op);
+}
+
static int cqspi_of_get_flash_pdata(struct platform_device *pdev,
struct cqspi_flash_pdata *f_pdata,
struct device_node *np)
@@ -1584,11 +3337,6 @@ static int cqspi_of_get_flash_pdata(struct platform_device *pdev,
return -ENXIO;
}
- if (of_property_read_u32(np, "spi-max-frequency", &f_pdata->clk_rate)) {
- dev_err(&pdev->dev, "couldn't determine spi-max-frequency\n");
- return -ENXIO;
- }
-
return 0;
}
@@ -1736,6 +3484,7 @@ static const struct spi_controller_mem_ops cqspi_mem_ops = {
.exec_op = cqspi_exec_mem_op,
.get_name = cqspi_get_name,
.supports_op = cqspi_supports_mem_op,
+ .execute_tuning = cqspi_mem_op_execute_tuning,
};
static const struct spi_controller_mem_caps cqspi_mem_caps = {
@@ -2104,6 +3853,7 @@ static const struct cqspi_driver_platdata k2g_qspi = {
static const struct cqspi_driver_platdata am654_ospi = {
.hwcaps_mask = CQSPI_SUPPORTS_OCTAL | CQSPI_SUPPORTS_QUAD,
.quirks = CQSPI_NEEDS_WR_DELAY,
+ .execute_tuning = cqspi_am654_ospi_execute_tuning,
};
static const struct cqspi_driver_platdata intel_lgm_qspi = {
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v4 08/16] spi: cadence-quadspi: add PHY tuning support
2026-06-18 7:37 ` [PATCH v4 08/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
@ 2026-06-18 7:59 ` sashiko-bot
0 siblings, 0 replies; 29+ messages in thread
From: sashiko-bot @ 2026-06-18 7:59 UTC (permalink / raw)
To: Santhosh Kumar K; +Cc: devicetree, conor+dt, robh
Thank you for your contribution! Sashiko AI review found 8 potential issue(s) to consider:
- [High] Unchecked dereference of `cqspi->ddata` in `cqspi_mem_op_execute_tuning` when it can be NULL.
- [High] Passing `.rodata` buffer to `spi_mem_exec_op()` violates SPI API DMA constraints.
- [High] Dual-region PHY tuning arithmetic lacks bounds clamping, causing `u8` underflows.
- [High] The tuning loops sweep `read_delay`, but `cqspi_phy_apply_setting` fails to update the hardware register with the new delay.
- [High] PHY tuning modifies shared SPI controller registers without holding the SPI bus lock, creating a race condition with concurrent transfers.
- [High] The PHY tuning sequence performs direct MMIO to controller registers without acquiring a PM runtime reference.
- [Medium] Missing error check in SDR tuning causes invalid read_delay configuration.
- [Medium] The PHY tuning search loops ignore critical hardware timeouts and memory allocation errors, causing the system to needlessly stall.
--
commit a0e116638901b220157d7f4a14b288dc44e8d0fd
Author: Santhosh Kumar K <s-k6@ti.com>
Subject: spi: cadence-quadspi: add PHY tuning support
This commit adds an execute_tuning callback for delay-line calibration on
Cadence QSPI controllers. It implements tuning sweeps for both DDR and SDR
modes. The calibration relies on writing and reading a predefined pattern
to and from flash memory.
> diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
> index f4efb92c9446a..05e1b7a2f3433 100644
> --- a/drivers/spi/spi-cadence-quadspi.c
> +++ b/drivers/spi/spi-cadence-quadspi.c
[ ... ]
> @@ -1559,6 +1640,1678 @@ static bool cqspi_supports_mem_op(struct spi_mem *mem,
> +static int cqspi_write_pattern_to_cache(struct cqspi_flash_pdata *f_pdata,
> + struct spi_mem *mem,
> + struct spi_mem_op *write_op)
> +{
> + struct device *dev = &f_pdata->cqspi->pdev->dev;
> + int ret;
> +
> + write_op->data.nbytes = sizeof(phy_tuning_pattern);
> + write_op->data.buf.out = phy_tuning_pattern;
> +
> + ret = spi_mem_exec_op(mem, write_op);
[Severity: High]
Does passing phy_tuning_pattern directly to spi_mem_exec_op violate SPI DMA
constraints? The phy_tuning_pattern array is defined as static const u8 and
resides in the rodata section.
The SPI framework mandates that buffers passed to the SPI core must be
DMA-capable, as fallback paths will attempt to call dma_map_single on them.
Mapping a rodata buffer violates the DMA API and will trigger kernel panics
or warnings when CONFIG_DMA_API_DEBUG is enabled. Could this be copied to a
dynamically allocated bounce buffer first?
[ ... ]
> +static int cqspi_phy_apply_setting(struct cqspi_flash_pdata *f_pdata,
> + struct phy_setting *phy)
> +{
> + struct cqspi_st *cqspi = f_pdata->cqspi;
> + unsigned int reg;
> + int ret;
> +
> + reg = readl(cqspi->iobase + CQSPI_REG_READCAPTURE);
> + reg |= BIT(CQSPI_REG_READCAPTURE_EDGE_LSB);
> + writel(reg, cqspi->iobase + CQSPI_REG_READCAPTURE);
> +
> + cqspi_set_dll(cqspi->iobase, phy->rx, phy->tx);
> +
> + ret = cqspi_resync_dll(cqspi);
> + if (ret)
> + return ret;
> +
> + f_pdata->phy_setting.read_delay = phy->read_delay;
> + return 0;
> +}
[Severity: High]
Is the read_delay value actually being programmed into the hardware during
the tuning sweeps?
While cqspi_phy_apply_setting updates the DLL and edge registers, it appears
it never writes the phy->read_delay into the DELAY_LSB field of the
CQSPI_REG_READCAPTURE register.
Because the internal configuration updates are bypassed when the SPI clock
frequency has not changed, doesn't this mean the hardware read_delay remains
fixed at its initial value throughout the entire tuning sequence?
[ ... ]
> +static int cqspi_find_rx_low_ddr(struct cqspi_flash_pdata *f_pdata,
> + struct spi_mem *mem, struct phy_setting *phy)
> +{
> + struct device *dev = &f_pdata->cqspi->pdev->dev;
> + int ret;
> +
> + do {
> + phy->rx = CQSPI_PHY_RX_LOW_SEARCH_START;
> + do {
> + ret = cqspi_phy_apply_setting(f_pdata, phy);
> + if (!ret) {
> + ret = cqspi_phy_check_pattern(f_pdata, mem);
> + if (!ret)
> + return 0;
> + }
> +
> + phy->rx += CQSPI_PHY_DDR_SEARCH_STEP;
> + } while (phy->rx <= CQSPI_PHY_RX_LOW_SEARCH_END);
[Severity: Medium]
Will this loop stall the boot process needlessly if a fatal error occurs?
If cqspi_phy_apply_setting returns -ETIMEDOUT because the DLL failed to
lock, or cqspi_phy_check_pattern returns -ENOMEM due to an allocation error,
the loop checks if (!ret) and then blindly continues to the next coordinate.
Since these operations wait up to 600us on each iteration, ignoring these
environmental errors instead of failing fast seems to cause a significant
stall. Should we check for specific fatal error codes and abort?
[ ... ]
> + /* Compare Manhattan distances: choose corner furthest from gap */
> + if ((abs(gaplow.tx - bottomleft.tx) +
> + abs(gaplow.rx - bottomleft.rx)) <
> + (abs(gaphigh.tx - topright.tx) +
> + abs(gaphigh.rx - topright.rx))) {
> + /* Topright further: Use Region 2, 16 taps inward */
> + searchpoint = topright;
> + searchpoint.tx -= 16;
> + searchpoint.rx -= (16 * (topright.rx - bottomleft.rx)) /
> + (topright.tx - bottomleft.tx);
[Severity: High]
Can this proportional adjustment underflow the u8 searchpoint.rx variable?
When resolving the dual-region tuning target, if the calculated adjustment
exceeds topright.rx (e.g. if topright.rx is 24 and the adjustment evaluates
to 27), the u8 variable will severely underflow and wrap to 253.
This wrapped value is then passed into cqspi_set_dll where the 0x7F mask
would truncate it to an arbitrary out-of-bounds point, corrupting the
calibration. Should there be a clamp applied here?
[ ... ]
> +static int cqspi_phy_tuning_sdr(struct cqspi_flash_pdata *f_pdata,
> + struct spi_mem *mem)
> +{
[ ... ]
> + do {
> + ret = cqspi_find_rx_low_sdr(f_pdata, mem, &rxlow);
> +
> + if (ret)
> + rxlow.read_delay++;
> + } while (ret && rxlow.read_delay <= CQSPI_PHY_MAX_RD);
> +
> + /* Find rxhigh: Decrement from RX=127 at same read_delay */
> +
[ ... ]
> +
> + rxhigh.read_delay = rxlow.read_delay;
> + ret = cqspi_find_rx_high_sdr(f_pdata, mem, &rxhigh, rxlow.rx);
[Severity: Medium]
Is there a missing error check after the first search loop?
Unlike the DDR implementation, if the loop exceeds CQSPI_PHY_MAX_RD without
finding rxlow, it exits with a non-zero ret. However, the code proceeds
directly to searching rxhigh with an out-of-bounds rxlow.read_delay (5) and
an invalid rxlow.rx (128). This causes out-of-bounds values to be applied to
the hardware configuration. Could we add an if (ret) goto out; check here?
[ ... ]
> +static int cqspi_am654_ospi_execute_tuning(struct spi_mem *mem,
> + struct spi_mem_op *read_op,
> + struct spi_mem_op *write_op)
> +{
[ ... ]
> + if (read_op->cmd.dtr || read_op->addr.dtr || read_op->dummy.dtr ||
> + read_op->data.dtr) {
> + f_pdata->use_dqs = true;
> + cqspi_phy_pre_config(cqspi, f_pdata, false);
> + ret = cqspi_phy_tuning_ddr(f_pdata, mem);
> + } else {
> + f_pdata->use_dqs = false;
> + cqspi_phy_pre_config(cqspi, f_pdata, true);
> + ret = cqspi_phy_tuning_sdr(f_pdata, mem);
> + }
[Severity: High]
Are we adequately protecting concurrent transfers and power management state
during these tuning calls?
The spi_mem_execute_tuning framework API does not acquire the SPI
controller's bus_lock_mutex. Consequently, this callback runs lock-free. It
performs raw read-modify-write operations on shared controller registers
like CQSPI_REG_CONFIG and unconditionally disables the controller via
cqspi_resync_dll(). If another SPI device on the same bus initiates a
concurrent transfer, its transfer could be aborted or corrupted by these
unguarded register writes.
[Severity: High]
Additionally, the tuning sequence accesses the controller's MMIO registers
directly but does not acquire an active PM runtime reference with
pm_runtime_resume_and_get(). If the SPI controller is suspended by
auto-suspend immediately after an inner spi_mem_exec_op returns, subsequent
manual MMIO accesses might hit unclocked hardware and trigger synchronous
external aborts.
[ ... ]
> +static int cqspi_mem_op_execute_tuning(struct spi_mem *mem,
> + struct spi_mem_op *read_op,
> + struct spi_mem_op *write_op)
> +{
> + struct cqspi_st *cqspi =
> + spi_controller_get_devdata(mem->spi->controller);
> +
> + if (!cqspi->ddata->execute_tuning)
> + return -EOPNOTSUPP;
> +
> + return cqspi->ddata->execute_tuning(mem, read_op, write_op);
> +}
[Severity: High]
Can cqspi->ddata be NULL here?
The execute_tuning callback is unconditionally added to cqspi_mem_ops. In
this driver, cqspi->ddata is populated via of_device_get_match_data(), which
can return NULL for generic fallbacks or manual device instantiation. If
ddata is NULL, dereferencing cqspi->ddata->execute_tuning will trigger a
kernel panic when the SPI core attempts to execute tuning.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260618073725.84733-1-s-k6@ti.com?part=8
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v4 09/16] spi: cadence-quadspi: skip DDR PHY tuning for 2-byte-address ops (i2383)
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (7 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 08/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 8:04 ` sashiko-bot
2026-06-18 7:37 ` [PATCH v4 10/16] spi: cadence-quadspi: refactor direct read path for PHY support Santhosh Kumar K
` (6 subsequent siblings)
15 siblings, 1 reply; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Erratum i2383 on AM654 locks the address phase in PHY DDR mode when a
2-byte column address is used. DDR PHY tuning must not be attempted for
such operations; non-PHY DDR usage is unaffected. [0]
Add CQSPI_NO_2BYTE_ADDR_PHY_DDR quirk and check it in
cqspi_am654_ospi_execute_tuning(). When the erratum applies, return 0
with read_op->max_freq cleared.
[0] https://www.ti.com/lit/er/sprz544c/sprz544c.pdf
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 72768292a32b..22df5f3bdb96 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -49,6 +49,7 @@ static_assert(CQSPI_MAX_CHIPSELECT <= SPI_DEVICE_CS_CNT_MAX);
#define CQSPI_DISABLE_RUNTIME_PM BIT(10)
#define CQSPI_NO_INDIRECT_MODE BIT(11)
#define CQSPI_HAS_WR_PROTECT BIT(12)
+#define CQSPI_NO_2BYTE_ADDR_PHY_DDR BIT(13)
/* Capabilities */
#define CQSPI_SUPPORTS_OCTAL BIT(0)
@@ -3211,6 +3212,20 @@ static int cqspi_am654_ospi_execute_tuning(struct spi_mem *mem,
return -EOPNOTSUPP;
}
+ /*
+ * Erratum i2383: in PHY DDR mode, a 2-byte column address locks up
+ * the address phase. Skip DDR PHY tuning for such operations.
+ */
+ if ((cqspi->ddata->quirks & CQSPI_NO_2BYTE_ADDR_PHY_DDR) &&
+ read_op->addr.nbytes == 2 &&
+ (read_op->cmd.dtr || read_op->addr.dtr || read_op->dummy.dtr ||
+ read_op->data.dtr)) {
+ dev_dbg(dev,
+ "i2383: skipping DDR PHY tuning (2-byte address)\n");
+ read_op->max_freq = 0;
+ return 0;
+ }
+
if (write_op) {
/*
* For NAND: write the calibration pattern to the page cache.
@@ -3852,7 +3867,7 @@ static const struct cqspi_driver_platdata k2g_qspi = {
static const struct cqspi_driver_platdata am654_ospi = {
.hwcaps_mask = CQSPI_SUPPORTS_OCTAL | CQSPI_SUPPORTS_QUAD,
- .quirks = CQSPI_NEEDS_WR_DELAY,
+ .quirks = CQSPI_NEEDS_WR_DELAY | CQSPI_NO_2BYTE_ADDR_PHY_DDR,
.execute_tuning = cqspi_am654_ospi_execute_tuning,
};
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v4 09/16] spi: cadence-quadspi: skip DDR PHY tuning for 2-byte-address ops (i2383)
2026-06-18 7:37 ` [PATCH v4 09/16] spi: cadence-quadspi: skip DDR PHY tuning for 2-byte-address ops (i2383) Santhosh Kumar K
@ 2026-06-18 8:04 ` sashiko-bot
0 siblings, 0 replies; 29+ messages in thread
From: sashiko-bot @ 2026-06-18 8:04 UTC (permalink / raw)
To: Santhosh Kumar K; +Cc: conor+dt, devicetree, robh
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Unchecked dereference of `cqspi->ddata` in `cqspi_mem_op_execute_tuning()` leads to a potential kernel oops.
--
commit 40592f2a22efbd29a6272035f538dcdb01d9f776
Author: Santhosh Kumar K <s-k6@ti.com>
spi: cadence-quadspi: skip DDR PHY tuning for 2-byte-address ops (i2383)
This commit introduces a CQSPI_NO_2BYTE_ADDR_PHY_DDR quirk to skip DDR PHY
tuning for 2-byte column addresses. This is done to address erratum i2383 on
AM654, which locks the address phase in PHY DDR mode for such operations.
[0] https://www.ti.com/lit/er/sprz544c/sprz544c.pdf
> diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
> index 05e1b7a2f3433..c94bb8ad5d1dd 100644
> --- a/drivers/spi/spi-cadence-quadspi.c
> +++ b/drivers/spi/spi-cadence-quadspi.c
[ ... ]
> @@ -3864,7 +3879,7 @@ static const struct cqspi_driver_platdata k2g_qspi = {
>
> static const struct cqspi_driver_platdata am654_ospi = {
> .hwcaps_mask = CQSPI_SUPPORTS_OCTAL | CQSPI_SUPPORTS_QUAD,
> - .quirks = CQSPI_NEEDS_WR_DELAY,
> + .quirks = CQSPI_NEEDS_WR_DELAY | CQSPI_NO_2BYTE_ADDR_PHY_DDR,
> .execute_tuning = cqspi_am654_ospi_execute_tuning,
> };
>
[Severity: Medium]
Looking at the tuning execution path that uses these callbacks, is there a
potential NULL pointer dereference in cqspi_mem_op_execute_tuning()?
drivers/spi/spi-cadence-quadspi.c:cqspi_mem_op_execute_tuning() {
...
struct cqspi_st *cqspi =
spi_controller_get_devdata(mem->spi->controller);
if (!cqspi->ddata->execute_tuning)
return -EOPNOTSUPP;
return cqspi->ddata->execute_tuning(mem, read_op, write_op);
}
The cqspi->ddata pointer is initialized via of_device_get_match_data(), which
can return NULL if the driver is bound via sysfs to a device node without a
matching compatible string.
Other uses of ddata in the driver (like in cqspi_probe or cqspi_setup_flash)
include a NULL check before dereferencing it, proving that ddata == NULL is a
supported and handled state.
Should this tuning wrapper also include a defensive check for ddata == NULL?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260618073725.84733-1-s-k6@ti.com?part=9
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v4 10/16] spi: cadence-quadspi: refactor direct read path for PHY support
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (8 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 09/16] spi: cadence-quadspi: skip DDR PHY tuning for 2-byte-address ops (i2383) Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:57 ` sashiko-bot
2026-06-18 7:37 ` [PATCH v4 11/16] spi: cadence-quadspi: enable PHY for direct reads Santhosh Kumar K
` (5 subsequent siblings)
15 siblings, 1 reply; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Extract the DMA transfer code from cqspi_direct_read_execute() into a
new cqspi_direct_read_dma() helper. Add cqspi_memcpy_fromio() to handle
non-DMA transfers, with 2-byte-aligned I/O accesses for 8D-8D-8D mode.
Change cqspi_direct_read_execute() to take the full spi_mem_op and
max_speed_hz instead of separate buf/from/len parameters, matching the
interface needed by the PHY-aware version in the following patch.
Thread max_speed_hz from cqspi_mem_process() through cqspi_read().
Transfers shorter than CQSPI_PHY_MIN_DIRECT_READ_LEN bytes always use
the memcpy path; longer transfers use DMA when available.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 72 ++++++++++++++++++++++++++-----
1 file changed, 62 insertions(+), 10 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 22df5f3bdb96..e41aeca47885 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -149,6 +149,8 @@ struct cqspi_driver_platdata {
#define CQSPI_READ_TIMEOUT_MS 10
#define CQSPI_BUSYWAIT_TIMEOUT_US 500
#define CQSPI_DLL_TIMEOUT_US 300
+/* Minimum transfer length to use DMA for direct reads */
+#define CQSPI_PHY_MIN_DIRECT_READ_LEN 17
/* Runtime_pm autosuspend delay */
#define CQSPI_AUTOSUSPEND_TIMEOUT 2000
@@ -1447,8 +1449,8 @@ static void cqspi_rx_dma_callback(void *param)
complete(&cqspi->rx_dma_complete);
}
-static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
- u_char *buf, loff_t from, size_t len)
+static int cqspi_direct_read_dma(struct cqspi_flash_pdata *f_pdata, u_char *buf,
+ loff_t from, size_t len)
{
struct cqspi_st *cqspi = f_pdata->cqspi;
struct device *dev = &cqspi->pdev->dev;
@@ -1460,11 +1462,6 @@ static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
dma_addr_t dma_dst;
struct device *ddev;
- if (!cqspi->rx_chan || !virt_addr_valid(buf)) {
- memcpy_fromio(buf, cqspi->ahb_base + from, len);
- return 0;
- }
-
ddev = cqspi->rx_chan->device->dev;
dma_dst = dma_map_single(ddev, buf, len, DMA_FROM_DEVICE);
if (dma_mapping_error(ddev, dma_dst)) {
@@ -1506,8 +1503,61 @@ static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
return ret;
}
+static void cqspi_memcpy_fromio(const struct spi_mem_op *op, void *to,
+ const void __iomem *from, size_t count)
+{
+ if (op->data.buswidth == 8 && op->data.dtr) {
+ unsigned long from_addr = (unsigned long)from;
+
+ /* Handle unaligned start with 2-byte read */
+ if (count && !IS_ALIGNED(from_addr, 4)) {
+ *(u16 *)to = __raw_readw(from);
+ from += 2;
+ to += 2;
+ count -= 2;
+ }
+
+ /* Use 4-byte reads for aligned bulk (no readq for 32-bit) */
+ if (count >= 4) {
+ size_t len = round_down(count, 4);
+
+ memcpy_fromio(to, from, len);
+ from += len;
+ to += len;
+ count -= len;
+ }
+
+ /* Handle remaining 2 bytes */
+ if (count)
+ *(u16 *)to = __raw_readw(from);
+
+ return;
+ }
+
+ memcpy_fromio(to, from, count);
+}
+
+static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
+ const struct spi_mem_op *op,
+ u32 post_config_max_speed_hz)
+{
+ struct cqspi_st *cqspi = f_pdata->cqspi;
+ loff_t from = op->addr.val;
+ size_t len = op->data.nbytes;
+ u_char *buf = op->data.buf.in;
+
+ if (!cqspi->rx_chan || !virt_addr_valid(buf) ||
+ len < CQSPI_PHY_MIN_DIRECT_READ_LEN) {
+ cqspi_memcpy_fromio(op, buf, cqspi->ahb_base + from, len);
+ return 0;
+ }
+
+ return cqspi_direct_read_dma(f_pdata, buf, from, len);
+}
+
static ssize_t cqspi_read(struct cqspi_flash_pdata *f_pdata,
- const struct spi_mem_op *op)
+ const struct spi_mem_op *op,
+ u32 post_config_max_speed_hz)
{
struct cqspi_st *cqspi = f_pdata->cqspi;
const struct cqspi_driver_platdata *ddata = cqspi->ddata;
@@ -1523,7 +1573,8 @@ static ssize_t cqspi_read(struct cqspi_flash_pdata *f_pdata,
if ((cqspi->use_direct_mode && ((from + len) <= cqspi->ahb_size)) ||
(cqspi->ddata && cqspi->ddata->quirks & CQSPI_NO_INDIRECT_MODE))
- return cqspi_direct_read_execute(f_pdata, buf, from, len);
+ return cqspi_direct_read_execute(f_pdata, op,
+ post_config_max_speed_hz);
if (cqspi->use_dma_read && ddata && ddata->indirect_read_dma &&
virt_addr_valid(buf) && ((dma_align & CQSPI_DMA_UNALIGN) == 0))
@@ -1551,7 +1602,8 @@ static int cqspi_mem_process(struct spi_mem *mem, const struct spi_mem_op *op)
!cqspi->disable_stig_mode))
return cqspi_command_read(f_pdata, op);
- return cqspi_read(f_pdata, op);
+ return cqspi_read(f_pdata, op,
+ mem->spi->post_config_max_speed_hz);
}
if (!op->addr.nbytes || !op->data.buf.out)
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v4 10/16] spi: cadence-quadspi: refactor direct read path for PHY support
2026-06-18 7:37 ` [PATCH v4 10/16] spi: cadence-quadspi: refactor direct read path for PHY support Santhosh Kumar K
@ 2026-06-18 7:57 ` sashiko-bot
0 siblings, 0 replies; 29+ messages in thread
From: sashiko-bot @ 2026-06-18 7:57 UTC (permalink / raw)
To: Santhosh Kumar K; +Cc: conor+dt, devicetree, robh
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [High] Casting an arbitrary `void *` buffer pointer to `u16 *` and dereferencing it causes unaligned access faults on strict-alignment architectures.
- [High] Performing 16-bit unaligned reads from memory-mapped I/O if the flash offset is odd causes a Data Abort.
- [High] Silent degradation to 1-byte I/O reads by memcpy_fromio() violates 8D-8D-8D access size requirements.
--
commit d928e9450946bfcaed1dc0c5349081ffad136425
Author: Santhosh Kumar K <s-k6@ti.com>
spi: cadence-quadspi: refactor direct read path for PHY support
Extract the DMA transfer code from cqspi_direct_read_execute() into a
new cqspi_direct_read_dma() helper. Add cqspi_memcpy_fromio() to handle
non-DMA transfers, with 2-byte-aligned I/O accesses for 8D-8D-8D mode.
Transfers shorter than CQSPI_PHY_MIN_DIRECT_READ_LEN bytes always use
the memcpy path; longer transfers use DMA when available.
> diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
> index c94bb8ad5d1dd..5f992fee790a4 100644
> --- a/drivers/spi/spi-cadence-quadspi.c
> +++ b/drivers/spi/spi-cadence-quadspi.c
[ ... ]
> @@ -1506,8 +1503,61 @@ static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
> return ret;
> }
>
> +static void cqspi_memcpy_fromio(const struct spi_mem_op *op, void *to,
> + const void __iomem *from, size_t count)
> +{
> + if (op->data.buswidth == 8 && op->data.dtr) {
> + unsigned long from_addr = (unsigned long)from;
> +
> + /* Handle unaligned start with 2-byte read */
> + if (count && !IS_ALIGNED(from_addr, 4)) {
> + *(u16 *)to = __raw_readw(from);
[Severity: High]
Will this __raw_readw() cause a Data Abort on ARM architectures if the
flash offset is odd?
The code assumes from_addr is 2-byte aligned, but userspace MTD reads
can specify arbitrary odd offsets. If from_addr is odd,
!IS_ALIGNED(from_addr, 4) evaluates to true and we execute __raw_readw()
on an odd I/O memory address.
Unaligned accesses to device memory strictly cause Alignment Faults on
ARM, even if the underlying hardware ignores the LSB.
Also, can casting the destination buffer to a u16 pointer and dereferencing
it cause unaligned access faults?
The destination buffer, derived from op->data.buf.in, might be unaligned.
Casting an unaligned void pointer to a u16 pointer and dereferencing it
causes an unaligned store exception on strict-alignment architectures.
Does this need to use put_unaligned() instead?
> + from += 2;
> + to += 2;
> + count -= 2;
> + }
> +
> + /* Use 4-byte reads for aligned bulk (no readq for 32-bit) */
> + if (count >= 4) {
> + size_t len = round_down(count, 4);
> +
> + memcpy_fromio(to, from, len);
[Severity: High]
Does memcpy_fromio() guarantee the 2-byte minimum I/O access size required
by the 8D-8D-8D hardware?
If the generic memcpy_fromio() fallback is used by the architecture, or
if the destination buffer is unaligned on architectures like ARM64,
memcpy_fromio() will silently degrade to byte-wise reads.
Issuing 1-byte read requests on the AHB bus violates the hardware's 2-byte
minimum access requirement in 8D-8D-8D mode. Should this use an explicit
loop with ioread32_rep() or __raw_readl() to guarantee the minimum access
size?
> + from += len;
> + to += len;
> + count -= len;
> + }
> +
> + /* Handle remaining 2 bytes */
> + if (count)
> + *(u16 *)to = __raw_readw(from);
[Severity: High]
As with the unaligned start handling above, does this unaligned void
pointer cast risk unaligned access faults on strict-alignment architectures?
> +
> + return;
> + }
> +
> + memcpy_fromio(to, from, count);
> +}
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260618073725.84733-1-s-k6@ti.com?part=10
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v4 11/16] spi: cadence-quadspi: enable PHY for direct reads
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (9 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 10/16] spi: cadence-quadspi: refactor direct read path for PHY support Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:53 ` sashiko-bot
2026-06-18 7:37 ` [PATCH v4 12/16] spi: cadence-quadspi: enable PHY for indirect writes Santhosh Kumar K
` (4 subsequent siblings)
15 siblings, 1 reply; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Add cqspi_tune_phy() to toggle PHY mode. Enabling sets the calibrated
read-capture delay, asserts PHY_EN and PHY_PIPELINE, and decrements the
dummy cycle count by one since the PHY pipeline absorbs that latency.
Disabling reverses all three. Disable is best-effort.
For direct reads, split the transfer into an unaligned head, a
16-byte-aligned middle section with PHY active, and an unaligned tail.
PHY is used when tuning completed successfully and the transfer is at
the calibrated frequency.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 103 +++++++++++++++++++++++++++++-
1 file changed, 102 insertions(+), 1 deletion(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index e41aeca47885..16e3b843f0aa 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -565,6 +565,61 @@ static void cqspi_readdata_capture(struct cqspi_st *cqspi, const bool bypass,
writel(reg, reg_base + CQSPI_REG_READCAPTURE);
}
+static int cqspi_tune_phy(struct cqspi_flash_pdata *f_pdata, bool enable)
+{
+ struct cqspi_st *cqspi = f_pdata->cqspi;
+ void __iomem *reg_base = cqspi->iobase;
+ u32 reg;
+ u8 dummy;
+
+ if (enable) {
+ cqspi_readdata_capture(cqspi, true, f_pdata->use_dqs,
+ f_pdata->phy_setting.read_delay);
+
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg |= CQSPI_REG_CONFIG_PHY_EN | CQSPI_REG_CONFIG_PHY_PIPELINE;
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ /*
+ * The PHY data-capture pipeline absorbs one dummy cycle's
+ * worth of latency; reduce the count to avoid over-compensation.
+ */
+ reg = readl(reg_base + CQSPI_REG_RD_INSTR);
+ dummy = FIELD_GET(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ reg);
+ dummy--;
+ reg &= ~(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB);
+ reg |= FIELD_PREP(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ dummy);
+ writel(reg, reg_base + CQSPI_REG_RD_INSTR);
+ } else {
+ cqspi_readdata_capture(cqspi, !cqspi->rclk_en, false,
+ f_pdata->read_delay);
+
+ reg = readl(reg_base + CQSPI_REG_CONFIG);
+ reg &= ~(CQSPI_REG_CONFIG_PHY_EN |
+ CQSPI_REG_CONFIG_PHY_PIPELINE);
+ writel(reg, reg_base + CQSPI_REG_CONFIG);
+
+ reg = readl(reg_base + CQSPI_REG_RD_INSTR);
+ dummy = FIELD_GET(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ reg);
+ dummy++;
+ reg &= ~(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB);
+ reg |= FIELD_PREP(CQSPI_REG_RD_INSTR_DUMMY_MASK
+ << CQSPI_REG_RD_INSTR_DUMMY_LSB,
+ dummy);
+ writel(reg, reg_base + CQSPI_REG_RD_INSTR);
+ }
+
+ return cqspi_wait_idle(cqspi);
+}
+
static int cqspi_exec_flash_cmd(struct cqspi_st *cqspi, unsigned int reg)
{
void __iomem *reg_base = cqspi->iobase;
@@ -1442,6 +1497,14 @@ static ssize_t cqspi_write(struct cqspi_flash_pdata *f_pdata,
return cqspi_indirect_write_execute(f_pdata, to, buf, len);
}
+static bool cqspi_use_tuned_phy(struct cqspi_flash_pdata *f_pdata,
+ const struct spi_mem_op *op,
+ u32 post_config_max_speed_hz)
+{
+ return f_pdata->use_tuned_phy &&
+ op->max_freq == post_config_max_speed_hz;
+}
+
static void cqspi_rx_dma_callback(void *param)
{
struct cqspi_st *cqspi = param;
@@ -1543,8 +1606,11 @@ static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
{
struct cqspi_st *cqspi = f_pdata->cqspi;
loff_t from = op->addr.val;
+ loff_t from_aligned, to_aligned;
size_t len = op->data.nbytes;
+ size_t len_aligned;
u_char *buf = op->data.buf.in;
+ int ret;
if (!cqspi->rx_chan || !virt_addr_valid(buf) ||
len < CQSPI_PHY_MIN_DIRECT_READ_LEN) {
@@ -1552,7 +1618,42 @@ static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
return 0;
}
- return cqspi_direct_read_dma(f_pdata, buf, from, len);
+ if (!cqspi_use_tuned_phy(f_pdata, op, post_config_max_speed_hz))
+ return cqspi_direct_read_dma(f_pdata, buf, from, len);
+
+ /* Split into unaligned head, aligned middle, unaligned tail */
+ from_aligned = ALIGN(from, 16);
+ to_aligned = ALIGN_DOWN(from + len, 16);
+ len_aligned = to_aligned - from_aligned;
+
+ if (from != from_aligned) {
+ ret = cqspi_direct_read_dma(f_pdata, buf, from,
+ from_aligned - from);
+ if (ret)
+ return ret;
+ buf += from_aligned - from;
+ }
+
+ if (len_aligned) {
+ ret = cqspi_tune_phy(f_pdata, true);
+ if (ret)
+ return ret;
+ ret = cqspi_direct_read_dma(f_pdata, buf, from_aligned,
+ len_aligned);
+ cqspi_tune_phy(f_pdata, false);
+ if (ret)
+ return ret;
+ buf += len_aligned;
+ }
+
+ if (to_aligned != (from + len)) {
+ ret = cqspi_direct_read_dma(f_pdata, buf, to_aligned,
+ (from + len) - to_aligned);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
}
static ssize_t cqspi_read(struct cqspi_flash_pdata *f_pdata,
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v4 11/16] spi: cadence-quadspi: enable PHY for direct reads
2026-06-18 7:37 ` [PATCH v4 11/16] spi: cadence-quadspi: enable PHY for direct reads Santhosh Kumar K
@ 2026-06-18 7:53 ` sashiko-bot
0 siblings, 0 replies; 29+ messages in thread
From: sashiko-bot @ 2026-06-18 7:53 UTC (permalink / raw)
To: Santhosh Kumar K; +Cc: robh, conor+dt, devicetree
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [Critical] Unaligned and small direct reads bypass the PHY but execute at maximum frequency, risking silent data corruption.
- [High] Failure to enable the PHY leaves the SPI controller permanently corrupted and hardware state inconsistent.
- [High] Blindly decrementing the dummy cycle count in `cqspi_tune_phy()` causes integer underflow for operations requiring 0 dummy cycles.
--
commit a2e81b0959397afcae6ba4e607025256559b31dd
Author: Santhosh Kumar K <s-k6@ti.com>
spi: cadence-quadspi: enable PHY for direct reads
Add cqspi_tune_phy() to toggle PHY mode. Enabling sets the calibrated
read-capture delay, asserts PHY_EN and PHY_PIPELINE, and decrements the
dummy cycle count by one since the PHY pipeline absorbs that latency.
Disabling reverses all three. Disable is best-effort.
> diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
> index 5f992fee790a4..eb6baa797c78e 100644
> --- a/drivers/spi/spi-cadence-quadspi.c
> +++ b/drivers/spi/spi-cadence-quadspi.c
[ ... ]
> @@ -565,6 +565,61 @@ static void cqspi_readdata_capture(struct cqspi_st *cqspi, const bool bypass,
> writel(reg, reg_base + CQSPI_REG_READCAPTURE);
> }
>
> +static int cqspi_tune_phy(struct cqspi_flash_pdata *f_pdata, bool enable)
> +{
> + struct cqspi_st *cqspi = f_pdata->cqspi;
> + void __iomem *reg_base = cqspi->iobase;
> + u32 reg;
> + u8 dummy;
> +
> + if (enable) {
[ ... ]
> + /*
> + * The PHY data-capture pipeline absorbs one dummy cycle's
> + * worth of latency; reduce the count to avoid over-compensation.
> + */
> + reg = readl(reg_base + CQSPI_REG_RD_INSTR);
> + dummy = FIELD_GET(CQSPI_REG_RD_INSTR_DUMMY_MASK
> + << CQSPI_REG_RD_INSTR_DUMMY_LSB,
> + reg);
> + dummy--;
[Severity: High]
If the current SPI operation legitimately requires 0 dummy cycles
(e.g., standard 0x03 read commands, or during indirect writes), does dummy
underflow to 255 here?
If it does underflow, FIELD_PREP would mask this to 31 dummy cycles, which
could cause the controller to wait for 31 dummy cycles instead of 0,
resulting in shifted data.
Also, if the operation has 0 dummy cycles, does the PHY pipeline absorb
the first cycle of valid data instead of a dummy cycle?
> + reg &= ~(CQSPI_REG_RD_INSTR_DUMMY_MASK
> + << CQSPI_REG_RD_INSTR_DUMMY_LSB);
> + reg |= FIELD_PREP(CQSPI_REG_RD_INSTR_DUMMY_MASK
> + << CQSPI_REG_RD_INSTR_DUMMY_LSB,
> + dummy);
> + writel(reg, reg_base + CQSPI_REG_RD_INSTR);
> + } else {
[ ... ]
> + }
> +
> + return cqspi_wait_idle(cqspi);
> +}
[ ... ]
> @@ -1442,6 +1497,14 @@ static ssize_t cqspi_write(struct cqspi_flash_pdata *f_pdata,
> return cqspi_indirect_write_execute(f_pdata, to, buf, len);
> }
>
> +static bool cqspi_use_tuned_phy(struct cqspi_flash_pdata *f_pdata,
> + const struct spi_mem_op *op,
> + u32 post_config_max_speed_hz)
> +{
> + return f_pdata->use_tuned_phy &&
> + op->max_freq == post_config_max_speed_hz;
> +}
[Severity: High]
Could we enforce that op->dummy.nbytes > 0 here before authorizing the
use of the tuned PHY, to prevent the integer underflow when extracting
the dummy count in cqspi_tune_phy()?
> +
[ ... ]
> @@ -1543,8 +1606,11 @@ static int cqspi_direct_read_execute(struct cqspi_flash_pdata *f_pdata,
[ ... ]
> if (!cqspi->rx_chan || !virt_addr_valid(buf) ||
> len < CQSPI_PHY_MIN_DIRECT_READ_LEN) {
> cqspi_memcpy_fromio(op, buf, cqspi->ahb_base + from, len);
> return 0;
> }
>
> - return cqspi_direct_read_dma(f_pdata, buf, from, len);
> + if (!cqspi_use_tuned_phy(f_pdata, op, post_config_max_speed_hz))
> + return cqspi_direct_read_dma(f_pdata, buf, from, len);
> +
> + /* Split into unaligned head, aligned middle, unaligned tail */
> + from_aligned = ALIGN(from, 16);
> + to_aligned = ALIGN_DOWN(from + len, 16);
> + len_aligned = to_aligned - from_aligned;
> +
> + if (from != from_aligned) {
> + ret = cqspi_direct_read_dma(f_pdata, buf, from,
> + from_aligned - from);
[Severity: Critical]
If PHY tuning is required to reliably capture data at the maximum frequency
(post_config_max_speed_hz), does bypassing the calibrated capture delays for
unaligned head/tail bytes (or reads < 17 bytes) force the controller to use
untuned static capture delays?
Capturing data at high frequencies without the necessary PHY calibration might
result in shifted data.
To safely support unaligned direct reads with the PHY, should the driver
perform aligned, PHY-enabled reads into a bounce buffer, or strictly lower
the SPI clock frequency prior to reading without the PHY?
> + if (ret)
> + return ret;
> + buf += from_aligned - from;
> + }
> +
> + if (len_aligned) {
> + ret = cqspi_tune_phy(f_pdata, true);
> + if (ret)
> + return ret;
[Severity: High]
If cqspi_wait_idle() fails inside cqspi_tune_phy() and returns an error,
this error is propagated immediately without calling
cqspi_tune_phy(..., false).
Could this leave the hardware registers (like the decremented dummy cycle
count, PHY_EN, and PHY_PIPELINE) in an inconsistent state for subsequent SPI
operations?
Also, will subsequent retries to enable the PHY decrement the dummy count a
second time, cumulatively altering the hardware state?
> + ret = cqspi_direct_read_dma(f_pdata, buf, from_aligned,
> + len_aligned);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260618073725.84733-1-s-k6@ti.com?part=11
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v4 12/16] spi: cadence-quadspi: enable PHY for indirect writes
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (10 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 11/16] spi: cadence-quadspi: enable PHY for direct reads Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:53 ` sashiko-bot
2026-06-18 7:37 ` [PATCH v4 13/16] mtd: spinand: extract variant ranking logic into spinand_op_find_best() Santhosh Kumar K
` (3 subsequent siblings)
15 siblings, 1 reply; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Enable PHY for indirect writes of at least CQSPI_PHY_MIN_INDIRECT_WRITE_LEN
bytes. PHY is activated only when tuning completed successfully and the
write op runs at the calibrated post-config frequency, matching the same
frequency guard used by the read path.
Thread max_speed_hz from cqspi_mem_process() through cqspi_write() into
cqspi_indirect_write_execute() for the frequency check.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 32 +++++++++++++++++++++++++++----
1 file changed, 28 insertions(+), 4 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 16e3b843f0aa..df7fcdf404a6 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -151,6 +151,8 @@ struct cqspi_driver_platdata {
#define CQSPI_DLL_TIMEOUT_US 300
/* Minimum transfer length to use DMA for direct reads */
#define CQSPI_PHY_MIN_DIRECT_READ_LEN 17
+/* Minimum indirect write length to amortize PHY enable/disable overhead */
+#define CQSPI_PHY_MIN_INDIRECT_WRITE_LEN SZ_1K
/* Runtime_pm autosuspend delay */
#define CQSPI_AUTOSUSPEND_TIMEOUT 2000
@@ -1240,13 +1242,15 @@ static int cqspi_write_setup(struct cqspi_flash_pdata *f_pdata,
static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
loff_t to_addr, const u8 *txbuf,
- const size_t n_tx)
+ const size_t n_tx,
+ u32 post_config_max_speed_hz)
{
struct cqspi_st *cqspi = f_pdata->cqspi;
struct device *dev = &cqspi->pdev->dev;
void __iomem *reg_base = cqspi->iobase;
unsigned int remaining = n_tx;
unsigned int write_bytes;
+ bool use_tuned_phy_write;
int ret;
if (!refcount_read(&cqspi->refcount))
@@ -1282,6 +1286,18 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
if (cqspi->apb_ahb_hazard)
readl(reg_base + CQSPI_REG_INDIRECTWR);
+ /* Use PHY only for large writes at the calibrated rate */
+ use_tuned_phy_write = n_tx >= CQSPI_PHY_MIN_INDIRECT_WRITE_LEN &&
+ f_pdata->use_tuned_phy &&
+ f_pdata->phy_write_op.max_freq ==
+ post_config_max_speed_hz;
+
+ if (use_tuned_phy_write) {
+ ret = cqspi_tune_phy(f_pdata, true);
+ if (ret)
+ goto failwr;
+ }
+
while (remaining > 0) {
size_t write_words, mod_bytes;
@@ -1330,9 +1346,15 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
cqspi_wait_idle(cqspi);
+ if (use_tuned_phy_write)
+ cqspi_tune_phy(f_pdata, false);
+
return 0;
failwr:
+ if (use_tuned_phy_write)
+ cqspi_tune_phy(f_pdata, false);
+
/* Disable interrupt. */
writel(0, reg_base + CQSPI_REG_IRQMASK);
@@ -1467,7 +1489,8 @@ static void cqspi_configure(struct cqspi_flash_pdata *f_pdata,
}
static ssize_t cqspi_write(struct cqspi_flash_pdata *f_pdata,
- const struct spi_mem_op *op)
+ const struct spi_mem_op *op,
+ u32 post_config_max_speed_hz)
{
struct cqspi_st *cqspi = f_pdata->cqspi;
loff_t to = op->addr.val;
@@ -1494,7 +1517,8 @@ static ssize_t cqspi_write(struct cqspi_flash_pdata *f_pdata,
return cqspi_wait_idle(cqspi);
}
- return cqspi_indirect_write_execute(f_pdata, to, buf, len);
+ return cqspi_indirect_write_execute(f_pdata, to, buf, len,
+ post_config_max_speed_hz);
}
static bool cqspi_use_tuned_phy(struct cqspi_flash_pdata *f_pdata,
@@ -1710,7 +1734,7 @@ static int cqspi_mem_process(struct spi_mem *mem, const struct spi_mem_op *op)
if (!op->addr.nbytes || !op->data.buf.out)
return cqspi_command_write(f_pdata, op);
- return cqspi_write(f_pdata, op);
+ return cqspi_write(f_pdata, op, mem->spi->post_config_max_speed_hz);
}
static int cqspi_exec_mem_op(struct spi_mem *mem, const struct spi_mem_op *op)
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v4 12/16] spi: cadence-quadspi: enable PHY for indirect writes
2026-06-18 7:37 ` [PATCH v4 12/16] spi: cadence-quadspi: enable PHY for indirect writes Santhosh Kumar K
@ 2026-06-18 7:53 ` sashiko-bot
0 siblings, 0 replies; 29+ messages in thread
From: sashiko-bot @ 2026-06-18 7:53 UTC (permalink / raw)
To: Santhosh Kumar K; +Cc: conor+dt, devicetree, robh
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
New issues:
- [High] Enabling PHY after starting the indirect write operation guarantees a deadlock and hardware timeout, while the error path compounds the issue with a secondary timeout.
Pre-existing issues:
- [High] Missing PHY disable cleanup when `cqspi_tune_phy(true)` fails during direct read.
--
commit 58e38b1c2aa32f1b8542b6e43dd5365e97bd7dce
Author: Santhosh Kumar K <s-k6@ti.com>
Subject: spi: cadence-quadspi: enable PHY for indirect writes
Enable PHY for indirect writes of at least CQSPI_PHY_MIN_INDIRECT_WRITE_LEN
bytes. PHY is activated only when tuning completed successfully and the
write op runs at the calibrated post-config frequency, matching the same
frequency guard used by the read path.
Thread max_speed_hz from cqspi_mem_process() through cqspi_write() into
cqspi_indirect_write_execute() for the frequency check.
> diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
> index 057381e56a7fd..f4cf97a909fc5 100644
> --- a/drivers/spi/spi-cadence-quadspi.c
> +++ b/drivers/spi/spi-cadence-quadspi.c
[ ... ]
> @@ -1073,13 +1242,15 @@ static int cqspi_write_setup(struct cqspi_flash_pdata *f_pdata,
>
> static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
> loff_t to_addr, const u8 *txbuf,
> - const size_t n_tx)
> + const size_t n_tx,
> + u32 post_config_max_speed_hz)
> {
> struct cqspi_st *cqspi = f_pdata->cqspi;
> struct device *dev = &cqspi->pdev->dev;
> void __iomem *reg_base = cqspi->iobase;
> unsigned int remaining = n_tx;
> unsigned int write_bytes;
> + bool use_tuned_phy_write;
> int ret;
>
> if (!refcount_read(&cqspi->refcount))
[ ... ]
> @@ -1115,6 +1286,18 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
> if (cqspi->apb_ahb_hazard)
> readl(reg_base + CQSPI_REG_INDIRECTWR);
>
> + /* Use PHY only for large writes at the calibrated rate */
> + use_tuned_phy_write = n_tx >= CQSPI_PHY_MIN_INDIRECT_WRITE_LEN &&
> + f_pdata->use_tuned_phy &&
> + f_pdata->phy_write_op.max_freq ==
> + post_config_max_speed_hz;
> +
> + if (use_tuned_phy_write) {
> + ret = cqspi_tune_phy(f_pdata, true);
> + if (ret)
> + goto failwr;
> + }
[Severity: High]
Does calling cqspi_tune_phy() here guarantee a timeout?
Since cqspi_tune_phy() internally calls cqspi_wait_idle(), and this is
executed after writing CQSPI_REG_INDIRECTWR_START_MASK to start the indirect
write controller, the controller is already busy waiting for data on the AHB
interface.
Will this cause cqspi_wait_idle() to deterministically poll for 500ms and
return -ETIMEDOUT?
> while (remaining > 0) {
> size_t write_words, mod_bytes;
[ ... ]
> @@ -1163,9 +1346,15 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
>
> cqspi_wait_idle(cqspi);
>
> + if (use_tuned_phy_write)
> + cqspi_tune_phy(f_pdata, false);
> +
> return 0;
>
> failwr:
> + if (use_tuned_phy_write)
> + cqspi_tune_phy(f_pdata, false);
> +
[Severity: High]
Could this error path cause a secondary timeout?
Since the indirect write operation is not yet canceled when cqspi_tune_phy()
is called, the controller is still busy.
When cqspi_tune_phy() calls cqspi_wait_idle(), wouldn't it time out again
for another 500ms before we write CQSPI_REG_INDIRECTWR_CANCEL_MASK below?
> /* Disable interrupt. */
> writel(0, reg_base + CQSPI_REG_IRQMASK);
This is a pre-existing issue, but I noticed a similar cleanup problem in
cqspi_direct_read_execute():
drivers/spi/spi-cadence-quadspi.c:cqspi_direct_read_execute() {
...
if (len_aligned) {
ret = cqspi_tune_phy(f_pdata, true);
if (ret)
return ret;
ret = cqspi_direct_read_dma(f_pdata, buf, from_aligned,
len_aligned);
cqspi_tune_phy(f_pdata, false);
...
}
[Severity: High]
If cqspi_tune_phy(f_pdata, true) fails, it returns directly without calling
cqspi_tune_phy(f_pdata, false).
Since cqspi_tune_phy(true) modifies the hardware state by enabling the PHY
in CQSPI_REG_CONFIG and decrementing the read dummy cycles in
CQSPI_REG_RD_INSTR before checking for idle, doesn't returning here leave
the QSPI controller with the PHY partially enabled and an incorrect number
of dummy cycles?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260618073725.84733-1-s-k6@ti.com?part=12
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v4 13/16] mtd: spinand: extract variant ranking logic into spinand_op_find_best()
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (11 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 12/16] spi: cadence-quadspi: enable PHY for indirect writes Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 14/16] mtd: spinand: negotiate optimal PHY operating point before dirmap creation Santhosh Kumar K
` (2 subsequent siblings)
15 siblings, 0 replies; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
spinand_select_op_variant() open-codes a loop that finds the fastest
eligible op variant by transfer duration. Extract this into a shared
helper spinand_op_find_best() that accepts a skip_mask bitmask of
already-tried variant indices, enabling callers to iterate variants in
ranked order while skipping previously attempted ones.
spinand_select_op_variant() becomes a one-liner. No functional change.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/mtd/nand/spi/core.c | 32 +++++++++++++++++++++++++++-----
1 file changed, 27 insertions(+), 5 deletions(-)
diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
index f86786344d52..b678d0534297 100644
--- a/drivers/mtd/nand/spi/core.c
+++ b/drivers/mtd/nand/spi/core.c
@@ -1541,9 +1541,22 @@ static int spinand_init_odtr_instruction_set(struct spinand_device *spinand)
return 0;
}
+/*
+ * spinand_op_find_best() - Find the fastest eligible op variant.
+ * @spinand: SPI NAND device
+ * @variants: full variant list to search
+ * @odtr: true to consider ODTR ops, false for SSDR ops
+ * @skip_mask: bitmask of variant indices to skip (already tried)
+ *
+ * Iterates @variants, evaluates transfer duration for each eligible op, and
+ * returns a pointer to the fastest one not in @skip_mask. Returns NULL when
+ * no eligible variant remains. Used by both variant selection at init time
+ * (skip_mask == 0) and ranked PHY tuning iteration.
+ */
static const struct spi_mem_op *
-spinand_select_op_variant(struct spinand_device *spinand, enum spinand_bus_interface iface,
- const struct spinand_op_variants *variants)
+spinand_op_find_best(struct spinand_device *spinand,
+ const struct spinand_op_variants *variants, bool odtr,
+ u32 skip_mask)
{
struct nand_device *nand = spinand_to_nand(spinand);
const struct spi_mem_op *best_variant = NULL;
@@ -1551,15 +1564,16 @@ spinand_select_op_variant(struct spinand_device *spinand, enum spinand_bus_inter
unsigned int i;
for (i = 0; i < variants->nops; i++) {
- struct spi_mem_op op = variants->ops[i];
+ struct spi_mem_op op;
u64 op_duration_ns = 0;
unsigned int nbytes;
int ret;
- if ((iface == SSDR && spinand_op_is_odtr(&op)) ||
- (iface == ODTR && !spinand_op_is_odtr(&op)))
+ if ((skip_mask & BIT(i)) ||
+ spinand_op_is_odtr(&variants->ops[i]) != odtr)
continue;
+ op = variants->ops[i];
nbytes = nanddev_per_page_oobsize(nand) +
nanddev_page_size(nand);
@@ -1588,6 +1602,14 @@ spinand_select_op_variant(struct spinand_device *spinand, enum spinand_bus_inter
return best_variant;
}
+static const struct spi_mem_op *
+spinand_select_op_variant(struct spinand_device *spinand,
+ enum spinand_bus_interface iface,
+ const struct spinand_op_variants *variants)
+{
+ return spinand_op_find_best(spinand, variants, iface == ODTR, 0);
+}
+
/**
* spinand_match_and_init() - Try to find a match between a device ID and an
* entry in a spinand_info table
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* [PATCH v4 14/16] mtd: spinand: negotiate optimal PHY operating point before dirmap creation
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (12 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 13/16] mtd: spinand: extract variant ranking logic into spinand_op_find_best() Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 8:02 ` sashiko-bot
2026-06-18 7:37 ` [PATCH v4 15/16] mtd: spi-nor: extract read op template construction into helper Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 16/16] mtd: spi-nor: run PHY tuning after init and update dirmap frequency Santhosh Kumar K
15 siblings, 1 reply; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Dirmap descriptors encode the op template including the operating
frequency at creation time, so PHY tuning must complete before dirmaps
are created to ensure the validated frequency is embedded in the
descriptors from the start.
Move dirmap creation from spinand_init() to spinand_probe(), after a
new spinand_configure_phy() call that negotiates the best available PHY
operating point. spinand_configure_phy() tries the pre-selected variant
first. If the controller signals that PHY tuning is not applicable for
that op, spinand_try_phy_ranked() iterates remaining variants in
performance order — DTR variants first, then SDR variants after
switching the bus interface if needed. On full failure the device falls
back to the best available non-PHY mode.
Add spinand_reset_max_ops() to copy op templates with max_freq zeroed
before each execute_tuning call, enforcing the invariant that a non-zero
max_freq only results from a successful tuning.
PHY failure is non-fatal; the device operates at the conservative base
rate.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/mtd/nand/spi/core.c | 214 ++++++++++++++++++++++++++++++++++--
include/linux/mtd/spinand.h | 11 ++
2 files changed, 216 insertions(+), 9 deletions(-)
diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
index b678d0534297..5dcfaabaf2cc 100644
--- a/drivers/mtd/nand/spi/core.c
+++ b/drivers/mtd/nand/spi/core.c
@@ -1280,10 +1280,16 @@ static int spinand_create_dirmap(struct spinand_device *spinand,
/* The plane number is passed in MSB just above the column address */
info.offset = plane << fls(nand->memorg.pagesize);
+ /*
+ * Propagate the validated PHY frequency into the dirmap op templates
+ * at construction time.
+ */
+
/* Write descriptor */
info.length = nanddev_page_size(nand) + nanddev_per_page_oobsize(nand);
info.primary_op_tmpl = *spinand->op_templates->update_cache;
info.primary_op_tmpl.data.ecc = enable_ecc;
+ info.primary_op_tmpl.max_freq = spinand->max_write_op.max_freq;
desc = devm_spi_mem_dirmap_create(&spinand->spimem->spi->dev,
spinand->spimem, &info);
if (IS_ERR(desc))
@@ -1294,9 +1300,11 @@ static int spinand_create_dirmap(struct spinand_device *spinand,
/* Read descriptor */
info.primary_op_tmpl = *spinand->op_templates->read_cache;
info.primary_op_tmpl.data.ecc = enable_ecc;
+ info.primary_op_tmpl.max_freq = spinand->max_read_op.max_freq;
if (secondary_op) {
info.secondary_op_tmpl = *spinand->op_templates->cont_read_cache;
info.secondary_op_tmpl.data.ecc = enable_ecc;
+ info.secondary_op_tmpl.max_freq = spinand->max_read_op.max_freq;
}
desc = spinand_create_rdesc(spinand, &info);
if (IS_ERR(desc))
@@ -1744,6 +1752,17 @@ int spinand_match_and_init(struct spinand_device *spinand,
spinand->cont_read_possible = false;
}
+ /*
+ * Save the full read variant list (ODTR and SSDR ops) for PHY
+ * tuning iteration. Only saved when all ODTR templates are
+ * valid so spinand_configure_phy() knows ranked fallback is
+ * available.
+ */
+ if (spinand->odtr_op_templates.read_cache &&
+ spinand->odtr_op_templates.write_cache &&
+ spinand->odtr_op_templates.update_cache)
+ spinand->phy_read_variants = info->op_variants.read_cache;
+
return 0;
}
@@ -1922,7 +1941,6 @@ static int spinand_mtd_suspend(struct mtd_info *mtd)
static int spinand_init(struct spinand_device *spinand)
{
- struct device *dev = &spinand->spimem->spi->dev;
struct mtd_info *mtd = spinand_to_mtd(spinand);
struct nand_device *nand = mtd_to_nanddev(mtd);
int ret;
@@ -2014,14 +2032,6 @@ static int spinand_init(struct spinand_device *spinand)
mtd->ecc_step_size = nanddev_get_ecc_conf(nand)->step_size;
mtd->bitflip_threshold = DIV_ROUND_UP(mtd->ecc_strength * 3, 4);
- ret = spinand_create_dirmaps(spinand);
- if (ret) {
- dev_err(dev,
- "Failed to create direct mappings for read/write operations (err = %d)\n",
- ret);
- goto err_cleanup_ecc_engine;
- }
-
return 0;
err_cleanup_ecc_engine:
@@ -2050,6 +2060,178 @@ static void spinand_cleanup(struct spinand_device *spinand)
kfree(spinand->scratchbuf);
}
+/*
+ * spinand_try_phy_ranked() - Try PHY tuning on variants in performance order.
+ * @spinand: SPI NAND device
+ * @mem: SPI memory device
+ * @odtr: true to iterate ODTR variants, false for SSDR variants
+ * @tried_mask: bitmask of already-tried variant indices; updated on each try
+ *
+ * Iterates the full read variant list in descending performance order,
+ * skipping variants in @tried_mask, and calls execute_tuning on each until one
+ * succeeds. The fastest PHY-tunable variant is chosen regardless of which was
+ * pre-selected at init; ranked iteration finds the best available variant
+ * without re-trying already-attempted ones.
+ *
+ * On success, sets spinand->max_read_op and updates the matching
+ * odtr_op_templates.read_cache or ssdr_op_templates.read_cache.
+ */
+static bool spinand_try_phy_ranked(struct spinand_device *spinand,
+ struct spi_mem *mem, bool odtr,
+ u32 *tried_mask)
+{
+ const struct spinand_op_variants *variants = spinand->phy_read_variants;
+ const struct spi_mem_op *best;
+ int ret;
+
+ if (!variants)
+ return false;
+
+ while ((best = spinand_op_find_best(spinand, variants, odtr,
+ *tried_mask))) {
+ *tried_mask |= BIT(best - variants->ops);
+ spinand->max_read_op = *best;
+ spinand->max_read_op.max_freq = 0;
+ ret = spi_mem_execute_tuning(mem, &spinand->max_read_op,
+ &spinand->max_write_op);
+ if (ret && ret != -EOPNOTSUPP)
+ dev_warn(&mem->spi->dev, "%s PHY tuning failed: %d\n",
+ odtr ? "ODTR" : "SSDR", ret);
+ if (!ret && spinand->max_read_op.max_freq) {
+ if (odtr)
+ spinand->odtr_op_templates.read_cache = best;
+ else
+ spinand->ssdr_op_templates.read_cache = best;
+ return true;
+ }
+ }
+ return false;
+}
+
+/*
+ * spinand_reset_max_ops() - Copy op templates and zero max_freq on both.
+ * @spinand: SPI NAND device
+ * @templates: op template set to copy from
+ *
+ * Called before execute_tuning so max_freq starts at zero; execute_tuning sets
+ * it to the validated clock rate only on success. A non-zero max_freq means
+ * PHY-validated; zero means the base rate applies.
+ */
+static void spinand_reset_max_ops(struct spinand_device *spinand,
+ struct spinand_mem_ops *templates)
+{
+ spinand->max_read_op = *templates->read_cache;
+ spinand->max_read_op.max_freq = 0;
+ spinand->max_write_op = *templates->write_cache;
+ spinand->max_write_op.max_freq = 0;
+}
+
+/*
+ * spinand_configure_phy() - Negotiate the optimal PHY operating point.
+ * @spinand: SPI NAND device
+ * @mem: SPI memory device
+ *
+ * Tries the pre-selected variant first. If the controller signals that
+ * PHY tuning is not applicable for that specific op, iterates all remaining
+ * variants in performance order. For devices that support both DTR and SDR
+ * interfaces, DTR variants are tried first; if all fail the device is
+ * switched to SDR mode and SDR variants are tried. On full failure the
+ * device falls back to the best available non-PHY mode. Devices that
+ * support only SDR skip the DTR ranked pass entirely.
+ *
+ * PHY failure is never fatal.
+ *
+ * Note: tried_mask is u32, supporting up to 32 variants total across both
+ * ODTR and SSDR. Flash devices with more than 32 read variants are not
+ * supported.
+ */
+static void spinand_configure_phy(struct spinand_device *spinand,
+ struct spi_mem *mem)
+{
+ u32 tried_mask;
+ int ret;
+
+ spinand_reset_max_ops(spinand, spinand->op_templates);
+
+ ret = spi_mem_execute_tuning(mem, &spinand->max_read_op,
+ &spinand->max_write_op);
+ if (ret && ret != -EOPNOTSUPP)
+ dev_warn(&mem->spi->dev, "Failed to execute PHY tuning: %d\n",
+ ret);
+
+ /*
+ * Any non-zero return or a set max_freq means we are done (error,
+ * unsupported, or success). Fallback only for the op-specific "skip"
+ * signal: ret == 0 with max_freq still 0.
+ */
+ if (ret || spinand->max_read_op.max_freq)
+ return;
+
+ if (!mem->spi->post_config_max_speed_hz || spinand->bus_iface == SSDR ||
+ !spinand->phy_read_variants)
+ return;
+
+ if (WARN_ON(spinand->phy_read_variants->nops > 32))
+ return;
+
+ /* Mark the pre-selected ODTR variant as already tried */
+ tried_mask = BIT(spinand->odtr_op_templates.read_cache -
+ spinand->phy_read_variants->ops);
+
+ dev_dbg(&mem->spi->dev,
+ "PHY tuning skipped for current op; searching for best PHY variant\n");
+
+ /* Pass 1: try all remaining ODTR variants in performance order */
+ if (spinand_try_phy_ranked(spinand, mem, true, &tried_mask))
+ return;
+
+ /*
+ * Pass 2: switch to SSDR and try all SSDR variants in performance
+ * order.
+ *
+ * Only enter if we actually have SSDR support and a reconfigure
+ * callback. The hardware is still in ODTR mode here so no
+ * configure_chip call is needed to undo; just set up the ODTR non-PHY
+ * fallback and return.
+ */
+ if (!spinand->ssdr_op_templates.read_cache ||
+ !spinand->ssdr_op_templates.write_cache ||
+ !spinand->configure_chip)
+ goto use_odtr_non_phy;
+
+ if (spinand->configure_chip(spinand, SSDR))
+ goto use_odtr_non_phy;
+
+ spinand->op_templates = &spinand->ssdr_op_templates;
+ spinand->bus_iface = SSDR;
+ spinand->max_write_op = *spinand->ssdr_op_templates.write_cache;
+ spinand->max_write_op.max_freq = 0;
+
+ /*
+ * Only ODTR variants were candidates in Pass 1; SSDR bit positions
+ * are clear
+ */
+ if (spinand_try_phy_ranked(spinand, mem, false, &tried_mask))
+ return;
+
+ /*
+ * All PHY attempts exhausted. Revert to ODTR for non-PHY DTR
+ * operation. If revert fails, stay in SSDR — a mode mismatch
+ * (ODTR op templates on SSDR-mode device) would corrupt data.
+ */
+ if (spinand->configure_chip(spinand, ODTR)) {
+ dev_warn(&mem->spi->dev,
+ "Failed to revert to ODTR, staying in SSDR non-PHY\n");
+ spinand_reset_max_ops(spinand, &spinand->ssdr_op_templates);
+ return;
+ }
+
+use_odtr_non_phy:
+ spinand->op_templates = &spinand->odtr_op_templates;
+ spinand->bus_iface = ODTR;
+ spinand_reset_max_ops(spinand, &spinand->odtr_op_templates);
+}
+
static int spinand_probe(struct spi_mem *mem)
{
struct spinand_device *spinand;
@@ -2072,6 +2254,20 @@ static int spinand_probe(struct spi_mem *mem)
if (ret)
return ret;
+ /*
+ * Negotiate the best PHY operating point before creating dirmaps so
+ * the validated frequency is available at dirmap construction time.
+ */
+ spinand_configure_phy(spinand, mem);
+
+ ret = spinand_create_dirmaps(spinand);
+ if (ret) {
+ dev_err(&mem->spi->dev,
+ "Failed to create direct mappings for read/write operations (err = %d)\n",
+ ret);
+ goto err_spinand_cleanup;
+ }
+
ret = mtd_device_register(mtd, NULL, 0);
if (ret)
goto err_spinand_cleanup;
diff --git a/include/linux/mtd/spinand.h b/include/linux/mtd/spinand.h
index ec6efcfeef83..50a8319cf11e 100644
--- a/include/linux/mtd/spinand.h
+++ b/include/linux/mtd/spinand.h
@@ -791,8 +791,19 @@ struct spinand_device {
struct spinand_mem_ops *op_templates;
enum spinand_bus_interface bus_iface;
+ /*
+ * Full read variant list (ODTR and SSDR ops together), saved when ODTR
+ * templates are valid. Used by spinand_configure_phy() to iterate all
+ * candidates when the pre-selected variant cannot be PHY-tuned.
+ */
+ const struct spinand_op_variants *phy_read_variants;
+
struct spinand_dirmap *dirmaps;
+ /* Persistent op templates updated by execute_tuning with validated speed. */
+ struct spi_mem_op max_read_op;
+ struct spi_mem_op max_write_op;
+
int (*select_target)(struct spinand_device *spinand,
unsigned int target);
unsigned int cur_target;
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v4 14/16] mtd: spinand: negotiate optimal PHY operating point before dirmap creation
2026-06-18 7:37 ` [PATCH v4 14/16] mtd: spinand: negotiate optimal PHY operating point before dirmap creation Santhosh Kumar K
@ 2026-06-18 8:02 ` sashiko-bot
0 siblings, 0 replies; 29+ messages in thread
From: sashiko-bot @ 2026-06-18 8:02 UTC (permalink / raw)
To: Santhosh Kumar K; +Cc: robh, conor+dt, devicetree
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [Critical] Fallback to SSDR during PHY tuning does not invalidate ODTR templates, causing a mode mismatch and persistent flash failure upon system resume.
- [High] Fallback PHY tuning downgrades only the primary read operation, leaving write, update, and continuous read operations at their original unvalidated high-speed variants.
- [High] Missing reset of `max_freq` on PHY tuning error paths and across tuning loops violates the zero-frequency invariant, leading to operations at unvalidated frequencies.
--
commit 0f63572c37a35621b6ca824014ba943d4c511e93
Author: Santhosh Kumar K <s-k6@ti.com>
mtd: spinand: negotiate optimal PHY operating point before dirmap creation
This commit moves dirmap creation to spinand_probe() and introduces PHY
tuning negotiation before dirmaps are created. It iterates through DTR
and SDR variants in descending performance order to find the best PHY
operating point, and falls back to non-PHY modes on complete failure.
> diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
> index b678d05342979..5dcfaabaf2cc5 100644
> --- a/drivers/mtd/nand/spi/core.c
> +++ b/drivers/mtd/nand/spi/core.c
> @@ -1294,9 +1300,11 @@ static int spinand_create_dirmap(struct spinand_device *spinand,
> /* Read descriptor */
> info.primary_op_tmpl = *spinand->op_templates->read_cache;
> info.primary_op_tmpl.data.ecc = enable_ecc;
> + info.primary_op_tmpl.max_freq = spinand->max_read_op.max_freq;
> if (secondary_op) {
> info.secondary_op_tmpl = *spinand->op_templates->cont_read_cache;
> info.secondary_op_tmpl.data.ecc = enable_ecc;
> + info.secondary_op_tmpl.max_freq = spinand->max_read_op.max_freq;
[Severity: High]
When the fallback tuning negotiates a slower read variant, the read_cache
is updated but the cont_read_cache, update_cache, and write_cache are not
downgraded.
Does this result in the uncalibrated high-speed cont_read_cache and
update_cache variants being embedded into the dirmaps using the max_freq
of the downgraded read variant?
Could this lead to I/O failures or data corruption when the controller
executes these uncalibrated high-speed variants?
> }
> desc = spinand_create_rdesc(spinand, &info);
[ ... ]
> +static bool spinand_try_phy_ranked(struct spinand_device *spinand,
> + struct spi_mem *mem, bool odtr,
> + u32 *tried_mask)
> +{
> + const struct spinand_op_variants *variants = spinand->phy_read_variants;
> + const struct spi_mem_op *best;
> + int ret;
> +
> + if (!variants)
> + return false;
> +
> + while ((best = spinand_op_find_best(spinand, variants, odtr,
> + *tried_mask))) {
> + *tried_mask |= BIT(best - variants->ops);
> + spinand->max_read_op = *best;
> + spinand->max_read_op.max_freq = 0;
> + ret = spi_mem_execute_tuning(mem, &spinand->max_read_op,
> + &spinand->max_write_op);
[Severity: High]
The loop zeroes out spinand->max_read_op.max_freq before each iteration,
but it does not reset spinand->max_write_op.max_freq.
If spi_mem_execute_tuning() mutates max_write_op.max_freq before failing,
does this leaked non-zero frequency propagate to subsequent iterations?
> + if (ret && ret != -EOPNOTSUPP)
> + dev_warn(&mem->spi->dev, "%s PHY tuning failed: %d\n",
> + odtr ? "ODTR" : "SSDR", ret);
> + if (!ret && spinand->max_read_op.max_freq) {
> + if (odtr)
> + spinand->odtr_op_templates.read_cache = best;
> + else
> + spinand->ssdr_op_templates.read_cache = best;
> + return true;
> + }
> + }
> + return false;
> +}
[ ... ]
> +static void spinand_configure_phy(struct spinand_device *spinand,
> + struct spi_mem *mem)
> +{
> + u32 tried_mask;
> + int ret;
> +
> + spinand_reset_max_ops(spinand, spinand->op_templates);
> +
> + ret = spi_mem_execute_tuning(mem, &spinand->max_read_op,
> + &spinand->max_write_op);
> + if (ret && ret != -EOPNOTSUPP)
> + dev_warn(&mem->spi->dev, "Failed to execute PHY tuning: %d\n",
> + ret);
> +
> + /*
> + * Any non-zero return or a set max_freq means we are done (error,
> + * unsupported, or success). Fallback only for the op-specific "skip"
> + * signal: ret == 0 with max_freq still 0.
> + */
> + if (ret || spinand->max_read_op.max_freq)
> + return;
[Severity: High]
If the initial tuning call fails and returns early here, does it leave the
potentially mutated max_write_op.max_freq without resetting it?
Could this leaked frequency be embedded into dirmap descriptors and cause
operations to run at unvalidated frequencies?
> +
> + if (!mem->spi->post_config_max_speed_hz || spinand->bus_iface == SSDR ||
> + !spinand->phy_read_variants)
> + return;
[ ... ]
> + /*
> + * Pass 2: switch to SSDR and try all SSDR variants in performance
> + * order.
> + *
> + * Only enter if we actually have SSDR support and a reconfigure
> + * callback. The hardware is still in ODTR mode here so no
> + * configure_chip call is needed to undo; just set up the ODTR non-PHY
> + * fallback and return.
> + */
> + if (!spinand->ssdr_op_templates.read_cache ||
> + !spinand->ssdr_op_templates.write_cache ||
> + !spinand->configure_chip)
> + goto use_odtr_non_phy;
> +
> + if (spinand->configure_chip(spinand, SSDR))
> + goto use_odtr_non_phy;
> +
> + spinand->op_templates = &spinand->ssdr_op_templates;
> + spinand->bus_iface = SSDR;
> + spinand->max_write_op = *spinand->ssdr_op_templates.write_cache;
> + spinand->max_write_op.max_freq = 0;
[Severity: Critical]
When dynamically falling back to SSDR mode, spinand->odtr_op_templates is
not cleared.
When the system resumes from suspend, spinand_configure_chip() uses the
uncleared ODTR templates to determine the mode:
if (spinand->odtr_op_templates.read_cache &&
spinand->odtr_op_templates.write_cache &&
spinand->odtr_op_templates.update_cache)
odtr = true;
Will this cause the flash chip to be incorrectly reconfigured back to ODTR
mode upon resume, while the statically created dirmaps still contain SSDR
commands?
Can this command desynchronization cause I/O failures after the device
resumes?
> +
> + /*
> + * Only ODTR variants were candidates in Pass 1; SSDR bit positions
> + * are clear
> + */
> + if (spinand_try_phy_ranked(spinand, mem, false, &tried_mask))
> + return;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260618073725.84733-1-s-k6@ti.com?part=14
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v4 15/16] mtd: spi-nor: extract read op template construction into helper
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (13 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 14/16] mtd: spinand: negotiate optimal PHY operating point before dirmap creation Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 7:37 ` [PATCH v4 16/16] mtd: spi-nor: run PHY tuning after init and update dirmap frequency Santhosh Kumar K
15 siblings, 0 replies; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
From: Pratyush Yadav <pratyush@kernel.org>
spi_nor_spimem_read_data() and spi_nor_create_read_dirmap() both
open-code the same sequence: construct a SPI_MEM_OP, call
spi_nor_spimem_setup_op(), convert dummy cycles to bytes, and adjust
for DTR. The dirmap path also needed an explicit data buswidth fixup
because spi_nor_spimem_setup_op() skips buswidth when data.nbytes is
zero.
Extract this into spi_nor_spimem_get_read_op(). Using data.nbytes = 2
as a non-zero placeholder lets spi_nor_spimem_setup_op() configure the
data buswidth without a separate override; callers replace data.nbytes
with the actual transfer length before use. No functional change.
Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/mtd/spi-nor/core.c | 66 +++++++++++++++++++++-----------------
1 file changed, 36 insertions(+), 30 deletions(-)
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index ccf4396cdcd0..b683c077a233 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -188,6 +188,37 @@ static int spi_nor_controller_ops_erase(struct spi_nor *nor, loff_t offs)
return nor->controller_ops->erase(nor, offs);
}
+/**
+ * spi_nor_spimem_get_read_op() - build a configured read op template
+ * @nor: the spi-nor device
+ *
+ * Returns a spi_mem_op with the command, address format, dummy cycles,
+ * and data buswidth configured for @nor. For direct reads, the caller
+ * must fill in addr.val, data.nbytes, and data.buf.in before use.
+ */
+static struct spi_mem_op spi_nor_spimem_get_read_op(struct spi_nor *nor)
+{
+ /*
+ * data.nbytes must be non-zero so spi_nor_spimem_setup_op()
+ * configures the data buswidth; callers replace it with the
+ * actual transfer length.
+ */
+ struct spi_mem_op op =
+ SPI_MEM_OP(SPI_MEM_OP_CMD(nor->read_opcode, 0),
+ SPI_MEM_OP_ADDR(nor->addr_nbytes, 0, 0),
+ SPI_MEM_OP_DUMMY(nor->read_dummy, 0),
+ SPI_MEM_OP_DATA_IN(2, NULL, 0));
+
+ spi_nor_spimem_setup_op(nor, &op, nor->read_proto);
+
+ /* convert the dummy cycles to the number of bytes */
+ op.dummy.nbytes = (nor->read_dummy * op.dummy.buswidth) / 8;
+ if (spi_nor_protocol_is_dtr(nor->read_proto))
+ op.dummy.nbytes *= 2;
+
+ return op;
+}
+
/**
* spi_nor_spimem_read_data() - read data from flash's memory region via
* spi-mem
@@ -201,21 +232,14 @@ static int spi_nor_controller_ops_erase(struct spi_nor *nor, loff_t offs)
static ssize_t spi_nor_spimem_read_data(struct spi_nor *nor, loff_t from,
size_t len, u8 *buf)
{
- struct spi_mem_op op =
- SPI_MEM_OP(SPI_MEM_OP_CMD(nor->read_opcode, 0),
- SPI_MEM_OP_ADDR(nor->addr_nbytes, from, 0),
- SPI_MEM_OP_DUMMY(nor->read_dummy, 0),
- SPI_MEM_OP_DATA_IN(len, buf, 0));
+ struct spi_mem_op op = spi_nor_spimem_get_read_op(nor);
bool usebouncebuf;
ssize_t nbytes;
int error;
- spi_nor_spimem_setup_op(nor, &op, nor->read_proto);
-
- /* convert the dummy cycles to the number of bytes */
- op.dummy.nbytes = (nor->read_dummy * op.dummy.buswidth) / 8;
- if (spi_nor_protocol_is_dtr(nor->read_proto))
- op.dummy.nbytes *= 2;
+ op.addr.val = from;
+ op.data.nbytes = len;
+ op.data.buf.in = buf;
usebouncebuf = spi_nor_spimem_bounce(nor, &op);
@@ -3712,28 +3736,10 @@ static int spi_nor_create_read_dirmap(struct spi_nor *nor)
{
struct spi_mem_dirmap_info info = {
.op_tmpl = &info.primary_op_tmpl,
- .primary_op_tmpl = SPI_MEM_OP(SPI_MEM_OP_CMD(nor->read_opcode, 0),
- SPI_MEM_OP_ADDR(nor->addr_nbytes, 0, 0),
- SPI_MEM_OP_DUMMY(nor->read_dummy, 0),
- SPI_MEM_OP_DATA_IN(0, NULL, 0)),
+ .primary_op_tmpl = spi_nor_spimem_get_read_op(nor),
.offset = 0,
.length = nor->params->size,
};
- struct spi_mem_op *op = info.op_tmpl;
-
- spi_nor_spimem_setup_op(nor, op, nor->read_proto);
-
- /* convert the dummy cycles to the number of bytes */
- op->dummy.nbytes = (nor->read_dummy * op->dummy.buswidth) / 8;
- if (spi_nor_protocol_is_dtr(nor->read_proto))
- op->dummy.nbytes *= 2;
-
- /*
- * Since spi_nor_spimem_setup_op() only sets buswidth when the number
- * of data bytes is non-zero, the data buswidth won't be set here. So,
- * do it explicitly.
- */
- op->data.buswidth = spi_nor_get_protocol_data_nbits(nor->read_proto);
nor->dirmap.rdesc = devm_spi_mem_dirmap_create(nor->dev, nor->spimem,
&info);
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* [PATCH v4 16/16] mtd: spi-nor: run PHY tuning after init and update dirmap frequency
2026-06-18 7:37 [PATCH v4 00/16] spi: cadence-quadspi: add PHY tuning support Santhosh Kumar K
` (14 preceding siblings ...)
2026-06-18 7:37 ` [PATCH v4 15/16] mtd: spi-nor: extract read op template construction into helper Santhosh Kumar K
@ 2026-06-18 7:37 ` Santhosh Kumar K
2026-06-18 8:01 ` sashiko-bot
15 siblings, 1 reply; 29+ messages in thread
From: Santhosh Kumar K @ 2026-06-18 7:37 UTC (permalink / raw)
To: broonie, robh, krzk+dt, conor+dt, miquel.raynal, richard,
vigneshr, pratyush, mwalle, takahiro.kuwano
Cc: linux-spi, devicetree, linux-kernel, linux-mtd, praneeth,
u-kumar1, a-dutta, s-k6
Run PHY tuning in spi_nor_probe() before creating dirmaps so the
validated frequency is available at dirmap construction time.
Store the configured read op template in nor->max_read_op and pass it
to spi_mem_execute_tuning(). On success the controller sets
max_read_op.max_freq to the calibrated rate. spi_nor_spimem_get_read_op()
propagates nor->max_read_op.max_freq into every op it returns, so the
validated frequency flows automatically into the dirmap template and into
regular read ops.
Tuning failure is non-fatal; the device operates at the conservative
base rate.
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
---
drivers/mtd/spi-nor/core.c | 14 ++++++++++++++
include/linux/mtd/spi-nor.h | 3 +++
2 files changed, 17 insertions(+)
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index b683c077a233..a2a2cf3d3603 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -216,6 +216,9 @@ static struct spi_mem_op spi_nor_spimem_get_read_op(struct spi_nor *nor)
if (spi_nor_protocol_is_dtr(nor->read_proto))
op.dummy.nbytes *= 2;
+ /* Propagate the validated frequency; zero before tuning. */
+ op.max_freq = nor->max_read_op.max_freq;
+
return op;
}
@@ -3843,6 +3846,17 @@ static int spi_nor_probe(struct spi_mem *spimem)
return -ENOMEM;
}
+ /*
+ * Populate the persistent template and run PHY tuning before dirmap
+ * creation so the validated frequency feeds into the dirmap op.
+ * Tuning failure is non-fatal; the device operates at base speed.
+ */
+ nor->max_read_op = spi_nor_spimem_get_read_op(nor);
+
+ ret = spi_mem_execute_tuning(spimem, &nor->max_read_op, NULL);
+ if (ret && ret != -EOPNOTSUPP)
+ dev_warn(dev, "Failed to execute PHY tuning: %d\n", ret);
+
ret = spi_nor_create_read_dirmap(nor);
if (ret)
return ret;
diff --git a/include/linux/mtd/spi-nor.h b/include/linux/mtd/spi-nor.h
index 4b92494827b1..ab498a50f15f 100644
--- a/include/linux/mtd/spi-nor.h
+++ b/include/linux/mtd/spi-nor.h
@@ -422,6 +422,9 @@ struct spi_nor {
struct spi_mem_dirmap_desc *wdesc;
} dirmap;
+ /* Persistent op template updated by execute_tuning with validated speed. */
+ struct spi_mem_op max_read_op;
+
void *priv;
};
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH v4 16/16] mtd: spi-nor: run PHY tuning after init and update dirmap frequency
2026-06-18 7:37 ` [PATCH v4 16/16] mtd: spi-nor: run PHY tuning after init and update dirmap frequency Santhosh Kumar K
@ 2026-06-18 8:01 ` sashiko-bot
0 siblings, 0 replies; 29+ messages in thread
From: sashiko-bot @ 2026-06-18 8:01 UTC (permalink / raw)
To: Santhosh Kumar K; +Cc: devicetree, conor+dt, robh
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] If probe steps fail after device initialization, `spi_nor_probe()` returns without restoring hardware state, leaving the device in an inconsistent configuration across soft reboots.
--
commit 7ac575a84edff91451a5b41913132dbcedb404cd
Author: Santhosh Kumar K <s-k6@ti.com>
mtd: spi-nor: run PHY tuning after init and update dirmap frequency
Run PHY tuning in spi_nor_probe() before creating dirmaps so the validated
frequency is available at dirmap construction time, falling back to base
rate on failure.
> diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
> index 2c9859fb07949..e9e66d28fa52a 100644
> --- a/drivers/mtd/spi-nor/core.c
> +++ b/drivers/mtd/spi-nor/core.c
[ ... ]
> @@ -3773,6 +3776,17 @@ static int spi_nor_probe(struct spi_mem *spimem)
> return -ENOMEM;
> }
>
> + /*
> + * Populate the persistent template and run PHY tuning before dirmap
> + * creation so the validated frequency feeds into the dirmap op.
> + * Tuning failure is non-fatal; the device operates at base speed.
> + */
> + nor->max_read_op = spi_nor_spimem_get_read_op(nor);
> +
> + ret = spi_mem_execute_tuning(spimem, &nor->max_read_op, NULL);
> + if (ret && ret != -EOPNOTSUPP)
> + dev_warn(dev, "Failed to execute PHY tuning: %d\n", ret);
> +
> ret = spi_nor_create_read_dirmap(nor);
> if (ret)
> return ret;
[Severity: High]
This isn't a bug introduced by this patch, but does spi_nor_probe() fail to
restore the hardware state on error paths after device initialization?
Looking at the error returns here for spi_nor_create_read_dirmap() and
subsequent calls like spi_nor_create_write_dirmap() or mtd_device_register(),
the function returns the error code directly without calling spi_nor_restore().
Because spi_nor_scan() executes spi_nor_init() earlier in the probe to
modify the physical flash hardware state (such as enabling 4-byte address
mode or Octal DTR mode), returning without unwinding these changes leaves
the flash in a non-default mode.
Unlike spi_nor_remove() and spi_nor_shutdown() which explicitly call
spi_nor_restore() to revert the flash to its default 3-byte/1-1-1 state,
this error path lacks teardown, and spi_nor_restore() is not registered as
a devm action.
If probe fails here, could this prevent the system's ROM bootloader from
reading the flash after a soft reboot, effectively bricking the device
until a hard power cycle?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260618073725.84733-1-s-k6@ti.com?part=16
^ permalink raw reply [flat|nested] 29+ messages in thread