* [PATCH v2 0/4] Improve error handling for qnap-mcu transfers
@ 2025-11-05 23:47 Heiko Stuebner
2025-11-05 23:47 ` [PATCH v2 1/4] mfd: qnap-mcu: Calculate the checksum on the actual number of bytes received Heiko Stuebner
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Heiko Stuebner @ 2025-11-05 23:47 UTC (permalink / raw)
To: lee; +Cc: linux-kernel, heiko
Digging deeper into how that MCU behaves, I found out it can return
more status codes than the "@0" for "ok".
The additional codes can report a failed checksum verification and
some "general" error for the command execution.
This also explains sporadic command timeout messages, I have seen
over time, when the controller sends an error code while we expect
a longer reply from it.
So while I'm not sure yet why it reports an error, with these changes
we at least handle the error return code in a meaningful way.
changes in v2:
- rebase on top of 6.18-rc1
- include improvement suggestions from Lee
Heiko Stuebner (4):
mfd: qnap-mcu: Calculate the checksum on the actual number of bytes
received
mfd: qnap-mcu: Use EPROTO in stead of EIO on checksum errors
mfd: qnap-mcu: Move checksum verification to its own function
mfd: qnap-mcu: Add proper error handling for command errors
drivers/mfd/qnap-mcu.c | 78 +++++++++++++++++++++++++++++++++++++++---
1 file changed, 73 insertions(+), 5 deletions(-)
--
2.47.2
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/4] mfd: qnap-mcu: Calculate the checksum on the actual number of bytes received
2025-11-05 23:47 [PATCH v2 0/4] Improve error handling for qnap-mcu transfers Heiko Stuebner
@ 2025-11-05 23:47 ` Heiko Stuebner
2025-11-05 23:47 ` [PATCH v2 2/4] mfd: qnap-mcu: Use EPROTO in stead of EIO on checksum errors Heiko Stuebner
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Heiko Stuebner @ 2025-11-05 23:47 UTC (permalink / raw)
To: lee; +Cc: linux-kernel, heiko
In the case of an error message, the number of received bytes can be
less than originally expected but still contain a valid message.
If the transfer itself ended in an error we would exit earlier already.
So calculate the checksum on the number of received bytes and not the
number of expected bytes.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
---
drivers/mfd/qnap-mcu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/mfd/qnap-mcu.c b/drivers/mfd/qnap-mcu.c
index 2be429a50611..e3210541ec56 100644
--- a/drivers/mfd/qnap-mcu.c
+++ b/drivers/mfd/qnap-mcu.c
@@ -175,8 +175,8 @@ int qnap_mcu_exec(struct qnap_mcu *mcu,
return -ETIMEDOUT;
}
- crc = qnap_mcu_csum(rx, reply_data_size);
- if (crc != rx[reply_data_size]) {
+ crc = qnap_mcu_csum(rx, reply->received - QNAP_MCU_CHECKSUM_SIZE);
+ if (crc != rx[reply->received - QNAP_MCU_CHECKSUM_SIZE]) {
dev_err(&mcu->serdev->dev, "Invalid Checksum received\n");
return -EIO;
}
--
2.47.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 2/4] mfd: qnap-mcu: Use EPROTO in stead of EIO on checksum errors
2025-11-05 23:47 [PATCH v2 0/4] Improve error handling for qnap-mcu transfers Heiko Stuebner
2025-11-05 23:47 ` [PATCH v2 1/4] mfd: qnap-mcu: Calculate the checksum on the actual number of bytes received Heiko Stuebner
@ 2025-11-05 23:47 ` Heiko Stuebner
2025-11-05 23:47 ` [PATCH v2 3/4] mfd: qnap-mcu: Move checksum verification to its own function Heiko Stuebner
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Heiko Stuebner @ 2025-11-05 23:47 UTC (permalink / raw)
To: lee; +Cc: linux-kernel, heiko
EPROTO stands for protocol error and a lot of driver already use it
to designate errors in the sent or received data from a peripheral.
So use it in the qnap-mcu as well for checksum errors.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
---
drivers/mfd/qnap-mcu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mfd/qnap-mcu.c b/drivers/mfd/qnap-mcu.c
index e3210541ec56..b4b630f7d413 100644
--- a/drivers/mfd/qnap-mcu.c
+++ b/drivers/mfd/qnap-mcu.c
@@ -178,7 +178,7 @@ int qnap_mcu_exec(struct qnap_mcu *mcu,
crc = qnap_mcu_csum(rx, reply->received - QNAP_MCU_CHECKSUM_SIZE);
if (crc != rx[reply->received - QNAP_MCU_CHECKSUM_SIZE]) {
dev_err(&mcu->serdev->dev, "Invalid Checksum received\n");
- return -EIO;
+ return -EPROTO;
}
memcpy(reply_data, rx, reply_data_size);
--
2.47.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 3/4] mfd: qnap-mcu: Move checksum verification to its own function
2025-11-05 23:47 [PATCH v2 0/4] Improve error handling for qnap-mcu transfers Heiko Stuebner
2025-11-05 23:47 ` [PATCH v2 1/4] mfd: qnap-mcu: Calculate the checksum on the actual number of bytes received Heiko Stuebner
2025-11-05 23:47 ` [PATCH v2 2/4] mfd: qnap-mcu: Use EPROTO in stead of EIO on checksum errors Heiko Stuebner
@ 2025-11-05 23:47 ` Heiko Stuebner
2025-11-05 23:47 ` [PATCH v2 4/4] mfd: qnap-mcu: Add proper error handling for command errors Heiko Stuebner
2025-11-20 10:15 ` [PATCH v2 0/4] Improve error handling for qnap-mcu transfers Lee Jones
4 siblings, 0 replies; 7+ messages in thread
From: Heiko Stuebner @ 2025-11-05 23:47 UTC (permalink / raw)
To: lee; +Cc: linux-kernel, heiko
We'll need the checksum check in a second place in the future, so
move the verification code to a separate function.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
---
drivers/mfd/qnap-mcu.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/mfd/qnap-mcu.c b/drivers/mfd/qnap-mcu.c
index b4b630f7d413..cd836bdd44a8 100644
--- a/drivers/mfd/qnap-mcu.c
+++ b/drivers/mfd/qnap-mcu.c
@@ -78,6 +78,13 @@ static u8 qnap_mcu_csum(const u8 *buf, size_t size)
return csum;
}
+static bool qnap_mcu_verify_checksum(const u8 *buf, size_t size)
+{
+ u8 crc = qnap_mcu_csum(buf, size - QNAP_MCU_CHECKSUM_SIZE);
+
+ return crc == buf[size - QNAP_MCU_CHECKSUM_SIZE];
+}
+
static int qnap_mcu_write(struct qnap_mcu *mcu, const u8 *data, u8 data_size)
{
unsigned char tx[QNAP_MCU_TX_BUFFER_SIZE];
@@ -150,7 +157,6 @@ int qnap_mcu_exec(struct qnap_mcu *mcu,
size_t length = reply_data_size + QNAP_MCU_CHECKSUM_SIZE;
struct qnap_mcu_reply *reply = &mcu->reply;
int ret = 0;
- u8 crc;
if (length > sizeof(rx)) {
dev_err(&mcu->serdev->dev, "expected data too big for receive buffer");
@@ -175,8 +181,7 @@ int qnap_mcu_exec(struct qnap_mcu *mcu,
return -ETIMEDOUT;
}
- crc = qnap_mcu_csum(rx, reply->received - QNAP_MCU_CHECKSUM_SIZE);
- if (crc != rx[reply->received - QNAP_MCU_CHECKSUM_SIZE]) {
+ if (!qnap_mcu_verify_checksum(rx, reply->received)) {
dev_err(&mcu->serdev->dev, "Invalid Checksum received\n");
return -EPROTO;
}
--
2.47.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 4/4] mfd: qnap-mcu: Add proper error handling for command errors
2025-11-05 23:47 [PATCH v2 0/4] Improve error handling for qnap-mcu transfers Heiko Stuebner
` (2 preceding siblings ...)
2025-11-05 23:47 ` [PATCH v2 3/4] mfd: qnap-mcu: Move checksum verification to its own function Heiko Stuebner
@ 2025-11-05 23:47 ` Heiko Stuebner
2025-11-13 16:03 ` Lee Jones
2025-11-20 10:15 ` [PATCH v2 0/4] Improve error handling for qnap-mcu transfers Lee Jones
4 siblings, 1 reply; 7+ messages in thread
From: Heiko Stuebner @ 2025-11-05 23:47 UTC (permalink / raw)
To: lee; +Cc: linux-kernel, heiko
Further investigation revealed that the MCU in QNAP devices may return
two error states. One "@8" for a checksum error in the submitted command
and one "@9" for any generic (and sadly unspecified) error.
These error codes with 2 data character can of course also be shorter
then the expected reply length for the submitted command, so we'll
need to check the received data for error codes and exit the receive
portion early in that case.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
---
drivers/mfd/qnap-mcu.c | 65 +++++++++++++++++++++++++++++++++++++++++-
1 file changed, 64 insertions(+), 1 deletion(-)
diff --git a/drivers/mfd/qnap-mcu.c b/drivers/mfd/qnap-mcu.c
index cd836bdd44a8..8c5eb4a72829 100644
--- a/drivers/mfd/qnap-mcu.c
+++ b/drivers/mfd/qnap-mcu.c
@@ -19,6 +19,7 @@
/* The longest command found so far is 5 bytes long */
#define QNAP_MCU_MAX_CMD_SIZE 5
#define QNAP_MCU_MAX_DATA_SIZE 36
+#define QNAP_MCU_ERROR_SIZE 2
#define QNAP_MCU_CHECKSUM_SIZE 1
#define QNAP_MCU_RX_BUFFER_SIZE \
@@ -103,6 +104,47 @@ static int qnap_mcu_write(struct qnap_mcu *mcu, const u8 *data, u8 data_size)
return serdev_device_write(mcu->serdev, tx, length, HZ);
}
+static bool qnap_mcu_is_error_msg(size_t size) {
+ return (size == QNAP_MCU_ERROR_SIZE + QNAP_MCU_CHECKSUM_SIZE);
+}
+
+static bool qnap_mcu_reply_is_generic_error(unsigned char *buf, size_t size)
+{
+ if (!qnap_mcu_is_error_msg(size))
+ return false;
+
+ if (buf[0] == '@' && buf[1] == '9')
+ return true;
+
+ return false;
+}
+
+static bool qnap_mcu_reply_is_checksum_error(unsigned char *buf, size_t size)
+{
+ if (!qnap_mcu_is_error_msg(size))
+ return false;
+
+ if (buf[0] == '@' && buf[1] == '8')
+ return true;
+
+ return false;
+}
+
+static bool qnap_mcu_reply_is_any_error(struct qnap_mcu *mcu, unsigned char *buf, size_t size)
+{
+ if (qnap_mcu_reply_is_generic_error(buf, size)) {
+ dev_err(&mcu->serdev->dev, "Controller sent generic error response\n");
+ return true;
+ }
+
+ if (qnap_mcu_reply_is_checksum_error(buf, size)) {
+ dev_err(&mcu->serdev->dev, "Controller received invalid checksum for the command\n");
+ return true;
+ }
+
+ return false;
+}
+
static size_t qnap_mcu_receive_buf(struct serdev_device *serdev, const u8 *buf, size_t size)
{
struct device *dev = &serdev->dev;
@@ -136,6 +178,24 @@ static size_t qnap_mcu_receive_buf(struct serdev_device *serdev, const u8 *buf,
}
}
+ /*
+ * We received everything the uart had to offer for now.
+ * This could mean that either the uart will send more in a 2nd
+ * receive run, or that the MCU cut the reply short because it
+ * sent an error code instead of the expected reply.
+ *
+ * So check if the received data has the correct size for an error
+ * reply and if it matches, is an actual error code.
+ */
+ if (qnap_mcu_is_error_msg(reply->received) &&
+ qnap_mcu_verify_checksum(reply->data, reply->received) &&
+ qnap_mcu_reply_is_any_error(mcu, reply->data, reply->received)) {
+ /* The reply was an error code, we're done */
+ reply->length = 0;
+
+ complete(&reply->done);
+ }
+
/*
* The only way to get out of the above loop and end up here
* is through consuming all of the supplied data, so here we
@@ -182,10 +242,13 @@ int qnap_mcu_exec(struct qnap_mcu *mcu,
}
if (!qnap_mcu_verify_checksum(rx, reply->received)) {
- dev_err(&mcu->serdev->dev, "Invalid Checksum received\n");
+ dev_err(&mcu->serdev->dev, "Invalid Checksum received from controller\n");
return -EPROTO;
}
+ if (qnap_mcu_reply_is_any_error(mcu, rx, reply->received))
+ return -EPROTO;
+
memcpy(reply_data, rx, reply_data_size);
return 0;
--
2.47.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 4/4] mfd: qnap-mcu: Add proper error handling for command errors
2025-11-05 23:47 ` [PATCH v2 4/4] mfd: qnap-mcu: Add proper error handling for command errors Heiko Stuebner
@ 2025-11-13 16:03 ` Lee Jones
0 siblings, 0 replies; 7+ messages in thread
From: Lee Jones @ 2025-11-13 16:03 UTC (permalink / raw)
To: Heiko Stuebner; +Cc: linux-kernel
On Thu, 06 Nov 2025, Heiko Stuebner wrote:
> Further investigation revealed that the MCU in QNAP devices may return
> two error states. One "@8" for a checksum error in the submitted command
> and one "@9" for any generic (and sadly unspecified) error.
>
> These error codes with 2 data character can of course also be shorter
> then the expected reply length for the submitted command, so we'll
> need to check the received data for error codes and exit the receive
> portion early in that case.
>
> Signed-off-by: Heiko Stuebner <heiko@sntech.de>
> ---
> drivers/mfd/qnap-mcu.c | 65 +++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 64 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mfd/qnap-mcu.c b/drivers/mfd/qnap-mcu.c
> index cd836bdd44a8..8c5eb4a72829 100644
> --- a/drivers/mfd/qnap-mcu.c
> +++ b/drivers/mfd/qnap-mcu.c
> @@ -19,6 +19,7 @@
> /* The longest command found so far is 5 bytes long */
> #define QNAP_MCU_MAX_CMD_SIZE 5
> #define QNAP_MCU_MAX_DATA_SIZE 36
> +#define QNAP_MCU_ERROR_SIZE 2
> #define QNAP_MCU_CHECKSUM_SIZE 1
>
> #define QNAP_MCU_RX_BUFFER_SIZE \
> @@ -103,6 +104,47 @@ static int qnap_mcu_write(struct qnap_mcu *mcu, const u8 *data, u8 data_size)
> return serdev_device_write(mcu->serdev, tx, length, HZ);
> }
>
> +static bool qnap_mcu_is_error_msg(size_t size) {
Looks like you forgot to run checkpatch.pl.
> + return (size == QNAP_MCU_ERROR_SIZE + QNAP_MCU_CHECKSUM_SIZE);
> +}
[...]
--
Lee Jones [李琼斯]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 0/4] Improve error handling for qnap-mcu transfers
2025-11-05 23:47 [PATCH v2 0/4] Improve error handling for qnap-mcu transfers Heiko Stuebner
` (3 preceding siblings ...)
2025-11-05 23:47 ` [PATCH v2 4/4] mfd: qnap-mcu: Add proper error handling for command errors Heiko Stuebner
@ 2025-11-20 10:15 ` Lee Jones
4 siblings, 0 replies; 7+ messages in thread
From: Lee Jones @ 2025-11-20 10:15 UTC (permalink / raw)
To: lee, Heiko Stuebner; +Cc: linux-kernel
On Thu, 06 Nov 2025 00:47:00 +0100, Heiko Stuebner wrote:
> Digging deeper into how that MCU behaves, I found out it can return
> more status codes than the "@0" for "ok".
>
> The additional codes can report a failed checksum verification and
> some "general" error for the command execution.
>
> This also explains sporadic command timeout messages, I have seen
> over time, when the controller sends an error code while we expect
> a longer reply from it.
>
> [...]
Applied, thanks!
[1/4] mfd: qnap-mcu: Calculate the checksum on the actual number of bytes received
commit: b4881070a02b017aea84592c424d5a980ed261c4
[2/4] mfd: qnap-mcu: Use EPROTO in stead of EIO on checksum errors
commit: c94fce30e190555d74e2769b5fe4a932d0ad432e
[3/4] mfd: qnap-mcu: Move checksum verification to its own function
commit: c3223f562586307b1bcb014475d0b71913972145
[4/4] mfd: qnap-mcu: Add proper error handling for command errors
commit: 56c1245d51faab70bf68cc3a5cd3925768e6375b
--
Lee Jones [李琼斯]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-11-20 10:15 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-05 23:47 [PATCH v2 0/4] Improve error handling for qnap-mcu transfers Heiko Stuebner
2025-11-05 23:47 ` [PATCH v2 1/4] mfd: qnap-mcu: Calculate the checksum on the actual number of bytes received Heiko Stuebner
2025-11-05 23:47 ` [PATCH v2 2/4] mfd: qnap-mcu: Use EPROTO in stead of EIO on checksum errors Heiko Stuebner
2025-11-05 23:47 ` [PATCH v2 3/4] mfd: qnap-mcu: Move checksum verification to its own function Heiko Stuebner
2025-11-05 23:47 ` [PATCH v2 4/4] mfd: qnap-mcu: Add proper error handling for command errors Heiko Stuebner
2025-11-13 16:03 ` Lee Jones
2025-11-20 10:15 ` [PATCH v2 0/4] Improve error handling for qnap-mcu transfers Lee Jones
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox