public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2] mctp i3c: fix MCTP I3C driver multi-thread issue
@ 2025-01-07  3:15 Leo Yang
  2025-01-09 10:59 ` Paolo Abeni
  2025-01-09 11:10 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 3+ messages in thread
From: Leo Yang @ 2025-01-07  3:15 UTC (permalink / raw)
  To: jk, matt, andrew+netdev, davem, edumazet, kuba, pabeni, horms,
	netdev, linux-kernel
  Cc: Leo Yang

We found a timeout problem with the pldm command on our system.  The
reason is that the MCTP-I3C driver has a race condition when receiving
multiple-packet messages in multi-thread, resulting in a wrong packet
order problem.

We identified this problem by adding a debug message to the
mctp_i3c_read function.

According to the MCTP spec, a multiple-packet message must be composed
in sequence, and if there is a wrong sequence, the whole message will be
discarded and wait for the next SOM.
For example, SOM → Pkt Seq #2 → Pkt Seq #1 → Pkt Seq #3 → EOM.

Therefore, we try to solve this problem by adding a mutex to the
mctp_i3c_read function.  Before the modification, when a command
requesting a multiple-packet message response is sent consecutively, an
error usually occurs within 100 loops.  After the mutex, it can go
through 40000 loops without any error, and it seems to run well.

But I'm a little worried about the performance of mutex in high load
situation (as spec seems to allow different endpoints to respond at the
same time), do you think this is a feasible solution?

Fixes: c8755b29b58e ("mctp i3c: MCTP I3C driver")
Signed-off-by: Leo Yang <Leo-Yang@quantatw.com>

---
Change in v2:
    1. Add Fixes tag.
	2. Add mutex comment.
    - Link to v1: https://lore.kernel.org/netdev/20241226025319.1724209-1-Leo-Yang@quantatw.com/
---

 drivers/net/mctp/mctp-i3c.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/mctp/mctp-i3c.c b/drivers/net/mctp/mctp-i3c.c
index 9adad59b8676..d247fe483c58 100644
--- a/drivers/net/mctp/mctp-i3c.c
+++ b/drivers/net/mctp/mctp-i3c.c
@@ -125,6 +125,8 @@ static int mctp_i3c_read(struct mctp_i3c_device *mi)
 
 	xfer.data.in = skb_put(skb, mi->mrl);
 
+	/* Make sure netif_rx() is read in the same order as i3c. */
+	mutex_lock(&mi->lock);
 	rc = i3c_device_do_priv_xfers(mi->i3c, &xfer, 1);
 	if (rc < 0)
 		goto err;
@@ -166,8 +168,10 @@ static int mctp_i3c_read(struct mctp_i3c_device *mi)
 		stats->rx_dropped++;
 	}
 
+	mutex_unlock(&mi->lock);
 	return 0;
 err:
+	mutex_unlock(&mi->lock);
 	kfree_skb(skb);
 	return rc;
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net v2] mctp i3c: fix MCTP I3C driver multi-thread issue
  2025-01-07  3:15 [PATCH net v2] mctp i3c: fix MCTP I3C driver multi-thread issue Leo Yang
@ 2025-01-09 10:59 ` Paolo Abeni
  2025-01-09 11:10 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: Paolo Abeni @ 2025-01-09 10:59 UTC (permalink / raw)
  To: Leo Yang, jk, matt, andrew+netdev, davem, edumazet, kuba, horms,
	netdev, linux-kernel
  Cc: Leo Yang

On 1/7/25 4:15 AM, Leo Yang wrote:
> We found a timeout problem with the pldm command on our system.  The
> reason is that the MCTP-I3C driver has a race condition when receiving
> multiple-packet messages in multi-thread, resulting in a wrong packet
> order problem.
> 
> We identified this problem by adding a debug message to the
> mctp_i3c_read function.
> 
> According to the MCTP spec, a multiple-packet message must be composed
> in sequence, and if there is a wrong sequence, the whole message will be
> discarded and wait for the next SOM.
> For example, SOM → Pkt Seq #2 → Pkt Seq #1 → Pkt Seq #3 → EOM.
> 
> Therefore, we try to solve this problem by adding a mutex to the
> mctp_i3c_read function.  Before the modification, when a command
> requesting a multiple-packet message response is sent consecutively, an
> error usually occurs within 100 loops.  After the mutex, it can go
> through 40000 loops without any error, and it seems to run well.
> 
> But I'm a little worried about the performance of mutex in high load
> situation (as spec seems to allow different endpoints to respond at the
> same time), do you think this is a feasible solution?

For the record, I'm taking the liberty of dropping the above paragraph
from the changelog, as the question IMHO should have been placed after
the --- separator, has been already replied and repost just for this
change would consume more time from everyone.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net v2] mctp i3c: fix MCTP I3C driver multi-thread issue
  2025-01-07  3:15 [PATCH net v2] mctp i3c: fix MCTP I3C driver multi-thread issue Leo Yang
  2025-01-09 10:59 ` Paolo Abeni
@ 2025-01-09 11:10 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-01-09 11:10 UTC (permalink / raw)
  To: Leo Yang
  Cc: jk, matt, andrew+netdev, davem, edumazet, kuba, pabeni, horms,
	netdev, linux-kernel, Leo-Yang

Hello:

This patch was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Tue,  7 Jan 2025 11:15:30 +0800 you wrote:
> We found a timeout problem with the pldm command on our system.  The
> reason is that the MCTP-I3C driver has a race condition when receiving
> multiple-packet messages in multi-thread, resulting in a wrong packet
> order problem.
> 
> We identified this problem by adding a debug message to the
> mctp_i3c_read function.
> 
> [...]

Here is the summary with links:
  - [net,v2] mctp i3c: fix MCTP I3C driver multi-thread issue
    https://git.kernel.org/netdev/net/c/2d2d4f60ed26

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-01-09 11:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-07  3:15 [PATCH net v2] mctp i3c: fix MCTP I3C driver multi-thread issue Leo Yang
2025-01-09 10:59 ` Paolo Abeni
2025-01-09 11:10 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox