Linux wireless drivers development
 help / color / mirror / Atom feed
* [PATCH 0/3] wifi: mt76: use __relay_write to avoid race issues.
@ 2026-05-31  3:40 Jason Xing
  2026-05-31  3:40 ` [PATCH 1/3] relayfs: introduce relay_subbuf_avail() Jason Xing
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Jason Xing @ 2026-05-31  3:40 UTC (permalink / raw)
  To: nbd, lorenzo, ryder.lee, shayne.chen, sean.wang, matthias.bgg,
	angelogioacchino.delregno, akpm, axboe
  Cc: linux-wireless, Jason Xing

From: Jason Xing <kernelxing@tencent.com>

__relay_reserve() that is even though efficient has a disadvantage:
it's not easy to prevent the writer and reader race conditions since
readers can read incorrect data just after the offset is advanced at
which point where no data or only partial data has been written into
relayfs.

---
The series is only complied. I came across this caller and spotted
the issue when developing relayfs.


Jason Xing (3):
  relayfs: introduce relay_subbuf_avail()
  wifi: mt76: mt7915: use relay_subbuf_avail() to fix stale fwlog reads
  wifi: mt76: mt7996: use relay_subbuf_avail() to fix stale fwlog reads

 .../wireless/mediatek/mt76/mt7915/debugfs.c   | 27 +++++++++----------
 .../wireless/mediatek/mt76/mt7996/debugfs.c   | 25 +++++++++--------
 include/linux/relay.h                         | 24 +++++++++++++++++
 3 files changed, 49 insertions(+), 27 deletions(-)

-- 
2.43.7


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] relayfs: introduce relay_subbuf_avail()
  2026-05-31  3:40 [PATCH 0/3] wifi: mt76: use __relay_write to avoid race issues Jason Xing
@ 2026-05-31  3:40 ` Jason Xing
  2026-05-31  3:40 ` [PATCH 2/3] wifi: mt76: mt7915: use relay_subbuf_avail() to fix stale fwlog reads Jason Xing
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Jason Xing @ 2026-05-31  3:40 UTC (permalink / raw)
  To: nbd, lorenzo, ryder.lee, shayne.chen, sean.wang, matthias.bgg,
	angelogioacchino.delregno, akpm, axboe
  Cc: linux-wireless, Jason Xing

From: Jason Xing <kernelxing@tencent.com>

Add relay_subbuf_avail() which ensures the sub-buffer has room and
returns the buffer pointer without advancing the offset.

Together with __relay_write(), it lets callers emit multi-part records
where the offset only moves after each chunk is copied, closing the
stale-read window.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
 include/linux/relay.h | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/include/linux/relay.h b/include/linux/relay.h
index 6772a7075840..270a028586bd 100644
--- a/include/linux/relay.h
+++ b/include/linux/relay.h
@@ -268,6 +268,30 @@ static inline void *relay_reserve(struct rchan *chan, size_t length)
 	return reserved;
 }
 
+/**
+ *	relay_subbuf_avail - ensure @length bytes fit in current sub-buffer
+ *	@chan: relay channel
+ *	@length: total number of bytes to check
+ *
+ *	Returns the per-cpu buffer if @length bytes fit, NULL otherwise.
+ *	Switches to the next sub-buffer if necessary but does NOT advance
+ *	the write offset.  Use __relay_write() to write into the returned
+ *	buffer.
+ *
+ *	Caller must prevent preemption and serialise against other writers.
+ */
+static inline struct rchan_buf *
+relay_subbuf_avail(struct rchan *chan, size_t length)
+{
+	struct rchan_buf *buf = *this_cpu_ptr(chan->buf);
+
+	if (unlikely(buf->offset + length > buf->chan->subbuf_size)) {
+		if (!relay_switch_subbuf(buf, length))
+			return NULL;
+	}
+	return buf;
+}
+
 /**
  *	subbuf_start_reserve - reserve bytes at the start of a sub-buffer
  *	@buf: relay channel buffer
-- 
2.43.7


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] wifi: mt76: mt7915: use relay_subbuf_avail() to fix stale fwlog reads
  2026-05-31  3:40 [PATCH 0/3] wifi: mt76: use __relay_write to avoid race issues Jason Xing
  2026-05-31  3:40 ` [PATCH 1/3] relayfs: introduce relay_subbuf_avail() Jason Xing
@ 2026-05-31  3:40 ` Jason Xing
  2026-05-31  3:40 ` [PATCH 3/3] wifi: mt76: mt7996: " Jason Xing
  2026-05-31  5:58 ` [PATCH 0/3] wifi: mt76: use __relay_write to avoid race issues Jason Xing
  3 siblings, 0 replies; 5+ messages in thread
From: Jason Xing @ 2026-05-31  3:40 UTC (permalink / raw)
  To: nbd, lorenzo, ryder.lee, shayne.chen, sean.wang, matthias.bgg,
	angelogioacchino.delregno, akpm, axboe
  Cc: linux-wireless, Jason Xing

From: Jason Xing <kernelxing@tencent.com>

relay_reserve() advances buf->offset before the caller writes data.
Since relay_file_read() uses buf->offset as the readable upper bound
of the active sub-buffer, a concurrent reader can observe the
reserved-but-not-yet-written region, resulting in stale data.

  WRITER                          READER (active sub-buf)
  ------                          ------
  relay_reserve(4+L)
    buf->offset += 4+L  ---+
  *(u32*)dest = L           |  (offset already exposes the slot)
                            +--> read(&len, 4)  => L (VALID!)
      <<preempted>>              read(buf, L)   => STALE data X
  memcpy(dest+4, ..., L)
    [payload written - too late]

The userspace reader [1] uses poll() + read(&len, 4) + read(buf, len),
which is racy against the relay_reserve() window described above.

Switch to relay_subbuf_avail() + __relay_write() so the offset only
advances after each chunk is copied.

[1] https://github.com/openwrt/mt76/blob/master/tools/fwlog.c

Fixes: 988845c9361a ("mt76: mt7915: add support for passing chip/firmware debug data to user space")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
 .../wireless/mediatek/mt76/mt7915/debugfs.c   | 27 +++++++++----------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/debugfs.c b/drivers/net/wireless/mediatek/mt76/mt7915/debugfs.c
index 26ed3745af43..d564b265dc27 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7915/debugfs.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7915/debugfs.c
@@ -1346,22 +1346,21 @@ mt7915_debugfs_write_fwlog(struct mt7915_dev *dev, const void *hdr, int hdrlen,
 {
 	static DEFINE_SPINLOCK(lock);
 	unsigned long flags;
-	void *dest;
+	u32 rec_len = len;
+
+	if (hdr)
+		rec_len += hdrlen;
 
 	spin_lock_irqsave(&lock, flags);
-	dest = relay_reserve(dev->relay_fwlog, hdrlen + len + 4);
-	if (dest) {
-		*(u32 *)dest = hdrlen + len;
-		dest += 4;
-
-		if (hdrlen) {
-			memcpy(dest, hdr, hdrlen);
-			dest += hdrlen;
-		}
-
-		memcpy(dest, data, len);
-		relay_flush(dev->relay_fwlog);
-	}
+	if (!relay_subbuf_avail(dev->relay_fwlog, sizeof(rec_len) + rec_len))
+		goto out;
+
+	__relay_write(dev->relay_fwlog, &rec_len, sizeof(rec_len));
+	if (hdr)
+		__relay_write(dev->relay_fwlog, hdr, hdrlen);
+	__relay_write(dev->relay_fwlog, data, len);
+	relay_flush(dev->relay_fwlog);
+out:
 	spin_unlock_irqrestore(&lock, flags);
 }
 
-- 
2.43.7


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] wifi: mt76: mt7996: use relay_subbuf_avail() to fix stale fwlog reads
  2026-05-31  3:40 [PATCH 0/3] wifi: mt76: use __relay_write to avoid race issues Jason Xing
  2026-05-31  3:40 ` [PATCH 1/3] relayfs: introduce relay_subbuf_avail() Jason Xing
  2026-05-31  3:40 ` [PATCH 2/3] wifi: mt76: mt7915: use relay_subbuf_avail() to fix stale fwlog reads Jason Xing
@ 2026-05-31  3:40 ` Jason Xing
  2026-05-31  5:58 ` [PATCH 0/3] wifi: mt76: use __relay_write to avoid race issues Jason Xing
  3 siblings, 0 replies; 5+ messages in thread
From: Jason Xing @ 2026-05-31  3:40 UTC (permalink / raw)
  To: nbd, lorenzo, ryder.lee, shayne.chen, sean.wang, matthias.bgg,
	angelogioacchino.delregno, akpm, axboe
  Cc: linux-wireless, Jason Xing

From: Jason Xing <kernelxing@tencent.com>

relay_reserve() advances buf->offset before the caller writes data.
Since relay_file_read() uses buf->offset as the readable upper bound
of the active sub-buffer, a concurrent reader can observe the
reserved-but-not-yet-written region, resulting in stale data.

  WRITER                          READER (active sub-buf)
  ------                          ------
  relay_reserve(4+L)
    buf->offset += 4+L  ---+
  *(u32*)dest = L           |  (offset already exposes the slot)
                            +--> read(&len, 4)  => L (VALID!)
      <<preempted>>              read(buf, L)   => STALE data X
  memcpy(dest+4, ..., L)
    [payload written - too late]

The userspace reader [1] uses poll() + read(&len, 4) + read(buf, len),
which is racy against the relay_reserve() window described above.

Switch to relay_subbuf_avail() + __relay_write() so the offset only
advances after each chunk is copied.

[1] https://github.com/openwrt/mt76/blob/master/tools/fwlog.c

Fixes: 98686cd21624 ("wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
 .../wireless/mediatek/mt76/mt7996/debugfs.c   | 25 +++++++++----------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/debugfs.c b/drivers/net/wireless/mediatek/mt76/mt7996/debugfs.c
index 34af800964d1..82f59a7eb508 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7996/debugfs.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7996/debugfs.c
@@ -914,22 +914,21 @@ mt7996_debugfs_write_fwlog(struct mt7996_dev *dev, const void *hdr, int hdrlen,
 {
 	static DEFINE_SPINLOCK(lock);
 	unsigned long flags;
-	void *dest;
+	u32 rec_len = len;
+
+	if (hdr)
+		rec_len += hdrlen;
 
 	spin_lock_irqsave(&lock, flags);
-	dest = relay_reserve(dev->relay_fwlog, hdrlen + len + 4);
-	if (dest) {
-		*(u32 *)dest = hdrlen + len;
-		dest += 4;
-
-		if (hdrlen) {
-			memcpy(dest, hdr, hdrlen);
-			dest += hdrlen;
-		}
+	if (!relay_subbuf_avail(dev->relay_fwlog, sizeof(rec_len) + rec_len))
+		goto out;
 
-		memcpy(dest, data, len);
-		relay_flush(dev->relay_fwlog);
-	}
+	__relay_write(dev->relay_fwlog, &rec_len, sizeof(rec_len));
+	if (hdr)
+		__relay_write(dev->relay_fwlog, hdr, hdrlen);
+	__relay_write(dev->relay_fwlog, data, len);
+	relay_flush(dev->relay_fwlog);
+out:
 	spin_unlock_irqrestore(&lock, flags);
 }
 
-- 
2.43.7


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] wifi: mt76: use __relay_write to avoid race issues.
  2026-05-31  3:40 [PATCH 0/3] wifi: mt76: use __relay_write to avoid race issues Jason Xing
                   ` (2 preceding siblings ...)
  2026-05-31  3:40 ` [PATCH 3/3] wifi: mt76: mt7996: " Jason Xing
@ 2026-05-31  5:58 ` Jason Xing
  3 siblings, 0 replies; 5+ messages in thread
From: Jason Xing @ 2026-05-31  5:58 UTC (permalink / raw)
  To: nbd, lorenzo, ryder.lee, shayne.chen, sean.wang, matthias.bgg,
	angelogioacchino.delregno, akpm, axboe
  Cc: linux-wireless, Jason Xing

On Sun, May 31, 2026 at 11:40 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> From: Jason Xing <kernelxing@tencent.com>
>
> __relay_reserve() that is even though efficient has a disadvantage:
> it's not easy to prevent the writer and reader race conditions since
> readers can read incorrect data just after the offset is advanced at
> which point where no data or only partial data has been written into
> relayfs.
>
> ---
> The series is only complied. I came across this caller and spotted
> the issue when developing relayfs.

I think I need to rephrase a bit since it's technically not an
__issue__. I should also remove the "Fixes:" tag in patch 2/3. The
reader in mt76 tools doesn't conflict with the per flush mode in the
kernel because everytime the kernel writes a bulk of data, it will
switch to a new subbuf which is a signal that makes sure the reader
only reads one subbuf at one time.

What I'm trying to do is to avoid relay_reserve causing potential
problems in the long run from a broader perspective. It heavily relies
on how the application handles the logic. To be honest, the current
approach to reading is not efficient due to the arch of relay.

In this case, replacing reserve with write is more secure/robust
without introducing any bad side effects.

Thanks,
Jason

>
>
> Jason Xing (3):
>   relayfs: introduce relay_subbuf_avail()
>   wifi: mt76: mt7915: use relay_subbuf_avail() to fix stale fwlog reads
>   wifi: mt76: mt7996: use relay_subbuf_avail() to fix stale fwlog reads
>
>  .../wireless/mediatek/mt76/mt7915/debugfs.c   | 27 +++++++++----------
>  .../wireless/mediatek/mt76/mt7996/debugfs.c   | 25 +++++++++--------
>  include/linux/relay.h                         | 24 +++++++++++++++++
>  3 files changed, 49 insertions(+), 27 deletions(-)
>
> --
> 2.43.7
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-05-31  5:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-31  3:40 [PATCH 0/3] wifi: mt76: use __relay_write to avoid race issues Jason Xing
2026-05-31  3:40 ` [PATCH 1/3] relayfs: introduce relay_subbuf_avail() Jason Xing
2026-05-31  3:40 ` [PATCH 2/3] wifi: mt76: mt7915: use relay_subbuf_avail() to fix stale fwlog reads Jason Xing
2026-05-31  3:40 ` [PATCH 3/3] wifi: mt76: mt7996: " Jason Xing
2026-05-31  5:58 ` [PATCH 0/3] wifi: mt76: use __relay_write to avoid race issues Jason Xing

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox