public inbox for linux-wireless@vger.kernel.org
 help / color / mirror / Atom feed
From: Johan Hovold <johan@kernel.org>
To: Miaoqing Pan <quic_miaoqing@quicinc.com>
Cc: quic_jjohnson@quicinc.com, ath11k@lists.infradead.org,
	linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org,
	johan+linaro@kernel.org
Subject: Re: [PATCH v2 ath-next 2/2] wifi: ath11k: fix HTC rx insufficient length
Date: Mon, 10 Mar 2025 11:09:52 +0100	[thread overview]
Message-ID: <Z866cCj8SWyZjCoP@hovoldconsulting.com> (raw)
In-Reply-To: <20250310010217.3845141-3-quic_miaoqing@quicinc.com>

On Mon, Mar 10, 2025 at 09:02:17AM +0800, Miaoqing Pan wrote:
> A relatively unusual race condition occurs between host software
> and hardware, where the host sees the updated destination ring head
> pointer before the hardware updates the corresponding descriptor.
> When this situation occurs, the length of the descriptor returns 0.

I still think this description is too vague and it doesn't explain how
this race is even possible. It sounds like there's a bug somewhere in
the driver or firmware, but if this really is an indication the hardware
is broken as your reply here seems to suggest:

	https://lore.kernel.org/lkml/bc187777-588c-4fa0-ba8c-847e91c78d43@quicinc.com/

then that too should be highlighted in the commit message (e.g. by
describing this as "working around broken hardware").

> The current error handling method is to increment descriptor tail
> pointer by 1, but 'sw_index' is not updated, causing descriptor and
> skb to not correspond one-to-one, resulting in the following error:
> 
> ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1488, expected 1492
> ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1460, expected 1484
> 
> To address this problem, temporarily skip processing the current
> descriptor and handle it again next time. However, to prevent this
> descriptor from continuously returning 0, use skb cb to set a flag.
> If the length returns 0 again, this descriptor will be discarded.

The ath12k ring-buffer handling looks very similar. Do you need a
corresponding workaround in ath12k_ce_completed_recv_next()? Or are you
sure that this (hardware) bug only affects ath11k devices?
 
>  	*nbytes = ath11k_hal_ce_dst_status_get_length(desc);
> -	if (*nbytes == 0) {
> -		ret = -EIO;
> -		goto err;
> +	if (unlikely(*nbytes == 0)) {
> +		struct ath11k_skb_rxcb *rxcb =
> +			ATH11K_SKB_RXCB(pipe->dest_ring->skb[sw_index]);
> +
> +		/* A relatively unusual race condition occurs between host
> +		 * software and hardware, where the host sees the updated
> +		 * destination ring head pointer before the hardware updates
> +		 * the corresponding descriptor.
> +		 *
> +		 * Temporarily skip processing the current descriptor and handle
> +		 * it again next time. However, to prevent this descriptor from
> +		 * continuously returning 0, set 'is_desc_len0' flag. If the
> +		 * length returns 0 again, this descriptor will be discarded.
> +		 */
> +		if (!rxcb->is_desc_len0) {
> +			rxcb->is_desc_len0 = true;
> +			ret = -EIO;
> +			goto err;
> +		}
>  	}

I'm still waiting for feedback from one user that can reproduce the
ring-buffer corruption very easily, but another user mentioned seeing
multiple zero-length descriptor warnings over the weekend when running
with this patch:

	ath11k_pci 0006:01:00.0: rxed invalid length (nbytes 0, max 2048)

Are there ever any valid reasons for seeing a zero-length descriptor
(i.e. unrelated to the race at hand)? IIUC the warning would only be
printed when processing such descriptors a second time (i.e. when
is_desc_len0 is set).

Johan

  reply	other threads:[~2025-03-10 10:09 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-10  1:02 [PATCH v2 ath-next 0/2] wifi: ath11k: fix HTC rx insufficient length Miaoqing Pan
2025-03-10  1:02 ` [PATCH v2 ath-next 1/2] wifi: ath11k: add function to get next srng desc Miaoqing Pan
2025-03-10  1:02 ` [PATCH v2 ath-next 2/2] wifi: ath11k: fix HTC rx insufficient length Miaoqing Pan
2025-03-10 10:09   ` Johan Hovold [this message]
2025-03-11  8:29     ` Miaoqing Pan
2025-03-11 15:20       ` Jeff Johnson
2025-03-12  1:11         ` Miaoqing Pan
2025-03-12 16:43           ` Johan Hovold
2025-03-13  1:41             ` Miaoqing Pan
2025-03-13 15:57               ` Johan Hovold
2025-03-14  0:46                 ` Miaoqing Pan
2025-03-13 13:31             ` Miaoqing Pan
2025-03-13 16:14               ` Johan Hovold
2025-03-14  1:01                 ` Miaoqing Pan
2025-03-14  8:06                   ` Johan Hovold
2025-03-14  8:19                     ` Miaoqing Pan
2025-03-17  5:52                     ` Miaoqing Pan
2025-03-17 13:04                       ` Johan Hovold
2025-03-18  7:53                         ` Miaoqing Pan
2025-03-18 17:42                           ` Johan Hovold
2025-03-19  6:47                             ` Miaoqing Pan
2025-03-21  9:35                               ` Johan Hovold
2025-03-25  1:04                                 ` Miaoqing Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z866cCj8SWyZjCoP@hovoldconsulting.com \
    --to=johan@kernel.org \
    --cc=ath11k@lists.infradead.org \
    --cc=johan+linaro@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=quic_jjohnson@quicinc.com \
    --cc=quic_miaoqing@quicinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox