patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: "Michał Pecio" <michal.pecio@gmail.com>
Cc: patches@lists.linux.dev, stable@vger.kernel.org,
	Jonathan Bell <jonathan@raspberrypi.org>,
	Oliver Neukum <oneukum@suse.com>,
	Mathias Nyman <mathias.nyman@linux.intel.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	mathias.nyman@intel.com, linux-usb@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH AUTOSEL 6.14 08/15] usb: xhci: Don't trust the EP Context cycle bit when moving HW dequeue
Date: Tue, 20 May 2025 10:04:49 -0400	[thread overview]
Message-ID: <aCyMAdNzTPgS0urL@lappy> (raw)
In-Reply-To: <20250512231628.7f91f435@foxbook>

On Mon, May 12, 2025 at 11:16:28PM +0200, Michał Pecio wrote:
>On Mon, 12 May 2025 14:03:43 -0400, Sasha Levin wrote:
>> From: Michal Pecio <michal.pecio@gmail.com>
>>
>> [ Upstream commit 6328bdc988d23201c700e1e7e04eb05a1149ac1e ]
>>
>> VIA VL805 doesn't bother updating the EP Context cycle bit when the
>> endpoint halts. This is seen by patching xhci_move_dequeue_past_td()
>> to print the cycle bits of the EP Context and the TRB at hw_dequeue
>> and then disconnecting a flash drive while reading it. Actual cycle
>> state is random as expected, but the EP Context bit is always 1.
>>
>> This means that the cycle state produced by this function is wrong
>> half the time, and then the endpoint stops working.
>>
>> Work around it by looking at the cycle bit of TD's end_trb instead
>> of believing the Endpoint or Stream Context. Specifically:
>>
>> - rename cycle_found to hw_dequeue_found to avoid confusion
>> - initialize new_cycle from td->end_trb instead of hw_dequeue
>> - switch new_cycle toggling to happen after end_trb is found
>>
>> Now a workload which regularly stalls the device works normally for
>> a few hours and clearly demonstrates the HW bug - the EP Context bit
>> is not updated in a new cycle until Set TR Dequeue overwrites it:
>>
>> [  +0,000298] sd 10:0:0:0: [sdc] Attached SCSI disk
>> [  +0,011758] cycle bits: TRB 1 EP Ctx 1
>> [  +5,947138] cycle bits: TRB 1 EP Ctx 1
>> [  +0,065731] cycle bits: TRB 0 EP Ctx 1
>> [  +0,064022] cycle bits: TRB 0 EP Ctx 0
>> [  +0,063297] cycle bits: TRB 0 EP Ctx 0
>> [  +0,069823] cycle bits: TRB 0 EP Ctx 0
>> [  +0,063390] cycle bits: TRB 1 EP Ctx 0
>> [  +0,063064] cycle bits: TRB 1 EP Ctx 1
>> [  +0,062293] cycle bits: TRB 1 EP Ctx 1
>> [  +0,066087] cycle bits: TRB 0 EP Ctx 1
>> [  +0,063636] cycle bits: TRB 0 EP Ctx 0
>> [  +0,066360] cycle bits: TRB 0 EP Ctx 0
>>
>> Also tested on the buggy ASM1042 which moves EP Context dequeue to
>> the next TRB after errors, one problem case addressed by the rework
>> that implemented this loop. In this case hw_dequeue can be enqueue,
>> so simply picking the cycle bit of TRB at hw_dequeue wouldn't work.
>>
>> Commit 5255660b208a ("xhci: add quirk for host controllers that
>> don't update endpoint DCS") tried to solve the stale cycle problem,
>> but it was more complex and got reverted due to a reported issue.
>>
>> Cc: Jonathan Bell <jonathan@raspberrypi.org>
>> Cc: Oliver Neukum <oneukum@suse.com>
>> Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
>> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
>> Link: https://lore.kernel.org/r/20250505125630.561699-2-mathias.nyman@linux.intel.com
>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Signed-off-by: Sasha Levin <sashal@kernel.org>
>
>Hi,
>
>This wasn't tagged for stable because the function may potentially
>still be affected by some unforeseen HW bugs, and previous attempt
>at fixing the issue ran into trouble and nobody truly knows why.
>
>The problem is very old and not critically severe, so I think this
>can wait till 6.15. People don't like minor release regressions.

I'll drop it, thanks!

-- 
Thanks,
Sasha

  reply	other threads:[~2025-05-20 14:04 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-12 18:03 [PATCH AUTOSEL 6.14 01/15] iio: accel: fxls8962af: Fix wakeup source leaks on device unbind Sasha Levin
2025-05-12 18:03 ` [PATCH AUTOSEL 6.14 02/15] iio: adc: qcom-spmi-iadc: " Sasha Levin
2025-05-12 18:03 ` [PATCH AUTOSEL 6.14 03/15] iio: imu: st_lsm6dsx: " Sasha Levin
2025-05-12 18:03 ` [PATCH AUTOSEL 6.14 04/15] btrfs: compression: adjust cb->compressed_folios allocation type Sasha Levin
2025-05-12 18:03 ` [PATCH AUTOSEL 6.14 05/15] btrfs: correct the order of prelim_ref arguments in btrfs__prelim_ref Sasha Levin
2025-05-12 18:03 ` [PATCH AUTOSEL 6.14 06/15] btrfs: handle empty eb->folios in num_extent_folios() Sasha Levin
2025-05-12 18:03 ` [PATCH AUTOSEL 6.14 07/15] btrfs: avoid NULL pointer dereference if no valid csum tree Sasha Levin
2025-05-12 18:03 ` [PATCH AUTOSEL 6.14 08/15] usb: xhci: Don't trust the EP Context cycle bit when moving HW dequeue Sasha Levin
2025-05-12 21:16   ` Michał Pecio
2025-05-20 14:04     ` Sasha Levin [this message]
2025-05-12 18:03 ` [PATCH AUTOSEL 6.14 09/15] tools: ynl-gen: validate 0 len strings from kernel Sasha Levin
2025-05-12 18:03 ` [PATCH AUTOSEL 6.14 10/15] block: only update request sector if needed Sasha Levin
2025-05-12 18:03 ` [PATCH AUTOSEL 6.14 11/15] wifi: iwlwifi: add support for Killer on MTL Sasha Levin
2025-05-12 18:03 ` [PATCH AUTOSEL 6.14 12/15] x86/Kconfig: make CFI_AUTO_DEFAULT depend on !RUST or Rust >= 1.88 Sasha Levin
2025-05-12 18:03 ` [PATCH AUTOSEL 6.14 13/15] xenbus: Allow PVH dom0 a non-local xenstore Sasha Levin
2025-05-12 18:03 ` [PATCH AUTOSEL 6.14 14/15] drm/amd/display: Call FP Protect Before Mode Programming/Mode Support Sasha Levin
2025-05-12 18:03 ` [PATCH AUTOSEL 6.14 15/15] __legitimize_mnt(): check for MNT_SYNC_UMOUNT should be under mount_lock Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aCyMAdNzTPgS0urL@lappy \
    --to=sashal@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jonathan@raspberrypi.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=mathias.nyman@intel.com \
    --cc=mathias.nyman@linux.intel.com \
    --cc=michal.pecio@gmail.com \
    --cc=oneukum@suse.com \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).