Linux USB

Linux USB
 help / color / mirror / Atom feed

* Re: [BUG] KASAN: slab-use-after-free in dev_driver_string from chaoskey_release
From: Shuangpeng @ 2026-06-07 19:37 UTC (permalink / raw)
  To: Alan Stern; +Cc: keithp, gregkh, linux-usb, linux-kernel
In-Reply-To: <257eb882-44dc-4e25-82f9-9cf9b455936d@rowland.harvard.edu>



> On Jun 6, 2026, at 22:29, Alan Stern <stern@rowland.harvard.edu> wrote:
> 
> On Sat, Jun 06, 2026 at 09:31:30PM -0400, Shuangpeng wrote:
>> Hi Kernel Maintainers,
>> 
>> I hit the following KASAN report while testing current upstream kernel:
>> 
>> KASAN: slab-use-after-free in dev_driver_string from chaoskey_release
>> 
>> on commit: e8c2f9fdadee7cbc75134dc463c1e0d856d6e5c7 (May 25 2026)
>> 
>> The reproducer and .config files are here.
>> https://gist.github.com/shuangpengbai/167620d391d9634107bfe4d784fcf52b
>> 
>> I’m happy to test debug patches or provide additional information.
>> 
>> Reported-by: Shuangpeng Bai <shuangpeng.kernel@gmail.com>
>> 
>> 
>> [ 2019.816807][T10106] ==================================================================
>> [ 2019.819081][T10106] BUG: KASAN: slab-use-after-free in dev_driver_string (drivers/base/core.c:2406)
>> [ 2019.820996][T10106] Read of size 8 at addr ffff888168e8a0b8 by task chaoskey_raw_re/10106
>> [ 2019.822432][T10106]
>> [ 2019.822899][T10106] Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>> [ 2019.822904][T10106] Call Trace:
>> [ 2019.822910][T10106]  <TASK>
>> [ 2019.822915][T10106]  dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
>> [ 2019.822932][T10106]  print_report (mm/kasan/report.c:378 mm/kasan/report.c:482)
>> [ 2019.822984][T10106]  kasan_report (mm/kasan/report.c:595)
>> [ 2019.823015][T10106]  dev_driver_string (drivers/base/core.c:2406)
>> [ 2019.823021][T10106]  __dynamic_dev_dbg (lib/dynamic_debug.c:906)
>> [ 2019.823282][T10106]  chaoskey_release (drivers/usb/misc/chaoskey.c:323)
> 
> The simple explanation is that the chaoskey_release() routine contains 
> debugging statements that reference an interface for the USB device even 
> after that data structure may have been deallocated.  Since they are 
> merely debugging statements, the simplest solution to the problem is to 
> get rid of them.
> 
> That's what the patch below does.  You can try it out and see if it 
> works.

I tried this patch and the bug is no longer triggered on my side.

Thanks for your fix!

> 
> Alan Stern
> 
> 
> 
> Index: usb-devel/drivers/usb/misc/chaoskey.c
> ===================================================================
> --- usb-devel.orig/drivers/usb/misc/chaoskey.c
> +++ usb-devel/drivers/usb/misc/chaoskey.c
> @@ -294,15 +294,10 @@ static int chaoskey_release(struct inode
> 
> interface = dev->interface;
> 
> - usb_dbg(interface, "release");
> -
> mutex_lock(&chaoskey_list_lock);
> mutex_lock(&dev->lock);
> 
> - usb_dbg(interface, "open count at release is %d", dev->open);
> -
> if (dev->open <= 0) {
> - usb_dbg(interface, "invalid open count (%d)", dev->open);
> rv = -ENODEV;
> goto bail;
> }
> @@ -320,7 +315,6 @@ bail:
> mutex_unlock(&dev->lock);
> destruction:
> mutex_unlock(&chaoskey_list_lock);
> - usb_dbg(interface, "release success");
> return rv;
> }



^ permalink raw reply

* Re: [PATCH] HID: usbhid: skip interrupt IN polling for devices with no input reports
From: Antheas Kapenekakis @ 2026-06-07 17:11 UTC (permalink / raw)
  To: Yaseen
  Cc: Denis Benato, Jiri Kosina, Benjamin Tissoires, Ilpo Järvinen,
	Kerim Kabirov, GameBurrow, linux-usb, linux-input, linux-kernel
In-Reply-To: <CAGwozwHHF5URNDxus5_WgPNiPc7VD_Gc4NJhV3eDDGdX6e53OA@mail.gmail.com>

On Sun, 7 Jun 2026 at 19:03, Antheas Kapenekakis <lkml@antheas.dev> wrote:
>
> On Sun, 7 Jun 2026 at 18:51, Yaseen <yaseen@ghoul.dev> wrote:
> >
> > On 06/06/2026 18:13, Antheas Kapenekakis wrote:
> > > On Sat, 6 Jun 2026 at 14:42, Denis Benato <benato.denis96@gmail.com> wrote:
> > >>
> > >>
> > >> On 6/5/26 14:02, Antheas Kapenekakis wrote:
> > >>> On Fri, 5 Jun 2026 at 13:40, Ahmed Yaseen <yaseen@ghoul.dev> wrote:
> > >>>> usbhid starts polling a device's interrupt IN endpoint on open
> > >>>> (usbhid_open() -> hid_start_in()). If the report descriptor declares no
> > >>>> input reports there is nothing to read there, so the poll is useless,
> > >>>> and on some composite devices it is also harmful.
> > >>> If it did have input reports, would starting the polling still cause
> > >>> issues? Because if it would, the issue is in the polling itself.
> > >> So far we haven't found an asus device that has more than one interface
> > >> that supports reading data out of if.
> > >>> Given the creativity of manufacturers when implementing hid protocols,
> > >>> I find it certain that they do use the in endpoint even without input
> > >>> reports. E.g., for feature reports. This could cause regressions.
> >
> > The ASUS ROG N-Key Device does have feature reports. They are used for
> > RGB control on the keyboard. I have confirmed this with a test by not
> > registering the hidraw node for this interface at all and noted that RGB
> > stops working after. So hiding or ejecting this interface is not an
> > option. Therefore, after this patch, I myself, together with Kerim and
> > GameBurrow have paid attention explicitly to ensure there are no
> > regressions to the LED controls, while fixing the keyboard issue.
> >
> > Also worth noting that feature reports travel over EP0 via
> > usbhid_{get,set}_raw_report() in both directions. The interrupt IN
> > endpoint is only ever used to receive input reports: hid_irq_in() passes
> > everything it gets to hid_safe_input_report(HID_INPUT_REPORT, ...).
> > There is no code in usbhid that reads feature reports from the interrupt
> > IN endpoint at all, so skipping that poll cannot break any feature
> > reports on any device. This is also mentioned in my patch description:
> >
> >         "Feature reports and hidraw output keep working over the control and
> > OUT endpoints, so the interface is otherwise unaffected."
> >
> > Regarding a manufacturer using in endpoint without an input report, even
> > today, the HID core would drop that data before it reaches hidraw:
> > __hid_input_report() bails when hid_get_report() finds no matching
> > report. That bail is also before the driver's ->raw_event() callback, so
> > no driver or hidraw reader can currently be relying on such traffic.
>
> Interesting, so it should noop. Muting the in endpoint would not
> affect feature reports that get sent over the in endpoint? I do not
> think this patch will cause regressions for Asus devices. I'm more
> concerned with other ones. E.g., the Legion Go S has a malformed
> report, and I do not recall which endpoints it uses. Then, the Win 5
> also does a mix. Those are two devices I'd be concerned with, but
> there are a myriad of other hid devices this could affect.
>
> I'd rather if possible the fix goes towards fixing the underlying
> issue that blocks processing inputs from other devices. This way, even
> for devices with an actual input report that is infrequent, this issue
> stops being present, even if the blocking wouldn't have been
> perceptible.
>
> I can reproduce on my Z13 in the following days.
>
> Best,
> Antheas
>
> > >> While I mostly agree with this it is also true that the general direction
> > >> for the kernel (especially lately) has been to not do out-of-spec things
> > >> at least by default.
> > >>
> > >> If things really regress it's expected to do so only an very few specific
> > >> devices with a buggy firmware, and we can think of something different
> > >> for those (hopefully very few ones).
> > >>
> > >> Perhaps someone concerned with security might be interested in what
> > >> we have because it doesn't look very normal.
> > >>
> > >> Note that below I have written a few ideas that maybe are worth
> > >
> > > The degradation would be silent.
> > >
> > >> looking into.
> > >>>> The ASUS ROG N-Key keyboards expose a second, input-less interface used
> > >>>> only for RGB control via feature reports. Opening its hidraw node (any
> > >>>> hidraw reader does, including SDL/Steam Input or a plain cat) starts the
> > >>> cating a hidraw causing issues would be expected, so let's focus on the former.
> > >
> > > Try to add spaces before and after your responses
> > >
> > >> Simply opening an hidraw should not trigger a delayed disconnect of that device,
> > >> I don't know why you would expect this to happen nor why you would
> > >> consider it acceptable. It's a bug.
> > >>
> > >> Focusing on userspace software exposing the bug is not a realistic option
> > >> because over the time we found a good chunk of software doing that:
> > >> - logitech control software (forgot the name)
> > >> - open razer software
> > >> - sdl
> > >> - asusctl (obviously it opens the device albeit in the future I will change this)
> > >>
> > >> and likely more given the fact not all software was identified.
> > >>> Asusctl has a bug where if you add the quirk that separates the event
> > >>> nodes per hid, this bug is reproduced as well. I chucked it to
> > >>> complicated threading getting out of control. It is the reason we
> > >>> skipped that patch that was in my series.
> > >> I found and solved the bug already. Regardless the issue remains:
> > >> Even with no asusctl at all, if a user has one logitech mouse
> > >> (and its control software) and a razer keyboard (and its control software)
> > >> the asus N-Key device will start an endless disconnect-reconnect loop.
> > >>
> > >> Any combination of two or more of those tools will trigger the issue
> > >> on some devices (weirdly enough not every model is affected):
> > >>
> > >> this is not good.
> > >>> Now, you say SDL/Steam do a spurious read as well, can you identify
> > >>> the codepath so we can look into it? What devices are affected? The
> > >>> early return fixes a warning on the Z13, but it also feeds through the
> > >>> universal lamp interface on the new Xbox Allies. Is this a bug on
> > >>> those devices or keyboards? If yes, it could be caused by userspace
> > >>> hanging on that node
> >
> > Affected devices include the ROG STRIX 2025 lineup: Scar 16/18
> > (G635L/G835L) and G16/G18 (G615L/G815L). My patch has been tested on
> > both Scar 18 and G18. Additionally a user with a Scar 18 2024 model
> > (G634JZR) has reported the issue as well; they were unable to
> > participate in testing but reproduce the issue with the same cat command
> > (reproduction command provided below). It is likely the G16/G18 of 2024
> > will also be affected. Models prior to 2024 appear unaffected so far.
> >
> > A user with an Xbox Ally X has tested this for me as well as of writing
> > this email. So we are able to confirm that this device is unaffected and
> > no regressions are noticed on that device from my patch, including the
> > lamp/RGB controls.
> >
> > I do not have access to a Z13 at the moment. If you have one, it would
> > be very helpful for me if you could test for any regressions on that
> > device and if the device is affected by the bug, and whether or not this
> > patch fixes the issue.
> >
> > I would also like to take this opportunity to mention that the 3 testers
> > and I are all daily driving a kernel with this patch applied, and over
> > the last few days, have noticed no issues with any devices.
> >
> > >> Sure, and I agree with you that fixing all userspace tools is desirable
> > >> but it's also unfeasible to fix them all, if we managed to do that
> > >> there will be years before everyone receives a fixed version of every
> > >> affected software and even then a core issue would remain:
> > >> linux tries to poll something it can't have anything out from.
> > >>
> > >> I am much more oriented on the fact that kernel shouldn't
> > >> be doing weird things (at least not by default) so this has to
> > >> somehow be stopped regardless of how well userspace behaves.
> > >
> > > The kernel is not doing weird things and I also did not ask you to fix
> > > all userspace software. I asked for a reproduction scenario, as it is
> > > not covered in the patch description. Relooking at the patch today, I
> > > also do not understand what it does fully.
> >
> > The reproduction scenario is in the patch description:
> >
> >         "Opening its hidraw node (any hidraw reader does, including SDL/Steam
> > Input or a plain cat) starts the pointless IN poll and keypress reports
> > on the keyboard interface get dropped for as long as the node stays
> > open: a lost key-down drops a letter, a lost key-up leaves the key stuck."
> >
> > i.e. run "sudo timeout 15 cat /dev/hidrawX" against the N-Key RGB
> > interface, then type on the internal keyboard.
> >
> > >
> > > It skips enabling input interrupts (but not only that) for devices
> > > that have no input reports. So the kernel behavior will depend on the
> > > feature descriptor moving forward.
> >
> > What the patch does is the last paragraph of the description:
> >
> >         "Skip the poll in usbhid_open() when the device has no input reports."
> >
> > Interrupt IN endpoint on a device with 0 input reports isn't doing
> > anything anyway. The other things the early return skips only matter
> > when input is possible.
> >
> > >
> > > And that fixes a hang on the affected devices because enabling
> > > interrupts on an endpoint without periodic input reports blocks a
> > > parallel endpoint that does have input reports?
> > >
> > > I would like this fix to target the actual cause that causes the block
> > > but it is not clear to me what that is or what is affected.
> >
> > As per my investigations with usbmon, I can see that the keyboard
> > interface's input reports never reach the URB layer while the RGB
> > interface is being polled. From the patch description:
> >
> >         "usbmon shows the dropped reports never reach the URB layer"
> >
> > So the blocking likely happens inside the device's firmware, and not in
> > the kernel, so the kernel cannot fix that part. What the kernel can do
> > is to stop arming the IN URB on an endpoint that as per its own
> > descriptor, can never produce data.
> >
> > >
> > > Antheas
> > >
> > >> If you have better ideas on how to fix the kernel we would
> > >> like to hear those as well.
> > >>
> > >> Best regards,
> > >> Denis
> > >>> Antheas
> > >>>
> > >>>> pointless IN poll and keypress reports on the keyboard interface get
> > >>>> dropped for as long as the node stays open: a lost key-down drops a
> > >>>> letter, a lost key-up leaves the key stuck. usbmon shows the dropped
> > >>>> reports never reach the URB layer.
> > >>>>
> > >>>> The useless poll itself is long-standing; commit 4ac74ea68f64 ("HID:
> > >>>> asus: early return for ROG devices") is what exposes it on these
> > >>>> devices by keeping the input-less interface alive instead of ejecting
> > >>>> it, so its hidraw node can be opened and the poll started.
> > >>>>
> > >>>> Skip the poll in usbhid_open() when the device has no input reports.
> > >>>> Feature reports and hidraw output keep working over the control and OUT
> > >>>> endpoints, so the interface is otherwise unaffected.
> > >> I will write my review here to avoid forking the discussion:
> > >>
> > >> I agree with the general idea but perhaps we can avoid
> > >> some hid devices to ever get HID_QUIRK_ALWAYS_POLL
> > >> and that might be enough to skip the problematic code?
> > >>
> > >> Maybe there is value in doing this with a quirk flag in hid-asus.c
> > >> affecting the least amount of devices?
> > >>
> > >> Or maybe just prevent devices with no data possibly coming out
> > >> to ever get HID_QUIRK_ALWAYS_POLL?
> >
> > Thank you for the review!
> >
> > I would like to also highlight one thing here; the HID_QUIRK_ALWAYS_POLL
> > is not given to this specific device. It was already in the if
> > condition, for the devices that do use it; my change only ORs a second
> > independent condition into it. So keeping devices away from that quirk
> > would not change anything here.
> >
> > Adding a quirk flag for this specific device is something I too have
> > considered and will be happy to change it like so if Jiri or Benjamin
> > feel it is more appropriate. My reasoning for taking the current route
> > is that it would prevent any hidden issues that might arise similarly,
> > and fix the whole class of this issue rather than for one vendor when
> > the likelihood of a regression is very low from skipping interrupt IN
> > polling if a device doesn't have input reports in the first place.
> >

I missed this part of the response. Here I'd add that
HID_QUIRK_ALWAYS_POLL is part of six if statements so your patch is
not equivalent to HID_QUIRK_ALWAYS_POLL. If it should be, perhaps an
alternative for fixing just asus devices would be to OR that quirk
when hid-asus initializes for all devices. I am not sure if it is a
firmware issue, if it is a kernel issue that can be mitigated without
quirks or skipping enabling the in endpoint, I'd prefer that. Failing
that, a quirk would perhaps limit the affected devices

Antheas

> > Best Regards,
> > Yaseen
> >
> > >>
> > >> For how to best do this we will need to hear what Jiri and
> > >> Benjamin have to say but if they think the proposed solution
> > >> is the correct solution:
> > >>
> > >> Reviewed-by: Denis Benato <denis.benato@linux.dev>
> > >>>> Fixes: 4ac74ea68f64 ("HID: asus: early return for ROG devices")
> > >>>> Tested-by: Kerim Kabirov <the.privat33r+linux@pm.me>
> > >>>> Tested-by: GameBurrow <gameburrow@pm.me>
> > >>>> Signed-off-by: Ahmed Yaseen <yaseen@ghoul.dev>
> > >>>> ---
> > >>>>   drivers/hid/usbhid/hid-core.c | 3 ++-
> > >>>>   1 file changed, 2 insertions(+), 1 deletion(-)
> > >>>>
> > >>>> diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c
> > >>>> index 96b0181cf819..90a8b34d9305 100644
> > >>>> --- a/drivers/hid/usbhid/hid-core.c
> > >>>> +++ b/drivers/hid/usbhid/hid-core.c
> > >>>> @@ -688,7 +688,8 @@ static int usbhid_open(struct hid_device *hid)
> > >>>>
> > >>>>          set_bit(HID_OPENED, &usbhid->iofl);
> > >>>>
> > >>>> -       if (hid->quirks & HID_QUIRK_ALWAYS_POLL) {
> > >>>> +       if ((hid->quirks & HID_QUIRK_ALWAYS_POLL) ||
> > >>>> +           list_empty(&hid->report_enum[HID_INPUT_REPORT].report_list)) {
> > >>>>                  res = 0;
> > >>>>                  goto Done;
> > >>>>          }
> > >>>> --
> > >>>> 2.54.0
> > >>>>
> > >>>>
> > >>>>
> >
> >
> >


^ permalink raw reply

* Re: [PATCH] HID: usbhid: skip interrupt IN polling for devices with no input reports
From: Antheas Kapenekakis @ 2026-06-07 17:03 UTC (permalink / raw)
  To: Yaseen
  Cc: Denis Benato, Jiri Kosina, Benjamin Tissoires, Ilpo Järvinen,
	Kerim Kabirov, GameBurrow, linux-usb, linux-input, linux-kernel
In-Reply-To: <20a9d77f-c60a-44ba-ac39-15107fc81256@ghoul.dev>

On Sun, 7 Jun 2026 at 18:51, Yaseen <yaseen@ghoul.dev> wrote:
>
> On 06/06/2026 18:13, Antheas Kapenekakis wrote:
> > On Sat, 6 Jun 2026 at 14:42, Denis Benato <benato.denis96@gmail.com> wrote:
> >>
> >>
> >> On 6/5/26 14:02, Antheas Kapenekakis wrote:
> >>> On Fri, 5 Jun 2026 at 13:40, Ahmed Yaseen <yaseen@ghoul.dev> wrote:
> >>>> usbhid starts polling a device's interrupt IN endpoint on open
> >>>> (usbhid_open() -> hid_start_in()). If the report descriptor declares no
> >>>> input reports there is nothing to read there, so the poll is useless,
> >>>> and on some composite devices it is also harmful.
> >>> If it did have input reports, would starting the polling still cause
> >>> issues? Because if it would, the issue is in the polling itself.
> >> So far we haven't found an asus device that has more than one interface
> >> that supports reading data out of if.
> >>> Given the creativity of manufacturers when implementing hid protocols,
> >>> I find it certain that they do use the in endpoint even without input
> >>> reports. E.g., for feature reports. This could cause regressions.
>
> The ASUS ROG N-Key Device does have feature reports. They are used for
> RGB control on the keyboard. I have confirmed this with a test by not
> registering the hidraw node for this interface at all and noted that RGB
> stops working after. So hiding or ejecting this interface is not an
> option. Therefore, after this patch, I myself, together with Kerim and
> GameBurrow have paid attention explicitly to ensure there are no
> regressions to the LED controls, while fixing the keyboard issue.
>
> Also worth noting that feature reports travel over EP0 via
> usbhid_{get,set}_raw_report() in both directions. The interrupt IN
> endpoint is only ever used to receive input reports: hid_irq_in() passes
> everything it gets to hid_safe_input_report(HID_INPUT_REPORT, ...).
> There is no code in usbhid that reads feature reports from the interrupt
> IN endpoint at all, so skipping that poll cannot break any feature
> reports on any device. This is also mentioned in my patch description:
>
>         "Feature reports and hidraw output keep working over the control and
> OUT endpoints, so the interface is otherwise unaffected."
>
> Regarding a manufacturer using in endpoint without an input report, even
> today, the HID core would drop that data before it reaches hidraw:
> __hid_input_report() bails when hid_get_report() finds no matching
> report. That bail is also before the driver's ->raw_event() callback, so
> no driver or hidraw reader can currently be relying on such traffic.

Interesting, so it should noop. Muting the in endpoint would not
affect feature reports that get sent over the in endpoint? I do not
think this patch will cause regressions for Asus devices. I'm more
concerned with other ones. E.g., the Legion Go S has a malformed
report, and I do not recall which endpoints it uses. Then, the Win 5
also does a mix. Those are two devices I'd be concerned with, but
there are a myriad of other hid devices this could affect.

I'd rather if possible the fix goes towards fixing the underlying
issue that blocks processing inputs from other devices. This way, even
for devices with an actual input report that is infrequent, this issue
stops being present, even if the blocking wouldn't have been
perceptible.

I can reproduce on my Z13 in the following days.

Best,
Antheas

> >> While I mostly agree with this it is also true that the general direction
> >> for the kernel (especially lately) has been to not do out-of-spec things
> >> at least by default.
> >>
> >> If things really regress it's expected to do so only an very few specific
> >> devices with a buggy firmware, and we can think of something different
> >> for those (hopefully very few ones).
> >>
> >> Perhaps someone concerned with security might be interested in what
> >> we have because it doesn't look very normal.
> >>
> >> Note that below I have written a few ideas that maybe are worth
> >
> > The degradation would be silent.
> >
> >> looking into.
> >>>> The ASUS ROG N-Key keyboards expose a second, input-less interface used
> >>>> only for RGB control via feature reports. Opening its hidraw node (any
> >>>> hidraw reader does, including SDL/Steam Input or a plain cat) starts the
> >>> cating a hidraw causing issues would be expected, so let's focus on the former.
> >
> > Try to add spaces before and after your responses
> >
> >> Simply opening an hidraw should not trigger a delayed disconnect of that device,
> >> I don't know why you would expect this to happen nor why you would
> >> consider it acceptable. It's a bug.
> >>
> >> Focusing on userspace software exposing the bug is not a realistic option
> >> because over the time we found a good chunk of software doing that:
> >> - logitech control software (forgot the name)
> >> - open razer software
> >> - sdl
> >> - asusctl (obviously it opens the device albeit in the future I will change this)
> >>
> >> and likely more given the fact not all software was identified.
> >>> Asusctl has a bug where if you add the quirk that separates the event
> >>> nodes per hid, this bug is reproduced as well. I chucked it to
> >>> complicated threading getting out of control. It is the reason we
> >>> skipped that patch that was in my series.
> >> I found and solved the bug already. Regardless the issue remains:
> >> Even with no asusctl at all, if a user has one logitech mouse
> >> (and its control software) and a razer keyboard (and its control software)
> >> the asus N-Key device will start an endless disconnect-reconnect loop.
> >>
> >> Any combination of two or more of those tools will trigger the issue
> >> on some devices (weirdly enough not every model is affected):
> >>
> >> this is not good.
> >>> Now, you say SDL/Steam do a spurious read as well, can you identify
> >>> the codepath so we can look into it? What devices are affected? The
> >>> early return fixes a warning on the Z13, but it also feeds through the
> >>> universal lamp interface on the new Xbox Allies. Is this a bug on
> >>> those devices or keyboards? If yes, it could be caused by userspace
> >>> hanging on that node
>
> Affected devices include the ROG STRIX 2025 lineup: Scar 16/18
> (G635L/G835L) and G16/G18 (G615L/G815L). My patch has been tested on
> both Scar 18 and G18. Additionally a user with a Scar 18 2024 model
> (G634JZR) has reported the issue as well; they were unable to
> participate in testing but reproduce the issue with the same cat command
> (reproduction command provided below). It is likely the G16/G18 of 2024
> will also be affected. Models prior to 2024 appear unaffected so far.
>
> A user with an Xbox Ally X has tested this for me as well as of writing
> this email. So we are able to confirm that this device is unaffected and
> no regressions are noticed on that device from my patch, including the
> lamp/RGB controls.
>
> I do not have access to a Z13 at the moment. If you have one, it would
> be very helpful for me if you could test for any regressions on that
> device and if the device is affected by the bug, and whether or not this
> patch fixes the issue.
>
> I would also like to take this opportunity to mention that the 3 testers
> and I are all daily driving a kernel with this patch applied, and over
> the last few days, have noticed no issues with any devices.
>
> >> Sure, and I agree with you that fixing all userspace tools is desirable
> >> but it's also unfeasible to fix them all, if we managed to do that
> >> there will be years before everyone receives a fixed version of every
> >> affected software and even then a core issue would remain:
> >> linux tries to poll something it can't have anything out from.
> >>
> >> I am much more oriented on the fact that kernel shouldn't
> >> be doing weird things (at least not by default) so this has to
> >> somehow be stopped regardless of how well userspace behaves.
> >
> > The kernel is not doing weird things and I also did not ask you to fix
> > all userspace software. I asked for a reproduction scenario, as it is
> > not covered in the patch description. Relooking at the patch today, I
> > also do not understand what it does fully.
>
> The reproduction scenario is in the patch description:
>
>         "Opening its hidraw node (any hidraw reader does, including SDL/Steam
> Input or a plain cat) starts the pointless IN poll and keypress reports
> on the keyboard interface get dropped for as long as the node stays
> open: a lost key-down drops a letter, a lost key-up leaves the key stuck."
>
> i.e. run "sudo timeout 15 cat /dev/hidrawX" against the N-Key RGB
> interface, then type on the internal keyboard.
>
> >
> > It skips enabling input interrupts (but not only that) for devices
> > that have no input reports. So the kernel behavior will depend on the
> > feature descriptor moving forward.
>
> What the patch does is the last paragraph of the description:
>
>         "Skip the poll in usbhid_open() when the device has no input reports."
>
> Interrupt IN endpoint on a device with 0 input reports isn't doing
> anything anyway. The other things the early return skips only matter
> when input is possible.
>
> >
> > And that fixes a hang on the affected devices because enabling
> > interrupts on an endpoint without periodic input reports blocks a
> > parallel endpoint that does have input reports?
> >
> > I would like this fix to target the actual cause that causes the block
> > but it is not clear to me what that is or what is affected.
>
> As per my investigations with usbmon, I can see that the keyboard
> interface's input reports never reach the URB layer while the RGB
> interface is being polled. From the patch description:
>
>         "usbmon shows the dropped reports never reach the URB layer"
>
> So the blocking likely happens inside the device's firmware, and not in
> the kernel, so the kernel cannot fix that part. What the kernel can do
> is to stop arming the IN URB on an endpoint that as per its own
> descriptor, can never produce data.
>
> >
> > Antheas
> >
> >> If you have better ideas on how to fix the kernel we would
> >> like to hear those as well.
> >>
> >> Best regards,
> >> Denis
> >>> Antheas
> >>>
> >>>> pointless IN poll and keypress reports on the keyboard interface get
> >>>> dropped for as long as the node stays open: a lost key-down drops a
> >>>> letter, a lost key-up leaves the key stuck. usbmon shows the dropped
> >>>> reports never reach the URB layer.
> >>>>
> >>>> The useless poll itself is long-standing; commit 4ac74ea68f64 ("HID:
> >>>> asus: early return for ROG devices") is what exposes it on these
> >>>> devices by keeping the input-less interface alive instead of ejecting
> >>>> it, so its hidraw node can be opened and the poll started.
> >>>>
> >>>> Skip the poll in usbhid_open() when the device has no input reports.
> >>>> Feature reports and hidraw output keep working over the control and OUT
> >>>> endpoints, so the interface is otherwise unaffected.
> >> I will write my review here to avoid forking the discussion:
> >>
> >> I agree with the general idea but perhaps we can avoid
> >> some hid devices to ever get HID_QUIRK_ALWAYS_POLL
> >> and that might be enough to skip the problematic code?
> >>
> >> Maybe there is value in doing this with a quirk flag in hid-asus.c
> >> affecting the least amount of devices?
> >>
> >> Or maybe just prevent devices with no data possibly coming out
> >> to ever get HID_QUIRK_ALWAYS_POLL?
>
> Thank you for the review!
>
> I would like to also highlight one thing here; the HID_QUIRK_ALWAYS_POLL
> is not given to this specific device. It was already in the if
> condition, for the devices that do use it; my change only ORs a second
> independent condition into it. So keeping devices away from that quirk
> would not change anything here.
>
> Adding a quirk flag for this specific device is something I too have
> considered and will be happy to change it like so if Jiri or Benjamin
> feel it is more appropriate. My reasoning for taking the current route
> is that it would prevent any hidden issues that might arise similarly,
> and fix the whole class of this issue rather than for one vendor when
> the likelihood of a regression is very low from skipping interrupt IN
> polling if a device doesn't have input reports in the first place.
>
> Best Regards,
> Yaseen
>
> >>
> >> For how to best do this we will need to hear what Jiri and
> >> Benjamin have to say but if they think the proposed solution
> >> is the correct solution:
> >>
> >> Reviewed-by: Denis Benato <denis.benato@linux.dev>
> >>>> Fixes: 4ac74ea68f64 ("HID: asus: early return for ROG devices")
> >>>> Tested-by: Kerim Kabirov <the.privat33r+linux@pm.me>
> >>>> Tested-by: GameBurrow <gameburrow@pm.me>
> >>>> Signed-off-by: Ahmed Yaseen <yaseen@ghoul.dev>
> >>>> ---
> >>>>   drivers/hid/usbhid/hid-core.c | 3 ++-
> >>>>   1 file changed, 2 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c
> >>>> index 96b0181cf819..90a8b34d9305 100644
> >>>> --- a/drivers/hid/usbhid/hid-core.c
> >>>> +++ b/drivers/hid/usbhid/hid-core.c
> >>>> @@ -688,7 +688,8 @@ static int usbhid_open(struct hid_device *hid)
> >>>>
> >>>>          set_bit(HID_OPENED, &usbhid->iofl);
> >>>>
> >>>> -       if (hid->quirks & HID_QUIRK_ALWAYS_POLL) {
> >>>> +       if ((hid->quirks & HID_QUIRK_ALWAYS_POLL) ||
> >>>> +           list_empty(&hid->report_enum[HID_INPUT_REPORT].report_list)) {
> >>>>                  res = 0;
> >>>>                  goto Done;
> >>>>          }
> >>>> --
> >>>> 2.54.0
> >>>>
> >>>>
> >>>>
>
>
>


^ permalink raw reply

* Re: [PATCH] HID: usbhid: skip interrupt IN polling for devices with no input reports
From: Yaseen @ 2026-06-07 16:51 UTC (permalink / raw)
  To: Antheas Kapenekakis, Denis Benato
  Cc: Jiri Kosina, Benjamin Tissoires, Ilpo Järvinen,
	Kerim Kabirov, GameBurrow, linux-usb, linux-input, linux-kernel
In-Reply-To: <CAGwozwFMm2o2G-fOixvu+QVXGsDC+E81+1Nsk4kQ7xbpnvhVPg@mail.gmail.com>

On 06/06/2026 18:13, Antheas Kapenekakis wrote:
> On Sat, 6 Jun 2026 at 14:42, Denis Benato <benato.denis96@gmail.com> wrote:
>>
>>
>> On 6/5/26 14:02, Antheas Kapenekakis wrote:
>>> On Fri, 5 Jun 2026 at 13:40, Ahmed Yaseen <yaseen@ghoul.dev> wrote:
>>>> usbhid starts polling a device's interrupt IN endpoint on open
>>>> (usbhid_open() -> hid_start_in()). If the report descriptor declares no
>>>> input reports there is nothing to read there, so the poll is useless,
>>>> and on some composite devices it is also harmful.
>>> If it did have input reports, would starting the polling still cause
>>> issues? Because if it would, the issue is in the polling itself.
>> So far we haven't found an asus device that has more than one interface
>> that supports reading data out of if.
>>> Given the creativity of manufacturers when implementing hid protocols,
>>> I find it certain that they do use the in endpoint even without input
>>> reports. E.g., for feature reports. This could cause regressions.

The ASUS ROG N-Key Device does have feature reports. They are used for 
RGB control on the keyboard. I have confirmed this with a test by not 
registering the hidraw node for this interface at all and noted that RGB 
stops working after. So hiding or ejecting this interface is not an 
option. Therefore, after this patch, I myself, together with Kerim and 
GameBurrow have paid attention explicitly to ensure there are no 
regressions to the LED controls, while fixing the keyboard issue.

Also worth noting that feature reports travel over EP0 via 
usbhid_{get,set}_raw_report() in both directions. The interrupt IN 
endpoint is only ever used to receive input reports: hid_irq_in() passes 
everything it gets to hid_safe_input_report(HID_INPUT_REPORT, ...). 
There is no code in usbhid that reads feature reports from the interrupt 
IN endpoint at all, so skipping that poll cannot break any feature 
reports on any device. This is also mentioned in my patch description:

	"Feature reports and hidraw output keep working over the control and 
OUT endpoints, so the interface is otherwise unaffected."

Regarding a manufacturer using in endpoint without an input report, even 
today, the HID core would drop that data before it reaches hidraw: 
__hid_input_report() bails when hid_get_report() finds no matching 
report. That bail is also before the driver's ->raw_event() callback, so 
no driver or hidraw reader can currently be relying on such traffic.

>> While I mostly agree with this it is also true that the general direction
>> for the kernel (especially lately) has been to not do out-of-spec things
>> at least by default.
>>
>> If things really regress it's expected to do so only an very few specific
>> devices with a buggy firmware, and we can think of something different
>> for those (hopefully very few ones).
>>
>> Perhaps someone concerned with security might be interested in what
>> we have because it doesn't look very normal.
>>
>> Note that below I have written a few ideas that maybe are worth
> 
> The degradation would be silent.
> 
>> looking into.
>>>> The ASUS ROG N-Key keyboards expose a second, input-less interface used
>>>> only for RGB control via feature reports. Opening its hidraw node (any
>>>> hidraw reader does, including SDL/Steam Input or a plain cat) starts the
>>> cating a hidraw causing issues would be expected, so let's focus on the former.
> 
> Try to add spaces before and after your responses
> 
>> Simply opening an hidraw should not trigger a delayed disconnect of that device,
>> I don't know why you would expect this to happen nor why you would
>> consider it acceptable. It's a bug.
>>
>> Focusing on userspace software exposing the bug is not a realistic option
>> because over the time we found a good chunk of software doing that:
>> - logitech control software (forgot the name)
>> - open razer software
>> - sdl
>> - asusctl (obviously it opens the device albeit in the future I will change this)
>>
>> and likely more given the fact not all software was identified.
>>> Asusctl has a bug where if you add the quirk that separates the event
>>> nodes per hid, this bug is reproduced as well. I chucked it to
>>> complicated threading getting out of control. It is the reason we
>>> skipped that patch that was in my series.
>> I found and solved the bug already. Regardless the issue remains:
>> Even with no asusctl at all, if a user has one logitech mouse
>> (and its control software) and a razer keyboard (and its control software)
>> the asus N-Key device will start an endless disconnect-reconnect loop.
>>
>> Any combination of two or more of those tools will trigger the issue
>> on some devices (weirdly enough not every model is affected):
>>
>> this is not good.
>>> Now, you say SDL/Steam do a spurious read as well, can you identify
>>> the codepath so we can look into it? What devices are affected? The
>>> early return fixes a warning on the Z13, but it also feeds through the
>>> universal lamp interface on the new Xbox Allies. Is this a bug on
>>> those devices or keyboards? If yes, it could be caused by userspace
>>> hanging on that node

Affected devices include the ROG STRIX 2025 lineup: Scar 16/18 
(G635L/G835L) and G16/G18 (G615L/G815L). My patch has been tested on 
both Scar 18 and G18. Additionally a user with a Scar 18 2024 model 
(G634JZR) has reported the issue as well; they were unable to 
participate in testing but reproduce the issue with the same cat command 
(reproduction command provided below). It is likely the G16/G18 of 2024 
will also be affected. Models prior to 2024 appear unaffected so far.

A user with an Xbox Ally X has tested this for me as well as of writing 
this email. So we are able to confirm that this device is unaffected and 
no regressions are noticed on that device from my patch, including the 
lamp/RGB controls.

I do not have access to a Z13 at the moment. If you have one, it would 
be very helpful for me if you could test for any regressions on that 
device and if the device is affected by the bug, and whether or not this 
patch fixes the issue.

I would also like to take this opportunity to mention that the 3 testers 
and I are all daily driving a kernel with this patch applied, and over 
the last few days, have noticed no issues with any devices.

>> Sure, and I agree with you that fixing all userspace tools is desirable
>> but it's also unfeasible to fix them all, if we managed to do that
>> there will be years before everyone receives a fixed version of every
>> affected software and even then a core issue would remain:
>> linux tries to poll something it can't have anything out from.
>>
>> I am much more oriented on the fact that kernel shouldn't
>> be doing weird things (at least not by default) so this has to
>> somehow be stopped regardless of how well userspace behaves.
> 
> The kernel is not doing weird things and I also did not ask you to fix
> all userspace software. I asked for a reproduction scenario, as it is
> not covered in the patch description. Relooking at the patch today, I
> also do not understand what it does fully.

The reproduction scenario is in the patch description:

	"Opening its hidraw node (any hidraw reader does, including SDL/Steam 
Input or a plain cat) starts the pointless IN poll and keypress reports 
on the keyboard interface get dropped for as long as the node stays 
open: a lost key-down drops a letter, a lost key-up leaves the key stuck."

i.e. run "sudo timeout 15 cat /dev/hidrawX" against the N-Key RGB 
interface, then type on the internal keyboard.

> 
> It skips enabling input interrupts (but not only that) for devices
> that have no input reports. So the kernel behavior will depend on the
> feature descriptor moving forward.

What the patch does is the last paragraph of the description:

	"Skip the poll in usbhid_open() when the device has no input reports."

Interrupt IN endpoint on a device with 0 input reports isn't doing 
anything anyway. The other things the early return skips only matter 
when input is possible.

> 
> And that fixes a hang on the affected devices because enabling
> interrupts on an endpoint without periodic input reports blocks a
> parallel endpoint that does have input reports?
> 
> I would like this fix to target the actual cause that causes the block
> but it is not clear to me what that is or what is affected.

As per my investigations with usbmon, I can see that the keyboard 
interface's input reports never reach the URB layer while the RGB 
interface is being polled. From the patch description:

	"usbmon shows the dropped reports never reach the URB layer"

So the blocking likely happens inside the device's firmware, and not in 
the kernel, so the kernel cannot fix that part. What the kernel can do 
is to stop arming the IN URB on an endpoint that as per its own 
descriptor, can never produce data.

> 
> Antheas
> 
>> If you have better ideas on how to fix the kernel we would
>> like to hear those as well.
>>
>> Best regards,
>> Denis
>>> Antheas
>>>
>>>> pointless IN poll and keypress reports on the keyboard interface get
>>>> dropped for as long as the node stays open: a lost key-down drops a
>>>> letter, a lost key-up leaves the key stuck. usbmon shows the dropped
>>>> reports never reach the URB layer.
>>>>
>>>> The useless poll itself is long-standing; commit 4ac74ea68f64 ("HID:
>>>> asus: early return for ROG devices") is what exposes it on these
>>>> devices by keeping the input-less interface alive instead of ejecting
>>>> it, so its hidraw node can be opened and the poll started.
>>>>
>>>> Skip the poll in usbhid_open() when the device has no input reports.
>>>> Feature reports and hidraw output keep working over the control and OUT
>>>> endpoints, so the interface is otherwise unaffected.
>> I will write my review here to avoid forking the discussion:
>>
>> I agree with the general idea but perhaps we can avoid
>> some hid devices to ever get HID_QUIRK_ALWAYS_POLL
>> and that might be enough to skip the problematic code?
>>
>> Maybe there is value in doing this with a quirk flag in hid-asus.c
>> affecting the least amount of devices?
>>
>> Or maybe just prevent devices with no data possibly coming out
>> to ever get HID_QUIRK_ALWAYS_POLL?

Thank you for the review!

I would like to also highlight one thing here; the HID_QUIRK_ALWAYS_POLL 
is not given to this specific device. It was already in the if 
condition, for the devices that do use it; my change only ORs a second 
independent condition into it. So keeping devices away from that quirk 
would not change anything here.

Adding a quirk flag for this specific device is something I too have 
considered and will be happy to change it like so if Jiri or Benjamin 
feel it is more appropriate. My reasoning for taking the current route 
is that it would prevent any hidden issues that might arise similarly, 
and fix the whole class of this issue rather than for one vendor when 
the likelihood of a regression is very low from skipping interrupt IN 
polling if a device doesn't have input reports in the first place.

Best Regards,
Yaseen

>>
>> For how to best do this we will need to hear what Jiri and
>> Benjamin have to say but if they think the proposed solution
>> is the correct solution:
>>
>> Reviewed-by: Denis Benato <denis.benato@linux.dev>
>>>> Fixes: 4ac74ea68f64 ("HID: asus: early return for ROG devices")
>>>> Tested-by: Kerim Kabirov <the.privat33r+linux@pm.me>
>>>> Tested-by: GameBurrow <gameburrow@pm.me>
>>>> Signed-off-by: Ahmed Yaseen <yaseen@ghoul.dev>
>>>> ---
>>>>   drivers/hid/usbhid/hid-core.c | 3 ++-
>>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c
>>>> index 96b0181cf819..90a8b34d9305 100644
>>>> --- a/drivers/hid/usbhid/hid-core.c
>>>> +++ b/drivers/hid/usbhid/hid-core.c
>>>> @@ -688,7 +688,8 @@ static int usbhid_open(struct hid_device *hid)
>>>>
>>>>          set_bit(HID_OPENED, &usbhid->iofl);
>>>>
>>>> -       if (hid->quirks & HID_QUIRK_ALWAYS_POLL) {
>>>> +       if ((hid->quirks & HID_QUIRK_ALWAYS_POLL) ||
>>>> +           list_empty(&hid->report_enum[HID_INPUT_REPORT].report_list)) {
>>>>                  res = 0;
>>>>                  goto Done;
>>>>          }
>>>> --
>>>> 2.54.0
>>>>
>>>>
>>>>

^ permalink raw reply

* [PATCH] usb: ucsi: huawei_gaokun: support mode switching
From: Pengyu Luo @ 2026-06-07 10:18 UTC (permalink / raw)
  To: Pengyu Luo, Heikki Krogerus, Greg Kroah-Hartman; +Cc: linux-usb, linux-kernel

The USB PHY (QMP Combo PHY) is always initialized in USB3+DP mode. In
the past, there was no MUX, and it was unnecessary to set it, since
MSM only supported 2-lane DP. But now, MST and 4-lane DP support has
been added to MSM, and a MUX has been added to the PHY. To support
4-lane DP and mode switching for gaokun, get the MUX and set it.

Signed-off-by: Pengyu Luo <mitltlatltl@gmail.com>
---
 drivers/usb/typec/ucsi/ucsi_huawei_gaokun.c | 55 +++++++++++++++------
 1 file changed, 41 insertions(+), 14 deletions(-)

diff --git a/drivers/usb/typec/ucsi/ucsi_huawei_gaokun.c b/drivers/usb/typec/ucsi/ucsi_huawei_gaokun.c
index c5965656baba..95b7b77b726d 100644
--- a/drivers/usb/typec/ucsi/ucsi_huawei_gaokun.c
+++ b/drivers/usb/typec/ucsi/ucsi_huawei_gaokun.c
@@ -18,6 +18,7 @@
 #include <linux/usb/pd_vdo.h>
 #include <linux/usb/typec_altmode.h>
 #include <linux/usb/typec_dp.h>
+#include <linux/usb/typec_mux.h>
 #include <linux/workqueue_types.h>
 
 #include "ucsi.h"
@@ -82,6 +83,8 @@ struct gaokun_ucsi_port {
 	struct gaokun_ucsi *ucsi;
 	struct auxiliary_device *bridge;
 
+	struct typec_mux *typec_mux;
+
 	int idx;
 	enum gaokun_ucsi_ccx ccx;
 	enum gaokun_ucsi_mux mux;
@@ -226,19 +229,18 @@ static void gaokun_ucsi_port_update(struct gaokun_ucsi_port *port,
 	port->hpd_state = FIELD_GET(GAOKUN_HPD_STATE_MASK, ddi);
 	port->hpd_irq = FIELD_GET(GAOKUN_HPD_IRQ_MASK, ddi);
 
-	/* Mode and SVID are unused; keeping them to make things clearer */
 	switch (port->mode) {
 	case USBC_DPAM_PAN_C:
 	case USBC_DPAM_PAN_C_REVERSE:
-		port->mode = DP_PIN_ASSIGN_C; /* correct it for usb later */
+		port->mode = TYPEC_DP_STATE_C; /* correct it for usb later */
 		break;
 	case USBC_DPAM_PAN_D:
 	case USBC_DPAM_PAN_D_REVERSE:
-		port->mode = DP_PIN_ASSIGN_D;
+		port->mode = TYPEC_DP_STATE_D;
 		break;
 	case USBC_DPAM_PAN_E:
 	case USBC_DPAM_PAN_E_REVERSE:
-		port->mode = DP_PIN_ASSIGN_E;
+		port->mode = TYPEC_DP_STATE_E;
 		break;
 	case USBC_DPAM_PAN_NONE:
 		port->mode = TYPEC_STATE_SAFE;
@@ -287,18 +289,32 @@ static int gaokun_ucsi_refresh(struct gaokun_ucsi *uec)
 	return idx;
 }
 
-static void gaokun_ucsi_handle_altmode(struct gaokun_ucsi_port *port)
+static void gaokun_ucsi_handle_usb_mode(struct gaokun_ucsi_port *port)
 {
 	struct gaokun_ucsi *uec = port->ucsi;
-	int idx = port->idx;
-
-	if (idx >= uec->ucsi->cap.num_connectors) {
+	struct typec_mux_state state = {};
+	struct typec_altmode dp_alt = {};
+	int idx = port->idx, ret;
+
+	/*
+	 * For every typec port on this platform, the only mode-switch is
+	 * controlled by its qmp combo phy which consumes svid and mode only.
+	 */
+	dp_alt.svid = port->svid;
+	state.mode = port->mode;
+	state.alt = &dp_alt;
+
+	if (idx >= uec->num_ports) {
 		dev_warn(uec->dev, "altmode port out of range: %d\n", idx);
 		return;
 	}
 
+	ret = typec_mux_set(port->typec_mux, &state);
+	if (ret)
+		dev_err(uec->dev, "failed to set mux %d\n", ret);
+
 	/* UCSI callback .connector_status() have set orientation */
-	if (port->bridge)
+	if (port->bridge && port->svid == USB_TYPEC_DP_SID)
 		drm_aux_hpd_bridge_notify(&port->bridge->dev,
 					  port->hpd_state ?
 					  connector_status_connected :
@@ -307,7 +323,7 @@ static void gaokun_ucsi_handle_altmode(struct gaokun_ucsi_port *port)
 	gaokun_ec_ucsi_pan_ack(uec->ec, port->idx);
 }
 
-static void gaokun_ucsi_altmode_notify_ind(struct gaokun_ucsi *uec)
+static void gaokun_ucsi_usb_notify_ind(struct gaokun_ucsi *uec)
 {
 	int idx;
 
@@ -320,7 +336,7 @@ static void gaokun_ucsi_altmode_notify_ind(struct gaokun_ucsi *uec)
 	if (idx == GAOKUN_UCSI_NO_PORT_UPDATE)
 		gaokun_ec_ucsi_pan_ack(uec->ec, idx); /* ack directly if no update */
 	else
-		gaokun_ucsi_handle_altmode(&uec->ports[idx]);
+		gaokun_ucsi_handle_usb_mode(&uec->ports[idx]);
 }
 
 /*
@@ -352,7 +368,7 @@ static void gaokun_ucsi_handle_no_usb_event(struct gaokun_ucsi *uec, int idx)
 	port = &uec->ports[idx];
 	if (!wait_for_completion_timeout(&port->usb_ack, 2 * HZ)) {
 		dev_warn(uec->dev, "No USB EVENT, triggered by UCSI EVENT");
-		gaokun_ucsi_altmode_notify_ind(uec);
+		gaokun_ucsi_usb_notify_ind(uec);
 	}
 }
 
@@ -366,7 +382,7 @@ static int gaokun_ucsi_notify(struct notifier_block *nb,
 	switch (action) {
 	case EC_EVENT_USB:
 		gaokun_ucsi_complete_usb_ack(uec);
-		gaokun_ucsi_altmode_notify_ind(uec);
+		gaokun_ucsi_usb_notify_ind(uec);
 		return NOTIFY_OK;
 
 	case EC_EVENT_UCSI:
@@ -429,8 +445,15 @@ static int gaokun_ucsi_ports_init(struct gaokun_ucsi *uec)
 			fwnode_handle_put(fwnode);
 			return PTR_ERR(ucsi_port->bridge);
 		}
-	}
 
+		ucsi_port->typec_mux = fwnode_typec_mux_get(fwnode);
+		if (IS_ERR(ucsi_port->typec_mux)) {
+			fwnode_handle_put(fwnode);
+			return dev_err_probe(dev, PTR_ERR(ucsi_port->typec_mux),
+					     "failed to acquire mode-switch for port: %d\n",
+					     port);
+		}
+	}
 	for (i = 0; i < num_ports; i++) {
 		if (!uec->ports[i].bridge)
 			continue;
@@ -502,10 +525,14 @@ static int gaokun_ucsi_probe(struct auxiliary_device *adev,
 static void gaokun_ucsi_remove(struct auxiliary_device *adev)
 {
 	struct gaokun_ucsi *uec = auxiliary_get_drvdata(adev);
+	int i;
 
 	disable_delayed_work_sync(&uec->work);
 	gaokun_ec_unregister_notify(uec->ec, &uec->nb);
 	ucsi_unregister(uec->ucsi);
+	for (i = 0; i < uec->num_ports; ++i)
+		typec_mux_put(uec->ports[i].typec_mux);
+
 	ucsi_destroy(uec->ucsi);
 }
 
-- 
2.54.0


^ permalink raw reply related

* [PATCH] USB: serial: kl5kusb105: fix bulk-out buffer overflow
From: HyeongJun An @ 2026-06-07  9:51 UTC (permalink / raw)
  To: Johan Hovold, Greg Kroah-Hartman
  Cc: linux-usb, linux-kernel, stable, HyeongJun An

klsi_105_prepare_write_buffer() is called by the generic write path
with the bulk-out buffer and its size (bulk_out_size, 64 bytes). It
stores a two-byte length header at the start of the buffer and copies
the payload from the write fifo starting at buf + KLSI_HDR_LEN, but
passes the full buffer size as the number of bytes to copy:

  count = kfifo_out_locked(&port->write_fifo, buf + KLSI_HDR_LEN,
                           size, &port->lock);

When the fifo holds at least size bytes, size bytes are copied starting
two bytes into the size-byte buffer, writing KLSI_HDR_LEN bytes past its
end. Copy at most size - KLSI_HDR_LEN bytes instead, leaving room for
the header as safe_serial already does.

Writing bulk_out_size or more bytes to the tty triggers a slab
out-of-bounds write, observed with KASAN by emulating the device with
dummy_hcd and raw-gadget:

  BUG: KASAN: slab-out-of-bounds in kfifo_copy_out+0x83/0xc0
  Write of size 64 at addr ffff888112c62202 by task python3
   kfifo_copy_out
   klsi_105_prepare_write_buffer [kl5kusb105]
   usb_serial_generic_write_start [usbserial]
  Allocated by task 139:
   usb_serial_probe [usbserial]
  The buggy address is located 2 bytes inside of allocated 64-byte region

The out-of-bounds write no longer occurs with this change applied.

Fixes: 60b3013cdaf3 ("USB: kl5usb105: reimplement using generic framework")
Cc: stable@vger.kernel.org
Signed-off-by: HyeongJun An <sammiee5311@gmail.com>
---
 drivers/usb/serial/kl5kusb105.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/serial/kl5kusb105.c b/drivers/usb/serial/kl5kusb105.c
index ed8531a64768..e72a0b45a707 100644
--- a/drivers/usb/serial/kl5kusb105.c
+++ b/drivers/usb/serial/kl5kusb105.c
@@ -330,8 +330,8 @@ static int klsi_105_prepare_write_buffer(struct usb_serial_port *port,
 	unsigned char *buf = dest;
 	int count;
 
-	count = kfifo_out_locked(&port->write_fifo, buf + KLSI_HDR_LEN, size,
-								&port->lock);
+	count = kfifo_out_locked(&port->write_fifo, buf + KLSI_HDR_LEN,
+				 size - KLSI_HDR_LEN, &port->lock);
 	put_unaligned_le16(count, buf);
 
 	return count + KLSI_HDR_LEN;
-- 
2.43.0


^ permalink raw reply related

* Re: [BUG] KASAN: slab-use-after-free in dev_driver_string from chaoskey_release
From: Alan Stern @ 2026-06-07  2:29 UTC (permalink / raw)
  To: Shuangpeng; +Cc: keithp, gregkh, linux-usb, linux-kernel
In-Reply-To: <20EC9664-054E-438B-B411-2145D347F97B@gmail.com>

On Sat, Jun 06, 2026 at 09:31:30PM -0400, Shuangpeng wrote:
> Hi Kernel Maintainers,
> 
> I hit the following KASAN report while testing current upstream kernel:
> 
> KASAN: slab-use-after-free in dev_driver_string from chaoskey_release
> 
> on commit: e8c2f9fdadee7cbc75134dc463c1e0d856d6e5c7 (May 25 2026)
> 
> The reproducer and .config files are here.
> https://gist.github.com/shuangpengbai/167620d391d9634107bfe4d784fcf52b
> 
> I’m happy to test debug patches or provide additional information.
> 
> Reported-by: Shuangpeng Bai <shuangpeng.kernel@gmail.com>
> 
> 
> [ 2019.816807][T10106] ==================================================================
> [ 2019.819081][T10106] BUG: KASAN: slab-use-after-free in dev_driver_string (drivers/base/core.c:2406)
> [ 2019.820996][T10106] Read of size 8 at addr ffff888168e8a0b8 by task chaoskey_raw_re/10106
> [ 2019.822432][T10106]
> [ 2019.822899][T10106] Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [ 2019.822904][T10106] Call Trace:
> [ 2019.822910][T10106]  <TASK>
> [ 2019.822915][T10106]  dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
> [ 2019.822932][T10106]  print_report (mm/kasan/report.c:378 mm/kasan/report.c:482)
> [ 2019.822984][T10106]  kasan_report (mm/kasan/report.c:595)
> [ 2019.823015][T10106]  dev_driver_string (drivers/base/core.c:2406)
> [ 2019.823021][T10106]  __dynamic_dev_dbg (lib/dynamic_debug.c:906)
> [ 2019.823282][T10106]  chaoskey_release (drivers/usb/misc/chaoskey.c:323)

The simple explanation is that the chaoskey_release() routine contains 
debugging statements that reference an interface for the USB device even 
after that data structure may have been deallocated.  Since they are 
merely debugging statements, the simplest solution to the problem is to 
get rid of them.

That's what the patch below does.  You can try it out and see if it 
works.

Alan Stern



Index: usb-devel/drivers/usb/misc/chaoskey.c
===================================================================
--- usb-devel.orig/drivers/usb/misc/chaoskey.c
+++ usb-devel/drivers/usb/misc/chaoskey.c
@@ -294,15 +294,10 @@ static int chaoskey_release(struct inode
 
 	interface = dev->interface;
 
-	usb_dbg(interface, "release");
-
 	mutex_lock(&chaoskey_list_lock);
 	mutex_lock(&dev->lock);
 
-	usb_dbg(interface, "open count at release is %d", dev->open);
-
 	if (dev->open <= 0) {
-		usb_dbg(interface, "invalid open count (%d)", dev->open);
 		rv = -ENODEV;
 		goto bail;
 	}
@@ -320,7 +315,6 @@ bail:
 	mutex_unlock(&dev->lock);
 destruction:
 	mutex_unlock(&chaoskey_list_lock);
-	usb_dbg(interface, "release success");
 	return rv;
 }
 


^ permalink raw reply

* Re: USB: Request for guidance investigating configuration descriptor enumeration failure
From: Alan Stern @ 2026-06-07  2:17 UTC (permalink / raw)
  To: Michal Pecio
  Cc: Nikhil Solanke, linux-usb, gregkh, mathias.nyman, sakari.ailus,
	katieeliu, johannes.bruederl, kees, dengjie03, limiao, wse, dev,
	vahnenko2003, cs, lijiayi, oneukum, bence98, eeodqql09
In-Reply-To: <20260604125323.1bcb40d7.michal.pecio@gmail.com>

On Thu, Jun 04, 2026 at 12:53:23PM +0200, Michal Pecio wrote:
> On Wed, 3 Jun 2026 22:02:44 -0400, Alan Stern wrote:
> > I used a bus analyzer to capture what happens when Windows 11 
> > initializes and enumerates a USB-2 flash drive.  The short answer is 
> > that yes, the initial Get-Configuration-Descriptor request is for 255 
> > bytes.
> 
> Could you also try a few BIOSes, UEFIs and such?
> Or anything from Apple?

Here's the information.  It may not clear things up as much as you would 
like.

Samsung UEFI BIOS:
	Get Device Desc		8
	Set Address
	Get Device Desc		18
	Get Config Desc		255
	Set Config

Lenovo UEFI BIOS:
	Set Address
	Get Device Desc		8
	Get Device Desc		8
	Get Device Desc		18
	Get Config Desc		8
	Get Config Desc		32
	Get String Desc 0	2
	Get String Desc 0	4
	Set Config

IpadOS 26.5:
	Set Address
	Get Device Desc		18
	Get String Desc 2	2
	Get String Desc 2	34
	Get String Desc 1	2
	Get String Desc 1	18
	Get String Desc 3	2
	Get String Desc 3	50
	Get Config Desc		9
	Get Config Desc		32
	Set Config

OSX 10.5:
	Get Device Desc		18
	Set Address
	Get Device Desc		18
	Get String Desc 2	2
	Get String Desc 2	34
	Get String Desc 1	2
	Get String Desc 1	18
	Get String Desc 3	2
	Get String Desc 3	50
	Get Config Desc		8
	Get Config Desc		32
	Set Config

The OSX recording was made from a very old MacBook Air, and it wouldn't 
be surprising if OSX has changed in the meantime.  The Ipad was fairly 
recent, however, as was the Lenovo.  The Samsung was perhaps 10 years 
old.

BTW, here's the lsusb output for the flash drive I used as a test 
device.  It explains why some of the systems asked for string 
descriptors 1, 2, and 3:

Bus 003 Device 005: ID 0930:6544 Toshiba Corp. TransMemory-Mini / Kingston DataTraveler 2.0 Stick
Couldn't open device, some information will be missing
Negotiated speed: High Speed (480Mbps)
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 [unknown]
  bDeviceSubClass         0 [unknown]
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x0930 Toshiba Corp.
  idProduct          0x6544 TransMemory-Mini / Kingston DataTraveler 2.0 Stick
  bcdDevice            1.00
  iManufacturer           1 Kingston
  iProduct                2 DataTraveler G3 
  iSerial                 3 0013729B6EB8C1107559071E
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0020
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              100mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     80 Bulk-Only
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval             255
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval             255

I suspect that non-storage devices (such as an Xbox clone) might have 
more stringent requirements, because they don't need to be visible to a 
BIOS.  In general, imitating Windows is almost certainly the best 
approach -- except perhaps for a few devices which are meant to be used 
exclusively with Macs.

Alan Stern

^ permalink raw reply

* [BUG] KASAN: slab-use-after-free in dev_driver_string from chaoskey_release
From: Shuangpeng @ 2026-06-07  1:31 UTC (permalink / raw)
  To: keithp, gregkh; +Cc: linux-usb, linux-kernel

Hi Kernel Maintainers,

I hit the following KASAN report while testing current upstream kernel:

KASAN: slab-use-after-free in dev_driver_string from chaoskey_release

on commit: e8c2f9fdadee7cbc75134dc463c1e0d856d6e5c7 (May 25 2026)

The reproducer and .config files are here.
https://gist.github.com/shuangpengbai/167620d391d9634107bfe4d784fcf52b

I’m happy to test debug patches or provide additional information.

Reported-by: Shuangpeng Bai <shuangpeng.kernel@gmail.com>


[ 2019.816807][T10106] ==================================================================
[ 2019.819081][T10106] BUG: KASAN: slab-use-after-free in dev_driver_string (drivers/base/core.c:2406)
[ 2019.820996][T10106] Read of size 8 at addr ffff888168e8a0b8 by task chaoskey_raw_re/10106
[ 2019.822432][T10106]
[ 2019.822899][T10106] Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 2019.822904][T10106] Call Trace:
[ 2019.822910][T10106]  <TASK>
[ 2019.822915][T10106]  dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
[ 2019.822932][T10106]  print_report (mm/kasan/report.c:378 mm/kasan/report.c:482)
[ 2019.822984][T10106]  kasan_report (mm/kasan/report.c:595)
[ 2019.823015][T10106]  dev_driver_string (drivers/base/core.c:2406)
[ 2019.823021][T10106]  __dynamic_dev_dbg (lib/dynamic_debug.c:906)
[ 2019.823282][T10106]  chaoskey_release (drivers/usb/misc/chaoskey.c:323)
[ 2019.823290][T10106]  __fput (fs/file_table.c:510)
[ 2019.823298][T10106]  fput_close_sync (fs/file_table.c:615)
[ 2019.823320][T10106]  __x64_sys_close (fs/open.c:1507 fs/open.c:1492 fs/open.c:1492)
[ 2019.823327][T10106]  do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
[ 2019.823337][T10106]  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
[ 2019.823344][T10106] RIP: 0033:0x7f52411ffc03
[ 2019.823352][T10106] Code: e9 37 ff ff ff e8 2d f9 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 45 c3 0f 1f 40 00 48 83 ec 18 89 7c 24 0c e8
All code
========
   0:	e9 37 ff ff ff       	jmp    0xffffffffffffff3c
   5:	e8 2d f9 01 00       	call   0x1f937
   a:	66 2e 0f 1f 84 00 00 	cs nopw 0x0(%rax,%rax,1)
  11:	00 00 00 
  14:	0f 1f 00             	nopl   (%rax)
  17:	64 8b 04 25 18 00 00 	mov    %fs:0x18,%eax
  1e:	00 
  1f:	85 c0                	test   %eax,%eax
  21:	75 14                	jne    0x37
  23:	b8 03 00 00 00       	mov    $0x3,%eax
  28:	0f 05                	syscall
  2a:*	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax		<-- trapping instruction
  30:	77 45                	ja     0x77
  32:	c3                   	ret
  33:	0f 1f 40 00          	nopl   0x0(%rax)
  37:	48 83 ec 18          	sub    $0x18,%rsp
  3b:	89 7c 24 0c          	mov    %edi,0xc(%rsp)
  3f:	e8                   	.byte 0xe8

Code starting with the faulting instruction
===========================================
   0:	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax
   6:	77 45                	ja     0x4d
   8:	c3                   	ret
   9:	0f 1f 40 00          	nopl   0x0(%rax)
   d:	48 83 ec 18          	sub    $0x18,%rsp
  11:	89 7c 24 0c          	mov    %edi,0xc(%rsp)
  15:	e8                   	.byte 0xe8
[ 2019.823358][T10106] RSP: 002b:00007ffd4b423688 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[ 2019.823382][T10106] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f52411ffc03
[ 2019.823388][T10106] RDX: 0000000000000000 RSI: 00007ffd4b421570 RDI: 0000000000000003
[ 2019.823400][T10106] RBP: 00007ffd4b423af0 R08: 00007f52412a3040 R09: 00007f52412a30c0
[ 2019.823404][T10106] R10: fffffffffffff639 R11: 0000000000000246 R12: 00007ffd4b4238f0
[ 2019.823408][T10106] R13: 0000000000000003 R14: 0000000000000000 R15: 000000000000277c
[ 2019.823417][T10106]  </TASK>
[ 2019.823420][T10106]
[ 2019.842033][T10106] Freed by task 10106 on cpu 0 at 2019.816700s:
[ 2019.842461][T10106]  kasan_save_track (mm/kasan/common.c:57 mm/kasan/common.c:78)
[ 2019.842793][T10106]  kasan_save_free_info (mm/kasan/generic.c:584)
[ 2019.843137][T10106]  __kasan_slab_free (mm/kasan/common.c:253 mm/kasan/common.c:285)
[ 2019.843463][T10106]  kfree (./include/linux/kasan.h:235 mm/slub.c:2689 mm/slub.c:6251 mm/slub.c:6566)
[ 2019.843736][T10106]  device_release (drivers/base/core.c:2562)
[ 2019.844053][T10106]  kobject_put (lib/kobject.c:689 lib/kobject.c:720 ./include/linux/kref.h:65 lib/kobject.c:737)
[ 2019.844368][T10106]  chaoskey_free (drivers/usb/misc/chaoskey.c:103)
[ 2019.844696][T10106]  chaoskey_release (drivers/usb/misc/chaoskey.c:315)
[ 2019.845038][T10106]  __fput (fs/file_table.c:510)
[ 2019.845322][T10106]  fput_close_sync (fs/file_table.c:615)
[ 2019.845651][T10106]  __x64_sys_close (fs/open.c:1507 fs/open.c:1492 fs/open.c:1492)
[ 2019.845981][T10106]  do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
[ 2019.846302][T10106]  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
[ 2019.846708][T10106]
[ 2019.846874][T10106] The buggy address belongs to the object at ffff888168e8a000
[ 2019.846874][T10106]  which belongs to the cache kmalloc-1k of size 1024
[ 2019.847827][T10106] The buggy address is located 184 bytes inside of
[ 2019.847827][T10106]  freed 1024-byte region [ffff888168e8a000, ffff888168e8a400)


Best,
Shuangpeng


^ permalink raw reply

* [westeri-thunderbolt:next] BUILD SUCCESS 503c5ae1e72aa9ed91925dafa3d82ee2e992747f
From: kernel test robot @ 2026-06-07  0:31 UTC (permalink / raw)
  To: Mika Westerberg; +Cc: linux-usb

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt.git next
branch HEAD: 503c5ae1e72aa9ed91925dafa3d82ee2e992747f  thunderbolt: debugfs: Fix margining error counter buffer leak

elapsed time: 4031m

configs tested: 191
configs skipped: 4

The following configs have been built successfully.
More configs may be tested in the coming days.

tested configs:
alpha                             allnoconfig    gcc-15.2.0
alpha                            allyesconfig    gcc-15.2.0
alpha                               defconfig    gcc-16.1.0
arc                              allmodconfig    clang-17
arc                               allnoconfig    gcc-15.2.0
arc                                 defconfig    gcc-16.1.0
arc                   randconfig-001-20260606    clang-23
arc                   randconfig-002-20260606    clang-23
arm                               allnoconfig    gcc-15.2.0
arm                              allyesconfig    clang-17
arm                                 defconfig    gcc-16.1.0
arm                   randconfig-001-20260606    clang-23
arm                   randconfig-002-20260606    clang-23
arm                   randconfig-003-20260606    clang-23
arm                   randconfig-004-20260606    clang-23
arm64                             allnoconfig    gcc-15.2.0
arm64                               defconfig    gcc-16.1.0
arm64                          randconfig-001    clang-23
arm64                 randconfig-001-20260606    clang-23
arm64                          randconfig-002    clang-23
arm64                 randconfig-002-20260606    clang-23
arm64                          randconfig-003    clang-23
arm64                 randconfig-003-20260606    clang-23
arm64                          randconfig-004    clang-23
arm64                 randconfig-004-20260606    clang-23
csky                             allmodconfig    gcc-15.2.0
csky                              allnoconfig    gcc-15.2.0
csky                                defconfig    gcc-16.1.0
csky                           randconfig-001    clang-23
csky                  randconfig-001-20260606    clang-23
csky                           randconfig-002    clang-23
csky                  randconfig-002-20260606    clang-23
hexagon                          allmodconfig    gcc-15.2.0
hexagon                           allnoconfig    gcc-15.2.0
hexagon                             defconfig    gcc-16.1.0
hexagon                        randconfig-001    gcc-11.5.0
hexagon               randconfig-001-20260606    gcc-11.5.0
hexagon                        randconfig-002    gcc-11.5.0
hexagon               randconfig-002-20260606    gcc-11.5.0
i386                             allmodconfig    clang-20
i386                              allnoconfig    gcc-15.2.0
i386                             allyesconfig    clang-20
i386        buildonly-randconfig-001-20260606    gcc-13
i386        buildonly-randconfig-002-20260606    gcc-13
i386        buildonly-randconfig-003-20260606    gcc-13
i386        buildonly-randconfig-004-20260606    gcc-13
i386        buildonly-randconfig-005-20260606    gcc-13
i386        buildonly-randconfig-006-20260606    gcc-13
i386                                defconfig    gcc-16.1.0
i386                  randconfig-001-20260606    clang-20
i386                  randconfig-002-20260606    clang-20
i386                  randconfig-003-20260606    clang-20
i386                  randconfig-004-20260606    clang-20
i386                  randconfig-005-20260606    clang-20
i386                  randconfig-006-20260606    clang-20
i386                  randconfig-007-20260606    clang-20
i386                           randconfig-011    clang-20
i386                  randconfig-011-20260606    clang-20
i386                           randconfig-012    clang-20
i386                  randconfig-012-20260606    clang-20
i386                           randconfig-013    clang-20
i386                  randconfig-013-20260606    clang-20
i386                           randconfig-014    clang-20
i386                  randconfig-014-20260606    clang-20
i386                           randconfig-015    clang-20
i386                  randconfig-015-20260606    clang-20
i386                           randconfig-016    clang-20
i386                  randconfig-016-20260606    clang-20
i386                           randconfig-017    clang-20
i386                  randconfig-017-20260606    clang-20
loongarch                        allmodconfig    clang-23
loongarch                         allnoconfig    gcc-15.2.0
loongarch                           defconfig    clang-23
loongarch                      randconfig-001    gcc-11.5.0
loongarch             randconfig-001-20260606    gcc-11.5.0
loongarch                      randconfig-002    gcc-11.5.0
loongarch             randconfig-002-20260606    gcc-11.5.0
m68k                             allmodconfig    gcc-15.2.0
m68k                              allnoconfig    gcc-15.2.0
m68k                             allyesconfig    clang-17
m68k                                defconfig    clang-23
microblaze                        allnoconfig    gcc-15.2.0
microblaze                       allyesconfig    gcc-15.2.0
microblaze                          defconfig    clang-23
mips                             allmodconfig    gcc-15.2.0
mips                              allnoconfig    gcc-15.2.0
mips                             allyesconfig    gcc-15.2.0
mips                      malta_kvm_defconfig    gcc-16.1.0
nios2                            allmodconfig    clang-23
nios2                             allnoconfig    clang-17
nios2                               defconfig    clang-23
nios2                          randconfig-001    gcc-11.5.0
nios2                 randconfig-001-20260606    gcc-11.5.0
nios2                          randconfig-002    gcc-11.5.0
nios2                 randconfig-002-20260606    gcc-11.5.0
openrisc                         allmodconfig    clang-23
openrisc                          allnoconfig    clang-17
openrisc                            defconfig    gcc-16.1.0
parisc                           allmodconfig    gcc-15.2.0
parisc                            allnoconfig    clang-17
parisc                           allyesconfig    clang-19
parisc                              defconfig    gcc-16.1.0
parisc                randconfig-001-20260606    gcc-8.5.0
parisc                randconfig-002-20260606    gcc-8.5.0
parisc64                            defconfig    clang-23
powerpc                          allmodconfig    gcc-15.2.0
powerpc                           allnoconfig    clang-17
powerpc               randconfig-001-20260606    gcc-8.5.0
powerpc               randconfig-002-20260606    gcc-8.5.0
powerpc64             randconfig-001-20260606    gcc-8.5.0
powerpc64             randconfig-002-20260606    gcc-8.5.0
riscv                            allmodconfig    clang-23
riscv                             allnoconfig    clang-17
riscv                            allyesconfig    clang-17
riscv                               defconfig    gcc-16.1.0
riscv                          randconfig-001    gcc-8.5.0
riscv                 randconfig-001-20260606    gcc-8.5.0
riscv                          randconfig-002    gcc-8.5.0
riscv                 randconfig-002-20260606    gcc-8.5.0
s390                             allmodconfig    clang-19
s390                              allnoconfig    clang-17
s390                             allyesconfig    gcc-15.2.0
s390                                defconfig    gcc-16.1.0
s390                           randconfig-001    gcc-8.5.0
s390                  randconfig-001-20260606    gcc-8.5.0
s390                           randconfig-002    gcc-8.5.0
s390                  randconfig-002-20260606    gcc-8.5.0
sh                               allmodconfig    gcc-15.2.0
sh                                allnoconfig    clang-17
sh                               allyesconfig    clang-19
sh                                  defconfig    gcc-14
sh                             randconfig-001    gcc-8.5.0
sh                    randconfig-001-20260606    gcc-8.5.0
sh                             randconfig-002    gcc-8.5.0
sh                    randconfig-002-20260606    gcc-8.5.0
sparc                             allnoconfig    clang-17
sparc                               defconfig    gcc-16.1.0
sparc                 randconfig-001-20260606    gcc-11.5.0
sparc                 randconfig-002-20260606    gcc-11.5.0
sparc64                          allmodconfig    clang-23
sparc64                             defconfig    gcc-14
sparc64               randconfig-001-20260606    gcc-11.5.0
sparc64               randconfig-002-20260606    gcc-11.5.0
um                               allmodconfig    clang-19
um                                allnoconfig    clang-17
um                               allyesconfig    gcc-15.2.0
um                                  defconfig    gcc-14
um                             i386_defconfig    gcc-14
um                    randconfig-001-20260606    gcc-11.5.0
um                    randconfig-002-20260606    gcc-11.5.0
um                           x86_64_defconfig    gcc-14
x86_64                           allmodconfig    clang-20
x86_64                            allnoconfig    clang-17
x86_64                           allyesconfig    clang-20
x86_64      buildonly-randconfig-001-20260606    gcc-14
x86_64      buildonly-randconfig-002-20260606    gcc-14
x86_64      buildonly-randconfig-003-20260606    gcc-14
x86_64      buildonly-randconfig-004-20260606    gcc-14
x86_64      buildonly-randconfig-005-20260606    gcc-14
x86_64      buildonly-randconfig-006-20260606    gcc-14
x86_64                              defconfig    gcc-14
x86_64                                  kexec    clang-22
x86_64                randconfig-001-20260606    gcc-14
x86_64                randconfig-002-20260606    gcc-14
x86_64                randconfig-003-20260606    gcc-14
x86_64                randconfig-004-20260606    gcc-14
x86_64                randconfig-005-20260606    gcc-14
x86_64                randconfig-006-20260606    gcc-14
x86_64                randconfig-011-20260606    gcc-14
x86_64                randconfig-012-20260606    gcc-14
x86_64                randconfig-013-20260606    gcc-14
x86_64                randconfig-014-20260606    gcc-14
x86_64                randconfig-015-20260606    gcc-14
x86_64                randconfig-016-20260606    gcc-14
x86_64                randconfig-071-20260607    clang-22
x86_64                randconfig-072-20260607    clang-22
x86_64                randconfig-073-20260607    clang-22
x86_64                randconfig-074-20260607    clang-22
x86_64                randconfig-075-20260607    clang-22
x86_64                randconfig-076-20260607    clang-22
x86_64                               rhel-9.4    clang-22
x86_64                           rhel-9.4-bpf    gcc-14
x86_64                          rhel-9.4-func    clang-22
x86_64                    rhel-9.4-kselftests    clang-22
x86_64                         rhel-9.4-kunit    gcc-14
x86_64                           rhel-9.4-ltp    gcc-14
x86_64                          rhel-9.4-rust    clang-20
xtensa                            allnoconfig    clang-17
xtensa                           allyesconfig    clang-23
xtensa                randconfig-001-20260606    gcc-11.5.0
xtensa                randconfig-002-20260606    gcc-11.5.0

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* Re: [PATCH 1/3] usb: typec: tcpm: add low power mode support
From: Badhri Jagan Sridharan @ 2026-06-06 16:37 UTC (permalink / raw)
  To: cy_huang
  Cc: linux-usb, heikki.krogerus, gregkh, lucas_tsai, ren_chen,
	kevin_hung
In-Reply-To: <20260518091513.3277975-3-cy_huang@richtek.com>

On Mon, May 18, 2026 at 2:18 AM <cy_huang@richtek.com> wrote:
>
> From: Lucas Tsai <lucas_tsai@richtek.com>
>
> Add low power mode support,
> enter low power mode at detach,
> exit low power mode at SRC_ATTACH_WAIT, SNK_ATTACH_WAIT and init.
>
> Signed-off-by: Lucas Tsai <lucas_tsai@richtek.com>
> ---
>  drivers/usb/typec/tcpm/tcpm.c | 10 ++++++++++
>  include/linux/usb/tcpm.h      |  4 ++++
>  2 files changed, 14 insertions(+)
>
> diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
> index 55fee96d3342..a4bde4c292e4 100644
> --- a/drivers/usb/typec/tcpm/tcpm.c
> +++ b/drivers/usb/typec/tcpm/tcpm.c
> @@ -4939,6 +4939,9 @@ static void tcpm_reset_port(struct tcpm_port *port)
>
>  static void tcpm_detach(struct tcpm_port *port)
>  {
> +       if (port->tcpc->set_low_power_mode)
> +               port->tcpc->set_low_power_mode(port->tcpc, true);
> +
>         if (tcpm_port_is_disconnected(port))
>                 port->hard_reset_count = 0;
>
> @@ -5181,6 +5184,8 @@ static void run_state_machine(struct tcpm_port *port)
>                         tcpm_set_state(port, SNK_UNATTACHED, PD_T_DRP_SNK);
>                 break;
>         case SRC_ATTACH_WAIT:
> +               if (port->tcpc->set_low_power_mode)
> +                       port->tcpc->set_low_power_mode(port->tcpc, false);

If the callback ends up being absolutely needed, then,
Instead of calling set_low_power_mode() here and in SNK_ATTACH_WAIT,
you can invoke it in __tcpm_cc_change() in the TOGGLING switch case if
the state machine is going to enter SRC_ATTACH_WAIT or
SNK_ATTACH_WAIT.

>                 if (tcpm_port_is_debug_source(port))
>                         tcpm_set_state(port, DEBUG_ACC_ATTACHED,
>                                        port->timings.cc_debounce_time);
> @@ -5439,6 +5444,8 @@ static void run_state_machine(struct tcpm_port *port)
>                         tcpm_set_state(port, SRC_UNATTACHED, PD_T_DRP_SRC);
>                 break;
>         case SNK_ATTACH_WAIT:
> +               if (port->tcpc->set_low_power_mode)
> +                       port->tcpc->set_low_power_mode(port->tcpc, false);
>                 if (tcpm_port_is_debug_sink(port))
>                         tcpm_set_state(port, DEBUG_ACC_ATTACHED,
>                                        PD_T_CC_DEBOUNCE);
> @@ -7489,6 +7496,9 @@ static void tcpm_init(struct tcpm_port *port)
>  {
>         enum typec_cc_status cc1, cc2;
>
> +       if (port->tcpc->set_low_power_mode)
> +               port->tcpc->set_low_power_mode(port->tcpc, false);
> +
>         port->tcpc->init(port->tcpc);
>
>         tcpm_reset_port(port);
> diff --git a/include/linux/usb/tcpm.h b/include/linux/usb/tcpm.h
> index 93079450bba0..475c5d478c0e 100644
> --- a/include/linux/usb/tcpm.h
> +++ b/include/linux/usb/tcpm.h
> @@ -82,6 +82,9 @@ enum tcpm_transmit_type {
>   *             Optional; if supported by hardware, called to start dual-role
>   *             toggling or single-role connection detection. Toggling stops
>   *             automatically if a connection is established.
> + * @set_low_power_mode:
> + *             Optional; if supported by hardware, called to enter or exit
> + *             low power mode.
>   * @try_role:  Optional; called to set a preferred role
>   * @pd_transmit:Called to transmit PD message
>   * @set_bist_data: Turn on/off bist data mode for compliance testing
> @@ -155,6 +158,7 @@ struct tcpc_dev {
>         int (*start_toggling)(struct tcpc_dev *dev,
>                               enum typec_port_type port_type,
>                               enum typec_cc_status cc);
> +       void (*set_low_power_mode)(struct tcpc_dev *dev, bool enable);
>         int (*try_role)(struct tcpc_dev *dev, int role);
>         int (*pd_transmit)(struct tcpc_dev *dev, enum tcpm_transmit_type type,
>                            const struct pd_message *msg, unsigned int negotiated_rev);
> --
> 2.43.0
>

^ permalink raw reply

* Re: [PATCH 3/3] usb: typec: tcpci_rt1711h: add low power mode support
From: Badhri Jagan Sridharan @ 2026-06-06 16:20 UTC (permalink / raw)
  To: Heikki Krogerus
  Cc: lucas_tsai, cy_huang, gregkh, kevin_hung, linux-usb, ren_chen
In-Reply-To: <aiLJKjtKMmjxdyyI@kuha>

On Fri, Jun 5, 2026 at 6:03 AM Heikki Krogerus
<heikki.krogerus@linux.intel.com> wrote:
>
> On Wed, Jun 03, 2026 at 11:18:27AM +0800, lucas_tsai@richtek.com wrote:
> > On Mon, Jun 01, 2026 at 04:36:02PM +0300, Heikki Krogerus wrote:
> > > Hi,
> > >
> > > I'm sorry to keep you waiting.
> > >
> > > On Mon, May 18, 2026 at 05:15:14PM +0800, cy_huang@richtek.com wrote:
> > > > From: Lucas Tsai <lucas_tsai@richtek.com>
> > > >
> > > > Add low power mode support,
> > > > add the op to enter and exit low power mode,
> > > > this mode reduce RT1711H/RT1715 VDD Iq to 1 of 10,
> > >
> > > What is VDD Iq?
> > >
> >
> > VDD pin is RT1715's power input,
> > and Iq is the quiescent current,
> > so the low power mode reduces the baseline power used by RT1715.
>
> Got it, thanks. Please add that explanation to the commit message.
>
> > > > while disabling VBUS detection and PD BMC
> > > > but keeping CC detection and not affecting DRP toggling.
> > > >
> > > > Signed-off-by: Lucas Tsai <lucas_tsai@richtek.com>
> > > > ---
> > > >  drivers/usb/typec/tcpm/tcpci_rt1711h.c | 14 ++++++++++++++
> > > >  1 file changed, 14 insertions(+)
> > > >
> > > > diff --git a/drivers/usb/typec/tcpm/tcpci_rt1711h.c b/drivers/usb/typec/tcpm/tcpci_rt1711h.c
> > > > index 4b3e4e22a82e..48d6a6823ab9 100644
> > > > --- a/drivers/usb/typec/tcpm/tcpci_rt1711h.c
> > > > +++ b/drivers/usb/typec/tcpm/tcpci_rt1711h.c
> > > > @@ -20,6 +20,7 @@
> > > >
> > > >  #define RT1711H_PHYCTRL1 0x80
> > > >  #define RT1711H_PHYCTRL2 0x81
> > > > +#define RT1711H_BMCCTRL          0x90
> > > >
> > > >  #define RT1711H_RTCTRL4          0x93
> > > >  /* rx threshold of rd/rp: 1b0 for level 0.4V/0.7V, 1b1 for 0.35V/0.75V */
> > > > @@ -254,6 +255,18 @@ static int rt1711h_start_drp_toggling(struct tcpci *tcpci,
> > > >   return 0;
> > > >  }
> > > >
> > > > +static void rt1711h_set_low_power_mode(struct tcpci *tcpci,
> > > > +                                struct tcpci_data *tdata, bool enable)
> > > > +{
> > > > + int ret;
> > > > + struct rt1711h_chip *chip = tdata_to_rt1711h(tdata);
> > > > +
> > > > + ret = rt1711h_write8(chip, RT1711H_BMCCTRL, enable ? 0x08 : 0x07);
> > > > + if (ret < 0)
> > > > +         dev_err(chip->dev, "%s lpm fail(%d)\n",
> > > > +                 enable ? "enter" : "exit", ret);
> > > > +}

Any reason why cant rt1711h_set_low_power_mode() be called from
rt1711h_init_cc_params() and rt1711h_start_drp_toggling() ?
rt1711h_start_drp_toggling() can enable low power mode.
rt1711h_init_cc_params(), which is already aware of CC status, can
disable low power mode when a source or sink is detected.
This approach would avoid adding a new callback in tcpci.c.

If the above works, you could optionally rely on runtime pm callbacks
to call rt1711h_set_low_power_mode() and increment/decrement the usage
count in  rt1711h_init_cc_params() and rt1711h_start_drp_toggling().

> > >
> > > Why couldn't this just be done in the PM suspend and resume
> > > callbacks for this driver?
> >
> > Entering low power mode or not relates directly to the status of USB-C port:
> > enter low power mode when USB-C port attached nothing (unattached),
> > and exit low power mode when attached with cable or device.
> >
> > In the other word, it is possible to have system suspended and USB-C
> > port attached at the same time, so it is not suitable to bind the flow
> > of low power mode to the PM suspend and resume.
>
> You don't have to do that in the system suspend callback, but you
> still should do this in the runtime suspend callback, no?
>
> This is not a major thing, but mctp.c is a bit bloated, I want to make

I believe Heikki was referring to tcpci.c here.

> sure that adding this kind of callbacks is really necessary.
>
> Badhri, do you have time to check this series?
>
> thanks,
>
> --
> heikki

^ permalink raw reply

* Re: [PATCH] HID: usbhid: skip interrupt IN polling for devices with no input reports
From: Antheas Kapenekakis @ 2026-06-06 13:13 UTC (permalink / raw)
  To: Denis Benato
  Cc: Ahmed Yaseen, Jiri Kosina, Benjamin Tissoires, Ilpo Järvinen,
	Kerim Kabirov, GameBurrow, linux-usb, linux-input, linux-kernel
In-Reply-To: <ceb5ed4b-1654-463d-9cff-4a1b91d52e92@gmail.com>

On Sat, 6 Jun 2026 at 14:42, Denis Benato <benato.denis96@gmail.com> wrote:
>
>
> On 6/5/26 14:02, Antheas Kapenekakis wrote:
> > On Fri, 5 Jun 2026 at 13:40, Ahmed Yaseen <yaseen@ghoul.dev> wrote:
> >> usbhid starts polling a device's interrupt IN endpoint on open
> >> (usbhid_open() -> hid_start_in()). If the report descriptor declares no
> >> input reports there is nothing to read there, so the poll is useless,
> >> and on some composite devices it is also harmful.
> > If it did have input reports, would starting the polling still cause
> > issues? Because if it would, the issue is in the polling itself.
> So far we haven't found an asus device that has more than one interface
> that supports reading data out of if.
> > Given the creativity of manufacturers when implementing hid protocols,
> > I find it certain that they do use the in endpoint even without input
> > reports. E.g., for feature reports. This could cause regressions.
> While I mostly agree with this it is also true that the general direction
> for the kernel (especially lately) has been to not do out-of-spec things
> at least by default.
>
> If things really regress it's expected to do so only an very few specific
> devices with a buggy firmware, and we can think of something different
> for those (hopefully very few ones).
>
> Perhaps someone concerned with security might be interested in what
> we have because it doesn't look very normal.
>
> Note that below I have written a few ideas that maybe are worth

The degradation would be silent.

> looking into.
> >> The ASUS ROG N-Key keyboards expose a second, input-less interface used
> >> only for RGB control via feature reports. Opening its hidraw node (any
> >> hidraw reader does, including SDL/Steam Input or a plain cat) starts the
> > cating a hidraw causing issues would be expected, so let's focus on the former.

Try to add spaces before and after your responses

> Simply opening an hidraw should not trigger a delayed disconnect of that device,
> I don't know why you would expect this to happen nor why you would
> consider it acceptable. It's a bug.
>
> Focusing on userspace software exposing the bug is not a realistic option
> because over the time we found a good chunk of software doing that:
> - logitech control software (forgot the name)
> - open razer software
> - sdl
> - asusctl (obviously it opens the device albeit in the future I will change this)
>
> and likely more given the fact not all software was identified.
> > Asusctl has a bug where if you add the quirk that separates the event
> > nodes per hid, this bug is reproduced as well. I chucked it to
> > complicated threading getting out of control. It is the reason we
> > skipped that patch that was in my series.
> I found and solved the bug already. Regardless the issue remains:
> Even with no asusctl at all, if a user has one logitech mouse
> (and its control software) and a razer keyboard (and its control software)
> the asus N-Key device will start an endless disconnect-reconnect loop.
>
> Any combination of two or more of those tools will trigger the issue
> on some devices (weirdly enough not every model is affected):
>
> this is not good.
> > Now, you say SDL/Steam do a spurious read as well, can you identify
> > the codepath so we can look into it? What devices are affected? The
> > early return fixes a warning on the Z13, but it also feeds through the
> > universal lamp interface on the new Xbox Allies. Is this a bug on
> > those devices or keyboards? If yes, it could be caused by userspace
> > hanging on that node
> Sure, and I agree with you that fixing all userspace tools is desirable
> but it's also unfeasible to fix them all, if we managed to do that
> there will be years before everyone receives a fixed version of every
> affected software and even then a core issue would remain:
> linux tries to poll something it can't have anything out from.
>
> I am much more oriented on the fact that kernel shouldn't
> be doing weird things (at least not by default) so this has to
> somehow be stopped regardless of how well userspace behaves.

The kernel is not doing weird things and I also did not ask you to fix
all userspace software. I asked for a reproduction scenario, as it is
not covered in the patch description. Relooking at the patch today, I
also do not understand what it does fully.

It skips enabling input interrupts (but not only that) for devices
that have no input reports. So the kernel behavior will depend on the
feature descriptor moving forward.

And that fixes a hang on the affected devices because enabling
interrupts on an endpoint without periodic input reports blocks a
parallel endpoint that does have input reports?

I would like this fix to target the actual cause that causes the block
but it is not clear to me what that is or what is affected.

Antheas

> If you have better ideas on how to fix the kernel we would
> like to hear those as well.
>
> Best regards,
> Denis
> > Antheas
> >
> >> pointless IN poll and keypress reports on the keyboard interface get
> >> dropped for as long as the node stays open: a lost key-down drops a
> >> letter, a lost key-up leaves the key stuck. usbmon shows the dropped
> >> reports never reach the URB layer.
> >>
> >> The useless poll itself is long-standing; commit 4ac74ea68f64 ("HID:
> >> asus: early return for ROG devices") is what exposes it on these
> >> devices by keeping the input-less interface alive instead of ejecting
> >> it, so its hidraw node can be opened and the poll started.
> >>
> >> Skip the poll in usbhid_open() when the device has no input reports.
> >> Feature reports and hidraw output keep working over the control and OUT
> >> endpoints, so the interface is otherwise unaffected.
> I will write my review here to avoid forking the discussion:
>
> I agree with the general idea but perhaps we can avoid
> some hid devices to ever get HID_QUIRK_ALWAYS_POLL
> and that might be enough to skip the problematic code?
>
> Maybe there is value in doing this with a quirk flag in hid-asus.c
> affecting the least amount of devices?
>
> Or maybe just prevent devices with no data possibly coming out
> to ever get HID_QUIRK_ALWAYS_POLL?
>
> For how to best do this we will need to hear what Jiri and
> Benjamin have to say but if they think the proposed solution
> is the correct solution:
>
> Reviewed-by: Denis Benato <denis.benato@linux.dev>
> >> Fixes: 4ac74ea68f64 ("HID: asus: early return for ROG devices")
> >> Tested-by: Kerim Kabirov <the.privat33r+linux@pm.me>
> >> Tested-by: GameBurrow <gameburrow@pm.me>
> >> Signed-off-by: Ahmed Yaseen <yaseen@ghoul.dev>
> >> ---
> >>  drivers/hid/usbhid/hid-core.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c
> >> index 96b0181cf819..90a8b34d9305 100644
> >> --- a/drivers/hid/usbhid/hid-core.c
> >> +++ b/drivers/hid/usbhid/hid-core.c
> >> @@ -688,7 +688,8 @@ static int usbhid_open(struct hid_device *hid)
> >>
> >>         set_bit(HID_OPENED, &usbhid->iofl);
> >>
> >> -       if (hid->quirks & HID_QUIRK_ALWAYS_POLL) {
> >> +       if ((hid->quirks & HID_QUIRK_ALWAYS_POLL) ||
> >> +           list_empty(&hid->report_enum[HID_INPUT_REPORT].report_list)) {
> >>                 res = 0;
> >>                 goto Done;
> >>         }
> >> --
> >> 2.54.0
> >>
> >>
> >>
>


^ permalink raw reply

* Re: xhci_hcd: AMD Raphael/Granite Ridge USB 2.0 xHCI [1022:15b8] dies on resume from suspend
From: Martin Alderson @ 2026-06-06 13:12 UTC (permalink / raw)
  To: Michal Pecio; +Cc: Mathias Nyman, linux-usb
In-Reply-To: <20260530005742.25893efa.michal.pecio@gmail.com>

Hi Michal,

Thanks again for your continued help!

I had Claude investigate this in a bit more detail and it came up with this:

The receiver's delayedwork_callback (drivers/hid/hid-logitech-dj.c)
reacts to connect/disconnect/unknown notifications by calling
logi_dj_recv_query_paired_devices(), which sends a SET_REPORT to the
receiver via a synchronous usb_control_msg() on ep0. That work is
scheduled straight from logi_dj_raw_event() and is never stopped on
suspend — the driver has no .suspend callback (only .reset_resume),
and cancel_work_sync() only runs in .remove.

So the ordering that kills the controller is:

1. dj work issues a control SET_REPORT on ep0; the URB lands on the ring
2. usb_suspend_both() → usb_suspend_device() drives the port to U3
3. only afterwards does usb_suspend_both() set udev->can_submit = 0
and call usb_hcd_flush_endpoint() (drivers/usb/core/driver.c) — and
that flush unlinks the still-pending ep0 URB
4. xhci issues Stop Endpoint to an endpoint on a U3 port → 5s timeout → HC died

That matches the trace exactly: the "Cancel URB ... ep 0x0" appears
after "Set port 7-1 link state ... U3", and the debugfs command ring
shows the single stuck Stop Endpoint TRB (slot 1, ep 1).

I had Claude patch the driver and this seems to fix it:

--- /tmp/hid-logitech-dj.orig.c 2026-06-06 14:08:26.580516662 +0100
+++ hid-logitech-dj.c 2026-06-06 13:42:15.702948099 +0100
@@ -150,6 +150,7 @@
  unsigned long last_query; /* in jiffies */
  bool ready;
  bool dj_mode;
+ bool suspended;
  enum recvr_type type;
  unsigned int unnumbered_application;
  spinlock_t lock;
@@ -908,6 +909,17 @@
  return;
  }

+ /*
+ * Don't issue control reports while the receiver is suspended; leave
+ * the notifications queued and reschedule from resume.  A SET_REPORT
+ * submitted as the USB device enters U3 leaves a Stop Endpoint command
+ * pending on a suspended port, which times out and kills the xHC.
+ */
+ if (djrcv_dev->suspended) {
+ spin_unlock_irqrestore(&djrcv_dev->lock, flags);
+ return;
+ }
+
  count = kfifo_out(&djrcv_dev->notif_fifo, &workitem, sizeof(workitem));

  if (count != sizeof(workitem)) {
@@ -1983,11 +1995,63 @@
  return retval;
 }

+static int logi_dj_suspend(struct hid_device *hdev, pm_message_t message)
+{
+ struct dj_receiver_dev *djrcv_dev = hid_get_drvdata(hdev);
+ unsigned long flags;
+
+ if (!djrcv_dev)
+ return 0;
+
+ /*
+ * Stop the notification work from issuing control reports while the
+ * receiver suspends.  Setting ->suspended makes any requeued work a
+ * no-op (see delayedwork_callback); cancel_work_sync() then waits for
+ * an instance already running.  Without this, a SET_REPORT submitted
+ * as the device enters U3 leaves a Stop Endpoint command pending on a
+ * suspended port, which times out and kills the xHCI host.
+ */
+ spin_lock_irqsave(&djrcv_dev->lock, flags);
+ djrcv_dev->suspended = true;
+ spin_unlock_irqrestore(&djrcv_dev->lock, flags);
+
+ cancel_work_sync(&djrcv_dev->work);
+ return 0;
+}
+
+static void logi_dj_resume_common(struct dj_receiver_dev *djrcv_dev)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&djrcv_dev->lock, flags);
+ djrcv_dev->suspended = false;
+ /* Drain notifications that arrived (and were deferred) during suspend. */
+ if (!kfifo_is_empty(&djrcv_dev->notif_fifo))
+ schedule_work(&djrcv_dev->work);
+ spin_unlock_irqrestore(&djrcv_dev->lock, flags);
+}
+
+static int logi_dj_resume(struct hid_device *hdev)
+{
+ struct dj_receiver_dev *djrcv_dev = hid_get_drvdata(hdev);
+
+ if (!djrcv_dev)
+ return 0;
+
+ logi_dj_resume_common(djrcv_dev);
+ return 0;
+}
+
 static int logi_dj_reset_resume(struct hid_device *hdev)
 {
  struct dj_receiver_dev *djrcv_dev = hid_get_drvdata(hdev);

- if (!djrcv_dev || djrcv_dev->hidpp != hdev)
+ if (!djrcv_dev)
+ return 0;
+
+ logi_dj_resume_common(djrcv_dev);
+
+ if (djrcv_dev->hidpp != hdev)
  return 0;

  logi_dj_recv_switch_to_dj_mode(djrcv_dev, 0);
@@ -2148,6 +2212,8 @@
  .probe = logi_dj_probe,
  .remove = logi_dj_remove,
  .raw_event = logi_dj_raw_event,
+ .suspend = pm_ptr(logi_dj_suspend),
+ .resume = pm_ptr(logi_dj_resume),
  .reset_resume = pm_ptr(logi_dj_reset_resume),
 };

Not sure if this is helpful for you? I can submit this to the
linux-input list once I've ran it for a bit longer (it's survived ~15
suspend cycles so far with no issues I can detect) - maybe they could
take this as a basis to fix the driver (I am not a kernel expert
whatsoever!).

Thanks
Martin

On Fri, May 29, 2026 at 11:57 PM Michal Pecio <michal.pecio@gmail.com> wrote:
>
> On Fri, 29 May 2026 13:04:49 +0100, Martin Alderson wrote:
> > Bus 007 Device 002: ID 046d:c52b Logitech, Inc. Unifying Receiver
>
> So this is the problem device. Until we have any better idea what to
> try, please add usbcore and usbhid dynamic debug and keep collecting
> logs - maybe something useful will show up.
>
> Looking at usbhid, it seems it may fail to suspend if some operations
> are ongoing. And then usbcore may apparently suspend the device anyway
> and usbhid will presumably try to continue its thing on the suspended
> device. AFAIK any URB submissions should fail then, but there might be
> a bug. I haven't yet looked closely at how it all works.
>
> BTW, are you able to test patched kernels in case dynamic debug proves
> insufficient to figure out what's going on?
>
> Regards,
> Michal

^ permalink raw reply

* Re: [PATCH] HID: usbhid: skip interrupt IN polling for devices with no input reports
From: Denis Benato @ 2026-06-06 12:42 UTC (permalink / raw)
  To: Antheas Kapenekakis, Ahmed Yaseen
  Cc: Jiri Kosina, Benjamin Tissoires, Ilpo Järvinen,
	Kerim Kabirov, GameBurrow, linux-usb, linux-input, linux-kernel
In-Reply-To: <CAGwozwFWbW=v2B7ruS4OUGuPhjSayw4Qxj3K+bCzwmgQu158Ng@mail.gmail.com>


On 6/5/26 14:02, Antheas Kapenekakis wrote:
> On Fri, 5 Jun 2026 at 13:40, Ahmed Yaseen <yaseen@ghoul.dev> wrote:
>> usbhid starts polling a device's interrupt IN endpoint on open
>> (usbhid_open() -> hid_start_in()). If the report descriptor declares no
>> input reports there is nothing to read there, so the poll is useless,
>> and on some composite devices it is also harmful.
> If it did have input reports, would starting the polling still cause
> issues? Because if it would, the issue is in the polling itself.
So far we haven't found an asus device that has more than one interface
that supports reading data out of if.
> Given the creativity of manufacturers when implementing hid protocols,
> I find it certain that they do use the in endpoint even without input
> reports. E.g., for feature reports. This could cause regressions.
While I mostly agree with this it is also true that the general direction
for the kernel (especially lately) has been to not do out-of-spec things
at least by default.

If things really regress it's expected to do so only an very few specific
devices with a buggy firmware, and we can think of something different
for those (hopefully very few ones).

Perhaps someone concerned with security might be interested in what
we have because it doesn't look very normal.

Note that below I have written a few ideas that maybe are worth
looking into.
>> The ASUS ROG N-Key keyboards expose a second, input-less interface used
>> only for RGB control via feature reports. Opening its hidraw node (any
>> hidraw reader does, including SDL/Steam Input or a plain cat) starts the
> cating a hidraw causing issues would be expected, so let's focus on the former.
Simply opening an hidraw should not trigger a delayed disconnect of that device,
I don't know why you would expect this to happen nor why you would
consider it acceptable. It's a bug.

Focusing on userspace software exposing the bug is not a realistic option
because over the time we found a good chunk of software doing that:
- logitech control software (forgot the name)
- open razer software
- sdl
- asusctl (obviously it opens the device albeit in the future I will change this)

and likely more given the fact not all software was identified. 
> Asusctl has a bug where if you add the quirk that separates the event
> nodes per hid, this bug is reproduced as well. I chucked it to
> complicated threading getting out of control. It is the reason we
> skipped that patch that was in my series.
I found and solved the bug already. Regardless the issue remains:
Even with no asusctl at all, if a user has one logitech mouse
(and its control software) and a razer keyboard (and its control software)
the asus N-Key device will start an endless disconnect-reconnect loop.

Any combination of two or more of those tools will trigger the issue
on some devices (weirdly enough not every model is affected):
this is not good.
> Now, you say SDL/Steam do a spurious read as well, can you identify
> the codepath so we can look into it? What devices are affected? The
> early return fixes a warning on the Z13, but it also feeds through the
> universal lamp interface on the new Xbox Allies. Is this a bug on
> those devices or keyboards? If yes, it could be caused by userspace
> hanging on that node
Sure, and I agree with you that fixing all userspace tools is desirable
but it's also unfeasible to fix them all, if we managed to do that
there will be years before everyone receives a fixed version of every
affected software and even then a core issue would remain:
linux tries to poll something it can't have anything out from.

I am much more oriented on the fact that kernel shouldn't
be doing weird things (at least not by default) so this has to
somehow be stopped regardless of how well userspace behaves.

If you have better ideas on how to fix the kernel we would
like to hear those as well.

Best regards,
Denis
> Antheas
>
>> pointless IN poll and keypress reports on the keyboard interface get
>> dropped for as long as the node stays open: a lost key-down drops a
>> letter, a lost key-up leaves the key stuck. usbmon shows the dropped
>> reports never reach the URB layer.
>>
>> The useless poll itself is long-standing; commit 4ac74ea68f64 ("HID:
>> asus: early return for ROG devices") is what exposes it on these
>> devices by keeping the input-less interface alive instead of ejecting
>> it, so its hidraw node can be opened and the poll started.
>>
>> Skip the poll in usbhid_open() when the device has no input reports.
>> Feature reports and hidraw output keep working over the control and OUT
>> endpoints, so the interface is otherwise unaffected.
I will write my review here to avoid forking the discussion:

I agree with the general idea but perhaps we can avoid
some hid devices to ever get HID_QUIRK_ALWAYS_POLL
and that might be enough to skip the problematic code?

Maybe there is value in doing this with a quirk flag in hid-asus.c
affecting the least amount of devices?

Or maybe just prevent devices with no data possibly coming out
to ever get HID_QUIRK_ALWAYS_POLL?

For how to best do this we will need to hear what Jiri and
Benjamin have to say but if they think the proposed solution
is the correct solution:

Reviewed-by: Denis Benato <denis.benato@linux.dev>
>> Fixes: 4ac74ea68f64 ("HID: asus: early return for ROG devices")
>> Tested-by: Kerim Kabirov <the.privat33r+linux@pm.me>
>> Tested-by: GameBurrow <gameburrow@pm.me>
>> Signed-off-by: Ahmed Yaseen <yaseen@ghoul.dev>
>> ---
>>  drivers/hid/usbhid/hid-core.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c
>> index 96b0181cf819..90a8b34d9305 100644
>> --- a/drivers/hid/usbhid/hid-core.c
>> +++ b/drivers/hid/usbhid/hid-core.c
>> @@ -688,7 +688,8 @@ static int usbhid_open(struct hid_device *hid)
>>
>>         set_bit(HID_OPENED, &usbhid->iofl);
>>
>> -       if (hid->quirks & HID_QUIRK_ALWAYS_POLL) {
>> +       if ((hid->quirks & HID_QUIRK_ALWAYS_POLL) ||
>> +           list_empty(&hid->report_enum[HID_INPUT_REPORT].report_list)) {
>>                 res = 0;
>>                 goto Done;
>>         }
>> --
>> 2.54.0
>>
>>
>>

^ permalink raw reply

* Re: [PATCH 3/3] arm64: dts: qcom: x1e80100-microsoft-romulus: add phy-reinit-on-resume
From: Krzysztof Kozlowski @ 2026-06-06 11:30 UTC (permalink / raw)
  To: Oliver White, Greg Kroah-Hartman, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Thinh Nguyen, Bjorn Andersson,
	Konrad Dybcio
  Cc: Felipe Balbi, linux-usb, devicetree, linux-arm-msm, linux-kernel
In-Reply-To: <20260601231236.20402-4-oliverjwhite07@gmail.com>

On 02/06/2026 01:12, Oliver White wrote:
> The Surface Laptop 7 gates the USB2 PHY power domain during deep sleep, causing the PHY register state to be lost. When the DWC3 multi-port controller resumes via the fast path (device_may_wakeup), the PHY is not re-initialized and USB2 devices (such as the wired keyboard on the USB-A port) may exhibit corrupted signalling, e.g. stuck modifier key reports.
> 
> Enable the 'snps,reinit-phy-on-resume' quirk to force a full PHY re-initialization cycle on resume.

Please run scripts/checkpatch.pl on the patches and fix reported
warnings. After that, run also 'scripts/checkpatch.pl --strict' on the
patches and (probably) fix more warnings. Some warnings can be ignored,
especially from --strict run, but the code here looks like it needs a
fix. Feel free to get in touch if the warning is not clear.

Please wrap commit message according to Linux coding style / submission
process (neither too early nor over the limit):
https://elixir.bootlin.com/linux/v6.4-rc1/source/Documentation/process/submitting-patches.rst#L597

Please read submitting patches and DCO. DCO chain is missing here.

Best regards,
Krzysztof

^ permalink raw reply

* Re: [PATCH 1/3] dt-bindings: usb: dwc3: document snps,reinit-phy-on-resume
From: Dmitry Baryshkov @ 2026-06-06 11:21 UTC (permalink / raw)
  To: Rob Herring
  Cc: Oliver White, Greg Kroah-Hartman, Krzysztof Kozlowski,
	Conor Dooley, Thinh Nguyen, Bjorn Andersson, Konrad Dybcio,
	Felipe Balbi, linux-usb, devicetree, linux-arm-msm, linux-kernel
In-Reply-To: <20260605190638.GA4188454-robh@kernel.org>

On Fri, Jun 05, 2026 at 02:06:38PM -0500, Rob Herring wrote:
> On Tue, Jun 02, 2026 at 11:12:34AM +1200, Oliver White wrote:
> > Add the documentation for the 'snps,reinit-phy-on-resume' boolean
> > property. When set, the DWC3 core will perform a full phy_exit() +
> > phy_init() cycle on each USB2 PHY during the host-mode fast resume
> > path. This is needed on platforms where the USB2 PHY power domain
> > is gated during deep sleep even when device_may_wakeup is true.
> > 
> > Signed-off-by: Oliver White <oliverjwhite07@gmail.com>
> > ---
> >  .../devicetree/bindings/usb/snps,dwc3-common.yaml      | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/usb/snps,dwc3-common.yaml b/Documentation/devicetree/bindings/usb/snps,dwc3-common.yaml
> > index 6c0b8b653824..d12f6ae81ab8 100644
> > --- a/Documentation/devicetree/bindings/usb/snps,dwc3-common.yaml
> > +++ b/Documentation/devicetree/bindings/usb/snps,dwc3-common.yaml
> > @@ -212,6 +212,16 @@ properties:
> >        When set, run the SOF/ITP counter based on ref_clk.
> >      type: boolean
> >  
> > +  snps,reinit-phy-on-resume:
> > +    description:
> > +      When set, the DWC3 will re-initialize the USB2 PHYs during the
> > +      host-mode fast resume path (device_may_wakeup). Some platforms
> > +      cut PHY power during deep sleep even when USB wake is enabled,
> > +      and the standard PHY runtime PM resume is insufficient to restore
> > +      the PHY register state. This quirk forces a full phy_exit() +
> > +      phy_init() cycle on each USB2 PHY.
> > +    type: boolean
> 
> This should be implied from a platform specific compatible string.

Platform as in the "root node compatible"?

-- 
With best wishes
Dmitry

^ permalink raw reply

* [PATCH] usb: gadget: tegra-xudc: drain EP pipeline before dma_unmap
From: Vishal Kumar @ 2026-06-06  2:40 UTC (permalink / raw)
  To: linux-usb
  Cc: gregkh, thierry.reding, jonathanh, linux-tegra, linux-kernel,
	Vishal Kumar, stable

On Tegra186/194/234 the XUDC posts a transfer-completion event when the
DMA write is dispatched to the AXI interconnect, before the store is
committed to memory.  Under SMMU strict mode dma_unmap() synchronously
invalidates the IOVA TLB entry.  An in-flight AXI write to the
just-unmapped IOVA triggers a translation fault (fsr=0x402) that
permanently wedges the bulk-OUT endpoint.

Observed on Tegra234 (Jetson Orin Nano) at ~170 MB/s USB-NCM transfers:

  arm-smmu 8000000.iommu: Unhandled context fault: fsr=0x402,
    iova=0xfffb5000, cbfrsynra=0x100f, cb=3
  tegra-mc 2c00000.memory-controller: EMEM address decode error

cbfrsynra=0x100f identifies XUDC (StreamID 0x0f per DT), cb=3 is iommu
group 4 (3550000.usb).  fsr=0x402 is a translation fault on a DMA write.

Fix: poll EP_THREAD_ACTIVE before calling usb_gadget_unmap_request() for
non-control endpoints.  EP_THREAD_ACTIVE clearing is the hardware's
guarantee that the endpoint sequencer is idle and all AXI transactions
have completed, so the subsequent TLB invalidation cannot race an
in-flight write.

Also change ep_wait_for_inactive() to return the readl_poll_timeout()
status so callers can detect a timeout.  On timeout in the completion
path, skip dma_unmap() to avoid the translation fault and force
req->usb_req.status = -EIO so the gadget driver does not treat the
transfer as successful or requeue the still-mapped buffer.  On timeout
in the dequeue path, emit a warning.

Fixes: 49d6f3dd4abe ("usb: gadget: add tegra xusb device mode driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Vishal Kumar <vishalmimani008@gmail.com>
---
 drivers/usb/gadget/udc/tegra-xudc.c | 47 ++++++++++++++++++++++------
 1 file changed, 38 insertions(+), 9 deletions(-)

diff --git a/drivers/usb/gadget/udc/tegra-xudc.c b/drivers/usb/gadget/udc/tegra-xudc.c
index 0b63b8c0a..3f18beddf 100644
--- a/drivers/usb/gadget/udc/tegra-xudc.c
+++ b/drivers/usb/gadget/udc/tegra-xudc.c
@@ -1023,9 +1023,9 @@ static void ep_wait_for_stopped(struct tegra_xudc *xudc, unsigned int ep)
 	xudc_writel(xudc, BIT(ep), EP_STOPPED);
 }
 
-static void ep_wait_for_inactive(struct tegra_xudc *xudc, unsigned int ep)
+static int ep_wait_for_inactive(struct tegra_xudc *xudc, unsigned int ep)
 {
-	xudc_readl_poll(xudc, EP_THREAD_ACTIVE, BIT(ep), 0);
+	return xudc_readl_poll(xudc, EP_THREAD_ACTIVE, BIT(ep), 0);
 }
 
 static void tegra_xudc_req_done(struct tegra_xudc_ep *ep,
@@ -1046,8 +1046,39 @@ static void tegra_xudc_req_done(struct tegra_xudc_ep *ep,
 					 (xudc->setup_state ==
 					  DATA_STAGE_XFER));
 	} else {
-		usb_gadget_unmap_request(&xudc->gadget, &req->usb_req,
-					 usb_endpoint_dir_in(ep->desc));
+		/*
+		 * Drain the endpoint DMA pipeline before unmapping.
+		 *
+		 * Under SMMU strict mode dma_unmap() synchronously
+		 * invalidates the IOVA TLB entry.  On Tegra186/194/234 the
+		 * XUDC appears to post the completion event when the DMA
+		 * write is dispatched to the AXI interconnect, before the
+		 * store is committed to memory.  A subsequent dma_unmap()
+		 * can remove the IOVA translation while the write is still
+		 * in-flight, triggering a translation fault (fsr=0x402) that
+		 * permanently wedges the bulk endpoint.
+		 *
+		 * Wait for EP_THREAD_ACTIVE to clear (endpoint sequencer
+		 * idle).  On timeout skip the unmap to avoid the SMMU fault;
+		 * the DMA mapping leaks but the hardware is already in an
+		 * unrecoverable state.
+		 */
+		if (!WARN_ONCE(ep_wait_for_inactive(xudc, ep->index),
+			       "ep%u: DMA drain timed out; skipping dma_unmap\n",
+			       ep->index)) {
+			/* Read-back completes the poll barrier; EP_THREAD_ACTIVE=0 guarantees DMA is idle. */
+			xudc_readl(xudc, EP_THREAD_ACTIVE);
+			usb_gadget_unmap_request(&xudc->gadget, &req->usb_req,
+						 usb_endpoint_dir_in(ep->desc));
+		} else {
+			/*
+			 * Timeout: mapping is intentionally leaked to avoid the
+			 * SMMU fault.  Force -EIO so the gadget driver does not
+			 * treat this as a successful transfer and reuse the
+			 * still-mapped buffer.
+			 */
+			req->usb_req.status = -EIO;
+		}
 	}
 
 	spin_unlock(&xudc->lock);
@@ -1443,10 +1474,12 @@ __tegra_xudc_ep_dequeue(struct tegra_xudc_ep *ep,
 		return 0;
 	}
 
-	/* Halt DMA for this endpiont. */
+	/* Halt DMA for this endpoint. */
 	if (ep_ctx_read_state(ep->context) == EP_STATE_RUNNING) {
 		ep_pause(xudc, ep->index);
-		ep_wait_for_inactive(xudc, ep->index);
+		if (ep_wait_for_inactive(xudc, ep->index))
+			dev_warn(xudc->dev, "ep%u: DMA drain timed out during dequeue\n",
+				 ep->index);
 	}
 
 	deq_trb = trb_phys_to_virt(ep, ep_ctx_read_deq_ptr(ep->context));

2.39.0

^ permalink raw reply related

* Re: [PATCH v3] usb: dwc3: avoid probe deferral when USB power supply is not available
From: Thinh Nguyen @ 2026-06-06  1:15 UTC (permalink / raw)
  To: Elson Serrao
  Cc: Thinh Nguyen, Greg Kroah-Hartman, linux-usb@vger.kernel.org,
	linux-kernel@vger.kernel.org, jack.pham@oss.qualcomm.com,
	wesley.cheng@oss.qualcomm.com
In-Reply-To: <20260605181142.1925832-1-elson.serrao@oss.qualcomm.com>

On Fri, Jun 05, 2026, Elson Serrao wrote:
> The dwc3 driver currently defers probe if the USB power supply is not yet
> registered. On some platforms, even though charging and power supply
> functionality is available during normal operation, there may exist
> minimal booting modes (such as recovery or diagnostic environments) where
> the relevant USB power supply device is not registered. In such cases,
> probe deferral prevents USB gadget operation entirely.
> 
> USB data functionality for basic operation does not inherently depend on
> the power supply framework, which is only required for enforcing VBUS
> current control. The configured VBUS current limit is typically enforced
> through the charger or PMIC power path. When charging functionality is
> unavailable, applying a current limit has no practical effect, reducing
> the benefit of strict probe-time enforcement in these environments.
> 
> Instead of deferring probe, register a power supply notifier when the
> USB power supply is not yet available. Cache the requested VBUS current
> limit and apply it once the matching power supply becomes available, as
> notified through the registered callback.
> 
> Signed-off-by: Elson Serrao <elson.serrao@oss.qualcomm.com>
> ---
> Changes in v3:
>  - Introduced DWC3_CURRENT_UNSPECIFIED macro to replace UINT_MAX for
>    improved code readability.
>  - Enhanced dwc3_gadget_vbus_draw() to return success when power supply
>    is expected but not ready yet, and only return -EOPNOTSUPP when truly
>    not supported.
>  - Link to v2: https://urldefense.com/v3/__https://lore.kernel.org/all/20260526183016.3501307-1-elson.serrao@oss.qualcomm.com/__;!!A4F2R9G_pg!egKWg179wv3MYHggzavFR-zdG3Gq6GJjN6GBi2AglShH2RXQxXHLGjTHEt0gzdZjdg_cIyWWawXOKJlhFaJKUFkM6EGLbF1z$ 
> 
> Changes in v2:
>  - Removed notifier unregistration from the vbus_draw work function to
>    avoid a race with remove callback.
>  - Added an early psy registration check in the notifier callback.
>  - Moved power supply registration check after notifier registration
>    in dwc3_get_usb_power_supply() to address the race identified in v1.
>  - Link to v1: https://urldefense.com/v3/__https://lore.kernel.org/all/20260407232410.4101455-1-elson.serrao@oss.qualcomm.com/__;!!A4F2R9G_pg!egKWg179wv3MYHggzavFR-zdG3Gq6GJjN6GBi2AglShH2RXQxXHLGjTHEt0gzdZjdg_cIyWWawXOKJlhFaJKUFkM6JWOc-1o$ 
> ---
>  drivers/usb/dwc3/core.c   | 99 +++++++++++++++++++++++++++++++++------
>  drivers/usb/dwc3/core.h   |  6 +++
>  drivers/usb/dwc3/gadget.c | 15 +++++-
>  3 files changed, 104 insertions(+), 16 deletions(-)
> 

[...]

> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
> index e0dee9d28740..d722d3f1402a 100644
> --- a/drivers/usb/dwc3/core.h
> +++ b/drivers/usb/dwc3/core.h
> @@ -677,6 +677,8 @@
>  /* Force Gen1 speed on Gen2 link */
>  #define DWC3_LLUCTL_FORCE_GEN1		BIT(10)
>  
> +#define DWC3_CURRENT_UNSPECIFIED	UINT_MAX

Would be nice if we can place this macro right under the current_limit
field of the dwc3 struct.

But I know it's minor and not a blocker for this patch, so feel free to
ignore this comment if you think it's not worth the change.

Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>

Thanks,
Thinh

^ permalink raw reply

* Re: [PATCH 1/3] dt-bindings: usb: dwc3: document snps,reinit-phy-on-resume
From: Rob Herring @ 2026-06-05 19:06 UTC (permalink / raw)
  To: Oliver White
  Cc: Greg Kroah-Hartman, Krzysztof Kozlowski, Conor Dooley,
	Thinh Nguyen, Bjorn Andersson, Konrad Dybcio, Felipe Balbi,
	linux-usb, devicetree, linux-arm-msm, linux-kernel
In-Reply-To: <20260601231236.20402-2-oliverjwhite07@gmail.com>

On Tue, Jun 02, 2026 at 11:12:34AM +1200, Oliver White wrote:
> Add the documentation for the 'snps,reinit-phy-on-resume' boolean
> property. When set, the DWC3 core will perform a full phy_exit() +
> phy_init() cycle on each USB2 PHY during the host-mode fast resume
> path. This is needed on platforms where the USB2 PHY power domain
> is gated during deep sleep even when device_may_wakeup is true.
> 
> Signed-off-by: Oliver White <oliverjwhite07@gmail.com>
> ---
>  .../devicetree/bindings/usb/snps,dwc3-common.yaml      | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/usb/snps,dwc3-common.yaml b/Documentation/devicetree/bindings/usb/snps,dwc3-common.yaml
> index 6c0b8b653824..d12f6ae81ab8 100644
> --- a/Documentation/devicetree/bindings/usb/snps,dwc3-common.yaml
> +++ b/Documentation/devicetree/bindings/usb/snps,dwc3-common.yaml
> @@ -212,6 +212,16 @@ properties:
>        When set, run the SOF/ITP counter based on ref_clk.
>      type: boolean
>  
> +  snps,reinit-phy-on-resume:
> +    description:
> +      When set, the DWC3 will re-initialize the USB2 PHYs during the
> +      host-mode fast resume path (device_may_wakeup). Some platforms
> +      cut PHY power during deep sleep even when USB wake is enabled,
> +      and the standard PHY runtime PM resume is insufficient to restore
> +      the PHY register state. This quirk forces a full phy_exit() +
> +      phy_init() cycle on each USB2 PHY.
> +    type: boolean

This should be implied from a platform specific compatible string.

Rob

^ permalink raw reply

* Re: [RFC PATCH v1 2/3] early: usb: xhci-dbc: Handle out of bounds xhci-xdbc capability
From: Mathias Nyman @ 2026-06-05 18:16 UTC (permalink / raw)
  To: Umang Jain, Greg Kroah-Hartman, Lucas De Marchi
  Cc: linux-usb, linux-kernel, kernel-dev
In-Reply-To: <20260604144122.962236-3-uajain@igalia.com>

Hi

On 6/4/26 17:41, Umang Jain wrote:
> Currently, the early xhci-dbc assumes that the extended capability
> can be mapped within the fixed boot time mappings dictated by
> NR_FIX_BTMAPS.
> 
> This patch iterates over the PCI BAR address size to find and map
> xhci-xdbc capability which could be out-of-bounds otherwise,
> in xdbc_map_pci_mmio(). The iterations map the maximum allowed
> boot time mappings (fixmap size) at a time and search for xhci-xdbc
> capability offset, till the end of the bar address size.
> 

Patch 1/3 can probably be merged into this one.

> Signed-off-by: Umang Jain <uajain@igalia.com>
> ---
>   drivers/usb/early/xhci-dbc.c | 47 +++++++++++++++++++++++++++++++++---
>   1 file changed, 44 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/usb/early/xhci-dbc.c b/drivers/usb/early/xhci-dbc.c
> index 8ce362a90910..1f6a129d4b5d 100644
> --- a/drivers/usb/early/xhci-dbc.c
> +++ b/drivers/usb/early/xhci-dbc.c
> @@ -35,10 +35,13 @@ static bool early_console_keep;
>   static inline void xdbc_trace(const char *fmt, ...) { }
>   #endif /* XDBC_TRACE */
>   
> +#define XDBC_MAPPING_SIZE 56
> +

I know spec says 56 bytes, but when looking at the Debug capability structure
in xhci section 7.6.8. it looks like 64 bytes.

>   static void __iomem * __init xdbc_map_pci_mmio(u32 bus, u32 dev, u32 func)
>   {
> -	u64 val64, sz64, mask64;
> +	u64 val64, sz64, mask64, fixmap_size, mapped_size;
>   	void __iomem *base;
> +	int offset;
>   	u32 val, sz;
>   	u8 byte;
>   
> @@ -85,8 +88,46 @@ static void __iomem * __init xdbc_map_pci_mmio(u32 bus, u32 dev, u32 func)
>   
>   	xdbc.xhci_start = val64;
>   	xdbc.xhci_length = sz64;
> -	base = early_ioremap(val64, sz64);
> -	xdbc.xhci_base_length = sz64;
> +
> +	fixmap_size = NR_FIX_BTMAPS << PAGE_SHIFT;
> +	if (sz64 < fixmap_size) {
> +		xdbc.xhci_base_length = sz64;
> +		return early_ioremap(val64, sz64);
> +	}
> +
> +	/*
> +	 * Base address size is greater than fixed size boot mappings,
> +	 * hence iterate over the region one fixmap_size at a time.
> +	 */
> +	base = early_ioremap(val64, fixmap_size);
> +	offset = xhci_find_next_ext_cap(base, 0, 0);
> +	mapped_size = fixmap_size;
> +
> +	while (mapped_size <= sz64) {
> +		val = readl(base + offset);
> +		if (XHCI_EXT_CAPS_ID(val) == XHCI_EXT_CAPS_DEBUG) {
> +			if (offset + XDBC_MAPPING_SIZE > fixmap_size) {
> +				early_iounmap(base, fixmap_size);
> +				base = early_ioremap(val64 + offset, XDBC_MAPPING_SIZE);

Took a closer look and it turns out we do sometimes need to touch registers in other
extended capabilities. Mainly BIOS handoff in XHCI_EXT_CAPS_LEGACY and port reset in
XHCI_EXT_CAPS_PROTOCOL

In the case where xHC size is larger than early_ioremap() allows I would just
early_ioremap() maximum allowed size once, starting from xdbc.xhci_start.
Then walk the extended capabilities list ensuring DbC and the other needed capabilities
are inside this maximum allowed size.
early_iounmap() and fail if not.
   
This way we can also access the normal xHC host registers in case we need to reset the
controller, or ensure the 'controller not ready' bit is clear.

Thanks
Mathias

^ permalink raw reply

* [PATCH v3] usb: dwc3: avoid probe deferral when USB power supply is not available
From: Elson Serrao @ 2026-06-05 18:11 UTC (permalink / raw)
  To: Thinh Nguyen, Greg Kroah-Hartman
  Cc: linux-usb, linux-kernel, jack.pham, wesley.cheng, Elson Serrao

The dwc3 driver currently defers probe if the USB power supply is not yet
registered. On some platforms, even though charging and power supply
functionality is available during normal operation, there may exist
minimal booting modes (such as recovery or diagnostic environments) where
the relevant USB power supply device is not registered. In such cases,
probe deferral prevents USB gadget operation entirely.

USB data functionality for basic operation does not inherently depend on
the power supply framework, which is only required for enforcing VBUS
current control. The configured VBUS current limit is typically enforced
through the charger or PMIC power path. When charging functionality is
unavailable, applying a current limit has no practical effect, reducing
the benefit of strict probe-time enforcement in these environments.

Instead of deferring probe, register a power supply notifier when the
USB power supply is not yet available. Cache the requested VBUS current
limit and apply it once the matching power supply becomes available, as
notified through the registered callback.

Signed-off-by: Elson Serrao <elson.serrao@oss.qualcomm.com>
---
Changes in v3:
 - Introduced DWC3_CURRENT_UNSPECIFIED macro to replace UINT_MAX for
   improved code readability.
 - Enhanced dwc3_gadget_vbus_draw() to return success when power supply
   is expected but not ready yet, and only return -EOPNOTSUPP when truly
   not supported.
 - Link to v2: https://lore.kernel.org/all/20260526183016.3501307-1-elson.serrao@oss.qualcomm.com/

Changes in v2:
 - Removed notifier unregistration from the vbus_draw work function to
   avoid a race with remove callback.
 - Added an early psy registration check in the notifier callback.
 - Moved power supply registration check after notifier registration
   in dwc3_get_usb_power_supply() to address the race identified in v1.
 - Link to v1: https://lore.kernel.org/all/20260407232410.4101455-1-elson.serrao@oss.qualcomm.com/
---
 drivers/usb/dwc3/core.c   | 99 +++++++++++++++++++++++++++++++++------
 drivers/usb/dwc3/core.h   |  6 +++
 drivers/usb/dwc3/gadget.c | 15 +++++-
 3 files changed, 104 insertions(+), 16 deletions(-)

diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 58899b1fa96d..8558bd3f38ea 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -2192,22 +2192,89 @@ static void dwc3_vbus_draw_work(struct work_struct *work)
 			ret, dwc->current_limit);
 }
 
-static struct power_supply *dwc3_get_usb_power_supply(struct dwc3 *dwc)
+static int dwc3_psy_notifier(struct notifier_block *nb,
+			     unsigned long event, void *data)
 {
-	struct power_supply *usb_psy;
-	const char *usb_psy_name;
+	struct dwc3 *dwc = container_of(nb, struct dwc3, psy_nb);
+	struct power_supply *psy = data;
+	unsigned long flags;
+
+	if (dwc->usb_psy)
+		return NOTIFY_DONE;
+
+	if (strcmp(psy->desc->name, dwc->usb_psy_name) != 0)
+		return NOTIFY_DONE;
+
+	/* Explicitly get the reference for this psy */
+	psy = power_supply_get_by_name(dwc->usb_psy_name);
+	if (!psy)
+		return NOTIFY_DONE;
+
+	spin_lock_irqsave(&dwc->lock, flags);
+	/*
+	 * The USB power_supply may already be set. This can happen if notifier
+	 * callbacks for the USB power_supply race, or if a previous notifier
+	 * callback has already successfully fetched and associated the instance.
+	 * In such cases, release the newly acquired reference and ignore
+	 * subsequent notifications until the notifier is unregistered.
+	 */
+	if (dwc->usb_psy) {
+		spin_unlock_irqrestore(&dwc->lock, flags);
+		power_supply_put(psy);
+		return NOTIFY_DONE;
+	}
+
+	dwc->usb_psy = psy;
+	if (dwc->current_limit != DWC3_CURRENT_UNSPECIFIED)
+		schedule_work(&dwc->vbus_draw_work);
+	spin_unlock_irqrestore(&dwc->lock, flags);
+
+	return NOTIFY_OK;
+}
+
+static void dwc3_get_usb_power_supply(struct dwc3 *dwc)
+{
+	struct power_supply *psy;
+	unsigned long flags;
 	int ret;
 
-	ret = device_property_read_string(dwc->dev, "usb-psy-name", &usb_psy_name);
+	ret = device_property_read_string(dwc->dev, "usb-psy-name", &dwc->usb_psy_name);
 	if (ret < 0)
-		return NULL;
-
-	usb_psy = power_supply_get_by_name(usb_psy_name);
-	if (!usb_psy)
-		return ERR_PTR(-EPROBE_DEFER);
+		return;
 
 	INIT_WORK(&dwc->vbus_draw_work, dwc3_vbus_draw_work);
-	return usb_psy;
+
+	dwc->current_limit = DWC3_CURRENT_UNSPECIFIED;
+	dwc->psy_nb.notifier_call = dwc3_psy_notifier;
+	ret = power_supply_reg_notifier(&dwc->psy_nb);
+	if (ret) {
+		dev_err(dwc->dev, "Failed to register power supply notifier: %d\n", ret);
+		dwc->psy_nb.notifier_call = NULL;
+		return;
+	}
+
+	psy = power_supply_get_by_name(dwc->usb_psy_name);
+	if (!psy)
+		return;
+
+	/* Unregister the notifier now that we have the power supply */
+	power_supply_unreg_notifier(&dwc->psy_nb);
+	dwc->psy_nb.notifier_call = NULL;
+
+	spin_lock_irqsave(&dwc->lock, flags);
+	/*
+	 * It is possible that the notifier callback ran before we reached here
+	 * and successfully fetched the power supply. In that case we need to
+	 * release the above reference.
+	 */
+	if (dwc->usb_psy) {
+		spin_unlock_irqrestore(&dwc->lock, flags);
+		power_supply_put(psy);
+		return;
+	}
+
+	dwc->usb_psy = psy;
+	spin_unlock_irqrestore(&dwc->lock, flags);
 }
 
 int dwc3_core_probe(const struct dwc3_probe_data *data)
@@ -2255,9 +2322,9 @@ int dwc3_core_probe(const struct dwc3_probe_data *data)
 
 	dwc3_get_software_properties(dwc, &data->properties);
 
-	dwc->usb_psy = dwc3_get_usb_power_supply(dwc);
-	if (IS_ERR(dwc->usb_psy))
-		return dev_err_probe(dev, PTR_ERR(dwc->usb_psy), "couldn't get usb power supply\n");
+	spin_lock_init(&dwc->lock);
+
+	dwc3_get_usb_power_supply(dwc);
 
 	if (!data->ignore_clocks_and_resets) {
 		dwc->reset = devm_reset_control_array_get_optional_shared(dev);
@@ -2309,7 +2376,6 @@ int dwc3_core_probe(const struct dwc3_probe_data *data)
 		dwc->num_usb3_ports = 1;
 	}
 
-	spin_lock_init(&dwc->lock);
 	mutex_init(&dwc->mutex);
 
 	pm_runtime_get_noresume(dev);
@@ -2377,6 +2443,8 @@ int dwc3_core_probe(const struct dwc3_probe_data *data)
 err_assert_reset:
 	reset_control_assert(dwc->reset);
 err_put_psy:
+	if (dwc->psy_nb.notifier_call)
+		power_supply_unreg_notifier(&dwc->psy_nb);
 	if (dwc->usb_psy)
 		power_supply_put(dwc->usb_psy);
 
@@ -2433,6 +2501,9 @@ void dwc3_core_remove(struct dwc3 *dwc)
 
 	dwc3_free_event_buffers(dwc);
 
+	if (dwc->psy_nb.notifier_call)
+		power_supply_unreg_notifier(&dwc->psy_nb);
+
 	if (dwc->usb_psy) {
 		cancel_work_sync(&dwc->vbus_draw_work);
 		power_supply_put(dwc->usb_psy);
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index e0dee9d28740..d722d3f1402a 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -677,6 +677,8 @@
 /* Force Gen1 speed on Gen2 link */
 #define DWC3_LLUCTL_FORCE_GEN1		BIT(10)
 
+#define DWC3_CURRENT_UNSPECIFIED	UINT_MAX
+
 /* Structures */
 
 struct dwc3_trb;
@@ -1059,6 +1061,8 @@ struct dwc3_glue_ops {
  * @role_switch_default_mode: default operation mode of controller while
  *			usb role is USB_ROLE_NONE.
  * @usb_psy: pointer to power supply interface.
+ * @usb_psy_name: name of the USB power supply
+ * @psy_nb: power supply notifier block
  * @vbus_draw_work: Work to set the vbus drawing limit
  * @current_limit: How much current to draw from vbus, in milliAmperes.
  * @usb2_phy: pointer to USB2 PHY
@@ -1251,6 +1255,8 @@ struct dwc3 {
 	enum usb_dr_mode	role_switch_default_mode;
 
 	struct power_supply	*usb_psy;
+	const char		*usb_psy_name;
+	struct notifier_block	psy_nb;
 	struct work_struct	vbus_draw_work;
 	unsigned int		current_limit;
 
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 3d4ca68e584c..c36d2a949231 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -3124,15 +3124,26 @@ static void dwc3_gadget_set_ssp_rate(struct usb_gadget *g,
 static int dwc3_gadget_vbus_draw(struct usb_gadget *g, unsigned int mA)
 {
 	struct dwc3		*dwc = gadget_to_dwc(g);
+	unsigned long		flags;
 
 	if (dwc->usb2_phy)
 		return usb_phy_set_power(dwc->usb2_phy, mA);
 
-	if (!dwc->usb_psy)
-		return -EOPNOTSUPP;
+	spin_lock_irqsave(&dwc->lock, flags);
+	if (!dwc->usb_psy) {
+		if (!dwc->psy_nb.notifier_call) {
+			spin_unlock_irqrestore(&dwc->lock, flags);
+			return -EOPNOTSUPP;
+		}
+		dwc->current_limit = mA;
+		spin_unlock_irqrestore(&dwc->lock, flags);
+		dev_dbg(dwc->dev, "Stored VBUS draw: %u mA (power supply not ready)\n", mA);
+		return 0;
+	}
 
 	dwc->current_limit = mA;
 	schedule_work(&dwc->vbus_draw_work);
+	spin_unlock_irqrestore(&dwc->lock, flags);
 
 	return 0;
 }
-- 
2.34.1


^ permalink raw reply related

* Re: [PATCH] usb: hub: Make usb_hub_wq type depend on isolcpus/nohz_full setting
From: Waiman Long @ 2026-06-05 18:05 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Greg Kroah-Hartman, Mathias Nyman, Alan Stern, Kuen-Han Tsai,
	linux-usb, linux-kernel, Vratislav Bendel
In-Reply-To: <ahb97cx_4dqPayDW@localhost.localdomain>

On 5/27/26 10:21 AM, Frederic Weisbecker wrote:
> Le Thu, May 21, 2026 at 01:06:59PM -0400, Waiman Long a écrit :
>> A Red Hat customer reports a kernel stability problem where hung tasks
>> are reported with occasional kernel panics. Analysis of the core dump
>> indicates that USB work items are running on isolcpus+nohz_full cores
>> competing with RT-class tasks running on those core while holding
>> usb_hub device mutex transitively blocking other kworkers waiting for
>> the same mutex leading to hung_task reports.
>>
>> As the usb_hub_wq uses the WQ_PERCPU flag, it will run the work items on
>> the same CPU that queues them. For many use cases, it is a more efficient
>> setup leading to higher throughput as it reduces cacheline bouncing.
>>
>> It is a different story if the system needs to run latency sensitive RT
>> workload on dedicated isolated CPUs. Having the kworkers processing work
>> items on the same set of isolated CPUs will likely break the low latency
>> requirements of the RT tasks. As the RT tasks have higher priority,
>> not much CPU time will be left running the kworkers to process work
>> items which, in turn, will block other tasks that have dependency on
>> the completion of those work items. In this case, using a WQ_UNBOUND
>> workqueue to avoid running on isolated CPUs will be more beneficial.
>>
>> One solution to get the best of both worlds is to make the workqueue
>> type depending on whether the "isolcpus" or "nohz_full" boot command
>> line options have been specified. If at least one of those options are
>> present, usb_hub_wq will be created as an unbound workqueue. Otherwise,
>> it will remain as a percpu workqueue.
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   drivers/usb/core/hub.c | 14 +++++++++++++-
>>   1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
>> index 24960ba9caa9..f79e5edd627a 100644
>> --- a/drivers/usb/core/hub.c
>> +++ b/drivers/usb/core/hub.c
>> @@ -33,6 +33,7 @@
>>   #include <linux/random.h>
>>   #include <linux/pm_qos.h>
>>   #include <linux/kobject.h>
>> +#include <linux/sched/isolation.h>
>>   
>>   #include <linux/bitfield.h>
>>   #include <linux/uaccess.h>
>> @@ -6066,6 +6067,8 @@ static struct usb_driver hub_driver = {
>>   
>>   int usb_hub_init(void)
>>   {
>> +	unsigned int wq_flags;
>> +
>>   	if (usb_register(&hub_driver) < 0) {
>>   		printk(KERN_ERR "%s: can't register hub driver\n",
>>   			usbcore_name);
>> @@ -6077,8 +6080,17 @@ int usb_hub_init(void)
>>   	 * USB-PERSIST port handover. Otherwise it might see that a full-speed
>>   	 * device was gone before the EHCI controller had handed its port
>>   	 * over to the companion full-speed controller.
>> +	 *
>> +	 * Create WQ_UNBOUND workqueue instead of WQ_PERCPU if either isolcpus
>> +	 * or nohz_full boot option is specified.
>>   	 */
>> -	hub_wq = alloc_workqueue("usb_hub_wq", WQ_FREEZABLE | WQ_PERCPU, 0);
>> +	if (housekeeping_enabled(HK_TYPE_DOMAIN) ||
>> +	    housekeeping_enabled(HK_TYPE_KERNEL_NOISE))
> HK_TYPE_DOMAIN is supposed to be a subset of HK_TYPE_KERNEL_NOISE anyway so
> the first should be enough.
Yes, that is the ideal case. However, that is not currently what some 
users are doing as they haven't changed their setup yet.
>
>> +		wq_flags = WQ_UNBOUND;
>> +	else
>> +		wq_flags = WQ_PERCPU;
>> +
>> +	hub_wq = alloc_workqueue("usb_hub_wq", WQ_FREEZABLE | wq_flags, 0);
> But then what happens if no isolcpus= is passed but later cpuset creates
> an isolated partition?
>
> Tejun and Marco thought about introducing a WQ_PREFER_PERCPU flag that would
> do what you want above. And the workqueue code should also handle dynamic
> isolation, that is switch from per-cpu workqueues to unbound ones or vice-versa
> dynamically.
>
> Marco is working on it.

I was thinking that it will be done with a later patch to make it works 
with dynamically isolated CPUs. Good to hear that Tejun and Macro are 
working with a new WQ flag. Will switch to the new flag once it is 
available. However, the current patch will still be useful especially 
for backporting to older distro kernels.

Cheers,
Longman

> Thanks.
>
>>   	if (hub_wq)
>>   		return 0;
>>   
>> -- 
>> 2.54.0
>>
>>


^ permalink raw reply

* Re: [REGRESSION] Sino Wealth 258a:002a keyboard enters stuck shift state on USB disconnect
From: Orlando Ulises Aguilar Rojas @ 2026-06-05 17:27 UTC (permalink / raw)
  To: Michal Pecio
  Cc: Benjamin Tissoires, linux-input, linux-usb, benjamin.tissoires
In-Reply-To: <20260605110003.435619d2.michal.pecio@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1636 bytes --]

Hi Benjamin and Michal,

Thank you both for your time and insights.

@Benjamin:
I downgraded and tested on kernel v6.18.3 as requested.

As you strongly suspected, the exact same stuck-shift behavior occurs
here when the mouse is disconnected. The 7.0.7 behavior was indeed a
temporary regression that accidentally "fixed" my device's broken
firmware.

I completely agree that a HID-BPF program is the most elegant solution
here rather than adding a generic quirk for a badly coded device. I
will read the documentation you linked and start working on a BPF
program to discard these short reports from the spurious mouse node.

@Michal:
Your theory about bus perturbation makes a lot of sense. The spurious
mouse interface emitting a corrupted short report due to a hub
topology change aligns perfectly with the fact that the zero-fill path
now processes that garbage.

Regarding your question about unplugging the keyboard:
When I physically disconnect the Machenike keyboard after the bug
triggers, the OS input subsystem remains completely stuck holding the
Shift modifier. I have tested all possible plug/unplug permutations
using a completely different, secondary keyboard, and the result is
exactly the same: it continues to type shifted characters ("!@#"). The
kernel module retains this corrupted state system-wide, and absolutely
no physical replug combination resolves it; a system reboot is
strictly required.

I have captured a packet trace using tshark during the exact
unbind/remove event of the mouse. Since the mailing list drops binary
attachments, I have compressed and uploaded the PCAPNG file

Best regards,
Orlando

[-- Attachment #2: usb_crash_trace.pcapng.gz --]
[-- Type: application/gzip, Size: 195467 bytes --]

^ permalink raw reply

* Re: [PATCH] thunderbolt: Assert downstream port reset on shutdown
From: Basavaraj Natikar @ 2026-06-05 16:27 UTC (permalink / raw)
  To: Mika Westerberg, Basavaraj Natikar
  Cc: andreas.noever, westeri, YehezkelShB, linux-usb, Sanath S,
	Mario Limonciello
In-Reply-To: <20260604050326.GH2990@black.igk.intel.com>

Hi,


On 6/4/2026 12:03 AM, Mika Westerberg wrote:
> Hi,
>
> On Wed, Jun 03, 2026 at 11:31:46PM +0530, Basavaraj Natikar wrote:
>> On shutdown the connection manager tears down the switch tree without
> router tree
>
>> signalling connected devices. Thunderbolt 3 devices directly connected
>> to a USB4 host never receive a disconnect indication and during shutdown
>> this can cause polling the dead link for up to 60 seconds. On some
>> platforms this behavior leads to a warm reset instead of a shutdown due
>> to this timeout.
>>
>> Fix this by asserting PORT_CS_19.DPR on each connected downstream port
>> before tearing down the switch tree. This drives SBTX low unconditionally
> router tree
>
>> (USB4 spec section 6.9), causing the device to detect SBRX low and
>> transition to Uninitialized Unplugged state immediately.
>>
>> Co-developed-by: Sanath S <Sanath.S@amd.com>
>> Signed-off-by: Sanath S <Sanath.S@amd.com>
>> Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
>> ---
>>   drivers/thunderbolt/switch.c |  2 +-
>>   drivers/thunderbolt/tb.c     | 11 +++++++++++
>>   drivers/thunderbolt/tb.h     |  1 +
>>   3 files changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/thunderbolt/switch.c b/drivers/thunderbolt/switch.c
>> index c2ad58b19e7b..52812908818b 100644
>> --- a/drivers/thunderbolt/switch.c
>> +++ b/drivers/thunderbolt/switch.c
>> @@ -704,7 +704,7 @@ int tb_port_disable(struct tb_port *port)
>>   	return __tb_port_enable(port, false);
>>   }
>>   
>> -static int tb_port_reset(struct tb_port *port)
>> +int tb_port_reset(struct tb_port *port)
>>   {
>>   	if (tb_switch_is_usb4(port->sw))
>>   		return port->cap_usb4 ? usb4_port_reset(port) : 0;
>> diff --git a/drivers/thunderbolt/tb.c b/drivers/thunderbolt/tb.c
>> index c69c323e6952..ca57b5181422 100644
>> --- a/drivers/thunderbolt/tb.c
>> +++ b/drivers/thunderbolt/tb.c
>> @@ -2935,6 +2935,7 @@ static void tb_stop(struct tb *tb)
>>   	struct tb_cm *tcm = tb_priv(tb);
>>   	struct tb_tunnel *tunnel;
>>   	struct tb_tunnel *n;
>> +	struct tb_port *port;
>>   
>>   	cancel_delayed_work(&tcm->remove_work);
>>   	/* tunnels are only present after everything has been initialized */
>> @@ -2948,6 +2949,16 @@ static void tb_stop(struct tb *tb)
>>   			tb_tunnel_deactivate(tunnel);
>>   		tb_tunnel_put(tunnel);
>>   	}
>> +	/*
>> +	 * Assert DPR to drive SBTX low, signalling disconnect and avoiding
>> +	 * ~60 s of link polling before warm reset on shutdown.
>> +	 */
>> +	tb_switch_for_each_port(tb->root_switch, port) {
>> +		if (!tb_port_is_null(port) || !tb_port_has_remote(port))
>> +			continue;
>> +		if (tb_port_reset(port))
>> +			tb_port_dbg(port, "DPR on shutdown failed, continuing\n");
>> +	}
> But now this tears down the topology also when the driver is unloaded? If
> you want to do that in shutdown there is ->shutdown hook for that.

Asserting DPR on unload is intentional and it is harmless. After a reload
the driver re-probes and the router tree comes back up cleanly.

It is also needed. Once the driver is unloaded it is unbound, so a later
system shutdown has no driver left to assert DPR and the ~60s link polling
hang comes back. Doing it in tb_stop() covers both plain shutdown and the
unload then shutdown case, so I kept it here instead of the ->shutdown hook.

Will fix switch tree to router tree.

Thanks,
--
Basavaraj

>
>>   	tb_switch_remove(tb->root_switch);
>>   	tcm->hotplug_active = false; /* signal tb_handle_hotplug to quit */
>>   }
>> diff --git a/drivers/thunderbolt/tb.h b/drivers/thunderbolt/tb.h
>> index 217c3114bec8..875eb538eacf 100644
>> --- a/drivers/thunderbolt/tb.h
>> +++ b/drivers/thunderbolt/tb.h
>> @@ -1102,6 +1102,7 @@ int tb_port_clear_counter(struct tb_port *port, int counter);
>>   int tb_port_unlock(struct tb_port *port);
>>   int tb_port_enable(struct tb_port *port);
>>   int tb_port_disable(struct tb_port *port);
>> +int tb_port_reset(struct tb_port *port);
>>   int tb_port_alloc_in_hopid(struct tb_port *port, int hopid, int max_hopid);
>>   void tb_port_release_in_hopid(struct tb_port *port, int hopid);
>>   int tb_port_alloc_out_hopid(struct tb_port *port, int hopid, int max_hopid);
>> -- 
>> 2.34.1


^ permalink raw reply

page: next (older)
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox