* [PATCH] HID: hid-lenovo-go: cancel cfg_setup work in hid_go_cfg_remove()
@ 2026-05-15 15:36 Manish Khadka
2026-05-15 16:18 ` sashiko-bot
0 siblings, 1 reply; 4+ messages in thread
From: Manish Khadka @ 2026-05-15 15:36 UTC (permalink / raw)
To: linux-input
Cc: Derek J . Clark, Mark Pearson, Jiri Kosina, Benjamin Tissoires,
linux-kernel
hid_go_cfg_probe() initialises drvdata.go_cfg_setup and schedules it
to run 2 ms later:
INIT_DELAYED_WORK(&drvdata.go_cfg_setup, &cfg_setup);
schedule_delayed_work(&drvdata.go_cfg_setup, msecs_to_jiffies(2));
cfg_setup() dereferences drvdata.hdev to issue MCU command requests.
hid_go_cfg_remove() tears down sysfs and stops the HID device, ending
with hid_set_drvdata(hdev, NULL), but never drains the delayed work.
If the device is unbound within the 2 ms scheduling delay (a probe
failure rolling back via remove, or a fast rmmod after probe), the
work fires after hid_set_drvdata(NULL) has cleared the back pointer,
leaving cfg_setup() with a NULL or stale drvdata.hdev.
Mirror the sibling driver hid-lenovo-go-s.c, whose hid_gos_cfg_remove()
already calls cancel_delayed_work_sync() on its analogous work, and
drain go_cfg_setup at the top of hid_go_cfg_remove(). The cancel
must come before guard(mutex)(&drvdata.cfg_mutex) because cfg_setup()
acquires that mutex; reversing the order would deadlock.
Fixes: d69ccfcbc955 ("HID: hid-lenovo-go: Add Lenovo Legion Go Series HID Driver")
Cc: stable@vger.kernel.org
Signed-off-by: Manish Khadka <maskmemanish@gmail.com>
---
drivers/hid/hid-lenovo-go.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/hid/hid-lenovo-go.c b/drivers/hid/hid-lenovo-go.c
index d4d26c783356..ef69869f0a00 100644
--- a/drivers/hid/hid-lenovo-go.c
+++ b/drivers/hid/hid-lenovo-go.c
@@ -2408,6 +2408,13 @@ static int hid_go_cfg_probe(struct hid_device *hdev,
static void hid_go_cfg_remove(struct hid_device *hdev)
{
+ /*
+ * cfg_setup is scheduled from hid_go_cfg_probe() with a 2 ms delay;
+ * drain it here before tearing down so the workqueue cannot run
+ * after hid_set_drvdata(NULL) and dereference a stale drvdata.hdev.
+ */
+ cancel_delayed_work_sync(&drvdata.go_cfg_setup);
+
guard(mutex)(&drvdata.cfg_mutex);
sysfs_remove_groups(&hdev->dev.kobj, top_level_attr_groups);
hid_hw_close(hdev);
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] HID: hid-lenovo-go: cancel cfg_setup work in hid_go_cfg_remove() 2026-05-15 15:36 [PATCH] HID: hid-lenovo-go: cancel cfg_setup work in hid_go_cfg_remove() Manish Khadka @ 2026-05-15 16:18 ` sashiko-bot 2026-05-15 17:45 ` [PATCH v2] " Manish Khadka 0 siblings, 1 reply; 4+ messages in thread From: sashiko-bot @ 2026-05-15 16:18 UTC (permalink / raw) To: Manish Khadka; +Cc: linux-input, dmitry.torokhov Thank you for your contribution! Sashiko AI review found 7 potential issue(s) to consider: - [High] AB-BA deadlock between device removal and sysfs operations. - [High] Global state corruption due to static per-device structs. - [High] Use-After-Free window due to mixing devm registration with explicit teardown. - [High] Resource leaks on probe error paths. - [Medium] Unconditional success return in `mcu_property_out()` masks timeout errors. - [Medium] Probe arbitrarily fails due to incorrect `schedule_delayed_work()` boolean logic. - [Low] Commit message mechanism explanation is factually incorrect. -- commit 8001481b81b1932a98238ba0e2de3079e8807d91 Author: Manish Khadka <maskmemanish@gmail.com> HID: hid-lenovo-go: cancel cfg_setup work in hid_go_cfg_remove() This commit drains the delayed go_cfg_setup workqueue at the start of hid_go_cfg_remove() to prevent it from dereferencing a stale device pointer after teardown. It places the cancellation before the mutex acquisition to avoid deadlocks. Cc: stable@vger.kernel.org > If the device is unbound within the 2 ms scheduling delay (a probe > failure rolling back via remove, or a fast rmmod after probe), the > work fires after hid_set_drvdata(NULL) has cleared the back pointer, > leaving cfg_setup() with a NULL or stale drvdata.hdev. This isn't a bug, but the explanation of the mechanism seems slightly inaccurate. Does hid_set_drvdata(hdev, NULL) only clear hdev->dev.driver_data rather than the global drvdata.hdev variable? > diff --git a/drivers/hid/hid-lenovo-go.c b/drivers/hid/hid-lenovo-go.c > index d4d26c7833563..ef69869f0a009 100644 > --- a/drivers/hid/hid-lenovo-go.c > +++ b/drivers/hid/hid-lenovo-go.c > @@ -2408,6 +2408,13 @@ static int hid_go_cfg_probe(struct hid_device *hdev, > > static void hid_go_cfg_remove(struct hid_device *hdev) > { > + /* > + * cfg_setup is scheduled from hid_go_cfg_probe() with a 2 ms delay; > + * drain it here before tearing down so the workqueue cannot run > + * after hid_set_drvdata(NULL) and dereference a stale drvdata.hdev. > + */ > + cancel_delayed_work_sync(&drvdata.go_cfg_setup); The driver uses static global structures for device state (drvdata) and LED attributes. If multiple Legion Go controllers are connected, or a device is rapidly re-bound, does re-initializing an active mutex or delayed workqueue, and re-registering an already-linked LED struct corrupt internal kernel linked lists and lock wait queues? > + > guard(mutex)(&drvdata.cfg_mutex); > sysfs_remove_groups(&hdev->dev.kobj, top_level_attr_groups); In hid_go_cfg_remove(), guard(mutex)(&drvdata.cfg_mutex) acquires the mutex before calling sysfs_remove_groups(). sysfs_remove_groups() drains kernfs active references, blocking until any currently executing sysfs operations finish. If the driver's sysfs callbacks acquire drvdata.cfg_mutex via mcu_property_out(), can this cause an AB-BA deadlock if a sysfs operation runs concurrently with device removal? > hid_hw_close(hdev); > hid_hw_stop(hdev); The RGB LED class device and RGB sysfs attributes appear to be registered using devm_* APIs, which clean up resources after the driver's .remove() callback finishes. If hid_go_cfg_remove() explicitly stops and closes the hardware here, could userspace accessing the LED brightness or RGB sysfs files in the window between hardware teardown and devres cleanup cause a use-after-free when the sysfs callbacks call hid_hw_output_report()? > hid_set_drvdata(hdev, NULL); > } Looking at hid_go_cfg_probe(), if a step after sysfs_create_groups() fails, the function returns an error without calling sysfs_remove_groups(). Does this leave dangling sysfs files that could be accessed later? Also, if hid_go_cfg_probe() returns an error, does hid_go_probe() leak the hardware open references by failing to call hid_hw_stop() and hid_hw_close()? In hid_go_cfg_probe(), the driver checks ret = schedule_delayed_work(...) and fails the probe if !ret. Since schedule_delayed_work() returns false (0) when the work is already pending, and the driver uses a single global workqueue structure, will a second device fail to probe if it connects before the first device's work fires? In mcu_property_out(), wait_for_completion_interruptible_timeout() is used to wait for hardware responses. If a timeout occurs, ret is set to -EBUSY. Because the function ends with an unconditional return 0, does this drop the error code and mask communication failures? -- Sashiko AI review · https://sashiko.dev/#/patchset/20260515153607.76175-1-maskmemanish@gmail.com?part=1 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2] HID: hid-lenovo-go: cancel cfg_setup work in hid_go_cfg_remove() 2026-05-15 16:18 ` sashiko-bot @ 2026-05-15 17:45 ` Manish Khadka 2026-05-15 18:23 ` sashiko-bot 0 siblings, 1 reply; 4+ messages in thread From: Manish Khadka @ 2026-05-15 17:45 UTC (permalink / raw) To: linux-input Cc: Derek J . Clark, Mark Pearson, Jiri Kosina, Benjamin Tissoires, linux-kernel hid_go_cfg_probe() initialises drvdata.go_cfg_setup and schedules it to run 2 ms later: INIT_DELAYED_WORK(&drvdata.go_cfg_setup, &cfg_setup); schedule_delayed_work(&drvdata.go_cfg_setup, msecs_to_jiffies(2)); cfg_setup() dereferences drvdata.hdev to issue MCU command requests. hid_go_cfg_remove() tears down sysfs and stops the HID device, but never drains the delayed work. If the device is unbound within the 2 ms scheduling delay (a probe failure rolling back via remove, or a fast rmmod after probe), the work fires after hid_destroy_device() has dropped its reference and released the underlying hdev struct, leaving cfg_setup() with a stale drvdata.hdev pointer. Mirror the sibling driver hid-lenovo-go-s.c, whose hid_gos_cfg_remove() already calls cancel_delayed_work_sync() on its analogous work, and drain go_cfg_setup at the top of hid_go_cfg_remove(). The cancel must come before guard(mutex)(&drvdata.cfg_mutex) because cfg_setup() acquires that mutex; reversing the order would deadlock. Fixes: d69ccfcbc955 ("HID: hid-lenovo-go: Add Lenovo Legion Go Series HID Driver") Cc: stable@vger.kernel.org Signed-off-by: Manish Khadka <maskmemanish@gmail.com> --- v1 -> v2: - Address Sashiko AI review feedback: * [Low] Correct the inaccurate description of how drvdata.hdev becomes stale. hid_set_drvdata(hdev, NULL) only clears the per-hdev driver_data, not the global drvdata.hdev; the actual stale-pointer mechanism is hid_destroy_device()'s put_device() releasing the underlying hdev struct. Commit message and the inline comment in hid_go_cfg_remove() are corrected accordingly. - The other six review points (ABBA deadlock with sysfs_remove_groups, global static drvdata corruption on multi-device/rebind, devm vs. explicit teardown race, probe error-path leaks, mcu_property_out unconditional return 0, schedule_delayed_work boolean misuse) are all valid pre-existing bugs in the driver but predate and are independent of this patch. Each deserves its own fix; will be addressed in separate patches. Thanks to Sashiko AI review for all seven points. drivers/hid/hid-lenovo-go.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/hid/hid-lenovo-go.c b/drivers/hid/hid-lenovo-go.c index d4d26c783356..b9d8cde53136 100644 --- a/drivers/hid/hid-lenovo-go.c +++ b/drivers/hid/hid-lenovo-go.c @@ -2408,6 +2408,15 @@ static int hid_go_cfg_probe(struct hid_device *hdev, static void hid_go_cfg_remove(struct hid_device *hdev) { + /* + * cfg_setup is scheduled from hid_go_cfg_probe() with a 2 ms delay + * and dereferences drvdata.hdev. Drain it here before tearing + * down so the workqueue cannot run after hid_destroy_device()'s + * put_device() has released the underlying hdev and dereference + * a stale drvdata.hdev pointer. + */ + cancel_delayed_work_sync(&drvdata.go_cfg_setup); + guard(mutex)(&drvdata.cfg_mutex); sysfs_remove_groups(&hdev->dev.kobj, top_level_attr_groups); hid_hw_close(hdev); -- 2.43.0 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v2] HID: hid-lenovo-go: cancel cfg_setup work in hid_go_cfg_remove() 2026-05-15 17:45 ` [PATCH v2] " Manish Khadka @ 2026-05-15 18:23 ` sashiko-bot 0 siblings, 0 replies; 4+ messages in thread From: sashiko-bot @ 2026-05-15 18:23 UTC (permalink / raw) To: Manish Khadka; +Cc: dmitry.torokhov, linux-input Thank you for your contribution! Sashiko AI review found 7 potential issue(s) to consider: - [Critical] Hardware state corruption and crashes due to use of a static global struct for device data. - [High] The patch provides an incomplete fix for the use-after-free and introduces a resource leak because it assumes the driver core calls remove() on probe failure. - [High] ABBA Deadlock in `hid_go_cfg_remove()` between `cfg_mutex` and sysfs kernfs removal. - [High] 'Scheduling while atomic' panic in the LED subsystem callback. - [High] Hardware event synchronization race condition permanently corrupts the command state machine. - [High] Resource leak on unbind if the device is operating as a generic HID device. - [High] Undefined behavior in `mcu_property_out()` due to passing a NULL pointer to `memcpy()`. -- commit 304d13b89ea5c6814d905d7149a191f0e8fad93d Author: Manish Khadka <maskmemanish@gmail.com> HID: hid-lenovo-go: cancel cfg_setup work in hid_go_cfg_remove() This commit cancels the go_cfg_setup delayed work in hid_go_cfg_remove() to prevent a use-after-free if the device is unbound before the work executes. > If the device is unbound within the > 2 ms scheduling delay (a probe failure rolling back via remove, or a > fast rmmod after probe), the work fires after hid_destroy_device() Is the statement about a probe failure rolling back via remove accurate? In the Linux driver model, the core never calls remove() if probe() fails. If hid_go_cfg_probe() fails after scheduling the delayed work, the work remains pending and will dereference the freed hdev when it fires, causing the exact use-after-free crash this patch attempts to fix. Additionally, if hid_go_cfg_probe() fails, does hid_go_probe() return the error code without calling hid_hw_close() and hid_hw_stop(), permanently leaking the hardware session? Should explicit cleanup be added to the probe error paths? > diff --git a/drivers/hid/hid-lenovo-go.c b/drivers/hid/hid-lenovo-go.c > index d4d26c7833563..b9d8cde531365 100644 > --- a/drivers/hid/hid-lenovo-go.c > +++ b/drivers/hid/hid-lenovo-go.c > @@ -2408,6 +2408,15 @@ static int hid_go_cfg_probe(struct hid_device *hdev, > > static void hid_go_cfg_remove(struct hid_device *hdev) > { > + /* > + * cfg_setup is scheduled from hid_go_cfg_probe() with a 2 ms delay > + * and dereferences drvdata.hdev. Drain it here before tearing > + * down so the workqueue cannot run after hid_destroy_device()'s > + * put_device() has released the underlying hdev and dereference > + * a stale drvdata.hdev pointer. > + */ > + cancel_delayed_work_sync(&drvdata.go_cfg_setup); This isn't a bug introduced by this patch, but does the driver use a global static variable (static struct hid_go_cfg drvdata) to store device state? A second device probe would unconditionally overwrite drvdata.hdev and re-initialize both drvdata.cfg_mutex and drvdata.go_cfg_setup. Re-initializing an active mutex destroys its wait lists, and re-initializing an active delayed work overwrites its linked list pointers, leading to immediate memory corruption and kernel panics. Should the device state be dynamically allocated per-instance instead? > + > guard(mutex)(&drvdata.cfg_mutex); > sysfs_remove_groups(&hdev->dev.kobj, top_level_attr_groups); This is a pre-existing issue, but does this sequence cause an ABBA deadlock? sysfs_remove_groups() drains all active kernfs references, blocking until any currently executing sysfs operations complete. However, the driver's sysfs callbacks internally call mcu_property_out(), which acquires drvdata.cfg_mutex. If a sysfs read or write races with driver removal, the sysfs thread holds the active kernfs reference and waits for cfg_mutex, while the removal thread holds cfg_mutex and waits for the kernfs reference. Could this result in an unrecoverable deadlock? [ ... ] This isn't a bug introduced by this patch, but does the driver assign hid_go_brightness_set to the led_cdev.brightness_set callback? The LED subsystem requires this callback to be non-blocking because it can be invoked from atomic contexts. However, hid_go_brightness_set calls mcu_property_out(), which allocates memory with GFP_KERNEL, locks a mutex, and blocks via wait_for_completion_interruptible_timeout(). Will this violate the LED API contract and cause scheduling while atomic panics? Should the callback be assigned to .brightness_set_blocking instead? This is another pre-existing issue, but does mcu_property_out() call reinit_completion() after wait_for_completion_interruptible_timeout() returns? If a command times out and the hardware response arrives shortly afterward, the interrupt handler signals the completion. The next call to mcu_property_out() will send a new command but fail to re-initialize the completion first. Its wait_for_completion will immediately return by consuming the stale signal, without waiting for the new response. Does this permanently desynchronize all subsequent hardware communications? Should reinit_completion() be called prior to sending the output report? This is also a pre-existing issue, but does hid_go_remove() bypass hardware cleanup if the device is operating as a generic HID device? In hid_go_probe(), if get_endpoint_address() fails (e.g., returns -ENODEV), the driver returns 0 to allow generic HID operation, leaving the hardware started and open. During unbind, hid_go_remove() calls get_endpoint_address() again. If it returns <= 0, the function returns immediately instead of falling through to the default case. Does this mean hid_hw_close() and hid_hw_stop() are bypassed entirely, permanently leaking the HID hardware session and its resources? Finally, this is a pre-existing issue, but does mcu_property_out() unconditionally execute memcpy(dmabuf + header_size, data, len)? Several hardware commands, such as GET_VERSION_DATA, invoke this function with data == NULL and len == 0. Under the C standard, passing a NULL pointer to memcpy() is undefined behavior regardless of the length. This triggers UBSAN kernel panics if enabled and allows aggressive compiler optimizations to discard the operation. Should the copy be conditionally guarded by if (len > 0) or if (data)? -- Sashiko AI review · https://sashiko.dev/#/patchset/20260515174511.78486-1-maskmemanish@gmail.com?part=1 ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-15 18:23 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-15 15:36 [PATCH] HID: hid-lenovo-go: cancel cfg_setup work in hid_go_cfg_remove() Manish Khadka 2026-05-15 16:18 ` sashiko-bot 2026-05-15 17:45 ` [PATCH v2] " Manish Khadka 2026-05-15 18:23 ` sashiko-bot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox