* [PATCH v3 0/2] Fix two kernel warnings in glink driver
@ 2021-11-02 23:51 Sujit Kautkar
2021-11-02 23:51 ` [PATCH v3 1/2] rpmsg: glink: Fix use-after-free in qcom_glink_rpdev_release() Sujit Kautkar
2021-11-02 23:51 ` [PATCH v3 2/2] rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device() Sujit Kautkar
0 siblings, 2 replies; 8+ messages in thread
From: Sujit Kautkar @ 2021-11-02 23:51 UTC (permalink / raw)
To: Andy Gross, Ohad Ben-Cohen
Cc: Bjorn Andersson, Sibi Sankar, Matthias Kaehlcke, Stephen Boyd,
Sujit Kautkar, linux-arm-msm, linux-kernel, linux-remoteproc
These changes addresses kernel warnings which shows up after enabling
debug kernel. First one fixes use-after-free warning and second fixes
warning by updating cdev APIs
Changes in v3:
- Clear ept pointers in patch 1
- Remove error check in patch 2
Changes in v2:
- Fix typo in commit message
Sujit Kautkar (2):
rpmsg: glink: Fix use-after-free in qcom_glink_rpdev_release()
rpmsg: glink: Update cdev add/del API in
rpmsg_ctrldev_release_device()
drivers/rpmsg/qcom_glink_native.c | 12 ++++++++++--
drivers/rpmsg/rpmsg_char.c | 10 ++--------
2 files changed, 12 insertions(+), 10 deletions(-)
--
2.31.0
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH v3 1/2] rpmsg: glink: Fix use-after-free in qcom_glink_rpdev_release() 2021-11-02 23:51 [PATCH v3 0/2] Fix two kernel warnings in glink driver Sujit Kautkar @ 2021-11-02 23:51 ` Sujit Kautkar 2021-11-03 16:34 ` Matthias Kaehlcke 2021-11-02 23:51 ` [PATCH v3 2/2] rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device() Sujit Kautkar 1 sibling, 1 reply; 8+ messages in thread From: Sujit Kautkar @ 2021-11-02 23:51 UTC (permalink / raw) To: Andy Gross, Ohad Ben-Cohen Cc: Bjorn Andersson, Sibi Sankar, Matthias Kaehlcke, Stephen Boyd, Sujit Kautkar, linux-arm-msm, linux-kernel, linux-remoteproc qcom_glink_rpdev_release() sets channel->rpdev to NULL. However, with debug enabled kernel, qcom_glink_rpdev_release() gets delayed due to delayed kobject release and channel gets released by that time and triggers below kernel warning. To avoid this use-after-free, clear ept pointers during ept destroy and channel release and add a new condition in qcom_glink_rpdev_release() to access channel | BUG: KASAN: use-after-free in qcom_glink_rpdev_release+0x54/0x70 | Write of size 8 at addr ffffffaba438e8d0 by task kworker/6:1/54 | | CPU: 6 PID: 54 Comm: kworker/6:1 Not tainted 5.4.109-lockdep #16 | Hardware name: Google Lazor (rev3+) with KB Backlight (DT) | Workqueue: events kobject_delayed_cleanup | Call trace: | dump_backtrace+0x0/0x284 | show_stack+0x20/0x2c | dump_stack+0xd4/0x170 | print_address_description+0x3c/0x4a8 | __kasan_report+0x144/0x168 | kasan_report+0x10/0x18 | __asan_report_store8_noabort+0x1c/0x24 | qcom_glink_rpdev_release+0x54/0x70 | device_release+0x68/0x14c | kobject_delayed_cleanup+0x158/0x2cc | process_one_work+0x7cc/0x10a4 | worker_thread+0x80c/0xcec | kthread+0x2a8/0x314 | ret_from_fork+0x10/0x18 Signed-off-by: Sujit Kautkar <sujitka@chromium.org> --- Changes in v3: - Clear ept pointers and add extra conditions Changes in v2: - Fix typo in commit message drivers/rpmsg/qcom_glink_native.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/rpmsg/qcom_glink_native.c b/drivers/rpmsg/qcom_glink_native.c index e1444fefdd1c0..0c64a6f7a4f09 100644 --- a/drivers/rpmsg/qcom_glink_native.c +++ b/drivers/rpmsg/qcom_glink_native.c @@ -269,6 +269,9 @@ static void qcom_glink_channel_release(struct kref *ref) idr_destroy(&channel->riids); spin_unlock_irqrestore(&channel->intent_lock, flags); + if (channel->rpdev) + channel->rpdev->ept = NULL; + kfree(channel->name); kfree(channel); } @@ -1214,6 +1217,8 @@ static void qcom_glink_destroy_ept(struct rpmsg_endpoint *ept) channel->ept.cb = NULL; spin_unlock_irqrestore(&channel->recv_lock, flags); + channel->rpdev->ept = NULL; + /* Decouple the potential rpdev from the channel */ channel->rpdev = NULL; @@ -1371,9 +1376,12 @@ static const struct rpmsg_endpoint_ops glink_endpoint_ops = { static void qcom_glink_rpdev_release(struct device *dev) { struct rpmsg_device *rpdev = to_rpmsg_device(dev); - struct glink_channel *channel = to_glink_channel(rpdev->ept); + struct glink_channel *channel = NULL; - channel->rpdev = NULL; + if (rpdev->ept) { + channel = to_glink_channel(rpdev->ept); + channel->rpdev = NULL; + } kfree(rpdev); } -- 2.31.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v3 1/2] rpmsg: glink: Fix use-after-free in qcom_glink_rpdev_release() 2021-11-02 23:51 ` [PATCH v3 1/2] rpmsg: glink: Fix use-after-free in qcom_glink_rpdev_release() Sujit Kautkar @ 2021-11-03 16:34 ` Matthias Kaehlcke 0 siblings, 0 replies; 8+ messages in thread From: Matthias Kaehlcke @ 2021-11-03 16:34 UTC (permalink / raw) To: Sujit Kautkar Cc: Andy Gross, Ohad Ben-Cohen, Bjorn Andersson, Sibi Sankar, Stephen Boyd, linux-arm-msm, linux-kernel, linux-remoteproc Hi Sujit, On Tue, Nov 02, 2021 at 04:51:49PM -0700, Sujit Kautkar wrote: > qcom_glink_rpdev_release() sets channel->rpdev to NULL. However, with > debug enabled kernel, qcom_glink_rpdev_release() gets delayed due to > delayed kobject release and channel gets released by that time and > triggers below kernel warning. To avoid this use-after-free, clear ept > pointers during ept destroy and channel release and add a new condition > in qcom_glink_rpdev_release() to access channel > > | BUG: KASAN: use-after-free in qcom_glink_rpdev_release+0x54/0x70 > | Write of size 8 at addr ffffffaba438e8d0 by task kworker/6:1/54 > | > | CPU: 6 PID: 54 Comm: kworker/6:1 Not tainted 5.4.109-lockdep #16 > | Hardware name: Google Lazor (rev3+) with KB Backlight (DT) > | Workqueue: events kobject_delayed_cleanup > | Call trace: > | dump_backtrace+0x0/0x284 > | show_stack+0x20/0x2c > | dump_stack+0xd4/0x170 > | print_address_description+0x3c/0x4a8 > | __kasan_report+0x144/0x168 > | kasan_report+0x10/0x18 > | __asan_report_store8_noabort+0x1c/0x24 > | qcom_glink_rpdev_release+0x54/0x70 > | device_release+0x68/0x14c > | kobject_delayed_cleanup+0x158/0x2cc > | process_one_work+0x7cc/0x10a4 > | worker_thread+0x80c/0xcec > | kthread+0x2a8/0x314 > | ret_from_fork+0x10/0x18 > > Signed-off-by: Sujit Kautkar <sujitka@chromium.org> > --- > Changes in v3: > - Clear ept pointers and add extra conditions > > Changes in v2: > - Fix typo in commit message > > drivers/rpmsg/qcom_glink_native.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/drivers/rpmsg/qcom_glink_native.c b/drivers/rpmsg/qcom_glink_native.c > index e1444fefdd1c0..0c64a6f7a4f09 100644 > --- a/drivers/rpmsg/qcom_glink_native.c > +++ b/drivers/rpmsg/qcom_glink_native.c > @@ -269,6 +269,9 @@ static void qcom_glink_channel_release(struct kref *ref) > idr_destroy(&channel->riids); > spin_unlock_irqrestore(&channel->intent_lock, flags); > > + if (channel->rpdev) > + channel->rpdev->ept = NULL; > + > kfree(channel->name); > kfree(channel); > } > @@ -1214,6 +1217,8 @@ static void qcom_glink_destroy_ept(struct rpmsg_endpoint *ept) > channel->ept.cb = NULL; > spin_unlock_irqrestore(&channel->recv_lock, flags); > > + channel->rpdev->ept = NULL; > + > /* Decouple the potential rpdev from the channel */ > channel->rpdev = NULL; > > @@ -1371,9 +1376,12 @@ static const struct rpmsg_endpoint_ops glink_endpoint_ops = { > static void qcom_glink_rpdev_release(struct device *dev) > { > struct rpmsg_device *rpdev = to_rpmsg_device(dev); > - struct glink_channel *channel = to_glink_channel(rpdev->ept); > + struct glink_channel *channel = NULL; no need to initialize the pointer, it is assigned in the path that uses it. > > - channel->rpdev = NULL; > + if (rpdev->ept) { > + channel = to_glink_channel(rpdev->ept); > + channel->rpdev = NULL; > + } > kfree(rpdev); > } Looks like this is already fixed in -next by: commit 343ba27b6f9d473ec3e602cc648300eb03a7fa05 Author: Chris Lew <clew@codeaurora.org> Date: Thu Jul 30 10:48:15 2020 +0530 rpmsg: glink: Remove channel decouple from rpdev release If a channel is being rapidly restarting and the kobj release worker is busy, there is a chance the rpdev_release function will run after the channel struct itself has been released. There should not be a need to decouple the channel from rpdev in the rpdev release since that should only happen from the close commands. Signed-off-by: Chris Lew <clew@codeaurora.org> Signed-off-by: Deepak Kumar Singh <deesin@codeaurora.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/1596086296-28529-6-git-send-email-deesin@codeaurora.org ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v3 2/2] rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device() 2021-11-02 23:51 [PATCH v3 0/2] Fix two kernel warnings in glink driver Sujit Kautkar 2021-11-02 23:51 ` [PATCH v3 1/2] rpmsg: glink: Fix use-after-free in qcom_glink_rpdev_release() Sujit Kautkar @ 2021-11-02 23:51 ` Sujit Kautkar 2021-11-03 17:16 ` Matthias Kaehlcke ` (2 more replies) 1 sibling, 3 replies; 8+ messages in thread From: Sujit Kautkar @ 2021-11-02 23:51 UTC (permalink / raw) To: Andy Gross, Ohad Ben-Cohen Cc: Bjorn Andersson, Sibi Sankar, Matthias Kaehlcke, Stephen Boyd, Sujit Kautkar, linux-kernel, linux-remoteproc Replace cdev add/del APIs with cdev_device_add/cdev_device_del to avoid below kernel warning. This correctly takes a reference to the parent device so the parent will not get released until all references to the cdev are released. | ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x7c | WARNING: CPU: 7 PID: 19892 at lib/debugobjects.c:488 debug_print_object+0x13c/0x1b0 | CPU: 7 PID: 19892 Comm: kworker/7:4 Tainted: G W 5.4.147-lockdep #1 | ================================================================== | Hardware name: Google CoachZ (rev1 - 2) with LTE (DT) | Workqueue: events kobject_delayed_cleanup | pstate: 60c00009 (nZCv daif +PAN +UAO) | pc : debug_print_object+0x13c/0x1b0 | lr : debug_print_object+0x13c/0x1b0 | sp : ffffff83b2ec7970 | x29: ffffff83b2ec7970 x28: dfffffd000000000 | x27: ffffff83d674f000 x26: dfffffd000000000 | x25: ffffffd06b8fa660 x24: dfffffd000000000 | x23: 0000000000000000 x22: ffffffd06b7c5108 | x21: ffffffd06d597860 x20: ffffffd06e2c21c0 | x19: ffffffd06d5974c0 x18: 000000000001dad8 | x17: 0000000000000000 x16: dfffffd000000000 | BUG: KASAN: use-after-free in qcom_glink_rpdev_release+0x54/0x70 | x15: ffffffffffffffff x14: 79616c6564203a74 | x13: 0000000000000000 x12: 0000000000000080 | Write of size 8 at addr ffffff83d95768d0 by task kworker/3:1/150 | x11: 0000000000000001 x10: 0000000000000000 | x9 : fc9e8edec0ad0300 x8 : fc9e8edec0ad0300 | | x7 : 0000000000000000 x6 : 0000000000000000 | x5 : 0000000000000080 x4 : 0000000000000000 | CPU: 3 PID: 150 Comm: kworker/3:1 Tainted: G W 5.4.147-lockdep #1 | x3 : ffffffd06c149574 x2 : ffffff83f77f7498 | x1 : ffffffd06d596f60 x0 : 0000000000000061 | Hardware name: Google CoachZ (rev1 - 2) with LTE (DT) | Call trace: | debug_print_object+0x13c/0x1b0 | Workqueue: events kobject_delayed_cleanup | __debug_check_no_obj_freed+0x25c/0x3c0 | debug_check_no_obj_freed+0x18/0x20 | Call trace: | slab_free_freelist_hook+0xb4/0x1bc | kfree+0xe8/0x2d8 | dump_backtrace+0x0/0x27c | rpmsg_ctrldev_release_device+0x78/0xb8 | device_release+0x68/0x14c | show_stack+0x20/0x2c | kobject_cleanup+0x12c/0x298 | kobject_delayed_cleanup+0x10/0x18 | dump_stack+0xe0/0x19c | process_one_work+0x578/0x92c | worker_thread+0x804/0xcf8 | print_address_description+0x3c/0x4a8 | kthread+0x2a8/0x314 | ret_from_fork+0x10/0x18 | __kasan_report+0x100/0x124 Signed-off-by: Sujit Kautkar <sujitka@chromium.org> --- Changes in v3: - Remove unecessary error check as per Matthias's comment Changes in v2: - Fix typo in commit message drivers/rpmsg/rpmsg_char.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/drivers/rpmsg/rpmsg_char.c b/drivers/rpmsg/rpmsg_char.c index 876ce43df732b..a6a33155ca859 100644 --- a/drivers/rpmsg/rpmsg_char.c +++ b/drivers/rpmsg/rpmsg_char.c @@ -458,7 +458,7 @@ static void rpmsg_ctrldev_release_device(struct device *dev) ida_simple_remove(&rpmsg_ctrl_ida, dev->id); ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt)); - cdev_del(&ctrldev->cdev); + cdev_device_del(&ctrldev->cdev, &ctrldev->dev); kfree(ctrldev); } @@ -493,19 +493,13 @@ static int rpmsg_chrdev_probe(struct rpmsg_device *rpdev) dev->id = ret; dev_set_name(&ctrldev->dev, "rpmsg_ctrl%d", ret); - ret = cdev_add(&ctrldev->cdev, dev->devt, 1); + ret = cdev_device_add(&ctrldev->cdev, &ctrldev->dev); if (ret) goto free_ctrl_ida; /* We can now rely on the release function for cleanup */ dev->release = rpmsg_ctrldev_release_device; - ret = device_add(dev); - if (ret) { - dev_err(&rpdev->dev, "device_add failed: %d\n", ret); - put_device(dev); - } - dev_set_drvdata(&rpdev->dev, ctrldev); return ret; -- 2.31.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v3 2/2] rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device() 2021-11-02 23:51 ` [PATCH v3 2/2] rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device() Sujit Kautkar @ 2021-11-03 17:16 ` Matthias Kaehlcke 2021-11-17 18:59 ` Stephen Boyd 2021-11-17 23:29 ` Bjorn Andersson 2 siblings, 0 replies; 8+ messages in thread From: Matthias Kaehlcke @ 2021-11-03 17:16 UTC (permalink / raw) To: Sujit Kautkar Cc: Andy Gross, Ohad Ben-Cohen, Bjorn Andersson, Sibi Sankar, Stephen Boyd, linux-kernel, linux-remoteproc On Tue, Nov 02, 2021 at 04:51:51PM -0700, Sujit Kautkar wrote: > Replace cdev add/del APIs with cdev_device_add/cdev_device_del to avoid > below kernel warning. This correctly takes a reference to the parent > device so the parent will not get released until all references to the > cdev are released. > > | ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x7c > | WARNING: CPU: 7 PID: 19892 at lib/debugobjects.c:488 debug_print_object+0x13c/0x1b0 > | CPU: 7 PID: 19892 Comm: kworker/7:4 Tainted: G W 5.4.147-lockdep #1 > | ================================================================== > | Hardware name: Google CoachZ (rev1 - 2) with LTE (DT) > | Workqueue: events kobject_delayed_cleanup > | pstate: 60c00009 (nZCv daif +PAN +UAO) > | pc : debug_print_object+0x13c/0x1b0 > | lr : debug_print_object+0x13c/0x1b0 > | sp : ffffff83b2ec7970 > | x29: ffffff83b2ec7970 x28: dfffffd000000000 > | x27: ffffff83d674f000 x26: dfffffd000000000 > | x25: ffffffd06b8fa660 x24: dfffffd000000000 > | x23: 0000000000000000 x22: ffffffd06b7c5108 > | x21: ffffffd06d597860 x20: ffffffd06e2c21c0 > | x19: ffffffd06d5974c0 x18: 000000000001dad8 > | x17: 0000000000000000 x16: dfffffd000000000 > | BUG: KASAN: use-after-free in qcom_glink_rpdev_release+0x54/0x70 > | x15: ffffffffffffffff x14: 79616c6564203a74 > | x13: 0000000000000000 x12: 0000000000000080 > | Write of size 8 at addr ffffff83d95768d0 by task kworker/3:1/150 > | x11: 0000000000000001 x10: 0000000000000000 > | x9 : fc9e8edec0ad0300 x8 : fc9e8edec0ad0300 > | > | x7 : 0000000000000000 x6 : 0000000000000000 > | x5 : 0000000000000080 x4 : 0000000000000000 > | CPU: 3 PID: 150 Comm: kworker/3:1 Tainted: G W 5.4.147-lockdep #1 > | x3 : ffffffd06c149574 x2 : ffffff83f77f7498 > | x1 : ffffffd06d596f60 x0 : 0000000000000061 > | Hardware name: Google CoachZ (rev1 - 2) with LTE (DT) > | Call trace: > | debug_print_object+0x13c/0x1b0 > | Workqueue: events kobject_delayed_cleanup > | __debug_check_no_obj_freed+0x25c/0x3c0 > | debug_check_no_obj_freed+0x18/0x20 > | Call trace: > | slab_free_freelist_hook+0xb4/0x1bc > | kfree+0xe8/0x2d8 > | dump_backtrace+0x0/0x27c > | rpmsg_ctrldev_release_device+0x78/0xb8 > | device_release+0x68/0x14c > | show_stack+0x20/0x2c > | kobject_cleanup+0x12c/0x298 > | kobject_delayed_cleanup+0x10/0x18 > | dump_stack+0xe0/0x19c > | process_one_work+0x578/0x92c > | worker_thread+0x804/0xcf8 > | print_address_description+0x3c/0x4a8 > | kthread+0x2a8/0x314 > | ret_from_fork+0x10/0x18 > | __kasan_report+0x100/0x124 > > Signed-off-by: Sujit Kautkar <sujitka@chromium.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3 2/2] rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device() 2021-11-02 23:51 ` [PATCH v3 2/2] rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device() Sujit Kautkar 2021-11-03 17:16 ` Matthias Kaehlcke @ 2021-11-17 18:59 ` Stephen Boyd 2021-11-17 23:29 ` Bjorn Andersson 2 siblings, 0 replies; 8+ messages in thread From: Stephen Boyd @ 2021-11-17 18:59 UTC (permalink / raw) To: Andy Gross, Ohad Ben-Cohen, Sujit Kautkar Cc: Bjorn Andersson, Sibi Sankar, Matthias Kaehlcke, linux-kernel, linux-remoteproc The subject is a little confusing. Maybe it should be "Use cdev_device_{add,del}() instead of open coding". Quoting Sujit Kautkar (2021-11-02 16:51:51) > Replace cdev add/del APIs with cdev_device_add/cdev_device_del to avoid > below kernel warning. This correctly takes a reference to the parent > device so the parent will not get released until all references to the > cdev are released. > > | ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x7c > | WARNING: CPU: 7 PID: 19892 at lib/debugobjects.c:488 debug_print_object+0x13c/0x1b0 > | CPU: 7 PID: 19892 Comm: kworker/7:4 Tainted: G W 5.4.147-lockdep #1 > | ================================================================== > | Hardware name: Google CoachZ (rev1 - 2) with LTE (DT) > | Workqueue: events kobject_delayed_cleanup > | pstate: 60c00009 (nZCv daif +PAN +UAO) > | pc : debug_print_object+0x13c/0x1b0 > | lr : debug_print_object+0x13c/0x1b0 > | sp : ffffff83b2ec7970 > | x29: ffffff83b2ec7970 x28: dfffffd000000000 > | x27: ffffff83d674f000 x26: dfffffd000000000 > | x25: ffffffd06b8fa660 x24: dfffffd000000000 > | x23: 0000000000000000 x22: ffffffd06b7c5108 > | x21: ffffffd06d597860 x20: ffffffd06e2c21c0 > | x19: ffffffd06d5974c0 x18: 000000000001dad8 > | x17: 0000000000000000 x16: dfffffd000000000 > | BUG: KASAN: use-after-free in qcom_glink_rpdev_release+0x54/0x70 > | x15: ffffffffffffffff x14: 79616c6564203a74 > | x13: 0000000000000000 x12: 0000000000000080 > | Write of size 8 at addr ffffff83d95768d0 by task kworker/3:1/150 > | x11: 0000000000000001 x10: 0000000000000000 > | x9 : fc9e8edec0ad0300 x8 : fc9e8edec0ad0300 > | > | x7 : 0000000000000000 x6 : 0000000000000000 > | x5 : 0000000000000080 x4 : 0000000000000000 > | CPU: 3 PID: 150 Comm: kworker/3:1 Tainted: G W 5.4.147-lockdep #1 > | x3 : ffffffd06c149574 x2 : ffffff83f77f7498 > | x1 : ffffffd06d596f60 x0 : 0000000000000061 > | Hardware name: Google CoachZ (rev1 - 2) with LTE (DT) > | Call trace: > | debug_print_object+0x13c/0x1b0 > | Workqueue: events kobject_delayed_cleanup > | __debug_check_no_obj_freed+0x25c/0x3c0 > | debug_check_no_obj_freed+0x18/0x20 > | Call trace: > | slab_free_freelist_hook+0xb4/0x1bc > | kfree+0xe8/0x2d8 > | dump_backtrace+0x0/0x27c > | rpmsg_ctrldev_release_device+0x78/0xb8 > | device_release+0x68/0x14c > | show_stack+0x20/0x2c > | kobject_cleanup+0x12c/0x298 > | kobject_delayed_cleanup+0x10/0x18 > | dump_stack+0xe0/0x19c > | process_one_work+0x578/0x92c > | worker_thread+0x804/0xcf8 > | print_address_description+0x3c/0x4a8 > | kthread+0x2a8/0x314 > | ret_from_fork+0x10/0x18 > | __kasan_report+0x100/0x124 > > Signed-off-by: Sujit Kautkar <sujitka@chromium.org> > --- Reviewed-by: Stephen Boyd <swboyd@chromium.org> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3 2/2] rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device() 2021-11-02 23:51 ` [PATCH v3 2/2] rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device() Sujit Kautkar 2021-11-03 17:16 ` Matthias Kaehlcke 2021-11-17 18:59 ` Stephen Boyd @ 2021-11-17 23:29 ` Bjorn Andersson 2021-12-07 0:15 ` Matthias Kaehlcke 2 siblings, 1 reply; 8+ messages in thread From: Bjorn Andersson @ 2021-11-17 23:29 UTC (permalink / raw) To: Sujit Kautkar Cc: Andy Gross, Ohad Ben-Cohen, Sibi Sankar, Matthias Kaehlcke, Stephen Boyd, linux-kernel, linux-remoteproc On Tue 02 Nov 18:51 CDT 2021, Sujit Kautkar wrote: I like Stephen's suggestion about modifying the $subject. Also note that the change isn't in the glink driver, so prefix should reflect that: $ git log --oneline --no-decorate -- drivers/rpmsg/rpmsg_char.c f998d48f9b3c rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device() bc774a3887cb rpmsg: char: Remove useless include 964e8bedd5a1 rpmsg: char: Return an error if device already open ... > Replace cdev add/del APIs with cdev_device_add/cdev_device_del to avoid > below kernel warning. This correctly takes a reference to the parent > device so the parent will not get released until all references to the > cdev are released. > > | ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x7c > | WARNING: CPU: 7 PID: 19892 at lib/debugobjects.c:488 debug_print_object+0x13c/0x1b0 > | CPU: 7 PID: 19892 Comm: kworker/7:4 Tainted: G W 5.4.147-lockdep #1 > | ================================================================== > | Hardware name: Google CoachZ (rev1 - 2) with LTE (DT) > | Workqueue: events kobject_delayed_cleanup > | pstate: 60c00009 (nZCv daif +PAN +UAO) > | pc : debug_print_object+0x13c/0x1b0 > | lr : debug_print_object+0x13c/0x1b0 > | sp : ffffff83b2ec7970 > | x29: ffffff83b2ec7970 x28: dfffffd000000000 > | x27: ffffff83d674f000 x26: dfffffd000000000 > | x25: ffffffd06b8fa660 x24: dfffffd000000000 > | x23: 0000000000000000 x22: ffffffd06b7c5108 > | x21: ffffffd06d597860 x20: ffffffd06e2c21c0 > | x19: ffffffd06d5974c0 x18: 000000000001dad8 > | x17: 0000000000000000 x16: dfffffd000000000 > | BUG: KASAN: use-after-free in qcom_glink_rpdev_release+0x54/0x70 > | x15: ffffffffffffffff x14: 79616c6564203a74 > | x13: 0000000000000000 x12: 0000000000000080 > | Write of size 8 at addr ffffff83d95768d0 by task kworker/3:1/150 > | x11: 0000000000000001 x10: 0000000000000000 > | x9 : fc9e8edec0ad0300 x8 : fc9e8edec0ad0300 > | > | x7 : 0000000000000000 x6 : 0000000000000000 > | x5 : 0000000000000080 x4 : 0000000000000000 > | CPU: 3 PID: 150 Comm: kworker/3:1 Tainted: G W 5.4.147-lockdep #1 > | x3 : ffffffd06c149574 x2 : ffffff83f77f7498 > | x1 : ffffffd06d596f60 x0 : 0000000000000061 > | Hardware name: Google CoachZ (rev1 - 2) with LTE (DT) > | Call trace: > | debug_print_object+0x13c/0x1b0 > | Workqueue: events kobject_delayed_cleanup > | __debug_check_no_obj_freed+0x25c/0x3c0 > | debug_check_no_obj_freed+0x18/0x20 > | Call trace: > | slab_free_freelist_hook+0xb4/0x1bc > | kfree+0xe8/0x2d8 > | dump_backtrace+0x0/0x27c Why is dump_backtrace in the callstack here inbetween rpmsg_ctrldev_release_device() and kfree()? Isn't the error that we're calling kfree() on an chunk of memory that contains a live object? > | rpmsg_ctrldev_release_device+0x78/0xb8 > | device_release+0x68/0x14c > | show_stack+0x20/0x2c > | kobject_cleanup+0x12c/0x298 > | kobject_delayed_cleanup+0x10/0x18 > | dump_stack+0xe0/0x19c > | process_one_work+0x578/0x92c > | worker_thread+0x804/0xcf8 > | print_address_description+0x3c/0x4a8 > | kthread+0x2a8/0x314 > | ret_from_fork+0x10/0x18 > | __kasan_report+0x100/0x124 > > Signed-off-by: Sujit Kautkar <sujitka@chromium.org> > --- > Changes in v3: > - Remove unecessary error check as per Matthias's comment > > Changes in v2: > - Fix typo in commit message > > drivers/rpmsg/rpmsg_char.c | 10 ++-------- > 1 file changed, 2 insertions(+), 8 deletions(-) > > diff --git a/drivers/rpmsg/rpmsg_char.c b/drivers/rpmsg/rpmsg_char.c > index 876ce43df732b..a6a33155ca859 100644 > --- a/drivers/rpmsg/rpmsg_char.c > +++ b/drivers/rpmsg/rpmsg_char.c > @@ -458,7 +458,7 @@ static void rpmsg_ctrldev_release_device(struct device *dev) > > ida_simple_remove(&rpmsg_ctrl_ida, dev->id); > ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt)); > - cdev_del(&ctrldev->cdev); > + cdev_device_del(&ctrldev->cdev, &ctrldev->dev); I am not able to find any other instance where cdev_device_del() is called from the device's release function itself, which tells me that this probably is not the right thing to do. Instead the appropriate way seem to put the cdev_device_del() in rpmsg_chrdev_remove(). That said, we already do device_del() in rpmsg_chrdev_remove() so if the warning is trying to tell us that ctrldev->dev is not deleted I think we have an unbalanced put_device()? Regards, Bjorn > kfree(ctrldev); > } > > @@ -493,19 +493,13 @@ static int rpmsg_chrdev_probe(struct rpmsg_device *rpdev) > dev->id = ret; > dev_set_name(&ctrldev->dev, "rpmsg_ctrl%d", ret); > > - ret = cdev_add(&ctrldev->cdev, dev->devt, 1); > + ret = cdev_device_add(&ctrldev->cdev, &ctrldev->dev); > if (ret) > goto free_ctrl_ida; > > /* We can now rely on the release function for cleanup */ > dev->release = rpmsg_ctrldev_release_device; > > - ret = device_add(dev); > - if (ret) { > - dev_err(&rpdev->dev, "device_add failed: %d\n", ret); > - put_device(dev); > - } > - > dev_set_drvdata(&rpdev->dev, ctrldev); > > return ret; > -- > 2.31.0 > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3 2/2] rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device() 2021-11-17 23:29 ` Bjorn Andersson @ 2021-12-07 0:15 ` Matthias Kaehlcke 0 siblings, 0 replies; 8+ messages in thread From: Matthias Kaehlcke @ 2021-12-07 0:15 UTC (permalink / raw) To: Bjorn Andersson Cc: Sujit Kautkar, Andy Gross, Ohad Ben-Cohen, Sibi Sankar, Stephen Boyd, linux-kernel, linux-remoteproc On Wed, Nov 17, 2021 at 05:29:07PM -0600, Bjorn Andersson wrote: > On Tue 02 Nov 18:51 CDT 2021, Sujit Kautkar wrote: > > I like Stephen's suggestion about modifying the $subject. > Also note that the change isn't in the glink driver, so prefix should > reflect that: > > $ git log --oneline --no-decorate -- drivers/rpmsg/rpmsg_char.c > f998d48f9b3c rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device() > bc774a3887cb rpmsg: char: Remove useless include > 964e8bedd5a1 rpmsg: char: Return an error if device already open > ... > > > Replace cdev add/del APIs with cdev_device_add/cdev_device_del to avoid > > below kernel warning. This correctly takes a reference to the parent > > device so the parent will not get released until all references to the > > cdev are released. > > > > | ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x7c > > | WARNING: CPU: 7 PID: 19892 at lib/debugobjects.c:488 debug_print_object+0x13c/0x1b0 > > | CPU: 7 PID: 19892 Comm: kworker/7:4 Tainted: G W 5.4.147-lockdep #1 > > | ================================================================== > > | Hardware name: Google CoachZ (rev1 - 2) with LTE (DT) > > | Workqueue: events kobject_delayed_cleanup > > | pstate: 60c00009 (nZCv daif +PAN +UAO) > > | pc : debug_print_object+0x13c/0x1b0 > > | lr : debug_print_object+0x13c/0x1b0 > > | sp : ffffff83b2ec7970 > > | x29: ffffff83b2ec7970 x28: dfffffd000000000 > > | x27: ffffff83d674f000 x26: dfffffd000000000 > > | x25: ffffffd06b8fa660 x24: dfffffd000000000 > > | x23: 0000000000000000 x22: ffffffd06b7c5108 > > | x21: ffffffd06d597860 x20: ffffffd06e2c21c0 > > | x19: ffffffd06d5974c0 x18: 000000000001dad8 > > | x17: 0000000000000000 x16: dfffffd000000000 > > | BUG: KASAN: use-after-free in qcom_glink_rpdev_release+0x54/0x70 > > | x15: ffffffffffffffff x14: 79616c6564203a74 > > | x13: 0000000000000000 x12: 0000000000000080 > > | Write of size 8 at addr ffffff83d95768d0 by task kworker/3:1/150 > > | x11: 0000000000000001 x10: 0000000000000000 > > | x9 : fc9e8edec0ad0300 x8 : fc9e8edec0ad0300 > > | > > | x7 : 0000000000000000 x6 : 0000000000000000 > > | x5 : 0000000000000080 x4 : 0000000000000000 > > | CPU: 3 PID: 150 Comm: kworker/3:1 Tainted: G W 5.4.147-lockdep #1 > > | x3 : ffffffd06c149574 x2 : ffffff83f77f7498 > > | x1 : ffffffd06d596f60 x0 : 0000000000000061 > > | Hardware name: Google CoachZ (rev1 - 2) with LTE (DT) > > | Call trace: > > | debug_print_object+0x13c/0x1b0 > > | Workqueue: events kobject_delayed_cleanup > > | __debug_check_no_obj_freed+0x25c/0x3c0 > > | debug_check_no_obj_freed+0x18/0x20 > > | Call trace: > > | slab_free_freelist_hook+0xb4/0x1bc > > | kfree+0xe8/0x2d8 > > | dump_backtrace+0x0/0x27c > > Why is dump_backtrace in the callstack here inbetween > rpmsg_ctrldev_release_device() and kfree()? Isn't the error that we're > calling kfree() on an chunk of memory that contains a live object? When I tried to repro there was no dump_backtrace(): Call trace: debug_print_object+0x13c/0x1b0 __debug_check_no_obj_freed+0x25c/0x3c0 debug_check_no_obj_freed+0x18/0x20 slab_free_freelist_hook+0xbc/0x1e4 kfree+0xfc/0x2f4 rpmsg_ctrldev_release_device+0x78/0xb8 device_release+0x84/0x168 kobject_cleanup+0x12c/0x298 kobject_delayed_cleanup+0x10/0x18 process_one_work+0x578/0x92c worker_thread+0x804/0xcf8 kthread+0x2a8/0x314 ret_from_fork+0x10/0x18 My guess is that Sujit added a dump_backtrace() for debugging and it was still there when the backtrace of the commit message was generated. That would also explain the two 'Call trace:' entries in the log. > > | rpmsg_ctrldev_release_device+0x78/0xb8 > > | device_release+0x68/0x14c > > | show_stack+0x20/0x2c > > | kobject_cleanup+0x12c/0x298 > > | kobject_delayed_cleanup+0x10/0x18 > > | dump_stack+0xe0/0x19c > > | process_one_work+0x578/0x92c > > | worker_thread+0x804/0xcf8 > > | print_address_description+0x3c/0x4a8 > > | kthread+0x2a8/0x314 > > | ret_from_fork+0x10/0x18 > > | __kasan_report+0x100/0x124 > > > > Signed-off-by: Sujit Kautkar <sujitka@chromium.org> > > --- > > Changes in v3: > > - Remove unecessary error check as per Matthias's comment > > > > Changes in v2: > > - Fix typo in commit message > > > > drivers/rpmsg/rpmsg_char.c | 10 ++-------- > > 1 file changed, 2 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/rpmsg/rpmsg_char.c b/drivers/rpmsg/rpmsg_char.c > > index 876ce43df732b..a6a33155ca859 100644 > > --- a/drivers/rpmsg/rpmsg_char.c > > +++ b/drivers/rpmsg/rpmsg_char.c > > @@ -458,7 +458,7 @@ static void rpmsg_ctrldev_release_device(struct device *dev) > > > > ida_simple_remove(&rpmsg_ctrl_ida, dev->id); > > ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt)); > > - cdev_del(&ctrldev->cdev); > > + cdev_device_del(&ctrldev->cdev, &ctrldev->dev); > > I am not able to find any other instance where cdev_device_del() is > called from the device's release function itself, which tells me that > this probably is not the right thing to do. Instead the appropriate way > seem to put the cdev_device_del() in rpmsg_chrdev_remove(). Yes, it sounds reasonable to me to delete the char device when the control device is removed. > That said, we already do device_del() in rpmsg_chrdev_remove() so if the > warning is trying to tell us that ctrldev->dev is not deleted I think we > have an unbalanced put_device()? My understanding is that the situation is analogous to this one: commit 1413ef638abae4ab5621901cf4d8ef08a4a48ba6 Author: Kevin Hao <haokexin@gmail.com> Date: Fri Oct 11 23:00:14 2019 +0800 i2c: dev: Fix the race between the release of i2c_dev and cdev The struct cdev is embedded in the struct i2c_dev. In the current code, we would free the i2c_dev struct directly in put_i2c_dev(), but the cdev is manged by a kobject, and the release of it is not predictable. So it is very possible that the i2c_dev is freed before the cdev is entirely released. We can easily get the following call trace with CONFIG_DEBUG_KOBJECT_RELEASE and CONFIG_DEBUG_OBJECTS_TIMERS enabled. ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x38 WARNING: CPU: 19 PID: 1 at lib/debugobjects.c:325 debug_print_object+0xb0/0xf0 ... This is a common issue when using cdev embedded in a struct. Fortunately, we already have a mechanism to solve this kind of issue. Please see commit 233ed09d7fda ("chardev: add helper function to egister char devs with a struct device") for more detail. In this patch, we choose to embed the struct device into the i2c_dev, and use the API provided by the commit 233ed09d7fda to make sure that the release of i2c_dev and cdev are in sequence. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-12-07 0:15 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-11-02 23:51 [PATCH v3 0/2] Fix two kernel warnings in glink driver Sujit Kautkar 2021-11-02 23:51 ` [PATCH v3 1/2] rpmsg: glink: Fix use-after-free in qcom_glink_rpdev_release() Sujit Kautkar 2021-11-03 16:34 ` Matthias Kaehlcke 2021-11-02 23:51 ` [PATCH v3 2/2] rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device() Sujit Kautkar 2021-11-03 17:16 ` Matthias Kaehlcke 2021-11-17 18:59 ` Stephen Boyd 2021-11-17 23:29 ` Bjorn Andersson 2021-12-07 0:15 ` Matthias Kaehlcke
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox