All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthias Kaehlcke <mka@chromium.org>
To: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Sujit Kautkar <sujitka@chromium.org>,
	Andy Gross <agross@kernel.org>, Ohad Ben-Cohen <ohad@wizery.com>,
	Sibi Sankar <sibis@codeaurora.org>,
	Stephen Boyd <swboyd@chromium.org>,
	linux-kernel@vger.kernel.org, linux-remoteproc@vger.kernel.org
Subject: Re: [PATCH v3 2/2] rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device()
Date: Mon, 6 Dec 2021 16:15:14 -0800	[thread overview]
Message-ID: <Ya6nks3V4/G7aA9Q@google.com> (raw)
In-Reply-To: <YZWQQ6XPIdMLtZEy@builder.lan>

On Wed, Nov 17, 2021 at 05:29:07PM -0600, Bjorn Andersson wrote:
> On Tue 02 Nov 18:51 CDT 2021, Sujit Kautkar wrote:
> 
> I like Stephen's suggestion about modifying the $subject.
> Also note that the change isn't in the glink driver, so prefix should
> reflect that:
> 
> $ git log --oneline --no-decorate -- drivers/rpmsg/rpmsg_char.c
> f998d48f9b3c rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device()
> bc774a3887cb rpmsg: char: Remove useless include
> 964e8bedd5a1 rpmsg: char: Return an error if device already open
> ...
> 
> > Replace cdev add/del APIs with cdev_device_add/cdev_device_del to avoid
> > below kernel warning. This correctly takes a reference to the parent
> > device so the parent will not get released until all references to the
> > cdev are released.
> > 
> > | ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x7c
> > | WARNING: CPU: 7 PID: 19892 at lib/debugobjects.c:488 debug_print_object+0x13c/0x1b0
> > | CPU: 7 PID: 19892 Comm: kworker/7:4 Tainted: G        W         5.4.147-lockdep #1
> > | ==================================================================
> > | Hardware name: Google CoachZ (rev1 - 2) with LTE (DT)
> > | Workqueue: events kobject_delayed_cleanup
> > | pstate: 60c00009 (nZCv daif +PAN +UAO)
> > | pc : debug_print_object+0x13c/0x1b0
> > | lr : debug_print_object+0x13c/0x1b0
> > | sp : ffffff83b2ec7970
> > | x29: ffffff83b2ec7970 x28: dfffffd000000000
> > | x27: ffffff83d674f000 x26: dfffffd000000000
> > | x25: ffffffd06b8fa660 x24: dfffffd000000000
> > | x23: 0000000000000000 x22: ffffffd06b7c5108
> > | x21: ffffffd06d597860 x20: ffffffd06e2c21c0
> > | x19: ffffffd06d5974c0 x18: 000000000001dad8
> > | x17: 0000000000000000 x16: dfffffd000000000
> > | BUG: KASAN: use-after-free in qcom_glink_rpdev_release+0x54/0x70
> > | x15: ffffffffffffffff x14: 79616c6564203a74
> > | x13: 0000000000000000 x12: 0000000000000080
> > | Write of size 8 at addr ffffff83d95768d0 by task kworker/3:1/150
> > | x11: 0000000000000001 x10: 0000000000000000
> > | x9 : fc9e8edec0ad0300 x8 : fc9e8edec0ad0300
> > |
> > | x7 : 0000000000000000 x6 : 0000000000000000
> > | x5 : 0000000000000080 x4 : 0000000000000000
> > | CPU: 3 PID: 150 Comm: kworker/3:1 Tainted: G        W         5.4.147-lockdep #1
> > | x3 : ffffffd06c149574 x2 : ffffff83f77f7498
> > | x1 : ffffffd06d596f60 x0 : 0000000000000061
> > | Hardware name: Google CoachZ (rev1 - 2) with LTE (DT)
> > | Call trace:
> > |  debug_print_object+0x13c/0x1b0
> > | Workqueue: events kobject_delayed_cleanup
> > |  __debug_check_no_obj_freed+0x25c/0x3c0
> > |  debug_check_no_obj_freed+0x18/0x20
> > | Call trace:
> > |  slab_free_freelist_hook+0xb4/0x1bc
> > |  kfree+0xe8/0x2d8
> > |  dump_backtrace+0x0/0x27c
> 
> Why is dump_backtrace in the callstack here inbetween
> rpmsg_ctrldev_release_device() and kfree()? Isn't the error that we're
> calling kfree() on an chunk of memory that contains a live object?

When I tried to repro there was no dump_backtrace():

  Call trace:
   debug_print_object+0x13c/0x1b0
   __debug_check_no_obj_freed+0x25c/0x3c0
   debug_check_no_obj_freed+0x18/0x20
   slab_free_freelist_hook+0xbc/0x1e4
   kfree+0xfc/0x2f4
   rpmsg_ctrldev_release_device+0x78/0xb8
   device_release+0x84/0x168
   kobject_cleanup+0x12c/0x298
   kobject_delayed_cleanup+0x10/0x18
   process_one_work+0x578/0x92c
   worker_thread+0x804/0xcf8
   kthread+0x2a8/0x314
   ret_from_fork+0x10/0x18

My guess is that Sujit added a dump_backtrace() for debugging and it was
still there when the backtrace of the commit message was generated. That
would also explain the two 'Call trace:' entries in the log.

> > |  rpmsg_ctrldev_release_device+0x78/0xb8
> > |  device_release+0x68/0x14c
> > |  show_stack+0x20/0x2c
> > |  kobject_cleanup+0x12c/0x298
> > |  kobject_delayed_cleanup+0x10/0x18
> > |  dump_stack+0xe0/0x19c
> > |  process_one_work+0x578/0x92c
> > |  worker_thread+0x804/0xcf8
> > |  print_address_description+0x3c/0x4a8
> > |  kthread+0x2a8/0x314
> > |  ret_from_fork+0x10/0x18
> > |  __kasan_report+0x100/0x124
> > 
> > Signed-off-by: Sujit Kautkar <sujitka@chromium.org>
> > ---
> > Changes in v3:
> > - Remove unecessary error check as per Matthias's comment
> > 
> > Changes in v2:
> > - Fix typo in commit message
> > 
> >  drivers/rpmsg/rpmsg_char.c | 10 ++--------
> >  1 file changed, 2 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/rpmsg/rpmsg_char.c b/drivers/rpmsg/rpmsg_char.c
> > index 876ce43df732b..a6a33155ca859 100644
> > --- a/drivers/rpmsg/rpmsg_char.c
> > +++ b/drivers/rpmsg/rpmsg_char.c
> > @@ -458,7 +458,7 @@ static void rpmsg_ctrldev_release_device(struct device *dev)
> >  
> >  	ida_simple_remove(&rpmsg_ctrl_ida, dev->id);
> >  	ida_simple_remove(&rpmsg_minor_ida, MINOR(dev->devt));
> > -	cdev_del(&ctrldev->cdev);
> > +	cdev_device_del(&ctrldev->cdev, &ctrldev->dev);
> 
> I am not able to find any other instance where cdev_device_del() is
> called from the device's release function itself, which tells me that
> this probably is not the right thing to do. Instead the appropriate way
> seem to put the cdev_device_del() in rpmsg_chrdev_remove().

Yes, it sounds reasonable to me to delete the char device when the control
device is removed.

> That said, we already do device_del() in rpmsg_chrdev_remove() so if the
> warning is trying to tell us that ctrldev->dev is not deleted I think we
> have an unbalanced put_device()?

My understanding is that the situation is analogous to this one:

commit 1413ef638abae4ab5621901cf4d8ef08a4a48ba6
Author: Kevin Hao <haokexin@gmail.com>
Date:   Fri Oct 11 23:00:14 2019 +0800

    i2c: dev: Fix the race between the release of i2c_dev and cdev

    The struct cdev is embedded in the struct i2c_dev. In the current code,
    we would free the i2c_dev struct directly in put_i2c_dev(), but the
    cdev is manged by a kobject, and the release of it is not predictable.
    So it is very possible that the i2c_dev is freed before the cdev is
    entirely released. We can easily get the following call trace with
    CONFIG_DEBUG_KOBJECT_RELEASE and CONFIG_DEBUG_OBJECTS_TIMERS enabled.
      ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x38
      WARNING: CPU: 19 PID: 1 at lib/debugobjects.c:325 debug_print_object+0xb0/0xf0

    ...

    This is a common issue when using cdev embedded in a struct.
    Fortunately, we already have a mechanism to solve this kind of issue.
    Please see commit 233ed09d7fda ("chardev: add helper function to
    egister char devs with a struct device") for more detail.

    In this patch, we choose to embed the struct device into the i2c_dev,
    and use the API provided by the commit 233ed09d7fda to make sure that
    the release of i2c_dev and cdev are in sequence.


      reply	other threads:[~2021-12-07  0:15 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-02 23:51 [PATCH v3 0/2] Fix two kernel warnings in glink driver Sujit Kautkar
2021-11-02 23:51 ` [PATCH v3 1/2] rpmsg: glink: Fix use-after-free in qcom_glink_rpdev_release() Sujit Kautkar
2021-11-03 16:34   ` Matthias Kaehlcke
2021-11-02 23:51 ` [PATCH v3 2/2] rpmsg: glink: Update cdev add/del API in rpmsg_ctrldev_release_device() Sujit Kautkar
2021-11-03 17:16   ` Matthias Kaehlcke
2021-11-17 18:59   ` Stephen Boyd
2021-11-17 23:29   ` Bjorn Andersson
2021-12-07  0:15     ` Matthias Kaehlcke [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ya6nks3V4/G7aA9Q@google.com \
    --to=mka@chromium.org \
    --cc=agross@kernel.org \
    --cc=bjorn.andersson@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-remoteproc@vger.kernel.org \
    --cc=ohad@wizery.com \
    --cc=sibis@codeaurora.org \
    --cc=sujitka@chromium.org \
    --cc=swboyd@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.