All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jacek Anaszewski <jacek.anaszewski@gmail.com>
To: anish kumar <yesanishhere@gmail.com>
Cc: Mukesh Ojha <quic_mojha@quicinc.com>, Pavel Machek <pavel@ucw.cz>,
	Lee Jones <lee@kernel.org>,
	linux-leds@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex
Date: Thu, 17 Oct 2024 19:58:59 +0200	[thread overview]
Message-ID: <c3965459-c295-c577-8b49-834ea4c19d55@gmail.com> (raw)
In-Reply-To: <CABCoZhBq4oFHOoPoWu2g=1Szry7bV9rBRgq_4zDxuS=17jfNhw@mail.gmail.com>



On 10/17/24 18:41, anish kumar wrote:
> On Thu, Oct 17, 2024 at 5:12 AM Jacek Anaszewski
> <jacek.anaszewski@gmail.com> wrote:
>>
>> Hi Anish and Mukesh,
>>
>> On 10/16/24 18:37, anish kumar wrote:
>>> On Tue, Oct 15, 2024 at 10:45 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote:
>>>>
>>>> On Tue, Oct 15, 2024 at 03:28:08PM -0700, anish kumar wrote:
>>>>> On Tue, Oct 15, 2024 at 12:28 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote:
>>>>>>
>>>>>> On Tue, Oct 15, 2024 at 10:59:12AM -0700, anish kumar wrote:
>>>>>>> On Tue, Oct 15, 2024 at 9:26 AM Mukesh Ojha <quic_mojha@quicinc.com> wrote:
>>>>>>>>
>>>>>>>> There is NULL pointer issue observed if from Process A where hid device
>>>>>>>> being added which results in adding a led_cdev addition and later a
>>>>>>>> another call to access of led_cdev attribute from Process B can result
>>>>>>>> in NULL pointer issue.
>>>>>>>
>>>>>>> Which pointer is NULL? Call stack shows that dualshock4_led_get_brightness
>>>>>>> function could be culprit?
>>>>>>
>>>>>> in dualshock4_led_get_brightness()[1], led->dev is NULL here, as [2]
>>>>>> is not yet completed.
>>>>>>
>>>>>> [1]
>>>>>>    struct hid_device *hdev = to_hid_device(led->dev->parent);
>>>>>>
>>>>>> [2]
>>>>>> led_cdev->dev = device_create_with_groups(&leds_class, parent, 0,
>>>>>>                     led_cdev, led_cdev->groups, "%s", final_name);
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Use mutex led_cdev->led_access to protect access to led->cdev and its
>>>>>>>> attribute inside brightness_show().
>>>>>>>
>>>>>>> I don't think it is needed here because it is just calling the led driver
>>>>>>> callback and updating the brightness. So, why would we need to serialize
>>>>>>> that using mutex? Maybe the callback needs some debugging.
>>>>>>> I'm curious if it is ready by the time the callback is invoked.
>>>>>>
>>>>>> Because, we should not be allowed to access led_cdev->dev as it is not
>>>>>> completed and since, brightness_store() has this lock brightness_show()
>>>>>> should also have this as we are seeing the issue without it.
>>>>>>
>>>>>> I hope, above might have answered your question.
>>>>>>
>>>>>> -Mukesh
>>>>>>>
>>>>>>>>
>>>>>>>>           Process A                               Process B
>>>>>>>>
>>>>>>>>    kthread+0x114
>>>>>>>>    worker_thread+0x244
>>>>>>>>    process_scheduled_works+0x248
>>>>>>>>    uhid_device_add_worker+0x24
>>>>>>>>    hid_add_device+0x120
>>>>>>>>    device_add+0x268
>>>>>>>>    bus_probe_device+0x94
>>>>>>>>    device_initial_probe+0x14
>>>>>>>>    __device_attach+0xfc
>>>>>>>>    bus_for_each_drv+0x10c
>>>>>>>>    __device_attach_driver+0x14c
>>>>>>>>    driver_probe_device+0x3c
>>>>>>>>    __driver_probe_device+0xa0
>>>>>>>>    really_probe+0x190
>>>>>>>>    hid_device_probe+0x130
>>>>>>>>    ps_probe+0x990
>>>>>>>>    ps_led_register+0x94
>>>>>>>>    devm_led_classdev_register_ext+0x58
>>>>>>>>    led_classdev_register_ext+0x1f8
>>>>>>>>    device_create_with_groups+0x48
>>>>>>>>    device_create_groups_vargs+0xc8
>>>>>>>>    device_add+0x244
>>>>>>>>    kobject_uevent+0x14
>>>>>>>>    kobject_uevent_env[jt]+0x224
>>>>>>>>    mutex_unlock[jt]+0xc4
>>>>>>>>    __mutex_unlock_slowpath+0xd4
>>>>>>>>    wake_up_q+0x70
>>>>>>>>    try_to_wake_up[jt]+0x48c
>>>>>>>>    preempt_schedule_common+0x28
>>>>>>>>    __schedule+0x628
>>>>>>>>    __switch_to+0x174
>>>>>>>>                                                   el0t_64_sync+0x1a8/0x1ac
>>>>>>>>                                                   el0t_64_sync_handler+0x68/0xbc
>>>>>>>>                                                   el0_svc+0x38/0x68
>>>>>>>>                                                   do_el0_svc+0x1c/0x28
>>>>>>>>                                                   el0_svc_common+0x80/0xe0
>>>>>>>>                                                   invoke_syscall+0x58/0x114
>>>>>>>>                                                   __arm64_sys_read+0x1c/0x2c
>>>>>>>>                                                   ksys_read+0x78/0xe8
>>>>>>>>                                                   vfs_read+0x1e0/0x2c8
>>>>>>>>                                                   kernfs_fop_read_iter+0x68/0x1b4
>>>>>>>>                                                   seq_read_iter+0x158/0x4ec
>>>>>>>>                                                   kernfs_seq_show+0x44/0x54
>>>>>>>>                                                   sysfs_kf_seq_show+0xb4/0x130
>>>>>>>>                                                   dev_attr_show+0x38/0x74
>>>>>>>>                                                   brightness_show+0x20/0x4c
>>>>>>>>                                                   dualshock4_led_get_brightness+0xc/0x74
>>>>>>>>
>>>>>>>> [ 3313.874295][ T4013] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060
>>>>>>>> [ 3313.874301][ T4013] Mem abort info:
>>>>>>>> [ 3313.874303][ T4013]   ESR = 0x0000000096000006
>>>>>>>> [ 3313.874305][ T4013]   EC = 0x25: DABT (current EL), IL = 32 bits
>>>>>>>> [ 3313.874307][ T4013]   SET = 0, FnV = 0
>>>>>>>> [ 3313.874309][ T4013]   EA = 0, S1PTW = 0
>>>>>>>> [ 3313.874311][ T4013]   FSC = 0x06: level 2 translation fault
>>>>>>>> [ 3313.874313][ T4013] Data abort info:
>>>>>>>> [ 3313.874314][ T4013]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
>>>>>>>> [ 3313.874316][ T4013]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>>>>>>>> [ 3313.874318][ T4013]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>>>>>>>> [ 3313.874320][ T4013] user pgtable: 4k pages, 39-bit VAs, pgdp=00000008f2b0a000
>>>>>>>> ..
>>>>>>>>
>>>>>>>> [ 3313.874332][ T4013] Dumping ftrace buffer:
>>>>>>>> [ 3313.874334][ T4013]    (ftrace buffer empty)
>>>>>>>> ..
>>>>>>>> ..
>>>>>>>> [ dd3313.874639][ T4013] CPU: 6 PID: 4013 Comm: InputReader
>>>>>>>> [ 3313.874648][ T4013] pc : dualshock4_led_get_brightness+0xc/0x74
>>>>>>>> [ 3313.874653][ T4013] lr : led_update_brightness+0x38/0x60
>>>>>>>> [ 3313.874656][ T4013] sp : ffffffc0b910bbd0
>>>>>>>> ..
>>>>>>>> ..
>>>>>>>> [ 3313.874685][ T4013] Call trace:
>>>>>>>> [ 3313.874687][ T4013]  dualshock4_led_get_brightness+0xc/0x74
>>>>>>>> [ 3313.874690][ T4013]  brightness_show+0x20/0x4c
>>>>>>>> [ 3313.874692][ T4013]  dev_attr_show+0x38/0x74
>>>>>>>> [ 3313.874696][ T4013]  sysfs_kf_seq_show+0xb4/0x130
>>>>>>>> [ 3313.874700][ T4013]  kernfs_seq_show+0x44/0x54
>>>>>>>> [ 3313.874703][ T4013]  seq_read_iter+0x158/0x4ec
>>>>>>>> [ 3313.874705][ T4013]  kernfs_fop_read_iter+0x68/0x1b4
>>>>>>>> [ 3313.874708][ T4013]  vfs_read+0x1e0/0x2c8
>>>>>>>> [ 3313.874711][ T4013]  ksys_read+0x78/0xe8
>>>>>>>> [ 3313.874714][ T4013]  __arm64_sys_read+0x1c/0x2c
>>>>>>>> [ 3313.874718][ T4013]  invoke_syscall+0x58/0x114
>>>>>>>> [ 3313.874721][ T4013]  el0_svc_common+0x80/0xe0
>>>>>>>> [ 3313.874724][ T4013]  do_el0_svc+0x1c/0x28
>>>>>>>> [ 3313.874727][ T4013]  el0_svc+0x38/0x68
>>>>>>>> [ 3313.874730][ T4013]  el0t_64_sync_handler+0x68/0xbc
>>>>>>>> [ 3313.874732][ T4013]  el0t_64_sync+0x1a8/0x1ac
>>>>>>>>
>>>>>>>> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
>>>>>>>> ---
>>>>>>>>    drivers/leds/led-class.c | 3 ++-
>>>>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c
>>>>>>>> index 06b97fd49ad9..e3cb93f19c06 100644
>>>>>>>> --- a/drivers/leds/led-class.c
>>>>>>>> +++ b/drivers/leds/led-class.c
>>>>>>>> @@ -30,8 +30,9 @@ static ssize_t brightness_show(struct device *dev,
>>>>>>>>    {
>>>>>>>>           struct led_classdev *led_cdev = dev_get_drvdata(dev);
>>>>>>>>
>>>>>>>> -       /* no lock needed for this */
>>>>>
>>>>> just get rid of the above comment then.
>>>>
>>>> If you notice, it is already removed (-) .
>>>>
>>>>>
>>>>> Also, the comment below in file leds.h
>>>>> needs an update as originally the idea for this mutex lock was to
>>>>> provide quick feedback to userspace based on this commit
>>>>> https://github.com/torvalds/linux/commit/acd899e4f3066b6662f6047da5b795cc762093cb
>>>>>
>>>>> Basically a comment somewhere so that when a new attribute
>>>>> gets added, it doesn't make the same mistake of not using the mutex
>>>>> and run into the same issue.
>>>>>
>>>>> /* Ensures consistent access to the LED Flash Class device */
>>>>> struct mutex led_access;
>>>>
>>>> Thanks for accepting that it is an issue.
>>>> I think, comment is very obvious actually the patch you mentioned should
>>>> be in fixes tag as it introduced the lock but did not protect the show
>>>> while it does it for store.
>>>
>>> Yes, but that patch was added for supporting flash class
>>> device and wasn't explicitly to take care of the scenario that you
>>> are trying to handle and the above comment in leds.h states the same.
>>
>> Correct. led_access mutex was introduced to add support for preventing
>> any LED class device state changes originating from sysfs while
>> v4l2_flash wrapper owns the device.
>>
>> Since the inception of LED subsystem all the locking was deemed to be
>> the responsibility of every single LED class driver and initially sysfs
>> attr callbacks didn't have any locking. After some time when LED core
>> started to grow it turned out that it was required to lock the LED class
>> initialization sequence, so as not to give the userspace an opportunity
>> to set LED brightness on not fully initialized device, which was
>> introduced in [0]. led_access mutex was already in place so it was used.
>> However as you noticed, it is not used consistently across all LED class
>> sysfs attrs callbacks.
>>
>> Since brightness_show() does not acquire led_access mutex it is still
>> possible to call brightness_get op when LED class initialization
>> sequence is not yet finished.
>>
>> Still, I'd propose to first narrow down the issue and figure out what
>> actually causes NULL pointer dereference, as it apparently
>> originates from dualshock4_led_get_brightness and not from LED core.
> 
> Mukesh already explained the issue in earlier emails but here is the gist
> anyway.
> 
> led_cdev->dev = device_create_with_groups(&leds_class, parent, 0,
>    led_cdev, led_cdev->groups, "%s", final_name);
> 
> If you look at the above code, device_create_with_groups function
> can create all the sysfs and before it returns and assigns led_cdev->dev
> pointer, those sysfs callback can get triggered and if the callback
> accesses led_cdev->dev this variable, it will crash as it is not yet
> assigned.

Your trace ends in dualshock4_led_get_brightness(). Did you confirm that
NULL pointer dereference is caused by accessing led_cdev->dev there?

-- 
Best regards,
Jacek Anaszewski

  reply	other threads:[~2024-10-17 17:59 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-15 16:25 [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex Mukesh Ojha
2024-10-15 17:59 ` anish kumar
2024-10-15 19:27   ` Mukesh Ojha
2024-10-15 22:28     ` anish kumar
2024-10-16  5:45       ` Mukesh Ojha
2024-10-16 16:37         ` anish kumar
2024-10-17 12:12           ` Jacek Anaszewski
2024-10-17 16:41             ` anish kumar
2024-10-17 17:58               ` Jacek Anaszewski [this message]
2024-10-17 20:30                 ` anish kumar
2024-10-18 20:10                   ` Jacek Anaszewski
2024-10-25 10:01                     ` Mukesh Ojha

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c3965459-c295-c577-8b49-834ea4c19d55@gmail.com \
    --to=jacek.anaszewski@gmail.com \
    --cc=lee@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-leds@vger.kernel.org \
    --cc=pavel@ucw.cz \
    --cc=quic_mojha@quicinc.com \
    --cc=yesanishhere@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.