* [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex @ 2024-10-15 16:25 Mukesh Ojha 2024-10-15 17:59 ` anish kumar 0 siblings, 1 reply; 12+ messages in thread From: Mukesh Ojha @ 2024-10-15 16:25 UTC (permalink / raw) To: Pavel Machek, Lee Jones; +Cc: linux-leds, linux-kernel, Mukesh Ojha There is NULL pointer issue observed if from Process A where hid device being added which results in adding a led_cdev addition and later a another call to access of led_cdev attribute from Process B can result in NULL pointer issue. Use mutex led_cdev->led_access to protect access to led->cdev and its attribute inside brightness_show(). Process A Process B kthread+0x114 worker_thread+0x244 process_scheduled_works+0x248 uhid_device_add_worker+0x24 hid_add_device+0x120 device_add+0x268 bus_probe_device+0x94 device_initial_probe+0x14 __device_attach+0xfc bus_for_each_drv+0x10c __device_attach_driver+0x14c driver_probe_device+0x3c __driver_probe_device+0xa0 really_probe+0x190 hid_device_probe+0x130 ps_probe+0x990 ps_led_register+0x94 devm_led_classdev_register_ext+0x58 led_classdev_register_ext+0x1f8 device_create_with_groups+0x48 device_create_groups_vargs+0xc8 device_add+0x244 kobject_uevent+0x14 kobject_uevent_env[jt]+0x224 mutex_unlock[jt]+0xc4 __mutex_unlock_slowpath+0xd4 wake_up_q+0x70 try_to_wake_up[jt]+0x48c preempt_schedule_common+0x28 __schedule+0x628 __switch_to+0x174 el0t_64_sync+0x1a8/0x1ac el0t_64_sync_handler+0x68/0xbc el0_svc+0x38/0x68 do_el0_svc+0x1c/0x28 el0_svc_common+0x80/0xe0 invoke_syscall+0x58/0x114 __arm64_sys_read+0x1c/0x2c ksys_read+0x78/0xe8 vfs_read+0x1e0/0x2c8 kernfs_fop_read_iter+0x68/0x1b4 seq_read_iter+0x158/0x4ec kernfs_seq_show+0x44/0x54 sysfs_kf_seq_show+0xb4/0x130 dev_attr_show+0x38/0x74 brightness_show+0x20/0x4c dualshock4_led_get_brightness+0xc/0x74 [ 3313.874295][ T4013] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060 [ 3313.874301][ T4013] Mem abort info: [ 3313.874303][ T4013] ESR = 0x0000000096000006 [ 3313.874305][ T4013] EC = 0x25: DABT (current EL), IL = 32 bits [ 3313.874307][ T4013] SET = 0, FnV = 0 [ 3313.874309][ T4013] EA = 0, S1PTW = 0 [ 3313.874311][ T4013] FSC = 0x06: level 2 translation fault [ 3313.874313][ T4013] Data abort info: [ 3313.874314][ T4013] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 [ 3313.874316][ T4013] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 3313.874318][ T4013] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 3313.874320][ T4013] user pgtable: 4k pages, 39-bit VAs, pgdp=00000008f2b0a000 .. [ 3313.874332][ T4013] Dumping ftrace buffer: [ 3313.874334][ T4013] (ftrace buffer empty) .. .. [ dd3313.874639][ T4013] CPU: 6 PID: 4013 Comm: InputReader [ 3313.874648][ T4013] pc : dualshock4_led_get_brightness+0xc/0x74 [ 3313.874653][ T4013] lr : led_update_brightness+0x38/0x60 [ 3313.874656][ T4013] sp : ffffffc0b910bbd0 .. .. [ 3313.874685][ T4013] Call trace: [ 3313.874687][ T4013] dualshock4_led_get_brightness+0xc/0x74 [ 3313.874690][ T4013] brightness_show+0x20/0x4c [ 3313.874692][ T4013] dev_attr_show+0x38/0x74 [ 3313.874696][ T4013] sysfs_kf_seq_show+0xb4/0x130 [ 3313.874700][ T4013] kernfs_seq_show+0x44/0x54 [ 3313.874703][ T4013] seq_read_iter+0x158/0x4ec [ 3313.874705][ T4013] kernfs_fop_read_iter+0x68/0x1b4 [ 3313.874708][ T4013] vfs_read+0x1e0/0x2c8 [ 3313.874711][ T4013] ksys_read+0x78/0xe8 [ 3313.874714][ T4013] __arm64_sys_read+0x1c/0x2c [ 3313.874718][ T4013] invoke_syscall+0x58/0x114 [ 3313.874721][ T4013] el0_svc_common+0x80/0xe0 [ 3313.874724][ T4013] do_el0_svc+0x1c/0x28 [ 3313.874727][ T4013] el0_svc+0x38/0x68 [ 3313.874730][ T4013] el0t_64_sync_handler+0x68/0xbc [ 3313.874732][ T4013] el0t_64_sync+0x1a8/0x1ac Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> --- drivers/leds/led-class.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c index 06b97fd49ad9..e3cb93f19c06 100644 --- a/drivers/leds/led-class.c +++ b/drivers/leds/led-class.c @@ -30,8 +30,9 @@ static ssize_t brightness_show(struct device *dev, { struct led_classdev *led_cdev = dev_get_drvdata(dev); - /* no lock needed for this */ + mutex_lock(&led_cdev->led_access); led_update_brightness(led_cdev); + mutex_unlock(&led_cdev->led_access); return sprintf(buf, "%u\n", led_cdev->brightness); } -- 2.34.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex 2024-10-15 16:25 [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex Mukesh Ojha @ 2024-10-15 17:59 ` anish kumar 2024-10-15 19:27 ` Mukesh Ojha 0 siblings, 1 reply; 12+ messages in thread From: anish kumar @ 2024-10-15 17:59 UTC (permalink / raw) To: Mukesh Ojha; +Cc: Pavel Machek, Lee Jones, linux-leds, linux-kernel On Tue, Oct 15, 2024 at 9:26 AM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > > There is NULL pointer issue observed if from Process A where hid device > being added which results in adding a led_cdev addition and later a > another call to access of led_cdev attribute from Process B can result > in NULL pointer issue. Which pointer is NULL? Call stack shows that dualshock4_led_get_brightness function could be culprit? > > Use mutex led_cdev->led_access to protect access to led->cdev and its > attribute inside brightness_show(). I don't think it is needed here because it is just calling the led driver callback and updating the brightness. So, why would we need to serialize that using mutex? Maybe the callback needs some debugging. I'm curious if it is ready by the time the callback is invoked. > > Process A Process B > > kthread+0x114 > worker_thread+0x244 > process_scheduled_works+0x248 > uhid_device_add_worker+0x24 > hid_add_device+0x120 > device_add+0x268 > bus_probe_device+0x94 > device_initial_probe+0x14 > __device_attach+0xfc > bus_for_each_drv+0x10c > __device_attach_driver+0x14c > driver_probe_device+0x3c > __driver_probe_device+0xa0 > really_probe+0x190 > hid_device_probe+0x130 > ps_probe+0x990 > ps_led_register+0x94 > devm_led_classdev_register_ext+0x58 > led_classdev_register_ext+0x1f8 > device_create_with_groups+0x48 > device_create_groups_vargs+0xc8 > device_add+0x244 > kobject_uevent+0x14 > kobject_uevent_env[jt]+0x224 > mutex_unlock[jt]+0xc4 > __mutex_unlock_slowpath+0xd4 > wake_up_q+0x70 > try_to_wake_up[jt]+0x48c > preempt_schedule_common+0x28 > __schedule+0x628 > __switch_to+0x174 > el0t_64_sync+0x1a8/0x1ac > el0t_64_sync_handler+0x68/0xbc > el0_svc+0x38/0x68 > do_el0_svc+0x1c/0x28 > el0_svc_common+0x80/0xe0 > invoke_syscall+0x58/0x114 > __arm64_sys_read+0x1c/0x2c > ksys_read+0x78/0xe8 > vfs_read+0x1e0/0x2c8 > kernfs_fop_read_iter+0x68/0x1b4 > seq_read_iter+0x158/0x4ec > kernfs_seq_show+0x44/0x54 > sysfs_kf_seq_show+0xb4/0x130 > dev_attr_show+0x38/0x74 > brightness_show+0x20/0x4c > dualshock4_led_get_brightness+0xc/0x74 > > [ 3313.874295][ T4013] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060 > [ 3313.874301][ T4013] Mem abort info: > [ 3313.874303][ T4013] ESR = 0x0000000096000006 > [ 3313.874305][ T4013] EC = 0x25: DABT (current EL), IL = 32 bits > [ 3313.874307][ T4013] SET = 0, FnV = 0 > [ 3313.874309][ T4013] EA = 0, S1PTW = 0 > [ 3313.874311][ T4013] FSC = 0x06: level 2 translation fault > [ 3313.874313][ T4013] Data abort info: > [ 3313.874314][ T4013] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 > [ 3313.874316][ T4013] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > [ 3313.874318][ T4013] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > [ 3313.874320][ T4013] user pgtable: 4k pages, 39-bit VAs, pgdp=00000008f2b0a000 > .. > > [ 3313.874332][ T4013] Dumping ftrace buffer: > [ 3313.874334][ T4013] (ftrace buffer empty) > .. > .. > [ dd3313.874639][ T4013] CPU: 6 PID: 4013 Comm: InputReader > [ 3313.874648][ T4013] pc : dualshock4_led_get_brightness+0xc/0x74 > [ 3313.874653][ T4013] lr : led_update_brightness+0x38/0x60 > [ 3313.874656][ T4013] sp : ffffffc0b910bbd0 > .. > .. > [ 3313.874685][ T4013] Call trace: > [ 3313.874687][ T4013] dualshock4_led_get_brightness+0xc/0x74 > [ 3313.874690][ T4013] brightness_show+0x20/0x4c > [ 3313.874692][ T4013] dev_attr_show+0x38/0x74 > [ 3313.874696][ T4013] sysfs_kf_seq_show+0xb4/0x130 > [ 3313.874700][ T4013] kernfs_seq_show+0x44/0x54 > [ 3313.874703][ T4013] seq_read_iter+0x158/0x4ec > [ 3313.874705][ T4013] kernfs_fop_read_iter+0x68/0x1b4 > [ 3313.874708][ T4013] vfs_read+0x1e0/0x2c8 > [ 3313.874711][ T4013] ksys_read+0x78/0xe8 > [ 3313.874714][ T4013] __arm64_sys_read+0x1c/0x2c > [ 3313.874718][ T4013] invoke_syscall+0x58/0x114 > [ 3313.874721][ T4013] el0_svc_common+0x80/0xe0 > [ 3313.874724][ T4013] do_el0_svc+0x1c/0x28 > [ 3313.874727][ T4013] el0_svc+0x38/0x68 > [ 3313.874730][ T4013] el0t_64_sync_handler+0x68/0xbc > [ 3313.874732][ T4013] el0t_64_sync+0x1a8/0x1ac > > Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> > --- > drivers/leds/led-class.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c > index 06b97fd49ad9..e3cb93f19c06 100644 > --- a/drivers/leds/led-class.c > +++ b/drivers/leds/led-class.c > @@ -30,8 +30,9 @@ static ssize_t brightness_show(struct device *dev, > { > struct led_classdev *led_cdev = dev_get_drvdata(dev); > > - /* no lock needed for this */ >> also you missed this. > + mutex_lock(&led_cdev->led_access); > led_update_brightness(led_cdev); > + mutex_unlock(&led_cdev->led_access); > > return sprintf(buf, "%u\n", led_cdev->brightness); > } > -- > 2.34.1 > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex 2024-10-15 17:59 ` anish kumar @ 2024-10-15 19:27 ` Mukesh Ojha 2024-10-15 22:28 ` anish kumar 0 siblings, 1 reply; 12+ messages in thread From: Mukesh Ojha @ 2024-10-15 19:27 UTC (permalink / raw) To: anish kumar; +Cc: Pavel Machek, Lee Jones, linux-leds, linux-kernel On Tue, Oct 15, 2024 at 10:59:12AM -0700, anish kumar wrote: > On Tue, Oct 15, 2024 at 9:26 AM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > > > > There is NULL pointer issue observed if from Process A where hid device > > being added which results in adding a led_cdev addition and later a > > another call to access of led_cdev attribute from Process B can result > > in NULL pointer issue. > > Which pointer is NULL? Call stack shows that dualshock4_led_get_brightness > function could be culprit? in dualshock4_led_get_brightness()[1], led->dev is NULL here, as [2] is not yet completed. [1] struct hid_device *hdev = to_hid_device(led->dev->parent); [2] led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, led_cdev, led_cdev->groups, "%s", final_name); > > > > > Use mutex led_cdev->led_access to protect access to led->cdev and its > > attribute inside brightness_show(). > > I don't think it is needed here because it is just calling the led driver > callback and updating the brightness. So, why would we need to serialize > that using mutex? Maybe the callback needs some debugging. > I'm curious if it is ready by the time the callback is invoked. Because, we should not be allowed to access led_cdev->dev as it is not completed and since, brightness_store() has this lock brightness_show() should also have this as we are seeing the issue without it. I hope, above might have answered your question. -Mukesh > > > > > Process A Process B > > > > kthread+0x114 > > worker_thread+0x244 > > process_scheduled_works+0x248 > > uhid_device_add_worker+0x24 > > hid_add_device+0x120 > > device_add+0x268 > > bus_probe_device+0x94 > > device_initial_probe+0x14 > > __device_attach+0xfc > > bus_for_each_drv+0x10c > > __device_attach_driver+0x14c > > driver_probe_device+0x3c > > __driver_probe_device+0xa0 > > really_probe+0x190 > > hid_device_probe+0x130 > > ps_probe+0x990 > > ps_led_register+0x94 > > devm_led_classdev_register_ext+0x58 > > led_classdev_register_ext+0x1f8 > > device_create_with_groups+0x48 > > device_create_groups_vargs+0xc8 > > device_add+0x244 > > kobject_uevent+0x14 > > kobject_uevent_env[jt]+0x224 > > mutex_unlock[jt]+0xc4 > > __mutex_unlock_slowpath+0xd4 > > wake_up_q+0x70 > > try_to_wake_up[jt]+0x48c > > preempt_schedule_common+0x28 > > __schedule+0x628 > > __switch_to+0x174 > > el0t_64_sync+0x1a8/0x1ac > > el0t_64_sync_handler+0x68/0xbc > > el0_svc+0x38/0x68 > > do_el0_svc+0x1c/0x28 > > el0_svc_common+0x80/0xe0 > > invoke_syscall+0x58/0x114 > > __arm64_sys_read+0x1c/0x2c > > ksys_read+0x78/0xe8 > > vfs_read+0x1e0/0x2c8 > > kernfs_fop_read_iter+0x68/0x1b4 > > seq_read_iter+0x158/0x4ec > > kernfs_seq_show+0x44/0x54 > > sysfs_kf_seq_show+0xb4/0x130 > > dev_attr_show+0x38/0x74 > > brightness_show+0x20/0x4c > > dualshock4_led_get_brightness+0xc/0x74 > > > > [ 3313.874295][ T4013] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060 > > [ 3313.874301][ T4013] Mem abort info: > > [ 3313.874303][ T4013] ESR = 0x0000000096000006 > > [ 3313.874305][ T4013] EC = 0x25: DABT (current EL), IL = 32 bits > > [ 3313.874307][ T4013] SET = 0, FnV = 0 > > [ 3313.874309][ T4013] EA = 0, S1PTW = 0 > > [ 3313.874311][ T4013] FSC = 0x06: level 2 translation fault > > [ 3313.874313][ T4013] Data abort info: > > [ 3313.874314][ T4013] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 > > [ 3313.874316][ T4013] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > > [ 3313.874318][ T4013] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > > [ 3313.874320][ T4013] user pgtable: 4k pages, 39-bit VAs, pgdp=00000008f2b0a000 > > .. > > > > [ 3313.874332][ T4013] Dumping ftrace buffer: > > [ 3313.874334][ T4013] (ftrace buffer empty) > > .. > > .. > > [ dd3313.874639][ T4013] CPU: 6 PID: 4013 Comm: InputReader > > [ 3313.874648][ T4013] pc : dualshock4_led_get_brightness+0xc/0x74 > > [ 3313.874653][ T4013] lr : led_update_brightness+0x38/0x60 > > [ 3313.874656][ T4013] sp : ffffffc0b910bbd0 > > .. > > .. > > [ 3313.874685][ T4013] Call trace: > > [ 3313.874687][ T4013] dualshock4_led_get_brightness+0xc/0x74 > > [ 3313.874690][ T4013] brightness_show+0x20/0x4c > > [ 3313.874692][ T4013] dev_attr_show+0x38/0x74 > > [ 3313.874696][ T4013] sysfs_kf_seq_show+0xb4/0x130 > > [ 3313.874700][ T4013] kernfs_seq_show+0x44/0x54 > > [ 3313.874703][ T4013] seq_read_iter+0x158/0x4ec > > [ 3313.874705][ T4013] kernfs_fop_read_iter+0x68/0x1b4 > > [ 3313.874708][ T4013] vfs_read+0x1e0/0x2c8 > > [ 3313.874711][ T4013] ksys_read+0x78/0xe8 > > [ 3313.874714][ T4013] __arm64_sys_read+0x1c/0x2c > > [ 3313.874718][ T4013] invoke_syscall+0x58/0x114 > > [ 3313.874721][ T4013] el0_svc_common+0x80/0xe0 > > [ 3313.874724][ T4013] do_el0_svc+0x1c/0x28 > > [ 3313.874727][ T4013] el0_svc+0x38/0x68 > > [ 3313.874730][ T4013] el0t_64_sync_handler+0x68/0xbc > > [ 3313.874732][ T4013] el0t_64_sync+0x1a8/0x1ac > > > > Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> > > --- > > drivers/leds/led-class.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c > > index 06b97fd49ad9..e3cb93f19c06 100644 > > --- a/drivers/leds/led-class.c > > +++ b/drivers/leds/led-class.c > > @@ -30,8 +30,9 @@ static ssize_t brightness_show(struct device *dev, > > { > > struct led_classdev *led_cdev = dev_get_drvdata(dev); > > > > - /* no lock needed for this */ > > >> also you missed this. > > > + mutex_lock(&led_cdev->led_access); > > led_update_brightness(led_cdev); > > + mutex_unlock(&led_cdev->led_access); > > > > return sprintf(buf, "%u\n", led_cdev->brightness); > > } > > -- > > 2.34.1 > > > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex 2024-10-15 19:27 ` Mukesh Ojha @ 2024-10-15 22:28 ` anish kumar 2024-10-16 5:45 ` Mukesh Ojha 0 siblings, 1 reply; 12+ messages in thread From: anish kumar @ 2024-10-15 22:28 UTC (permalink / raw) To: Mukesh Ojha; +Cc: Pavel Machek, Lee Jones, linux-leds, linux-kernel On Tue, Oct 15, 2024 at 12:28 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > > On Tue, Oct 15, 2024 at 10:59:12AM -0700, anish kumar wrote: > > On Tue, Oct 15, 2024 at 9:26 AM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > > > > > > There is NULL pointer issue observed if from Process A where hid device > > > being added which results in adding a led_cdev addition and later a > > > another call to access of led_cdev attribute from Process B can result > > > in NULL pointer issue. > > > > Which pointer is NULL? Call stack shows that dualshock4_led_get_brightness > > function could be culprit? > > in dualshock4_led_get_brightness()[1], led->dev is NULL here, as [2] > is not yet completed. > > [1] > struct hid_device *hdev = to_hid_device(led->dev->parent); > > [2] > led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, > led_cdev, led_cdev->groups, "%s", final_name); > > > > > > > > > Use mutex led_cdev->led_access to protect access to led->cdev and its > > > attribute inside brightness_show(). > > > > I don't think it is needed here because it is just calling the led driver > > callback and updating the brightness. So, why would we need to serialize > > that using mutex? Maybe the callback needs some debugging. > > I'm curious if it is ready by the time the callback is invoked. > > Because, we should not be allowed to access led_cdev->dev as it is not > completed and since, brightness_store() has this lock brightness_show() > should also have this as we are seeing the issue without it. > > I hope, above might have answered your question. > > -Mukesh > > > > > > > > Process A Process B > > > > > > kthread+0x114 > > > worker_thread+0x244 > > > process_scheduled_works+0x248 > > > uhid_device_add_worker+0x24 > > > hid_add_device+0x120 > > > device_add+0x268 > > > bus_probe_device+0x94 > > > device_initial_probe+0x14 > > > __device_attach+0xfc > > > bus_for_each_drv+0x10c > > > __device_attach_driver+0x14c > > > driver_probe_device+0x3c > > > __driver_probe_device+0xa0 > > > really_probe+0x190 > > > hid_device_probe+0x130 > > > ps_probe+0x990 > > > ps_led_register+0x94 > > > devm_led_classdev_register_ext+0x58 > > > led_classdev_register_ext+0x1f8 > > > device_create_with_groups+0x48 > > > device_create_groups_vargs+0xc8 > > > device_add+0x244 > > > kobject_uevent+0x14 > > > kobject_uevent_env[jt]+0x224 > > > mutex_unlock[jt]+0xc4 > > > __mutex_unlock_slowpath+0xd4 > > > wake_up_q+0x70 > > > try_to_wake_up[jt]+0x48c > > > preempt_schedule_common+0x28 > > > __schedule+0x628 > > > __switch_to+0x174 > > > el0t_64_sync+0x1a8/0x1ac > > > el0t_64_sync_handler+0x68/0xbc > > > el0_svc+0x38/0x68 > > > do_el0_svc+0x1c/0x28 > > > el0_svc_common+0x80/0xe0 > > > invoke_syscall+0x58/0x114 > > > __arm64_sys_read+0x1c/0x2c > > > ksys_read+0x78/0xe8 > > > vfs_read+0x1e0/0x2c8 > > > kernfs_fop_read_iter+0x68/0x1b4 > > > seq_read_iter+0x158/0x4ec > > > kernfs_seq_show+0x44/0x54 > > > sysfs_kf_seq_show+0xb4/0x130 > > > dev_attr_show+0x38/0x74 > > > brightness_show+0x20/0x4c > > > dualshock4_led_get_brightness+0xc/0x74 > > > > > > [ 3313.874295][ T4013] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060 > > > [ 3313.874301][ T4013] Mem abort info: > > > [ 3313.874303][ T4013] ESR = 0x0000000096000006 > > > [ 3313.874305][ T4013] EC = 0x25: DABT (current EL), IL = 32 bits > > > [ 3313.874307][ T4013] SET = 0, FnV = 0 > > > [ 3313.874309][ T4013] EA = 0, S1PTW = 0 > > > [ 3313.874311][ T4013] FSC = 0x06: level 2 translation fault > > > [ 3313.874313][ T4013] Data abort info: > > > [ 3313.874314][ T4013] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 > > > [ 3313.874316][ T4013] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > > > [ 3313.874318][ T4013] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > > > [ 3313.874320][ T4013] user pgtable: 4k pages, 39-bit VAs, pgdp=00000008f2b0a000 > > > .. > > > > > > [ 3313.874332][ T4013] Dumping ftrace buffer: > > > [ 3313.874334][ T4013] (ftrace buffer empty) > > > .. > > > .. > > > [ dd3313.874639][ T4013] CPU: 6 PID: 4013 Comm: InputReader > > > [ 3313.874648][ T4013] pc : dualshock4_led_get_brightness+0xc/0x74 > > > [ 3313.874653][ T4013] lr : led_update_brightness+0x38/0x60 > > > [ 3313.874656][ T4013] sp : ffffffc0b910bbd0 > > > .. > > > .. > > > [ 3313.874685][ T4013] Call trace: > > > [ 3313.874687][ T4013] dualshock4_led_get_brightness+0xc/0x74 > > > [ 3313.874690][ T4013] brightness_show+0x20/0x4c > > > [ 3313.874692][ T4013] dev_attr_show+0x38/0x74 > > > [ 3313.874696][ T4013] sysfs_kf_seq_show+0xb4/0x130 > > > [ 3313.874700][ T4013] kernfs_seq_show+0x44/0x54 > > > [ 3313.874703][ T4013] seq_read_iter+0x158/0x4ec > > > [ 3313.874705][ T4013] kernfs_fop_read_iter+0x68/0x1b4 > > > [ 3313.874708][ T4013] vfs_read+0x1e0/0x2c8 > > > [ 3313.874711][ T4013] ksys_read+0x78/0xe8 > > > [ 3313.874714][ T4013] __arm64_sys_read+0x1c/0x2c > > > [ 3313.874718][ T4013] invoke_syscall+0x58/0x114 > > > [ 3313.874721][ T4013] el0_svc_common+0x80/0xe0 > > > [ 3313.874724][ T4013] do_el0_svc+0x1c/0x28 > > > [ 3313.874727][ T4013] el0_svc+0x38/0x68 > > > [ 3313.874730][ T4013] el0t_64_sync_handler+0x68/0xbc > > > [ 3313.874732][ T4013] el0t_64_sync+0x1a8/0x1ac > > > > > > Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> > > > --- > > > drivers/leds/led-class.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c > > > index 06b97fd49ad9..e3cb93f19c06 100644 > > > --- a/drivers/leds/led-class.c > > > +++ b/drivers/leds/led-class.c > > > @@ -30,8 +30,9 @@ static ssize_t brightness_show(struct device *dev, > > > { > > > struct led_classdev *led_cdev = dev_get_drvdata(dev); > > > > > > - /* no lock needed for this */ just get rid of the above comment then. Also, the comment below in file leds.h needs an update as originally the idea for this mutex lock was to provide quick feedback to userspace based on this commit https://github.com/torvalds/linux/commit/acd899e4f3066b6662f6047da5b795cc762093cb Basically a comment somewhere so that when a new attribute gets added, it doesn't make the same mistake of not using the mutex and run into the same issue. /* Ensures consistent access to the LED Flash Class device */ struct mutex led_access; > > > > >> also you missed this. > > > > > + mutex_lock(&led_cdev->led_access); > > > led_update_brightness(led_cdev); > > > + mutex_unlock(&led_cdev->led_access); > > > > > > return sprintf(buf, "%u\n", led_cdev->brightness); > > > } > > > -- > > > 2.34.1 > > > > > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex 2024-10-15 22:28 ` anish kumar @ 2024-10-16 5:45 ` Mukesh Ojha 2024-10-16 16:37 ` anish kumar 0 siblings, 1 reply; 12+ messages in thread From: Mukesh Ojha @ 2024-10-16 5:45 UTC (permalink / raw) To: anish kumar; +Cc: Pavel Machek, Lee Jones, linux-leds, linux-kernel On Tue, Oct 15, 2024 at 03:28:08PM -0700, anish kumar wrote: > On Tue, Oct 15, 2024 at 12:28 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > > > > On Tue, Oct 15, 2024 at 10:59:12AM -0700, anish kumar wrote: > > > On Tue, Oct 15, 2024 at 9:26 AM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > > > > > > > > There is NULL pointer issue observed if from Process A where hid device > > > > being added which results in adding a led_cdev addition and later a > > > > another call to access of led_cdev attribute from Process B can result > > > > in NULL pointer issue. > > > > > > Which pointer is NULL? Call stack shows that dualshock4_led_get_brightness > > > function could be culprit? > > > > in dualshock4_led_get_brightness()[1], led->dev is NULL here, as [2] > > is not yet completed. > > > > [1] > > struct hid_device *hdev = to_hid_device(led->dev->parent); > > > > [2] > > led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, > > led_cdev, led_cdev->groups, "%s", final_name); > > > > > > > > > > > > > Use mutex led_cdev->led_access to protect access to led->cdev and its > > > > attribute inside brightness_show(). > > > > > > I don't think it is needed here because it is just calling the led driver > > > callback and updating the brightness. So, why would we need to serialize > > > that using mutex? Maybe the callback needs some debugging. > > > I'm curious if it is ready by the time the callback is invoked. > > > > Because, we should not be allowed to access led_cdev->dev as it is not > > completed and since, brightness_store() has this lock brightness_show() > > should also have this as we are seeing the issue without it. > > > > I hope, above might have answered your question. > > > > -Mukesh > > > > > > > > > > > Process A Process B > > > > > > > > kthread+0x114 > > > > worker_thread+0x244 > > > > process_scheduled_works+0x248 > > > > uhid_device_add_worker+0x24 > > > > hid_add_device+0x120 > > > > device_add+0x268 > > > > bus_probe_device+0x94 > > > > device_initial_probe+0x14 > > > > __device_attach+0xfc > > > > bus_for_each_drv+0x10c > > > > __device_attach_driver+0x14c > > > > driver_probe_device+0x3c > > > > __driver_probe_device+0xa0 > > > > really_probe+0x190 > > > > hid_device_probe+0x130 > > > > ps_probe+0x990 > > > > ps_led_register+0x94 > > > > devm_led_classdev_register_ext+0x58 > > > > led_classdev_register_ext+0x1f8 > > > > device_create_with_groups+0x48 > > > > device_create_groups_vargs+0xc8 > > > > device_add+0x244 > > > > kobject_uevent+0x14 > > > > kobject_uevent_env[jt]+0x224 > > > > mutex_unlock[jt]+0xc4 > > > > __mutex_unlock_slowpath+0xd4 > > > > wake_up_q+0x70 > > > > try_to_wake_up[jt]+0x48c > > > > preempt_schedule_common+0x28 > > > > __schedule+0x628 > > > > __switch_to+0x174 > > > > el0t_64_sync+0x1a8/0x1ac > > > > el0t_64_sync_handler+0x68/0xbc > > > > el0_svc+0x38/0x68 > > > > do_el0_svc+0x1c/0x28 > > > > el0_svc_common+0x80/0xe0 > > > > invoke_syscall+0x58/0x114 > > > > __arm64_sys_read+0x1c/0x2c > > > > ksys_read+0x78/0xe8 > > > > vfs_read+0x1e0/0x2c8 > > > > kernfs_fop_read_iter+0x68/0x1b4 > > > > seq_read_iter+0x158/0x4ec > > > > kernfs_seq_show+0x44/0x54 > > > > sysfs_kf_seq_show+0xb4/0x130 > > > > dev_attr_show+0x38/0x74 > > > > brightness_show+0x20/0x4c > > > > dualshock4_led_get_brightness+0xc/0x74 > > > > > > > > [ 3313.874295][ T4013] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060 > > > > [ 3313.874301][ T4013] Mem abort info: > > > > [ 3313.874303][ T4013] ESR = 0x0000000096000006 > > > > [ 3313.874305][ T4013] EC = 0x25: DABT (current EL), IL = 32 bits > > > > [ 3313.874307][ T4013] SET = 0, FnV = 0 > > > > [ 3313.874309][ T4013] EA = 0, S1PTW = 0 > > > > [ 3313.874311][ T4013] FSC = 0x06: level 2 translation fault > > > > [ 3313.874313][ T4013] Data abort info: > > > > [ 3313.874314][ T4013] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 > > > > [ 3313.874316][ T4013] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > > > > [ 3313.874318][ T4013] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > > > > [ 3313.874320][ T4013] user pgtable: 4k pages, 39-bit VAs, pgdp=00000008f2b0a000 > > > > .. > > > > > > > > [ 3313.874332][ T4013] Dumping ftrace buffer: > > > > [ 3313.874334][ T4013] (ftrace buffer empty) > > > > .. > > > > .. > > > > [ dd3313.874639][ T4013] CPU: 6 PID: 4013 Comm: InputReader > > > > [ 3313.874648][ T4013] pc : dualshock4_led_get_brightness+0xc/0x74 > > > > [ 3313.874653][ T4013] lr : led_update_brightness+0x38/0x60 > > > > [ 3313.874656][ T4013] sp : ffffffc0b910bbd0 > > > > .. > > > > .. > > > > [ 3313.874685][ T4013] Call trace: > > > > [ 3313.874687][ T4013] dualshock4_led_get_brightness+0xc/0x74 > > > > [ 3313.874690][ T4013] brightness_show+0x20/0x4c > > > > [ 3313.874692][ T4013] dev_attr_show+0x38/0x74 > > > > [ 3313.874696][ T4013] sysfs_kf_seq_show+0xb4/0x130 > > > > [ 3313.874700][ T4013] kernfs_seq_show+0x44/0x54 > > > > [ 3313.874703][ T4013] seq_read_iter+0x158/0x4ec > > > > [ 3313.874705][ T4013] kernfs_fop_read_iter+0x68/0x1b4 > > > > [ 3313.874708][ T4013] vfs_read+0x1e0/0x2c8 > > > > [ 3313.874711][ T4013] ksys_read+0x78/0xe8 > > > > [ 3313.874714][ T4013] __arm64_sys_read+0x1c/0x2c > > > > [ 3313.874718][ T4013] invoke_syscall+0x58/0x114 > > > > [ 3313.874721][ T4013] el0_svc_common+0x80/0xe0 > > > > [ 3313.874724][ T4013] do_el0_svc+0x1c/0x28 > > > > [ 3313.874727][ T4013] el0_svc+0x38/0x68 > > > > [ 3313.874730][ T4013] el0t_64_sync_handler+0x68/0xbc > > > > [ 3313.874732][ T4013] el0t_64_sync+0x1a8/0x1ac > > > > > > > > Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> > > > > --- > > > > drivers/leds/led-class.c | 3 ++- > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c > > > > index 06b97fd49ad9..e3cb93f19c06 100644 > > > > --- a/drivers/leds/led-class.c > > > > +++ b/drivers/leds/led-class.c > > > > @@ -30,8 +30,9 @@ static ssize_t brightness_show(struct device *dev, > > > > { > > > > struct led_classdev *led_cdev = dev_get_drvdata(dev); > > > > > > > > - /* no lock needed for this */ > > just get rid of the above comment then. If you notice, it is already removed (-) . > > Also, the comment below in file leds.h > needs an update as originally the idea for this mutex lock was to > provide quick feedback to userspace based on this commit > https://github.com/torvalds/linux/commit/acd899e4f3066b6662f6047da5b795cc762093cb > > Basically a comment somewhere so that when a new attribute > gets added, it doesn't make the same mistake of not using the mutex > and run into the same issue. > > /* Ensures consistent access to the LED Flash Class device */ > struct mutex led_access; Thanks for accepting that it is an issue. I think, comment is very obvious actually the patch you mentioned should be in fixes tag as it introduced the lock but did not protect the show while it does it for store. Fixes: acd899e4f306 ("leds: implement sysfs interface locking mechanism") -Mukesh > > > > > > > > >> also you missed this. > > > > > > > + mutex_lock(&led_cdev->led_access); > > > > led_update_brightness(led_cdev); > > > > + mutex_unlock(&led_cdev->led_access); > > > > > > > > return sprintf(buf, "%u\n", led_cdev->brightness); > > > > } > > > > -- > > > > 2.34.1 > > > > > > > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex 2024-10-16 5:45 ` Mukesh Ojha @ 2024-10-16 16:37 ` anish kumar 2024-10-17 12:12 ` Jacek Anaszewski 0 siblings, 1 reply; 12+ messages in thread From: anish kumar @ 2024-10-16 16:37 UTC (permalink / raw) To: Mukesh Ojha; +Cc: Pavel Machek, Lee Jones, linux-leds, linux-kernel On Tue, Oct 15, 2024 at 10:45 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > > On Tue, Oct 15, 2024 at 03:28:08PM -0700, anish kumar wrote: > > On Tue, Oct 15, 2024 at 12:28 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > > > > > > On Tue, Oct 15, 2024 at 10:59:12AM -0700, anish kumar wrote: > > > > On Tue, Oct 15, 2024 at 9:26 AM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > > > > > > > > > > There is NULL pointer issue observed if from Process A where hid device > > > > > being added which results in adding a led_cdev addition and later a > > > > > another call to access of led_cdev attribute from Process B can result > > > > > in NULL pointer issue. > > > > > > > > Which pointer is NULL? Call stack shows that dualshock4_led_get_brightness > > > > function could be culprit? > > > > > > in dualshock4_led_get_brightness()[1], led->dev is NULL here, as [2] > > > is not yet completed. > > > > > > [1] > > > struct hid_device *hdev = to_hid_device(led->dev->parent); > > > > > > [2] > > > led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, > > > led_cdev, led_cdev->groups, "%s", final_name); > > > > > > > > > > > > > > > > > Use mutex led_cdev->led_access to protect access to led->cdev and its > > > > > attribute inside brightness_show(). > > > > > > > > I don't think it is needed here because it is just calling the led driver > > > > callback and updating the brightness. So, why would we need to serialize > > > > that using mutex? Maybe the callback needs some debugging. > > > > I'm curious if it is ready by the time the callback is invoked. > > > > > > Because, we should not be allowed to access led_cdev->dev as it is not > > > completed and since, brightness_store() has this lock brightness_show() > > > should also have this as we are seeing the issue without it. > > > > > > I hope, above might have answered your question. > > > > > > -Mukesh > > > > > > > > > > > > > > Process A Process B > > > > > > > > > > kthread+0x114 > > > > > worker_thread+0x244 > > > > > process_scheduled_works+0x248 > > > > > uhid_device_add_worker+0x24 > > > > > hid_add_device+0x120 > > > > > device_add+0x268 > > > > > bus_probe_device+0x94 > > > > > device_initial_probe+0x14 > > > > > __device_attach+0xfc > > > > > bus_for_each_drv+0x10c > > > > > __device_attach_driver+0x14c > > > > > driver_probe_device+0x3c > > > > > __driver_probe_device+0xa0 > > > > > really_probe+0x190 > > > > > hid_device_probe+0x130 > > > > > ps_probe+0x990 > > > > > ps_led_register+0x94 > > > > > devm_led_classdev_register_ext+0x58 > > > > > led_classdev_register_ext+0x1f8 > > > > > device_create_with_groups+0x48 > > > > > device_create_groups_vargs+0xc8 > > > > > device_add+0x244 > > > > > kobject_uevent+0x14 > > > > > kobject_uevent_env[jt]+0x224 > > > > > mutex_unlock[jt]+0xc4 > > > > > __mutex_unlock_slowpath+0xd4 > > > > > wake_up_q+0x70 > > > > > try_to_wake_up[jt]+0x48c > > > > > preempt_schedule_common+0x28 > > > > > __schedule+0x628 > > > > > __switch_to+0x174 > > > > > el0t_64_sync+0x1a8/0x1ac > > > > > el0t_64_sync_handler+0x68/0xbc > > > > > el0_svc+0x38/0x68 > > > > > do_el0_svc+0x1c/0x28 > > > > > el0_svc_common+0x80/0xe0 > > > > > invoke_syscall+0x58/0x114 > > > > > __arm64_sys_read+0x1c/0x2c > > > > > ksys_read+0x78/0xe8 > > > > > vfs_read+0x1e0/0x2c8 > > > > > kernfs_fop_read_iter+0x68/0x1b4 > > > > > seq_read_iter+0x158/0x4ec > > > > > kernfs_seq_show+0x44/0x54 > > > > > sysfs_kf_seq_show+0xb4/0x130 > > > > > dev_attr_show+0x38/0x74 > > > > > brightness_show+0x20/0x4c > > > > > dualshock4_led_get_brightness+0xc/0x74 > > > > > > > > > > [ 3313.874295][ T4013] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060 > > > > > [ 3313.874301][ T4013] Mem abort info: > > > > > [ 3313.874303][ T4013] ESR = 0x0000000096000006 > > > > > [ 3313.874305][ T4013] EC = 0x25: DABT (current EL), IL = 32 bits > > > > > [ 3313.874307][ T4013] SET = 0, FnV = 0 > > > > > [ 3313.874309][ T4013] EA = 0, S1PTW = 0 > > > > > [ 3313.874311][ T4013] FSC = 0x06: level 2 translation fault > > > > > [ 3313.874313][ T4013] Data abort info: > > > > > [ 3313.874314][ T4013] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 > > > > > [ 3313.874316][ T4013] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > > > > > [ 3313.874318][ T4013] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > > > > > [ 3313.874320][ T4013] user pgtable: 4k pages, 39-bit VAs, pgdp=00000008f2b0a000 > > > > > .. > > > > > > > > > > [ 3313.874332][ T4013] Dumping ftrace buffer: > > > > > [ 3313.874334][ T4013] (ftrace buffer empty) > > > > > .. > > > > > .. > > > > > [ dd3313.874639][ T4013] CPU: 6 PID: 4013 Comm: InputReader > > > > > [ 3313.874648][ T4013] pc : dualshock4_led_get_brightness+0xc/0x74 > > > > > [ 3313.874653][ T4013] lr : led_update_brightness+0x38/0x60 > > > > > [ 3313.874656][ T4013] sp : ffffffc0b910bbd0 > > > > > .. > > > > > .. > > > > > [ 3313.874685][ T4013] Call trace: > > > > > [ 3313.874687][ T4013] dualshock4_led_get_brightness+0xc/0x74 > > > > > [ 3313.874690][ T4013] brightness_show+0x20/0x4c > > > > > [ 3313.874692][ T4013] dev_attr_show+0x38/0x74 > > > > > [ 3313.874696][ T4013] sysfs_kf_seq_show+0xb4/0x130 > > > > > [ 3313.874700][ T4013] kernfs_seq_show+0x44/0x54 > > > > > [ 3313.874703][ T4013] seq_read_iter+0x158/0x4ec > > > > > [ 3313.874705][ T4013] kernfs_fop_read_iter+0x68/0x1b4 > > > > > [ 3313.874708][ T4013] vfs_read+0x1e0/0x2c8 > > > > > [ 3313.874711][ T4013] ksys_read+0x78/0xe8 > > > > > [ 3313.874714][ T4013] __arm64_sys_read+0x1c/0x2c > > > > > [ 3313.874718][ T4013] invoke_syscall+0x58/0x114 > > > > > [ 3313.874721][ T4013] el0_svc_common+0x80/0xe0 > > > > > [ 3313.874724][ T4013] do_el0_svc+0x1c/0x28 > > > > > [ 3313.874727][ T4013] el0_svc+0x38/0x68 > > > > > [ 3313.874730][ T4013] el0t_64_sync_handler+0x68/0xbc > > > > > [ 3313.874732][ T4013] el0t_64_sync+0x1a8/0x1ac > > > > > > > > > > Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> > > > > > --- > > > > > drivers/leds/led-class.c | 3 ++- > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c > > > > > index 06b97fd49ad9..e3cb93f19c06 100644 > > > > > --- a/drivers/leds/led-class.c > > > > > +++ b/drivers/leds/led-class.c > > > > > @@ -30,8 +30,9 @@ static ssize_t brightness_show(struct device *dev, > > > > > { > > > > > struct led_classdev *led_cdev = dev_get_drvdata(dev); > > > > > > > > > > - /* no lock needed for this */ > > > > just get rid of the above comment then. > > If you notice, it is already removed (-) . > > > > > Also, the comment below in file leds.h > > needs an update as originally the idea for this mutex lock was to > > provide quick feedback to userspace based on this commit > > https://github.com/torvalds/linux/commit/acd899e4f3066b6662f6047da5b795cc762093cb > > > > Basically a comment somewhere so that when a new attribute > > gets added, it doesn't make the same mistake of not using the mutex > > and run into the same issue. > > > > /* Ensures consistent access to the LED Flash Class device */ > > struct mutex led_access; > > Thanks for accepting that it is an issue. > I think, comment is very obvious actually the patch you mentioned should > be in fixes tag as it introduced the lock but did not protect the show > while it does it for store. Yes, but that patch was added for supporting flash class device and wasn't explicitly to take care of the scenario that you are trying to handle and the above comment in leds.h states the same. I think we should modify that comment and state clearly that the aforementioned mutex is also to handle access to led_cdev->dev. Either here in this .h or where attributes are defined, so that new attributes that get added doesn't suffer from the same bug. led_trigger_set also this function also suffers from the same bug so you need to handle it the same way. > > Fixes: acd899e4f306 ("leds: implement sysfs interface locking mechanism") > > -Mukesh > > > > > > > > > > > > >> also you missed this. > > > > > > > > > + mutex_lock(&led_cdev->led_access); > > > > > led_update_brightness(led_cdev); > > > > > + mutex_unlock(&led_cdev->led_access); > > > > > > > > > > return sprintf(buf, "%u\n", led_cdev->brightness); > > > > > } > > > > > -- > > > > > 2.34.1 > > > > > > > > > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex 2024-10-16 16:37 ` anish kumar @ 2024-10-17 12:12 ` Jacek Anaszewski 2024-10-17 16:41 ` anish kumar 0 siblings, 1 reply; 12+ messages in thread From: Jacek Anaszewski @ 2024-10-17 12:12 UTC (permalink / raw) To: anish kumar, Mukesh Ojha Cc: Pavel Machek, Lee Jones, linux-leds, linux-kernel Hi Anish and Mukesh, On 10/16/24 18:37, anish kumar wrote: > On Tue, Oct 15, 2024 at 10:45 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: >> >> On Tue, Oct 15, 2024 at 03:28:08PM -0700, anish kumar wrote: >>> On Tue, Oct 15, 2024 at 12:28 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: >>>> >>>> On Tue, Oct 15, 2024 at 10:59:12AM -0700, anish kumar wrote: >>>>> On Tue, Oct 15, 2024 at 9:26 AM Mukesh Ojha <quic_mojha@quicinc.com> wrote: >>>>>> >>>>>> There is NULL pointer issue observed if from Process A where hid device >>>>>> being added which results in adding a led_cdev addition and later a >>>>>> another call to access of led_cdev attribute from Process B can result >>>>>> in NULL pointer issue. >>>>> >>>>> Which pointer is NULL? Call stack shows that dualshock4_led_get_brightness >>>>> function could be culprit? >>>> >>>> in dualshock4_led_get_brightness()[1], led->dev is NULL here, as [2] >>>> is not yet completed. >>>> >>>> [1] >>>> struct hid_device *hdev = to_hid_device(led->dev->parent); >>>> >>>> [2] >>>> led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, >>>> led_cdev, led_cdev->groups, "%s", final_name); >>>> >>>>> >>>>>> >>>>>> Use mutex led_cdev->led_access to protect access to led->cdev and its >>>>>> attribute inside brightness_show(). >>>>> >>>>> I don't think it is needed here because it is just calling the led driver >>>>> callback and updating the brightness. So, why would we need to serialize >>>>> that using mutex? Maybe the callback needs some debugging. >>>>> I'm curious if it is ready by the time the callback is invoked. >>>> >>>> Because, we should not be allowed to access led_cdev->dev as it is not >>>> completed and since, brightness_store() has this lock brightness_show() >>>> should also have this as we are seeing the issue without it. >>>> >>>> I hope, above might have answered your question. >>>> >>>> -Mukesh >>>>> >>>>>> >>>>>> Process A Process B >>>>>> >>>>>> kthread+0x114 >>>>>> worker_thread+0x244 >>>>>> process_scheduled_works+0x248 >>>>>> uhid_device_add_worker+0x24 >>>>>> hid_add_device+0x120 >>>>>> device_add+0x268 >>>>>> bus_probe_device+0x94 >>>>>> device_initial_probe+0x14 >>>>>> __device_attach+0xfc >>>>>> bus_for_each_drv+0x10c >>>>>> __device_attach_driver+0x14c >>>>>> driver_probe_device+0x3c >>>>>> __driver_probe_device+0xa0 >>>>>> really_probe+0x190 >>>>>> hid_device_probe+0x130 >>>>>> ps_probe+0x990 >>>>>> ps_led_register+0x94 >>>>>> devm_led_classdev_register_ext+0x58 >>>>>> led_classdev_register_ext+0x1f8 >>>>>> device_create_with_groups+0x48 >>>>>> device_create_groups_vargs+0xc8 >>>>>> device_add+0x244 >>>>>> kobject_uevent+0x14 >>>>>> kobject_uevent_env[jt]+0x224 >>>>>> mutex_unlock[jt]+0xc4 >>>>>> __mutex_unlock_slowpath+0xd4 >>>>>> wake_up_q+0x70 >>>>>> try_to_wake_up[jt]+0x48c >>>>>> preempt_schedule_common+0x28 >>>>>> __schedule+0x628 >>>>>> __switch_to+0x174 >>>>>> el0t_64_sync+0x1a8/0x1ac >>>>>> el0t_64_sync_handler+0x68/0xbc >>>>>> el0_svc+0x38/0x68 >>>>>> do_el0_svc+0x1c/0x28 >>>>>> el0_svc_common+0x80/0xe0 >>>>>> invoke_syscall+0x58/0x114 >>>>>> __arm64_sys_read+0x1c/0x2c >>>>>> ksys_read+0x78/0xe8 >>>>>> vfs_read+0x1e0/0x2c8 >>>>>> kernfs_fop_read_iter+0x68/0x1b4 >>>>>> seq_read_iter+0x158/0x4ec >>>>>> kernfs_seq_show+0x44/0x54 >>>>>> sysfs_kf_seq_show+0xb4/0x130 >>>>>> dev_attr_show+0x38/0x74 >>>>>> brightness_show+0x20/0x4c >>>>>> dualshock4_led_get_brightness+0xc/0x74 >>>>>> >>>>>> [ 3313.874295][ T4013] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060 >>>>>> [ 3313.874301][ T4013] Mem abort info: >>>>>> [ 3313.874303][ T4013] ESR = 0x0000000096000006 >>>>>> [ 3313.874305][ T4013] EC = 0x25: DABT (current EL), IL = 32 bits >>>>>> [ 3313.874307][ T4013] SET = 0, FnV = 0 >>>>>> [ 3313.874309][ T4013] EA = 0, S1PTW = 0 >>>>>> [ 3313.874311][ T4013] FSC = 0x06: level 2 translation fault >>>>>> [ 3313.874313][ T4013] Data abort info: >>>>>> [ 3313.874314][ T4013] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 >>>>>> [ 3313.874316][ T4013] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 >>>>>> [ 3313.874318][ T4013] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 >>>>>> [ 3313.874320][ T4013] user pgtable: 4k pages, 39-bit VAs, pgdp=00000008f2b0a000 >>>>>> .. >>>>>> >>>>>> [ 3313.874332][ T4013] Dumping ftrace buffer: >>>>>> [ 3313.874334][ T4013] (ftrace buffer empty) >>>>>> .. >>>>>> .. >>>>>> [ dd3313.874639][ T4013] CPU: 6 PID: 4013 Comm: InputReader >>>>>> [ 3313.874648][ T4013] pc : dualshock4_led_get_brightness+0xc/0x74 >>>>>> [ 3313.874653][ T4013] lr : led_update_brightness+0x38/0x60 >>>>>> [ 3313.874656][ T4013] sp : ffffffc0b910bbd0 >>>>>> .. >>>>>> .. >>>>>> [ 3313.874685][ T4013] Call trace: >>>>>> [ 3313.874687][ T4013] dualshock4_led_get_brightness+0xc/0x74 >>>>>> [ 3313.874690][ T4013] brightness_show+0x20/0x4c >>>>>> [ 3313.874692][ T4013] dev_attr_show+0x38/0x74 >>>>>> [ 3313.874696][ T4013] sysfs_kf_seq_show+0xb4/0x130 >>>>>> [ 3313.874700][ T4013] kernfs_seq_show+0x44/0x54 >>>>>> [ 3313.874703][ T4013] seq_read_iter+0x158/0x4ec >>>>>> [ 3313.874705][ T4013] kernfs_fop_read_iter+0x68/0x1b4 >>>>>> [ 3313.874708][ T4013] vfs_read+0x1e0/0x2c8 >>>>>> [ 3313.874711][ T4013] ksys_read+0x78/0xe8 >>>>>> [ 3313.874714][ T4013] __arm64_sys_read+0x1c/0x2c >>>>>> [ 3313.874718][ T4013] invoke_syscall+0x58/0x114 >>>>>> [ 3313.874721][ T4013] el0_svc_common+0x80/0xe0 >>>>>> [ 3313.874724][ T4013] do_el0_svc+0x1c/0x28 >>>>>> [ 3313.874727][ T4013] el0_svc+0x38/0x68 >>>>>> [ 3313.874730][ T4013] el0t_64_sync_handler+0x68/0xbc >>>>>> [ 3313.874732][ T4013] el0t_64_sync+0x1a8/0x1ac >>>>>> >>>>>> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> >>>>>> --- >>>>>> drivers/leds/led-class.c | 3 ++- >>>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c >>>>>> index 06b97fd49ad9..e3cb93f19c06 100644 >>>>>> --- a/drivers/leds/led-class.c >>>>>> +++ b/drivers/leds/led-class.c >>>>>> @@ -30,8 +30,9 @@ static ssize_t brightness_show(struct device *dev, >>>>>> { >>>>>> struct led_classdev *led_cdev = dev_get_drvdata(dev); >>>>>> >>>>>> - /* no lock needed for this */ >>> >>> just get rid of the above comment then. >> >> If you notice, it is already removed (-) . >> >>> >>> Also, the comment below in file leds.h >>> needs an update as originally the idea for this mutex lock was to >>> provide quick feedback to userspace based on this commit >>> https://github.com/torvalds/linux/commit/acd899e4f3066b6662f6047da5b795cc762093cb >>> >>> Basically a comment somewhere so that when a new attribute >>> gets added, it doesn't make the same mistake of not using the mutex >>> and run into the same issue. >>> >>> /* Ensures consistent access to the LED Flash Class device */ >>> struct mutex led_access; >> >> Thanks for accepting that it is an issue. >> I think, comment is very obvious actually the patch you mentioned should >> be in fixes tag as it introduced the lock but did not protect the show >> while it does it for store. > > Yes, but that patch was added for supporting flash class > device and wasn't explicitly to take care of the scenario that you > are trying to handle and the above comment in leds.h states the same. Correct. led_access mutex was introduced to add support for preventing any LED class device state changes originating from sysfs while v4l2_flash wrapper owns the device. Since the inception of LED subsystem all the locking was deemed to be the responsibility of every single LED class driver and initially sysfs attr callbacks didn't have any locking. After some time when LED core started to grow it turned out that it was required to lock the LED class initialization sequence, so as not to give the userspace an opportunity to set LED brightness on not fully initialized device, which was introduced in [0]. led_access mutex was already in place so it was used. However as you noticed, it is not used consistently across all LED class sysfs attrs callbacks. Since brightness_show() does not acquire led_access mutex it is still possible to call brightness_get op when LED class initialization sequence is not yet finished. Still, I'd propose to first narrow down the issue and figure out what actually causes NULL pointer dereference, as it apparently originates from dualshock4_led_get_brightness and not from LED core. I bet that the driver is not fully initialized up to the point when devm_led_classdev_register_ext() is called in it. > > I think we should modify that comment and state clearly that > the aforementioned mutex is also to handle access to led_cdev->dev. > Either here in this .h or where attributes are defined, so that new attributes > that get added doesn't suffer from the same bug. > > led_trigger_set also this function also suffers from the same bug so you > need to handle it the same way. led_trigger_set() is already called with led_access mutex held in led_trigger_write(), i.e. from "trigger" sysfs attr. >> >> Fixes: acd899e4f306 ("leds: implement sysfs interface locking mechanism") >> >> -Mukesh >>> >>> >>>>> >>>>>>> also you missed this. >>>>> >>>>>> + mutex_lock(&led_cdev->led_access); >>>>>> led_update_brightness(led_cdev); >>>>>> + mutex_unlock(&led_cdev->led_access); >>>>>> >>>>>> return sprintf(buf, "%u\n", led_cdev->brightness); >>>>>> } >>>>>> -- >>>>>> 2.34.1 >>>>>> >>>>>> > [0] https://lore.kernel.org/linux-leds/20180523222221.27621-1-lhenriques@suse.com/ -- Best regards, Jacek Anaszewski ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex 2024-10-17 12:12 ` Jacek Anaszewski @ 2024-10-17 16:41 ` anish kumar 2024-10-17 17:58 ` Jacek Anaszewski 0 siblings, 1 reply; 12+ messages in thread From: anish kumar @ 2024-10-17 16:41 UTC (permalink / raw) To: Jacek Anaszewski Cc: Mukesh Ojha, Pavel Machek, Lee Jones, linux-leds, linux-kernel On Thu, Oct 17, 2024 at 5:12 AM Jacek Anaszewski <jacek.anaszewski@gmail.com> wrote: > > Hi Anish and Mukesh, > > On 10/16/24 18:37, anish kumar wrote: > > On Tue, Oct 15, 2024 at 10:45 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > >> > >> On Tue, Oct 15, 2024 at 03:28:08PM -0700, anish kumar wrote: > >>> On Tue, Oct 15, 2024 at 12:28 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > >>>> > >>>> On Tue, Oct 15, 2024 at 10:59:12AM -0700, anish kumar wrote: > >>>>> On Tue, Oct 15, 2024 at 9:26 AM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > >>>>>> > >>>>>> There is NULL pointer issue observed if from Process A where hid device > >>>>>> being added which results in adding a led_cdev addition and later a > >>>>>> another call to access of led_cdev attribute from Process B can result > >>>>>> in NULL pointer issue. > >>>>> > >>>>> Which pointer is NULL? Call stack shows that dualshock4_led_get_brightness > >>>>> function could be culprit? > >>>> > >>>> in dualshock4_led_get_brightness()[1], led->dev is NULL here, as [2] > >>>> is not yet completed. > >>>> > >>>> [1] > >>>> struct hid_device *hdev = to_hid_device(led->dev->parent); > >>>> > >>>> [2] > >>>> led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, > >>>> led_cdev, led_cdev->groups, "%s", final_name); > >>>> > >>>>> > >>>>>> > >>>>>> Use mutex led_cdev->led_access to protect access to led->cdev and its > >>>>>> attribute inside brightness_show(). > >>>>> > >>>>> I don't think it is needed here because it is just calling the led driver > >>>>> callback and updating the brightness. So, why would we need to serialize > >>>>> that using mutex? Maybe the callback needs some debugging. > >>>>> I'm curious if it is ready by the time the callback is invoked. > >>>> > >>>> Because, we should not be allowed to access led_cdev->dev as it is not > >>>> completed and since, brightness_store() has this lock brightness_show() > >>>> should also have this as we are seeing the issue without it. > >>>> > >>>> I hope, above might have answered your question. > >>>> > >>>> -Mukesh > >>>>> > >>>>>> > >>>>>> Process A Process B > >>>>>> > >>>>>> kthread+0x114 > >>>>>> worker_thread+0x244 > >>>>>> process_scheduled_works+0x248 > >>>>>> uhid_device_add_worker+0x24 > >>>>>> hid_add_device+0x120 > >>>>>> device_add+0x268 > >>>>>> bus_probe_device+0x94 > >>>>>> device_initial_probe+0x14 > >>>>>> __device_attach+0xfc > >>>>>> bus_for_each_drv+0x10c > >>>>>> __device_attach_driver+0x14c > >>>>>> driver_probe_device+0x3c > >>>>>> __driver_probe_device+0xa0 > >>>>>> really_probe+0x190 > >>>>>> hid_device_probe+0x130 > >>>>>> ps_probe+0x990 > >>>>>> ps_led_register+0x94 > >>>>>> devm_led_classdev_register_ext+0x58 > >>>>>> led_classdev_register_ext+0x1f8 > >>>>>> device_create_with_groups+0x48 > >>>>>> device_create_groups_vargs+0xc8 > >>>>>> device_add+0x244 > >>>>>> kobject_uevent+0x14 > >>>>>> kobject_uevent_env[jt]+0x224 > >>>>>> mutex_unlock[jt]+0xc4 > >>>>>> __mutex_unlock_slowpath+0xd4 > >>>>>> wake_up_q+0x70 > >>>>>> try_to_wake_up[jt]+0x48c > >>>>>> preempt_schedule_common+0x28 > >>>>>> __schedule+0x628 > >>>>>> __switch_to+0x174 > >>>>>> el0t_64_sync+0x1a8/0x1ac > >>>>>> el0t_64_sync_handler+0x68/0xbc > >>>>>> el0_svc+0x38/0x68 > >>>>>> do_el0_svc+0x1c/0x28 > >>>>>> el0_svc_common+0x80/0xe0 > >>>>>> invoke_syscall+0x58/0x114 > >>>>>> __arm64_sys_read+0x1c/0x2c > >>>>>> ksys_read+0x78/0xe8 > >>>>>> vfs_read+0x1e0/0x2c8 > >>>>>> kernfs_fop_read_iter+0x68/0x1b4 > >>>>>> seq_read_iter+0x158/0x4ec > >>>>>> kernfs_seq_show+0x44/0x54 > >>>>>> sysfs_kf_seq_show+0xb4/0x130 > >>>>>> dev_attr_show+0x38/0x74 > >>>>>> brightness_show+0x20/0x4c > >>>>>> dualshock4_led_get_brightness+0xc/0x74 > >>>>>> > >>>>>> [ 3313.874295][ T4013] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060 > >>>>>> [ 3313.874301][ T4013] Mem abort info: > >>>>>> [ 3313.874303][ T4013] ESR = 0x0000000096000006 > >>>>>> [ 3313.874305][ T4013] EC = 0x25: DABT (current EL), IL = 32 bits > >>>>>> [ 3313.874307][ T4013] SET = 0, FnV = 0 > >>>>>> [ 3313.874309][ T4013] EA = 0, S1PTW = 0 > >>>>>> [ 3313.874311][ T4013] FSC = 0x06: level 2 translation fault > >>>>>> [ 3313.874313][ T4013] Data abort info: > >>>>>> [ 3313.874314][ T4013] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 > >>>>>> [ 3313.874316][ T4013] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > >>>>>> [ 3313.874318][ T4013] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > >>>>>> [ 3313.874320][ T4013] user pgtable: 4k pages, 39-bit VAs, pgdp=00000008f2b0a000 > >>>>>> .. > >>>>>> > >>>>>> [ 3313.874332][ T4013] Dumping ftrace buffer: > >>>>>> [ 3313.874334][ T4013] (ftrace buffer empty) > >>>>>> .. > >>>>>> .. > >>>>>> [ dd3313.874639][ T4013] CPU: 6 PID: 4013 Comm: InputReader > >>>>>> [ 3313.874648][ T4013] pc : dualshock4_led_get_brightness+0xc/0x74 > >>>>>> [ 3313.874653][ T4013] lr : led_update_brightness+0x38/0x60 > >>>>>> [ 3313.874656][ T4013] sp : ffffffc0b910bbd0 > >>>>>> .. > >>>>>> .. > >>>>>> [ 3313.874685][ T4013] Call trace: > >>>>>> [ 3313.874687][ T4013] dualshock4_led_get_brightness+0xc/0x74 > >>>>>> [ 3313.874690][ T4013] brightness_show+0x20/0x4c > >>>>>> [ 3313.874692][ T4013] dev_attr_show+0x38/0x74 > >>>>>> [ 3313.874696][ T4013] sysfs_kf_seq_show+0xb4/0x130 > >>>>>> [ 3313.874700][ T4013] kernfs_seq_show+0x44/0x54 > >>>>>> [ 3313.874703][ T4013] seq_read_iter+0x158/0x4ec > >>>>>> [ 3313.874705][ T4013] kernfs_fop_read_iter+0x68/0x1b4 > >>>>>> [ 3313.874708][ T4013] vfs_read+0x1e0/0x2c8 > >>>>>> [ 3313.874711][ T4013] ksys_read+0x78/0xe8 > >>>>>> [ 3313.874714][ T4013] __arm64_sys_read+0x1c/0x2c > >>>>>> [ 3313.874718][ T4013] invoke_syscall+0x58/0x114 > >>>>>> [ 3313.874721][ T4013] el0_svc_common+0x80/0xe0 > >>>>>> [ 3313.874724][ T4013] do_el0_svc+0x1c/0x28 > >>>>>> [ 3313.874727][ T4013] el0_svc+0x38/0x68 > >>>>>> [ 3313.874730][ T4013] el0t_64_sync_handler+0x68/0xbc > >>>>>> [ 3313.874732][ T4013] el0t_64_sync+0x1a8/0x1ac > >>>>>> > >>>>>> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> > >>>>>> --- > >>>>>> drivers/leds/led-class.c | 3 ++- > >>>>>> 1 file changed, 2 insertions(+), 1 deletion(-) > >>>>>> > >>>>>> diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c > >>>>>> index 06b97fd49ad9..e3cb93f19c06 100644 > >>>>>> --- a/drivers/leds/led-class.c > >>>>>> +++ b/drivers/leds/led-class.c > >>>>>> @@ -30,8 +30,9 @@ static ssize_t brightness_show(struct device *dev, > >>>>>> { > >>>>>> struct led_classdev *led_cdev = dev_get_drvdata(dev); > >>>>>> > >>>>>> - /* no lock needed for this */ > >>> > >>> just get rid of the above comment then. > >> > >> If you notice, it is already removed (-) . > >> > >>> > >>> Also, the comment below in file leds.h > >>> needs an update as originally the idea for this mutex lock was to > >>> provide quick feedback to userspace based on this commit > >>> https://github.com/torvalds/linux/commit/acd899e4f3066b6662f6047da5b795cc762093cb > >>> > >>> Basically a comment somewhere so that when a new attribute > >>> gets added, it doesn't make the same mistake of not using the mutex > >>> and run into the same issue. > >>> > >>> /* Ensures consistent access to the LED Flash Class device */ > >>> struct mutex led_access; > >> > >> Thanks for accepting that it is an issue. > >> I think, comment is very obvious actually the patch you mentioned should > >> be in fixes tag as it introduced the lock but did not protect the show > >> while it does it for store. > > > > Yes, but that patch was added for supporting flash class > > device and wasn't explicitly to take care of the scenario that you > > are trying to handle and the above comment in leds.h states the same. > > Correct. led_access mutex was introduced to add support for preventing > any LED class device state changes originating from sysfs while > v4l2_flash wrapper owns the device. > > Since the inception of LED subsystem all the locking was deemed to be > the responsibility of every single LED class driver and initially sysfs > attr callbacks didn't have any locking. After some time when LED core > started to grow it turned out that it was required to lock the LED class > initialization sequence, so as not to give the userspace an opportunity > to set LED brightness on not fully initialized device, which was > introduced in [0]. led_access mutex was already in place so it was used. > However as you noticed, it is not used consistently across all LED class > sysfs attrs callbacks. > > Since brightness_show() does not acquire led_access mutex it is still > possible to call brightness_get op when LED class initialization > sequence is not yet finished. > > Still, I'd propose to first narrow down the issue and figure out what > actually causes NULL pointer dereference, as it apparently > originates from dualshock4_led_get_brightness and not from LED core. Mukesh already explained the issue in earlier emails but here is the gist anyway. led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, led_cdev, led_cdev->groups, "%s", final_name); If you look at the above code, device_create_with_groups function can create all the sysfs and before it returns and assigns led_cdev->dev pointer, those sysfs callback can get triggered and if the callback accesses led_cdev->dev this variable, it will crash as it is not yet assigned. In my opinion, we just have to put a proper comment in attribute creation part so that if a new attribute gets added it uses the lock. > > I bet that the driver is not fully initialized up to the point when > devm_led_classdev_register_ext() is called in it. > > > > > I think we should modify that comment and state clearly that > > the aforementioned mutex is also to handle access to led_cdev->dev. > > Either here in this .h or where attributes are defined, so that new attributes > > that get added doesn't suffer from the same bug. > > > > led_trigger_set also this function also suffers from the same bug so you > > need to handle it the same way. > > led_trigger_set() is already called with led_access mutex held in > led_trigger_write(), i.e. from "trigger" sysfs attr. makes sense. > > >> > >> Fixes: acd899e4f306 ("leds: implement sysfs interface locking mechanism") > >> > >> -Mukesh > >>> > >>> > >>>>> > >>>>>>> also you missed this. > >>>>> > >>>>>> + mutex_lock(&led_cdev->led_access); > >>>>>> led_update_brightness(led_cdev); > >>>>>> + mutex_unlock(&led_cdev->led_access); > >>>>>> > >>>>>> return sprintf(buf, "%u\n", led_cdev->brightness); > >>>>>> } > >>>>>> -- > >>>>>> 2.34.1 > >>>>>> > >>>>>> > > > > [0] > https://lore.kernel.org/linux-leds/20180523222221.27621-1-lhenriques@suse.com/ > > -- > Best regards, > Jacek Anaszewski ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex 2024-10-17 16:41 ` anish kumar @ 2024-10-17 17:58 ` Jacek Anaszewski 2024-10-17 20:30 ` anish kumar 0 siblings, 1 reply; 12+ messages in thread From: Jacek Anaszewski @ 2024-10-17 17:58 UTC (permalink / raw) To: anish kumar Cc: Mukesh Ojha, Pavel Machek, Lee Jones, linux-leds, linux-kernel On 10/17/24 18:41, anish kumar wrote: > On Thu, Oct 17, 2024 at 5:12 AM Jacek Anaszewski > <jacek.anaszewski@gmail.com> wrote: >> >> Hi Anish and Mukesh, >> >> On 10/16/24 18:37, anish kumar wrote: >>> On Tue, Oct 15, 2024 at 10:45 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: >>>> >>>> On Tue, Oct 15, 2024 at 03:28:08PM -0700, anish kumar wrote: >>>>> On Tue, Oct 15, 2024 at 12:28 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: >>>>>> >>>>>> On Tue, Oct 15, 2024 at 10:59:12AM -0700, anish kumar wrote: >>>>>>> On Tue, Oct 15, 2024 at 9:26 AM Mukesh Ojha <quic_mojha@quicinc.com> wrote: >>>>>>>> >>>>>>>> There is NULL pointer issue observed if from Process A where hid device >>>>>>>> being added which results in adding a led_cdev addition and later a >>>>>>>> another call to access of led_cdev attribute from Process B can result >>>>>>>> in NULL pointer issue. >>>>>>> >>>>>>> Which pointer is NULL? Call stack shows that dualshock4_led_get_brightness >>>>>>> function could be culprit? >>>>>> >>>>>> in dualshock4_led_get_brightness()[1], led->dev is NULL here, as [2] >>>>>> is not yet completed. >>>>>> >>>>>> [1] >>>>>> struct hid_device *hdev = to_hid_device(led->dev->parent); >>>>>> >>>>>> [2] >>>>>> led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, >>>>>> led_cdev, led_cdev->groups, "%s", final_name); >>>>>> >>>>>>> >>>>>>>> >>>>>>>> Use mutex led_cdev->led_access to protect access to led->cdev and its >>>>>>>> attribute inside brightness_show(). >>>>>>> >>>>>>> I don't think it is needed here because it is just calling the led driver >>>>>>> callback and updating the brightness. So, why would we need to serialize >>>>>>> that using mutex? Maybe the callback needs some debugging. >>>>>>> I'm curious if it is ready by the time the callback is invoked. >>>>>> >>>>>> Because, we should not be allowed to access led_cdev->dev as it is not >>>>>> completed and since, brightness_store() has this lock brightness_show() >>>>>> should also have this as we are seeing the issue without it. >>>>>> >>>>>> I hope, above might have answered your question. >>>>>> >>>>>> -Mukesh >>>>>>> >>>>>>>> >>>>>>>> Process A Process B >>>>>>>> >>>>>>>> kthread+0x114 >>>>>>>> worker_thread+0x244 >>>>>>>> process_scheduled_works+0x248 >>>>>>>> uhid_device_add_worker+0x24 >>>>>>>> hid_add_device+0x120 >>>>>>>> device_add+0x268 >>>>>>>> bus_probe_device+0x94 >>>>>>>> device_initial_probe+0x14 >>>>>>>> __device_attach+0xfc >>>>>>>> bus_for_each_drv+0x10c >>>>>>>> __device_attach_driver+0x14c >>>>>>>> driver_probe_device+0x3c >>>>>>>> __driver_probe_device+0xa0 >>>>>>>> really_probe+0x190 >>>>>>>> hid_device_probe+0x130 >>>>>>>> ps_probe+0x990 >>>>>>>> ps_led_register+0x94 >>>>>>>> devm_led_classdev_register_ext+0x58 >>>>>>>> led_classdev_register_ext+0x1f8 >>>>>>>> device_create_with_groups+0x48 >>>>>>>> device_create_groups_vargs+0xc8 >>>>>>>> device_add+0x244 >>>>>>>> kobject_uevent+0x14 >>>>>>>> kobject_uevent_env[jt]+0x224 >>>>>>>> mutex_unlock[jt]+0xc4 >>>>>>>> __mutex_unlock_slowpath+0xd4 >>>>>>>> wake_up_q+0x70 >>>>>>>> try_to_wake_up[jt]+0x48c >>>>>>>> preempt_schedule_common+0x28 >>>>>>>> __schedule+0x628 >>>>>>>> __switch_to+0x174 >>>>>>>> el0t_64_sync+0x1a8/0x1ac >>>>>>>> el0t_64_sync_handler+0x68/0xbc >>>>>>>> el0_svc+0x38/0x68 >>>>>>>> do_el0_svc+0x1c/0x28 >>>>>>>> el0_svc_common+0x80/0xe0 >>>>>>>> invoke_syscall+0x58/0x114 >>>>>>>> __arm64_sys_read+0x1c/0x2c >>>>>>>> ksys_read+0x78/0xe8 >>>>>>>> vfs_read+0x1e0/0x2c8 >>>>>>>> kernfs_fop_read_iter+0x68/0x1b4 >>>>>>>> seq_read_iter+0x158/0x4ec >>>>>>>> kernfs_seq_show+0x44/0x54 >>>>>>>> sysfs_kf_seq_show+0xb4/0x130 >>>>>>>> dev_attr_show+0x38/0x74 >>>>>>>> brightness_show+0x20/0x4c >>>>>>>> dualshock4_led_get_brightness+0xc/0x74 >>>>>>>> >>>>>>>> [ 3313.874295][ T4013] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060 >>>>>>>> [ 3313.874301][ T4013] Mem abort info: >>>>>>>> [ 3313.874303][ T4013] ESR = 0x0000000096000006 >>>>>>>> [ 3313.874305][ T4013] EC = 0x25: DABT (current EL), IL = 32 bits >>>>>>>> [ 3313.874307][ T4013] SET = 0, FnV = 0 >>>>>>>> [ 3313.874309][ T4013] EA = 0, S1PTW = 0 >>>>>>>> [ 3313.874311][ T4013] FSC = 0x06: level 2 translation fault >>>>>>>> [ 3313.874313][ T4013] Data abort info: >>>>>>>> [ 3313.874314][ T4013] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 >>>>>>>> [ 3313.874316][ T4013] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 >>>>>>>> [ 3313.874318][ T4013] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 >>>>>>>> [ 3313.874320][ T4013] user pgtable: 4k pages, 39-bit VAs, pgdp=00000008f2b0a000 >>>>>>>> .. >>>>>>>> >>>>>>>> [ 3313.874332][ T4013] Dumping ftrace buffer: >>>>>>>> [ 3313.874334][ T4013] (ftrace buffer empty) >>>>>>>> .. >>>>>>>> .. >>>>>>>> [ dd3313.874639][ T4013] CPU: 6 PID: 4013 Comm: InputReader >>>>>>>> [ 3313.874648][ T4013] pc : dualshock4_led_get_brightness+0xc/0x74 >>>>>>>> [ 3313.874653][ T4013] lr : led_update_brightness+0x38/0x60 >>>>>>>> [ 3313.874656][ T4013] sp : ffffffc0b910bbd0 >>>>>>>> .. >>>>>>>> .. >>>>>>>> [ 3313.874685][ T4013] Call trace: >>>>>>>> [ 3313.874687][ T4013] dualshock4_led_get_brightness+0xc/0x74 >>>>>>>> [ 3313.874690][ T4013] brightness_show+0x20/0x4c >>>>>>>> [ 3313.874692][ T4013] dev_attr_show+0x38/0x74 >>>>>>>> [ 3313.874696][ T4013] sysfs_kf_seq_show+0xb4/0x130 >>>>>>>> [ 3313.874700][ T4013] kernfs_seq_show+0x44/0x54 >>>>>>>> [ 3313.874703][ T4013] seq_read_iter+0x158/0x4ec >>>>>>>> [ 3313.874705][ T4013] kernfs_fop_read_iter+0x68/0x1b4 >>>>>>>> [ 3313.874708][ T4013] vfs_read+0x1e0/0x2c8 >>>>>>>> [ 3313.874711][ T4013] ksys_read+0x78/0xe8 >>>>>>>> [ 3313.874714][ T4013] __arm64_sys_read+0x1c/0x2c >>>>>>>> [ 3313.874718][ T4013] invoke_syscall+0x58/0x114 >>>>>>>> [ 3313.874721][ T4013] el0_svc_common+0x80/0xe0 >>>>>>>> [ 3313.874724][ T4013] do_el0_svc+0x1c/0x28 >>>>>>>> [ 3313.874727][ T4013] el0_svc+0x38/0x68 >>>>>>>> [ 3313.874730][ T4013] el0t_64_sync_handler+0x68/0xbc >>>>>>>> [ 3313.874732][ T4013] el0t_64_sync+0x1a8/0x1ac >>>>>>>> >>>>>>>> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> >>>>>>>> --- >>>>>>>> drivers/leds/led-class.c | 3 ++- >>>>>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>>>>> >>>>>>>> diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c >>>>>>>> index 06b97fd49ad9..e3cb93f19c06 100644 >>>>>>>> --- a/drivers/leds/led-class.c >>>>>>>> +++ b/drivers/leds/led-class.c >>>>>>>> @@ -30,8 +30,9 @@ static ssize_t brightness_show(struct device *dev, >>>>>>>> { >>>>>>>> struct led_classdev *led_cdev = dev_get_drvdata(dev); >>>>>>>> >>>>>>>> - /* no lock needed for this */ >>>>> >>>>> just get rid of the above comment then. >>>> >>>> If you notice, it is already removed (-) . >>>> >>>>> >>>>> Also, the comment below in file leds.h >>>>> needs an update as originally the idea for this mutex lock was to >>>>> provide quick feedback to userspace based on this commit >>>>> https://github.com/torvalds/linux/commit/acd899e4f3066b6662f6047da5b795cc762093cb >>>>> >>>>> Basically a comment somewhere so that when a new attribute >>>>> gets added, it doesn't make the same mistake of not using the mutex >>>>> and run into the same issue. >>>>> >>>>> /* Ensures consistent access to the LED Flash Class device */ >>>>> struct mutex led_access; >>>> >>>> Thanks for accepting that it is an issue. >>>> I think, comment is very obvious actually the patch you mentioned should >>>> be in fixes tag as it introduced the lock but did not protect the show >>>> while it does it for store. >>> >>> Yes, but that patch was added for supporting flash class >>> device and wasn't explicitly to take care of the scenario that you >>> are trying to handle and the above comment in leds.h states the same. >> >> Correct. led_access mutex was introduced to add support for preventing >> any LED class device state changes originating from sysfs while >> v4l2_flash wrapper owns the device. >> >> Since the inception of LED subsystem all the locking was deemed to be >> the responsibility of every single LED class driver and initially sysfs >> attr callbacks didn't have any locking. After some time when LED core >> started to grow it turned out that it was required to lock the LED class >> initialization sequence, so as not to give the userspace an opportunity >> to set LED brightness on not fully initialized device, which was >> introduced in [0]. led_access mutex was already in place so it was used. >> However as you noticed, it is not used consistently across all LED class >> sysfs attrs callbacks. >> >> Since brightness_show() does not acquire led_access mutex it is still >> possible to call brightness_get op when LED class initialization >> sequence is not yet finished. >> >> Still, I'd propose to first narrow down the issue and figure out what >> actually causes NULL pointer dereference, as it apparently >> originates from dualshock4_led_get_brightness and not from LED core. > > Mukesh already explained the issue in earlier emails but here is the gist > anyway. > > led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, > led_cdev, led_cdev->groups, "%s", final_name); > > If you look at the above code, device_create_with_groups function > can create all the sysfs and before it returns and assigns led_cdev->dev > pointer, those sysfs callback can get triggered and if the callback > accesses led_cdev->dev this variable, it will crash as it is not yet > assigned. Your trace ends in dualshock4_led_get_brightness(). Did you confirm that NULL pointer dereference is caused by accessing led_cdev->dev there? -- Best regards, Jacek Anaszewski ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex 2024-10-17 17:58 ` Jacek Anaszewski @ 2024-10-17 20:30 ` anish kumar 2024-10-18 20:10 ` Jacek Anaszewski 0 siblings, 1 reply; 12+ messages in thread From: anish kumar @ 2024-10-17 20:30 UTC (permalink / raw) To: Jacek Anaszewski Cc: Mukesh Ojha, Pavel Machek, Lee Jones, linux-leds, linux-kernel On Thu, Oct 17, 2024 at 10:59 AM Jacek Anaszewski <jacek.anaszewski@gmail.com> wrote: > > > > On 10/17/24 18:41, anish kumar wrote: > > On Thu, Oct 17, 2024 at 5:12 AM Jacek Anaszewski > > <jacek.anaszewski@gmail.com> wrote: > >> > >> Hi Anish and Mukesh, > >> > >> On 10/16/24 18:37, anish kumar wrote: > >>> On Tue, Oct 15, 2024 at 10:45 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > >>>> > >>>> On Tue, Oct 15, 2024 at 03:28:08PM -0700, anish kumar wrote: > >>>>> On Tue, Oct 15, 2024 at 12:28 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > >>>>>> > >>>>>> On Tue, Oct 15, 2024 at 10:59:12AM -0700, anish kumar wrote: > >>>>>>> On Tue, Oct 15, 2024 at 9:26 AM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > >>>>>>>> > >>>>>>>> There is NULL pointer issue observed if from Process A where hid device > >>>>>>>> being added which results in adding a led_cdev addition and later a > >>>>>>>> another call to access of led_cdev attribute from Process B can result > >>>>>>>> in NULL pointer issue. > >>>>>>> > >>>>>>> Which pointer is NULL? Call stack shows that dualshock4_led_get_brightness > >>>>>>> function could be culprit? > >>>>>> > >>>>>> in dualshock4_led_get_brightness()[1], led->dev is NULL here, as [2] > >>>>>> is not yet completed. > >>>>>> > >>>>>> [1] > >>>>>> struct hid_device *hdev = to_hid_device(led->dev->parent); > >>>>>> > >>>>>> [2] > >>>>>> led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, > >>>>>> led_cdev, led_cdev->groups, "%s", final_name); > >>>>>> > >>>>>>> > >>>>>>>> > >>>>>>>> Use mutex led_cdev->led_access to protect access to led->cdev and its > >>>>>>>> attribute inside brightness_show(). > >>>>>>> > >>>>>>> I don't think it is needed here because it is just calling the led driver > >>>>>>> callback and updating the brightness. So, why would we need to serialize > >>>>>>> that using mutex? Maybe the callback needs some debugging. > >>>>>>> I'm curious if it is ready by the time the callback is invoked. > >>>>>> > >>>>>> Because, we should not be allowed to access led_cdev->dev as it is not > >>>>>> completed and since, brightness_store() has this lock brightness_show() > >>>>>> should also have this as we are seeing the issue without it. > >>>>>> > >>>>>> I hope, above might have answered your question. > >>>>>> > >>>>>> -Mukesh > >>>>>>> > >>>>>>>> > >>>>>>>> Process A Process B > >>>>>>>> > >>>>>>>> kthread+0x114 > >>>>>>>> worker_thread+0x244 > >>>>>>>> process_scheduled_works+0x248 > >>>>>>>> uhid_device_add_worker+0x24 > >>>>>>>> hid_add_device+0x120 > >>>>>>>> device_add+0x268 > >>>>>>>> bus_probe_device+0x94 > >>>>>>>> device_initial_probe+0x14 > >>>>>>>> __device_attach+0xfc > >>>>>>>> bus_for_each_drv+0x10c > >>>>>>>> __device_attach_driver+0x14c > >>>>>>>> driver_probe_device+0x3c > >>>>>>>> __driver_probe_device+0xa0 > >>>>>>>> really_probe+0x190 > >>>>>>>> hid_device_probe+0x130 > >>>>>>>> ps_probe+0x990 > >>>>>>>> ps_led_register+0x94 > >>>>>>>> devm_led_classdev_register_ext+0x58 > >>>>>>>> led_classdev_register_ext+0x1f8 > >>>>>>>> device_create_with_groups+0x48 > >>>>>>>> device_create_groups_vargs+0xc8 > >>>>>>>> device_add+0x244 > >>>>>>>> kobject_uevent+0x14 > >>>>>>>> kobject_uevent_env[jt]+0x224 > >>>>>>>> mutex_unlock[jt]+0xc4 > >>>>>>>> __mutex_unlock_slowpath+0xd4 > >>>>>>>> wake_up_q+0x70 > >>>>>>>> try_to_wake_up[jt]+0x48c > >>>>>>>> preempt_schedule_common+0x28 > >>>>>>>> __schedule+0x628 > >>>>>>>> __switch_to+0x174 > >>>>>>>> el0t_64_sync+0x1a8/0x1ac > >>>>>>>> el0t_64_sync_handler+0x68/0xbc > >>>>>>>> el0_svc+0x38/0x68 > >>>>>>>> do_el0_svc+0x1c/0x28 > >>>>>>>> el0_svc_common+0x80/0xe0 > >>>>>>>> invoke_syscall+0x58/0x114 > >>>>>>>> __arm64_sys_read+0x1c/0x2c > >>>>>>>> ksys_read+0x78/0xe8 > >>>>>>>> vfs_read+0x1e0/0x2c8 > >>>>>>>> kernfs_fop_read_iter+0x68/0x1b4 > >>>>>>>> seq_read_iter+0x158/0x4ec > >>>>>>>> kernfs_seq_show+0x44/0x54 > >>>>>>>> sysfs_kf_seq_show+0xb4/0x130 > >>>>>>>> dev_attr_show+0x38/0x74 > >>>>>>>> brightness_show+0x20/0x4c > >>>>>>>> dualshock4_led_get_brightness+0xc/0x74 > >>>>>>>> > >>>>>>>> [ 3313.874295][ T4013] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060 > >>>>>>>> [ 3313.874301][ T4013] Mem abort info: > >>>>>>>> [ 3313.874303][ T4013] ESR = 0x0000000096000006 > >>>>>>>> [ 3313.874305][ T4013] EC = 0x25: DABT (current EL), IL = 32 bits > >>>>>>>> [ 3313.874307][ T4013] SET = 0, FnV = 0 > >>>>>>>> [ 3313.874309][ T4013] EA = 0, S1PTW = 0 > >>>>>>>> [ 3313.874311][ T4013] FSC = 0x06: level 2 translation fault > >>>>>>>> [ 3313.874313][ T4013] Data abort info: > >>>>>>>> [ 3313.874314][ T4013] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 > >>>>>>>> [ 3313.874316][ T4013] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > >>>>>>>> [ 3313.874318][ T4013] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > >>>>>>>> [ 3313.874320][ T4013] user pgtable: 4k pages, 39-bit VAs, pgdp=00000008f2b0a000 > >>>>>>>> .. > >>>>>>>> > >>>>>>>> [ 3313.874332][ T4013] Dumping ftrace buffer: > >>>>>>>> [ 3313.874334][ T4013] (ftrace buffer empty) > >>>>>>>> .. > >>>>>>>> .. > >>>>>>>> [ dd3313.874639][ T4013] CPU: 6 PID: 4013 Comm: InputReader > >>>>>>>> [ 3313.874648][ T4013] pc : dualshock4_led_get_brightness+0xc/0x74 > >>>>>>>> [ 3313.874653][ T4013] lr : led_update_brightness+0x38/0x60 > >>>>>>>> [ 3313.874656][ T4013] sp : ffffffc0b910bbd0 > >>>>>>>> .. > >>>>>>>> .. > >>>>>>>> [ 3313.874685][ T4013] Call trace: > >>>>>>>> [ 3313.874687][ T4013] dualshock4_led_get_brightness+0xc/0x74 > >>>>>>>> [ 3313.874690][ T4013] brightness_show+0x20/0x4c > >>>>>>>> [ 3313.874692][ T4013] dev_attr_show+0x38/0x74 > >>>>>>>> [ 3313.874696][ T4013] sysfs_kf_seq_show+0xb4/0x130 > >>>>>>>> [ 3313.874700][ T4013] kernfs_seq_show+0x44/0x54 > >>>>>>>> [ 3313.874703][ T4013] seq_read_iter+0x158/0x4ec > >>>>>>>> [ 3313.874705][ T4013] kernfs_fop_read_iter+0x68/0x1b4 > >>>>>>>> [ 3313.874708][ T4013] vfs_read+0x1e0/0x2c8 > >>>>>>>> [ 3313.874711][ T4013] ksys_read+0x78/0xe8 > >>>>>>>> [ 3313.874714][ T4013] __arm64_sys_read+0x1c/0x2c > >>>>>>>> [ 3313.874718][ T4013] invoke_syscall+0x58/0x114 > >>>>>>>> [ 3313.874721][ T4013] el0_svc_common+0x80/0xe0 > >>>>>>>> [ 3313.874724][ T4013] do_el0_svc+0x1c/0x28 > >>>>>>>> [ 3313.874727][ T4013] el0_svc+0x38/0x68 > >>>>>>>> [ 3313.874730][ T4013] el0t_64_sync_handler+0x68/0xbc > >>>>>>>> [ 3313.874732][ T4013] el0t_64_sync+0x1a8/0x1ac > >>>>>>>> > >>>>>>>> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> > >>>>>>>> --- > >>>>>>>> drivers/leds/led-class.c | 3 ++- > >>>>>>>> 1 file changed, 2 insertions(+), 1 deletion(-) > >>>>>>>> > >>>>>>>> diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c > >>>>>>>> index 06b97fd49ad9..e3cb93f19c06 100644 > >>>>>>>> --- a/drivers/leds/led-class.c > >>>>>>>> +++ b/drivers/leds/led-class.c > >>>>>>>> @@ -30,8 +30,9 @@ static ssize_t brightness_show(struct device *dev, > >>>>>>>> { > >>>>>>>> struct led_classdev *led_cdev = dev_get_drvdata(dev); > >>>>>>>> > >>>>>>>> - /* no lock needed for this */ > >>>>> > >>>>> just get rid of the above comment then. > >>>> > >>>> If you notice, it is already removed (-) . > >>>> > >>>>> > >>>>> Also, the comment below in file leds.h > >>>>> needs an update as originally the idea for this mutex lock was to > >>>>> provide quick feedback to userspace based on this commit > >>>>> https://github.com/torvalds/linux/commit/acd899e4f3066b6662f6047da5b795cc762093cb > >>>>> > >>>>> Basically a comment somewhere so that when a new attribute > >>>>> gets added, it doesn't make the same mistake of not using the mutex > >>>>> and run into the same issue. > >>>>> > >>>>> /* Ensures consistent access to the LED Flash Class device */ > >>>>> struct mutex led_access; > >>>> > >>>> Thanks for accepting that it is an issue. > >>>> I think, comment is very obvious actually the patch you mentioned should > >>>> be in fixes tag as it introduced the lock but did not protect the show > >>>> while it does it for store. > >>> > >>> Yes, but that patch was added for supporting flash class > >>> device and wasn't explicitly to take care of the scenario that you > >>> are trying to handle and the above comment in leds.h states the same. > >> > >> Correct. led_access mutex was introduced to add support for preventing > >> any LED class device state changes originating from sysfs while > >> v4l2_flash wrapper owns the device. > >> > >> Since the inception of LED subsystem all the locking was deemed to be > >> the responsibility of every single LED class driver and initially sysfs > >> attr callbacks didn't have any locking. After some time when LED core > >> started to grow it turned out that it was required to lock the LED class > >> initialization sequence, so as not to give the userspace an opportunity > >> to set LED brightness on not fully initialized device, which was > >> introduced in [0]. led_access mutex was already in place so it was used. > >> However as you noticed, it is not used consistently across all LED class > >> sysfs attrs callbacks. > >> > >> Since brightness_show() does not acquire led_access mutex it is still > >> possible to call brightness_get op when LED class initialization > >> sequence is not yet finished. > >> > >> Still, I'd propose to first narrow down the issue and figure out what > >> actually causes NULL pointer dereference, as it apparently > >> originates from dualshock4_led_get_brightness and not from LED core. > > > > Mukesh already explained the issue in earlier emails but here is the gist > > anyway. > > > > led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, > > led_cdev, led_cdev->groups, "%s", final_name); > > > > If you look at the above code, device_create_with_groups function > > can create all the sysfs and before it returns and assigns led_cdev->dev > > pointer, those sysfs callback can get triggered and if the callback > > accesses led_cdev->dev this variable, it will crash as it is not yet > > assigned. > > Your trace ends in dualshock4_led_get_brightness(). Did you confirm that > NULL pointer dereference is caused by accessing led_cdev->dev there? Based on the comment from mukesh, he confirmed that. Relevant comment from him: " in dualshock4_led_get_brightness()[1], led->dev is NULL here, as [2] is not yet completed. [1] struct hid_device *hdev = to_hid_device(led->dev->parent); [2] led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, led_cdev, led_cdev->groups, "%s", final_name); " > > -- > Best regards, > Jacek Anaszewski ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex 2024-10-17 20:30 ` anish kumar @ 2024-10-18 20:10 ` Jacek Anaszewski 2024-10-25 10:01 ` Mukesh Ojha 0 siblings, 1 reply; 12+ messages in thread From: Jacek Anaszewski @ 2024-10-18 20:10 UTC (permalink / raw) To: anish kumar Cc: Mukesh Ojha, Pavel Machek, Lee Jones, linux-leds, linux-kernel On 10/17/24 22:30, anish kumar wrote: > On Thu, Oct 17, 2024 at 10:59 AM Jacek Anaszewski > <jacek.anaszewski@gmail.com> wrote: >> >> >> >> On 10/17/24 18:41, anish kumar wrote: >>> On Thu, Oct 17, 2024 at 5:12 AM Jacek Anaszewski >>> <jacek.anaszewski@gmail.com> wrote: >>>> >>>> Hi Anish and Mukesh, >>>> >>>> On 10/16/24 18:37, anish kumar wrote: >>>>> On Tue, Oct 15, 2024 at 10:45 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: >>>>>> >>>>>> On Tue, Oct 15, 2024 at 03:28:08PM -0700, anish kumar wrote: >>>>>>> On Tue, Oct 15, 2024 at 12:28 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: >>>>>>>> >>>>>>>> On Tue, Oct 15, 2024 at 10:59:12AM -0700, anish kumar wrote: >>>>>>>>> On Tue, Oct 15, 2024 at 9:26 AM Mukesh Ojha <quic_mojha@quicinc.com> wrote: >>>>>>>>>> >>>>>>>>>> There is NULL pointer issue observed if from Process A where hid device >>>>>>>>>> being added which results in adding a led_cdev addition and later a >>>>>>>>>> another call to access of led_cdev attribute from Process B can result >>>>>>>>>> in NULL pointer issue. >>>>>>>>> >>>>>>>>> Which pointer is NULL? Call stack shows that dualshock4_led_get_brightness >>>>>>>>> function could be culprit? >>>>>>>> >>>>>>>> in dualshock4_led_get_brightness()[1], led->dev is NULL here, as [2] >>>>>>>> is not yet completed. >>>>>>>> >>>>>>>> [1] >>>>>>>> struct hid_device *hdev = to_hid_device(led->dev->parent); >>>>>>>> >>>>>>>> [2] >>>>>>>> led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, >>>>>>>> led_cdev, led_cdev->groups, "%s", final_name); >>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Use mutex led_cdev->led_access to protect access to led->cdev and its >>>>>>>>>> attribute inside brightness_show(). >>>>>>>>> >>>>>>>>> I don't think it is needed here because it is just calling the led driver >>>>>>>>> callback and updating the brightness. So, why would we need to serialize >>>>>>>>> that using mutex? Maybe the callback needs some debugging. >>>>>>>>> I'm curious if it is ready by the time the callback is invoked. >>>>>>>> >>>>>>>> Because, we should not be allowed to access led_cdev->dev as it is not >>>>>>>> completed and since, brightness_store() has this lock brightness_show() >>>>>>>> should also have this as we are seeing the issue without it. >>>>>>>> >>>>>>>> I hope, above might have answered your question. >>>>>>>> >>>>>>>> -Mukesh >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Process A Process B >>>>>>>>>> >>>>>>>>>> kthread+0x114 >>>>>>>>>> worker_thread+0x244 >>>>>>>>>> process_scheduled_works+0x248 >>>>>>>>>> uhid_device_add_worker+0x24 >>>>>>>>>> hid_add_device+0x120 >>>>>>>>>> device_add+0x268 >>>>>>>>>> bus_probe_device+0x94 >>>>>>>>>> device_initial_probe+0x14 >>>>>>>>>> __device_attach+0xfc >>>>>>>>>> bus_for_each_drv+0x10c >>>>>>>>>> __device_attach_driver+0x14c >>>>>>>>>> driver_probe_device+0x3c >>>>>>>>>> __driver_probe_device+0xa0 >>>>>>>>>> really_probe+0x190 >>>>>>>>>> hid_device_probe+0x130 >>>>>>>>>> ps_probe+0x990 >>>>>>>>>> ps_led_register+0x94 >>>>>>>>>> devm_led_classdev_register_ext+0x58 >>>>>>>>>> led_classdev_register_ext+0x1f8 >>>>>>>>>> device_create_with_groups+0x48 >>>>>>>>>> device_create_groups_vargs+0xc8 >>>>>>>>>> device_add+0x244 >>>>>>>>>> kobject_uevent+0x14 >>>>>>>>>> kobject_uevent_env[jt]+0x224 >>>>>>>>>> mutex_unlock[jt]+0xc4 >>>>>>>>>> __mutex_unlock_slowpath+0xd4 >>>>>>>>>> wake_up_q+0x70 >>>>>>>>>> try_to_wake_up[jt]+0x48c >>>>>>>>>> preempt_schedule_common+0x28 >>>>>>>>>> __schedule+0x628 >>>>>>>>>> __switch_to+0x174 >>>>>>>>>> el0t_64_sync+0x1a8/0x1ac >>>>>>>>>> el0t_64_sync_handler+0x68/0xbc >>>>>>>>>> el0_svc+0x38/0x68 >>>>>>>>>> do_el0_svc+0x1c/0x28 >>>>>>>>>> el0_svc_common+0x80/0xe0 >>>>>>>>>> invoke_syscall+0x58/0x114 >>>>>>>>>> __arm64_sys_read+0x1c/0x2c >>>>>>>>>> ksys_read+0x78/0xe8 >>>>>>>>>> vfs_read+0x1e0/0x2c8 >>>>>>>>>> kernfs_fop_read_iter+0x68/0x1b4 >>>>>>>>>> seq_read_iter+0x158/0x4ec >>>>>>>>>> kernfs_seq_show+0x44/0x54 >>>>>>>>>> sysfs_kf_seq_show+0xb4/0x130 >>>>>>>>>> dev_attr_show+0x38/0x74 >>>>>>>>>> brightness_show+0x20/0x4c >>>>>>>>>> dualshock4_led_get_brightness+0xc/0x74 >>>>>>>>>> >>>>>>>>>> [ 3313.874295][ T4013] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060 >>>>>>>>>> [ 3313.874301][ T4013] Mem abort info: >>>>>>>>>> [ 3313.874303][ T4013] ESR = 0x0000000096000006 >>>>>>>>>> [ 3313.874305][ T4013] EC = 0x25: DABT (current EL), IL = 32 bits >>>>>>>>>> [ 3313.874307][ T4013] SET = 0, FnV = 0 >>>>>>>>>> [ 3313.874309][ T4013] EA = 0, S1PTW = 0 >>>>>>>>>> [ 3313.874311][ T4013] FSC = 0x06: level 2 translation fault >>>>>>>>>> [ 3313.874313][ T4013] Data abort info: >>>>>>>>>> [ 3313.874314][ T4013] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 >>>>>>>>>> [ 3313.874316][ T4013] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 >>>>>>>>>> [ 3313.874318][ T4013] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 >>>>>>>>>> [ 3313.874320][ T4013] user pgtable: 4k pages, 39-bit VAs, pgdp=00000008f2b0a000 >>>>>>>>>> .. >>>>>>>>>> >>>>>>>>>> [ 3313.874332][ T4013] Dumping ftrace buffer: >>>>>>>>>> [ 3313.874334][ T4013] (ftrace buffer empty) >>>>>>>>>> .. >>>>>>>>>> .. >>>>>>>>>> [ dd3313.874639][ T4013] CPU: 6 PID: 4013 Comm: InputReader >>>>>>>>>> [ 3313.874648][ T4013] pc : dualshock4_led_get_brightness+0xc/0x74 >>>>>>>>>> [ 3313.874653][ T4013] lr : led_update_brightness+0x38/0x60 >>>>>>>>>> [ 3313.874656][ T4013] sp : ffffffc0b910bbd0 >>>>>>>>>> .. >>>>>>>>>> .. >>>>>>>>>> [ 3313.874685][ T4013] Call trace: >>>>>>>>>> [ 3313.874687][ T4013] dualshock4_led_get_brightness+0xc/0x74 >>>>>>>>>> [ 3313.874690][ T4013] brightness_show+0x20/0x4c >>>>>>>>>> [ 3313.874692][ T4013] dev_attr_show+0x38/0x74 >>>>>>>>>> [ 3313.874696][ T4013] sysfs_kf_seq_show+0xb4/0x130 >>>>>>>>>> [ 3313.874700][ T4013] kernfs_seq_show+0x44/0x54 >>>>>>>>>> [ 3313.874703][ T4013] seq_read_iter+0x158/0x4ec >>>>>>>>>> [ 3313.874705][ T4013] kernfs_fop_read_iter+0x68/0x1b4 >>>>>>>>>> [ 3313.874708][ T4013] vfs_read+0x1e0/0x2c8 >>>>>>>>>> [ 3313.874711][ T4013] ksys_read+0x78/0xe8 >>>>>>>>>> [ 3313.874714][ T4013] __arm64_sys_read+0x1c/0x2c >>>>>>>>>> [ 3313.874718][ T4013] invoke_syscall+0x58/0x114 >>>>>>>>>> [ 3313.874721][ T4013] el0_svc_common+0x80/0xe0 >>>>>>>>>> [ 3313.874724][ T4013] do_el0_svc+0x1c/0x28 >>>>>>>>>> [ 3313.874727][ T4013] el0_svc+0x38/0x68 >>>>>>>>>> [ 3313.874730][ T4013] el0t_64_sync_handler+0x68/0xbc >>>>>>>>>> [ 3313.874732][ T4013] el0t_64_sync+0x1a8/0x1ac >>>>>>>>>> >>>>>>>>>> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> >>>>>>>>>> --- >>>>>>>>>> drivers/leds/led-class.c | 3 ++- >>>>>>>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>>>>>>> >>>>>>>>>> diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c >>>>>>>>>> index 06b97fd49ad9..e3cb93f19c06 100644 >>>>>>>>>> --- a/drivers/leds/led-class.c >>>>>>>>>> +++ b/drivers/leds/led-class.c >>>>>>>>>> @@ -30,8 +30,9 @@ static ssize_t brightness_show(struct device *dev, >>>>>>>>>> { >>>>>>>>>> struct led_classdev *led_cdev = dev_get_drvdata(dev); >>>>>>>>>> >>>>>>>>>> - /* no lock needed for this */ >>>>>>> >>>>>>> just get rid of the above comment then. >>>>>> >>>>>> If you notice, it is already removed (-) . >>>>>> >>>>>>> >>>>>>> Also, the comment below in file leds.h >>>>>>> needs an update as originally the idea for this mutex lock was to >>>>>>> provide quick feedback to userspace based on this commit >>>>>>> https://github.com/torvalds/linux/commit/acd899e4f3066b6662f6047da5b795cc762093cb >>>>>>> >>>>>>> Basically a comment somewhere so that when a new attribute >>>>>>> gets added, it doesn't make the same mistake of not using the mutex >>>>>>> and run into the same issue. >>>>>>> >>>>>>> /* Ensures consistent access to the LED Flash Class device */ >>>>>>> struct mutex led_access; >>>>>> >>>>>> Thanks for accepting that it is an issue. >>>>>> I think, comment is very obvious actually the patch you mentioned should >>>>>> be in fixes tag as it introduced the lock but did not protect the show >>>>>> while it does it for store. >>>>> >>>>> Yes, but that patch was added for supporting flash class >>>>> device and wasn't explicitly to take care of the scenario that you >>>>> are trying to handle and the above comment in leds.h states the same. >>>> >>>> Correct. led_access mutex was introduced to add support for preventing >>>> any LED class device state changes originating from sysfs while >>>> v4l2_flash wrapper owns the device. >>>> >>>> Since the inception of LED subsystem all the locking was deemed to be >>>> the responsibility of every single LED class driver and initially sysfs >>>> attr callbacks didn't have any locking. After some time when LED core >>>> started to grow it turned out that it was required to lock the LED class >>>> initialization sequence, so as not to give the userspace an opportunity >>>> to set LED brightness on not fully initialized device, which was >>>> introduced in [0]. led_access mutex was already in place so it was used. >>>> However as you noticed, it is not used consistently across all LED class >>>> sysfs attrs callbacks. >>>> >>>> Since brightness_show() does not acquire led_access mutex it is still >>>> possible to call brightness_get op when LED class initialization >>>> sequence is not yet finished. >>>> >>>> Still, I'd propose to first narrow down the issue and figure out what >>>> actually causes NULL pointer dereference, as it apparently >>>> originates from dualshock4_led_get_brightness and not from LED core. >>> >>> Mukesh already explained the issue in earlier emails but here is the gist >>> anyway. >>> >>> led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, >>> led_cdev, led_cdev->groups, "%s", final_name); >>> >>> If you look at the above code, device_create_with_groups function >>> can create all the sysfs and before it returns and assigns led_cdev->dev >>> pointer, those sysfs callback can get triggered and if the callback >>> accesses led_cdev->dev this variable, it will crash as it is not yet >>> assigned. >> >> Your trace ends in dualshock4_led_get_brightness(). Did you confirm that >> NULL pointer dereference is caused by accessing led_cdev->dev there? > > Based on the comment from mukesh, he confirmed that. > > Relevant comment from him: > " > in dualshock4_led_get_brightness()[1], led->dev is NULL here, as [2] > is not yet completed. > > [1] > struct hid_device *hdev = to_hid_device(led->dev->parent); > > [2] > led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, > led_cdev, led_cdev->groups, "%s", final_name); > " Ah, right, I missed that. So yes, this change in general is justified, but as you mentioned we need to adjust the comment next to the led_access mutex definition. Probably just change s/LED Flash Class/LED class/. And also add the locking to max_brightness_show() to be consistent. -- Best regards, Jacek Anaszewski ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex 2024-10-18 20:10 ` Jacek Anaszewski @ 2024-10-25 10:01 ` Mukesh Ojha 0 siblings, 0 replies; 12+ messages in thread From: Mukesh Ojha @ 2024-10-25 10:01 UTC (permalink / raw) To: Jacek Anaszewski Cc: anish kumar, Pavel Machek, Lee Jones, linux-leds, linux-kernel On Fri, Oct 18, 2024 at 10:10:39PM +0200, Jacek Anaszewski wrote: > On 10/17/24 22:30, anish kumar wrote: > > On Thu, Oct 17, 2024 at 10:59 AM Jacek Anaszewski > > <jacek.anaszewski@gmail.com> wrote: > > > > > > > > > > > > On 10/17/24 18:41, anish kumar wrote: > > > > On Thu, Oct 17, 2024 at 5:12 AM Jacek Anaszewski > > > > <jacek.anaszewski@gmail.com> wrote: > > > > > > > > > > Hi Anish and Mukesh, > > > > > > > > > > On 10/16/24 18:37, anish kumar wrote: > > > > > > On Tue, Oct 15, 2024 at 10:45 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > > > > > > > > > > > > > > On Tue, Oct 15, 2024 at 03:28:08PM -0700, anish kumar wrote: > > > > > > > > On Tue, Oct 15, 2024 at 12:28 PM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > > > > > > > > > > > > > > > > > > On Tue, Oct 15, 2024 at 10:59:12AM -0700, anish kumar wrote: > > > > > > > > > > On Tue, Oct 15, 2024 at 9:26 AM Mukesh Ojha <quic_mojha@quicinc.com> wrote: > > > > > > > > > > > > > > > > > > > > > > There is NULL pointer issue observed if from Process A where hid device > > > > > > > > > > > being added which results in adding a led_cdev addition and later a > > > > > > > > > > > another call to access of led_cdev attribute from Process B can result > > > > > > > > > > > in NULL pointer issue. > > > > > > > > > > > > > > > > > > > > Which pointer is NULL? Call stack shows that dualshock4_led_get_brightness > > > > > > > > > > function could be culprit? > > > > > > > > > > > > > > > > > > in dualshock4_led_get_brightness()[1], led->dev is NULL here, as [2] > > > > > > > > > is not yet completed. > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > struct hid_device *hdev = to_hid_device(led->dev->parent); > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, > > > > > > > > > led_cdev, led_cdev->groups, "%s", final_name); > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Use mutex led_cdev->led_access to protect access to led->cdev and its > > > > > > > > > > > attribute inside brightness_show(). > > > > > > > > > > > > > > > > > > > > I don't think it is needed here because it is just calling the led driver > > > > > > > > > > callback and updating the brightness. So, why would we need to serialize > > > > > > > > > > that using mutex? Maybe the callback needs some debugging. > > > > > > > > > > I'm curious if it is ready by the time the callback is invoked. > > > > > > > > > > > > > > > > > > Because, we should not be allowed to access led_cdev->dev as it is not > > > > > > > > > completed and since, brightness_store() has this lock brightness_show() > > > > > > > > > should also have this as we are seeing the issue without it. > > > > > > > > > > > > > > > > > > I hope, above might have answered your question. > > > > > > > > > > > > > > > > > > -Mukesh > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Process A Process B > > > > > > > > > > > > > > > > > > > > > > kthread+0x114 > > > > > > > > > > > worker_thread+0x244 > > > > > > > > > > > process_scheduled_works+0x248 > > > > > > > > > > > uhid_device_add_worker+0x24 > > > > > > > > > > > hid_add_device+0x120 > > > > > > > > > > > device_add+0x268 > > > > > > > > > > > bus_probe_device+0x94 > > > > > > > > > > > device_initial_probe+0x14 > > > > > > > > > > > __device_attach+0xfc > > > > > > > > > > > bus_for_each_drv+0x10c > > > > > > > > > > > __device_attach_driver+0x14c > > > > > > > > > > > driver_probe_device+0x3c > > > > > > > > > > > __driver_probe_device+0xa0 > > > > > > > > > > > really_probe+0x190 > > > > > > > > > > > hid_device_probe+0x130 > > > > > > > > > > > ps_probe+0x990 > > > > > > > > > > > ps_led_register+0x94 > > > > > > > > > > > devm_led_classdev_register_ext+0x58 > > > > > > > > > > > led_classdev_register_ext+0x1f8 > > > > > > > > > > > device_create_with_groups+0x48 > > > > > > > > > > > device_create_groups_vargs+0xc8 > > > > > > > > > > > device_add+0x244 > > > > > > > > > > > kobject_uevent+0x14 > > > > > > > > > > > kobject_uevent_env[jt]+0x224 > > > > > > > > > > > mutex_unlock[jt]+0xc4 > > > > > > > > > > > __mutex_unlock_slowpath+0xd4 > > > > > > > > > > > wake_up_q+0x70 > > > > > > > > > > > try_to_wake_up[jt]+0x48c > > > > > > > > > > > preempt_schedule_common+0x28 > > > > > > > > > > > __schedule+0x628 > > > > > > > > > > > __switch_to+0x174 > > > > > > > > > > > el0t_64_sync+0x1a8/0x1ac > > > > > > > > > > > el0t_64_sync_handler+0x68/0xbc > > > > > > > > > > > el0_svc+0x38/0x68 > > > > > > > > > > > do_el0_svc+0x1c/0x28 > > > > > > > > > > > el0_svc_common+0x80/0xe0 > > > > > > > > > > > invoke_syscall+0x58/0x114 > > > > > > > > > > > __arm64_sys_read+0x1c/0x2c > > > > > > > > > > > ksys_read+0x78/0xe8 > > > > > > > > > > > vfs_read+0x1e0/0x2c8 > > > > > > > > > > > kernfs_fop_read_iter+0x68/0x1b4 > > > > > > > > > > > seq_read_iter+0x158/0x4ec > > > > > > > > > > > kernfs_seq_show+0x44/0x54 > > > > > > > > > > > sysfs_kf_seq_show+0xb4/0x130 > > > > > > > > > > > dev_attr_show+0x38/0x74 > > > > > > > > > > > brightness_show+0x20/0x4c > > > > > > > > > > > dualshock4_led_get_brightness+0xc/0x74 > > > > > > > > > > > > > > > > > > > > > > [ 3313.874295][ T4013] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060 > > > > > > > > > > > [ 3313.874301][ T4013] Mem abort info: > > > > > > > > > > > [ 3313.874303][ T4013] ESR = 0x0000000096000006 > > > > > > > > > > > [ 3313.874305][ T4013] EC = 0x25: DABT (current EL), IL = 32 bits > > > > > > > > > > > [ 3313.874307][ T4013] SET = 0, FnV = 0 > > > > > > > > > > > [ 3313.874309][ T4013] EA = 0, S1PTW = 0 > > > > > > > > > > > [ 3313.874311][ T4013] FSC = 0x06: level 2 translation fault > > > > > > > > > > > [ 3313.874313][ T4013] Data abort info: > > > > > > > > > > > [ 3313.874314][ T4013] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 > > > > > > > > > > > [ 3313.874316][ T4013] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > > > > > > > > > > > [ 3313.874318][ T4013] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > > > > > > > > > > > [ 3313.874320][ T4013] user pgtable: 4k pages, 39-bit VAs, pgdp=00000008f2b0a000 > > > > > > > > > > > .. > > > > > > > > > > > > > > > > > > > > > > [ 3313.874332][ T4013] Dumping ftrace buffer: > > > > > > > > > > > [ 3313.874334][ T4013] (ftrace buffer empty) > > > > > > > > > > > .. > > > > > > > > > > > .. > > > > > > > > > > > [ dd3313.874639][ T4013] CPU: 6 PID: 4013 Comm: InputReader > > > > > > > > > > > [ 3313.874648][ T4013] pc : dualshock4_led_get_brightness+0xc/0x74 > > > > > > > > > > > [ 3313.874653][ T4013] lr : led_update_brightness+0x38/0x60 > > > > > > > > > > > [ 3313.874656][ T4013] sp : ffffffc0b910bbd0 > > > > > > > > > > > .. > > > > > > > > > > > .. > > > > > > > > > > > [ 3313.874685][ T4013] Call trace: > > > > > > > > > > > [ 3313.874687][ T4013] dualshock4_led_get_brightness+0xc/0x74 > > > > > > > > > > > [ 3313.874690][ T4013] brightness_show+0x20/0x4c > > > > > > > > > > > [ 3313.874692][ T4013] dev_attr_show+0x38/0x74 > > > > > > > > > > > [ 3313.874696][ T4013] sysfs_kf_seq_show+0xb4/0x130 > > > > > > > > > > > [ 3313.874700][ T4013] kernfs_seq_show+0x44/0x54 > > > > > > > > > > > [ 3313.874703][ T4013] seq_read_iter+0x158/0x4ec > > > > > > > > > > > [ 3313.874705][ T4013] kernfs_fop_read_iter+0x68/0x1b4 > > > > > > > > > > > [ 3313.874708][ T4013] vfs_read+0x1e0/0x2c8 > > > > > > > > > > > [ 3313.874711][ T4013] ksys_read+0x78/0xe8 > > > > > > > > > > > [ 3313.874714][ T4013] __arm64_sys_read+0x1c/0x2c > > > > > > > > > > > [ 3313.874718][ T4013] invoke_syscall+0x58/0x114 > > > > > > > > > > > [ 3313.874721][ T4013] el0_svc_common+0x80/0xe0 > > > > > > > > > > > [ 3313.874724][ T4013] do_el0_svc+0x1c/0x28 > > > > > > > > > > > [ 3313.874727][ T4013] el0_svc+0x38/0x68 > > > > > > > > > > > [ 3313.874730][ T4013] el0t_64_sync_handler+0x68/0xbc > > > > > > > > > > > [ 3313.874732][ T4013] el0t_64_sync+0x1a8/0x1ac > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> > > > > > > > > > > > --- > > > > > > > > > > > drivers/leds/led-class.c | 3 ++- > > > > > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c > > > > > > > > > > > index 06b97fd49ad9..e3cb93f19c06 100644 > > > > > > > > > > > --- a/drivers/leds/led-class.c > > > > > > > > > > > +++ b/drivers/leds/led-class.c > > > > > > > > > > > @@ -30,8 +30,9 @@ static ssize_t brightness_show(struct device *dev, > > > > > > > > > > > { > > > > > > > > > > > struct led_classdev *led_cdev = dev_get_drvdata(dev); > > > > > > > > > > > > > > > > > > > > > > - /* no lock needed for this */ > > > > > > > > > > > > > > > > just get rid of the above comment then. > > > > > > > > > > > > > > If you notice, it is already removed (-) . > > > > > > > > > > > > > > > > > > > > > > > Also, the comment below in file leds.h > > > > > > > > needs an update as originally the idea for this mutex lock was to > > > > > > > > provide quick feedback to userspace based on this commit > > > > > > > > https://github.com/torvalds/linux/commit/acd899e4f3066b6662f6047da5b795cc762093cb > > > > > > > > > > > > > > > > Basically a comment somewhere so that when a new attribute > > > > > > > > gets added, it doesn't make the same mistake of not using the mutex > > > > > > > > and run into the same issue. > > > > > > > > > > > > > > > > /* Ensures consistent access to the LED Flash Class device */ > > > > > > > > struct mutex led_access; > > > > > > > > > > > > > > Thanks for accepting that it is an issue. > > > > > > > I think, comment is very obvious actually the patch you mentioned should > > > > > > > be in fixes tag as it introduced the lock but did not protect the show > > > > > > > while it does it for store. > > > > > > > > > > > > Yes, but that patch was added for supporting flash class > > > > > > device and wasn't explicitly to take care of the scenario that you > > > > > > are trying to handle and the above comment in leds.h states the same. > > > > > > > > > > Correct. led_access mutex was introduced to add support for preventing > > > > > any LED class device state changes originating from sysfs while > > > > > v4l2_flash wrapper owns the device. > > > > > > > > > > Since the inception of LED subsystem all the locking was deemed to be > > > > > the responsibility of every single LED class driver and initially sysfs > > > > > attr callbacks didn't have any locking. After some time when LED core > > > > > started to grow it turned out that it was required to lock the LED class > > > > > initialization sequence, so as not to give the userspace an opportunity > > > > > to set LED brightness on not fully initialized device, which was > > > > > introduced in [0]. led_access mutex was already in place so it was used. > > > > > However as you noticed, it is not used consistently across all LED class > > > > > sysfs attrs callbacks. > > > > > > > > > > Since brightness_show() does not acquire led_access mutex it is still > > > > > possible to call brightness_get op when LED class initialization > > > > > sequence is not yet finished. > > > > > > > > > > Still, I'd propose to first narrow down the issue and figure out what > > > > > actually causes NULL pointer dereference, as it apparently > > > > > originates from dualshock4_led_get_brightness and not from LED core. > > > > > > > > Mukesh already explained the issue in earlier emails but here is the gist > > > > anyway. > > > > > > > > led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, > > > > led_cdev, led_cdev->groups, "%s", final_name); > > > > > > > > If you look at the above code, device_create_with_groups function > > > > can create all the sysfs and before it returns and assigns led_cdev->dev > > > > pointer, those sysfs callback can get triggered and if the callback > > > > accesses led_cdev->dev this variable, it will crash as it is not yet > > > > assigned. > > > > > > Your trace ends in dualshock4_led_get_brightness(). Did you confirm that > > > NULL pointer dereference is caused by accessing led_cdev->dev there? > > > > Based on the comment from mukesh, he confirmed that. > > > > Relevant comment from him: > > " > > in dualshock4_led_get_brightness()[1], led->dev is NULL here, as [2] > > is not yet completed. > > > > [1] > > struct hid_device *hdev = to_hid_device(led->dev->parent); > > > > [2] > > led_cdev->dev = device_create_with_groups(&leds_class, parent, 0, > > led_cdev, led_cdev->groups, "%s", final_name); > > " > > Ah, right, I missed that. So yes, this change in general is justified, > but as you mentioned we need to adjust the comment next to the > led_access mutex definition. > > Probably just change s/LED Flash Class/LED class/. > > And also add the locking to max_brightness_show() to be consistent. Yes, max_brightness_show() would also need one. Will send v2. -Mukesh > > -- > Best regards, > Jacek Anaszewski ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-10-25 10:01 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-10-15 16:25 [PATCH] leds: class: Protect brightness_show() with led_cdev->led_access mutex Mukesh Ojha 2024-10-15 17:59 ` anish kumar 2024-10-15 19:27 ` Mukesh Ojha 2024-10-15 22:28 ` anish kumar 2024-10-16 5:45 ` Mukesh Ojha 2024-10-16 16:37 ` anish kumar 2024-10-17 12:12 ` Jacek Anaszewski 2024-10-17 16:41 ` anish kumar 2024-10-17 17:58 ` Jacek Anaszewski 2024-10-17 20:30 ` anish kumar 2024-10-18 20:10 ` Jacek Anaszewski 2024-10-25 10:01 ` Mukesh Ojha
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.