* [PATCH V2 1/1] thermal/sysfs: Clear cooling_device_stats_attr_group before initialized @ 2022-07-22 8:43 Di Shen 2022-07-22 17:18 ` Rafael J. Wysocki 0 siblings, 1 reply; 8+ messages in thread From: Di Shen @ 2022-07-22 8:43 UTC (permalink / raw) To: rafael, daniel.lezcano, rui.zhang Cc: viresh.kumar, amitk, linux-pm, linux-kernel, jeson.gao, xuewen.yan, cindygm567 There's a space allocated for cooling_device_stats_attr_group within cooling_device_attr_groups. This space is shared by all cooling devices. If the stats structure of one cooling device successfully creates stats sysfs. After that, another cooling device fails to get max_states in cooling_device_stats_setup(). It can return directly without initializing the stats structure, but the cooling_device_stats_attr_group is still the attribute group of the last cooling device. At this time, read or write stats sysfs nodes can cause kernel crash. Like the following, kernel crashed when 'cat time_in_state_ms'. [<5baac8d4>] panic+0x1b4/0x3c8 [<9d287b0f>] arm_notify_die+0x0/0x78 [<094fc22c>] __do_kernel_fault+0x94/0xa4 [<3b4b69a4>] do_page_fault+0xd4/0x364 [<23793e7a>] do_translation_fault+0x38/0xc0 [<6e5cc52a>] do_DataAbort+0x4c/0xd0 [<a28c16b8>] __dabt_svc+0x5c/0xa0 [<747516ae>] _raw_spin_lock+0x20/0x60 [<9a9e4cd4>] time_in_state_ms_show+0x28/0x148 [<cb78325e>] dev_attr_show+0x38/0x64 [<aea3e364>] sysfs_kf_seq_show+0x8c/0xf0 [<c0a843ab>] seq_read+0x244/0x620 [<b316b374>] vfs_read+0xd8/0x218 [<3aebf5fa>] sys_read+0x80/0xe4 [<7cf100f5>] ret_fast_syscall+0x0/0x28 [<08cbe22f>] 0xbe8c1198 stats sysfs: phone:/sys/class/thermal/cooling_device2/stats # ls reset time_in_state_ms total_trans trans_table The same as cat total_trans, trans_table, and echo reset. To avoid kernel crash, this patch set clears the cooling_device_attr_groups before stats structure is initialized. Signed-off-by: Di Shen <di.shen@unisoc.com> --- drivers/thermal/thermal_sysfs.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c index 1c4aac8464a7..e3fae63fa0f7 100644 --- a/drivers/thermal/thermal_sysfs.c +++ b/drivers/thermal/thermal_sysfs.c @@ -817,6 +817,9 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev) unsigned long states; int var; + var = ARRAY_SIZE(cooling_device_attr_groups) - 2; + cooling_device_attr_groups[var] = NULL; + if (cdev->ops->get_max_state(cdev, &states)) return; -- 2.17.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH V2 1/1] thermal/sysfs: Clear cooling_device_stats_attr_group before initialized 2022-07-22 8:43 [PATCH V2 1/1] thermal/sysfs: Clear cooling_device_stats_attr_group before initialized Di Shen @ 2022-07-22 17:18 ` Rafael J. Wysocki 2022-07-22 18:42 ` Rafael J. Wysocki 2022-07-26 7:31 ` Di Shen 0 siblings, 2 replies; 8+ messages in thread From: Rafael J. Wysocki @ 2022-07-22 17:18 UTC (permalink / raw) To: Di Shen Cc: Rafael J. Wysocki, Daniel Lezcano, Zhang, Rui, Viresh Kumar, Amit Kucheria, Linux PM, Linux Kernel Mailing List, jeson.gao, xuewen.yan, cindygm567 On Fri, Jul 22, 2022 at 10:44 AM Di Shen <di.shen@unisoc.com> wrote: > > There's a space allocated for cooling_device_stats_attr_group > within cooling_device_attr_groups. This space is shared by all > cooling devices. That's correct. > If the stats structure of one cooling device successfully > creates stats sysfs. After that, another cooling device fails > to get max_states in cooling_device_stats_setup(). It can > return directly without initializing the stats structure, but > the cooling_device_stats_attr_group is still the attribute > group of the last cooling device. I cannot parse the above, sorry. For example, how can a "stats structure of one cooling device" create anything? As a data structure, it is a passive entity, so it doesn't carry out any actions. I think (but I am not sure) that you are referring to the error code path in which the ->get_max_state() callback fails for a cooling device after cooling_device_stats_setup() has completed successfully for another one. > At this time, read or write stats sysfs nodes can cause kernel > crash. Like the following, kernel crashed when > 'cat time_in_state_ms'. > > [<5baac8d4>] panic+0x1b4/0x3c8 > [<9d287b0f>] arm_notify_die+0x0/0x78 > [<094fc22c>] __do_kernel_fault+0x94/0xa4 > [<3b4b69a4>] do_page_fault+0xd4/0x364 > [<23793e7a>] do_translation_fault+0x38/0xc0 > [<6e5cc52a>] do_DataAbort+0x4c/0xd0 > [<a28c16b8>] __dabt_svc+0x5c/0xa0 > [<747516ae>] _raw_spin_lock+0x20/0x60 > [<9a9e4cd4>] time_in_state_ms_show+0x28/0x148 > [<cb78325e>] dev_attr_show+0x38/0x64 > [<aea3e364>] sysfs_kf_seq_show+0x8c/0xf0 > [<c0a843ab>] seq_read+0x244/0x620 > [<b316b374>] vfs_read+0xd8/0x218 > [<3aebf5fa>] sys_read+0x80/0xe4 > [<7cf100f5>] ret_fast_syscall+0x0/0x28 > [<08cbe22f>] 0xbe8c1198 > > stats sysfs: > phone:/sys/class/thermal/cooling_device2/stats # ls > reset time_in_state_ms total_trans trans_table > > The same as cat total_trans, trans_table, and echo reset. > > To avoid kernel crash, this patch set clears the > cooling_device_attr_groups before stats structure is initialized. > > Signed-off-by: Di Shen <di.shen@unisoc.com> > --- > drivers/thermal/thermal_sysfs.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c > index 1c4aac8464a7..e3fae63fa0f7 100644 > --- a/drivers/thermal/thermal_sysfs.c > +++ b/drivers/thermal/thermal_sysfs.c > @@ -817,6 +817,9 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev) > unsigned long states; > int var; > > + var = ARRAY_SIZE(cooling_device_attr_groups) - 2; > + cooling_device_attr_groups[var] = NULL; > + > if (cdev->ops->get_max_state(cdev, &states)) > return; > > -- ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2 1/1] thermal/sysfs: Clear cooling_device_stats_attr_group before initialized 2022-07-22 17:18 ` Rafael J. Wysocki @ 2022-07-22 18:42 ` Rafael J. Wysocki 2022-07-26 7:39 ` Di Shen 2022-07-26 7:31 ` Di Shen 1 sibling, 1 reply; 8+ messages in thread From: Rafael J. Wysocki @ 2022-07-22 18:42 UTC (permalink / raw) To: Di Shen Cc: Rafael J. Wysocki, Daniel Lezcano, Zhang, Rui, Viresh Kumar, Amit Kucheria, Linux PM, Linux Kernel Mailing List, jeson.gao, xuewen.yan, cindygm567 On Friday, July 22, 2022 7:18:42 PM CEST Rafael J. Wysocki wrote: > On Fri, Jul 22, 2022 at 10:44 AM Di Shen <di.shen@unisoc.com> wrote: > > > > There's a space allocated for cooling_device_stats_attr_group > > within cooling_device_attr_groups. This space is shared by all > > cooling devices. > > That's correct. > > > If the stats structure of one cooling device successfully > > creates stats sysfs. After that, another cooling device fails > > to get max_states in cooling_device_stats_setup(). It can > > return directly without initializing the stats structure, but > > the cooling_device_stats_attr_group is still the attribute > > group of the last cooling device. > > I cannot parse the above, sorry. > > For example, how can a "stats structure of one cooling device" create > anything? As a data structure, it is a passive entity, so it doesn't > carry out any actions. > > I think (but I am not sure) that you are referring to the error code > path in which the ->get_max_state() callback fails for a cooling > device after cooling_device_stats_setup() has completed successfully > for another one. > > > At this time, read or write stats sysfs nodes can cause kernel > > crash. Like the following, kernel crashed when > > 'cat time_in_state_ms'. > > > > [<5baac8d4>] panic+0x1b4/0x3c8 > > [<9d287b0f>] arm_notify_die+0x0/0x78 > > [<094fc22c>] __do_kernel_fault+0x94/0xa4 > > [<3b4b69a4>] do_page_fault+0xd4/0x364 > > [<23793e7a>] do_translation_fault+0x38/0xc0 > > [<6e5cc52a>] do_DataAbort+0x4c/0xd0 > > [<a28c16b8>] __dabt_svc+0x5c/0xa0 > > [<747516ae>] _raw_spin_lock+0x20/0x60 > > [<9a9e4cd4>] time_in_state_ms_show+0x28/0x148 > > [<cb78325e>] dev_attr_show+0x38/0x64 > > [<aea3e364>] sysfs_kf_seq_show+0x8c/0xf0 > > [<c0a843ab>] seq_read+0x244/0x620 > > [<b316b374>] vfs_read+0xd8/0x218 > > [<3aebf5fa>] sys_read+0x80/0xe4 > > [<7cf100f5>] ret_fast_syscall+0x0/0x28 > > [<08cbe22f>] 0xbe8c1198 > > > > stats sysfs: > > phone:/sys/class/thermal/cooling_device2/stats # ls > > reset time_in_state_ms total_trans trans_table > > > > The same as cat total_trans, trans_table, and echo reset. So does the (untested) patch below work too? --- drivers/thermal/thermal_sysfs.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) Index: linux-pm/drivers/thermal/thermal_sysfs.c =================================================================== --- linux-pm.orig/drivers/thermal/thermal_sysfs.c +++ linux-pm/drivers/thermal/thermal_sysfs.c @@ -813,12 +813,13 @@ static const struct attribute_group cool static void cooling_device_stats_setup(struct thermal_cooling_device *cdev) { + const struct attribute_group *stats_attr_group = NULL; struct cooling_dev_stats *stats; unsigned long states; int var; if (cdev->ops->get_max_state(cdev, &states)) - return; + goto out; states++; /* Total number of states is highest state + 1 */ @@ -828,7 +829,7 @@ static void cooling_device_stats_setup(s stats = kzalloc(var, GFP_KERNEL); if (!stats) - return; + goto out; stats->time_in_state = (ktime_t *)(stats + 1); stats->trans_table = (unsigned int *)(stats->time_in_state + states); @@ -838,9 +839,12 @@ static void cooling_device_stats_setup(s spin_lock_init(&stats->lock); + stats_attr_group = &cooling_device_stats_attr_group; + +out: /* Fill the empty slot left in cooling_device_attr_groups */ var = ARRAY_SIZE(cooling_device_attr_groups) - 2; - cooling_device_attr_groups[var] = &cooling_device_stats_attr_group; + cooling_device_attr_groups[var] = stats_attr_group; } static void cooling_device_stats_destroy(struct thermal_cooling_device *cdev) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2 1/1] thermal/sysfs: Clear cooling_device_stats_attr_group before initialized 2022-07-22 18:42 ` Rafael J. Wysocki @ 2022-07-26 7:39 ` Di Shen 2022-07-27 8:16 ` Di Shen 0 siblings, 1 reply; 8+ messages in thread From: Di Shen @ 2022-07-26 7:39 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Di Shen, Rafael J. Wysocki, Daniel Lezcano, Zhang, Rui, Viresh Kumar, Amit Kucheria, Linux PM, Linux Kernel Mailing List, jeson.gao, xuewen.yan, ke.wang On Sat, Jul 23, 2022 at 2:42 AM Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > On Friday, July 22, 2022 7:18:42 PM CEST Rafael J. Wysocki wrote: > > On Fri, Jul 22, 2022 at 10:44 AM Di Shen <di.shen@unisoc.com> wrote: > > > > > > There's a space allocated for cooling_device_stats_attr_group > > > within cooling_device_attr_groups. This space is shared by all > > > cooling devices. > > > > That's correct. > > > > > If the stats structure of one cooling device successfully > > > creates stats sysfs. After that, another cooling device fails > > > to get max_states in cooling_device_stats_setup(). It can > > > return directly without initializing the stats structure, but > > > the cooling_device_stats_attr_group is still the attribute > > > group of the last cooling device. > > > > I cannot parse the above, sorry. > > > > For example, how can a "stats structure of one cooling device" create > > anything? As a data structure, it is a passive entity, so it doesn't > > carry out any actions. > > > > I think (but I am not sure) that you are referring to the error code > > path in which the ->get_max_state() callback fails for a cooling > > device after cooling_device_stats_setup() has completed successfully > > for another one. > > > > > At this time, read or write stats sysfs nodes can cause kernel > > > crash. Like the following, kernel crashed when > > > 'cat time_in_state_ms'. > > > > > > [<5baac8d4>] panic+0x1b4/0x3c8 > > > [<9d287b0f>] arm_notify_die+0x0/0x78 > > > [<094fc22c>] __do_kernel_fault+0x94/0xa4 > > > [<3b4b69a4>] do_page_fault+0xd4/0x364 > > > [<23793e7a>] do_translation_fault+0x38/0xc0 > > > [<6e5cc52a>] do_DataAbort+0x4c/0xd0 > > > [<a28c16b8>] __dabt_svc+0x5c/0xa0 > > > [<747516ae>] _raw_spin_lock+0x20/0x60 > > > [<9a9e4cd4>] time_in_state_ms_show+0x28/0x148 > > > [<cb78325e>] dev_attr_show+0x38/0x64 > > > [<aea3e364>] sysfs_kf_seq_show+0x8c/0xf0 > > > [<c0a843ab>] seq_read+0x244/0x620 > > > [<b316b374>] vfs_read+0xd8/0x218 > > > [<3aebf5fa>] sys_read+0x80/0xe4 > > > [<7cf100f5>] ret_fast_syscall+0x0/0x28 > > > [<08cbe22f>] 0xbe8c1198 > > > > > > stats sysfs: > > > phone:/sys/class/thermal/cooling_device2/stats # ls > > > reset time_in_state_ms total_trans trans_table > > > > > > The same as cat total_trans, trans_table, and echo reset. > > So does the (untested) patch below work too? > Yes, I agree with you. I will test it on our platform and give a reply later. Thanks. > --- > drivers/thermal/thermal_sysfs.c | 10 +++++++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > Index: linux-pm/drivers/thermal/thermal_sysfs.c > =================================================================== > --- linux-pm.orig/drivers/thermal/thermal_sysfs.c > +++ linux-pm/drivers/thermal/thermal_sysfs.c > @@ -813,12 +813,13 @@ static const struct attribute_group cool > > static void cooling_device_stats_setup(struct thermal_cooling_device *cdev) > { > + const struct attribute_group *stats_attr_group = NULL; > struct cooling_dev_stats *stats; > unsigned long states; > int var; > > if (cdev->ops->get_max_state(cdev, &states)) > - return; > + goto out; > > states++; /* Total number of states is highest state + 1 */ > > @@ -828,7 +829,7 @@ static void cooling_device_stats_setup(s > > stats = kzalloc(var, GFP_KERNEL); > if (!stats) > - return; > + goto out; > > stats->time_in_state = (ktime_t *)(stats + 1); > stats->trans_table = (unsigned int *)(stats->time_in_state + states); > @@ -838,9 +839,12 @@ static void cooling_device_stats_setup(s > > spin_lock_init(&stats->lock); > > + stats_attr_group = &cooling_device_stats_attr_group; > + > +out: > /* Fill the empty slot left in cooling_device_attr_groups */ > var = ARRAY_SIZE(cooling_device_attr_groups) - 2; > - cooling_device_attr_groups[var] = &cooling_device_stats_attr_group; > + cooling_device_attr_groups[var] = stats_attr_group; > } > > static void cooling_device_stats_destroy(struct thermal_cooling_device *cdev) > > > -- Best regards, ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2 1/1] thermal/sysfs: Clear cooling_device_stats_attr_group before initialized 2022-07-26 7:39 ` Di Shen @ 2022-07-27 8:16 ` Di Shen 2022-07-27 14:20 ` Rafael J. Wysocki 0 siblings, 1 reply; 8+ messages in thread From: Di Shen @ 2022-07-27 8:16 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Di Shen, Rafael J. Wysocki, Daniel Lezcano, Zhang, Rui, Viresh Kumar, Amit Kucheria, Linux PM, Linux Kernel Mailing List, jeson.gao, xuewen.yan, ke.wang Hi Rafael, I have tested this patch on our platform, and it works. Later, I will send the patch v3. Thanks! -- Di On Tue, Jul 26, 2022 at 3:39 PM Di Shen <cindygm567@gmail.com> wrote: > > On Sat, Jul 23, 2022 at 2:42 AM Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > > > On Friday, July 22, 2022 7:18:42 PM CEST Rafael J. Wysocki wrote: > > > On Fri, Jul 22, 2022 at 10:44 AM Di Shen <di.shen@unisoc.com> wrote: > > > > > > > > There's a space allocated for cooling_device_stats_attr_group > > > > within cooling_device_attr_groups. This space is shared by all > > > > cooling devices. > > > > > > That's correct. > > > > > > > If the stats structure of one cooling device successfully > > > > creates stats sysfs. After that, another cooling device fails > > > > to get max_states in cooling_device_stats_setup(). It can > > > > return directly without initializing the stats structure, but > > > > the cooling_device_stats_attr_group is still the attribute > > > > group of the last cooling device. > > > > > > I cannot parse the above, sorry. > > > > > > For example, how can a "stats structure of one cooling device" create > > > anything? As a data structure, it is a passive entity, so it doesn't > > > carry out any actions. > > > > > > I think (but I am not sure) that you are referring to the error code > > > path in which the ->get_max_state() callback fails for a cooling > > > device after cooling_device_stats_setup() has completed successfully > > > for another one. > > > > > > > At this time, read or write stats sysfs nodes can cause kernel > > > > crash. Like the following, kernel crashed when > > > > 'cat time_in_state_ms'. > > > > > > > > [<5baac8d4>] panic+0x1b4/0x3c8 > > > > [<9d287b0f>] arm_notify_die+0x0/0x78 > > > > [<094fc22c>] __do_kernel_fault+0x94/0xa4 > > > > [<3b4b69a4>] do_page_fault+0xd4/0x364 > > > > [<23793e7a>] do_translation_fault+0x38/0xc0 > > > > [<6e5cc52a>] do_DataAbort+0x4c/0xd0 > > > > [<a28c16b8>] __dabt_svc+0x5c/0xa0 > > > > [<747516ae>] _raw_spin_lock+0x20/0x60 > > > > [<9a9e4cd4>] time_in_state_ms_show+0x28/0x148 > > > > [<cb78325e>] dev_attr_show+0x38/0x64 > > > > [<aea3e364>] sysfs_kf_seq_show+0x8c/0xf0 > > > > [<c0a843ab>] seq_read+0x244/0x620 > > > > [<b316b374>] vfs_read+0xd8/0x218 > > > > [<3aebf5fa>] sys_read+0x80/0xe4 > > > > [<7cf100f5>] ret_fast_syscall+0x0/0x28 > > > > [<08cbe22f>] 0xbe8c1198 > > > > > > > > stats sysfs: > > > > phone:/sys/class/thermal/cooling_device2/stats # ls > > > > reset time_in_state_ms total_trans trans_table > > > > > > > > The same as cat total_trans, trans_table, and echo reset. > > > > So does the (untested) patch below work too? > > > > Yes, I agree with you. I will test it on our platform and give > a reply later. Thanks. > > > --- > > drivers/thermal/thermal_sysfs.c | 10 +++++++--- > > 1 file changed, 7 insertions(+), 3 deletions(-) > > > > Index: linux-pm/drivers/thermal/thermal_sysfs.c > > =================================================================== > > --- linux-pm.orig/drivers/thermal/thermal_sysfs.c > > +++ linux-pm/drivers/thermal/thermal_sysfs.c > > @@ -813,12 +813,13 @@ static const struct attribute_group cool > > > > static void cooling_device_stats_setup(struct thermal_cooling_device *cdev) > > { > > + const struct attribute_group *stats_attr_group = NULL; > > struct cooling_dev_stats *stats; > > unsigned long states; > > int var; > > > > if (cdev->ops->get_max_state(cdev, &states)) > > - return; > > + goto out; > > > > states++; /* Total number of states is highest state + 1 */ > > > > @@ -828,7 +829,7 @@ static void cooling_device_stats_setup(s > > > > stats = kzalloc(var, GFP_KERNEL); > > if (!stats) > > - return; > > + goto out; > > > > stats->time_in_state = (ktime_t *)(stats + 1); > > stats->trans_table = (unsigned int *)(stats->time_in_state + states); > > @@ -838,9 +839,12 @@ static void cooling_device_stats_setup(s > > > > spin_lock_init(&stats->lock); > > > > + stats_attr_group = &cooling_device_stats_attr_group; > > + > > +out: > > /* Fill the empty slot left in cooling_device_attr_groups */ > > var = ARRAY_SIZE(cooling_device_attr_groups) - 2; > > - cooling_device_attr_groups[var] = &cooling_device_stats_attr_group; > > + cooling_device_attr_groups[var] = stats_attr_group; > > } > > > > static void cooling_device_stats_destroy(struct thermal_cooling_device *cdev) > > > > > > > -- > Best regards, ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2 1/1] thermal/sysfs: Clear cooling_device_stats_attr_group before initialized 2022-07-27 8:16 ` Di Shen @ 2022-07-27 14:20 ` Rafael J. Wysocki 2022-07-28 1:54 ` Di Shen 0 siblings, 1 reply; 8+ messages in thread From: Rafael J. Wysocki @ 2022-07-27 14:20 UTC (permalink / raw) To: Di Shen Cc: Rafael J. Wysocki, Di Shen, Rafael J. Wysocki, Daniel Lezcano, Zhang, Rui, Viresh Kumar, Amit Kucheria, Linux PM, Linux Kernel Mailing List, jeson.gao, xuewen.yan, ke.wang On Wed, Jul 27, 2022 at 10:17 AM Di Shen <cindygm567@gmail.com> wrote: > > Hi Rafael, > I have tested this patch on our platform, and it works. Later, I will > send the patch v3. Well, no need. I'll use the one that you've just tested. Thanks! > On Tue, Jul 26, 2022 at 3:39 PM Di Shen <cindygm567@gmail.com> wrote: > > > > On Sat, Jul 23, 2022 at 2:42 AM Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > > > > > On Friday, July 22, 2022 7:18:42 PM CEST Rafael J. Wysocki wrote: > > > > On Fri, Jul 22, 2022 at 10:44 AM Di Shen <di.shen@unisoc.com> wrote: > > > > > > > > > > There's a space allocated for cooling_device_stats_attr_group > > > > > within cooling_device_attr_groups. This space is shared by all > > > > > cooling devices. > > > > > > > > That's correct. > > > > > > > > > If the stats structure of one cooling device successfully > > > > > creates stats sysfs. After that, another cooling device fails > > > > > to get max_states in cooling_device_stats_setup(). It can > > > > > return directly without initializing the stats structure, but > > > > > the cooling_device_stats_attr_group is still the attribute > > > > > group of the last cooling device. > > > > > > > > I cannot parse the above, sorry. > > > > > > > > For example, how can a "stats structure of one cooling device" create > > > > anything? As a data structure, it is a passive entity, so it doesn't > > > > carry out any actions. > > > > > > > > I think (but I am not sure) that you are referring to the error code > > > > path in which the ->get_max_state() callback fails for a cooling > > > > device after cooling_device_stats_setup() has completed successfully > > > > for another one. > > > > > > > > > At this time, read or write stats sysfs nodes can cause kernel > > > > > crash. Like the following, kernel crashed when > > > > > 'cat time_in_state_ms'. > > > > > > > > > > [<5baac8d4>] panic+0x1b4/0x3c8 > > > > > [<9d287b0f>] arm_notify_die+0x0/0x78 > > > > > [<094fc22c>] __do_kernel_fault+0x94/0xa4 > > > > > [<3b4b69a4>] do_page_fault+0xd4/0x364 > > > > > [<23793e7a>] do_translation_fault+0x38/0xc0 > > > > > [<6e5cc52a>] do_DataAbort+0x4c/0xd0 > > > > > [<a28c16b8>] __dabt_svc+0x5c/0xa0 > > > > > [<747516ae>] _raw_spin_lock+0x20/0x60 > > > > > [<9a9e4cd4>] time_in_state_ms_show+0x28/0x148 > > > > > [<cb78325e>] dev_attr_show+0x38/0x64 > > > > > [<aea3e364>] sysfs_kf_seq_show+0x8c/0xf0 > > > > > [<c0a843ab>] seq_read+0x244/0x620 > > > > > [<b316b374>] vfs_read+0xd8/0x218 > > > > > [<3aebf5fa>] sys_read+0x80/0xe4 > > > > > [<7cf100f5>] ret_fast_syscall+0x0/0x28 > > > > > [<08cbe22f>] 0xbe8c1198 > > > > > > > > > > stats sysfs: > > > > > phone:/sys/class/thermal/cooling_device2/stats # ls > > > > > reset time_in_state_ms total_trans trans_table > > > > > > > > > > The same as cat total_trans, trans_table, and echo reset. > > > > > > So does the (untested) patch below work too? > > > > > > > Yes, I agree with you. I will test it on our platform and give > > a reply later. Thanks. > > > > > --- > > > drivers/thermal/thermal_sysfs.c | 10 +++++++--- > > > 1 file changed, 7 insertions(+), 3 deletions(-) > > > > > > Index: linux-pm/drivers/thermal/thermal_sysfs.c > > > =================================================================== > > > --- linux-pm.orig/drivers/thermal/thermal_sysfs.c > > > +++ linux-pm/drivers/thermal/thermal_sysfs.c > > > @@ -813,12 +813,13 @@ static const struct attribute_group cool > > > > > > static void cooling_device_stats_setup(struct thermal_cooling_device *cdev) > > > { > > > + const struct attribute_group *stats_attr_group = NULL; > > > struct cooling_dev_stats *stats; > > > unsigned long states; > > > int var; > > > > > > if (cdev->ops->get_max_state(cdev, &states)) > > > - return; > > > + goto out; > > > > > > states++; /* Total number of states is highest state + 1 */ > > > > > > @@ -828,7 +829,7 @@ static void cooling_device_stats_setup(s > > > > > > stats = kzalloc(var, GFP_KERNEL); > > > if (!stats) > > > - return; > > > + goto out; > > > > > > stats->time_in_state = (ktime_t *)(stats + 1); > > > stats->trans_table = (unsigned int *)(stats->time_in_state + states); > > > @@ -838,9 +839,12 @@ static void cooling_device_stats_setup(s > > > > > > spin_lock_init(&stats->lock); > > > > > > + stats_attr_group = &cooling_device_stats_attr_group; > > > + > > > +out: > > > /* Fill the empty slot left in cooling_device_attr_groups */ > > > var = ARRAY_SIZE(cooling_device_attr_groups) - 2; > > > - cooling_device_attr_groups[var] = &cooling_device_stats_attr_group; > > > + cooling_device_attr_groups[var] = stats_attr_group; > > > } > > > > > > static void cooling_device_stats_destroy(struct thermal_cooling_device *cdev) > > > > > > > > > > > -- > > Best regards, ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2 1/1] thermal/sysfs: Clear cooling_device_stats_attr_group before initialized 2022-07-27 14:20 ` Rafael J. Wysocki @ 2022-07-28 1:54 ` Di Shen 0 siblings, 0 replies; 8+ messages in thread From: Di Shen @ 2022-07-28 1:54 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Rafael J. Wysocki, Di Shen, Daniel Lezcano, Zhang, Rui, Viresh Kumar, Amit Kucheria, Linux PM, Linux Kernel Mailing List, jeson.gao, xuewen.yan, ke.wang Ok, thanks. On Wed, Jul 27, 2022 at 10:20 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Wed, Jul 27, 2022 at 10:17 AM Di Shen <cindygm567@gmail.com> wrote: > > > > Hi Rafael, > > I have tested this patch on our platform, and it works. Later, I will > > send the patch v3. > > Well, no need. I'll use the one that you've just tested. > > Thanks! > > > > On Tue, Jul 26, 2022 at 3:39 PM Di Shen <cindygm567@gmail.com> wrote: > > > > > > On Sat, Jul 23, 2022 at 2:42 AM Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > > > > > > > On Friday, July 22, 2022 7:18:42 PM CEST Rafael J. Wysocki wrote: > > > > > On Fri, Jul 22, 2022 at 10:44 AM Di Shen <di.shen@unisoc.com> wrote: > > > > > > > > > > > > There's a space allocated for cooling_device_stats_attr_group > > > > > > within cooling_device_attr_groups. This space is shared by all > > > > > > cooling devices. > > > > > > > > > > That's correct. > > > > > > > > > > > If the stats structure of one cooling device successfully > > > > > > creates stats sysfs. After that, another cooling device fails > > > > > > to get max_states in cooling_device_stats_setup(). It can > > > > > > return directly without initializing the stats structure, but > > > > > > the cooling_device_stats_attr_group is still the attribute > > > > > > group of the last cooling device. > > > > > > > > > > I cannot parse the above, sorry. > > > > > > > > > > For example, how can a "stats structure of one cooling device" create > > > > > anything? As a data structure, it is a passive entity, so it doesn't > > > > > carry out any actions. > > > > > > > > > > I think (but I am not sure) that you are referring to the error code > > > > > path in which the ->get_max_state() callback fails for a cooling > > > > > device after cooling_device_stats_setup() has completed successfully > > > > > for another one. > > > > > > > > > > > At this time, read or write stats sysfs nodes can cause kernel > > > > > > crash. Like the following, kernel crashed when > > > > > > 'cat time_in_state_ms'. > > > > > > > > > > > > [<5baac8d4>] panic+0x1b4/0x3c8 > > > > > > [<9d287b0f>] arm_notify_die+0x0/0x78 > > > > > > [<094fc22c>] __do_kernel_fault+0x94/0xa4 > > > > > > [<3b4b69a4>] do_page_fault+0xd4/0x364 > > > > > > [<23793e7a>] do_translation_fault+0x38/0xc0 > > > > > > [<6e5cc52a>] do_DataAbort+0x4c/0xd0 > > > > > > [<a28c16b8>] __dabt_svc+0x5c/0xa0 > > > > > > [<747516ae>] _raw_spin_lock+0x20/0x60 > > > > > > [<9a9e4cd4>] time_in_state_ms_show+0x28/0x148 > > > > > > [<cb78325e>] dev_attr_show+0x38/0x64 > > > > > > [<aea3e364>] sysfs_kf_seq_show+0x8c/0xf0 > > > > > > [<c0a843ab>] seq_read+0x244/0x620 > > > > > > [<b316b374>] vfs_read+0xd8/0x218 > > > > > > [<3aebf5fa>] sys_read+0x80/0xe4 > > > > > > [<7cf100f5>] ret_fast_syscall+0x0/0x28 > > > > > > [<08cbe22f>] 0xbe8c1198 > > > > > > > > > > > > stats sysfs: > > > > > > phone:/sys/class/thermal/cooling_device2/stats # ls > > > > > > reset time_in_state_ms total_trans trans_table > > > > > > > > > > > > The same as cat total_trans, trans_table, and echo reset. > > > > > > > > So does the (untested) patch below work too? > > > > > > > > > > Yes, I agree with you. I will test it on our platform and give > > > a reply later. Thanks. > > > > > > > --- > > > > drivers/thermal/thermal_sysfs.c | 10 +++++++--- > > > > 1 file changed, 7 insertions(+), 3 deletions(-) > > > > > > > > Index: linux-pm/drivers/thermal/thermal_sysfs.c > > > > =================================================================== > > > > --- linux-pm.orig/drivers/thermal/thermal_sysfs.c > > > > +++ linux-pm/drivers/thermal/thermal_sysfs.c > > > > @@ -813,12 +813,13 @@ static const struct attribute_group cool > > > > > > > > static void cooling_device_stats_setup(struct thermal_cooling_device *cdev) > > > > { > > > > + const struct attribute_group *stats_attr_group = NULL; > > > > struct cooling_dev_stats *stats; > > > > unsigned long states; > > > > int var; > > > > > > > > if (cdev->ops->get_max_state(cdev, &states)) > > > > - return; > > > > + goto out; > > > > > > > > states++; /* Total number of states is highest state + 1 */ > > > > > > > > @@ -828,7 +829,7 @@ static void cooling_device_stats_setup(s > > > > > > > > stats = kzalloc(var, GFP_KERNEL); > > > > if (!stats) > > > > - return; > > > > + goto out; > > > > > > > > stats->time_in_state = (ktime_t *)(stats + 1); > > > > stats->trans_table = (unsigned int *)(stats->time_in_state + states); > > > > @@ -838,9 +839,12 @@ static void cooling_device_stats_setup(s > > > > > > > > spin_lock_init(&stats->lock); > > > > > > > > + stats_attr_group = &cooling_device_stats_attr_group; > > > > + > > > > +out: > > > > /* Fill the empty slot left in cooling_device_attr_groups */ > > > > var = ARRAY_SIZE(cooling_device_attr_groups) - 2; > > > > - cooling_device_attr_groups[var] = &cooling_device_stats_attr_group; > > > > + cooling_device_attr_groups[var] = stats_attr_group; > > > > } > > > > > > > > static void cooling_device_stats_destroy(struct thermal_cooling_device *cdev) > > > > > > > > > > > > > > > -- > > > Best regards, ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2 1/1] thermal/sysfs: Clear cooling_device_stats_attr_group before initialized 2022-07-22 17:18 ` Rafael J. Wysocki 2022-07-22 18:42 ` Rafael J. Wysocki @ 2022-07-26 7:31 ` Di Shen 1 sibling, 0 replies; 8+ messages in thread From: Di Shen @ 2022-07-26 7:31 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Di Shen, Daniel Lezcano, Zhang, Rui, Viresh Kumar, Amit Kucheria, Linux PM, Linux Kernel Mailing List, jeson.gao, xuewen.yan, ke.wang On Sat, Jul 23, 2022 at 1:18 AM Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Fri, Jul 22, 2022 at 10:44 AM Di Shen <di.shen@unisoc.com> wrote: > > > > There's a space allocated for cooling_device_stats_attr_group > > within cooling_device_attr_groups. This space is shared by all > > cooling devices. > > That's correct. > > > If the stats structure of one cooling device successfully > > creates stats sysfs. After that, another cooling device fails > > to get max_states in cooling_device_stats_setup(). It can > > return directly without initializing the stats structure, but > > the cooling_device_stats_attr_group is still the attribute > > group of the last cooling device. > > I cannot parse the above, sorry. > > For example, how can a "stats structure of one cooling device" create > anything? As a data structure, it is a passive entity, so it doesn't > carry out any actions. > Sorry, I didn't describe it properly. I mean 'if it has been called back cooling_device_stats_setup() successfully for a cooling device'. > I think (but I am not sure) that you are referring to the error code > path in which the ->get_max_state() callback fails for a cooling > device after cooling_device_stats_setup() has completed successfully > for another one. > That's it. As you say, ->get_max_state() callback fails for a cooling device after cooling_device_stats_setup() has completed successfully for another one. > > At this time, read or write stats sysfs nodes can cause kernel > > crash. Like the following, kernel crashed when > > 'cat time_in_state_ms'. > > > > [<5baac8d4>] panic+0x1b4/0x3c8 > > [<9d287b0f>] arm_notify_die+0x0/0x78 > > [<094fc22c>] __do_kernel_fault+0x94/0xa4 > > [<3b4b69a4>] do_page_fault+0xd4/0x364 > > [<23793e7a>] do_translation_fault+0x38/0xc0 > > [<6e5cc52a>] do_DataAbort+0x4c/0xd0 > > [<a28c16b8>] __dabt_svc+0x5c/0xa0 > > [<747516ae>] _raw_spin_lock+0x20/0x60 > > [<9a9e4cd4>] time_in_state_ms_show+0x28/0x148 > > [<cb78325e>] dev_attr_show+0x38/0x64 > > [<aea3e364>] sysfs_kf_seq_show+0x8c/0xf0 > > [<c0a843ab>] seq_read+0x244/0x620 > > [<b316b374>] vfs_read+0xd8/0x218 > > [<3aebf5fa>] sys_read+0x80/0xe4 > > [<7cf100f5>] ret_fast_syscall+0x0/0x28 > > [<08cbe22f>] 0xbe8c1198 > > > > stats sysfs: > > phone:/sys/class/thermal/cooling_device2/stats # ls > > reset time_in_state_ms total_trans trans_table > > > > The same as cat total_trans, trans_table, and echo reset. > > > > To avoid kernel crash, this patch set clears the > > cooling_device_attr_groups before stats structure is initialized. > > > > Signed-off-by: Di Shen <di.shen@unisoc.com> > > --- > > drivers/thermal/thermal_sysfs.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c > > index 1c4aac8464a7..e3fae63fa0f7 100644 > > --- a/drivers/thermal/thermal_sysfs.c > > +++ b/drivers/thermal/thermal_sysfs.c > > @@ -817,6 +817,9 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev) > > unsigned long states; > > int var; > > > > + var = ARRAY_SIZE(cooling_device_attr_groups) - 2; > > + cooling_device_attr_groups[var] = NULL; > > + > > if (cdev->ops->get_max_state(cdev, &states)) > > return; > > > > -- -- Best regards, Di Di ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-07-28 1:54 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-07-22 8:43 [PATCH V2 1/1] thermal/sysfs: Clear cooling_device_stats_attr_group before initialized Di Shen 2022-07-22 17:18 ` Rafael J. Wysocki 2022-07-22 18:42 ` Rafael J. Wysocki 2022-07-26 7:39 ` Di Shen 2022-07-27 8:16 ` Di Shen 2022-07-27 14:20 ` Rafael J. Wysocki 2022-07-28 1:54 ` Di Shen 2022-07-26 7:31 ` Di Shen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox