public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
* cpufreq: Scaling driver sysfs attribute issues.
@ 2022-08-19 19:03 Chris Hyser
  2022-08-19 19:03 ` [PATCH] cpufreq: cppc_cpufreq: prevent crash on reading freqdomain_cpus Chris Hyser
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Hyser @ 2022-08-19 19:03 UTC (permalink / raw)
  To: chris.hyser, linux-pm, Rafael J. Wysocki, Viresh Kumar

We found a NULL pointer dereference in cppc_cpufreq scaling driver, traced it
to root cause, and looked at the other scaling drivers to see what else may be
affected.

The simplest fix for cppc_cpufreq is basically identical to commit e25303676e18
("cpufreq: acpi_cpufreq: prevent crash on reading freqdomain_cpus") and is
included.

The problem is a scaling driver which provides a struct freq_attr sysfs
attributes array passed via struct cpufreq_driver during driver registration,
has .init() and .exit() functions and does _not_ provide .online()/.offline()
routines.  The freq_attr attributes are often backed by allocated memory from
.init() and freed on .exit(). The issue is that the cpufreq core does not remove
those driver freq_attr sysfs files in this scenario though the backing memory
has been freed and while some instances just show garbage values, others contain
pointers. Note: registering empty .online()/.offline() driver functions can also
solve this problem as .exit() is not called and the memory has not been freed.

Looking through the other scaling drivers, most that provide a .attr use
'cpufreq_generic_attr' for which the 'show' functions do check for a NULL data
pointer. Two others are worth mentioning: brcm_avs_cpufreq driver has it's own
attributes backed by it's own functions. These do not check to see if the data
pointer is NULL in the show() functions, however as it uses devm_kzalloc(), it
has no .exit() and the sysfs files remain backed by memory until driver unload.

The other is the 'amd_pstate' driver. I don't have a means of testing, but I
suspect that this driver does have a similar problem, ie if a cpu is removed
and the various driver sysfs attrs are accessed, it will crash.

Additionally, I spent some time looking at an alternative that removed the
driver .attr set of sysfs attributes prior to calling .exit() in
__cpufreq_offline() and puts them back following an .init() in cpufreq_online().
On the AMD and ARM systems I tested with, no policies are shared and thus
.init()/.exit() are called on every cpu online/offline operation.  A simple test
consisting of two tight loops one onlining/offlining a few cpus and the other
reading the sysfs cpumask file, ultimately results in deadlock between sysfs and
the cpu hotplug machinery.  This seems consistent with various commit comments
about the locking here and a good time to send an email.

-chrish


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-08-23  4:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-19 19:03 cpufreq: Scaling driver sysfs attribute issues Chris Hyser
2022-08-19 19:03 ` [PATCH] cpufreq: cppc_cpufreq: prevent crash on reading freqdomain_cpus Chris Hyser
2022-08-22  5:39   ` Viresh Kumar
2022-08-22 13:19     ` Chris Hyser
2022-08-22 15:54       ` Chris Hyser
2022-08-22 20:14         ` Chris Hyser
2022-08-23  4:13           ` Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox