Linux ACPI
 help / color / mirror / Atom feed
* OOB access on ACPI processor thermal device via sysfs write
@ 2020-04-02  7:41 Takashi Iwai
  2020-04-02  7:47 ` Zhang, Rui
  0 siblings, 1 reply; 6+ messages in thread
From: Takashi Iwai @ 2020-04-02  7:41 UTC (permalink / raw)
  To: linux-acpi; +Cc: Zhang Rui, Rafael J. Wysocki, Len Brown, linux-kernel

[ My post yesterday didn't seem go out properly, so I'm resending with
  a different MTA setup; sorry if you already received it ]

Hi,

after my recent patch [*], I started looking at the root cause more
closely, and found that it's a side-effect of the device
initialization order between acpi_processor_driver and cpufreq.
  [*] https://lore.kernel.org/linux-pm/20200330140859.12535-1-tiwai@suse.de/

In short:
  
- acpi_processor_driver_init() is called fairly early boot stage, and
  it registers a thermal device for each CPU. 

- A thermal device allocates a statistics table with the fixed size
  determined by get_max_state() ops call at its probe time.
  (The table entry is updated at each time a write to thermal state
  file happens.)

  For ACPI, processor_get_max_state() is called and it invokes
  acpi_processor_max_state() that calculates the max states depending
  on cpufreq, cpufreq_get_max_state().

  cpufreq_get_max_state() returns 0 at this stage because no cpufreq
  driver is available.  So the table size is set to 1.

- Later on, after cpufreq get initialized, the following sysfs write
  causes an OOB access and corrupts memory (eventually Oops):
    # echo 3 > /sys/class/thermal/cooling_device0/cur_state

  The reason is that now processor_get_max_state() returns 3 as
  cpufreq became available.  So the thermal driver believes as if it
  were a valid value, and updates the table accordingly although it
  overflows.


My patch above detects and avoids such an overflow, but it's no real
fix for the cause.  The proper fix needs either the shuffling of the
device creation order or some automatic resize of statistics table.

Do you guys have any suggestion how to solve it?

FWIW, I could reproduce the problem locally on my laptop with 5.6
kernel, and I believe this can be seen generically on most of
machines.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-04-02 18:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-04-02  7:41 OOB access on ACPI processor thermal device via sysfs write Takashi Iwai
2020-04-02  7:47 ` Zhang, Rui
2020-04-02  9:03   ` Takashi Iwai
2020-04-02 10:03     ` Zhang Rui
2020-04-02 10:09       ` Takashi Iwai
2020-04-02 18:07       ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox