* Re: [RFC] thermal/core: Disable uevent messages for cooling devices
From: John Stultz @ 2024-04-08 19:03 UTC (permalink / raw)
To: Roman Stratiienko
Cc: linux-pm, rafael, daniel.lezcano, amitk, rui.zhang, linux-kernel,
megi
In-Reply-To: <CAGphcdnK8Hx-YsA-HukRKbvC-HpnLktCtq8qFicQUL-8ZHC+1w@mail.gmail.com>
On Mon, Apr 8, 2024 at 11:59 AM Roman Stratiienko
<r.stratiienko@gmail.com> wrote:
>
> I haven't worked on it since I posted it initially. But it looks like
> there's an alternative patch already upstreamed and backported into
> stable:
>
> https://lore.kernel.org/linux-kernel/CAJZ5v0hHTuEXmQA=0D90eR_KUsOsfcxYbTS=zQYDTXuY6o_K_Q@mail.gmail.com/T/
Ah! Many thanks for the link! I'll check in with folks to better
understand if there is a functionality gap between what you submitted
and what landed upstream.
Thanks again!
-john
^ permalink raw reply
* Re: [RFC] thermal/core: Disable uevent messages for cooling devices
From: Roman Stratiienko @ 2024-04-08 18:59 UTC (permalink / raw)
To: John Stultz
Cc: linux-pm, rafael, daniel.lezcano, amitk, rui.zhang, linux-kernel,
megi
In-Reply-To: <CANDhNCpAhtgdwpSUTJ2jo2J5L6rHzQHVB9q+kkZ3ouTt12b-uw@mail.gmail.com>
Hi John,
I haven't worked on it since I posted it initially. But it looks like
there's an alternative patch already upstreamed and backported into
stable:
https://lore.kernel.org/linux-kernel/CAJZ5v0hHTuEXmQA=0D90eR_KUsOsfcxYbTS=zQYDTXuY6o_K_Q@mail.gmail.com/T/
BR,
Roman
пн, 8 апр. 2024 г. в 21:48, John Stultz <jstultz@google.com>:
>
> On Sun, Jul 10, 2022 at 9:40 AM Roman Stratiienko
> <r.stratiienko@gmail.com> wrote:
> >
> > During suspend, the big CPU cluster is turned off first while a little
> > is still running. This forcibly unregisters the cooling device which
> > sends a "REMOVE" uevent to all subscribers [1].
> >
> > In case userspace netlink subscriber has set the EPOLLWAKEUP flag, a
> > wakeup event is triggered that causes suspend to be aborted.
> >
> > Without this change, suspend doesn't work on PinePhone PRO with AOSP
> > userland.
> >
> > [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5c238a8b599f1ae25eaeb08ad0e9e13e2b9eb023
> > Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
>
> Hey Roman,
> I wanted to drudge this patch up, to ask what the current status of
> it was? Is there an alternative solution that you've been using since
> this was last sent out?
> I've heard of some vendors working around something similar, so I
> wanted to see if we could get a common fix upstream.
>
> thanks
> -john
^ permalink raw reply
* Re: [RFC] thermal/core: Disable uevent messages for cooling devices
From: John Stultz @ 2024-04-08 18:48 UTC (permalink / raw)
To: Roman Stratiienko
Cc: linux-pm, rafael, daniel.lezcano, amitk, rui.zhang, linux-kernel,
megi
In-Reply-To: <20220710164026.541466-1-r.stratiienko@gmail.com>
On Sun, Jul 10, 2022 at 9:40 AM Roman Stratiienko
<r.stratiienko@gmail.com> wrote:
>
> During suspend, the big CPU cluster is turned off first while a little
> is still running. This forcibly unregisters the cooling device which
> sends a "REMOVE" uevent to all subscribers [1].
>
> In case userspace netlink subscriber has set the EPOLLWAKEUP flag, a
> wakeup event is triggered that causes suspend to be aborted.
>
> Without this change, suspend doesn't work on PinePhone PRO with AOSP
> userland.
>
> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5c238a8b599f1ae25eaeb08ad0e9e13e2b9eb023
> Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Hey Roman,
I wanted to drudge this patch up, to ask what the current status of
it was? Is there an alternative solution that you've been using since
this was last sent out?
I've heard of some vendors working around something similar, so I
wanted to see if we could get a common fix upstream.
thanks
-john
^ permalink raw reply
* Re: [PATCH v4 7/8] cpuidle/poll_state: replace cpu_relax with smp_cond_load_relaxed
From: Ankur Arora @ 2024-04-08 18:46 UTC (permalink / raw)
To: Okanovic, Haris
Cc: ankur.a.arora@oracle.com, joao.m.martins@oracle.com,
kvm@vger.kernel.org, dianders@chromium.org,
linux-arm-kernel@lists.infradead.org, pmladek@suse.com,
wanpengli@tencent.com, akpm@linux-foundation.org,
linux-kernel@vger.kernel.org, catalin.marinas@arm.com,
mingo@redhat.com, pbonzini@redhat.com, tglx@linutronix.de,
daniel.lezcano@linaro.org, mihai.carabas@oracle.com,
arnd@arndb.de, will@kernel.org, hpa@zytor.com,
peterz@infradead.org, mic@digikod.net, vkuznets@redhat.com,
bp@alien8.de, npiggin@gmail.com, linux-pm@vger.kernel.org,
rafael@kernel.org, juerg.haefliger@canonical.com, x86@kernel.org,
rick.p.edgecombe@intel.com
In-Reply-To: <aada0beae0b3479bfa311eea94a3b595bb8e5835.camel@amazon.com>
Okanovic, Haris <harisokn@amazon.com> writes:
> On Fri, 2024-04-05 at 16:14 -0700, Ankur Arora wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> Okanovic, Haris <harisokn@amazon.com> writes:
>>
>> > On Thu, 2024-02-15 at 09:41 +0200, Mihai Carabas wrote:
>> > > cpu_relax on ARM64 does a simple "yield". Thus we replace it with
>> > > smp_cond_load_relaxed which basically does a "wfe".
>> > >
>> > > Suggested-by: Peter Zijlstra <peterz@infradead.org>
>> > > Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
>> > > ---
>> > > drivers/cpuidle/poll_state.c | 15 ++++++++++-----
>> > > 1 file changed, 10 insertions(+), 5 deletions(-)
>> > >
>> > > diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
>> > > index 9b6d90a72601..1e45be906e72 100644
>> > > --- a/drivers/cpuidle/poll_state.c
>> > > +++ b/drivers/cpuidle/poll_state.c
>> > > @@ -13,6 +13,7 @@
>> > > static int __cpuidle poll_idle(struct cpuidle_device *dev,
>> > > struct cpuidle_driver *drv, int index)
>> > > {
>> > > + unsigned long ret;
>> > > u64 time_start;
>> > >
>> > > time_start = local_clock_noinstr();
>> > > @@ -26,12 +27,16 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,
>> > >
>> > > limit = cpuidle_poll_time(drv, dev);
>> > >
>> > > - while (!need_resched()) {
>> > > - cpu_relax();
>> > > - if (loop_count++ < POLL_IDLE_RELAX_COUNT)
>> > > - continue;
>> > > -
>> > > + for (;;) {
>> > > loop_count = 0;
>> > > +
>> > > + ret = smp_cond_load_relaxed(¤t_thread_info()->flags,
>> > > + VAL & _TIF_NEED_RESCHED ||
>> > > + loop_count++ >= POLL_IDLE_RELAX_COUNT);
>> >
>> > Is it necessary to repeat this 200 times with a wfe poll?
>>
>> The POLL_IDLE_RELAX_COUNT is there because on x86 each cpu_relax()
>> iteration is much shorter.
>>
>> With WFE, it makes less sense.
>>
>> > Does kvm not implement a timeout period?
>>
>> Not yet, but it does become more useful after a WFE haltpoll is
>> available on ARM64.
>
> Note that kvm conditionally traps WFE and WFI based on number of host
> CPU tasks. VMs will sometimes see hardware behavior - potentially
> polling for a long time before entering WFI.
>
> https://elixir.bootlin.com/linux/latest/source/arch/arm64/kvm/arm.c#L459
Yeah. There was a discussion on this
https://lore.kernel.org/lkml/871qc6qufy.fsf@oracle.com/.
>> Haltpoll does have a timeout, which you should be able to tune via
>> /sys/module/haltpoll/parameters/ but that, of course, won't help here.
>>
>> > Could you make it configurable? This patch improves certain workloads
>> > on AWS Graviton instances as well, but blocks up to 6ms in 200 * 30us
>> > increments before going to wfi, which is a bit excessive.
>>
>> Yeah, this looks like a problem. We could solve it by making it an
>> architectural parameter. Though I worry about ARM platforms with
>> much smaller default timeouts.
>> The other possibility is using WFET in the primitive, but then we
>> have that dependency and that's a bigger change.
>
> See arm64's delay() for inspiration:
>
> https://elixir.bootlin.com/linux/v6.9-rc2/source/arch/arm64/lib/delay.c#L26
Sure, that part is straight-forward enough. However, this will need a fallback
the case when WFET is not available. And, because this path is used on x86,
so we need a cross platform smp_cond*timeout(). Though given that the x86
version is based on cpu_relax() then that could just fold the sched_clock()
check in.
Maybe another place to do this would be by KVM forcing a WFE timeout. Arguably
that is needed regardless of whether we use a smp_cond*timeout() or not.
--
ankur
^ permalink raw reply
* Re: [PATCH 2/2] thermal/drivers/mediatek/lvts_thermal: Improve some memory allocation
From: Christophe JAILLET @ 2024-04-08 18:41 UTC (permalink / raw)
To: Dan Carpenter
Cc: Rafael J. Wysocki, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Matthias Brugger, AngeloGioacchino Del Regno, linux-kernel,
kernel-janitors, linux-pm, linux-arm-kernel, linux-mediatek
In-Reply-To: <d97f2a57-d318-455b-a860-8bd7972c8aaf@moroto.mountain>
Le 08/04/2024 à 10:09, Dan Carpenter a écrit :
> On Sun, Apr 07, 2024 at 10:01:49PM +0200, Christophe JAILLET wrote:
>> diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
>> index 3003dc350766..b133f731c5ba 100644
>> --- a/drivers/thermal/mediatek/lvts_thermal.c
>> +++ b/drivers/thermal/mediatek/lvts_thermal.c
>> @@ -204,7 +204,7 @@ static const struct debugfs_reg32 lvts_regs[] = {
>>
>> static int lvts_debugfs_init(struct device *dev, struct lvts_domain *lvts_td)
>> {
>> - struct debugfs_regset32 *regset;
>> + struct debugfs_regset32 *regsets;
>> struct lvts_ctrl *lvts_ctrl;
>> struct dentry *dentry;
>> char name[64];
>> @@ -214,8 +214,14 @@ static int lvts_debugfs_init(struct device *dev, struct lvts_domain *lvts_td)
>> if (IS_ERR(lvts_td->dom_dentry))
>> return 0;
>>
>> + regsets = devm_kcalloc(dev, lvts_td->num_lvts_ctrl,
>> + sizeof(*regsets), GFP_KERNEL);
>> + if (!regsets)
>> + return 0;
>
> I understand that this preserved the behavior from the original code,
> but the original code was wrong. This should return -ENOMEM.
Hi Dan,
I don't agree.
For me, this memory allocation is of the same type as all debugfs
functions that we ignore the error code.
If it fails, it is not a reason good enough to have the probe fail.
(anyway, if we are short on memory at this point other errors will
likely occur)
>
>> +
>> for (i = 0; i < lvts_td->num_lvts_ctrl; i++) {
>>
>> + struct debugfs_regset32 *regset = ®sets[i];
>> lvts_ctrl = &lvts_td->lvts_ctrl[i];
>
> The blank line should come after the declaration.
The blank line was already there, and in this file, it looks like the
preferred style (even if not completely consistent)
Let see if there is some comment about 0 or -ENOMEM in case of memory
allocation error, and if needed, I'll repost without the blank line.
This patch being a really tiny tiny tiny improvement (IMHO), so it may
also just be ignored.
CJ
>
> regards,
> dan carpenter
>
>
>
^ permalink raw reply
* Re: [PATCH v2 0/2] cgroup/cpuset: Make cpuset hotplug processing synchronous
From: Tejun Heo @ 2024-04-08 17:40 UTC (permalink / raw)
To: Waiman Long
Cc: Zefan Li, Johannes Weiner, Thomas Gleixner, Peter Zijlstra,
Rafael J. Wysocki, Len Brown, Pavel Machek, Shuah Khan,
linux-kernel, cgroups, linux-pm, linux-kselftest,
Frederic Weisbecker, Paul E. McKenney, Ingo Molnar,
Valentin Schneider, Anna-Maria Behnsen, Alex Shi, Vincent Guittot,
Michal Koutný
In-Reply-To: <20240404134749.2857852-1-longman@redhat.com>
On Thu, Apr 04, 2024 at 09:47:47AM -0400, Waiman Long wrote:
> v2:
> - Found that rebuild_sched_domains() has external callers, revert its
> change and introduce rebuild_sched_domains_cpuslocked() instead.
>
> As discussed in the LKML thread [1], the asynchronous nature of cpuset
> hotplug handling code is causing problem with RCU testing. With recent
> changes in the way locking is being handled in the cpuset code, it is
> now possible to make the cpuset hotplug code synchronous again without
> major changes.
>
> This series enables the hotplug code to call directly into cpuset hotplug
> core without indirection with the exception of the special case of v1
> cpuset becoming empty still being handled indirectly with workqueue.
>
> A new simple test case was also written to test this special v1 cpuset
> case. The test_cpuset_prs.sh script was also run with LOCKDEP on to
> verify that there is no regression.
Applied to cgroup/for-6.10. Hopefully, this one will work out. Thanks for
working on this.
--
tejun
^ permalink raw reply
* Re: [PATCH v2] platform/x86/intel/hid: Don't wake on 5-button releases
From: Ilpo Järvinen @ 2024-04-08 15:42 UTC (permalink / raw)
To: Hans de Goede, David McFarland
Cc: linux-pm, platform-driver-x86, Rafael J . Wysocki, Enrik Berkhan
In-Reply-To: <878r1tpd6u.fsf_-_@gmail.com>
On Thu, 04 Apr 2024 08:41:45 -0300, David McFarland wrote:
> If, for example, the power button is configured to suspend, holding it
> and releasing it after the machine has suspended, will wake the machine.
>
> Also on some machines, power button release events are sent during
> hibernation, even if the button wasn't used to hibernate the machine.
> This causes hibernation to be aborted.
>
> [...]
Thank you for your contribution, it has been applied to my local
review-ilpo branch. Note it will show up in the public
platform-drivers-x86/review-ilpo branch only once I've pushed my
local branch there, which might take a while.
The list of commits applied:
[1/1] platform/x86/intel/hid: Don't wake on 5-button releases
commit: 5864e479ca4344f3a5df8074524da24c960f440b
--
i.
^ permalink raw reply
* Re: [PATCH v2 1/2] cgroup/cpuset: Make cpuset hotplug processing synchronous
From: Valentin Schneider @ 2024-04-08 15:22 UTC (permalink / raw)
To: Waiman Long, Tejun Heo, Zefan Li, Johannes Weiner,
Thomas Gleixner, Peter Zijlstra, Rafael J. Wysocki, Len Brown,
Pavel Machek, Shuah Khan
Cc: linux-kernel, cgroups, linux-pm, linux-kselftest,
Frederic Weisbecker, Paul E. McKenney, Ingo Molnar,
Anna-Maria Behnsen, Alex Shi, Vincent Guittot, Michal Koutný,
Waiman Long
In-Reply-To: <20240404134749.2857852-2-longman@redhat.com>
On 04/04/24 09:47, Waiman Long wrote:
> Since commit 3a5a6d0c2b03("cpuset: don't nest cgroup_mutex inside
> get_online_cpus()"), cpuset hotplug was done asynchronously via a work
> function. This is to avoid recursive locking of cgroup_mutex.
>
> Since then, the cgroup locking scheme has changed quite a bit. A
> cpuset_mutex was introduced to protect cpuset specific operations.
> The cpuset_mutex is then replaced by a cpuset_rwsem. With commit
> d74b27d63a8b ("cgroup/cpuset: Change cpuset_rwsem and hotplug lock
> order"), cpu_hotplug_lock is acquired before cpuset_rwsem. Later on,
> cpuset_rwsem is reverted back to cpuset_mutex. All these locking changes
> allow the hotplug code to call into cpuset core directly.
>
> The following commits were also merged due to the asynchronous nature
> of cpuset hotplug processing.
>
> - commit b22afcdf04c9 ("cpu/hotplug: Cure the cpusets trainwreck")
> - commit 50e76632339d ("sched/cpuset/pm: Fix cpuset vs. suspend-resume
> bugs")
> - commit 28b89b9e6f7b ("cpuset: handle race between CPU hotplug and
> cpuset_hotplug_work")
>
> Clean up all these bandages by making cpuset hotplug
> processing synchronous again with the exception that the call to
> cgroup_transfer_tasks() to transfer tasks out of an empty cgroup v1
> cpuset, if necessary, will still be done via a work function due to the
> existing cgroup_mutex -> cpu_hotplug_lock dependency. It is possible
> to reverse that dependency, but that will require updating a number of
> different cgroup controllers. This special hotplug code path should be
> rarely taken anyway.
>
> As all the cpuset states will be updated by the end of the hotplug
> operation, we can revert most the above commits except commit
> 50e76632339d ("sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs")
> which is partially reverted. Also removing some cpus_read_lock trylock
> attempts in the cpuset partition code as they are no longer necessary
> since the cpu_hotplug_lock is now held for the whole duration of the
> cpuset hotplug code path.
>
> Signed-off-by: Waiman Long <longman@redhat.com>
Tested-by: Valentin Schneider <vschneid@redhat.com>
^ permalink raw reply
* Re: [PATCH 1/1] pm: Take advantage of %ps to simplify debug output
From: Rafael J. Wysocki @ 2024-04-08 15:09 UTC (permalink / raw)
To: Len Brown; +Cc: linux-pm, Len Brown
In-Reply-To: <2aaa9ba442f027fd9d1011c5fb30adea9bd1b799.1712344255.git.len.brown@intel.com>
On Fri, Apr 5, 2024 at 9:13 PM Len Brown <lenb@kernel.org> wrote:
>
> From: Len Brown <len.brown@intel.com>
>
> initcall_debug previous and new output:
>
> ...PM: calling pci_pm_suspend+0x0/0x1b0 @ 3233, parent: pci0000:00
>
> ...PM: calling pci_pm_suspend @ 3233, parent: pci0000:00
>
> Signed-off-by: Len Brown <len.brown@intel.com>
> ---
> drivers/base/power/main.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> index f85f3515c258..de581c33095f 100644
> --- a/drivers/base/power/main.c
> +++ b/drivers/base/power/main.c
> @@ -209,7 +209,7 @@ static ktime_t initcall_debug_start(struct device *dev, void *cb)
> if (!pm_print_times_enabled)
> return 0;
>
> - dev_info(dev, "calling %pS @ %i, parent: %s\n", cb,
> + dev_info(dev, "calling %ps @ %i, parent: %s\n", cb,
> task_pid_nr(current),
> dev->parent ? dev_name(dev->parent) : "none");
> return ktime_get();
> @@ -224,7 +224,7 @@ static void initcall_debug_report(struct device *dev, ktime_t calltime,
> return;
>
> rettime = ktime_get();
> - dev_info(dev, "%pS returned %d after %Ld usecs\n", cb, error,
> + dev_info(dev, "%ps returned %d after %Ld usecs\n", cb, error,
> (unsigned long long)ktime_us_delta(rettime, calltime));
> }
>
> @@ -1963,7 +1963,7 @@ EXPORT_SYMBOL_GPL(dpm_suspend_start);
> void __suspend_report_result(const char *function, struct device *dev, void *fn, int ret)
> {
> if (ret)
> - dev_err(dev, "%s(): %pS returns %d\n", function, fn, ret);
> + dev_err(dev, "%s(): %ps returns %d\n", function, fn, ret);
> }
> EXPORT_SYMBOL_GPL(__suspend_report_result);
>
> --
Applied as 6.10 material, thanks!
^ permalink raw reply
* Re: [PATCH 0/2] Deduplicate bin_attribute simple read() callbacks
From: Rafael J. Wysocki @ 2024-04-08 15:01 UTC (permalink / raw)
To: Lukas Wunner
Cc: Greg Kroah-Hartman, Rafael J. Wysocki, linux-kernel,
Michael Ellerman, linuxppc-dev, linux-acpi, Jean Delvare,
Ard Biesheuvel, linux-efi, Zhenyu Wang, Zhi Wang, intel-gvt-dev,
Daniel Lezcano, linux-pm, Luis Chamberlain, linux-modules
In-Reply-To: <cover.1712410202.git.lukas@wunner.de>
On Sat, Apr 6, 2024 at 3:52 PM Lukas Wunner <lukas@wunner.de> wrote:
>
> For my upcoming PCI device authentication v2 patches, I have the need
> to expose a simple buffer in virtual memory as a bin_attribute.
>
> It turns out we've duplicated the ->read() callback for such simple
> buffers a fair number of times across the tree.
>
> So instead of reinventing the wheel, I decided to introduce a common
> helper and eliminate all duplications I could find.
>
> I'm open to a bikeshedding discussion on the sysfs_bin_attr_simple_read()
> name. ;)
>
> Lukas Wunner (2):
> sysfs: Add sysfs_bin_attr_simple_read() helper
> treewide: Use sysfs_bin_attr_simple_read() helper
>
> arch/powerpc/platforms/powernv/opal.c | 10 +-------
> drivers/acpi/bgrt.c | 9 +-------
> drivers/firmware/dmi_scan.c | 12 ++--------
> drivers/firmware/efi/rci2-table.c | 10 +-------
> drivers/gpu/drm/i915/gvt/firmware.c | 26 +++++----------------
> .../intel/int340x_thermal/int3400_thermal.c | 9 +-------
> fs/sysfs/file.c | 27 ++++++++++++++++++++++
> include/linux/sysfs.h | 15 ++++++++++++
> init/initramfs.c | 10 +-------
> kernel/module/sysfs.c | 13 +----------
> 10 files changed, 56 insertions(+), 85 deletions(-)
>
> --
For the series
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
^ permalink raw reply
* RE: [PATCH 2/3] arm64: add __READ_ONCE_EX()
From: David Laight @ 2024-04-08 14:51 UTC (permalink / raw)
To: 'Haris Okanovic', linux-kernel@vger.kernel.org,
linux-pm@vger.kernel.org, linux-assembly@vger.kernel.org
Cc: peterz@infradead.org
In-Reply-To: <20240402014706.3969151-2-harisokn@amazon.com>
From: Haris Okanovic
> Sent: 02 April 2024 02:47
>
> Perform an exclusive load, which atomically loads a word and arms the
> execusive monitor to enable wfe() polling of an address.
>
> Adding this macro in preparation for an arm64 cpuidle driver which
> supports a wfe() based polling state.
>
> https://developer.arm.com/documentation/dht0008/a/arm-synchronization-primitives/exclusive-
> accesses/exclusive-monitors
>
> Signed-off-by: Haris Okanovic <harisokn@amazon.com>
> ---
> arch/arm64/include/asm/readex.h | 46 +++++++++++++++++++++++++++++++++
> 1 file changed, 46 insertions(+)
> create mode 100644 arch/arm64/include/asm/readex.h
>
> diff --git a/arch/arm64/include/asm/readex.h b/arch/arm64/include/asm/readex.h
> new file mode 100644
> index 000000000000..51963c3107e1
> --- /dev/null
> +++ b/arch/arm64/include/asm/readex.h
> @@ -0,0 +1,46 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Based on arch/arm64/include/asm/rwonce.h
> + *
> + * Copyright (C) 2020 Google LLC.
> + * Copyright (C) 2024 Amazon.com, Inc. or its affiliates.
> + */
> +
> +#ifndef __ASM_READEX_H
> +#define __ASM_READEX_H
> +
> +#define __LOAD_EX(sfx, regs...) "ldaxr" #sfx "\t" #regs
> +
> +#define __READ_ONCE_EX(x) \
> +({ \
> + typeof(&(x)) __x = &(x); \
> + int atomic = 1; \
> + union { __unqual_scalar_typeof(*__x) __val; char __c[1]; } __u; \
> + switch (sizeof(x)) { \
> + case 1: \
> + asm volatile(__LOAD_EX(b, %w0, %1) \
> + : "=r" (*(__u8 *)__u.__c) \
> + : "Q" (*__x) : "memory"); \
> + break; \
> + case 2: \
> + asm volatile(__LOAD_EX(h, %w0, %1) \
> + : "=r" (*(__u16 *)__u.__c) \
> + : "Q" (*__x) : "memory"); \
> + break; \
> + case 4: \
> + asm volatile(__LOAD_EX(, %w0, %1) \
> + : "=r" (*(__u32 *)__u.__c) \
> + : "Q" (*__x) : "memory"); \
> + break; \
> + case 8: \
> + asm volatile(__LOAD_EX(, %0, %1) \
> + : "=r" (*(__u64 *)__u.__c) \
> + : "Q" (*__x) : "memory"); \
> + break; \
> + default: \
> + atomic = 0; \
> + } \
> + atomic ? (typeof(*__x))__u.__val : (*(volatile typeof(__x))__x);\
I'm pretty sure that doesn't work the way you expect.
The ?: operator promotes 'unsigned char' to 'int'.
So you can fall foul of signedness tests (eg in min()).
It also isn't going to work for non-scalers.
Replacing the ?: with __builtin_choose_expr() may help.
(This is probably a bug in the code you copied.)
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply
* Re: [PATCH 1/3] thermal: intel: intel_tcc: Add model checks for temperature registers
From: Ricardo Neri @ 2024-04-08 14:23 UTC (permalink / raw)
To: Zhang, Rui
Cc: linux@roeck-us.net, Wysocki, Rafael J, jdelvare@suse.com,
srinivas.pandruvada@linux.intel.com, lukasz.luba@arm.com,
linux-pm@vger.kernel.org, linux-hwmon@vger.kernel.org,
daniel.lezcano@linaro.org, linux-kernel@vger.kernel.org,
Neri, Ricardo
In-Reply-To: <5e86524413ec2cfeb1096f49851bf18837c7e50b.camel@intel.com>
On Sun, Apr 07, 2024 at 08:13:28AM +0000, Zhang, Rui wrote:
> > +
> > +#define TCC_FAM6_MODEL_TEMP_MASKS
Thank your your review, Rui!
>
> Future non FAM6 processors can still use this macro, right?
> So I'd prefer to remove FAM6_MODEL in the macro name.
Yes, it is true, FAM6_MODEL it is restrictive and also not needed here.
I will update accodingly.
> [...]
> >
> > +
> > +/**
> > + * get_tcc_offset_mask() - Returns the model-specific bitmask for
> > TCC offset
> > + *
> > + * Get the model-specific bitmask to extract TCC_OFFSET from the
> > MSR_TEMPERATURE_
> > + * TARGET register. If the mask is 0, it means the processor does
> > not support TCC offset.
> > + *
> > + * Return: The model-specific bitmask for TCC offset.
> > + */
> > +u32 get_tcc_offset_mask(void)
> > +{
> > + return intel_tcc_temp_masks.tcc_offset;
> > +}
> > +EXPORT_SYMBOL_NS(get_tcc_offset_mask, INTEL_TCC);
>
> the name is not consistent with the other intel_tcc APIs.
>
> how about intel_tcc_get_offset_mask()?
Sure. I can make this change.
>
> [...]
>
> > diff --git a/include/linux/intel_tcc.h b/include/linux/intel_tcc.h
> > index 8ff8eabb4a98..e281cf06aeab 100644
> > --- a/include/linux/intel_tcc.h
> > +++ b/include/linux/intel_tcc.h
> > @@ -14,5 +14,13 @@ int intel_tcc_get_tjmax(int cpu);
> > int intel_tcc_get_offset(int cpu);
> > int intel_tcc_set_offset(int cpu, int offset);
> > int intel_tcc_get_temp(int cpu, int *temp, bool pkg);
> > +#ifdef CONFIG_INTEL_TCC
> > +u32 get_tcc_offset_mask(void);
> > +u32 intel_tcc_get_temp_mask(bool pkg);
> > +#else
> > +static inline u32 get_tcc_offset_mask(void) { return 0; }
> > +/* Use the architectural bitmask of the temperature readout. No
> > model checks. */
> > +static inline u32 intel_tcc_get_temp_mask(bool pkg) { return 0x7f; }
> > +#endif
>
> for intel_tcc_get_temp_mask()
> 1. with CONFIG_INTEL_TCC
> a) for a platform in the model list, return the hardcoded value
> b) for a platform not in the model list, return 0xff
> 2. without CONFIG_INTEL_TCC, return 0x7f
>
> This is a bit confusing. IMO, at least we should leave a comment about
> this difference.
If we don't do model checks, I think we should rely on what is architectural
as per the SDM. Hence the 0x7f value.
Perhaps I can expand the comment in this hunk to detail what we do when we
do model checks.
^ permalink raw reply
* Re: [PATCH] ACPI: DPTF: Add Lunar Lake support
From: Rafael J. Wysocki @ 2024-04-08 14:11 UTC (permalink / raw)
To: Sumeet Pawnikar
Cc: rafael, srinivas.pandruvada, linux-acpi, linux-kernel, linux-pm
In-Reply-To: <20240405121819.31331-1-sumeet.r.pawnikar@intel.com>
On Fri, Apr 5, 2024 at 2:31 PM Sumeet Pawnikar
<sumeet.r.pawnikar@intel.com> wrote:
>
> Add Lunar Lake ACPI IDs for DPTF devices.
>
> Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
> ---
> drivers/acpi/dptf/dptf_pch_fivr.c | 1 +
> drivers/acpi/dptf/dptf_power.c | 2 ++
> drivers/acpi/dptf/int340x_thermal.c | 6 ++++++
> drivers/acpi/fan.h | 1 +
> drivers/thermal/intel/int340x_thermal/int3400_thermal.c | 1 +
> drivers/thermal/intel/int340x_thermal/int3403_thermal.c | 1 +
> 6 files changed, 12 insertions(+)
>
> diff --git a/drivers/acpi/dptf/dptf_pch_fivr.c b/drivers/acpi/dptf/dptf_pch_fivr.c
> index 654aaa53c67f..d202730fafd8 100644
> --- a/drivers/acpi/dptf/dptf_pch_fivr.c
> +++ b/drivers/acpi/dptf/dptf_pch_fivr.c
> @@ -150,6 +150,7 @@ static const struct acpi_device_id pch_fivr_device_ids[] = {
> {"INTC1045", 0},
> {"INTC1049", 0},
> {"INTC1064", 0},
> + {"INTC106B", 0},
> {"INTC10A3", 0},
> {"", 0},
> };
> diff --git a/drivers/acpi/dptf/dptf_power.c b/drivers/acpi/dptf/dptf_power.c
> index b8187babbbbb..8023b3e23315 100644
> --- a/drivers/acpi/dptf/dptf_power.c
> +++ b/drivers/acpi/dptf/dptf_power.c
> @@ -232,6 +232,8 @@ static const struct acpi_device_id int3407_device_ids[] = {
> {"INTC1061", 0},
> {"INTC1065", 0},
> {"INTC1066", 0},
> + {"INTC106C", 0},
> + {"INTC106D", 0},
> {"INTC10A4", 0},
> {"INTC10A5", 0},
> {"", 0},
> diff --git a/drivers/acpi/dptf/int340x_thermal.c b/drivers/acpi/dptf/int340x_thermal.c
> index b7113fa92fa6..014ada759954 100644
> --- a/drivers/acpi/dptf/int340x_thermal.c
> +++ b/drivers/acpi/dptf/int340x_thermal.c
> @@ -43,6 +43,12 @@ static const struct acpi_device_id int340x_thermal_device_ids[] = {
> {"INTC1064"},
> {"INTC1065"},
> {"INTC1066"},
> + {"INTC1068"},
> + {"INTC1069"},
> + {"INTC106A"},
> + {"INTC106B"},
> + {"INTC106C"},
> + {"INTC106D"},
> {"INTC10A0"},
> {"INTC10A1"},
> {"INTC10A2"},
> diff --git a/drivers/acpi/fan.h b/drivers/acpi/fan.h
> index e7b4b4e4a55e..f89d19c922dc 100644
> --- a/drivers/acpi/fan.h
> +++ b/drivers/acpi/fan.h
> @@ -15,6 +15,7 @@
> {"INTC1044", }, /* Fan for Tiger Lake generation */ \
> {"INTC1048", }, /* Fan for Alder Lake generation */ \
> {"INTC1063", }, /* Fan for Meteor Lake generation */ \
> + {"INTC106A", }, /* Fan for Lunar Lake generation */ \
> {"INTC10A2", }, /* Fan for Raptor Lake generation */ \
> {"PNP0C0B", } /* Generic ACPI fan */
>
> diff --git a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> index 427d370648d5..f8ebdd19d340 100644
> --- a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> +++ b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> @@ -705,6 +705,7 @@ static const struct acpi_device_id int3400_thermal_match[] = {
> {"INTC1040", 0},
> {"INTC1041", 0},
> {"INTC1042", 0},
> + {"INTC1068", 0},
> {"INTC10A0", 0},
> {}
> };
> diff --git a/drivers/thermal/intel/int340x_thermal/int3403_thermal.c b/drivers/thermal/intel/int340x_thermal/int3403_thermal.c
> index 9b33fd3a66da..86901f9f54d8 100644
> --- a/drivers/thermal/intel/int340x_thermal/int3403_thermal.c
> +++ b/drivers/thermal/intel/int340x_thermal/int3403_thermal.c
> @@ -284,6 +284,7 @@ static const struct acpi_device_id int3403_device_ids[] = {
> {"INTC1043", 0},
> {"INTC1046", 0},
> {"INTC1062", 0},
> + {"INTC1069", 0},
> {"INTC10A1", 0},
> {"", 0},
> };
> --
Applied as 6.10 material, thanks!
^ permalink raw reply
* Re: [PATCH v5 0/4] Update Energy Model after chip binning adjusted voltages
From: Rafael J. Wysocki @ 2024-04-08 14:06 UTC (permalink / raw)
To: Lukasz Luba
Cc: linux-kernel, linux-pm, rafael, dietmar.eggemann,
linux-arm-kernel, sboyd, nm, linux-samsung-soc, daniel.lezcano,
viresh.kumar, krzysztof.kozlowski, alim.akhtar, m.szyprowski,
mhiramat
In-Reply-To: <20240403154907.1420245-1-lukasz.luba@arm.com>
On Wed, Apr 3, 2024 at 5:49 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
> Hi all,
>
> This is a follow-up patch aiming to add EM modification due to chip binning.
> The first RFC and the discussion can be found here [1].
>
> It uses Exynos chip driver code as a 1st user. The EM framework has been
> extended to handle this use case easily, when the voltage has been changed
> after setup. On my Odroid-xu4 in some OPPs I can observe ~20% power difference.
> According to that data in driver tables it could be up to ~29%.
>
> This chip binning is applicable to a lot of SoCs, so the EM framework should
> make it easy to update. It uses the existing OPP and DT information to
> re-calculate the new power values.
>
> It has dependency on Exynos SoC driver tree.
>
> Changes:
> v5:
> - adjusted aligning of the function arguments in patch 1/4 (Dietmar)
> - adjusted the in-code comment patch 4/4 (Dietmar)
> - added Reviewed-by to all patches (Dietmar)
> v4:
> - added asterisk in the comment section (test robot)
> - change the patch 2/4 header name and use 'Refactor'
> v3:
> - updated header description patch 2/4 (Dietmar)
> - removed 2 sentences from comment and adjusted in patch 3/4 (Dietmar)
> - patch 4/4 re-phrased code comment (Dietmar)
> - collected tags (Krzysztof, Viresh)
> v2:
> - removed 'ret' from error message which wasn't initialized (Christian)
> v1:
> - exported the OPP calculation function from the OPP/OF so it can be
> used from EM fwk (Viresh)
> - refactored EM updating function to re-use common code
> - added new EM function which can be used by chip device drivers which
> modify the voltage in OPPs
> RFC is at [1]
>
> Regards,
> Lukasz Luba
>
> [1] https://lore.kernel.org/lkml/20231220110339.1065505-1-lukasz.luba@arm.com/
>
> Lukasz Luba (4):
> OPP: OF: Export dev_opp_pm_calc_power() for usage from EM
> PM: EM: Refactor em_adjust_new_capacity()
> PM: EM: Add em_dev_update_chip_binning()
> soc: samsung: exynos-asv: Update Energy Model after adjusting voltage
>
> drivers/opp/of.c | 17 +++--
> drivers/soc/samsung/exynos-asv.c | 10 ++-
> include/linux/energy_model.h | 5 ++
> include/linux/pm_opp.h | 8 +++
> kernel/power/energy_model.c | 106 +++++++++++++++++++++++++------
> 5 files changed, 121 insertions(+), 25 deletions(-)
>
> --
All patches in the series applied as 6.10 material, thanks!
^ permalink raw reply
* Re: [PATCH v2] PM: s2idle: Make sure CPUs will wakeup directly on resume
From: Rafael J. Wysocki @ 2024-04-08 13:39 UTC (permalink / raw)
To: Ulf Hansson, Anna-Maria Behnsen
Cc: linux-kernel, Rafael J . Wysocki, Pavel Machek, Len Brown,
linux-pm, Frederic Weisbecker, Thomas Gleixner, x86,
Mario Limonciello, stable
In-Reply-To: <CAPDyKFo7KT4V8Nvn58N3mNfeW6ai=-5hampjN7N19kYaR7zdVA@mail.gmail.com>
On Mon, Apr 8, 2024 at 2:43 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Mon, 8 Apr 2024 at 09:02, Anna-Maria Behnsen
> <anna-maria@linutronix.de> wrote:
> >
> > s2idle works like a regular suspend with freezing processes and freezing
> > devices. All CPUs except the control CPU go into idle. Once this is
> > completed the control CPU kicks all other CPUs out of idle, so that they
> > reenter the idle loop and then enter s2idle state. The control CPU then
> > issues an swait() on the suspend state and therefore enters the idle loop
> > as well.
> >
> > Due to being kicked out of idle, the other CPUs leave their NOHZ states,
> > which means the tick is active and the corresponding hrtimer is programmed
> > to the next jiffie.
> >
> > On entering s2idle the CPUs shut down their local clockevent device to
> > prevent wakeups. The last CPU which enters s2idle shuts down its local
> > clockevent and freezes timekeeping.
> >
> > On resume, one of the CPUs receives the wakeup interrupt, unfreezes
> > timekeeping and its local clockevent and starts the resume process. At that
> > point all other CPUs are still in s2idle with their clockevents switched
> > off. They only resume when they are kicked by another CPU or after resuming
> > devices and then receiving a device interrupt.
> >
> > That means there is no guarantee that all CPUs will wakeup directly on
> > resume. As a consequence there is no guarantee that timers which are queued
> > on those CPUs and should expire directly after resume, are handled. Also
> > timer list timers which are remotely queued to one of those CPUs after
> > resume will not result in a reprogramming IPI as the tick is
> > active. Queueing a hrtimer will also not result in a reprogramming IPI
> > because the first hrtimer event is already in the past.
> >
> > The recent introduction of the timer pull model (7ee988770326 ("timers:
> > Implement the hierarchical pull model")) amplifies this problem, if the
> > current migrator is one of the non woken up CPUs. When a non pinned timer
> > list timer is queued and the queuing CPU goes idle, it relies on the still
> > suspended migrator CPU to expire the timer which will happen by chance.
> >
> > The problem exists since commit 8d89835b0467 ("PM: suspend: Do not pause
> > cpuidle in the suspend-to-idle path"). There the cpuidle_pause() call which
> > in turn invoked a wakeup for all idle CPUs was moved to a later point in
> > the resume process. This might not be reached or reached very late because
> > it waits on a timer of a still suspended CPU.
> >
> > Address this by kicking all CPUs out of idle after the control CPU returns
> > from swait() so that they resume their timers and restore consistent system
> > state.
> >
> > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218641
> > Fixes: 8d89835b0467 ("PM: suspend: Do not pause cpuidle in the suspend-to-idle path")
> > Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> > Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
> > Tested-by: Mario Limonciello <mario.limonciello@amd.com>
> > Cc: stable@kernel.org
>
> Thanks for the detailed commit message! Please add:
>
> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Applied as 6.9-rc material, many thanks to everyone involved!
^ permalink raw reply
* Re: [RFC PATCH 1/1] clk: sunxi-ng: h6-r: add GPU power domain
From: Ulf Hansson @ 2024-04-08 12:55 UTC (permalink / raw)
To: Andre Przywara
Cc: Jernej Skrabec, Chen-Yu Tsai, Samuel Holland, Michael Turquette,
Stephen Boyd, linux-clk, linux-pm, linux-arm-kernel, linux-sunxi
In-Reply-To: <20240225160616.15001-2-andre.przywara@arm.com>
On Sun, 25 Feb 2024 at 17:08, Andre Przywara <andre.przywara@arm.com> wrote:
>
> The Allwinner H616 features register 0x7010254 in the PRCM MMIO frame,
> where bit 0 needs to be cleared to enable operation of the Mali GPU.
> With this bit set (the reset default), any access to the Mali registers
> hangs the bus and thus the whole system. The BSP code clears this bit
> in U-Boot and their kernel never touches it again.
Is the bit representing a reset or power-rail? If it's a reset, it's
probably better to model it like that.
>
> Register a power-domain device to control this bit. Since we claim this
> MMIO region in the H6 R-CCU driver, add the code here, so that we don't
> need to artificially split the MMIO range in the DT.
> Since there seems to be at least one other register with similar behaviour
> nearby (0x7010260), make the power domain take one cell, even though we
> only support domain #0 for now.
Seems like we need some updated DT bindings too to cover this?
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
> drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c | 84 ++++++++++++++++++++++++++
> 1 file changed, 84 insertions(+)
>
> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c
> index 02b28cfc5525e..363fb7a71e9f5 100644
> --- a/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c
> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c
> @@ -4,9 +4,11 @@
> */
>
> #include <linux/clk-provider.h>
> +#include <linux/io.h>
> #include <linux/module.h>
> #include <linux/of.h>
> #include <linux/platform_device.h>
> +#include <linux/pm_domain.h>
>
> #include "ccu_common.h"
> #include "ccu_reset.h"
> @@ -217,6 +219,86 @@ static const struct sunxi_ccu_desc sun50i_h616_r_ccu_desc = {
> .num_resets = ARRAY_SIZE(sun50i_h616_r_ccu_resets),
> };
>
> +#define PD_H616_GPU_REG 0x254
> +
> +struct sun50i_h616_ppu_pd {
> + struct generic_pm_domain genpd;
> + void __iomem *base;
> +};
> +
> +#define to_sun50i_h616_ppu_pd(_genpd) \
> + container_of(_genpd, struct sun50i_h616_ppu_pd, genpd)
> +
> +static bool sun50i_h616_ppu_power_status(const struct sun50i_h616_ppu_pd *pd)
> +{
> + return !readl(pd->base + PD_H616_GPU_REG);
> +}
> +
> +static int sun50i_h616_ppu_pd_set_power(const struct sun50i_h616_ppu_pd *pd,
> + bool power_on)
> +{
> + if (power_on)
> + writel(0, pd->base + PD_H616_GPU_REG);
> + else
> + writel(1, pd->base + PD_H616_GPU_REG);
> +
> + return 0;
> +}
> +
> +static int sun50i_h616_ppu_pd_power_on(struct generic_pm_domain *genpd)
> +{
> + const struct sun50i_h616_ppu_pd *pd = to_sun50i_h616_ppu_pd(genpd);
> +
> + return sun50i_h616_ppu_pd_set_power(pd, true);
> +}
> +
> +static int sun50i_h616_ppu_pd_power_off(struct generic_pm_domain *genpd)
> +{
> + const struct sun50i_h616_ppu_pd *pd = to_sun50i_h616_ppu_pd(genpd);
> +
> + return sun50i_h616_ppu_pd_set_power(pd, false);
> +}
> +
> +static int sun50i_h616_register_ppu(struct platform_device *pdev,
> + void __iomem *base)
> +{
> + struct device *dev = &pdev->dev;
> + struct genpd_onecell_data *ppu;
> + struct sun50i_h616_ppu_pd *pd;
> + int ret;
> +
> + pd = devm_kzalloc(dev, sizeof(*pd), GFP_KERNEL);
> + if (!pd)
> + return -ENOMEM;
> +
> + ppu = devm_kzalloc(dev, sizeof(*ppu), GFP_KERNEL);
> + if (!ppu)
> + return -ENOMEM;
> +
> + ppu->domains = devm_kzalloc(dev, sizeof(*ppu->domains), GFP_KERNEL);
> + if (!ppu->domains)
> + return -ENOMEM;
> +
> + ppu->num_domains = 1;
> + pd->genpd.name = "GPU";
> + pd->genpd.power_off = sun50i_h616_ppu_pd_power_off;
> + pd->genpd.power_on = sun50i_h616_ppu_pd_power_on;
> + pd->base = base;
> +
> + ret = pm_genpd_init(&pd->genpd, NULL, !sun50i_h616_ppu_power_status(pd));
> + if (ret) {
> + dev_warn(dev, "Failed to add GPU power domain: %d\n", ret);
> + return ret;
> + }
> +
> + ppu->domains[0] = &pd->genpd;
> + ret = of_genpd_add_provider_onecell(dev->of_node, ppu);
> + if (ret)
> + dev_warn(dev, "Failed to add provider: %d\n", ret);
> +
> + return 0;
> +}
> +
> static int sun50i_h6_r_ccu_probe(struct platform_device *pdev)
> {
> const struct sunxi_ccu_desc *desc;
> @@ -230,6 +312,8 @@ static int sun50i_h6_r_ccu_probe(struct platform_device *pdev)
> if (IS_ERR(reg))
> return PTR_ERR(reg);
>
> + sun50i_h616_register_ppu(pdev, reg);
> +
> return devm_sunxi_ccu_probe(&pdev->dev, reg, desc);
> }
>
In general (for maintenance reasons) it's a good idea to put genpd
providers under drivers/pmdomain/*. It looks like that should work
this case too, right?
Kind regards
Uffe
^ permalink raw reply
* Re: [PATCH v2] PM: s2idle: Make sure CPUs will wakeup directly on resume
From: Ulf Hansson @ 2024-04-08 12:42 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: linux-kernel, Rafael J . Wysocki, Pavel Machek, Len Brown,
linux-pm, Frederic Weisbecker, Thomas Gleixner, x86,
Mario Limonciello, stable
In-Reply-To: <87r0fg5ocg.fsf@somnus>
On Mon, 8 Apr 2024 at 09:02, Anna-Maria Behnsen
<anna-maria@linutronix.de> wrote:
>
> s2idle works like a regular suspend with freezing processes and freezing
> devices. All CPUs except the control CPU go into idle. Once this is
> completed the control CPU kicks all other CPUs out of idle, so that they
> reenter the idle loop and then enter s2idle state. The control CPU then
> issues an swait() on the suspend state and therefore enters the idle loop
> as well.
>
> Due to being kicked out of idle, the other CPUs leave their NOHZ states,
> which means the tick is active and the corresponding hrtimer is programmed
> to the next jiffie.
>
> On entering s2idle the CPUs shut down their local clockevent device to
> prevent wakeups. The last CPU which enters s2idle shuts down its local
> clockevent and freezes timekeeping.
>
> On resume, one of the CPUs receives the wakeup interrupt, unfreezes
> timekeeping and its local clockevent and starts the resume process. At that
> point all other CPUs are still in s2idle with their clockevents switched
> off. They only resume when they are kicked by another CPU or after resuming
> devices and then receiving a device interrupt.
>
> That means there is no guarantee that all CPUs will wakeup directly on
> resume. As a consequence there is no guarantee that timers which are queued
> on those CPUs and should expire directly after resume, are handled. Also
> timer list timers which are remotely queued to one of those CPUs after
> resume will not result in a reprogramming IPI as the tick is
> active. Queueing a hrtimer will also not result in a reprogramming IPI
> because the first hrtimer event is already in the past.
>
> The recent introduction of the timer pull model (7ee988770326 ("timers:
> Implement the hierarchical pull model")) amplifies this problem, if the
> current migrator is one of the non woken up CPUs. When a non pinned timer
> list timer is queued and the queuing CPU goes idle, it relies on the still
> suspended migrator CPU to expire the timer which will happen by chance.
>
> The problem exists since commit 8d89835b0467 ("PM: suspend: Do not pause
> cpuidle in the suspend-to-idle path"). There the cpuidle_pause() call which
> in turn invoked a wakeup for all idle CPUs was moved to a later point in
> the resume process. This might not be reached or reached very late because
> it waits on a timer of a still suspended CPU.
>
> Address this by kicking all CPUs out of idle after the control CPU returns
> from swait() so that they resume their timers and restore consistent system
> state.
>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218641
> Fixes: 8d89835b0467 ("PM: suspend: Do not pause cpuidle in the suspend-to-idle path")
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: Mario Limonciello <mario.limonciello@amd.com>
> Cc: stable@kernel.org
Thanks for the detailed commit message! Please add:
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Kind regards
Uffe
> ---
> v2: Fix typos in commit message
> ---
> kernel/power/suspend.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> --- a/kernel/power/suspend.c
> +++ b/kernel/power/suspend.c
> @@ -106,6 +106,12 @@ static void s2idle_enter(void)
> swait_event_exclusive(s2idle_wait_head,
> s2idle_state == S2IDLE_STATE_WAKE);
>
> + /*
> + * Kick all CPUs to ensure that they resume their timers and restore
> + * consistent system state.
> + */
> + wake_up_all_idle_cpus();
> +
> cpus_read_unlock();
>
> raw_spin_lock_irq(&s2idle_lock);
^ permalink raw reply
* Re: [PATCH 3/3] hwmon: (coretemp) Use a model-specific bitmask to read registers
From: Guenter Roeck @ 2024-04-08 11:40 UTC (permalink / raw)
To: Zhang, Rui
Cc: ricardo.neri-calderon@linux.intel.com, Neri, Ricardo,
linux-hwmon@vger.kernel.org, lukasz.luba@arm.com,
daniel.lezcano@linaro.org, jdelvare@suse.com,
srinivas.pandruvada@linux.intel.com, linux-kernel@vger.kernel.org,
Wysocki, Rafael J, linux-pm@vger.kernel.org
In-Reply-To: <d0b5ae04b4d08e2003114c4d6b6d3a040f585995.camel@intel.com>
On Sun, Apr 07, 2024 at 08:39:51AM +0000, Zhang, Rui wrote:
> > I do not think it is appropriate to make a hardware monitoring driver
> > depend on the thermal subsystem.
> >
> > NAK in the current form.
> >
> Hi, Guenter,
>
> Thanks for reviewing.
>
> We've seen a couple of hwmon drivers depends on or imply THERMAL.
> That's why we think this is an applicable solution.
So far this was - unless someone sneaked something by - for drivers
which registered thermal zones, not for calling code which resides
inside thermal subsystem.
> Using the intel_tcc APIs can effectively reduce the future maintenance
> burden because we don't need to duplicate the model list in multiple
> places.
>
> or do you have any suggestions?
The exported code should reside outside the thermal subsystem.
Also, as implemented, if INTEL_TCC=n, the returned temperature mask value
is 0x7f, and the offset mask is 0. So the alternative would be to just use
those values unconditionally since apparently that is sufficient.
Thanks,
Guenter
^ permalink raw reply
* Re: [PATCH 2/2] treewide: Use sysfs_bin_attr_simple_read() helper
From: Michael Ellerman @ 2024-04-08 11:11 UTC (permalink / raw)
To: Lukas Wunner, Greg Kroah-Hartman, Rafael J. Wysocki, linux-kernel
Cc: linuxppc-dev, linux-acpi, Jean Delvare, Ard Biesheuvel, linux-efi,
Zhenyu Wang, Zhi Wang, intel-gvt-dev, Daniel Lezcano, linux-pm,
Luis Chamberlain, linux-modules
In-Reply-To: <92ee0a0e83a5a3f3474845db6c8575297698933a.1712410202.git.lukas@wunner.de>
Lukas Wunner <lukas@wunner.de> writes:
> Deduplicate ->read() callbacks of bin_attributes which are backed by a
> simple buffer in memory:
>
> Use the newly introduced sysfs_bin_attr_simple_read() helper instead,
> either by referencing it directly or by declaring such bin_attributes
> with BIN_ATTR_SIMPLE_RO() or BIN_ATTR_SIMPLE_ADMIN_RO().
>
> Aside from a reduction of LoC, this shaves off a few bytes from vmlinux
> (304 bytes on an x86_64 allyesconfig).
>
> No functional change intended.
>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> ---
> arch/powerpc/platforms/powernv/opal.c | 10 +--------
> drivers/acpi/bgrt.c | 9 +-------
> drivers/firmware/dmi_scan.c | 12 ++--------
> drivers/firmware/efi/rci2-table.c | 10 +--------
> drivers/gpu/drm/i915/gvt/firmware.c | 26 +++++-----------------
> .../intel/int340x_thermal/int3400_thermal.c | 9 +-------
> init/initramfs.c | 10 +--------
> kernel/module/sysfs.c | 13 +----------
> 8 files changed, 14 insertions(+), 85 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/opal.c b/arch/powerpc/platforms/powernv/opal.c
> index 45dd77e..5d0f35b 100644
> --- a/arch/powerpc/platforms/powernv/opal.c
> +++ b/arch/powerpc/platforms/powernv/opal.c
> @@ -792,14 +792,6 @@ static int __init opal_sysfs_init(void)
> return 0;
> }
>
> -static ssize_t export_attr_read(struct file *fp, struct kobject *kobj,
> - struct bin_attribute *bin_attr, char *buf,
> - loff_t off, size_t count)
> -{
> - return memory_read_from_buffer(buf, count, &off, bin_attr->private,
> - bin_attr->size);
> -}
> -
> static int opal_add_one_export(struct kobject *parent, const char *export_name,
> struct device_node *np, const char *prop_name)
> {
> @@ -826,7 +818,7 @@ static int opal_add_one_export(struct kobject *parent, const char *export_name,
> sysfs_bin_attr_init(attr);
> attr->attr.name = name;
> attr->attr.mode = 0400;
> - attr->read = export_attr_read;
> + attr->read = sysfs_bin_attr_simple_read;
> attr->private = __va(vals[0]);
> attr->size = vals[1];
I gave it a quick boot and checked I could still read the attributes,
everything seems fine.
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
cheers
^ permalink raw reply
* Re: [PATCH 2/2] treewide: Use sysfs_bin_attr_simple_read() helper
From: Zhi Wang @ 2024-04-08 10:42 UTC (permalink / raw)
To: Lukas Wunner
Cc: Greg Kroah-Hartman, Rafael J. Wysocki, linux-kernel,
Michael Ellerman, linuxppc-dev, linux-acpi, Jean Delvare,
Ard Biesheuvel, linux-efi, Zhenyu Wang, Zhi Wang, intel-gvt-dev,
Daniel Lezcano, linux-pm, Luis Chamberlain, linux-modules
In-Reply-To: <92ee0a0e83a5a3f3474845db6c8575297698933a.1712410202.git.lukas@wunner.de>
On Sat, 6 Apr 2024 15:52:02 +0200
Lukas Wunner <lukas@wunner.de> wrote:
> Deduplicate ->read() callbacks of bin_attributes which are backed by a
> simple buffer in memory:
>
> Use the newly introduced sysfs_bin_attr_simple_read() helper instead,
> either by referencing it directly or by declaring such bin_attributes
> with BIN_ATTR_SIMPLE_RO() or BIN_ATTR_SIMPLE_ADMIN_RO().
>
> Aside from a reduction of LoC, this shaves off a few bytes from
> vmlinux (304 bytes on an x86_64 allyesconfig).
>
> No functional change intended.
>
As for GVT, looks good to me.
Acked-by: Zhi Wang <zhiwang@kernel.org>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> ---
> arch/powerpc/platforms/powernv/opal.c | 10 +--------
> drivers/acpi/bgrt.c | 9 +-------
> drivers/firmware/dmi_scan.c | 12 ++--------
> drivers/firmware/efi/rci2-table.c | 10 +--------
> drivers/gpu/drm/i915/gvt/firmware.c | 26
> +++++----------------- .../intel/int340x_thermal/int3400_thermal.c
> | 9 +------- init/initramfs.c
> | 10 +-------- kernel/module/sysfs.c |
> 13 +---------- 8 files changed, 14 insertions(+), 85 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/opal.c
> b/arch/powerpc/platforms/powernv/opal.c index 45dd77e..5d0f35b 100644
> --- a/arch/powerpc/platforms/powernv/opal.c
> +++ b/arch/powerpc/platforms/powernv/opal.c
> @@ -792,14 +792,6 @@ static int __init opal_sysfs_init(void)
> return 0;
> }
>
> -static ssize_t export_attr_read(struct file *fp, struct kobject
> *kobj,
> - struct bin_attribute *bin_attr, char
> *buf,
> - loff_t off, size_t count)
> -{
> - return memory_read_from_buffer(buf, count, &off,
> bin_attr->private,
> - bin_attr->size);
> -}
> -
> static int opal_add_one_export(struct kobject *parent, const char
> *export_name, struct device_node *np, const char *prop_name)
> {
> @@ -826,7 +818,7 @@ static int opal_add_one_export(struct kobject
> *parent, const char *export_name, sysfs_bin_attr_init(attr);
> attr->attr.name = name;
> attr->attr.mode = 0400;
> - attr->read = export_attr_read;
> + attr->read = sysfs_bin_attr_simple_read;
> attr->private = __va(vals[0]);
> attr->size = vals[1];
>
> diff --git a/drivers/acpi/bgrt.c b/drivers/acpi/bgrt.c
> index e4fb9e2..d1d9c92 100644
> --- a/drivers/acpi/bgrt.c
> +++ b/drivers/acpi/bgrt.c
> @@ -29,14 +29,7 @@
> BGRT_SHOW(xoffset, image_offset_x);
> BGRT_SHOW(yoffset, image_offset_y);
>
> -static ssize_t image_read(struct file *file, struct kobject *kobj,
> - struct bin_attribute *attr, char *buf, loff_t off,
> size_t count) -{
> - memcpy(buf, attr->private + off, count);
> - return count;
> -}
> -
> -static BIN_ATTR_RO(image, 0); /* size gets filled in later */
> +static BIN_ATTR_SIMPLE_RO(image);
>
> static struct attribute *bgrt_attributes[] = {
> &bgrt_attr_version.attr,
> diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
> index 015c95a..3d0f773 100644
> --- a/drivers/firmware/dmi_scan.c
> +++ b/drivers/firmware/dmi_scan.c
> @@ -746,16 +746,8 @@ static void __init dmi_scan_machine(void)
> pr_info("DMI not present or invalid.\n");
> }
>
> -static ssize_t raw_table_read(struct file *file, struct kobject
> *kobj,
> - struct bin_attribute *attr, char *buf,
> - loff_t pos, size_t count)
> -{
> - memcpy(buf, attr->private + pos, count);
> - return count;
> -}
> -
> -static BIN_ATTR(smbios_entry_point, S_IRUSR, raw_table_read, NULL,
> 0); -static BIN_ATTR(DMI, S_IRUSR, raw_table_read, NULL, 0);
> +static BIN_ATTR_SIMPLE_ADMIN_RO(smbios_entry_point);
> +static BIN_ATTR_SIMPLE_ADMIN_RO(DMI);
>
> static int __init dmi_init(void)
> {
> diff --git a/drivers/firmware/efi/rci2-table.c
> b/drivers/firmware/efi/rci2-table.c index de1a9a1..4fd45d6 100644
> --- a/drivers/firmware/efi/rci2-table.c
> +++ b/drivers/firmware/efi/rci2-table.c
> @@ -40,15 +40,7 @@ struct rci2_table_global_hdr {
> static u32 rci2_table_len;
> unsigned long rci2_table_phys __ro_after_init =
> EFI_INVALID_TABLE_ADDR;
> -static ssize_t raw_table_read(struct file *file, struct kobject
> *kobj,
> - struct bin_attribute *attr, char *buf,
> - loff_t pos, size_t count)
> -{
> - memcpy(buf, attr->private + pos, count);
> - return count;
> -}
> -
> -static BIN_ATTR(rci2, S_IRUSR, raw_table_read, NULL, 0);
> +static BIN_ATTR_SIMPLE_ADMIN_RO(rci2);
>
> static u16 checksum(void)
> {
> diff --git a/drivers/gpu/drm/i915/gvt/firmware.c
> b/drivers/gpu/drm/i915/gvt/firmware.c index 4dd52ac..5e66a26 100644
> --- a/drivers/gpu/drm/i915/gvt/firmware.c
> +++ b/drivers/gpu/drm/i915/gvt/firmware.c
> @@ -50,21 +50,7 @@ struct gvt_firmware_header {
>
> #define dev_to_drm_minor(d) dev_get_drvdata((d))
>
> -static ssize_t
> -gvt_firmware_read(struct file *filp, struct kobject *kobj,
> - struct bin_attribute *attr, char *buf,
> - loff_t offset, size_t count)
> -{
> - memcpy(buf, attr->private + offset, count);
> - return count;
> -}
> -
> -static struct bin_attribute firmware_attr = {
> - .attr = {.name = "gvt_firmware", .mode = (S_IRUSR)},
> - .read = gvt_firmware_read,
> - .write = NULL,
> - .mmap = NULL,
> -};
> +static BIN_ATTR_SIMPLE_ADMIN_RO(gvt_firmware);
>
> static int expose_firmware_sysfs(struct intel_gvt *gvt)
> {
> @@ -107,10 +93,10 @@ static int expose_firmware_sysfs(struct
> intel_gvt *gvt) crc32_start = offsetof(struct gvt_firmware_header,
> version); h->crc32 = crc32_le(0, firmware + crc32_start, size -
> crc32_start);
> - firmware_attr.size = size;
> - firmware_attr.private = firmware;
> + bin_attr_gvt_firmware.size = size;
> + bin_attr_gvt_firmware.private = firmware;
>
> - ret = device_create_bin_file(&pdev->dev, &firmware_attr);
> + ret = device_create_bin_file(&pdev->dev,
> &bin_attr_gvt_firmware); if (ret) {
> vfree(firmware);
> return ret;
> @@ -122,8 +108,8 @@ static void clean_firmware_sysfs(struct intel_gvt
> *gvt) {
> struct pci_dev *pdev = to_pci_dev(gvt->gt->i915->drm.dev);
>
> - device_remove_bin_file(&pdev->dev, &firmware_attr);
> - vfree(firmware_attr.private);
> + device_remove_bin_file(&pdev->dev, &bin_attr_gvt_firmware);
> + vfree(bin_attr_gvt_firmware.private);
> }
>
> /**
> diff --git a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c index
> 427d370..6d4b51a 100644 ---
> a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c +++
> b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c @@ -73,14
> +73,7 @@ struct odvp_attr { struct device_attribute attr;
> };
>
> -static ssize_t data_vault_read(struct file *file, struct kobject
> *kobj,
> - struct bin_attribute *attr, char *buf, loff_t off,
> size_t count) -{
> - memcpy(buf, attr->private + off, count);
> - return count;
> -}
> -
> -static BIN_ATTR_RO(data_vault, 0);
> +static BIN_ATTR_SIMPLE_RO(data_vault);
>
> static struct bin_attribute *data_attributes[] = {
> &bin_attr_data_vault,
> diff --git a/init/initramfs.c b/init/initramfs.c
> index da79760..5193fae 100644
> --- a/init/initramfs.c
> +++ b/init/initramfs.c
> @@ -575,15 +575,7 @@ static int __init initramfs_async_setup(char
> *str) #include <linux/initrd.h>
> #include <linux/kexec.h>
>
> -static ssize_t raw_read(struct file *file, struct kobject *kobj,
> - struct bin_attribute *attr, char *buf,
> - loff_t pos, size_t count)
> -{
> - memcpy(buf, attr->private + pos, count);
> - return count;
> -}
> -
> -static BIN_ATTR(initrd, 0440, raw_read, NULL, 0);
> +static BIN_ATTR(initrd, 0440, sysfs_bin_attr_simple_read, NULL, 0);
>
> void __init reserve_initrd_mem(void)
> {
> diff --git a/kernel/module/sysfs.c b/kernel/module/sysfs.c
> index d964167..26efe13 100644
> --- a/kernel/module/sysfs.c
> +++ b/kernel/module/sysfs.c
> @@ -146,17 +146,6 @@ struct module_notes_attrs {
> struct bin_attribute attrs[] __counted_by(notes);
> };
>
> -static ssize_t module_notes_read(struct file *filp, struct kobject
> *kobj,
> - struct bin_attribute *bin_attr,
> - char *buf, loff_t pos, size_t count)
> -{
> - /*
> - * The caller checked the pos and count against our size.
> - */
> - memcpy(buf, bin_attr->private + pos, count);
> - return count;
> -}
> -
> static void free_notes_attrs(struct module_notes_attrs *notes_attrs,
> unsigned int i)
> {
> @@ -205,7 +194,7 @@ static void add_notes_attrs(struct module *mod,
> const struct load_info *info) nattr->attr.mode = 0444;
> nattr->size = info->sechdrs[i].sh_size;
> nattr->private = (void
> *)info->sechdrs[i].sh_addr;
> - nattr->read = module_notes_read;
> + nattr->read = sysfs_bin_attr_simple_read;
> ++nattr;
> }
> ++loaded;
^ permalink raw reply
* [PATCH v2] cppc_cpufreq: Fix possible null pointer dereference
From: Aleksandr Mishin @ 2024-04-08 9:35 UTC (permalink / raw)
To: Ionela Voinescu
Cc: Aleksandr Mishin, Rafael J. Wysocki, Viresh Kumar, linux-pm,
linux-kernel, lvc-project
In-Reply-To: <20240405094005.18545-1-amishin@t-argos.ru>
cppc_cpufreq_get_rate() and hisi_cppc_cpufreq_get_rate() can be called from
different places with various parameters. So cpufreq_cpu_get() can return
null as 'policy' in some circumstances.
Fix this bug by adding null return check.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: a28b2bfc099c ("cppc_cpufreq: replace per-cpu data array with a list")
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
---
v2: Fix mixed declarations
drivers/cpufreq/cppc_cpufreq.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 64420d9cfd1e..15f1d41920a3 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -741,10 +741,15 @@ static unsigned int cppc_cpufreq_get_rate(unsigned int cpu)
{
struct cppc_perf_fb_ctrs fb_ctrs_t0 = {0}, fb_ctrs_t1 = {0};
struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
- struct cppc_cpudata *cpu_data = policy->driver_data;
+ struct cppc_cpudata *cpu_data;
u64 delivered_perf;
int ret;
+ if (!policy)
+ return -ENODEV;
+
+ cpu_data = policy->driver_data;
+
cpufreq_cpu_put(policy);
ret = cppc_get_perf_ctrs(cpu, &fb_ctrs_t0);
@@ -822,10 +827,15 @@ static struct cpufreq_driver cppc_cpufreq_driver = {
static unsigned int hisi_cppc_cpufreq_get_rate(unsigned int cpu)
{
struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
- struct cppc_cpudata *cpu_data = policy->driver_data;
+ struct cppc_cpudata *cpu_data;
u64 desired_perf;
int ret;
+ if (!policy)
+ return -ENODEV;
+
+ cpu_data = policy->driver_data;
+
cpufreq_cpu_put(policy);
ret = cppc_get_desired_perf(cpu, &desired_perf);
--
2.30.2
^ permalink raw reply related
* [Bug 218689] AMD_Pstate_EPP Ryzen 7000 issues. Freezing and static sound
From: bugzilla-daemon @ 2024-04-08 9:16 UTC (permalink / raw)
To: linux-pm
In-Reply-To: <bug-218689-137361@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=218689
Artem S. Tashkinov (aros@gmx.com) changed:
What |Removed |Added
----------------------------------------------------------------------------
CC|aros@gmx.com |
--- Comment #7 from Artem S. Tashkinov (aros@gmx.com) ---
> if there were* to be a hardware issue^^^
AFAIK you're the _only_ person using Linux who experiences such issues. So far
there have been _zero_ reports about complications related to the amd-pstate
driver.
It may not work at all but it doesn't and cannot lead to freezes or sounds.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply
* Re: [PATCH 0/2] Deduplicate bin_attribute simple read() callbacks
From: Ard Biesheuvel @ 2024-04-08 8:42 UTC (permalink / raw)
To: Lukas Wunner
Cc: Greg Kroah-Hartman, Rafael J. Wysocki, linux-kernel,
Michael Ellerman, linuxppc-dev, linux-acpi, Jean Delvare,
linux-efi, Zhenyu Wang, Zhi Wang, intel-gvt-dev, Daniel Lezcano,
linux-pm, Luis Chamberlain, linux-modules
In-Reply-To: <cover.1712410202.git.lukas@wunner.de>
On Sat, 6 Apr 2024 at 15:52, Lukas Wunner <lukas@wunner.de> wrote:
>
> For my upcoming PCI device authentication v2 patches, I have the need
> to expose a simple buffer in virtual memory as a bin_attribute.
>
> It turns out we've duplicated the ->read() callback for such simple
> buffers a fair number of times across the tree.
>
> So instead of reinventing the wheel, I decided to introduce a common
> helper and eliminate all duplications I could find.
>
> I'm open to a bikeshedding discussion on the sysfs_bin_attr_simple_read()
> name. ;)
>
> Lukas Wunner (2):
> sysfs: Add sysfs_bin_attr_simple_read() helper
> treewide: Use sysfs_bin_attr_simple_read() helper
>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
> arch/powerpc/platforms/powernv/opal.c | 10 +-------
> drivers/acpi/bgrt.c | 9 +-------
> drivers/firmware/dmi_scan.c | 12 ++--------
> drivers/firmware/efi/rci2-table.c | 10 +-------
> drivers/gpu/drm/i915/gvt/firmware.c | 26 +++++----------------
> .../intel/int340x_thermal/int3400_thermal.c | 9 +-------
> fs/sysfs/file.c | 27 ++++++++++++++++++++++
> include/linux/sysfs.h | 15 ++++++++++++
> init/initramfs.c | 10 +-------
> kernel/module/sysfs.c | 13 +----------
> 10 files changed, 56 insertions(+), 85 deletions(-)
>
> --
> 2.43.0
>
^ permalink raw reply
* Re: [PATCH v2] PM: s2idle: Make sure CPUs will wakeup directly on resume
From: Peter Zijlstra @ 2024-04-08 8:40 UTC (permalink / raw)
To: Anna-Maria Behnsen
Cc: linux-kernel, Rafael J . Wysocki, Pavel Machek, Len Brown,
Ulf Hansson, linux-pm, Frederic Weisbecker, Thomas Gleixner, x86,
Mario Limonciello, stable
In-Reply-To: <87r0fg5ocg.fsf@somnus>
On Mon, Apr 08, 2024 at 09:02:23AM +0200, Anna-Maria Behnsen wrote:
> s2idle works like a regular suspend with freezing processes and freezing
> devices. All CPUs except the control CPU go into idle. Once this is
> completed the control CPU kicks all other CPUs out of idle, so that they
> reenter the idle loop and then enter s2idle state. The control CPU then
> issues an swait() on the suspend state and therefore enters the idle loop
> as well.
>
> Due to being kicked out of idle, the other CPUs leave their NOHZ states,
> which means the tick is active and the corresponding hrtimer is programmed
> to the next jiffie.
>
> On entering s2idle the CPUs shut down their local clockevent device to
> prevent wakeups. The last CPU which enters s2idle shuts down its local
> clockevent and freezes timekeeping.
>
> On resume, one of the CPUs receives the wakeup interrupt, unfreezes
> timekeeping and its local clockevent and starts the resume process. At that
> point all other CPUs are still in s2idle with their clockevents switched
> off. They only resume when they are kicked by another CPU or after resuming
> devices and then receiving a device interrupt.
>
> That means there is no guarantee that all CPUs will wakeup directly on
> resume. As a consequence there is no guarantee that timers which are queued
> on those CPUs and should expire directly after resume, are handled. Also
> timer list timers which are remotely queued to one of those CPUs after
> resume will not result in a reprogramming IPI as the tick is
> active. Queueing a hrtimer will also not result in a reprogramming IPI
> because the first hrtimer event is already in the past.
>
> The recent introduction of the timer pull model (7ee988770326 ("timers:
> Implement the hierarchical pull model")) amplifies this problem, if the
> current migrator is one of the non woken up CPUs. When a non pinned timer
> list timer is queued and the queuing CPU goes idle, it relies on the still
> suspended migrator CPU to expire the timer which will happen by chance.
>
> The problem exists since commit 8d89835b0467 ("PM: suspend: Do not pause
> cpuidle in the suspend-to-idle path"). There the cpuidle_pause() call which
> in turn invoked a wakeup for all idle CPUs was moved to a later point in
> the resume process. This might not be reached or reached very late because
> it waits on a timer of a still suspended CPU.
>
> Address this by kicking all CPUs out of idle after the control CPU returns
> from swait() so that they resume their timers and restore consistent system
> state.
>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218641
> Fixes: 8d89835b0467 ("PM: suspend: Do not pause cpuidle in the suspend-to-idle path")
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: Mario Limonciello <mario.limonciello@amd.com>
> Cc: stable@kernel.org
Cute,
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
> kernel/power/suspend.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> --- a/kernel/power/suspend.c
> +++ b/kernel/power/suspend.c
> @@ -106,6 +106,12 @@ static void s2idle_enter(void)
> swait_event_exclusive(s2idle_wait_head,
> s2idle_state == S2IDLE_STATE_WAKE);
>
> + /*
> + * Kick all CPUs to ensure that they resume their timers and restore
> + * consistent system state.
> + */
> + wake_up_all_idle_cpus();
> +
> cpus_read_unlock();
>
> raw_spin_lock_irq(&s2idle_lock);
^ permalink raw reply
* Re: [PATCH 2/2] thermal/drivers/mediatek/lvts_thermal: Improve some memory allocation
From: Dan Carpenter @ 2024-04-08 8:09 UTC (permalink / raw)
To: Christophe JAILLET
Cc: Rafael J. Wysocki, Daniel Lezcano, Zhang Rui, Lukasz Luba,
Matthias Brugger, AngeloGioacchino Del Regno, linux-kernel,
kernel-janitors, linux-pm, linux-arm-kernel, linux-mediatek
In-Reply-To: <8cb69f245311a348164b0b5ca3dbc59386746035.1712520052.git.christophe.jaillet@wanadoo.fr>
On Sun, Apr 07, 2024 at 10:01:49PM +0200, Christophe JAILLET wrote:
> diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
> index 3003dc350766..b133f731c5ba 100644
> --- a/drivers/thermal/mediatek/lvts_thermal.c
> +++ b/drivers/thermal/mediatek/lvts_thermal.c
> @@ -204,7 +204,7 @@ static const struct debugfs_reg32 lvts_regs[] = {
>
> static int lvts_debugfs_init(struct device *dev, struct lvts_domain *lvts_td)
> {
> - struct debugfs_regset32 *regset;
> + struct debugfs_regset32 *regsets;
> struct lvts_ctrl *lvts_ctrl;
> struct dentry *dentry;
> char name[64];
> @@ -214,8 +214,14 @@ static int lvts_debugfs_init(struct device *dev, struct lvts_domain *lvts_td)
> if (IS_ERR(lvts_td->dom_dentry))
> return 0;
>
> + regsets = devm_kcalloc(dev, lvts_td->num_lvts_ctrl,
> + sizeof(*regsets), GFP_KERNEL);
> + if (!regsets)
> + return 0;
I understand that this preserved the behavior from the original code,
but the original code was wrong. This should return -ENOMEM.
> +
> for (i = 0; i < lvts_td->num_lvts_ctrl; i++) {
>
> + struct debugfs_regset32 *regset = ®sets[i];
> lvts_ctrl = &lvts_td->lvts_ctrl[i];
The blank line should come after the declaration.
regards,
dan carpenter
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox