All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthias Kaehlcke <mka@chromium.org>
To: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Abel Vesa <abel.vesa@linaro.org>,
	Bjorn Andersson <andersson@kernel.org>,
	"Rafael J . Wysocki" <rafael@kernel.org>,
	Kevin Hilman <khilman@kernel.org>,
	Ulf Hansson <ulf.hansson@linaro.org>,
	Len Brown <len.brown@intel.com>, Pavel Machek <pavel@ucw.cz>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Andy Gross <agross@kernel.org>,
	Konrad Dybcio <konrad.dybcio@linaro.org>,
	linux-pm@vger.kernel.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-arm-msm@vger.kernel.org, Stephen Boyd <sboyd@kernel.org>,
	Doug Anderson <dianders@chromium.org>
Subject: Re: [RFC PATCH v2 1/2] PM: domains: Skip disabling unused domains if provider has sync_state
Date: Fri, 3 Feb 2023 21:18:37 +0000	[thread overview]
Message-ID: <Y916LZXfwi17uVn5@google.com> (raw)
In-Reply-To: <9b8af6b3-9ab5-12f8-5576-1a93c58a26c1@linaro.org>

On Fri, Feb 03, 2023 at 10:00:27PM +0200, Dmitry Baryshkov wrote:
> On 03/02/2023 03:20, Matthias Kaehlcke wrote:
> > Hi Dmitry,
> > 
> > On Thu, Feb 02, 2023 at 09:53:41PM +0200, Dmitry Baryshkov wrote:
> > > On 02/02/2023 20:24, Matthias Kaehlcke wrote:
> > > > Hi Abel,
> > > > 
> > > > On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote:
> > > > > Currently, there are cases when a domain needs to remain enabled until
> > > > > the consumer driver probes. Sometimes such consumer drivers may be built
> > > > > as modules. Since the genpd_power_off_unused is called too early for
> > > > > such consumer driver modules to get a chance to probe, the domain, since
> > > > > it is unused, will get disabled. On the other hand, the best time for
> > > > > an unused domain to be disabled is on the provider's sync_state
> > > > > callback. So, if the provider has registered a sync_state callback,
> > > > > assume the unused domains for that provider will be disabled on its
> > > > > sync_state callback. Also provide a generic sync_state callback which
> > > > > disables all the domains unused for the provider that registers it.
> > > > > 
> > > > > Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
> > > > > ---
> > > > > 
> > > > > This approach has been applied for unused clocks as well.
> > > > > With this patch merged in, all the providers that have sync_state
> > > > > callback registered will leave the domains enabled unless the provider's
> > > > > sync_state callback explicitly disables them. So those providers will
> > > > > need to add the disabling part to their sync_state callback. On the
> > > > > other hand, the platforms that have cases where domains need to remain
> > > > > enabled (even if unused) until the consumer driver probes, will be able,
> > > > > with this patch in, to run without the pd_ignore_unused kernel argument,
> > > > > which seems to be the case for most Qualcomm platforms, at this moment.
> > > > 
> > > > I recently encountered a related issue on a Qualcomm platform with a
> > > > v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use
> > > > highest corner until sync_state"). The issue involves a DT node with a
> > > > rpmhpd, the DT node is enabled, however the corresponding device driver
> > > > is not enabled in the kernel. In such a scenario the sync_state callback
> > > > is never called, because the genpd consumer never probes. As a result
> > > > the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during
> > > > system suspend, which results in a substantially higher power consumption
> > > > in S3.
> > > > 
> > > > I wonder if genpd (and some other frameworks) needs something like
> > > > regulator_init_complete(), which turns off unused regulators 30s after
> > > > system boot. That's conceptually similar to the current
> > > > genpd_power_off_unused(), but would provide time for modules being loaded.
> > > 
> > > I think the overall goal is to move away from ad-hoc implementations like
> > > clk_disable_unused/genpd_power_off_unused/regulator_init_complete towards
> > > the sync_state.
> > 
> > I generally agree with the goal of using common mechanisms whenever possible.
> > 
> > > So inherently one either has to provide drivers for all devices in question
> > > or disable unused devices in DT.
> > 
> > I don't think that's a great solution, it essentially hands the issue down to
> > the users or downstream maintainers of the kernel, who might not be aware that
> > there is an issue, nor know about the specifics of genpd (or interconnects and
> > clocks which have similar problems).
> 
> The goal is to move the control down to individual drivers. Previously we
> had issues with clk_disable_unused() disabling mdss/mdp clocks incorrectly,
> which frequently led to broken display output. Other clock/genpd/regulator
> drivers might have other internal dependencies. Thus it is not really
> possible to handle resource shutdown in the common  (framework) code.
> 
> > 
> > In general symptoms are probably subtle, like a (potentially substantially)
> > increased power consumption during system suspend. The issue might have been
> > introduced by an update to a newer kernel, which now includes a DT node for a
> > new SoC feature which wasn't supported by the 'old' kernel. It's common
> > practice to use the 'old' .config, at least as a starting point, which
> > obviously doesn't enable the new driver. That happend to me with [1] when
> > testing v6.1. It took me quite some time to track the 'culprit' commit down
> > and then some debugging to understand what's going on. Shortly after that I
> > ran into a related issue involving genpds when testing v6.2-rc, which again
> > took a non-trivial amount of time to track down (and I'm familiar with the SoC
> > platform and the general nature of the issue). I don't think it's reasonable
> > to expect every user/downstream maintainer of an impacted system to go through
> > this, one person at a time.
> 
> I think it would be nice to have some way of 'sync_pending' debug available
> (compare this to debugfs/devices_deferred).

Most folks are probably not even aware that they have a 'sync_state' issue and
wouldn't look in debugfs, so I think this would have to be something proactive,
like a warning log that is enabled by default (possibly with the option to
disable it). Something in debugfs could be a nice complement.

> Note, we are trying to make sure that all supported drivers are enabled at
> least as modules (if possible). If we fail, please send a patch fixing the
> defconfig.

That's great, however not everybody uses the defconfig, it's just a default.

> > Maybe there could be a generic solution for drivers with a 'sync_state'
> > callback, e.g. a the driver (or framework) could have a 'sync_state_timeout'
> > callback (or similar), which is called by the driver framework if 'sync_state'
> > wasn't called (for example) 30s after the device was probed. Then the provider
> > can power off or throttle unclaimed resources.
> 
> I might be missing a point somewhere, but for me it looks like a logical
> solution. Please send a proposal.

I started working on a patch, I'll probably send it out next week if I don't
encounter any evident major issues.

  reply	other threads:[~2023-02-03 21:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-27 10:40 [RFC PATCH v2 1/2] PM: domains: Skip disabling unused domains if provider has sync_state Abel Vesa
2023-01-27 10:40 ` [RFC PATCH v2 2/2] soc: qcom: rmphpd: Call the genpd unused power off sync state callback Abel Vesa
2023-01-28 21:53 ` [RFC PATCH v2 1/2] PM: domains: Skip disabling unused domains if provider has sync_state kernel test robot
2023-02-02 18:24 ` Matthias Kaehlcke
2023-02-02 19:20   ` Doug Anderson
2023-02-02 19:53   ` Dmitry Baryshkov
2023-02-02 22:20     ` Doug Anderson
2023-02-06 16:24       ` Abel Vesa
2023-02-06 17:37         ` Matthias Kaehlcke
2023-02-03  1:20     ` Matthias Kaehlcke
2023-02-03 20:00       ` Dmitry Baryshkov
2023-02-03 21:18         ` Matthias Kaehlcke [this message]
2023-02-06 16:21         ` Abel Vesa
2023-02-06 16:39           ` Abel Vesa
2023-02-06 16:31   ` Abel Vesa
2023-02-06 17:22     ` Matthias Kaehlcke
2023-02-06 17:48       ` Abel Vesa
2023-02-06 18:36         ` Matthias Kaehlcke
2023-02-06 19:32         ` Saravana Kannan
2023-02-06 21:10           ` Doug Anderson
2023-02-06 21:35             ` Saravana Kannan
2023-02-07 23:45               ` Doug Anderson
2023-02-20 17:15                 ` Bjorn Andersson
2023-02-21 18:32                   ` Matthias Kaehlcke
2023-02-22 18:51                     ` Bjorn Andersson
2023-02-23 23:20                       ` Matthias Kaehlcke
2023-03-01 22:40                   ` Doug Anderson
2023-02-15 11:57 ` Ulf Hansson
2023-02-15 12:21   ` Abel Vesa
2023-02-20 16:43     ` Bjorn Andersson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y916LZXfwi17uVn5@google.com \
    --to=mka@chromium.org \
    --cc=abel.vesa@linaro.org \
    --cc=agross@kernel.org \
    --cc=andersson@kernel.org \
    --cc=dianders@chromium.org \
    --cc=dmitry.baryshkov@linaro.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=khilman@kernel.org \
    --cc=konrad.dybcio@linaro.org \
    --cc=len.brown@intel.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=pavel@ucw.cz \
    --cc=rafael@kernel.org \
    --cc=sboyd@kernel.org \
    --cc=ulf.hansson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.