From: Christian Loehle <christian.loehle@arm.com>
To: "King, Colin" <colin.king@intel.com>,
Bart Van Assche <bvanassche@acm.org>,
Jens Axboe <axboe@kernel.dk>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Daniel Lezcano <daniel.lezcano@linaro.org>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] cpuidle: psd: add power sleep demotion prevention for fast I/O devices
Date: Tue, 1 Apr 2025 16:15:06 +0100 [thread overview]
Message-ID: <743b3472-e1a4-4fdc-ac2d-6e74c51022c4@arm.com> (raw)
In-Reply-To: <SJ2PR11MB7670E05E066CCC16AFEA16A18DAC2@SJ2PR11MB7670.namprd11.prod.outlook.com>
On 4/1/25 16:03, King, Colin wrote:
> Hi,
>
> Reply at end..
>
>> -----Original Message-----
>> From: Christian Loehle <christian.loehle@arm.com>
>> Sent: 26 March 2025 16:27
>> To: King, Colin <colin.king@intel.com>; Bart Van Assche
>> <bvanassche@acm.org>; Jens Axboe <axboe@kernel.dk>; Rafael J. Wysocki
>> <rafael@kernel.org>; Daniel Lezcano <daniel.lezcano@linaro.org>; linux-
>> block@vger.kernel.org; linux-pm@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Subject: Re: [PATCH] cpuidle: psd: add power sleep demotion prevention for
>> fast I/O devices
>>
>> On 3/26/25 15:04, King, Colin wrote:
>>> Hi,
>>>
>>>> -----Original Message-----
>>>> From: Bart Van Assche <bvanassche@acm.org>
>>>> Sent: 23 March 2025 12:36
>>>> To: King, Colin <colin.king@intel.com>; Christian Loehle
>>>> <christian.loehle@arm.com>; Jens Axboe <axboe@kernel.dk>; Rafael J.
>>>> Wysocki <rafael@kernel.org>; Daniel Lezcano
>>>> <daniel.lezcano@linaro.org>; linux-block@vger.kernel.org;
>>>> linux-pm@vger.kernel.org
>>>> Cc: linux-kernel@vger.kernel.org
>>>> Subject: Re: [PATCH] cpuidle: psd: add power sleep demotion
>>>> prevention for fast I/O devices
>>>>
>>>> On 3/17/25 3:03 AM, King, Colin wrote:
>>>>> This code is optional, one can enable it or disable it via the
>>>>> config option. Also, even when it is built-in one can disable it by
>>>>> writing 0 to the
>>>> sysfs file
>>>>> /sys/devices/system/cpu/cpuidle/psd_cpu_lat_timeout_ms
>>>>
>>>> I'm not sure we need even more configuration knobs in sysfs.
>>>
>>> It's useful for enabling / disabling the functionality, as well as some form of
>> tuning for slower I/O devices, so I think it is justifiable.
>>>
>>>> How are users
>>>> expected to find this configuration option? How should they decide
>>>> whether to enable or to disable it?
>>>
>>> I can send a V2 with some documentation if that's required.
>>>
>>>>
>>>> Please take a look at this proposal and let me know whether this
>>>> would solve the issue that you are looking into: "[LSF/MM/BPF Topic]
>> Energy- Efficient I/O"
>>>> (https://lore.kernel.org/linux-block/ad1018b6-7c0b-4d70-
>>>> b845-c869287d3cf3@acm.org/). The only disadvantage of this approach
>>>> compared to the cpuidle patch is that it requires RPM (runtime power
>>>> management) to be enabled. Maybe I should look into modifying the
>>>> approach such that it does not rely on RPM.
>>>
>>> I've had a look, the scope of my patch is a bit wider. If my patch
>>> gets accepted I'm going to also look at putting the psd call into
>>> other devices (such as network devices) to also stop deep states while
>>> these devices are busy. Since the code is very lightweight I was hoping this
>> was going to be relatively easy and simple to use in various devices in the
>> future.
>>
>> IMO this needs to be a lot more fine-grained then, both in terms of which
>> devices or even IO is affected (Surely some IO is fine with at least *some*
>> latency) but also how aggressive we are in blocking.
>> Just looking at some common latency/residency of idle states out there I don't
>> think it's reasonable to force polling for a 3-10ms (rounding up with the jiffie)
>> period.
>
> The current solution by a customer is that they are resorting to disabling C6/C6P and hence
> all the CPUs are essentially in a non-low power state all the time. The opt-in solution
> provided in the patch provides nearly the same performance and will re-enable deeper
> C-states once the I/O is completed.
>
> As I mentioned earlier, the jiffies are used because it's low-touch and very fast with negligible
> impact on the I/O paths. Using finer grained timing is far more an expensive operation and
> is a huge overhead on very fast I/O devices.
>
> Also, this is a user config and tune-able choice. Users can opt-in to using this if they want
> to pay for the extra CPU overhead for a bit more I/O performance. If they don't want it, they
> don't need to enable it.
>
>> Playing devil's advocate if the system is under some thermal/power pressure
>> we might actually reduce throughput by burning so much power on this.
>> This seems like the stuff that is easily convincing because it improves
>> throughput and then taking care of power afterwards is really hard. :/
>>
>
> The current solution is when the user is trying to get maximum bandwidth and disabling C6/C6P
> so they are already keeping the system busy. This solution at least will save power when I/O is idling.
>
No. They can set the pm qos latency constraint when they care about 'maximum bandwidth'
and remove the constraint when they don't.
If they just disable the idle states at boot and never enable them at least they have no
grounds to complain to kernel people about, they should know what they're doing is detrimental
to power.
Furthermore we might be better off disabling C6/C6P than staying in a polling state (whenever we've
completed an IO in the last ~5 to 20ms, depending on the HZ setting).
Again, the wastefulness of a polling state can hardly be overestimated, especially given
that it doesn't seem to be necessary at all here?
next prev parent reply other threads:[~2025-04-01 15:15 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <33882f284ac6e6d1ec766ca4bb2f3b88@intel.com>
2025-03-03 22:24 ` [PATCH] cpuidle: psd: add power sleep demotion prevention for fast I/O devices Christian Loehle
2025-03-17 10:03 ` King, Colin
2025-03-23 9:18 ` Christian Loehle
2025-03-26 15:14 ` King, Colin
2025-03-23 12:35 ` Bart Van Assche
2025-03-26 15:04 ` King, Colin
2025-03-26 15:14 ` Rafael J. Wysocki
2025-03-26 16:26 ` Christian Loehle
2025-03-26 17:46 ` Rafael J. Wysocki
2025-04-01 15:03 ` King, Colin
2025-04-01 15:15 ` Christian Loehle [this message]
2025-04-01 16:41 ` Rafael J. Wysocki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=743b3472-e1a4-4fdc-ac2d-6e74c51022c4@arm.com \
--to=christian.loehle@arm.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=colin.king@intel.com \
--cc=daniel.lezcano@linaro.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=rafael@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox