From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: Len Brown <lenb@kernel.org>
Cc: linux-pm@lists.linux-foundation.org
Subject: Re: Ottawa Linux Power Management Summit, June 25-26, 2007 - Minutes
Date: Sun, 9 Sep 2007 14:26:39 +0200 [thread overview]
Message-ID: <200709091426.39854.rjw@sisk.pl> (raw)
In-Reply-To: <200709050426.04793.lenb@kernel.org>
On Wednesday, 5 September 2007 10:26, Len Brown wrote:
> A Linux Power Management "mini-summit" was held in Ottawa
> on June 25 and 26, 2007, immediately preceeding the Ottawa Linux Symposium.
>
> An effort was made to follow the best-known-method
> for a Linux mini-summit, thought to be the most recent
> storage-summit. The invitation to the meeting was open --
> sent to linux-pm@lists.linux-foundation.org in early May.
> The focus of the meeting was on technical discussion. Thus,
> only presentations which supported discussion were encouraged,
> and the size of the forum was capped at 20. The agenda was set
> by consensus of the attendees.
>
> Thank you to the Intel Open Source Technology Center
> for sponsoring the meeting.
>
> Day 1 attendees:
>
> Len Brown, Intel OTC, Linux Kernel ACPI Maintainer
> Mark Gross, Intel OTC, embedded Linux team
> Paul Mundt, Renesas, Linux Kernel Super-H Maintainer
> Kevin Hilman, MontaVista, MV DPM Maintainer
> Igor Stoppa, Nokia, OSSO Power Management
> Sakari Poussa, Nokia, OSSO Power Management
> Dave Jones, Red Hat, Fedora Maintainer, Linux Kernel Cpufreq Maintainer
> Klaus Pedersen, Nokia, OSSO Power Management
> Ken Rozendal, IBM, Linux on Power
> Vivek Kashyap, IBM LTC
> Adam Belay, Novell/MIT, cpuidle developer
> Eugeny S. Mints, NGS Power Management
> Scott E. Preece, Motorola
> Marcelo Tosatti, Red Hat, One Laptop Per Child
>
> Day 2 additional attendees:
>
> Tariq Shureih, Intel OTC, MID power policy manager
> Rishi Bhattacharya, Texas Instruments
> Iliasbiris, Instituto de Tecnologia
>
> notes:
>
> Mark Gross showed off a Classmate PC. The unit he had was a 900MHz
> Celeron (model 13) Find out more at http://classmatepc.com
>
> Mark led a discussion about constraints/quality of service.
> An application specifies a QOS/SLA to some middle-ware, which
> translates that into operation constraints. We discussed the
> vocabulary for constraints. More on this below.
>
> Igor Stoppa presented findings from the Nokia Tablet team.
> The OMAP1 used in the n770 had idle/big-sleep/deep-sleep.
> The OMAP2 is used in the n800, is built on 90nm technology.
> The OMAP3 is expected to be built on high leakage 65nm technology,
> and thus require software to take advantage of power-gating off states.
> Indeed, the OMAP3 has over 30 power gates.
>
> http://linux.omap.com has OMAP Linux resources.
> http://source.mvista.com hosts OMAP patches before they get
> to kernel.org
>
> Re: Performance States
>
> Igor asserted that once a voltage is selected, it is it always
> the best policy to run at the maximum frequency supported by
> that voltage.
>
> However, the OMAP2 throws Linux a curve ball when increasing
> the ARM core to its maximum speed, it will _reduce_ the speed of
> the DSP. Eg. 400MHz and 133MHz respectively. cpufreq doesn't
> have a concept of this kind of dependency.
>
> cpufreq_set_policy() doesn't match Nokia's needs as it is a 1-way
> notification, and there is no way to register constraints.
>
> Igor reported a scaling frequency bug where the current polling
> interval and minimum residency formulas in ondemand don't work
> on Nokia's hardware.
>
> He also described "spread to deadline" in contrast to "race to
> idle". In spread-to-deadline, the work is run at the minimum rate
> such that it will complete in time for a known future deadline.
> The deadline might be an expected external periodic communication
> event, for example.
>
> Re: pause/resume
> Total pause/resume on the n800 is 20-80ms.
> PLL re-lock takes about 0.1ms and the voltage ramp is about 5ms
> by comparison. The big time consumer is drivers. In particular
> syncing with screen updates.
>
> Paul Mundt contrasted the clock framework with cpufreq, saying
> that one could build a rate table of all P-state transitions.
> Though this would need to prototyped to see if it is viable.
>
> Marcelo Tosatti shoed off an OLPC XO-1 (http://laptop.org/)
> It includes a 433MHz AMD Geode LX.
> (this replaced the previous cache-less Geode GX)
> The XO-1 has 1G NAND flash 1200x900 LED screen which uses 0.2W min,
> 1.0 Watts max. These screen power numbers are truly impressive.
>
> OLPC wants to aggressively auto-suspend to an suspend-to-RAM
> like state, except the screen stays on (and wireless stays on).
> The system wakes upon user-input. The requirement for this state
> is < 100ms resume latency. Jim Gettys asserts that the iPAQ could
> resume in 10ms by comparison. Marcelo reports that the XO-1
> can resume in 160ms today if USB is disabled. However, if USB
> is enabled, it resumes in 250ms. He thinks that resume needs to
> be multi-threaded, and it needs to be smarter so that it doesn't
> blindly resume every device in the system.
>
> XO-1 has a Display Controller (DCON), which will refresh display
> even when processor completely powered off.
>
> Regarding wake, enable_irqwake(irq) is ugly b/c it is IRQ specific.
> Needs to e enable-wakeup(device) -- a generic API.
>
> Audio amplifier must delay ~100ms power-up to avoid a pop.
>
> OLPC is not using suspend-to-disk, yet.
>
> Discussed the STD vs STR path. The expectation is that STR can be
> faster if it doesn't follow the same path as STD. Per the list,
> Rafael is working on this.
Well, in 2.6.23 the hibernation (STD is a PCish name) and suspend (ie.
STR, standby, etc.) code paths will be separate on the highest level. Still,
they both use the freezer and device_suspend()/device_resume() , which consume
the majority of the suspend/resume time.
> OLPC is using OHM - Open HW Manager -- a generic system manager,
> of which power management is just one part.
>
> olpc-pm.c olpc_pm_enter() is kicked off by OHM on detecting idle.
>
> Dave Jones led a discussion on cpufreq.
>
> Re: Accounting vs cpufreq.
> Enterprise capacity planning applications get confused by cpufreq.
> cpufreq lowers the MHz due to low demand, the management application
> sees no idle time left -- indicating that the system has reached capacity
> and need to be upgraded.
>
> Dave commented that the cpufreq conservative governor should
> be deleted and whatever hooks are needed should simply be added
> to ondemand.
>
> MHz vs scheduler: today cpufreq simply tracks idle time and the
> schedule is completely unaware that cpufreq changes the frequency.
> Application hints may be appropriate for apps to tell the scheduler
> about their MHz needs. Also, the scheduler may be better off
> scheduling cycles instead of scheduling time.
>
> Discussion on APERF/MPERF MSRs on Intel processors: The APERF/MPERF
> ratio conveys the "actual" to "maximum" MHz ratio since the
> MSRs were last reset. Note that with Intel Dynamic Acceleration
> (IDA), this ratio can be greater than 1 -- so maybe "maximum"
> needs to be re-worded as "marketing":-)
>
> governors It isn't clear whey there needs to be a governor
> per core. It seems to be unused today, except on incorrectly
> administered systems.
>
> user-space: cpuspeed, powernowd not used so much these days.
>
> The fabled DPM/PowerOP/cpufreq integration isn't happening fast.
> Per previous discussion, an abstract notion Operating Points
> makes the most sense, and perhaps dealing in units of absolute
> MHz is not the right model. Though users are now accustomed to
> thinking they know the absolute MHz....
>
> Dave Jones was open to the idea of transforming cpufreq into a
> generic clock scaling implementation.
>
> Dave mentioned that Fedora Core 7 32-bit is now shipping with
> CONFIG_NOHZ=y and CONFIG_HZ=1000.
CONFIG_NOHZ is known to break suspend and resume on some machines. These
problems are being fixed over time, but that's a risky decision for a
distribution to switch it on by default.
> Kevin Hillman led a discussion on DPM (Dynamic Power Management,
> http://dynamicpower.sourceforge.net/)
>
> DPM has been shipping since Linux-2.4 and is a part of many
> successful products, so it will continue to be supported.
>
> One key aspect of DPM is that it allows customers to put their
> platform-specific proprietary control code in user-space.
>
> DPM has hooks in the scheduler where applications explicitly
> request an operating state.
>
> MontaVista is hoping to migrate to mainline, now that mainline is
> becoming more capable. In particular, they need solid tickless,
> cpufreq, and wake-up events.
>
> Paul Mundt described the cutting edge in the Super-H space.
> The SH4A-SMP has 4 cores and it expected to be used in high-end
> consumer electronics, navigation etc. It has per-core voltage
> regulation, and CPU offline saves real power. Often ITRON is
> run on a core.
>
> Mark Gross led a discussion on Device QOS Parameters, to see
> if common language might be suitable, say in a sysfs interface.
> We brain-stormed on how throughput, rate, power gain, latency,
> acoustic and timeout applied to various classes of devices;
> such as storage, wired and wireless networks, and the display.
>
> Suspend/Resume:
> Earlier on the list, Linus stated that he might
> prefer multiple entry points that do simpler functions rather
> than the over-loaded .suspend/.resume I/F we have today.
>
> Adam Belay described a 2-pass device suspend to ram loop, where .stop is
> first called for each device before the first .suspend is called:
>
> .start .stop
> dont touch hardware able to return failure
> .suspend(target state)
> saves HW state enable wake feature invoke D-state
> (power-off)
> [take STD snapshot here] .resume
>
> There is also a .reset especially for kexec that can be called
> after .stop. It removes the IRQ and int src.
I think we'll need some more callbacks than that. For example, we may need to
add a prepare_to_stop() callback allowing the driver to allocate additional
memory etc. before .stop() is called.
> The .stop loop allows a device to veto the suspend and for the
> system to quickly back out of the operation.
If we want to remove the freezer, we may want to use .stop() to make the driver
start blocking I/O data going from processes to the device and the other way
around.
Greetings,
Rafael
--
"Premature optimization is the root of all evil." - Donald Knuth
next prev parent reply other threads:[~2007-09-09 12:26 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-09-05 8:26 Ottawa Linux Power Management Summit, June 25-26, 2007 - Minutes Len Brown
2007-09-09 12:26 ` Rafael J. Wysocki [this message]
2007-09-24 19:41 ` Dave Jones
2007-09-25 11:39 ` Rafael J. Wysocki
2007-09-25 12:00 ` Thomas Gleixner
2007-10-28 13:54 ` cpufreq-set problems [Was: Re: Ottawa Linux Power Management Summit, June 25-26, 2007 - Minutes] Dominik Brodowski
2007-10-29 13:26 ` Igor Stoppa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200709091426.39854.rjw@sisk.pl \
--to=rjw@sisk.pl \
--cc=lenb@kernel.org \
--cc=linux-pm@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox