Linux Power Management development
 help / color / mirror / Atom feed
* [GIT PATCH] ACPI patches for Linux 3.1
From: Len Brown @ 2011-08-02 21:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-acpi, linux-pm, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 22580 bytes --]

Hi Linus,

please pull from: 

git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git release

This will update the files shown below.

thanks!

Len Brown
Intel Open Source Technology Center

ps. individual patches are available on linux-acpi@vger.kernel.org

 Documentation/feature-removal-schedule.txt |    9 --
 Documentation/kernel-parameters.txt        |    5 +
 drivers/acpi/acpica/acglobal.h             |    6 +
 drivers/acpi/acpica/aclocal.h              |    1 +
 drivers/acpi/acpica/acpredef.h             |    1 +
 drivers/acpi/acpica/nspredef.c             |   19 +++-
 drivers/acpi/acpica/nsrepair2.c            |   15 +++
 drivers/acpi/acpica/tbinstal.c             |   27 +++++-
 drivers/acpi/battery.c                     |   82 ++++++++++------
 drivers/acpi/dock.c                        |    4 +-
 drivers/acpi/ec_sys.c                      |    2 +-
 drivers/acpi/fan.c                         |    2 +-
 drivers/acpi/osl.c                         |   25 +++++-
 drivers/acpi/pci_irq.c                     |   58 +++++++++++
 drivers/acpi/pci_root.c                    |    3 +-
 drivers/acpi/processor_thermal.c           |    2 +-
 drivers/acpi/sbs.c                         |   13 ++-
 drivers/acpi/sleep.c                       |   16 +++
 drivers/acpi/sysfs.c                       |    4 +-
 drivers/acpi/thermal.c                     |    2 +-
 drivers/acpi/video.c                       |    2 +-
 drivers/ata/libata-acpi.c                  |    4 +-
 drivers/pci/hotplug/acpiphp_glue.c         |    2 +-
 drivers/thermal/Kconfig                    |    8 +-
 drivers/thermal/thermal_sys.c              |  142 ++++++++++++++++++++++------
 include/acpi/acpi_drivers.h                |    2 +-
 include/acpi/acpixf.h                      |    3 +-
 include/acpi/processor.h                   |    2 +-
 include/linux/acpi.h                       |    1 -
 include/linux/thermal.h                    |   22 -----
 30 files changed, 352 insertions(+), 132 deletions(-)

through these commits:

Bob Moore (5):
      ACPICA: Load operator: re-instate most restrictions on incoming table signature
      ACPICA: Add missing _TDL to list of predefined names
      ACPICA: Update to version 20110527
      ACPICA: Add option to disable method return value validation and repair
      ACPICA: Update to version 20110623

David Rientjes (1):
      ACPI: remove NID_INVAL

Fenghua Yu (1):
      ACPICA: Do not repair _TSS return package if _PSS is present

Jean Delvare (3):
      thermal: hide CONFIG_THERMAL_HWMON
      thermal: split hwmon lookup to a separate function
      thermal: make THERMAL_HWMON implementation fully internal

Jon Mason (1):
      ACPI: fix 80 char overflow

Lan Tianyu (8):
      ACPI / SBS: Add getting state operation in the acpi_sbs_battery_get_property()
      ACPI / SBS: Correct the value of power_now and power_avg in the sysfs
      ACPI / Battery: Add the power unit macro
      ACPI / Battery: Change 16-bit signed negative battery current into correct value
      ACPI / Battery: Rename acpi_battery_quirks2 with acpi_battery_quirks
      ACPI / Battery: Add the hibernation process in the battery_notify()
      ACPI / Battery: Add the check before refresh sysfs in the battery_notify()
      ACPI / Battery: Resolve the race condition in the sysfs_remove_battery()

Len Brown (2):
      ACPI print OSI(Linux) warning only once
      ACPI:  delete stale reference in kernel-parameters.txt

Shaohua Li (1):
      ACPI: add missing _OSI strings

Stefan Assmann (1):
      ACPI: fix CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS

Stefan Hajnoczi (2):
      ACPI / Battery: avoid acpi_battery_add() use-after-free
      ACPI / Battery: propagate sysfs error in acpi_battery_add()

Takao Indoh (1):
      ACPI: introduce "acpi_rsdp=" parameter for kdump

Vasiliy Kulikov (1):
      ACPI: constify ops structs

Zhang Rui (1):
      ACPI: DMI workaround for Asus A8N-SLI Premium and Asus A8N-SLI DELUX

with this log:

commit 4a8f5058bde15d737abe39b5bed3f21dcb6599d2
Merge: 3eb208f eb03cb0 d7f6169 e410829 bb0c5ed aa16597 4996c02 6c8111e
Author: Len Brown <len.brown@intel.com>
Date:   Tue Aug 2 17:22:09 2011 -0400

    Merge branches 'acpica', 'battery', 'boot-irqs', 'bz-24492', 'bz-9528', 'from-akpm', 'kexec-param' and 'misc' into release
    
    Conflicts:
    	Documentation/kernel-parameters.txt
    
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 6c8111e9a0e73ef1e58a1bf0a10c23ee1512e7a2
Author: Len Brown <len.brown@intel.com>
Date:   Tue Aug 2 17:15:33 2011 -0400

    ACPI:  delete stale reference in kernel-parameters.txt
    
    Says for acpi=
                            See also Documentation/power/pm.txt, pci=noacpi
    
    but this file does not exist
    
    Reported-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit aa165971c2923d05988f920c978e438dbc7b0de6
Author: Shaohua Li <shaohua.li@intel.com>
Date:   Thu Jul 28 13:48:43 2011 -0700

    ACPI: add missing _OSI strings
    
    Linux supports some optional features, but it should notify the BIOS about
    them via the _OSI method.  Currently Linux doesn't notify any, which might
    make such features not work because the BIOS doesn't know about them.
    
    Jarosz has a system which needs this to make ACPI processor aggregator
    device work.
    
    Reported-by: "Jarosz, Sebastian" <sebastian.jarosz@intel.com>
    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Acked-by: Matthew Garrett <mjg@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit f52e00c668669c9c290e84adf859c76db6d92a5a
Author: David Rientjes <rientjes@google.com>
Date:   Thu Jul 28 13:48:43 2011 -0700

    ACPI: remove NID_INVAL
    
    b552a8c56db8 ("ACPI: remove NID_INVAL") removed the left over uses of
    NID_INVAL, but didn't actually remove the definition.  Remove it.
    
    Signed-off-by: David Rientjes <rientjes@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 31f5396ad3bde23c8416e8d23ba425e27f413314
Author: Jean Delvare <khali@linux-fr.org>
Date:   Thu Jul 28 13:48:42 2011 -0700

    thermal: make THERMAL_HWMON implementation fully internal
    
    THERMAL_HWMON is implemented inside the thermal_sys driver and has no
    effect on drivers implementing thermal zones, so they shouldn't see
    anything related to it in <linux/thermal.h>.  Making the THERMAL_HWMON
    implementation fully internal has two advantages beyond the cleaner
    design:
    
    * This avoids rebuilding all thermal drivers if the THERMAL_HWMON
      implementation changes, or if CONFIG_THERMAL_HWMON gets enabled or
      disabled.
    
    * This avoids breaking the thermal kABI in these cases too, which should
      make distributions happy.
    
    The only drawback I can see is slightly higher memory fragmentation, as
    the number of kzalloc() calls will increase by one per thermal zone.  But
    I doubt it will be a problem in practice, as I've never seen a system with
    more than two thermal zones.
    
    Signed-off-by: Jean Delvare <khali@linux-fr.org>
    Cc: Rene Herman <rene.herman@gmail.com>
    Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 0d97d7a494d43be77f57e688369be0aae33d1ade
Author: Jean Delvare <khali@linux-fr.org>
Date:   Thu Jul 28 13:48:41 2011 -0700

    thermal: split hwmon lookup to a separate function
    
    We'll soon need to reuse it.
    
    Signed-off-by: Jean Delvare <khali@linux-fr.org>
    Cc: Rene Herman <rene.herman@gmail.com>
    Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit ab92402af04c151c26541eb28e1c26286a429db5
Author: Jean Delvare <khali@linux-fr.org>
Date:   Thu Jul 28 13:48:40 2011 -0700

    thermal: hide CONFIG_THERMAL_HWMON
    
    It's about time to revert 16d752397301b9 ("thermal: Create
    CONFIG_THERMAL_HWMON=n").  Anybody running a kernel >= 2.6.40 would also
    be running a recent enough version of lm-sensors.
    
    Actually having CONFIG_THERMAL_HWMON is pretty convenient so instead of
    dropping it, we keep it but hide it.
    
    Signed-off-by: Jean Delvare <khali@linux-fr.org>
    Cc: Rene Herman <rene.herman@gmail.com>
    Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 8997621bb2daaf19a4e9d82f118224159d8054e2
Author: Len Brown <len.brown@intel.com>
Date:   Tue Aug 2 00:45:48 2011 -0400

    ACPI print OSI(Linux) warning only once
    
    This message gets repeated on some machines:
    https://bugzilla.kernel.org/show_bug.cgi?id=29292
    
    Signed-off-by: Len Brown <len.brown@intel.com>

commit bb0c5ed6ec523199e34e81dcef8e987507553b63
Author: Zhang Rui <rui.zhang@intel.com>
Date:   Sat Jul 30 01:40:48 2011 -0400

    ACPI: DMI workaround for Asus A8N-SLI Premium and Asus A8N-SLI DELUX
    
    DMI workaround for A8N-SLI Premium and A8N-SLI DELUXE
    to enable the s3 suspend old ordering.
    http://bugzilla.kernel.org/show_bug.cgi?id=9528
    
    Tested-by: Heiko Ettelbrück <hbruckynews@gmx.de>
    Tested-by: Brian Beardall <brian@rapsure.net>
    Signed-off-by: Zhang Rui <rui.zhang@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit eb03cb02b74df6dd0b653d5f6d976f16a434dfaf
Author: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Date:   Tue Jul 12 09:03:29 2011 +0100

    ACPI / Battery: propagate sysfs error in acpi_battery_add()
    
    Make sure the error return from sysfs_add_battery() is checked and
    propagated out from acpi_battery_add().
    
    Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit e80bba4b5108c6479379740201b0a5d9da5ffbac
Author: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Date:   Tue Jul 12 09:03:28 2011 +0100

    ACPI / Battery: avoid acpi_battery_add() use-after-free
    
    When acpi_battery_add_fs() fails the error handling code does not clean
    up completely.  Moreover, it does not return resulting in a
    use-after-free.
    
    Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 4996c02306a25def1d352ec8e8f48895bbc7dea9
Author: Takao Indoh <indou.takao@jp.fujitsu.com>
Date:   Thu Jul 14 18:05:21 2011 -0400

    ACPI: introduce "acpi_rsdp=" parameter for kdump
    
    There is a problem with putting the first kernel in EFI virtual mode,
    it is that when the second kernel comes up it tries to initialize the
    EFI again and once we have put EFI in virtual mode we can not really
    do that.
    
    Actually, EFI is not necessary for kdump, we can boot the second kernel
    with "noefi" parameter, but the boot will mostly fail because 2nd kernel
    cannot find RSDP.
    
    In this situation, we introduced "acpi_rsdp=" kernel parameter, so that
    kexec-tools can pass the "noefi acpi_rsdp=X" to the second kernel to
    make kdump works. The physical address of the RSDP can be got from
    sysfs(/sys/firmware/efi/systab).
    
    Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com>
    Reviewed-by: WANG Cong <amwang@redhat.com>
    Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 9c8b04be443b33939f374a811c82abeebe0a61d1
Author: Vasiliy Kulikov <segoon@openwall.com>
Date:   Sat Jun 25 21:07:52 2011 +0400

    ACPI: constify ops structs
    
    Structs battery_file, acpi_dock_ops, file_operations,
    thermal_cooling_device_ops, thermal_zone_device_ops, kernel_param_ops
    are not changed in runtime.  It is safe to make them const.
    register_hotplug_dock_device() was altered to take const "ops" argument
    to respect acpi_dock_ops' const notion.
    
    Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
    Acked-by: Jeff Garzik <jgarzik@redhat.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit d7f6169a0d32002657886fee561c641acddb9a75
Author: Stefan Assmann <sassmann@kpanic.de>
Date:   Fri Jul 15 14:52:30 2011 +0200

    ACPI: fix CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS
    
    The following was observed by Steve Rostedt on 3.0.0-rc5
    Backtrace:
    irq 16: nobody cared (try booting with the "irqpoll" option)
    Pid: 65, comm: irq/16-uhci_hcd Not tainted 3.0.0-rc5-test+ #94
    Call Trace:
     [<ffffffff810aa643>] __report_bad_irq+0x37/0xc1
     [<ffffffff810aaa2d>] note_interrupt+0x14e/0x1c9
     [<ffffffff810a9a05>] ? irq_thread_fn+0x3c/0x3c
     [<ffffffff810a990e>] irq_thread+0xf6/0x1b1
     [<ffffffff810a9818>] ? irq_finalize_oneshot+0xb3/0xb3
     [<ffffffff8106b4d6>] kthread+0x9f/0xa7
     [<ffffffff814f1f04>] kernel_thread_helper+0x4/0x10
     [<ffffffff8103ca09>] ? finish_task_switch+0x7b/0xc0
     [<ffffffff814eac78>] ? retint_restore_args+0x13/0x13
     [<ffffffff8106b437>] ? __init_kthread_worker+0x5a/0x5a
     [<ffffffff814f1f00>] ? gs_change+0x13/0x13
    handlers:
    [<ffffffff810a912d>] irq_default_primary_handler threaded [<ffffffff8135eaa6>] usb_hcd_irq
    [<ffffffff810a912d>] irq_default_primary_handler threaded [<ffffffff8135eaa6>] usb_hcd_irq
    Disabling IRQ #16
    
    The problem being that a device triggers boot interrupts (due to threaded
    interrupt handling and masking of the IO-APIC), which are forwarded
    to the PIRQ line of the device. These interrupts are not handled on the PIRQ
    line because the interrupt handler is not present there.
    This should have already been fixed by CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS.
    However some parts of the quirk got lost in the ACPI merge. This is a resent of
    the patch proposed in 2009.
    See http://lkml.org/lkml/2009/9/7/192
    
    Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
    Tested-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit e545b55a1e980cbb6a158886286106bbf39722b1
Author: Jon Mason <jdmason@kudzu.us>
Date:   Sun Jun 19 18:51:37 2011 -0500

    ACPI: fix 80 char overflow
    
    Trivial fix for 80 char line overflow in drivers/acpi/pci_root.c
    
    Signed-off-by: Jon Mason <jdmason@kudzu.us>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 9c921c22a7f33397a6774d7fa076db9b6a0fd669
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Thu Jun 30 11:34:12 2011 +0800

    ACPI / Battery: Resolve the race condition in the sysfs_remove_battery()
    
    Use battery->lock in sysfs_remove_battery() to make
    checking, removing, and clearing bat.dev atomic.
    This is necessary because sysfs_remove_battery() may
    be invoked concurrently from different paths.
    
        https://bugzilla.kernel.org/show_bug.cgi?id=35642
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 6e17fb6aa1a67afa1827ae317c3594040f055730
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Thu Jun 30 11:33:58 2011 +0800

    ACPI / Battery: Add the check before refresh sysfs in the battery_notify()
    
    In the commit 25be5821, add the refresh sysfs when system resumes
    from suspending. But it didn't check that the battery exists. This
    will cause battery sysfs files added when the battery doesn't exist.
    This patch add the check before refreshing.
    
    	https://bugzilla.kernel.org/show_bug.cgi?id=35642
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit d5a5911b3278bad6515a9958f7318f74d534ef64
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Thu Jun 30 11:33:40 2011 +0800

    ACPI / Battery: Add the hibernation process in the battery_notify()
    
    The Commit 25be58215 has added a PM notifier to refresh the sys in order
    to deal with the unit change of the Battery Present Rate. But it just
    consided the suspend situation. The problem also will happen during the
    hibernation according the bug 28192.
    
        https://bugzilla.kernel.org/show_bug.cgi?id=28192
    
    This patch adds the hibernation process and fix the bug.
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 7b78622d0f9df9f274a319eea1494240119d3734
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Thu Jun 30 11:33:27 2011 +0800

    ACPI / Battery: Rename acpi_battery_quirks2 with acpi_battery_quirks
    
    This patch is cosmetic only, and makes no functional change.
    Since the acpi_battery_quirks has been deleted, rename
    acpi_battery_quirks2 with acpi_battery_quirks to clean the code.
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 55003b2105a4578736f3e868fbaa889bb1ff3ce0
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Thu Jun 30 11:33:12 2011 +0800

    ACPI / Battery: Change 16-bit signed negative battery current into correct value
    
    This patch is for some machines which report the battery current
    as a 16-bit signed negative when it is charging. This is caused
    by DSDT bug. The commit bc76f90b8a5cf4aceedf210d08d5e8292f820cec
    has resolved the problem for Acer laptops. But some other machines
    also have such problem.
    
        https://bugzilla.kernel.org/show_bug.cgi?id=33722
    
    Since it is improper that the current is above 32A on laptops
    whether on AC or on battery, this patch is to check the current and
     take its absolute value as current and producing a message when it
    is negative in s16.
    
    Remove Acer quirk, as this workaround handles Acer too.
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit ae6f61870490c10a0b0436e5afffa00c9dacffef
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Thu Jun 30 11:32:40 2011 +0800

    ACPI / Battery: Add the power unit macro
    
    This patch is cosmetic only, and makes no functional change.
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit e4108292cc5b5ca07abc83af31a78338362810ca
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Fri Jul 1 16:03:39 2011 +0800

    ACPI / SBS: Correct the value of power_now and power_avg in the sysfs
    
    https://bugzilla.kernel.org/show_bug.cgi?id=24492
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 1dd5c715e5b7524da8c1030f5cf1ea903e45c457
Author: Lan Tianyu <tianyu.lan@intel.com>
Date:   Fri Jul 1 16:03:15 2011 +0800

    ACPI / SBS: Add getting state operation in the acpi_sbs_battery_get_property()
    
    https://bugzilla.kernel.org/show_bug.cgi?id=24492
    
    Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 3eb208f0a36cf7a86953afa7a4eb6776294e6768
Author: Bob Moore <robert.moore@intel.com>
Date:   Mon Jul 4 08:38:51 2011 +0000

    ACPICA: Update to version 20110623
    
    Version 20110623.
    
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Lin Ming <ming.m.lin@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 8f9c91273e36e5762c617c23e4fd48d5172e0dac
Author: Fenghua Yu <fenghua.yu@intel.com>
Date:   Mon Jul 4 08:36:16 2011 +0000

    ACPICA: Do not repair _TSS return package if _PSS is present
    
    We can only sort the _TSS return package if there is no _PSS
    in the same scope. This is because if _PSS is present, the ACPI
    specification dictates that the _TSS Power Dissipation field is
    to be ignored, and therefore some BIOSs leave garbage values in
    the _TSS Power field(s).  In this case, it is best to just return
    the _TSS package as-is.
    
    Reported-by: Fenghua Yu <fenghua.yu@intel.com>
    Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Lin Ming <ming.m.lin@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit d57b23ad0ca7a46931e4d98eb6b4b73b112f0969
Author: Bob Moore <robert.moore@intel.com>
Date:   Mon Jul 4 08:24:03 2011 +0000

    ACPICA: Add option to disable method return value validation and repair
    
    Runtime option can be used to disable return value repair if this
    is causing a problem on a particular machine.
    
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Lin Ming <ming.m.lin@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 80f40ce0f10e87d9a27f1e29325c6f39245fbcb1
Author: Bob Moore <robert.moore@intel.com>
Date:   Tue Jun 14 10:45:57 2011 +0800

    ACPICA: Update to version 20110527
    
    Version 20110527.
    
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Lin Ming <ming.m.lin@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit f631f9cafbc47bbfe5eb2aa831cd323175298448
Author: Bob Moore <robert.moore@intel.com>
Date:   Tue Jun 14 10:44:51 2011 +0800

    ACPICA: Add missing _TDL to list of predefined names
    
    Affects both iASL and core ACPICA.
    
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Lin Ming <ming.m.lin@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit c8cefe307d79c57b32ca1d4c9ebff528f8dd914c
Author: Bob Moore <robert.moore@intel.com>
Date:   Tue Jun 14 10:42:53 2011 +0800

    ACPICA: Load operator: re-instate most restrictions on incoming table signature
    
    Now, only allow "SSDT" "OEM", and a null signature.
    
    Signed-off-by: Bob Moore <robert.moore@intel.com>
    Signed-off-by: Lin Ming <ming.m.lin@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply

* Re: [PATCH 02/11] PM: extend PM QoS with per-device wake-up constraints
From: Kevin Hilman @ 2011-08-02 21:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: markgross, Mark Brown, Linux PM mailing list, linux-omap,
	Jean Pihet
In-Reply-To: <201108022220.00019.rjw@sisk.pl>

[adding Mark Brown as we discussed similar topics a couple plumbers ago]

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

[...]

>> >> The new class is only available from kernel drivers and so is not exported
>> >> to user space.
>> >
>> > It should be available to user space, however, because in many cases drivers
>> > simply have no idea what values to use (after all, the use decides if he
>> > wants to trade worse video playback quality for better battery life, for
>> > example).
>> >
>> 
>> FWIW, I think it's wrong to expose the raw per-device constraints
>> directly to userspace.
>> 
>> I think it's the responsibility of the subsystems (video, audio, input,
>> etc.) to expose QoS knobs to userspace as they see fit and now allow
>> userspace to tinker directly with QoS constraints.
>
> This assumes that those "subsystems" or rather "frameworks" (a bus type or
> a device class is a subsystem in the terminology used throughout the PM
> documentation) will (a) know about PM QoS and (b) will care to handle it.
> Both (a) and (b) seem to be unrealistic IMHO.

I disagree and think that both are quite realistic (mainly because they
exist today, albiet mostly out of tree because no generic QoS framework
exist.  e.g. on OMAP, we have OMAP-specific *kernel* APIs for requesting
per-device wakeup latencies, and drivers and frameworks are using them.)

Most of these frameworks already have QoS constraints/requirements but
have no generic way to express them.  That's why we're pushing for a
generic constraints framework.

Consider video for example.  It's the kernel-side drivers, not user
space apps, that know about the latency or throughput constraints based
on e.g. frame rate, bytes/pixel, double/triple buffering, PIP, multiple
displays, etc. etc.   

In this case, the video framework (V4L2) might not want any knobs
exposed to userspace because userspace simply doesn't have the knowledge
to set appropriate constraints.  I'm less familiar with audio, but I
believe audio would be similar (sample rate, number of channels, mixing
with other concurrent audio streams, etc. etc. are all known by the
kernel-side code.)

On the other hand, consider touchscreen.  Touchscreens have a
configurable sample rate which allows a trade-off between power savings
and accuracy.  For example, low accuracy (and thus low power) would be
fine for a UI which is only taking finger gestures, but if the
application was doing handwriting recognition with a stylus, it would
likely want higher accuracy (and consume more power.)

In this case, the kernel driver has no way of knowing what the
application is doing, so some way for touchscreen apps to request this
kind of constraint would be required.

My point is it should be up to each framework (audio, video,
input/touchscreen) to expose a userspace interface to their users that
makes sense for the *specific needs* of the framework.

Using the above examples, audio and video might not need (or want) to
expose anything to userspace, where touchscreen would.  IMO, it would be
much more obvious for a touchscreen app to use a new API in tslib (which
it is already using) to set its constraints rather than having to use
tslib for most things but a sysfs file for QoS.

> We already export wakeup and runtime PM knobs per device via sysfs and
> I'm not so sure why PM QoS is different in that respect.

As stated above, because for many frameworks userspace simply does not
have all (or any) of the knowledge to set the right constraints.

Kevin

^ permalink raw reply

* Re: [PATCH 04/13] PM: QoS: implement the per-device latency constraints
From: Rafael J. Wysocki @ 2011-08-02 21:13 UTC (permalink / raw)
  To: Jean Pihet
  Cc: markgross, broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <CAORVsuXzn8wGZsyxUcAN7stjbU4Kshwdg2CW06w17Y5O7oJVsA@mail.gmail.com>

Hi,

On Tuesday, August 02, 2011, Jean Pihet wrote:

...
> >> -static inline void pm_qos_set_value(struct pm_qos_object *o, s32 value)
> >> +static inline void pm_qos_set_value(struct pm_qos_constraints *c, s32 value)
> >>  {
> >> -     o->target_value = value;
> >> +     c->target_value = value;
> >>  }
> >
> > Well, I'm not sure that this function is necessary at all.  You might as well
> > simply remove it as far as I'm concerned.
> The idea is to provide an efficient and lockless way to get the
> aggregated constraint class value. When the constraints a re changing
> the new value is calculated and stored using pm_qos_get_value and
> pm_qos_set_value. Then pm_qos_read_value is used to retrieve the
> value. For example cpuidle calls pm_qos_request which uses
> pm_qos_read_value to get the CPU/DMA minimum latency.

Still, pm_qos_set_value() is static in this file and doesn't really serve
any specific purpose except for symmetry with pm_qos_read_value().

Anyway, as I said I'm not sure, so it's OK to leave it as is to me too.

> 
> >
> >> -static void update_target(struct pm_qos_object *o, struct plist_node *node,
> >> -                       int del, int value)
> >> +static void update_target(struct pm_qos_request *req,
> >> +                       enum pm_qos_req_action action, int value)
> >>  {
> >>       unsigned long flags;
> >> -     int prev_value, curr_value;
> >> +     int prev_value, curr_value, new_value;
> >> +     struct pm_qos_object *o = pm_qos_array[req->class];
> >> +     struct pm_qos_constraints *c;
> >> +
> >> +     switch (req->class) {
> >> +     case PM_QOS_DEV_LATENCY:
> >> +             if (!req->dev) {
> >> +                     WARN(1, KERN_ERR "PM QoS API called with NULL dev\n");
> >> +                     return;
> >> +             }
> >> +             c = &req->dev->power.latency_constraints;
> >> +             break;
> >> +     case PM_QOS_CPU_DMA_LATENCY:
> >> +     case PM_QOS_NETWORK_LATENCY:
> >> +     case PM_QOS_NETWORK_THROUGHPUT:
> >> +             c = o->constraints;
> >> +             break;
> >> +     case PM_QOS_RESERVED:
> >> +     default:
> >> +             WARN(1, KERN_ERR "PM QoS API called with wrong class %d, "
> >> +                     "req 0x%p\n", req->class, req);
> >> +             return;
> >> +     }
> >
> > Do we _really_ need that switch()?
> >
> > What about introducing dev_pm_qos_add_request() and friends specifically
> > for devices, such that they will take the target device object (dev) as
> > their first argument?  Then, you could keep pm_qos_add_request() pretty
> > much as is, right?
> Yes but in that case I need to duplicate the API functions for devices
> (add, update, remove). Those functions will call update_target.
> I prefer the way this patch does it: simplify the API functions and
> have only 1 place with the constraints management and notification
> logic (in update_target).

The device functions would still call update_target(), but directly on the
device's struct struct pm_qos_constraints, so they wouldn't really duplicate
the existing pm_qos_*_request().

> 
> >
> >>
> >>       spin_lock_irqsave(&pm_qos_lock, flags);
> >> -     prev_value = pm_qos_get_value(o);
> >> -     /* PM_QOS_DEFAULT_VALUE is a signal that the value is unchanged */
> >> -     if (value != PM_QOS_DEFAULT_VALUE) {
> >> +
> >> +     prev_value = pm_qos_get_value(c);
> >> +     if (value == PM_QOS_DEFAULT_VALUE)
> >> +             new_value = c->default_value;
> >> +     else
> >> +             new_value = value;
> >
> > What about writing that as
> >
> >        new_value = value != PM_QOS_DEFAULT_VALUE ? value : c->default_value;
> This is shorter but more difficult to read ;8

That depends on who you ask. :-)

> >
> >> +
> >> +     switch (action) {
> >> +     case PM_QOS_REMOVE_REQ:
> >> +             plist_del(&req->node, &c->list);
> >> +             break;
> >> +     case PM_QOS_UPDATE_REQ:
> >>               /*
> >>                * to change the list, we atomically remove, reinit
> >>                * with new value and add, then see if the extremal
> >>                * changed
> >>                */
> >> -             plist_del(node, &o->requests);
> >> -             plist_node_init(node, value);
> >> -             plist_add(node, &o->requests);
> >> -     } else if (del) {
> >> -             plist_del(node, &o->requests);
> >> -     } else {
> >> -             plist_add(node, &o->requests);
> >> +             plist_del(&req->node, &c->list);
> >> +     case PM_QOS_ADD_REQ:
> >> +             plist_node_init(&req->node, new_value);
> >> +             plist_add(&req->node, &c->list);
> >> +             break;
> >> +     default:
> >> +             /* no action */
> >> +             ;
> >>       }
> >> -     curr_value = pm_qos_get_value(o);
> >> -     pm_qos_set_value(o, curr_value);
> >> +
> >> +     curr_value = pm_qos_get_value(c);
> >> +     pm_qos_set_value(c, curr_value);
> >>       spin_unlock_irqrestore(&pm_qos_lock, flags);
> >>
> >>       if (prev_value != curr_value)
> >>               blocking_notifier_call_chain(o->notifiers,
> >
> > That's why I'm thinking that it would be helpful to have a pointer
> > to the notifier list from struct pm_qos_constraints .
> I think having a per-device notifier list is complicating things. If a
> subsystem needs te be notified it needs to register a notifier
> callback for _every_ device in the system.

I'm not really sure.  For example, why would a video subsystem want to be
notified of a PM QoS change in a keyboard?

> As a first implementation for OMAP the low-level PM code is using a
> single notifier to retrieve all the devices latency constraints and
> apply them to the pwer domains. I do not see how this could work with
> a per-device notifier list.

If you need a global notifier chain, it can be implemented as a separate
static variable as I said in my last reply in the [03/13] thread.

> 
> >
> > Besides, you can use "pm_qos_array[req->class]->notifiers" instead of
> > "o->notifiers".
> Ok
> 
> ...
> >> +static int find_pm_qos_object_by_minor(int minor)
> >> +{
> >> +     int pm_qos_class;
> >> +
> >> +     for (pm_qos_class = 0;
> >> +             pm_qos_class < PM_QOS_NUM_CLASSES; pm_qos_class++) {
> >> +             if (minor ==
> >> +                     pm_qos_array[pm_qos_class]->pm_qos_power_miscdev.minor)
> >> +                     return pm_qos_class;
> >> +     }
> >> +     return -1;
> >> +}
> >
> > This function doesn't seem to be used anywhere, what's the purpose of it?
> It is used by pm_qos_power_open in order to retrieve the class
> associated with the MISC device.
> BTW this patch is moving the code so that all the MISC related
> functions are grouped together.

OK, one more thing that needs to be done in a separate patch. :-)

Thanks,
Rafael

^ permalink raw reply

* Re: [PATCH 03/13] PM: QoS: extend the in-kernel API with per-device latency constraints
From: Rafael J. Wysocki @ 2011-08-02 21:02 UTC (permalink / raw)
  To: Jean Pihet
  Cc: markgross, broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <CAORVsuVY63VYGEB7CM50UzJS3wLt+jZn5U6S9X+277GB_ETV1g@mail.gmail.com>

Hi,

On Tuesday, August 02, 2011, Jean Pihet wrote:
...
> I will keep pm_qos_class if the rename brings in some confusion. The
> intention was to simplify the names of the fields in structs with
> explicit names.

Please keep it for now.  You can rename it later if there's a clear
benefit, but I'm not sure still.

> >
> >> +     struct device *dev;
> >>  };
> >>
> >> -void pm_qos_add_request(struct pm_qos_request_list *l, int pm_qos_class, s32 value);
> >> -void pm_qos_update_request(struct pm_qos_request_list *pm_qos_req,
> >> -             s32 new_value);
> >> -void pm_qos_remove_request(struct pm_qos_request_list *pm_qos_req);
> >> +struct pm_qos_parameters {
> >> +     int class;
> >> +     struct device *dev;
> >> +     s32 value;
> >> +};
> >
> > What exactly is the "dev" member needed for?
> This is the target device that is passed as parameter to the API. It
> is used for constraints of the devices latency class.

Well, I don't like this change.

In fact, I'd prefer it if the old interface were kept unmodified.

> ...
> >> +static BLOCKING_NOTIFIER_HEAD(dev_lat_notifier);
> >> +static struct pm_qos_object dev_pm_qos = {
> >> +     .requests = PLIST_HEAD_INIT(dev_pm_qos.requests, pm_qos_lock),
> >> +     .notifiers = &dev_lat_notifier,
> >> +     .name = "dev_latency",
> >> +     .target_value = PM_QOS_DEV_LAT_DEFAULT_VALUE,
> >> +     .default_value = PM_QOS_DEV_LAT_DEFAULT_VALUE,
> >> +     .type = PM_QOS_MIN,
> >> +};
> >> +
> >
> > You seem to be confusing things here.  Since devices will have their own lists
> > of requests now (as per the previous patch), the .requests member above is not
> > necessary.  Moreover, it seems to be used incorrectly below.
> The idea was to split the patches as requested previously: first the
> API changes (this very patch [03/13]) and then the actual
> implementation ([04/13]). Is this correct?

The idea is right in general, but so to speak it didn't work out too well.

The most important change is the introduction of struct pm_qos_constraints,
which doesn't need any preparation IMO.  I'd do this possibly early in the
series and I'd do it like this:

+struct pm_qos_constraints {
+       struct plist_head list;
+       s32 target_value;
+       s32 default_value;
+       struct blocking_notifier_head *notifiers;
+};

(more on the notifiers later).  Please note the lack of the type field.

At the same time I'd redefine struct pm_qos_object in the following way:

 struct pm_qos_object {
-       struct plist_head requests;
+       struct pm_qos_constraints *constraints;
-       struct blocking_notifier_head *notifiers;
        struct miscdevice pm_qos_power_miscdev;
        char *name;
-       s32 target_value;       /* Do not change to 64 bit */
-       s32 default_value;
        enum pm_qos_type type;
 };

so for the _existing_ PM QoS classes you can simply use
constraints->list instead of requests, constraints->target_value
instead of target_value, constraints->defaul_value instead of default_value
and constraints->notifiers instead of notifiers.

I wouldn't introduce a "global" PM QoS class for devices.

That should be a relatively simple patch that doesn't change the
existing interface and functionality.

The next patch would be to modify update_target() so that it takes
a pointer to struct pm_qos_constraints instead of the pointers to
struct pm_qos_object and struct plist_node (possibly also to take
an "operation" argument instead of the "del" one).  All it needs is
in struct pm_qos_constraints, so that should be simple too.
The modified update_target() should return a value indicating
whether or not prev_value was different from curr_value, so that
the device PM QoS functions (introduced by the next patch) can use it
to trigger system-wide notifications.

The next patch would introduce the per-device PM QoS by (1) adding
a struct pm_qos_constraints _pointer_ to struct dev_pm_info (assuming
that at least _some_ devices won't use PM QoS) and (2) adding
dev_pm_qos_add/update/remove_request() in drivers/base/power/qos.c,
all of them using the modified update_target().

Now if system-wide notifier chain for devices is necessary, you can
simply define it as a static variable in drivers/base/power/qos.c
without actually adding any "PM QoS object" for this purpose.  Now,
you can use the value returned by the modified update_target() to
decide whether or not to run notifiers from dev_pm_qos_*_request().

Does it make sense to you?

Rafael

^ permalink raw reply

* Re: [PATCH 02/11] PM: extend PM QoS with per-device wake-up constraints
From: Rafael J. Wysocki @ 2011-08-02 20:19 UTC (permalink / raw)
  To: Kevin Hilman; +Cc: markgross, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <87bow7ln4c.fsf@ti.com>

On Tuesday, August 02, 2011, Kevin Hilman wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > Hi,
> >
> > On Thursday, June 30, 2011, jean.pihet@newoldbits.com wrote:
> >> From: Jean Pihet <j-pihet@ti.com>
> >> 
> >> - add a new PM QoS class PM_QOS_DEV_WAKEUP_LATENCY for device wake-up
> >> constraints. Due to the per-device nature of the new class the constraints
> >> list is stored inside the device dev_pm_info struct instead of the internal
> >> per-class constraints lists.
> >
> > I think PM_QOS_DEV_LATENCY might be a better name.
> >
> >> The new class is only available from kernel drivers and so is not exported
> >> to user space.
> >
> > It should be available to user space, however, because in many cases drivers
> > simply have no idea what values to use (after all, the use decides if he
> > wants to trade worse video playback quality for better battery life, for
> > example).
> >
> 
> FWIW, I think it's wrong to expose the raw per-device constraints
> directly to userspace.
> 
> I think it's the responsibility of the subsystems (video, audio, input,
> etc.) to expose QoS knobs to userspace as they see fit and now allow
> userspace to tinker directly with QoS constraints.

This assumes that those "subsystems" or rather "frameworks" (a bus type or
a device class is a subsystem in the terminology used throughout the PM
documentation) will (a) know about PM QoS and (b) will care to handle it.
Both (a) and (b) seem to be unrealistic IMHO.

We already export wakeup and runtime PM knobs per device via sysfs and
I'm not so sure why PM QoS is different in that respect.

Thanks,
Rafael

^ permalink raw reply

* Re: [PATCH v4 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
From: Kevin Hilman @ 2011-08-02 18:45 UTC (permalink / raw)
  To: MyungJoo Ham
  Cc: Len Brown, linux-pm, Greg Kroah-Hartman, Thomas Gleixner,
	Kyungmin Park
In-Reply-To: <1310717510-19002-2-git-send-email-myungjoo.ham@samsung.com>

MyungJoo Ham <myungjoo.ham@samsung.com> writes:

> With OPPs, a device may have multiple operable frequency and voltage
> sets. However, there can be multiple possible operable sets and a system
> will need to choose one from them. In order to reduce the power
> consumption (by reducing frequency and voltage) without affecting the
> performance too much, a Dynamic Voltage and Frequency Scaling (DVFS)
> scheme may be used.
>
> This patch introduces the DVFS capability to non-CPU devices with OPPs.
> DVFS is a techique whereby the frequency and supplied voltage of a
> device is adjusted on-the-fly. DVFS usually sets the frequency as low
> as possible with given conditions (such as QoS assurance) and adjusts
> voltage according to the chosen frequency in order to reduce power
> consumption and heat dissipation.
>
> The generic DVFS for devices, DEVFREQ, may appear quite similar with
> /drivers/cpufreq.  However, CPUFREQ does not allow to have multiple
> devices registered and is not suitable to have multiple heterogenous
> devices with different (but simple) governors.
>
> Normally, DVFS mechanism controls frequency based on the demand for
> the device, and then, chooses voltage based on the chosen frequency.
> DEVFREQ also controls the frequency based on the governor's frequency
> recommendation and let OPP pick up the pair of frequency and voltage
> based on the recommended frequency. Then, the chosen OPP is passed to
> device driver's "target" callback.
>
> Tested with memory bus of Exynos4-NURI board.
>
> The test code with board support for Exynos4-NURI is at
> http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/devfreq
>
> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>
> --
> Thank you for your valuable comments, Rafael, Greg, Pavel, and Colin.
>
> Changed from v3
> - In kerneldoc comments, DEVFREQ has ben replaced by devfreq

FYI... there are still lots of kerneldoc comments in this version with
DEVFREQ instead of devfreq, particularily in devfreq.h.

Kevin

^ permalink raw reply

* Re: [PATCH 03/13] PM: QoS: extend the in-kernel API with per-device latency constraints
From: Kevin Hilman @ 2011-08-02 18:01 UTC (permalink / raw)
  To: jean.pihet
  Cc: markgross, broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <1311841821-10252-4-git-send-email-j-pihet@ti.com>

jean.pihet@newoldbits.com writes:

> From: Jean Pihet <j-pihet@ti.com>
>
> Extend the PM QoS kernel API:
> - add a new PM QoS class PM_QOS_DEV_LATENCY for device wake-up latency
> constraints
> - make the pm_qos_add_request API more generic by using a parameter of
> type struct pm_qos_parameters
> - minor clean-ups and rename of struct fields:
>   . rename pm_qos_class to class and pm_qos_req to req in internal code
>   . consistenly use req and params as the API parameters
>   . rename struct pm_qos_request_list to struct pm_qos_request
> - update the in-kernel API callers to the new API
>
> Signed-off-by: Jean Pihet <j-pihet@ti.com>
> ---
>  arch/arm/plat-omap/i2c.c               |   20 -----
>  drivers/i2c/busses/i2c-omap.c          |   35 ++++++---

More on breaking this patch up...

Since the OMAP I2C driver is not a current PM QoS API user, it doesn't
really belong in this patch which is converting existing PM QoS API
users.  The OMAP I2C conversion should be its own patch.

Kevin

^ permalink raw reply

* Re: [PATCH 02/11] PM: extend PM QoS with per-device wake-up constraints
From: Kevin Hilman @ 2011-08-02 17:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: markgross, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <201107022310.50175.rjw@sisk.pl>

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> Hi,
>
> On Thursday, June 30, 2011, jean.pihet@newoldbits.com wrote:
>> From: Jean Pihet <j-pihet@ti.com>
>> 
>> - add a new PM QoS class PM_QOS_DEV_WAKEUP_LATENCY for device wake-up
>> constraints. Due to the per-device nature of the new class the constraints
>> list is stored inside the device dev_pm_info struct instead of the internal
>> per-class constraints lists.
>
> I think PM_QOS_DEV_LATENCY might be a better name.
>
>> The new class is only available from kernel drivers and so is not exported
>> to user space.
>
> It should be available to user space, however, because in many cases drivers
> simply have no idea what values to use (after all, the use decides if he
> wants to trade worse video playback quality for better battery life, for
> example).
>

FWIW, I think it's wrong to expose the raw per-device constraints
directly to userspace.

I think it's the responsibility of the subsystems (video, audio, input,
etc.) to expose QoS knobs to userspace as they see fit and now allow
userspace to tinker directly with QoS constraints.

Kevin

^ permalink raw reply

* [PATCH v4 2/2] i8042: Enable OLPC's EC-based i8042 wakeup control
From: Daniel Drake @ 2011-08-02 15:49 UTC (permalink / raw)
  To: dtor, dmitry.torokhov; +Cc: linux-pm, dilinger, linux-input

The OLPC XO laptop can be resumed from suspend via keyboard or mouse
activity. Hook up the i8042 driver to the OLPC EC controls to make
this possible.

Signed-off-by: Daniel Drake <dsd@laptop.org>
---
 drivers/input/serio/i8042-x86ia64io.h |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/drivers/input/serio/i8042-x86ia64io.h b/drivers/input/serio/i8042-x86ia64io.h
index 76b2e58..36f6015 100644
--- a/drivers/input/serio/i8042-x86ia64io.h
+++ b/drivers/input/serio/i8042-x86ia64io.h
@@ -9,6 +9,7 @@
 
 #ifdef CONFIG_X86
 #include <asm/x86_init.h>
+#include <asm/olpc.h>
 #endif
 
 /*
@@ -877,6 +878,13 @@ static inline void i8042_pnp_exit(void) { }
 
 static inline void i8042_platform_suspend(struct device *dev, bool may_wakeup)
 {
+	if (!machine_is_olpc())
+		return;
+
+	if (may_wakeup)
+		olpc_ec_wakeup_set(EC_SCI_SRC_GAME);
+	else
+		olpc_ec_wakeup_clear(EC_SCI_SRC_GAME);
 }
 
 static int __init i8042_platform_init(void)
@@ -923,6 +931,9 @@ static int __init i8042_platform_init(void)
 
 	if (dmi_check_system(i8042_dmi_dritek_table))
 		i8042_dritek = true;
+
+	if (olpc_ec_wakeup_available())
+		i8042_enable_wakeup = true;
 #endif /* CONFIG_X86 */
 
 	return retval;
-- 
1.7.6

^ permalink raw reply related

* [PATCH v4 1/2] Input: enable i8042-level wakeup control
From: Daniel Drake @ 2011-08-02 15:49 UTC (permalink / raw)
  To: dtor, dmitry.torokhov; +Cc: linux-pm, dilinger, linux-input

The OLPC XO laptop is able to use the PS/2 controller as a wakeup source.
When used as a wakeup source, the key press/mouse motion must not be lost.

Add infrastructure to allow controllers to be marked as wakeup-capable,
and avoid doing power control/reset on the i8042/serio/input devices when
running in this mode. For systems where this functionality is available,
you are expected to enable wakeups on the i8042 device, the serio
devices, and the relevant input devices, to ensure that the hardware is
left powered and untouched throughout the suspend/resume.

Signed-off-by: Daniel Drake <dsd@laptop.org>
---
 drivers/input/input.c                 |    6 +++-
 drivers/input/keyboard/atkbd.c        |    4 ++-
 drivers/input/mouse/hgpk.c            |    2 +
 drivers/input/mouse/psmouse-base.c    |    4 ++-
 drivers/input/serio/i8042-io.h        |    4 ++
 drivers/input/serio/i8042-ip22io.h    |    4 ++
 drivers/input/serio/i8042-jazzio.h    |    4 ++
 drivers/input/serio/i8042-ppcio.h     |    4 ++
 drivers/input/serio/i8042-snirm.h     |    4 ++
 drivers/input/serio/i8042-sparcio.h   |    4 ++
 drivers/input/serio/i8042-x86ia64io.h |    4 ++
 drivers/input/serio/i8042.c           |   62 +++++++++++++++++++++++++++++---
 drivers/input/serio/serio.c           |   29 +++++++++++++--
 13 files changed, 122 insertions(+), 13 deletions(-)

On original submission, Dmitry was worried about this functionality not
working at all on other platforms. I agree, it will only work where the
hardware has been specifically designed with this consideration. v2 of
the patch therefore removes the module param option, meaning that it
will only be activated on platforms that explicitly enable it at the
code level.

v2 also performs a more extensive job. We avoid resetting the device
at the input_device level during suspend/resume. We also disable i8042
interrupts when going into suspend, to avoid races handling interrupts
in the wrong order during resume.

v3 includes a cleanup suggested by Rafael, and explains the corner case
that we have to handle in the input layer.

v4 changes direction a little: each layer now just looks at the wakeup
capability of its own device, avoiding some of the earlier tree
traversal. Userspace must now enable wakeups on 5 devices:
	i8042
	i8042/serio0
	i8042/serio0/input*
	i8042/serio1
	i8042/serio1/input*
This matches the behaviour of USB devices, where both the device and the
parent controller must have wakeups enabled for wakeup functionality
to work. Suggested by Rafael J. Wysocki.

diff --git a/drivers/input/input.c b/drivers/input/input.c
index da38d97..639aba7 100644
--- a/drivers/input/input.c
+++ b/drivers/input/input.c
@@ -1588,6 +1588,9 @@ static int input_dev_suspend(struct device *dev)
 {
 	struct input_dev *input_dev = to_input_dev(dev);
 
+	if (device_may_wakeup(dev))
+		return 0;
+
 	mutex_lock(&input_dev->mutex);
 
 	if (input_dev->users)
@@ -1602,7 +1605,8 @@ static int input_dev_resume(struct device *dev)
 {
 	struct input_dev *input_dev = to_input_dev(dev);
 
-	input_reset_device(input_dev);
+	if (!device_may_wakeup(dev))
+		input_reset_device(input_dev);
 
 	return 0;
 }
diff --git a/drivers/input/keyboard/atkbd.c b/drivers/input/keyboard/atkbd.c
index 19cfc0c..4bb81c2 100644
--- a/drivers/input/keyboard/atkbd.c
+++ b/drivers/input/keyboard/atkbd.c
@@ -1027,6 +1027,7 @@ static void atkbd_set_keycode_table(struct atkbd *atkbd)
 static void atkbd_set_device_attrs(struct atkbd *atkbd)
 {
 	struct input_dev *input_dev = atkbd->dev;
+	struct device *parent = &atkbd->ps2dev.serio->dev;
 	int i;
 
 	if (atkbd->extra)
@@ -1047,7 +1048,8 @@ static void atkbd_set_device_attrs(struct atkbd *atkbd)
 	input_dev->id.product = atkbd->translated ? 1 : atkbd->set;
 	input_dev->id.version = atkbd->id;
 	input_dev->event = atkbd_event;
-	input_dev->dev.parent = &atkbd->ps2dev.serio->dev;
+	input_dev->dev.parent = parent;
+	device_set_wakeup_capable(&input_dev->dev, device_can_wakeup(parent));
 
 	input_set_drvdata(input_dev, atkbd);
 
diff --git a/drivers/input/mouse/hgpk.c b/drivers/input/mouse/hgpk.c
index 95577c1..d5dd990 100644
--- a/drivers/input/mouse/hgpk.c
+++ b/drivers/input/mouse/hgpk.c
@@ -548,6 +548,8 @@ static void hgpk_setup_input_device(struct input_dev *input,
 		input->phys = old_input->phys;
 		input->id = old_input->id;
 		input->dev.parent = old_input->dev.parent;
+		device_set_wakeup_capable(&input->dev,
+			device_can_wakeup(&old_input->dev));
 	}
 
 	memset(input->evbit, 0, sizeof(input->evbit));
diff --git a/drivers/input/mouse/psmouse-base.c b/drivers/input/mouse/psmouse-base.c
index 3f74bae..de8ecd5 100644
--- a/drivers/input/mouse/psmouse-base.c
+++ b/drivers/input/mouse/psmouse-base.c
@@ -1232,8 +1232,10 @@ static int psmouse_switch_protocol(struct psmouse *psmouse,
 {
 	const struct psmouse_protocol *selected_proto;
 	struct input_dev *input_dev = psmouse->dev;
+	struct device *parent = &psmouse->ps2dev.serio->dev;
 
-	input_dev->dev.parent = &psmouse->ps2dev.serio->dev;
+	input_dev->dev.parent = parent;
+	device_set_wakeup_capable(&input_dev->dev, device_can_wakeup(parent));
 
 	input_dev->evbit[0] = BIT_MASK(EV_KEY) | BIT_MASK(EV_REL);
 	input_dev->keybit[BIT_WORD(BTN_MOUSE)] =
diff --git a/drivers/input/serio/i8042-io.h b/drivers/input/serio/i8042-io.h
index 5d48bb6..296633c 100644
--- a/drivers/input/serio/i8042-io.h
+++ b/drivers/input/serio/i8042-io.h
@@ -92,4 +92,8 @@ static inline void i8042_platform_exit(void)
 #endif
 }
 
+static inline void i8042_platform_suspend(struct device *dev, bool may_wakeup)
+{
+}
+
 #endif /* _I8042_IO_H */
diff --git a/drivers/input/serio/i8042-ip22io.h b/drivers/input/serio/i8042-ip22io.h
index ee1ad27..c5b76a4 100644
--- a/drivers/input/serio/i8042-ip22io.h
+++ b/drivers/input/serio/i8042-ip22io.h
@@ -73,4 +73,8 @@ static inline void i8042_platform_exit(void)
 #endif
 }
 
+static inline void i8042_platform_suspend(struct device *dev, bool may_wakeup)
+{
+}
+
 #endif /* _I8042_IP22_H */
diff --git a/drivers/input/serio/i8042-jazzio.h b/drivers/input/serio/i8042-jazzio.h
index 13fd710..a11913a 100644
--- a/drivers/input/serio/i8042-jazzio.h
+++ b/drivers/input/serio/i8042-jazzio.h
@@ -66,4 +66,8 @@ static inline void i8042_platform_exit(void)
 #endif
 }
 
+static inline void i8042_platform_suspend(struct device *dev, bool may_wakeup)
+{
+}
+
 #endif /* _I8042_JAZZ_H */
diff --git a/drivers/input/serio/i8042-ppcio.h b/drivers/input/serio/i8042-ppcio.h
index f708c75..c9f4292 100644
--- a/drivers/input/serio/i8042-ppcio.h
+++ b/drivers/input/serio/i8042-ppcio.h
@@ -52,6 +52,10 @@ static inline void i8042_platform_exit(void)
 {
 }
 
+static inline void i8042_platform_suspend(struct device *dev, bool may_wakeup)
+{
+}
+
 #else
 
 #include "i8042-io.h"
diff --git a/drivers/input/serio/i8042-snirm.h b/drivers/input/serio/i8042-snirm.h
index 409a934..96d034f 100644
--- a/drivers/input/serio/i8042-snirm.h
+++ b/drivers/input/serio/i8042-snirm.h
@@ -72,4 +72,8 @@ static inline void i8042_platform_exit(void)
 
 }
 
+static inline void i8042_platform_suspend(struct device *dev, bool may_wakeup)
+{
+}
+
 #endif /* _I8042_SNIRM_H */
diff --git a/drivers/input/serio/i8042-sparcio.h b/drivers/input/serio/i8042-sparcio.h
index 395a9af..e5381d3 100644
--- a/drivers/input/serio/i8042-sparcio.h
+++ b/drivers/input/serio/i8042-sparcio.h
@@ -154,4 +154,8 @@ static inline void i8042_platform_exit(void)
 }
 #endif /* !CONFIG_PCI */
 
+static inline void i8042_platform_suspend(struct device *dev, bool may_wakeup)
+{
+}
+
 #endif /* _I8042_SPARCIO_H */
diff --git a/drivers/input/serio/i8042-x86ia64io.h b/drivers/input/serio/i8042-x86ia64io.h
index bb9f5d3..76b2e58 100644
--- a/drivers/input/serio/i8042-x86ia64io.h
+++ b/drivers/input/serio/i8042-x86ia64io.h
@@ -875,6 +875,10 @@ static inline int i8042_pnp_init(void) { return 0; }
 static inline void i8042_pnp_exit(void) { }
 #endif
 
+static inline void i8042_platform_suspend(struct device *dev, bool may_wakeup)
+{
+}
+
 static int __init i8042_platform_init(void)
 {
 	int retval;
diff --git a/drivers/input/serio/i8042.c b/drivers/input/serio/i8042.c
index d37a48e..d275130 100644
--- a/drivers/input/serio/i8042.c
+++ b/drivers/input/serio/i8042.c
@@ -87,6 +87,7 @@ MODULE_PARM_DESC(debug, "Turn i8042 debugging mode on and off");
 #endif
 
 static bool i8042_bypass_aux_irq_test;
+static bool i8042_enable_wakeup;
 
 #include "i8042.h"
 
@@ -1082,10 +1083,17 @@ static void i8042_dritek_enable(void)
  * before suspending.
  */
 
-static int i8042_controller_resume(bool force_reset)
+static int i8042_controller_resume(bool force_reset, bool soft_resume)
 {
 	int error;
 
+	/*
+	 * If device is selected as a wakeup source, it was not powered down
+	 * or reset during suspend, so we have very little to do.
+	 */
+	if (soft_resume)
+		goto soft;
+
 	error = i8042_controller_check();
 	if (error)
 		return error;
@@ -1129,6 +1137,7 @@ static int i8042_controller_resume(bool force_reset)
 	if (i8042_ports[I8042_KBD_PORT_NO].serio)
 		i8042_enable_kbd_port();
 
+soft:
 	i8042_interrupt(0, NULL);
 
 	return 0;
@@ -1146,14 +1155,48 @@ static int i8042_pm_reset(struct device *dev)
 	return 0;
 }
 
+static int i8042_pm_suspend(struct device *dev)
+{
+	i8042_platform_suspend(dev, device_may_wakeup(dev));
+
+	/*
+	 * If device is selected as a wakeup source, don't powerdown or reset
+	 * during suspend. Just disable IRQs to ensure race-free resume.
+	 */
+	if (device_may_wakeup(dev)) {
+		if (i8042_kbd_irq_registered)
+			disable_irq(I8042_KBD_IRQ);
+		if (i8042_aux_irq_registered)
+			disable_irq(I8042_AUX_IRQ);
+		return 0;
+	}
+
+	return i8042_pm_reset(dev);
+}
+
 static int i8042_pm_resume(struct device *dev)
 {
 	/*
 	 * On resume from S2R we always try to reset the controller
 	 * to bring it in a sane state. (In case of S2D we expect
 	 * BIOS to reset the controller for us.)
+	 * This function call will also handle any pending keypress event
+	 * (perhaps the system wakeup reason)
+	 */
+	int r = i8042_controller_resume(true, device_may_wakeup(dev));
+
+	/* If the device was left running during suspend, enable IRQs again
+	 * now. Must be done last to avoid races with interrupt processing
+	 * inside i8042_controller_resume.
 	 */
-	return i8042_controller_resume(true);
+	if (device_may_wakeup(dev)) {
+		if (i8042_kbd_irq_registered)
+			enable_irq(I8042_KBD_IRQ);
+		if (i8042_aux_irq_registered)
+			enable_irq(I8042_AUX_IRQ);
+	}
+
+	return r;
 }
 
 static int i8042_pm_thaw(struct device *dev)
@@ -1165,11 +1208,11 @@ static int i8042_pm_thaw(struct device *dev)
 
 static int i8042_pm_restore(struct device *dev)
 {
-	return i8042_controller_resume(false);
+	return i8042_controller_resume(false, false);
 }
 
 static const struct dev_pm_ops i8042_pm_ops = {
-	.suspend	= i8042_pm_reset,
+	.suspend	= i8042_pm_suspend,
 	.resume		= i8042_pm_resume,
 	.thaw		= i8042_pm_thaw,
 	.poweroff	= i8042_pm_reset,
@@ -1192,6 +1235,7 @@ static int __init i8042_create_kbd_port(void)
 {
 	struct serio *serio;
 	struct i8042_port *port = &i8042_ports[I8042_KBD_PORT_NO];
+	struct device *parent = &i8042_platform_device->dev;
 
 	serio = kzalloc(sizeof(struct serio), GFP_KERNEL);
 	if (!serio)
@@ -1203,7 +1247,8 @@ static int __init i8042_create_kbd_port(void)
 	serio->stop		= i8042_stop;
 	serio->close		= i8042_port_close;
 	serio->port_data	= port;
-	serio->dev.parent	= &i8042_platform_device->dev;
+	serio->dev.parent	= parent;
+	device_set_wakeup_capable(&serio->dev, device_can_wakeup(parent));
 	strlcpy(serio->name, "i8042 KBD port", sizeof(serio->name));
 	strlcpy(serio->phys, I8042_KBD_PHYS_DESC, sizeof(serio->phys));
 
@@ -1218,6 +1263,7 @@ static int __init i8042_create_aux_port(int idx)
 	struct serio *serio;
 	int port_no = idx < 0 ? I8042_AUX_PORT_NO : I8042_MUX_PORT_NO + idx;
 	struct i8042_port *port = &i8042_ports[port_no];
+	struct device *parent = &i8042_platform_device->dev;
 
 	serio = kzalloc(sizeof(struct serio), GFP_KERNEL);
 	if (!serio)
@@ -1228,7 +1274,8 @@ static int __init i8042_create_aux_port(int idx)
 	serio->start		= i8042_start;
 	serio->stop		= i8042_stop;
 	serio->port_data	= port;
-	serio->dev.parent	= &i8042_platform_device->dev;
+	serio->dev.parent	= parent;
+	device_set_wakeup_capable(&serio->dev, device_can_wakeup(parent));
 	if (idx < 0) {
 		strlcpy(serio->name, "i8042 AUX port", sizeof(serio->name));
 		strlcpy(serio->phys, I8042_AUX_PHYS_DESC, sizeof(serio->phys));
@@ -1403,6 +1450,9 @@ static int __init i8042_probe(struct platform_device *dev)
 		i8042_dritek_enable();
 #endif
 
+	if (i8042_enable_wakeup)
+		device_set_wakeup_capable(&dev->dev, true);
+
 	if (!i8042_noaux) {
 		error = i8042_setup_aux();
 		if (error && error != -ENODEV && error != -EBUSY)
diff --git a/drivers/input/serio/serio.c b/drivers/input/serio/serio.c
index ba70058..9e2fdb4 100644
--- a/drivers/input/serio/serio.c
+++ b/drivers/input/serio/serio.c
@@ -931,7 +931,7 @@ static int serio_uevent(struct device *dev, struct kobj_uevent_env *env)
 #endif /* CONFIG_HOTPLUG */
 
 #ifdef CONFIG_PM
-static int serio_suspend(struct device *dev)
+static int serio_poweroff(struct device *dev)
 {
 	struct serio *serio = to_serio_port(dev);
 
@@ -940,7 +940,16 @@ static int serio_suspend(struct device *dev)
 	return 0;
 }
 
-static int serio_resume(struct device *dev)
+static int serio_suspend(struct device *dev)
+{
+	/* If configured as a wakeup source, don't power off. */
+	if (device_may_wakeup(dev))
+		return 0;
+
+	return serio_poweroff(dev);
+}
+
+static int serio_restore(struct device *dev)
 {
 	struct serio *serio = to_serio_port(dev);
 
@@ -953,11 +962,23 @@ static int serio_resume(struct device *dev)
 	return 0;
 }
 
+static int serio_resume(struct device *dev)
+{
+	/*
+	 * If configured as a wakeup source, we didn't power off during
+	 * suspend, and hence have nothing to do.
+	 */
+	if (device_may_wakeup(dev))
+		return 0;
+
+	return serio_restore(dev);
+}
+
 static const struct dev_pm_ops serio_pm_ops = {
 	.suspend	= serio_suspend,
 	.resume		= serio_resume,
-	.poweroff	= serio_suspend,
-	.restore	= serio_resume,
+	.poweroff	= serio_poweroff,
+	.restore	= serio_restore,
 };
 #endif /* CONFIG_PM */
 
-- 
1.7.6

^ permalink raw reply related

* Re: [PATCH 05/13] PM: QoS: support the dynamic insertion and removal of devices
From: Jean Pihet @ 2011-08-02 10:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: markgross, broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <201107310038.30710.rjw@sisk.pl>

Rafael,

2011/7/31 Rafael J. Wysocki <rjw@sisk.pl>:
> On Thursday, July 28, 2011, jean.pihet@newoldbits.com wrote:
...

>> @@ -113,6 +109,8 @@ void device_pm_remove(struct device *dev)
>>  {
>>       pr_debug("PM: Removing info for %s:%s\n",
>>                dev->bus ? dev->bus->name : "No Bus", dev_name(dev));
>> +     /* Call PM QoS to de-init the per-device latency constraints */
>> +     pm_qos_dev_constraints_deinit(dev);
>
> I'd call this function "dev_pm_qos_constraints_destroy()" (and the previous
> one "dev_pm_qos_constraints_init()" for consistency).
Ok

...
>> +/* Called from the device PM subsystem at device init */
>> +void pm_qos_dev_constraints_init(struct device *dev)
>> +{
>> +     plist_head_init(&dev->power.latency_constraints.list, &dev->power.lock);
>> +     dev->power.latency_constraints.target_value =
>> +                                     PM_QOS_DEV_LAT_DEFAULT_VALUE;
>> +     dev->power.latency_constraints.default_value =
>> +                                     PM_QOS_DEV_LAT_DEFAULT_VALUE;
>> +     dev->power.latency_constraints.type = PM_QOS_MIN;
>> +     dev->power.latency_constraints_init = 1;
>
> You could avoid adding this field if there were a PM_QOS_UNINITIALIZED
> (or PM_QOS_UNKNOWN) type.
>
> And if you _really_ want to have a separate field, why don't you put it
> into latency_constraints ?
Ok I remove latency_constraints_init and use the type field instead.

...
>
> Thanks,
> Rafael

Thanks,
Jean

^ permalink raw reply

* Re: [PATCH 04/13] PM: QoS: implement the per-device latency constraints
From: Jean Pihet @ 2011-08-02 10:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: markgross, broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <201107310030.29286.rjw@sisk.pl>

Rafael,

2011/7/31 Rafael J. Wysocki <rjw@sisk.pl>:
> On Thursday, July 28, 2011, jean.pihet@newoldbits.com wrote:
>> From: Jean Pihet <j-pihet@ti.com>
>>
>> Re-design the PM QoS implementation to support the per-device
>> constraints:
>
> Well, I guess I should have reviewed this patch before [03/13].
Hmm indeed. The split of patches into API and implementation patches
wakes it rather confusing. I could add a comment in [03/13] about it.

...

>> - Misc fixes to improve code readability:
>>   . rename of fields names (request, list, constraints, class),
>
> Please avoid doing renames along with functional changes.  It makes reviewing
> extremely hard.
Ok I will move the renames in a different patch.

...
>> @@ -97,7 +97,12 @@ void device_pm_add(struct device *dev)
>>                       dev_name(dev->parent));
>>       list_add_tail(&dev->power.entry, &dpm_list);
>>       mutex_unlock(&dpm_list_mtx);
>> -     plist_head_init(&dev->power.latency_constraints, &dev->power.lock);
>> +     plist_head_init(&dev->power.latency_constraints.list, &dev->power.lock);
>> +     dev->power.latency_constraints.target_value =
>> +                                     PM_QOS_DEV_LAT_DEFAULT_VALUE;
>> +     dev->power.latency_constraints.default_value =
>> +                                     PM_QOS_DEV_LAT_DEFAULT_VALUE;
>> +     dev->power.latency_constraints.type = PM_QOS_MIN;
>
> Perhaps add a helper doing these assignments?
This code is later moved into the device insertion and removal
functions. Cf. [05/13].

...
>> @@ -464,7 +465,7 @@ struct dev_pm_info {
>>       unsigned long           accounting_timestamp;
>>       void                    *subsys_data;  /* Owned by the subsystem. */
>>  #endif
>> -     struct plist_head       latency_constraints;
>> +     struct pm_qos_constraints       latency_constraints;
>
> Why don't you simply call it "qos"?  The data type provides the information
> about what it's for now.
Ok

...
>> +struct pm_qos_constraints {
>> +     struct plist_head list;
>> +     /*
>> +      * Do not change target_value to 64 bit in order to guarantee
>> +      * accesses atomicity
>> +      */
>
> The comment doesn't belong here.  Please put it outside of the structure
> definition or after the field name (or both, in which case the "inline"
> one may be shorter, like "must not be 64-bit").
Ok

>
>> +     s32 target_value;
>> +     s32 default_value;
>> +     enum pm_qos_type type;
>> +};
>> +
>> +/*
>> + * Struct that is pre-allocated by the caller.
>> + * The handle is kept for future use (update, removal)
>> + */
>>  struct pm_qos_request {
>> -     struct plist_node list;
>> +     struct plist_node node;
>
> Please avoid doing such things along with functional changes.
Ok

...
>> +enum pm_qos_req_action {
>> +     PM_QOS_ADD_REQ,
>> +     PM_QOS_UPDATE_REQ,
>> +     PM_QOS_REMOVE_REQ
>>  };
>
> A comment describing the meaning of these would be helpful.
Ok

...
>> -static inline int pm_qos_get_value(struct pm_qos_object *o)
>> +static inline int pm_qos_get_value(struct pm_qos_constraints *c)
>
> I'd remove the "inline" if you're at it.  It's only a hint if the kernel
> is not built with "always inline" and the compiler should do the inlining
> anyway if that's a better choice.
Ok

...
>> -static inline s32 pm_qos_read_value(struct pm_qos_object *o)
>> +/*
>> + * pm_qos_read_value atomically reads and returns target_value.
>
> We have a standard for writing kerneldoc comments, please follow it.
Ok

>
>> + * target_value is updated upon update of the constraints list, using
>> + * pm_qos_set_value.
>> + *
>> + * Note: The lockless read path depends on the CPU accessing
>> + * target_value atomically.  Atomic access is only guaranteed on all CPU
>> + * types linux supports for 32 bit quantites
>
> You should say "data types" rather than "quantities" here.
Ok

>
>> + */
>> +static inline s32 pm_qos_read_value(struct pm_qos_constraints *c)
>>  {
>> -     return o->target_value;
>> +     if (c)
>> +             return c->target_value;
>> +     else
>> +             return 0;
>>  }
>>
>> -static inline void pm_qos_set_value(struct pm_qos_object *o, s32 value)
>> +static inline void pm_qos_set_value(struct pm_qos_constraints *c, s32 value)
>>  {
>> -     o->target_value = value;
>> +     c->target_value = value;
>>  }
>
> Well, I'm not sure that this function is necessary at all.  You might as well
> simply remove it as far as I'm concerned.
The idea is to provide an efficient and lockless way to get the
aggregated constraint class value. When the constraints a re changing
the new value is calculated and stored using pm_qos_get_value and
pm_qos_set_value. Then pm_qos_read_value is used to retrieve the
value. For example cpuidle calls pm_qos_request which uses
pm_qos_read_value to get the CPU/DMA minimum latency.

>
>> -static void update_target(struct pm_qos_object *o, struct plist_node *node,
>> -                       int del, int value)
>> +static void update_target(struct pm_qos_request *req,
>> +                       enum pm_qos_req_action action, int value)
>>  {
>>       unsigned long flags;
>> -     int prev_value, curr_value;
>> +     int prev_value, curr_value, new_value;
>> +     struct pm_qos_object *o = pm_qos_array[req->class];
>> +     struct pm_qos_constraints *c;
>> +
>> +     switch (req->class) {
>> +     case PM_QOS_DEV_LATENCY:
>> +             if (!req->dev) {
>> +                     WARN(1, KERN_ERR "PM QoS API called with NULL dev\n");
>> +                     return;
>> +             }
>> +             c = &req->dev->power.latency_constraints;
>> +             break;
>> +     case PM_QOS_CPU_DMA_LATENCY:
>> +     case PM_QOS_NETWORK_LATENCY:
>> +     case PM_QOS_NETWORK_THROUGHPUT:
>> +             c = o->constraints;
>> +             break;
>> +     case PM_QOS_RESERVED:
>> +     default:
>> +             WARN(1, KERN_ERR "PM QoS API called with wrong class %d, "
>> +                     "req 0x%p\n", req->class, req);
>> +             return;
>> +     }
>
> Do we _really_ need that switch()?
>
> What about introducing dev_pm_qos_add_request() and friends specifically
> for devices, such that they will take the target device object (dev) as
> their first argument?  Then, you could keep pm_qos_add_request() pretty
> much as is, right?
Yes but in that case I need to duplicate the API functions for devices
(add, update, remove). Those functions will call update_target.
I prefer the way this patch does it: simplify the API functions and
have only 1 place with the constraints management and notification
logic (in update_target).

>
>>
>>       spin_lock_irqsave(&pm_qos_lock, flags);
>> -     prev_value = pm_qos_get_value(o);
>> -     /* PM_QOS_DEFAULT_VALUE is a signal that the value is unchanged */
>> -     if (value != PM_QOS_DEFAULT_VALUE) {
>> +
>> +     prev_value = pm_qos_get_value(c);
>> +     if (value == PM_QOS_DEFAULT_VALUE)
>> +             new_value = c->default_value;
>> +     else
>> +             new_value = value;
>
> What about writing that as
>
>        new_value = value != PM_QOS_DEFAULT_VALUE ? value : c->default_value;
This is shorter but more difficult to read ;8

>
>> +
>> +     switch (action) {
>> +     case PM_QOS_REMOVE_REQ:
>> +             plist_del(&req->node, &c->list);
>> +             break;
>> +     case PM_QOS_UPDATE_REQ:
>>               /*
>>                * to change the list, we atomically remove, reinit
>>                * with new value and add, then see if the extremal
>>                * changed
>>                */
>> -             plist_del(node, &o->requests);
>> -             plist_node_init(node, value);
>> -             plist_add(node, &o->requests);
>> -     } else if (del) {
>> -             plist_del(node, &o->requests);
>> -     } else {
>> -             plist_add(node, &o->requests);
>> +             plist_del(&req->node, &c->list);
>> +     case PM_QOS_ADD_REQ:
>> +             plist_node_init(&req->node, new_value);
>> +             plist_add(&req->node, &c->list);
>> +             break;
>> +     default:
>> +             /* no action */
>> +             ;
>>       }
>> -     curr_value = pm_qos_get_value(o);
>> -     pm_qos_set_value(o, curr_value);
>> +
>> +     curr_value = pm_qos_get_value(c);
>> +     pm_qos_set_value(c, curr_value);
>>       spin_unlock_irqrestore(&pm_qos_lock, flags);
>>
>>       if (prev_value != curr_value)
>>               blocking_notifier_call_chain(o->notifiers,
>
> That's why I'm thinking that it would be helpful to have a pointer
> to the notifier list from struct pm_qos_constraints .
I think having a per-device notifier list is complicating things. If a
subsystem needs te be notified it needs to register a notifier
callback for _every_ device in the system.
As a first implementation for OMAP the low-level PM code is using a
single notifier to retrieve all the devices latency constraints and
apply them to the pwer domains. I do not see how this could work with
a per-device notifier list.

>
> Besides, you can use "pm_qos_array[req->class]->notifiers" instead of
> "o->notifiers".
Ok

...
>> +static int find_pm_qos_object_by_minor(int minor)
>> +{
>> +     int pm_qos_class;
>> +
>> +     for (pm_qos_class = 0;
>> +             pm_qos_class < PM_QOS_NUM_CLASSES; pm_qos_class++) {
>> +             if (minor ==
>> +                     pm_qos_array[pm_qos_class]->pm_qos_power_miscdev.minor)
>> +                     return pm_qos_class;
>> +     }
>> +     return -1;
>> +}
>
> This function doesn't seem to be used anywhere, what's the purpose of it?
It is used by pm_qos_power_open in order to retrieve the class
associated with the MISC device.
BTW this patch is moving the code so that all the MISC related
functions are grouped together.

>
>> +
>>  static int pm_qos_power_open(struct inode *inode, struct file *filp)
>>  {
>>       struct pm_qos_parameters pm_qos_params;
...
>
> Thanks,
> Rafael

Thanks,
Jean

^ permalink raw reply

* Re: [PATCH 01/13] PM: QoS: rename pm_qos_params files to pm_qos
From: Rafael J. Wysocki @ 2011-08-02  9:47 UTC (permalink / raw)
  To: Jean Pihet
  Cc: markgross, broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <CAORVsuX1iXnQS5LZeKb7nJa1_QCmnHDpDZBWyQab6rcLqig=TQ@mail.gmail.com>

On Tuesday, August 02, 2011, Jean Pihet wrote:
> Rafael,
> 
> 2011/7/29 Rafael J. Wysocki <rjw@sisk.pl>:
> > On Thursday, July 28, 2011, jean.pihet@newoldbits.com wrote:
> >> From: Jean Pihet <j-pihet@ti.com>
> >>
> >> The PM QoS implementation files are better named
> >> kernel/pm_qos.c and include/linux/pm_qos.h.
> >>
> ...
> 
> >>  create mode 100644 include/linux/pm_qos.h
> >>  delete mode 100644 include/linux/pm_qos_params.h
> >
> > That I agree with.
> >
> >>  create mode 100644 kernel/pm_qos.c
> >>  delete mode 100644 kernel/pm_qos_params.c
> >
> > As I said, please move that file to kernel/power and call it qos.c.
> Ok
> 
> >
> > That said the device interface should be located in drivers/base/power
> > to follow our current conventions.
> By device interface I understand the following:
> - the user space API (per-device sysfs entry),
> - the in-kernel device specific PM QoS API. If needed, cf. comments on
> [04/13] about that.
> 
> Is that correct?

Yes, it is.

I think that the definitions of the dev_pm_qos_*() funtions mentioned
in there may go into drivers/base/power/qos.c and they can call functions
from kernel/power/qos.c.

Thanks,
Rafael

^ permalink raw reply

* Re: [PATCH 03/13] PM: QoS: extend the in-kernel API with per-device latency constraints
From: Jean Pihet @ 2011-08-02  9:41 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: markgross, broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <201107300055.49468.rjw@sisk.pl>

Rafael,

2011/7/30 Rafael J. Wysocki <rjw@sisk.pl>:
> On Thursday, July 28, 2011, jean.pihet@newoldbits.com wrote:
>> From: Jean Pihet <j-pihet@ti.com>
>>
>> Extend the PM QoS kernel API:
>> - add a new PM QoS class PM_QOS_DEV_LATENCY for device wake-up latency
>> constraints
>> - make the pm_qos_add_request API more generic by using a parameter of
>> type struct pm_qos_parameters
>> - minor clean-ups and rename of struct fields:
>>   . rename pm_qos_class to class and pm_qos_req to req in internal code
>>   . consistenly use req and params as the API parameters
>>   . rename struct pm_qos_request_list to struct pm_qos_request
>> - update the in-kernel API callers to the new API
>
> There should be more about the motivation in the changelog.  It says
> what the patch is doing, but it doesn't say a word of _why_ it is done in
> the first place.
Ok will add a comment

>
> Second, you're renaming a lot of things in the same patch that makes
> functional changes.  This is confusing and makes it very difficult to
> understand what's going on.  Please use separate patches to rename
> things, when possible, and avoid renaming things that don't need to be
> renamed.
Sorry about that! I will split up the patches.

...
>>
>> -struct pm_qos_request_list {
>> +struct pm_qos_request {
>
> This renaming should go in a separate patch, really.
Ok

>
>>       struct plist_node list;
>> -     int pm_qos_class;
>> +     int class;
>
> This renaming doesn't seem to be necessary.  Moreover, it's confusing,
> because there already is struct class (for device classes) and a member
> called "class" in struct device and they aren't related to this one.
I will keep pm_qos_class if the rename brings in some confusion. The
intention was to simplify the names of the fields in structs with
explicit names.

>
>> +     struct device *dev;
>>  };
>>
>> -void pm_qos_add_request(struct pm_qos_request_list *l, int pm_qos_class, s32 value);
>> -void pm_qos_update_request(struct pm_qos_request_list *pm_qos_req,
>> -             s32 new_value);
>> -void pm_qos_remove_request(struct pm_qos_request_list *pm_qos_req);
>> +struct pm_qos_parameters {
>> +     int class;
>> +     struct device *dev;
>> +     s32 value;
>> +};
>
> What exactly is the "dev" member needed for?
This is the target device that is passed as parameter to the API. It
is used for constraints of the devices latency class.

...
>> +static BLOCKING_NOTIFIER_HEAD(dev_lat_notifier);
>> +static struct pm_qos_object dev_pm_qos = {
>> +     .requests = PLIST_HEAD_INIT(dev_pm_qos.requests, pm_qos_lock),
>> +     .notifiers = &dev_lat_notifier,
>> +     .name = "dev_latency",
>> +     .target_value = PM_QOS_DEV_LAT_DEFAULT_VALUE,
>> +     .default_value = PM_QOS_DEV_LAT_DEFAULT_VALUE,
>> +     .type = PM_QOS_MIN,
>> +};
>> +
>
> You seem to be confusing things here.  Since devices will have their own lists
> of requests now (as per the previous patch), the .requests member above is not
> necessary.  Moreover, it seems to be used incorrectly below.
The idea was to split the patches as requested previously: first the
API changes (this very patch [03/13]) and then the actual
implementation ([04/13]). Is this correct?

...
>> -int pm_qos_add_notifier(int pm_qos_class, struct notifier_block *notifier)
>> +int pm_qos_add_notifier(int class, struct notifier_block *notifier)
>>  {
>>       int retval;
>>
>>       retval = blocking_notifier_chain_register(
>> -                     pm_qos_array[pm_qos_class]->notifiers, notifier);
>> +                     pm_qos_array[class]->notifiers, notifier);
>
> Now, there's a question if we want to have one notifier chain for all
> devices or if it's better to have a per-device notifier chain.  Both
> approaches have their own advantages and drawbacks, but in the latter
> case this code will have to be reworked.
I really think the need is for a per-class notifier. Cf. comments on
[04/13] about that.

>
> Thanks,
> Rafael

Thanks,
Jean

^ permalink raw reply

* Re: [PATCH 01/13] PM: QoS: rename pm_qos_params files to pm_qos
From: Jean Pihet @ 2011-08-02  9:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: markgross, broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <201107292357.48969.rjw@sisk.pl>

Rafael,

2011/7/29 Rafael J. Wysocki <rjw@sisk.pl>:
> On Thursday, July 28, 2011, jean.pihet@newoldbits.com wrote:
>> From: Jean Pihet <j-pihet@ti.com>
>>
>> The PM QoS implementation files are better named
>> kernel/pm_qos.c and include/linux/pm_qos.h.
>>
...

>>  create mode 100644 include/linux/pm_qos.h
>>  delete mode 100644 include/linux/pm_qos_params.h
>
> That I agree with.
>
>>  create mode 100644 kernel/pm_qos.c
>>  delete mode 100644 kernel/pm_qos_params.c
>
> As I said, please move that file to kernel/power and call it qos.c.
Ok

>
> That said the device interface should be located in drivers/base/power
> to follow our current conventions.
By device interface I understand the following:
- the user space API (per-device sysfs entry),
- the in-kernel device specific PM QoS API. If needed, cf. comments on
[04/13] about that.

Is that correct?

>
> Thanks,
> Rafael

Thanks,
Jean

^ permalink raw reply

* Re: [Regression] Commit 02c24a82187d5a628c68edfe71ae60dc135cd178 breaks s2disk
From: Rafael J. Wysocki @ 2011-08-02  9:26 UTC (permalink / raw)
  To: Al Viro; +Cc: Jan Kara, Linux PM mailing list, Linus Torvalds, LKML,
	Josef Bacik
In-Reply-To: <20110802013947.GK2203@ZenIV.linux.org.uk>

On Tuesday, August 02, 2011, Al Viro wrote:
> On Tue, Aug 02, 2011 at 02:30:27AM +0100, Al Viro wrote:
> 
> > Applied (in file->f_mapping variant; it is equal to bdev->bd_mapping,
> > but what's wrong with using ->f_mapping here?)
> 
> BTW, I've put s-o-b: Rafael J. Wysocki <rjw@sisk.pl> in there.  Rafael,
> do you have any problems with that?

Not at all.

Thanks a lot,
Rafael

^ permalink raw reply

* Re: [PATCH 07/13] OMAP PM: early init of the pwrdms states
From: Jean Pihet @ 2011-08-02  8:57 UTC (permalink / raw)
  To: Todd Poynor
  Cc: markgross, broonie, Linux PM mailing list, linux-omap, Jean Pihet
In-Reply-To: <CAORVsuUovg7px5SGC+mt7CoDcXZ8r2d09obkXqSp=g44U1G9EQ@mail.gmail.com>

Todd,

On Fri, Jul 29, 2011 at 10:50 AM, Jean Pihet <jean.pihet@newoldbits.com> wrote:
> On Fri, Jul 29, 2011 at 10:08 AM, Todd Poynor <toddpoynor@google.com> wrote:
>> On Thu, Jul 28, 2011 at 10:30:14AM +0200, jean.pihet@newoldbits.com wrote:
...

>>> diff --git a/arch/arm/mach-omap2/powerdomain.c b/arch/arm/mach-omap2/powerdomain.c
>>> index 9af0847..63c3e7a 100644
>>> --- a/arch/arm/mach-omap2/powerdomain.c
>>> +++ b/arch/arm/mach-omap2/powerdomain.c
>>> @@ -108,6 +108,9 @@ static int _pwrdm_register(struct powerdomain *pwrdm)
>>>       pwrdm->state = pwrdm_read_pwrst(pwrdm);
>>>       pwrdm->state_counter[pwrdm->state] = 1;
>>>
>>> +     /* Early init of the next power state */
>>> +     pwrdm_set_next_pwrst(pwrdm, PWRDM_POWER_RET);
>>> +
>>
>> Wanted to check that it's OK to initialize the next state of a power
>> domain to RETENTION early in the boot sequence.  I believe patches
>> have been previously discussed that set the state to ON to ensure the
>> domain doesn't go to a lower state, and possibly lose context, before
>> the PM subsystem is setup to handle it?  Not sure, thought maybe worth
>> a doublecheck.
> Indeed I need to check the behavior for OMAP3 & 4 which seem to
> initialize the pwrdm states differently.
> BTW the patch that inits all pwrdms to ON is not yet in l-o master
> that is why I (lazily) submitted this one for now.

Ok I will update the patch to make it compliant with [1]. v4 will
include this change.

Thanks,
Jean

[1] http://marc.info/?l=linux-arm-kernel&m=131052762623823&w=2

>
>>
>>
>> Todd

^ permalink raw reply

* Re: [PATCH v4 0/3] DEVFREQ, DVFS framework for non-CPU devices
From: MyungJoo Ham @ 2011-08-02  7:17 UTC (permalink / raw)
  To: Turquette, Mike
  Cc: Len Brown, Greg Kroah-Hartman, Kyungmin Park, Thomas Gleixner,
	linux-pm
In-Reply-To: <CAJOA=zPz6oqga=gM7bh7762Vr=43VjweDy30KUb4EcgaWZavNg@mail.gmail.com>

On Tue, Aug 2, 2011 at 7:01 AM, Turquette, Mike <mturquette@ti.com> wrote:
>
> Maybe I'm not understanding how the devfreq requests would be made
> from drivers.  Can you explain an example where a single target device
> named X has constraints placed on it's clock rate from two different
> drivers Y & Z?  Imagine in this case that there are no performance
> counters or any way in hardware to monitor device saturation.

Ok, what you want to see is the case where X has a clock with OPP and
DEVFREQ and Y and Z are going to give constraints on that X's clock,
right?

In such a case, DEVFREQ has nothing to interfere directly with the
relation between X <--> Y/Z.

Y and Z can give constraints on X's clock with OPP interface (
opp_enable(dev, freq) and opp_disable(dev, freq) ) without the need
for DEVFREQ-aware.

DEVFREQ chooses frequencies from enabled OPPs regardless of the governor chosen.

However, if your concern is about the inconsistency between Y and Z
caused by calling opp_enable to "cancel" opp_disable, DEVFREQ provides
no protection against it and writers of Y/Z will need to do some
bothersome work unless QoS request feature (or generalized tickle) is
added to DEVFREQ. Anyway, (as will be discussed below) I guess the QoS
request feature might wait for next version of DEVFREQ.

>
>>> Some examples include a MMC controller, which might change its clock
>>> rate depending on the class of card that the user has inserted.  Or
>>> even a "smartish" device like a GPU lacking performance counters; it's
>>> driver will ramp up frequency when there is work to be done and kick
>>> off a timeout.  If no new work comes in before the timeout then the
>>> driver will drop the frequency.
>>
>> In the "simple MMC controller w/o performance counter" case, there are
>> following ways to use devfreq even if using the number of requests or
>> functions calls is not possible.
>>
>> Method 1) use "userspace" governor and let user process choose
>> frequency based on the class
>
> I'm less interested in userspace control of MMC controller operating
> frequency and much more interested in how devfreq might arbitrate QoS
> requests from multiple "client" devices.
>
>> Method 2) use any "reasonable" governor and let the device driver set
>> only "valid" frequencies enabled.
>
> Can you elaborate on this?  I'm not sure I understand how this will
> look in driver code.  Maybe the example I requested above will shed
> some light.

If you are concerned about the consistency (between Y and Z's
enable/disable calls) problem in the previous X/Y/Z example, it'd be
addressed with "generalized tickle" or "QoS requests" unless those Y
and Z are aware of each other. I'd say such a feature is for the
"next" version of DEVFREQ as it is not going to affect the framework
itself significantly.

Anyway, the interface I'm thinking about are:
Method1:
   id = devfreq_qos_request(dev, freq); /* sets dev's frequency at
freq or higher */
   devfreq_qos_release(dev, id);
OR
   devfreq_qos_request(this_dev, target_dev, freq); /* this_dev sets
target_dev's frequency at freq or higher */
   devfreq_qos_release(this_dev, target_dev);

Method1 would be suitable for usual qos requests from related devices.
If there are multiple qos requests active, the highest requested freq
is used.
Internally, devfreq will manage a sorted (descending with freq) list
of "this_dev" or "id" per target_dev and enforce the target-freq to be
>= the highest freq in the list.

Method2:
   devfreq_tickle(dev, rate, duration); /* sets dev's frequency at its
maximum frequency * rate / 100 for duration in ms */

Method2 would be suitable for reacting to inputs (e,g, a user hitting
a key, clicking a mouse, touching a screen, ...).



>
>>   For a rough example, we may do if class < 6, disable freq > 40MHz,
>> class < 10, disable freq > 80MHz, and so on. If we do not have
>> performance counters or any other mechanisms to monitor the
>> activities, "performance" governor along with clock-gated MMC driver
>> will save enough power.
>>
>> For GPUs without anything to monitor the activities, we may do the
>> same as the MMC case.
>>
>> However, with the H/W I've got now, (Exynos4210), we have performance
>> counters (PPMU) for many blocks: 3D(MALI GPU), ACP, CAMIF, CPU, DMC0,
>> DMC1 (memory controllers), FSYS, IMAGE, LCD0, LCD1, MFC_L, MFC_R, TV,
>> LEFT_BUS, and RIGHT_BUS. I don't think Exynos4 is an exceptionally
>> fancy SoC (already millions are sold for phones) and other mobile SoCs
>> (at least for flagship models) will have them very soon (or already
>> have them). Along with this patch, in the example with git branch
>> link, we control DMC0/DMC1 blocks. And,
>
> I agree devfreq is well-suited for such hardware.
>
>>> A governor is not required in these cases (as they are event driven)
>>> and devfreq is quite heavyweight for such applications.  What is
>>> needed is a QoS-style software layer that allows throughput requests
>>> to be made from an initiator device towards a target device.  This
>>> layer should aggregate requests since many initator devices may make
>>> requests to the same target device.  This layer I'm describing, which
>>> does not exist today, should be where the actual DVFS transition takes
>>> place.  That could take the form of a clk_set_rate call in the clock
>>> framework (as described by Colin in V1 of this series), or some other
>>> not-yet-realized dvfs_set_opp ,or something like Jean Pihet's
>>> per-device PM QoS patches or whatever.  For the purposes of this email
>>> I don't really care which framework implements the QoS request
>>> aggregation.
>>
>> Such aggregation could be also done with governors. If the
>> governor-device pair does not want to poll devfreq wouldn't loop
>> unless there is any governor-device pair that wants to do so. If it is
>> event-driven, users may just "allow/disallow" frequencies with OPP
>> framework and devfreq will choose proper frequency with the given
>> governor for the device. If every device uses "static" or
>> "event-driven" governors such as powersave/performance/userspace,
>> there will be no polling/looping.
>
> So drivers must disable OPPs, and then the non-polling devfreq
> governor will have to be notified by the OPP code and then run it's
> ->target code again?  This sounds backwards to me.

DEVFREQ (not its governors. governors only "recommend" proper
frequency to DEVFREQ framework when requested by DEVFREQ.) is already
being notified by any OPP changes (add/disable/enable) so that DEVFREQ
wouldn't choose disabled frequencies. That way, disabling and enabling
frequencies at OPP takes effects immediately with DEVFREQ.

More semantically sound approach may be to let OPP have a notifier
(per device) so that the changes in the opp availability go to the
"OPP consumers". However, at least for now, DEVFREQ is the only one
that needs such notification. Therefore, using such a notifier per
device only for DEVFREQ (moreover, not all of OPP'ed devices are using
DEVFREQ) could incur too much overhead as notifier is heavier than a
simple function call.

>
> devfreq seems like an ideal bit of code to understand the constraints
> needed by a device (via the workqueue/monitor loop) and then request
> those needs via the proper API.  It seems entirely wrong to me to have
> other device drivers send their QoS needs to devfreq.

Tickle is an approach for temporal QoS requests. And, I understand
that there are needs for non-tempoeral QoS requests. However, I guess
it might be ok to let QoS requests be "next TODO" subjects for
DEVFREQ. Besides, some engineers have already requested QoS request
feature for DEVFREQ in my side as well. :)

>
> I'm starting to sound like a broken record though, and I've rescinded
> my NAK in my reply to Rafael.  If you could explain how multiple
> drivers can request their performance needs to a devfreq governor
> (same question I asked above) then that would be really helpful.

Without QoS request feature (the Method1 up there), using
opp_enable/disable is the only feasible way unless tickle fits for the
need. (managing "canceling disable" could be bothersome without the
Method1 anyway...)

>
> Thanks,
> Mike
>
-- 
MyungJoo Ham, Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858

^ permalink raw reply

* Re: [Regression] Commit 02c24a82187d5a628c68edfe71ae60dc135cd178 breaks s2disk
From: Al Viro @ 2011-08-02  1:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux PM mailing list, Jan Kara, LKML, Josef Bacik
In-Reply-To: <20110802013027.GJ2203@ZenIV.linux.org.uk>

On Tue, Aug 02, 2011 at 02:30:27AM +0100, Al Viro wrote:

> Applied (in file->f_mapping variant; it is equal to bdev->bd_mapping,
> but what's wrong with using ->f_mapping here?)

BTW, I've put s-o-b: Rafael J. Wysocki <rjw@sisk.pl> in there.  Rafael,
do you have any problems with that?

^ permalink raw reply

* Re: [Regression] Commit 02c24a82187d5a628c68edfe71ae60dc135cd178 breaks s2disk
From: Linus Torvalds @ 2011-08-02  1:38 UTC (permalink / raw)
  To: Al Viro; +Cc: Linux PM mailing list, Jan Kara, LKML, Josef Bacik
In-Reply-To: <20110802013027.GJ2203@ZenIV.linux.org.uk>

On Mon, Aug 1, 2011 at 3:30 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Applied (in file->f_mapping variant; it is equal to bdev->bd_mapping,
> but what's wrong with using ->f_mapping here?)

My only issue was that from a "mindless conversion" standpoint, most
of the other users had been changed to use inode->i_mapping.

I dunno why. I agree that file->f_mapping looks cleaner and is what
the code used to do. But maybe there was some reason why the other
fsync methods had been changed to use

  filemap_write_and_wait_range(inode->i_mapping, start, end);

instead.

           Linus

^ permalink raw reply

* Re: [Regression] Commit 02c24a82187d5a628c68edfe71ae60dc135cd178 breaks s2disk
From: Al Viro @ 2011-08-02  1:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux PM mailing list, Jan Kara, LKML, Josef Bacik
In-Reply-To: <CA+55aFxPAowBwgK7JqN0uH-ytvg+BKdE3yBGF0z_6kNDNcOYcQ@mail.gmail.com>

On Mon, Aug 01, 2011 at 03:22:02PM -1000, Linus Torvalds wrote:
> On Mon, Aug 1, 2011 at 2:17 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >
> > Well, I'm not sure if the patch below is the right fix, but it evidently makes
> > s2disk work for me again.
> 
> Looks right to me.
> 
> The only issue is whether we should use "db_mapping->i_mapping" or
> "file->f_mapping". I think they are the same for block devices.
> 
> Al?

Applied (in file->f_mapping variant; it is equal to bdev->bd_mapping,
but what's wrong with using ->f_mapping here?)

> Also, what about some of the other fsync things that apparently
> weren't updated to write back page caches. ps3flash_fsync? Others?

ps3flash, IIRC, doesn't go through page cache on read and write...
If anything, we probably have instances that bother with pagecache
for no reason...

^ permalink raw reply

* Re: [Regression] Commit 02c24a82187d5a628c68edfe71ae60dc135cd178 breaks s2disk
From: Linus Torvalds @ 2011-08-02  1:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM mailing list, Jan Kara, LKML, Josef Bacik, Al Viro
In-Reply-To: <201108020217.48952.rjw@sisk.pl>

On Mon, Aug 1, 2011 at 2:17 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>
> Well, I'm not sure if the patch below is the right fix, but it evidently makes
> s2disk work for me again.

Looks right to me.

The only issue is whether we should use "db_mapping->i_mapping" or
"file->f_mapping". I think they are the same for block devices.

Al?

Also, what about some of the other fsync things that apparently
weren't updated to write back page caches. ps3flash_fsync? Others?

                Linus

^ permalink raw reply

* Re: [Regression] Commit 02c24a82187d5a628c68edfe71ae60dc135cd178 breaks s2disk
From: Rafael J. Wysocki @ 2011-08-02  0:17 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Jan Kara, Linux PM mailing list, Linus Torvalds, LKML, Al Viro
In-Reply-To: <201108020147.01136.rjw@sisk.pl>

On Tuesday, August 02, 2011, Rafael J. Wysocki wrote:
> Hi,
> 
> Unfortunately s2disk (which is a user space tool for saving hibernate images)
> doesn't work any more after your commit 02c24a82187d5a628c68edfe71ae60dc135cd178
> (fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers).
> The symptom is that the s2disk's resume counterpart cannot find the image in
> the device even though s2disk considers it as successfully saved.
> 
> What happens is that s2disk opens a block device (disk partition to be precise)
> directly and writes to it eventually calling fsync().  My interpretation of the
> failure is that before commit 02c24a82187d5a628c68ed, when s2disk called fsync()
> the VFS layer would run filemap_write_and_wait_range() that in turn would cause
> the data to go to the disk, but after that commit this doesn't happen any more.
> 
> If I'm right, then not only s2disk, but any process writing directly to a block
> device will have a problem with fsync().

Well, I'm not sure if the patch below is the right fix, but it evidently makes
s2disk work for me again.

Thanks,
Rafael

---
 fs/block_dev.c |    4 ++++
 1 file changed, 4 insertions(+)

Index: linux-2.6/fs/block_dev.c
===================================================================
--- linux-2.6.orig/fs/block_dev.c
+++ linux-2.6/fs/block_dev.c
@@ -387,6 +387,10 @@ int blkdev_fsync(struct file *filp, loff
 	struct inode *bd_inode = filp->f_mapping->host;
 	struct block_device *bdev = I_BDEV(bd_inode);
 	int error;
+	
+	error = filemap_write_and_wait_range(filp->f_mapping, start, end);
+	if (error)
+		return error;
 
 	/*
 	 * There is no need to serialise calls to blkdev_issue_flush with

^ permalink raw reply

* [Regression] Commit 02c24a82187d5a628c68edfe71ae60dc135cd178 breaks s2disk
From: Rafael J. Wysocki @ 2011-08-01 23:47 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Jan Kara, Linux PM mailing list, Linus Torvalds, LKML, Al Viro

Hi,

Unfortunately s2disk (which is a user space tool for saving hibernate images)
doesn't work any more after your commit 02c24a82187d5a628c68edfe71ae60dc135cd178
(fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers).
The symptom is that the s2disk's resume counterpart cannot find the image in
the device even though s2disk considers it as successfully saved.

What happens is that s2disk opens a block device (disk partition to be precise)
directly and writes to it eventually calling fsync().  My interpretation of the
failure is that before commit 02c24a82187d5a628c68ed, when s2disk called fsync()
the VFS layer would run filemap_write_and_wait_range() that in turn would cause
the data to go to the disk, but after that commit this doesn't happen any more.

If I'm right, then not only s2disk, but any process writing directly to a block
device will have a problem with fsync().

Thanks,
Rafael

^ permalink raw reply

* Re: [PATCH v4 0/3] DEVFREQ, DVFS framework for non-CPU devices
From: Turquette, Mike @ 2011-08-01 22:01 UTC (permalink / raw)
  To: myungjoo.ham
  Cc: Len Brown, Greg Kroah-Hartman, Kyungmin Park, Thomas Gleixner,
	linux-pm
In-Reply-To: <CAJ0PZbT0+ovjPAnVKaaJNVChG+kU3cgjC9sEU4ppnMGqGL+UoA@mail.gmail.com>

On Sun, Jul 31, 2011 at 11:22 PM, MyungJoo Ham <myungjoo.ham@samsung.com> wrote:
> Hello.
>
> On Sat, Jul 30, 2011 at 10:02 AM, Turquette, Mike <mturquette@ti.com> wrote:
>> On Fri, Jul 29, 2011 at 2:10 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>> On Friday, July 29, 2011, Turquette, Mike wrote:
>>>> On Thu, Jul 28, 2011 at 3:10 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>>> > On Friday, July 15, 2011, MyungJoo Ham wrote:
>>>> >> For a usage example, please look at
>>>> >> http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/devfreq
>>>> >>
>>>> >> In the above git tree, DVFS (dynamic voltage and frequency scaling) mechanism
>>>> >> is applied to the memory bus of Exynos4210 for Exynos4210-NURI boards.
>>>> >> In the example, the LPDDR2 DRAM frequency changes between 133, 266, and 400MHz
>>>> >> and other related clocks simply follow the determined DDR RAM clock.
>>>> >>
>>>> >> The DEVFREQ driver for Exynos4210 memory bus is at
>>>> >> /arch/arm/mach-exynos4/devfreq_bus.c in the git tree.
>>>> >>
>>>> >> MyungJoo Ham (3):
>>>> >>   PM: Introduce DEVFREQ: generic DVFS framework with device-specific
>>>> >>     OPPs
>>>> >>   PM / DEVFREQ: add example governors
>>>> >>   PM / DEVFREQ: add sysfs interface (including user tickling)
>>>> >
>>>> > OK, I'm going to take the patches for 3.2.
>>>>
>>>> Have any other platforms signed up to use this mechanism to manage
>>>> their peripheral DVFS?
>>>
>>> Not that I know of, but one initial user is sufficient for me.
>>> So if you have anything _against_ the patches, please speak up.
>>
>> I do have some concerns.  Let me start by saying that I'm defining a
>> "governor" as some active piece of executing code, probably a looping
>> workqueue that inspects activity/idleness of a device and then makes a
>> determination regarding clock frequency.
>>
>> devfreq seems to be good framework for creating DVFS governors.
>> However I think that most scalable devices on an SoC do *not* need a
>> governor, and many scalable devices won't have performance counters or
>> any other way to implement such introspection.
>
> Yes, governors except for some static or userspace-driven ones (such
> as "performance", "powersave", and "userspace" although "userspace" is
> not implemented for devfreq yet), they loop workqueue that inspects
> activity/idleness of a device and determines frequency. However, the
> inspection is done with a callback provided by each device, not done
> directly by the devfreq itself. Therefore, if there is any way to
> measure the activities (not just performance counters, number of
> requests/function calls should be fine for may cases), normal
> governors like "simple-ondemand" will work.

Maybe I'm not understanding how the devfreq requests would be made
from drivers.  Can you explain an example where a single target device
named X has constraints placed on it's clock rate from two different
drivers Y & Z?  Imagine in this case that there are no performance
counters or any way in hardware to monitor device saturation.

>> Some examples include a MMC controller, which might change its clock
>> rate depending on the class of card that the user has inserted.  Or
>> even a "smartish" device like a GPU lacking performance counters; it's
>> driver will ramp up frequency when there is work to be done and kick
>> off a timeout.  If no new work comes in before the timeout then the
>> driver will drop the frequency.
>
> In the "simple MMC controller w/o performance counter" case, there are
> following ways to use devfreq even if using the number of requests or
> functions calls is not possible.
>
> Method 1) use "userspace" governor and let user process choose
> frequency based on the class

I'm less interested in userspace control of MMC controller operating
frequency and much more interested in how devfreq might arbitrate QoS
requests from multiple "client" devices.

> Method 2) use any "reasonable" governor and let the device driver set
> only "valid" frequencies enabled.

Can you elaborate on this?  I'm not sure I understand how this will
look in driver code.  Maybe the example I requested above will shed
some light.

>   For a rough example, we may do if class < 6, disable freq > 40MHz,
> class < 10, disable freq > 80MHz, and so on. If we do not have
> performance counters or any other mechanisms to monitor the
> activities, "performance" governor along with clock-gated MMC driver
> will save enough power.
>
> For GPUs without anything to monitor the activities, we may do the
> same as the MMC case.
>
> However, with the H/W I've got now, (Exynos4210), we have performance
> counters (PPMU) for many blocks: 3D(MALI GPU), ACP, CAMIF, CPU, DMC0,
> DMC1 (memory controllers), FSYS, IMAGE, LCD0, LCD1, MFC_L, MFC_R, TV,
> LEFT_BUS, and RIGHT_BUS. I don't think Exynos4 is an exceptionally
> fancy SoC (already millions are sold for phones) and other mobile SoCs
> (at least for flagship models) will have them very soon (or already
> have them). Along with this patch, in the example with git branch
> link, we control DMC0/DMC1 blocks. And,

I agree devfreq is well-suited for such hardware.

>> A governor is not required in these cases (as they are event driven)
>> and devfreq is quite heavyweight for such applications.  What is
>> needed is a QoS-style software layer that allows throughput requests
>> to be made from an initiator device towards a target device.  This
>> layer should aggregate requests since many initator devices may make
>> requests to the same target device.  This layer I'm describing, which
>> does not exist today, should be where the actual DVFS transition takes
>> place.  That could take the form of a clk_set_rate call in the clock
>> framework (as described by Colin in V1 of this series), or some other
>> not-yet-realized dvfs_set_opp ,or something like Jean Pihet's
>> per-device PM QoS patches or whatever.  For the purposes of this email
>> I don't really care which framework implements the QoS request
>> aggregation.
>
> Such aggregation could be also done with governors. If the
> governor-device pair does not want to poll devfreq wouldn't loop
> unless there is any governor-device pair that wants to do so. If it is
> event-driven, users may just "allow/disallow" frequencies with OPP
> framework and devfreq will choose proper frequency with the given
> governor for the device. If every device uses "static" or
> "event-driven" governors such as powersave/performance/userspace,
> there will be no polling/looping.

So drivers must disable OPPs, and then the non-polling devfreq
governor will have to be notified by the OPP code and then run it's
->target code again?  This sounds backwards to me.

devfreq seems like an ideal bit of code to understand the constraints
needed by a device (via the workqueue/monitor loop) and then request
those needs via the proper API.  It seems entirely wrong to me to have
other device drivers send their QoS needs to devfreq.

I'm starting to sound like a broken record though, and I've rescinded
my NAK in my reply to Rafael.  If you could explain how multiple
drivers can request their performance needs to a devfreq governor
(same question I asked above) then that would be really helpful.

Thanks,
Mike

> When it is going to be directly controlled by userspace, we'll need a
> "userspace" governor (same with userspace governor of cpufreq).
>
> If there is a QoS request for a devfreq-ed device, the request could
> be done with OPP's frequency enable/disable. If a device is to be
> executed at 400MHz or faster, all frequencies under 400MHz could be
> simply disabled w/ OPP. Devfreq governors cannot override such
> frequency enable/disable configurations.
>
> However, if such QoS requests need delays (timers) like tickle, a
> generalized tickle supplied with frequency or percent of max-frequency
> might work. (i.e., tickle(dev, freuqency, duration); ) Then, this
> generalized tickle will hold at the request frequency or higher by
> disabling lower frequencies temporarily.
>
>>
>> The point of describing this non-existant API is that devfreq should
>> really be just another input into it.  A governor that can measure bus
>> saturation is really cool, but it may not yield optimal results
>> compared to several drivers which make QoS-style requests and insure
>> that performance is guaranteed for their particular needs during their
>> transactions.  The good news is that we don't have to choose between
>> performance counter introspection and software QoS requests: both the
>> driver requests and the governor should all feed as inputs into the
>> QoS-style DVFS mechanism.
>>
>> Taking that logic to its inevitable conclusion, tickle doesn't belong
>> inside the governor at all.  If some device X wants to ramp up the
>> frequency of device Y, it should just make a QoS-style throughput
>> request towards device Y, possibly with a timeout (keeping the
>> original idea of tickle intact).  This is entirely a separate idea
>> from a governor's introspective workqueue loop.
>
> Although tickle is sharing the same loop with governors, tickle does
> not belong inside governors. Tickle overrides the decisions of
> governors; governor's decision function is not called if the device is
> being tickled. However, generalizing the tickle function so that it
> may take "at least at xx % of max frequency" or "operate at least xx
> khz" as an option seems reasonable for QoS requests. And such options
> might be implemented for next version of devfreq later. This requires
> modification in tickle function interface or adding another interface
> for tickle function. However, if such QoS requests do not need
> duration set, we can just go with OPP's frequency enable/disable and
> disable lower-than-QoS-requirement frequencies.
>
> Thus, I guess this QoS issue is somewhat not very significant for
> devfreq. And it can be easily mitigated by adding another interface or
> modifying the interface of tickle function.
>
>>
>> For userspace, a sysfs entry for tickle would also not feed into the
>> governor, but some dummy struct device *user would probably be the
>> initiator device and it would simply call the QoS-style throughput
>> API.
>>
>> In summary my objections to this series are:
>> 1) devfreq should not be the *final* software layer to invoke a DVFS
>> transition as it has not taken all constraints into account.
>> 2) a devfreq governor represents just one constraint out of many to be
>> considered for any given scalable device.
>
> If the concern is about the QoS requests, I guess generalizing tickle
> would be sufficient as above. For devices without performance counters
> and any other mechanisms to infer the usage statistics, "performance"
> governor with event-driven OPP freq-enable/disable should be fine.
>
>>
>> My objection to these patches getting merged is that I think they are
>> a bit ahead of their time.  We need to know what the real DVFS API
>> looks like underneath devfreq first, since devfreq should really be
>> built on top of it.
>>
>> Regards,
>> Mike
>>
>>> Thanks,
>>> Rafael
>>>
>> _______________________________________________
>> linux-pm mailing list
>> linux-pm@lists.linux-foundation.org
>> https://lists.linux-foundation.org/mailman/listinfo/linux-pm
>>
>
> Cheers!
> MyungJoo.
>
>
> --
> MyungJoo Ham (함명주), Ph.D.
> Mobile Software Platform Lab,
> Digital Media and Communications (DMC) Business
> Samsung Electronics
> cell: 82-10-6714-2858
>
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox