* Re: [PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches
From: Waiman Long @ 2019-07-02 19:15 UTC (permalink / raw)
To: David Rientjes
Cc: Christoph Lameter, Pekka Enberg, Joonsoo Kim, Andrew Morton,
Alexander Viro, Jonathan Corbet, Luis Chamberlain, Kees Cook,
Johannes Weiner, Michal Hocko, Vladimir Davydov, linux-mm,
linux-doc, linux-fsdevel, cgroups, linux-kernel, Roman Gushchin,
Shakeel Butt, Andrea Arcangeli
In-Reply-To: <alpine.DEB.2.21.1907021206000.67286@chino.kir.corp.google.com>
On 7/2/19 3:09 PM, David Rientjes wrote:
> On Tue, 2 Jul 2019, Waiman Long wrote:
>
>> diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab
>> index 29601d93a1c2..2a3d0fc4b4ac 100644
>> --- a/Documentation/ABI/testing/sysfs-kernel-slab
>> +++ b/Documentation/ABI/testing/sysfs-kernel-slab
>> @@ -429,10 +429,12 @@ KernelVersion: 2.6.22
>> Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
>> Christoph Lameter <cl@linux-foundation.org>
>> Description:
>> - The shrink file is written when memory should be reclaimed from
>> - a cache. Empty partial slabs are freed and the partial list is
>> - sorted so the slabs with the fewest available objects are used
>> - first.
>> + A value of '1' is written to the shrink file when memory should
>> + be reclaimed from a cache. Empty partial slabs are freed and
>> + the partial list is sorted so the slabs with the fewest
>> + available objects are used first. When a value of '2' is
>> + written, all the corresponding child memory cgroup caches
>> + should be shrunk as well. All other values are invalid.
>>
> This should likely call out that '2' also does '1', that might not be
> clear enough.
You are right. I will reword the text to make it clearer.
>> What: /sys/kernel/slab/cache/slab_size
>> Date: May 2007
>> diff --git a/mm/slab.h b/mm/slab.h
>> index 3b22931bb557..a16b2c7ff4dd 100644
>> --- a/mm/slab.h
>> +++ b/mm/slab.h
>> @@ -174,6 +174,7 @@ int __kmem_cache_shrink(struct kmem_cache *);
>> void __kmemcg_cache_deactivate(struct kmem_cache *s);
>> void __kmemcg_cache_deactivate_after_rcu(struct kmem_cache *s);
>> void slab_kmem_cache_release(struct kmem_cache *);
>> +int kmem_cache_shrink_all(struct kmem_cache *s);
>>
>> struct seq_file;
>> struct file;
>> diff --git a/mm/slab_common.c b/mm/slab_common.c
>> index 464faaa9fd81..493697ba1da5 100644
>> --- a/mm/slab_common.c
>> +++ b/mm/slab_common.c
>> @@ -981,6 +981,49 @@ int kmem_cache_shrink(struct kmem_cache *cachep)
>> }
>> EXPORT_SYMBOL(kmem_cache_shrink);
>>
>> +/**
>> + * kmem_cache_shrink_all - shrink a cache and all its memcg children
>> + * @s: The root cache to shrink.
>> + *
>> + * Return: 0 if successful, -EINVAL if not a root cache
>> + */
>> +int kmem_cache_shrink_all(struct kmem_cache *s)
>> +{
>> + struct kmem_cache *c;
>> +
>> + if (!IS_ENABLED(CONFIG_MEMCG_KMEM)) {
>> + kmem_cache_shrink(s);
>> + return 0;
>> + }
>> + if (!is_root_cache(s))
>> + return -EINVAL;
>> +
>> + /*
>> + * The caller should have a reference to the root cache and so
>> + * we don't need to take the slab_mutex. We have to take the
>> + * slab_mutex, however, to iterate the memcg caches.
>> + */
>> + get_online_cpus();
>> + get_online_mems();
>> + kasan_cache_shrink(s);
>> + __kmem_cache_shrink(s);
>> +
>> + mutex_lock(&slab_mutex);
>> + for_each_memcg_cache(c, s) {
>> + /*
>> + * Don't need to shrink deactivated memcg caches.
>> + */
>> + if (s->flags & SLAB_DEACTIVATED)
>> + continue;
>> + kasan_cache_shrink(c);
>> + __kmem_cache_shrink(c);
>> + }
>> + mutex_unlock(&slab_mutex);
>> + put_online_mems();
>> + put_online_cpus();
>> + return 0;
>> +}
>> +
>> bool slab_is_available(void)
>> {
>> return slab_state >= UP;
> I'm wondering how long this could take, i.e. how long we hold slab_mutex
> while we traverse each cache and shrink it.
It will depends on how many memcg caches are there. Actually, I have
been thinking about using the show method to show the time spent in the
last shrink operation. I am just not sure if it is worth doing. What do
you think?
-Longman
^ permalink raw reply
* Re: [PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches
From: Roman Gushchin @ 2019-07-02 19:30 UTC (permalink / raw)
To: Waiman Long
Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
Andrew Morton, Alexander Viro, Jonathan Corbet, Luis Chamberlain,
Kees Cook, Johannes Weiner, Michal Hocko, Vladimir Davydov,
linux-mm@kvack.org, linux-doc@vger.kernel.org,
linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org, Shakeel Butt, Andrea Arcangeli
In-Reply-To: <20190702183730.14461-1-longman@redhat.com>
On Tue, Jul 02, 2019 at 02:37:30PM -0400, Waiman Long wrote:
> Currently, a value of '1" is written to /sys/kernel/slab/<slab>/shrink
> file to shrink the slab by flushing all the per-cpu slabs and free
> slabs in partial lists. This applies only to the root caches, though.
>
> Extends this capability by shrinking all the child memcg caches and
> the root cache when a value of '2' is written to the shrink sysfs file.
>
> On a 4-socket 112-core 224-thread x86-64 system after a parallel kernel
> build, the the amount of memory occupied by slabs before shrinking
> slabs were:
>
> # grep task_struct /proc/slabinfo
> task_struct 7114 7296 7744 4 8 : tunables 0 0
> 0 : slabdata 1824 1824 0
> # grep "^S[lRU]" /proc/meminfo
> Slab: 1310444 kB
> SReclaimable: 377604 kB
> SUnreclaim: 932840 kB
>
> After shrinking slabs:
>
> # grep "^S[lRU]" /proc/meminfo
> Slab: 695652 kB
> SReclaimable: 322796 kB
> SUnreclaim: 372856 kB
> # grep task_struct /proc/slabinfo
> task_struct 2262 2572 7744 4 8 : tunables 0 0
> 0 : slabdata 643 643 0
>
> Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Roman Gushchin <guro@fb.com>
Thanks, Waiman!
^ permalink raw reply
* Re: [PATCH v1 1/2] Documentation/filesystems: add binderfs
From: Christian Brauner @ 2019-07-02 19:51 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Jonathan Corbet, linux-doc, linux-kernel
In-Reply-To: <20190702175729.GF1729@bombadil.infradead.org>
On Tue, Jul 02, 2019 at 10:57:29AM -0700, Matthew Wilcox wrote:
> On Mon, Jan 14, 2019 at 05:24:01PM -0700, Jonathan Corbet wrote:
> > On Fri, 11 Jan 2019 14:40:59 +0100
> > Christian Brauner <christian@brauner.io> wrote:
> > > This documents the Android binderfs filesystem used to dynamically add and
> > > remove binder devices that are private to each instance.
> >
> > You didn't add it to index.rst, so it won't actually become part of the
> > docs build.
>
> I think you added it in the wrong place.
>
> From 8167b80c950834da09a9204b6236f238197c197b Mon Sep 17 00:00:00 2001
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Date: Tue, 2 Jul 2019 13:54:38 -0400
> Subject: [PATCH] docs: Move binderfs to admin-guide
>
> The documentation is more appropriate for the administrator than for
> the internal kernel API section it is currently in.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Don't feel very strong about where this ends up. :)
Acked-by: Christian Brauner <christian@brauner.io>
> ---
> .../{filesystems => admin-guide}/binderfs.rst | 0
> Documentation/admin-guide/index.rst | 1 +
> Documentation/filesystems/index.rst | 10 ----------
> 3 files changed, 1 insertion(+), 10 deletions(-)
> rename Documentation/{filesystems => admin-guide}/binderfs.rst (100%)
>
> diff --git a/Documentation/filesystems/binderfs.rst b/Documentation/admin-guide/binderfs.rst
> similarity index 100%
> rename from Documentation/filesystems/binderfs.rst
> rename to Documentation/admin-guide/binderfs.rst
> diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
> index 8001917ee012..24fbe0568eff 100644
> --- a/Documentation/admin-guide/index.rst
> +++ b/Documentation/admin-guide/index.rst
> @@ -70,6 +70,7 @@ configure specific aspects of kernel behavior to your liking.
> ras
> bcache
> ext4
> + binderfs
> pm/index
> thunderbolt
> LSM/index
> diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystems/index.rst
> index 1131c34d77f6..970c0a3ec377 100644
> --- a/Documentation/filesystems/index.rst
> +++ b/Documentation/filesystems/index.rst
> @@ -31,13 +31,3 @@ filesystem implementations.
>
> journalling
> fscrypt
> -
> -Filesystem-specific documentation
> -=================================
> -
> -Documentation for individual filesystem types can be found here.
> -
> -.. toctree::
> - :maxdepth: 2
> -
> - binderfs.rst
> --
> 2.20.1
>
^ permalink raw reply
* Re: [PATCH] Documentation: misc-devices: mei: Convert mei txt files to reST
From: Shreeya Patel @ 2019-07-02 19:57 UTC (permalink / raw)
To: Winkler, Tomas, skhan@linuxfoundation.org, corbet@lwn.net,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-kernel-mentees@lists.linuxfoundation.org
In-Reply-To: <5B8DA87D05A7694D9FA63FD143655C1B9DC547D0@hasmsx108.ger.corp.intel.com>
On Sun, 2019-06-30 at 06:23 +0000, Winkler, Tomas wrote:
> > -----Original Message-----
> > From: Shreeya Patel [mailto:shreeya.patel23498@gmail.com]
> > Sent: Sunday, June 30, 2019 00:32
> > To: skhan@linuxfoundation.org; corbet@lwn.net; Winkler, Tomas
> > <tomas.winkler@intel.com>; linux-doc@vger.kernel.org; linux-
> > kernel@vger.kernel.org;
> > linux-kernel-mentees@lists.linuxfoundation.org
> > Subject: [PATCH] Documentation: misc-devices: mei: Convert mei txt
> > files to
> > reST
> >
> > Convert the MEI misc device's documentation files from .txt to
> > reStructuredText format. Make a minor change of correcting the
> > wrong macro
> > name MEI_CONNECT_CLIENT_IOCTL to IOCTL_MEI_CONNECT_CLIENT.
> > Add an index file in mei as there are two sections for it in the
> > documentation.
> >
> > Signed-off-by: Shreeya Patel <shreeya.patel23498@gmail.com>
> > ---
>
> Sorry you are late, we've already done that, it should be merged via
> Greg's char-misc tree.
> Thanks
> Tomas
>
Oh okay.
Thanks
>
>
> > I am not sure if I have placed the Documentation in the right place
> > so I would
> > like to get some suggestions from the MAINTAINERS on this part.
> >
> > Documentation/misc-devices/index.rst | 1 +
> > Documentation/misc-devices/mei/index.rst | 15 +
> > .../misc-devices/mei/mei-client-bus.rst | 151 +++++++++
> > .../misc-devices/mei/mei-client-bus.txt | 141 ---------
> > Documentation/misc-devices/mei/mei.rst | 289
> > ++++++++++++++++++
> > Documentation/misc-devices/mei/mei.txt | 266 --------------
> > --
> > 6 files changed, 456 insertions(+), 407 deletions(-) create mode
> > 100644
> > Documentation/misc-devices/mei/index.rst
> > create mode 100644 Documentation/misc-devices/mei/mei-client-
> > bus.rst
> > delete mode 100644 Documentation/misc-devices/mei/mei-client-
> > bus.txt
> > create mode 100644 Documentation/misc-devices/mei/mei.rst
> > delete mode 100644 Documentation/misc-devices/mei/mei.txt
> >
> > diff --git a/Documentation/misc-devices/index.rst
> > b/Documentation/misc-
> > devices/index.rst
> > index dfd1f45a3127..e788a12b2b19 100644
> > --- a/Documentation/misc-devices/index.rst
> > +++ b/Documentation/misc-devices/index.rst
> > @@ -15,3 +15,4 @@ fit into other categories.
> > :maxdepth: 2
> >
> > ibmvmc
> > + mei/index
> > diff --git a/Documentation/misc-devices/mei/index.rst
> > b/Documentation/misc-
> > devices/mei/index.rst
> > new file mode 100644
> > index 000000000000..3018098ad075
> > --- /dev/null
> > +++ b/Documentation/misc-devices/mei/index.rst
> > @@ -0,0 +1,15 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +===============================================================
> > ==
> > +Intel(R) Management Engine Interface Kernel Driver (Intel(R) MEI)
> > +===============================================================
> > ==
> > +
> > +.. class:: toc-title
> > +
> > + Table of contents
> > +
> > +.. toctree::
> > + :maxdepth: 2
> > +
> > + mei
> > + mei-client-bus
> > diff --git a/Documentation/misc-devices/mei/mei-client-bus.rst
> > b/Documentation/misc-devices/mei/mei-client-bus.rst
> > new file mode 100644
> > index 000000000000..82d455afae78
> > --- /dev/null
> > +++ b/Documentation/misc-devices/mei/mei-client-bus.rst
> > @@ -0,0 +1,151 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +==============================================
> > +Intel(R) Management Engine (ME) Client bus API
> > +==============================================
> > +
> > +
> > +Rationale
> > +=========
> > +
> > +MEI misc character device is useful for dedicated applications to
> > send
> > +and receive data to the many FW appliance found in Intel's ME from
> > the user
> > space.
> > +However for some of the ME functionalities it make sense to
> > leverage
> > +existing software stack and expose them through existing kernel
> > subsystems.
> > +
> > +In order to plug seamlessly into the kernel device driver model we
> > add
> > +kernel virtual bus abstraction on top of the MEI driver. This
> > allows
> > +implementing linux kernel drivers for the various MEI features as
> > a stand
> > alone entities found in their respective subsystem.
> > +Existing device drivers can even potentially be re-used by adding
> > an
> > +MEI CL bus layer to the existing code.
> > +
> > +
> > +MEI CL bus API
> > +==============
> > +
> > +A driver implementation for an MEI Client is very similar to
> > existing
> > +bus based device drivers. The driver registers itself as an MEI CL
> > bus
> > +driver through the :c:type:`mei_cl_driver` structure:
> > +
> > +::
> > +
> > + struct mei_cl_driver {
> > + struct device_driver driver;
> > + const char *name;
> > +
> > + const struct mei_cl_device_id *id_table;
> > +
> > + int (*probe)(struct mei_cl_device *dev, const struct
> > mei_cl_id
> > *id);
> > + int (*remove)(struct mei_cl_device *dev);
> > + };
> > +
> > + struct mei_cl_id {
> > + char name[MEI_NAME_SIZE];
> > + kernel_ulong_t driver_info;
> > + };
> > +
> > +
> > +The :c:type:`mei_cl_id` structure allows the driver to bind itself
> > against a
> > device name.
> > +
> > +To actually register a driver on the ME Client bus one must call
> > the
> > +:c:func:`mei_cl_add_driver()` API. This is typically called at
> > module init time.
> > +
> > +Once registered on the ME Client bus, a driver will typically try
> > to do
> > +some I/O on this bus and this should be done through the
> > +:c:func:`mei_cl_send()` and :c:func:`mei_cl_recv()` routines. The
> > latter is
> > synchronous (blocks and sleeps until data shows up).
> > +In order for drivers to be notified of pending events waiting for
> > them (e.g.
> > +an Rx event) they can register an event handler through the
> > +:c:func:`mei_cl_register_event_cb()` routine. Currently only the
> > +:c:macro:`MEI_EVENT_RX` event will trigger an event handler call
> > and
> > +the driver implementation is supposed to call :c:func:`mei_recv()`
> > from
> > +the event handler in order to fetch the pending received buffers.
> > +
> > +
> > +Example
> > +=======
> > +
> > +As a theoretical example let's pretend the ME comes with a
> > "contact" NFC IP.
> > +The driver init and exit routines for this device would look like:
> > +
> > +::
> > +
> > + #define CONTACT_DRIVER_NAME "contact"
> > +
> > + static struct mei_cl_device_id contact_mei_cl_tbl[] = {
> > + { CONTACT_DRIVER_NAME, },
> > + /* required last entry */
> > + { }
> > + };
> > + MODULE_DEVICE_TABLE(mei_cl, contact_mei_cl_tbl);
> > +
> > + static struct mei_cl_driver contact_driver = {
> > + .id_table = contact_mei_tbl,
> > + .name = CONTACT_DRIVER_NAME,
> > + .probe = contact_probe,
> > + .remove = contact_remove,
> > + };
> > +
> > + static int contact_init(void)
> > + {
> > + int r;
> > +
> > + r = mei_cl_driver_register(&contact_driver);
> > + if (r) {
> > + pr_err(CONTACT_DRIVER_NAME ": driver
> > registration
> > failed\n");
> > + return r;
> > + }
> > +
> > + return 0;
> > + }
> > +
> > + static void __exit contact_exit(void)
> > + {
> > + mei_cl_driver_unregister(&contact_driver);
> > + }
> > +
> > + module_init(contact_init);
> > + module_exit(contact_exit);
> > +
> > +And the driver's simplified probe routine would look like that:
> > +
> > +::
> > +
> > + int contact_probe(struct mei_cl_device *dev, struct
> > mei_cl_device_id
> > *id)
> > + {
> > + struct contact_driver *contact;
> > +
> > + [...]
> > + mei_cl_enable_device(dev);
> > +
> > + mei_cl_register_event_cb(dev, contact_event_cb,
> > contact);
> > +
> > + return 0;
> > + }
> > +
> > +In the probe routine the driver first enable the MEI device and
> > then
> > +registers an ME bus event handler which is as close as it can get
> > to
> > +registering a threaded IRQ handler.
> > +The handler implementation will typically call some I/O routine
> > +depending on the pending events:
> > +
> > +::
> > +
> > + #define MAX_NFC_PAYLOAD 128
> > +
> > + static void contact_event_cb(struct mei_cl_device *dev,
> > u32 events,
> > + void *context)
> > + {
> > + struct contact_driver *contact = context;
> > +
> > + if (events & BIT(MEI_EVENT_RX)) {
> > + u8 payload[MAX_NFC_PAYLOAD];
> > + int payload_size;
> > +
> > + payload_size = mei_recv(dev, payload,
> > MAX_NFC_PAYLOAD);
> > + if (payload_size <= 0)
> > + return;
> > +
> > + /* Hook to the NFC subsystem */
> > + nfc_hci_recv_frame(contact->hdev, payload,
> > payload_size);
> > + }
> > + }
> > diff --git a/Documentation/misc-devices/mei/mei-client-bus.txt
> > b/Documentation/misc-devices/mei/mei-client-bus.txt
> > deleted file mode 100644
> > index 743be4ec8989..000000000000
> > --- a/Documentation/misc-devices/mei/mei-client-bus.txt
> > +++ /dev/null
> > @@ -1,141 +0,0 @@
> > -Intel(R) Management Engine (ME) Client bus API -
> > ==============================================
> > -
> > -
> > -Rationale
> > -=========
> > -
> > -MEI misc character device is useful for dedicated applications to
> > send and
> > receive -data to the many FW appliance found in Intel's ME from the
> > user
> > space.
> > -However for some of the ME functionalities it make sense to
> > leverage existing
> > software -stack and expose them through existing kernel subsystems.
> > -
> > -In order to plug seamlessly into the kernel device driver model we
> > add kernel
> > virtual -bus abstraction on top of the MEI driver. This allows
> > implementing linux
> > kernel drivers -for the various MEI features as a stand alone
> > entities found in
> > their respective subsystem.
> > -Existing device drivers can even potentially be re-used by adding
> > an MEI CL
> > bus layer to -the existing code.
> > -
> > -
> > -MEI CL bus API
> > -==============
> > -
> > -A driver implementation for an MEI Client is very similar to
> > existing bus -based
> > device drivers. The driver registers itself as an MEI CL bus driver
> > through -the
> > mei_cl_driver structure:
> > -
> > -struct mei_cl_driver {
> > - struct device_driver driver;
> > - const char *name;
> > -
> > - const struct mei_cl_device_id *id_table;
> > -
> > - int (*probe)(struct mei_cl_device *dev, const struct mei_cl_id
> > *id);
> > - int (*remove)(struct mei_cl_device *dev);
> > -};
> > -
> > -struct mei_cl_id {
> > - char name[MEI_NAME_SIZE];
> > - kernel_ulong_t driver_info;
> > -};
> > -
> > -The mei_cl_id structure allows the driver to bind itself against a
> > device name.
> > -
> > -To actually register a driver on the ME Client bus one must call
> > the
> > mei_cl_add_driver() -API. This is typically called at module init
> > time.
> > -
> > -Once registered on the ME Client bus, a driver will typically try
> > to do some I/O
> > on -this bus and this should be done through the mei_cl_send() and
> > mei_cl_recv() -routines. The latter is synchronous (blocks and
> > sleeps until data
> > shows up).
> > -In order for drivers to be notified of pending events waiting for
> > them (e.g.
> > -an Rx event) they can register an event handler through the
> > -mei_cl_register_event_cb() routine. Currently only the
> > MEI_EVENT_RX event -
> > will trigger an event handler call and the driver implementation is
> > supposed -to
> > call mei_recv() from the event handler in order to fetch the
> > pending -received
> > buffers.
> > -
> > -
> > -Example
> > -=======
> > -
> > -As a theoretical example let's pretend the ME comes with a
> > "contact" NFC IP.
> > -The driver init and exit routines for this device would look like:
> > -
> > -#define CONTACT_DRIVER_NAME "contact"
> > -
> > -static struct mei_cl_device_id contact_mei_cl_tbl[] = {
> > - { CONTACT_DRIVER_NAME, },
> > -
> > - /* required last entry */
> > - { }
> > -};
> > -MODULE_DEVICE_TABLE(mei_cl, contact_mei_cl_tbl);
> > -
> > -static struct mei_cl_driver contact_driver = {
> > - .id_table = contact_mei_tbl,
> > - .name = CONTACT_DRIVER_NAME,
> > -
> > - .probe = contact_probe,
> > - .remove = contact_remove,
> > -};
> > -
> > -static int contact_init(void)
> > -{
> > - int r;
> > -
> > - r = mei_cl_driver_register(&contact_driver);
> > - if (r) {
> > - pr_err(CONTACT_DRIVER_NAME ": driver registration
> > failed\n");
> > - return r;
> > - }
> > -
> > - return 0;
> > -}
> > -
> > -static void __exit contact_exit(void)
> > -{
> > - mei_cl_driver_unregister(&contact_driver);
> > -}
> > -
> > -module_init(contact_init);
> > -module_exit(contact_exit);
> > -
> > -And the driver's simplified probe routine would look like that:
> > -
> > -int contact_probe(struct mei_cl_device *dev, struct
> > mei_cl_device_id *id) -{
> > - struct contact_driver *contact;
> > -
> > - [...]
> > - mei_cl_enable_device(dev);
> > -
> > - mei_cl_register_event_cb(dev, contact_event_cb, contact);
> > -
> > - return 0;
> > -}
> > -
> > -In the probe routine the driver first enable the MEI device and
> > then registers -
> > an ME bus event handler which is as close as it can get to
> > registering a -
> > threaded IRQ handler.
> > -The handler implementation will typically call some I/O routine
> > depending on -
> > the pending events:
> > -
> > -#define MAX_NFC_PAYLOAD 128
> > -
> > -static void contact_event_cb(struct mei_cl_device *dev, u32
> > events,
> > - void *context)
> > -{
> > - struct contact_driver *contact = context;
> > -
> > - if (events & BIT(MEI_EVENT_RX)) {
> > - u8 payload[MAX_NFC_PAYLOAD];
> > - int payload_size;
> > -
> > - payload_size = mei_recv(dev, payload, MAX_NFC_PAYLOAD);
> > - if (payload_size <= 0)
> > - return;
> > -
> > - /* Hook to the NFC subsystem */
> > - nfc_hci_recv_frame(contact->hdev, payload,
> > payload_size);
> > - }
> > -}
> > diff --git a/Documentation/misc-devices/mei/mei.rst
> > b/Documentation/misc-
> > devices/mei/mei.rst
> > new file mode 100644
> > index 000000000000..e91ac2570b4d
> > --- /dev/null
> > +++ b/Documentation/misc-devices/mei/mei.rst
> > @@ -0,0 +1,289 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +====================================
> > +Intel(R) Management Engine Interface
> > +====================================
> > +
> > +Introduction
> > +============
> > +
> > +The Intel Management Engine (Intel ME) is an isolated and
> > protected
> > +computing resource (Co-processor) residing inside certain Intel
> > +chipsets. The Intel ME provides support for computer/IT management
> > +features. The feature set depends on the Intel chipset SKU.
> > +
> > +The Intel Management Engine Interface (Intel MEI, previously known
> > as
> > +HECI) is the interface between the Host and Intel ME. This
> > interface is
> > +exposed to the host as a PCI device. The Intel MEI Driver is in
> > charge
> > +of the communication channel between a host application and the
> > Intel ME
> > feature.
> > +
> > +Each Intel ME feature (Intel ME Client) is addressed by a
> > GUID/UUID and
> > +each client has its own protocol. The protocol is message-based
> > with a
> > +header and payload up to 512 bytes.
> > +
> > +Prominent usage of the Intel ME Interface is to communicate with
> > +Intel(R) Active Management Technology (Intel AMT) implemented in
> > +firmware running on the Intel ME.
> > +
> > +Intel AMT provides the ability to manage a host remotely out-of-
> > band
> > +(OOB) even when the operating system running on the host processor
> > has
> > +crashed or is in a sleep state.
> > +
> > +Some examples of Intel AMT usage are:
> > + * Monitoring hardware state and platform components
> > + * Remote power off/on (useful for green computing or
> > overnight IT
> > + maintenance)
> > + * OS updates
> > + * Storage of useful platform information such as software
> > assets
> > + * Built-in hardware KVM
> > + * Selective network isolation of Ethernet and IP protocol
> > flows based
> > + on policies set by a remote management console
> > + * IDE device redirection from remote management console
> > +
> > +Intel AMT (OOB) communication is based on SOAP (deprecated
> > starting
> > +with Release 6.0) over HTTP/S or WS-Management protocol over
> > HTTP/S
> > +that are received from a remote management console application.
> > +
> > +For more information about Intel AMT:
> > +`<
> > http://software.intel.com/sites/manageability/AMT_Implementation_and_
> > +Reference_Guide>`_
> > +
> > +
> > +Intel MEI Driver
> > +================
> > +
> > +The driver exposes a misc device called :file:`/dev/mei`.
> > +
> > +An application maintains communication with an Intel ME feature
> > while
> > +:file:`/dev/mei` is open. The binding to a specific feature is
> > +performed by calling :c:macro:`IOCTL_MEI_CONNECT_CLIENT`, which
> > passes
> > the desired UUID.
> > +The number of instances of an Intel ME feature that can be opened
> > at
> > +the same time depends on the Intel ME feature, but most of the
> > features
> > +allow only a single instance.
> > +
> > +The Intel AMT Host Interface (Intel AMTHI) feature supports
> > multiple
> > +simultaneous user connected applications. The Intel MEI driver
> > handles
> > +this internally by maintaining request queues for the
> > applications.
> > +
> > +The driver is transparent to data that are passed between firmware
> > +feature and host application.
> > +
> > +Because some of the Intel ME features can change the system
> > +configuration, the driver by default allows only a privileged user
> > to
> > +access it.
> > +
> > +A code snippet for an application communicating with Intel AMTHI
> > client:
> > +
> > +::
> > +
> > + struct mei_connect_client_data data;
> > + fd = open(MEI_DEVICE);
> > +
> > + data.d.in_client_uuid = AMTHI_UUID;
> > +
> > + ioctl(fd, IOCTL_MEI_CONNECT_CLIENT, &data);
> > +
> > + printf("Ver=%d, MaxLen=%ld\n",
> > + data.d.in_client_uuid.protocol_version,
> > + data.d.in_client_uuid.max_msg_length);
> > +
> > + [...]
> > +
> > + write(fd, amthi_req_data, amthi_req_data_len);
> > +
> > + [...]
> > +
> > + read(fd, &amthi_res_data, amthi_res_data_len);
> > +
> > + [...]
> > +
> > + close(fd);
> > +
> > +
> > +IOCTL
> > +=====
> > +
> > +The Intel MEI Driver supports the following IOCTL commands:
> > +
> > +
> > +:c:macro:`IOCTL_MEI_CONNECT_CLIENT`
> > +-------------------------------------
> > +Connect to firmware Feature (client)
> > +
> > +**Usage:**
> > +
> > +::
> > +
> > + struct mei_connect_client_data clientData;
> > + ioctl(fd, IOCTL_MEI_CONNECT_CLIENT, &clientData);
> > +
> > +**Inputs:**
> > + :c:type:`mei_connect_client_data` - structure contain the
> > following
> > + input field.
> > +
> > + :c:data:`in_client_uuid` - UUID of the FW Feature that
> > needs to connect
> > to.
> > +
> > +**Outputs:**
> > + :c:data:`out_client_properties` - Client Properties: MTU
> > and Protocol
> > Version.
> > +
> > +**Error returns:**
> > + | :c:macro:`EINVAL` - Wrong IOCTL Number.
> > + | :c:macro:`ENODEV` - Device or Connection is not
> > initialized or ready.
> > (e.g. Wrong UUID).
> > + | :c:macro:`ENOMEM` - Unable to allocate memory to client
> > internal
> > data.
> > + | :c:macro:`EFAULT` - Fatal Error (e.g. Unable to access
> > user input data).
> > + | :c:macro:`EBUSY` - Connection Already Open.
> > +
> > +**Notes:**
> > + :c:data:`max_msg_length` (MTU) in client properties
> > describes the
> > maximum
> > + data that can be sent or received. (e.g. if MTU=2K, can
> > send
> > + requests up to bytes 2k and received responses up to 2k
> > bytes).
> > +
> > +
> > +:c:macro:`IOCTL_MEI_NOTIFY_SET`
> > +-------------------------------
> > +Enable or disable event notifications
> > +
> > +**Usage:**
> > +
> > +::
> > +
> > + uint32_t enable;
> > + ioctl(fd, IOCTL_MEI_NOTIFY_SET, &enable);
> > +
> > +**Inputs:**
> > + | :c:data:`uint32_t enable = 1;`
> > + | or
> > + | :c:data:`uint32_t enable[disable] = 0;`
> > +
> > +**Error returns:**
> > + | :c:macro:`EINVAL` - Wrong IOCTL Number.
> > + | :c:macro:`ENODEV` - Device is not initialized or the
> > client not
> > connected.
> > + | :c:macro:`ENOMEM` - Unable to allocate memory to client
> > internal
> > data.
> > + | :c:macro:`EFAULT` - Fatal Error (e.g. Unable to access
> > user input data).
> > + | :c:macro:`EOPNOTSUPP` - if the device doesn't support
> > the feature.
> > +
> > +**Notes:**
> > + The client must be connected in order to enable
> > notification events.
> > +
> > +
> > +:c:macro:`IOCTL_MEI_NOTIFY_GET`
> > +-------------------------------
> > +Retrieve event
> > +
> > +**Usage:**
> > +
> > +::
> > +
> > + uint32_t event;
> > + ioctl(fd, IOCTL_MEI_NOTIFY_GET, &event);
> > +
> > +**Outputs:**
> > + | 1 - if an event is pending.
> > + | 0 - if there is no even pending.
> > +
> > +**Error returns:**
> > + | :c:macro:`EINVAL` - Wrong IOCTL Number.
> > + | :c:macro:`ENODEV` - Device is not initialized or the
> > client not
> > connected.
> > + | :c:macro:`ENOMEM` - Unable to allocate memory to client
> > internal
> > data.
> > + | :c:macro:`EFAULT` - Fatal Error (e.g. Unable to access
> > user input data).
> > + | :c:macro:`EOPNOTSUPP` - if the device doesn't support
> > the feature.
> > +
> > +**Notes:**
> > + The client must be connected and event notification has to
> > be enabled
> > + in order to receive an event.
> > +
> > +
> > +Intel ME Applications
> > +=====================
> > +
> > +1) Intel Local Management Service (Intel LMS)
> > +
> > + Applications running locally on the platform communicate
> > with Intel AMT
> > Release
> > + 2.0 and later releases in the same way that network
> > applications do via
> > SOAP
> > + over HTTP (deprecated starting with Release 6.0) or with
> > WS-
> > Management over
> > + SOAP over HTTP. This means that some Intel AMT features
> > can be
> > accessed from a
> > + local application using the same network interface as a
> > remote
> > application
> > + communicating with Intel AMT over the network.
> > +
> > + When a local application sends a message addressed to the
> > local Intel
> > AMT host
> > + name, the Intel LMS, which listens for traffic directed to
> > the host name,
> > + intercepts the message and routes it to the Intel MEI.
> > + For more information:
> > +
> > `<
> > http://software.intel.com/sites/manageability/AMT_Implementation_and_Re
> > ference_Guide>`_
> > + Under "About Intel AMT" => "Local Access"
> > +
> > + For downloading Intel LMS:
> > +
> > + `<
> > http://software.intel.com/en-us/articles/download-the-latest-intel-a
> > + mt-open-source-drivers/>`_
> > +
> > + The Intel LMS opens a connection using the Intel MEI
> > driver to the Intel
> > LMS
> > + firmware feature using a defined UUID and then
> > communicates with the
> > feature
> > + using a protocol called Intel AMT Port Forwarding Protocol
> > (Intel APF
> > protocol).
> > + The protocol is used to maintain multiple sessions with
> > Intel AMT from a
> > + single application.
> > +
> > + See the protocol specification in the `Intel AMT Software
> > Development
> > Kit (SDK)
> > +
> > <
> > http://software.intel.com/sites/manageability/AMT_Implementation_and_Ref
> > erence_Guide>`_
> > + Under "SDK Resources" => "Intel(R) vPro(TM) Gateway (MPS)"
> > + => "Information for Intel(R) vPro(TM) Gateway Developers"
> > + => "Description of the Intel AMT Port Forwarding (APF)
> > Protocol"
> > +
> > +2) Intel AMT Remote configuration using a Local Agent
> > +
> > + A Local Agent enables IT personnel to configure Intel AMT
> > out-of-the-box
> > + without requiring installing additional data to enable
> > setup. The remote
> > + configuration process may involve an ISV-developed remote
> > configuration
> > + agent that runs on the host.
> > + For more information:
> > +
> > `<
> > http://software.intel.com/sites/manageability/AMT_Implementation_and_Re
> > ference_Guide>`_
> > + Under "Setup and Configuration of Intel AMT" =>
> > + "SDK Tools Supporting Setup and Configuration" =>
> > + "Using the Local Agent Sample"
> > +
> > + An open source Intel AMT configuration utility, impleme
> > nting a local
> > agent
> > + that accesses the Intel MEI driver, can be found here:
> > +
> > + `<
> > http://software.intel.com/en-us/articles/download-the-latest-intel-a
> > + mt-open-source-drivers/>`
> > +
> > +
> > +Intel AMT OS Health Watchdog
> > +============================
> > +
> > +The Intel AMT Watchdog is an OS Health (Hang/Crash) watchdog.
> > +Whenever the OS hangs or crashes, Intel AMT will send an event to
> > any
> > +subscriber to this event. This mechanism means that IT knows when
> > a
> > +platform crashes even when there is a hard failure on the host.
> > +
> > +The Intel AMT Watchdog is composed of two parts:
> > + 1) Firmware feature - receives the heartbeats
> > + and sends an event when the heartbeats stop.
> > + 2) Intel MEI iAMT watchdog driver - connects to the
> > watchdog feature,
> > + configures the watchdog and sends the heartbeats.
> > +
> > +The Intel iAMT watchdog MEI driver uses the kernel watchdog API to
> > +configure the Intel AMT Watchdog and to send heartbeats to it. The
> > +default timeout of the watchdog is 120 seconds.
> > +
> > +If the Intel AMT is not enabled in the firmware then the watchdog
> > +client won't enumerate on the me client bus and watchdog devices
> > won't be
> > exposed.
> > +
> > +
> > +Supported Chipsets
> > +==================
> > +
> > +| 7 Series Chipset Family
> > +| 6 Series Chipset Family
> > +| 5 Series Chipset Family
> > +| 4 Series Chipset Family
> > +| Mobile 4 Series Chipset Family
> > +| ICH9
> > +| 82946GZ/GL
> > +| 82G35 Express
> > +| 82Q963/Q965
> > +| 82P965/G965
> > +| Mobile PM965/GM965
> > +| Mobile GME965/GLE960
> > +| 82Q35 Express
> > +| 82G33/G31/P35/P31 Express
> > +| 82Q33 Express
> > +| 82X38/X48 Express
> > +
> > +---
> > +linux-mei@linux.intel.com
> > diff --git a/Documentation/misc-devices/mei/mei.txt
> > b/Documentation/misc-
> > devices/mei/mei.txt
> > deleted file mode 100644
> > index 2b80a0cd621f..000000000000
> > --- a/Documentation/misc-devices/mei/mei.txt
> > +++ /dev/null
> > @@ -1,266 +0,0 @@
> > -Intel(R) Management Engine Interface (Intel(R) MEI) -
> > ===================================================
> > -
> > -Introduction
> > -============
> > -
> > -The Intel Management Engine (Intel ME) is an isolated and
> > protected
> > computing -resource (Co-processor) residing inside certain Intel
> > chipsets. The
> > Intel ME -provides support for computer/IT management features. The
> > feature
> > set -depends on the Intel chipset SKU.
> > -
> > -The Intel Management Engine Interface (Intel MEI, previously known
> > as HECI)
> > -is the interface between the Host and Intel ME. This interface is
> > exposed -to
> > the host as a PCI device. The Intel MEI Driver is in charge of the
> > -
> > communication channel between a host application and the Intel ME
> > feature.
> > -
> > -Each Intel ME feature (Intel ME Client) is addressed by a
> > GUID/UUID and -each
> > client has its own protocol. The protocol is message-based with a
> > -header and
> > payload up to 512 bytes.
> > -
> > -Prominent usage of the Intel ME Interface is to communicate with
> > Intel(R) -
> > Active Management Technology (Intel AMT) implemented in firmware
> > running
> > on -the Intel ME.
> > -
> > -Intel AMT provides the ability to manage a host remotely out-of-
> > band (OOB) -
> > even when the operating system running on the host processor has
> > crashed or -
> > is in a sleep state.
> > -
> > -Some examples of Intel AMT usage are:
> > - - Monitoring hardware state and platform components
> > - - Remote power off/on (useful for green computing or overnight
> > IT
> > - maintenance)
> > - - OS updates
> > - - Storage of useful platform information such as software
> > assets
> > - - Built-in hardware KVM
> > - - Selective network isolation of Ethernet and IP protocol flows
> > based
> > - on policies set by a remote management console
> > - - IDE device redirection from remote management console
> > -
> > -Intel AMT (OOB) communication is based on SOAP (deprecated
> > -starting with
> > Release 6.0) over HTTP/S or WS-Management protocol over -HTTP/S
> > that are
> > received from a remote management console application.
> > -
> > -For more information about Intel AMT:
> > -
> >
http://software.intel.com/sites/manageability/AMT_Implementation_and_Refe
> > rence_Guide
> > -
> > -
> > -Intel MEI Driver
> > -================
> > -
> > -The driver exposes a misc device called /dev/mei.
> > -
> > -An application maintains communication with an Intel ME feature
> > while -
> > /dev/mei is open. The binding to a specific feature is performed by
> > calling -
> > MEI_CONNECT_CLIENT_IOCTL, which passes the desired UUID.
> > -The number of instances of an Intel ME feature that can be opened
> > -at the
> > same time depends on the Intel ME feature, but most of the
> > -features allow
> > only a single instance.
> > -
> > -The Intel AMT Host Interface (Intel AMTHI) feature supports
> > multiple -
> > simultaneous user connected applications. The Intel MEI driver
> > -handles this
> > internally by maintaining request queues for the applications.
> > -
> > -The driver is transparent to data that are passed between firmware
> > feature -
> > and host application.
> > -
> > -Because some of the Intel ME features can change the system
> > -configuration,
> > the driver by default allows only a privileged -user to access it.
> > -
> > -A code snippet for an application communicating with Intel AMTHI
> > client:
> > -
> > - struct mei_connect_client_data data;
> > - fd = open(MEI_DEVICE);
> > -
> > - data.d.in_client_uuid = AMTHI_UUID;
> > -
> > - ioctl(fd, IOCTL_MEI_CONNECT_CLIENT, &data);
> > -
> > - printf("Ver=%d, MaxLen=%ld\n",
> > - data.d.in_client_uuid.protocol_version,
> > - data.d.in_client_uuid.max_msg_length);
> > -
> > - [...]
> > -
> > - write(fd, amthi_req_data, amthi_req_data_len);
> > -
> > - [...]
> > -
> > - read(fd, &amthi_res_data, amthi_res_data_len);
> > -
> > - [...]
> > - close(fd);
> > -
> > -
> > -IOCTL
> > -=====
> > -
> > -The Intel MEI Driver supports the following IOCTL commands:
> > - IOCTL_MEI_CONNECT_CLIENT Connect to firmware Feature
> > (client).
> > -
> > - usage:
> > - struct mei_connect_client_data clientData;
> > - ioctl(fd, IOCTL_MEI_CONNECT_CLIENT, &clientData);
> > -
> > - inputs:
> > - mei_connect_client_data struct contain the following
> > - input field:
> > -
> > - in_client_uuid - UUID of the FW Feature that needs
> > - to connect to.
> > - outputs:
> > - out_client_properties - Client Properties: MTU and
> > Protocol
> > Version.
> > -
> > - error returns:
> > - EINVAL Wrong IOCTL Number
> > - ENODEV Device or Connection is not initialized or
> > ready.
> > - (e.g. Wrong UUID)
> > - ENOMEM Unable to allocate memory to client
> > internal
> > data.
> > - EFAULT Fatal Error (e.g. Unable to access user
> > input data)
> > - EBUSY Connection Already Open
> > -
> > - Notes:
> > - max_msg_length (MTU) in client properties describes the
> > maximum
> > - data that can be sent or received. (e.g. if MTU=2K, can
> > send
> > - requests up to bytes 2k and received responses up to 2k
> > bytes).
> > -
> > - IOCTL_MEI_NOTIFY_SET: enable or disable event notifications
> > -
> > - Usage:
> > - uint32_t enable;
> > - ioctl(fd, IOCTL_MEI_NOTIFY_SET, &enable);
> > -
> > - Inputs:
> > - uint32_t enable = 1;
> > - or
> > - uint32_t enable[disable] = 0;
> > -
> > - Error returns:
> > - EINVAL Wrong IOCTL Number
> > - ENODEV Device is not initialized or the client
> > not
> > connected
> > - ENOMEM Unable to allocate memory to client
> > internal
> > data.
> > - EFAULT Fatal Error (e.g. Unable to access user
> > input data)
> > - EOPNOTSUPP if the device doesn't support the feature
> > -
> > - Notes:
> > - The client must be connected in order to enable notification
> > events
> > -
> > -
> > - IOCTL_MEI_NOTIFY_GET : retrieve event
> > -
> > - Usage:
> > - uint32_t event;
> > - ioctl(fd, IOCTL_MEI_NOTIFY_GET, &event);
> > -
> > - Outputs:
> > - 1 - if an event is pending
> > - 0 - if there is no even pending
> > -
> > - Error returns:
> > - EINVAL Wrong IOCTL Number
> > - ENODEV Device is not initialized or the client not
> > connected
> > - ENOMEM Unable to allocate memory to client
> > internal
> > data.
> > - EFAULT Fatal Error (e.g. Unable to access user
> > input data)
> > - EOPNOTSUPP if the device doesn't support the feature
> > -
> > - Notes:
> > - The client must be connected and event notification has to be
> > enabled
> > - in order to receive an event
> > -
> > -
> > -Intel ME Applications
> > -=====================
> > -
> > - 1) Intel Local Management Service (Intel LMS)
> > -
> > - Applications running locally on the platform communicate
> > with Intel
> > AMT Release
> > - 2.0 and later releases in the same way that network
> > applications do
> > via SOAP
> > - over HTTP (deprecated starting with Release 6.0) or with WS-
> > Management over
> > - SOAP over HTTP. This means that some Intel AMT features can
> > be
> > accessed from a
> > - local application using the same network interface as a
> > remote
> > application
> > - communicating with Intel AMT over the network.
> > -
> > - When a local application sends a message addressed to the
> > local Intel
> > AMT host
> > - name, the Intel LMS, which listens for traffic directed to
> > the host
> > name,
> > - intercepts the message and routes it to the Intel MEI.
> > - For more information:
> > -
> >
http://software.intel.com/sites/manageability/AMT_Implementation_and_Refe
> > rence_Guide
> > - Under "About Intel AMT" => "Local Access"
> > -
> > - For downloading Intel LMS:
> > -
> > http://software.intel.com/en-us/articles/download-the-latest-intel-
> > amt-open-source-drivers/
> > -
> > - The Intel LMS opens a connection using the Intel MEI driver
> > to the
> > Intel LMS
> > - firmware feature using a defined UUID and then communicates
> > with
> > the feature
> > - using a protocol called Intel AMT Port Forwarding Protocol
> > (Intel APF
> > protocol).
> > - The protocol is used to maintain multiple sessions with
> > Intel AMT
> > from a
> > - single application.
> > -
> > - See the protocol specification in the Intel AMT Software
> > Development
> > Kit (SDK)
> > -
> >
http://software.intel.com/sites/manageability/AMT_Implementation_and_Refe
> > rence_Guide
> > - Under "SDK Resources" => "Intel(R) vPro(TM) Gateway (MPS)"
> > - => "Information for Intel(R) vPro(TM) Gateway Developers"
> > - => "Description of the Intel AMT Port Forwarding (APF)
> > Protocol"
> > -
> > - 2) Intel AMT Remote configuration using a Local Agent
> > -
> > - A Local Agent enables IT personnel to configure Intel AMT
> > out-of-the-
> > box
> > - without requiring installing additional data to enable
> > setup. The
> > remote
> > - configuration process may involve an ISV-developed remote
> > configuration
> > - agent that runs on the host.
> > - For more information:
> > -
> >
http://software.intel.com/sites/manageability/AMT_Implementation_and_Refe
> > rence_Guide
> > - Under "Setup and Configuration of Intel AMT" =>
> > - "SDK Tools Supporting Setup and Configuration" =>
> > - "Using the Local Agent Sample"
> > -
> > - An open source Intel AMT configuration utility, implementin
> > g a local
> > agent
> > - that accesses the Intel MEI driver, can be found here:
> > -
> > http://software.intel.com/en-us/articles/download-the-latest-intel-
> > amt-open-source-drivers/
> > -
> > -
> > -Intel AMT OS Health Watchdog
> > -============================
> > -
> > -The Intel AMT Watchdog is an OS Health (Hang/Crash) watchdog.
> > -Whenever the OS hangs or crashes, Intel AMT will send an event -to
> > any
> > subscriber to this event. This mechanism means that -IT knows when
> > a platform
> > crashes even when there is a hard failure on the host.
> > -
> > -The Intel AMT Watchdog is composed of two parts:
> > - 1) Firmware feature - receives the heartbeats
> > - and sends an event when the heartbeats stop.
> > - 2) Intel MEI iAMT watchdog driver - connects to the watchdog
> > feature,
> > - configures the watchdog and sends the heartbeats.
> > -
> > -The Intel iAMT watchdog MEI driver uses the kernel watchdog API to
> > configure
> > -the Intel AMT Watchdog and to send heartbeats to it. The default
> > timeout of
> > the -watchdog is 120 seconds.
> > -
> > -If the Intel AMT is not enabled in the firmware then the watchdog
> > client won't
> > enumerate -on the me client bus and watchdog devices won't be
> > exposed.
> > -
> > -
> > -Supported Chipsets
> > -==================
> > -
> > -7 Series Chipset Family
> > -6 Series Chipset Family
> > -5 Series Chipset Family
> > -4 Series Chipset Family
> > -Mobile 4 Series Chipset Family
> > -ICH9
> > -82946GZ/GL
> > -82G35 Express
> > -82Q963/Q965
> > -82P965/G965
> > -Mobile PM965/GM965
> > -Mobile GME965/GLE960
> > -82Q35 Express
> > -82G33/G31/P35/P31 Express
> > -82Q33 Express
> > -82X38/X48 Express
> > -
> > ----
> > -linux-mei@linux.intel.com
> > --
> > 2.17.1
>
>
^ permalink raw reply
* Re: [PATCH v6 2/6] ARM: Disable instrumentation for some code
From: Linus Walleij @ 2019-07-02 21:56 UTC (permalink / raw)
To: Florian Fainelli
Cc: Linux ARM, bcm-kernel-feedback-list, Andrey Ryabinin, Abbott Liu,
Alexander Potapenko, Dmitry Vyukov, Jonathan Corbet, Russell King,
christoffer.dall, Marc Zyngier, Arnd Bergmann, Nicolas Pitre,
Vladimir Murzin, Kees Cook, jinb.park7, Alexandre Belloni,
Ard Biesheuvel, Daniel Lezcano, Philippe Ombredanne, Rob Landley,
Greg KH, Andrew Morton, Mark Rutland, Catalin Marinas,
Masahiro Yamada, Thomas Gleixner, thgarnie, David Howells,
Geert Uytterhoeven, Andre Przywara, julien.thierry, drjones,
philip, mhocko, kirill.shutemov, kasan-dev,
Linux Doc Mailing List, linux-kernel@vger.kernel.org, kvmarm,
Andrey Ryabinin
In-Reply-To: <20190617221134.9930-3-f.fainelli@gmail.com>
On Tue, Jun 18, 2019 at 12:11 AM Florian Fainelli <f.fainelli@gmail.com> wrote:
> @@ -236,7 +236,8 @@ static int unwind_pop_register(struct unwind_ctrl_block *ctrl,
> if (*vsp >= (unsigned long *)ctrl->sp_high)
> return -URC_FAILURE;
>
> - ctrl->vrs[reg] = *(*vsp)++;
> + ctrl->vrs[reg] = READ_ONCE_NOCHECK(*(*vsp));
> + (*vsp)++;
I would probably even put in a comment here so it is clear why we
do this. Passers-by may not know that READ_ONCE_NOCHECK() is
even related to KASan.
Other than that,
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Yours,
Linus Walleij
^ permalink raw reply
* Re: [PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches
From: Andrew Morton @ 2019-07-02 20:03 UTC (permalink / raw)
To: Waiman Long
Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
Alexander Viro, Jonathan Corbet, Luis Chamberlain, Kees Cook,
Johannes Weiner, Michal Hocko, Vladimir Davydov, linux-mm,
linux-doc, linux-fsdevel, cgroups, linux-kernel, Roman Gushchin,
Shakeel Butt, Andrea Arcangeli
In-Reply-To: <20190702183730.14461-1-longman@redhat.com>
On Tue, 2 Jul 2019 14:37:30 -0400 Waiman Long <longman@redhat.com> wrote:
> Currently, a value of '1" is written to /sys/kernel/slab/<slab>/shrink
> file to shrink the slab by flushing all the per-cpu slabs and free
> slabs in partial lists. This applies only to the root caches, though.
>
> Extends this capability by shrinking all the child memcg caches and
> the root cache when a value of '2' is written to the shrink sysfs file.
Why?
Please fully describe the value of the proposed feature to or users.
Always.
>
> ...
>
> --- a/Documentation/ABI/testing/sysfs-kernel-slab
> +++ b/Documentation/ABI/testing/sysfs-kernel-slab
> @@ -429,10 +429,12 @@ KernelVersion: 2.6.22
> Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
> Christoph Lameter <cl@linux-foundation.org>
> Description:
> - The shrink file is written when memory should be reclaimed from
> - a cache. Empty partial slabs are freed and the partial list is
> - sorted so the slabs with the fewest available objects are used
> - first.
> + A value of '1' is written to the shrink file when memory should
> + be reclaimed from a cache. Empty partial slabs are freed and
> + the partial list is sorted so the slabs with the fewest
> + available objects are used first. When a value of '2' is
> + written, all the corresponding child memory cgroup caches
> + should be shrunk as well. All other values are invalid.
One would expect this to be a bitfield, like /proc/sys/vm/drop_caches.
So writing 3 does both forms of shrinking.
Yes, it happens to be the case that 2 is a superset of 1, but what
about if we add "4"?
^ permalink raw reply
* Klientskie bazy. Email: prodawez@armyspy.com Uznajte podrobnee!
From: NAdRkvadroshturman @ 2019-07-02 19:10 UTC (permalink / raw)
To: CnBDFkvadroshturman
Klientskie bazy. Email: prodawez@armyspy.com Uznajte podrobnee!
^ permalink raw reply
* Re: [PATCH v6 0/6] KASan for arm
From: Linus Walleij @ 2019-07-02 21:06 UTC (permalink / raw)
To: Florian Fainelli
Cc: Linux ARM, bcm-kernel-feedback-list, Alexander Potapenko,
Dmitry Vyukov, Jonathan Corbet, Russell King, christoffer.dall,
Marc Zyngier, Arnd Bergmann, Nicolas Pitre, Vladimir Murzin,
Kees Cook, jinb.park7, Alexandre Belloni, Ard Biesheuvel,
Daniel Lezcano, Philippe Ombredanne, liuwenliang, Rob Landley,
Greg KH, Andrew Morton, Mark Rutland, Catalin Marinas,
Masahiro Yamada, Thomas Gleixner, thgarnie, David Howells,
Geert Uytterhoeven, Andre Przywara, julien.thierry, drjones,
philip, mhocko, kirill.shutemov, kasan-dev,
Linux Doc Mailing List, linux-kernel@vger.kernel.org, kvmarm,
Andrey Ryabinin
In-Reply-To: <20190617221134.9930-1-f.fainelli@gmail.com>
Hi Florian,
On Tue, Jun 18, 2019 at 12:11 AM Florian Fainelli <f.fainelli@gmail.com> wrote:
> Abbott submitted a v5 about a year ago here:
>
> and the series was not picked up since then, so I rebased it against
> v5.2-rc4 and re-tested it on a Brahma-B53 (ARMv8 running AArch32 mode)
> and Brahma-B15, both LPAE and test-kasan is consistent with the ARM64
> counter part.
>
> We were in a fairly good shape last time with a few different people
> having tested it, so I am hoping we can get that included for 5.4 if
> everything goes well.
Thanks for picking this up. I was trying out KASan in the past,
got sidetracked and honestly lost interest a bit because it was
boring. But I do realize that it is really neat, so I will try to help
out with some review and test on a bunch of hardware I have.
At one point I even had this running on the ARMv4 SA1100
(no joke!) and if I recall correctly, I got stuck because of things
that might very well have been related to using a very fragile
Arm testchip that later broke down completely in the l2cache
when we added the spectre/meltdown fixes.
I start reviewing and testing.
Yours,
Linus Walleij
^ permalink raw reply
* Re: [PATCH v5 07/18] kunit: test: add initial tests
From: Luis Chamberlain @ 2019-07-02 20:57 UTC (permalink / raw)
To: Brendan Higgins
Cc: Frank Rowand, Greg KH, Josh Poimboeuf, Kees Cook, Kieran Bingham,
Peter Zijlstra, Rob Herring, Stephen Boyd, shuah,
Theodore Ts'o, Masahiro Yamada, devicetree, dri-devel,
kunit-dev, open list:DOCUMENTATION, linux-fsdevel, linux-kbuild,
Linux Kernel Mailing List, open list:KERNEL SELFTEST FRAMEWORK,
linux-nvdimm, linux-um, Sasha Levin, Bird, Timothy,
Amir Goldstein, Dan Carpenter, Daniel Vetter, Jeff Dike,
Joel Stanley, Julia Lawall, Kevin Hilman, Knut Omang,
Logan Gunthorpe, Michael Ellerman, Petr Mladek, Randy Dunlap,
Richard Weinberger, David Rientjes, Steven Rostedt, wfg
In-Reply-To: <CAFd5g46=7OQDREdLDTiMgVWq-Xj2zfOw8cRhPJEihSbO89MDyA@mail.gmail.com>
On Tue, Jul 02, 2019 at 10:52:50AM -0700, Brendan Higgins wrote:
> On Wed, Jun 26, 2019 at 12:53 AM Brendan Higgins
> <brendanhiggins@google.com> wrote:
> >
> > On Tue, Jun 25, 2019 at 4:22 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
> > >
> > > On Mon, Jun 17, 2019 at 01:26:02AM -0700, Brendan Higgins wrote:
> > > > diff --git a/kunit/example-test.c b/kunit/example-test.c
> > > > new file mode 100644
> > > > index 0000000000000..f44b8ece488bb
> > > > --- /dev/null
> > > > +++ b/kunit/example-test.c
> > >
> > > <-- snip -->
> > >
> > > > +/*
> > > > + * This defines a suite or grouping of tests.
> > > > + *
> > > > + * Test cases are defined as belonging to the suite by adding them to
> > > > + * `kunit_cases`.
> > > > + *
> > > > + * Often it is desirable to run some function which will set up things which
> > > > + * will be used by every test; this is accomplished with an `init` function
> > > > + * which runs before each test case is invoked. Similarly, an `exit` function
> > > > + * may be specified which runs after every test case and can be used to for
> > > > + * cleanup. For clarity, running tests in a test module would behave as follows:
> > > > + *
> > >
> > > To be clear this is not the kernel module init, but rather the kunit
> > > module init. I think using kmodule would make this clearer to a reader.
> >
> > Seems reasonable. Will fix in next revision.
> >
> > > > + * module.init(test);
> > > > + * module.test_case[0](test);
> > > > + * module.exit(test);
> > > > + * module.init(test);
> > > > + * module.test_case[1](test);
> > > > + * module.exit(test);
> > > > + * ...;
> > > > + */
>
> Do you think it might be clearer yet to rename `struct kunit_module
> *module;` to `struct kunit_suite *suite;`?
Yes. Definitely. Or struct kunit_test. Up to you.
Luis
^ permalink raw reply
* Re: [PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches
From: Andrew Morton @ 2019-07-02 21:33 UTC (permalink / raw)
To: Waiman Long
Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
Alexander Viro, Jonathan Corbet, Luis Chamberlain, Kees Cook,
Johannes Weiner, Michal Hocko, Vladimir Davydov, linux-mm,
linux-doc, linux-fsdevel, cgroups, linux-kernel, Roman Gushchin,
Shakeel Butt, Andrea Arcangeli
In-Reply-To: <78879b79-1b8f-cdfd-d4fa-610afe5e5d48@redhat.com>
On Tue, 2 Jul 2019 16:44:24 -0400 Waiman Long <longman@redhat.com> wrote:
> On 7/2/19 4:03 PM, Andrew Morton wrote:
> > On Tue, 2 Jul 2019 14:37:30 -0400 Waiman Long <longman@redhat.com> wrote:
> >
> >> Currently, a value of '1" is written to /sys/kernel/slab/<slab>/shrink
> >> file to shrink the slab by flushing all the per-cpu slabs and free
> >> slabs in partial lists. This applies only to the root caches, though.
> >>
> >> Extends this capability by shrinking all the child memcg caches and
> >> the root cache when a value of '2' is written to the shrink sysfs file.
> > Why?
> >
> > Please fully describe the value of the proposed feature to or users.
> > Always.
>
> Sure. Essentially, the sysfs shrink interface is not complete. It allows
> the root cache to be shrunk, but not any of the memcg caches.
But that doesn't describe anything of value. Who wants to use this,
and why? How will it be used? What are the use-cases?
^ permalink raw reply
* Re: [PATCH v6 1/6] ARM: Add TTBR operator for kasan_init
From: Linus Walleij @ 2019-07-02 21:03 UTC (permalink / raw)
To: Florian Fainelli, Russell King
Cc: Linux ARM, bcm-kernel-feedback-list, Abbott Liu, Andrey Ryabinin,
Alexander Potapenko, Dmitry Vyukov, Jonathan Corbet, Russell King,
christoffer.dall, Marc Zyngier, Arnd Bergmann, Nicolas Pitre,
Vladimir Murzin, Kees Cook, jinb.park7, Alexandre Belloni,
Ard Biesheuvel, Daniel Lezcano, Philippe Ombredanne, Rob Landley,
Greg KH, Andrew Morton, Mark Rutland, Catalin Marinas,
Masahiro Yamada, Thomas Gleixner, thgarnie, David Howells,
Geert Uytterhoeven, Andre Przywara, julien.thierry, drjones,
philip, mhocko, kirill.shutemov, kasan-dev,
Linux Doc Mailing List, linux-kernel@vger.kernel.org, kvmarm,
Andrey Ryabinin
In-Reply-To: <20190617221134.9930-2-f.fainelli@gmail.com>
Hi Florian!
thanks for your patch!
On Tue, Jun 18, 2019 at 12:11 AM Florian Fainelli <f.fainelli@gmail.com> wrote:
> From: Abbott Liu <liuwenliang@huawei.com>
>
> The purpose of this patch is to provide set_ttbr0/get_ttbr0 to
> kasan_init function. The definitions of cp15 registers should be in
> arch/arm/include/asm/cp15.h rather than arch/arm/include/asm/kvm_hyp.h,
> so move them.
>
> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Reported-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Abbott Liu <liuwenliang@huawei.com>
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> +#include <linux/stringify.h>
What is this for? I think it can be dropped.
This stuff adding a whole bunch of accessors:
> +static inline void set_par(u64 val)
> +{
> + if (IS_ENABLED(CONFIG_ARM_LPAE))
> + write_sysreg(val, PAR_64);
> + else
> + write_sysreg(val, PAR_32);
> +}
Can we put that in a separate patch since it is not
adding any users, so this is a pure refactoring patch for
the current code?
Yours,
Linus Walleij
^ permalink raw reply
* Re: [PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches
From: Waiman Long @ 2019-07-02 20:44 UTC (permalink / raw)
To: Andrew Morton
Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
Alexander Viro, Jonathan Corbet, Luis Chamberlain, Kees Cook,
Johannes Weiner, Michal Hocko, Vladimir Davydov, linux-mm,
linux-doc, linux-fsdevel, cgroups, linux-kernel, Roman Gushchin,
Shakeel Butt, Andrea Arcangeli
In-Reply-To: <20190702130318.39d187dc27dbdd9267788165@linux-foundation.org>
On 7/2/19 4:03 PM, Andrew Morton wrote:
> On Tue, 2 Jul 2019 14:37:30 -0400 Waiman Long <longman@redhat.com> wrote:
>
>> Currently, a value of '1" is written to /sys/kernel/slab/<slab>/shrink
>> file to shrink the slab by flushing all the per-cpu slabs and free
>> slabs in partial lists. This applies only to the root caches, though.
>>
>> Extends this capability by shrinking all the child memcg caches and
>> the root cache when a value of '2' is written to the shrink sysfs file.
> Why?
>
> Please fully describe the value of the proposed feature to or users.
> Always.
Sure. Essentially, the sysfs shrink interface is not complete. It allows
the root cache to be shrunk, but not any of the memcg caches.
The same can also be said for others slab sysfs files which show current
cache status. I don't think sysfs files are created for the memcg
caches, but I may be wrong. In many cases, information can be available
elsewhere like the slabinfo file. The shrink operation, however, has no
other alternative available.
>> ...
>>
>> --- a/Documentation/ABI/testing/sysfs-kernel-slab
>> +++ b/Documentation/ABI/testing/sysfs-kernel-slab
>> @@ -429,10 +429,12 @@ KernelVersion: 2.6.22
>> Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
>> Christoph Lameter <cl@linux-foundation.org>
>> Description:
>> - The shrink file is written when memory should be reclaimed from
>> - a cache. Empty partial slabs are freed and the partial list is
>> - sorted so the slabs with the fewest available objects are used
>> - first.
>> + A value of '1' is written to the shrink file when memory should
>> + be reclaimed from a cache. Empty partial slabs are freed and
>> + the partial list is sorted so the slabs with the fewest
>> + available objects are used first. When a value of '2' is
>> + written, all the corresponding child memory cgroup caches
>> + should be shrunk as well. All other values are invalid.
> One would expect this to be a bitfield, like /proc/sys/vm/drop_caches.
> So writing 3 does both forms of shrinking.
>
> Yes, it happens to be the case that 2 is a superset of 1, but what
> about if we add "4"?
>
Yes, I can make it into a bit fields of 2 bits, just like
/proc/sys/vm/drop_caches.
Cheers,
Longman
^ permalink raw reply
* [PATCH] MAINTAINERS: Update for Intel Speed Select Technology
From: Srinivas Pandruvada @ 2019-07-03 1:53 UTC (permalink / raw)
To: dvhart, andy, andriy.shevchenko, corbet
Cc: rjw, alan, lenb, prarit, darcari, linux-doc, linux-kernel,
platform-driver-x86, Srinivas Pandruvada
Added myself as the maintainer.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
---
MAINTAINERS | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 5cfbea4ce575..b6ed7958372d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8101,6 +8101,14 @@ S: Supported
F: drivers/infiniband/hw/i40iw/
F: include/uapi/rdma/i40iw-abi.h
+INTEL SPEED SELECT TECHNOLOGY
+M: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
+L: platform-driver-x86@vger.kernel.org
+S: Maintained
+F: drivers/platform/x86/intel_speed_select_if/
+F: tools/power/x86/intel-speed-select/
+F: include/uapi/linux/isst_if.h
+
INTEL TELEMETRY DRIVER
M: Rajneesh Bhardwaj <rajneesh.bhardwaj@linux.intel.com>
M: "David E. Box" <david.e.box@linux.intel.com>
--
2.17.2
^ permalink raw reply related
* Re: [PATCH 0/2] arm64: Introduce boot parameter to disable TLB flush instruction within the same inner shareable domain
From: qi.fuli @ 2019-07-03 2:45 UTC (permalink / raw)
To: Will Deacon, qi.fuli@fujitsu.com
Cc: Will Deacon, indou.takao@fujitsu.com, linux-doc@vger.kernel.org,
peterz@infradead.org, Catalin Marinas, Jonathan Corbet,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
In-Reply-To: <20190627102724.vif6zh6zfqktpmjx@willie-the-truck>
Hi Will,
Thanks for your comments.
On 6/27/19 7:27 PM, Will Deacon wrote:
> On Mon, Jun 24, 2019 at 10:34:02AM +0000, qi.fuli@fujitsu.com wrote:
>> On 6/18/19 2:03 AM, Will Deacon wrote:
>>> On Mon, Jun 17, 2019 at 11:32:53PM +0900, Takao Indoh wrote:
>>>> From: Takao Indoh <indou.takao@fujitsu.com>
>>>>
>>>> I found a performance issue related on the implementation of Linux's TLB
>>>> flush for arm64.
>>>>
>>>> When I run a single-threaded test program on moderate environment, it
>>>> usually takes 39ms to finish its work. However, when I put a small
>>>> apprication, which just calls mprotest() continuously, on one of sibling
>>>> cores and run it simultaneously, the test program slows down significantly.
>>>> It becomes 49ms(125%) on ThunderX2. I also detected the same problem on
>>>> ThunderX1 and Fujitsu A64FX.
>>> This is a problem for any applications that share hardware resources with
>>> each other, so I don't think it's something we should be too concerned about
>>> addressing unless there is a practical DoS scenario, which there doesn't
>>> appear to be in this case. It may be that the real answer is "don't call
>>> mprotect() in a loop".
>> I think there has been a misunderstanding, please let me explain.
>> This application is just an example using for reproducing the
>> performance issue we found.
>> Our original purpose is reducing OS jitter by this series.
>> The OS jitter on massively parallel processing systems have been known
>> and studied for many years.
>> The 2.5% OS jitter can result in over a factor of 20 slowdown for the
>> same application [1].
> I think it's worth pointing out that the system in question was neither
> ARM-based nor running Linux, so I'd be cautious in applying the conclusions
> of that paper directly to our TLB invalidation code. Furthermore, the noise
> being generated in their experiments uses a timer interrupt, which has a
> /vastly/ different profile to a DVM message in terms of both system impact
> and frequency.
My original purpose was to explain that the OS jitter is a vital issue for
large-scale HPC environment by referencing this paper.
Please allow me to introduce the issue that had occurred to our HPC
environment.
We used FWQ [1] to do an experiment on 1 node of our HPC environment,
we expected it would be tens of microseconds of maximum OS jitter, but
it was
hundreds of microseconds, which didn't meet our requirement. We tried to
find
out the cause by using ftrace, but we cannot find any processes which would
cause noise and only knew the extension of processing time. Then we
confirmed
the CPU instruction count through CPU PMU, we also didn't find any changes.
However, we found that with the increase of that the TLB flash was called,
the noise was also increasing. Here we understood that the cause of this
issue
is the implementation of Linux's TLB flush for arm64, especially use of
TLBI-is
instruction which is a broadcast to all processor core on the system.
Therefore,
we made this patch set to fix this issue. After testing for several
times, the
noise was reduced and our original goal was achieved, so we do think
this patch
makes sense.
As I mentioned, the OS jitter is a vital issue for large-scale HPC
environment.
We tried a lot of things to reduce the OS jitter. One of them is task
separation
between the CPUs which are used for computing and the CPUs which are
used for
maintenance. All of the daemon processes and I/O interrupts are bounden
to the
maintenance CPUs. Further more, we used nohz_full to avoid the noise
caused by
computing CPU interruption, but all of the CPUs were affected by TLBI-is
instruction, the task separation of CPUs didn't work. Therefore, we
would like
to implement that TLB flush is done on minimal CPUs to reducing the OS
jitter
by using this patch set.
[1] https://asc.llnl.gov/sequoia/benchmarks/FTQ_summary_v1.1.pdf
Thanks,
QI Fuli
>> Though it may be an extreme example, reducing the OS jitter has been an
>> issue in HPC environment.
>>
>> [1] Ferreira, Kurt B., Patrick Bridges, and Ron Brightwell.
>> "Characterizing application sensitivity to OS interference using
>> kernel-level noise injection." Proceedings of the 2008 ACM/IEEE
>> conference on Supercomputing. IEEE Press, 2008.
>>
>>>> I suppose the root cause of this issue is the implementation of Linux's TLB
>>>> flush for arm64, especially use of TLBI-is instruction which is a broadcast
>>>> to all processor core on the system. In case of the above situation,
>>>> TLBI-is is called by mprotect().
>>> On the flip side, Linux is providing the hardware with enough information
>>> not to broadcast to cores for which the remote TLBs don't have entries
>>> allocated for the ASID being invalidated. I would say that the root cause
>>> of the issue is that this filtering is not taking place.
>> Do you mean that the filter should be implemented in hardware?
> Yes. If you're building a large system and you care about "jitter", then
> you either need to partition it in such a way that sources of noise are
> contained, or you need to introduce filters to limit their scope. Rewriting
> the low-level memory-management parts of the operating system is a red
> herring and imposes a needless burden on everybody else without solving
> the real problem, which is that contended use of shared resources doesn't
> scale.
>
> Will
^ permalink raw reply
* [PATCH v4] docs: aha152x.txt convert it to ReST
From: Sushma Unnibhavi @ 2019-07-03 6:14 UTC (permalink / raw)
To: skhan
Cc: Sushma Unnibhavi, corbet, mchehab, linux-kernel-mentees,
linux-doc, linux-kernel
This patch converts aha152x.rst
to ReST format, No content change.
Added aha152x.rst to sh/index.rst
Added SPDX tag in index.rst
Signed-off-by: Sushma Unnibhavi <sushmaunnibhavi425@gmail.com>
---
Documentation/driver-api/index.rst | 1 +
Documentation/scsi/aha152x.rst | 203 ++++++++++++++++++++++++++++
Documentation/scsi/aha152x.txt | 183 -------------------------
Documentation/scsi/source/conf.py | 52 +++++++
Documentation/scsi/source/index.rst | 22 +++
5 files changed, 278 insertions(+), 183 deletions(-)
create mode 100644 Documentation/scsi/aha152x.rst
delete mode 100644 Documentation/scsi/aha152x.txt
create mode 100644 Documentation/scsi/source/conf.py
create mode 100644 Documentation/scsi/source/index.rst
diff --git a/Documentation/driver-api/index.rst b/Documentation/driver-api/index.rst
index d26308af6036..e26809c95c79 100644
--- a/Documentation/driver-api/index.rst
+++ b/Documentation/driver-api/index.rst
@@ -32,6 +32,7 @@ available subsections can be seen below.
usb/index
firewire
pci/index
+ scsi/index
spi
i2c
i3c/index
diff --git a/Documentation/scsi/aha152x.rst b/Documentation/scsi/aha152x.rst
new file mode 100644
index 000000000000..3c4d558b9daf
--- /dev/null
+++ b/Documentation/scsi/aha152x.rst
@@ -0,0 +1,203 @@
+
+=====================================================
+Adaptec AHA-1520/1522 SCSI driver for Linux (aha152x)
+=====================================================
+
+Copyright 1993-1999 Jürgen Fischer <fischer@norbit.de>
+TC1550 patches by Luuk van Dijk (ldz@xs4all.nl)
+
+
+In Revision 2 the driver was modified a lot (especially the
+bottom-half handler complete()).
+
+The driver is much cleaner now, has support for the new
+error handling code in 2.3, produced less cpu load (much
+less polling loops), has slightly higher throughput (at
+least on my ancient test box; a i486/33Mhz/20MB).
+
+
+========================
+Configuration Arguments
+========================
++-----------+------------------------------------------+---------------------------+
+|IOPORT| | base io address | (0x340/0x140) |
++-----------+------------------------------------------+---------------------------+
+|IRQ | interrupt level | (9-12; default 11)| |
++-----------+------------------------------------------+---------------------------+
+|SCSI_ID | scsi id of controller | (0-7; default 7) |
++-----------+------------------------------------------+---------------------------+
+|RECONNECT | allow targets to disconnect from the bus| (0/1; default 1 [on]) |
++-----------+------------------------------------------+---------------------------+
+|PARITY | enable parity checking | (0/1; default 1 [on]) |
++-----------+------------------------------------------+---------------------------+
+|SYNCHRONOUS| enable synchronous transfers | (0/1; default 1 [on]) |
++-----------+------------------------------------------+---------------------------+
+|DELAY: | bus reset delay | (default 100) |
++-----------+------------------------------------------+---------------------------+
+|EXT_TRANS: | enable extended translation (see NOTES) | (0/1: default 0 [off]) |
++-----------+------------------------------------------+---------------------------+
+
+========================================================================
+Compile Time Configuration (go into AHA152X in drivers/scsi/Makefile)
+========================================================================
+
+-DAUTOCONF
+ use configuration the controller reports (AHA-152x only)
+
+-DSKIP_BIOSTEST
+ Don't test for BIOS signature (AHA-1510 or disabled BIOS)
+
+-DSETUP0="{ IOPORT, IRQ, SCSI_ID, RECONNECT, PARITY, SYNCHRONOUS, DELAY, EXT_TRANS }"
+ override for the first controller
+
+-DSETUP1="{ IOPORT, IRQ, SCSI_ID, RECONNECT, PARITY, SYNCHRONOUS, DELAY, EXT_TRANS }"
+ override for the second controller
+
+-DAHA152X_DEBUG
+ enable debugging output
+
+-DAHA152X_STAT
+ enable some statistics
+
+
+==========================
+Lilo Command Line Options
+==========================
+
+aha152x=<IOPORT>[,<IRQ>[,<SCSI-ID>[,<RECONNECT>[,<PARITY>[,<SYNCHRONOUS>[,<DELAY> [,<EXT_TRANS]]]]]]]
+
+The normal configuration can be overridden by specifying a command
+line.When you do this, the BIOS test is skipped. Entered values
+have to be valid (known). Don't use values that aren't supported
+under normal operation. If you think that you need other values:
+contact me. For two controllers use the aha152x statement twice.
+
+
+=================================
+Symbols For Module Configuration
+=================================
+---------------------------
+Choose From 2 Alternatives
+---------------------------
+1. specify everything (old)
+
+ aha152x=IOPORT,IRQ,SCSI_ID,RECONNECT,PARITY,SYNCHRONOUS,DELAY,EXT_TRANS
+ configuration override for first controller
+
+
+ aha152x1=IOPORT,IRQ,SCSI_ID,RECONNECT,PARITY,SYNCHRONOUS,DELAY,EXT_TRANS
+ configuration override for second controller
+
+2. specify only what you need to (irq or io is required; new)
+
+ io=IOPORT0[,IOPORT1]
+ IOPORT for first and second controller
+
+ irq=IRQ0[,IRQ1]
+ IRQ for first and second controller
+
+ scsiid=SCSIID0[,SCSIID1]
+ SCSIID for first and second controller
+
+ reconnect=RECONNECT0[,RECONNECT1]
+ allow targets to disconnect for first and second controller
+
+ parity=PAR0[PAR1]
+ use parity for first and second controller
+
+ sync=SYNCHRONOUS0[,SYNCHRONOUS1]
+ enable synchronous transfers for first and second controller
+
+ delay=DELAY0[,DELAY1]
+ reset DELAY for first and second controller
+
+ exttrans=EXTTRANS0[,EXTTRANS1]
+ enable extended translation for first and second controller
+
+
+If you use both alternatives the first will be taken.
+
+
+====================
+NOTES ON EXT_TRANS:
+====================
+
+SCSI uses block numbers to address blocks/sectors on a device.
+The BIOS uses a cylinder/head/sector addressing scheme (C/H/S)
+scheme instead. DOS expects a BIOS or driver that understands
+this C/H/S addressing.
+
+The number of cylinders/heads/sectors is called geometry and is
+required as base for requests in C/H/S addressing. SCSI only
+knows about the total capacity of disks in blocks (sectors).
+
+Therefore the SCSI BIOS/DOS driver has to calculate a logical/virtual
+geometry just to be able to support that addressing scheme. The
+geometry returned by the SCSI BIOS is a pure calculation and has
+nothing to do with the real/physical geometry of the disk (which
+is usually irrelevant anyway).
+
+Basically this has no impact at all on Linux, because it also uses block
+instead of C/H/S addressing. Unfortunately C/H/S addressing is also used
+in the partition table and therefore every operating system has to know
+the right geometry to be able to interpret it.
+
+Moreover there are certain limitations to the C/H/S addressing scheme,
+namely the address space is limited to up to 255 heads, up to 63 sectors
+and a maximum of 1023 cylinders.
+
+The AHA-1522 BIOS calculates the geometry by fixing the number of heads
+to 64, the number of sectors to 32 and by calculating the number of
+cylinders by dividing the capacity reported by the disk by 64*32 (1 MB).
+This is considered to be the default translation.
+
+With respect to the limit of 1023 cylinders using C/H/S you can only
+address the first GB of your disk in the partition table. Therefore
+BIOSes of some newer controllers based on the AIC-6260/6360 support
+extended translation. This means that the BIOS uses 255 for heads,
+63 for sectors and then divides the capacity of the disk by 255*63
+(about 8 MB), as soon it sees a disk greater than 1 GB. That results
+in a maximum of about 8 GB addressable diskspace in the partition
+table (but there are already bigger disks out there today).
+
+To make it even more complicated the translation mode might/might
+not be configurable in certain BIOS setups.
+
+This driver does some more or less failsafe guessing to get the
+geometry right in most cases:
+
+- for disks<1GB:
+ -use default translation (C/32/64)
+
+- for disks>1GB:
+ - take current geometry from the partition table (using scsicam_bios_param
+ and accept only `valid` geometries, ie. either (C/32/64) or (C/63/255)).
+ This can be extended translation even if it's not enabled in the driver.
+
+ - if that fails, take extended translation if enabled by override,
+ kernel or module parameter, otherwise take default translation and
+ ask the user for verification. This might on not yet partitioned
+ disks.
+
+
+==================
+REFERENCES USED:
+==================
+ "AIC-6260 SCSI Chip Specification", Adaptec Corporation.
+
+ "SCSI COMPUTER SYSTEM INTERFACE - 2 (SCSI-2)", X3T9.2/86-109 rev. 10h
+
+ "Writing a SCSI device driver for Linux", Rik Faith (faith@cs.unc.edu)
+
+ "Kernel Hacker's Guide", Michael K. Johnson (johnsonm@sunsite.unc.edu)
+
+ "Adaptec 1520/1522 User's Guide", Adaptec Corporation.
+
+ Michael K. Johnson (johnsonm@sunsite.unc.edu)
+
+ Drew Eckhardt (drew@cs.colorado.edu)
+
+ Eric Youngdale (eric@andante.org)
+
+ special thanks to Eric Youngdale for the free(!) supplying the
+ documentation on the chip.
diff --git a/Documentation/scsi/aha152x.txt b/Documentation/scsi/aha152x.txt
deleted file mode 100644
index 94848734ac66..000000000000
--- a/Documentation/scsi/aha152x.txt
+++ /dev/null
@@ -1,183 +0,0 @@
-$Id: README.aha152x,v 1.2 1999/12/25 15:32:30 fischer Exp fischer $
-Adaptec AHA-1520/1522 SCSI driver for Linux (aha152x)
-
-Copyright 1993-1999 Jürgen Fischer <fischer@norbit.de>
-TC1550 patches by Luuk van Dijk (ldz@xs4all.nl)
-
-
-In Revision 2 the driver was modified a lot (especially the
-bottom-half handler complete()).
-
-The driver is much cleaner now, has support for the new
-error handling code in 2.3, produced less cpu load (much
-less polling loops), has slightly higher throughput (at
-least on my ancient test box; a i486/33Mhz/20MB).
-
-
-CONFIGURATION ARGUMENTS:
-
-IOPORT base io address (0x340/0x140)
-IRQ interrupt level (9-12; default 11)
-SCSI_ID scsi id of controller (0-7; default 7)
-RECONNECT allow targets to disconnect from the bus (0/1; default 1 [on])
-PARITY enable parity checking (0/1; default 1 [on])
-SYNCHRONOUS enable synchronous transfers (0/1; default 1 [on])
-DELAY: bus reset delay (default 100)
-EXT_TRANS: enable extended translation (0/1: default 0 [off])
- (see NOTES)
-
-COMPILE TIME CONFIGURATION (go into AHA152X in drivers/scsi/Makefile):
-
--DAUTOCONF
- use configuration the controller reports (AHA-152x only)
-
--DSKIP_BIOSTEST
- Don't test for BIOS signature (AHA-1510 or disabled BIOS)
-
--DSETUP0="{ IOPORT, IRQ, SCSI_ID, RECONNECT, PARITY, SYNCHRONOUS, DELAY, EXT_TRANS }"
- override for the first controller
-
--DSETUP1="{ IOPORT, IRQ, SCSI_ID, RECONNECT, PARITY, SYNCHRONOUS, DELAY, EXT_TRANS }"
- override for the second controller
-
--DAHA152X_DEBUG
- enable debugging output
-
--DAHA152X_STAT
- enable some statistics
-
-
-LILO COMMAND LINE OPTIONS:
-
-aha152x=<IOPORT>[,<IRQ>[,<SCSI-ID>[,<RECONNECT>[,<PARITY>[,<SYNCHRONOUS>[,<DELAY> [,<EXT_TRANS]]]]]]]
-
- The normal configuration can be overridden by specifying a command line.
- When you do this, the BIOS test is skipped. Entered values have to be
- valid (known). Don't use values that aren't supported under normal
- operation. If you think that you need other values: contact me.
- For two controllers use the aha152x statement twice.
-
-
-SYMBOLS FOR MODULE CONFIGURATION:
-
-Choose from 2 alternatives:
-
-1. specify everything (old)
-
-aha152x=IOPORT,IRQ,SCSI_ID,RECONNECT,PARITY,SYNCHRONOUS,DELAY,EXT_TRANS
- configuration override for first controller
-
-
-aha152x1=IOPORT,IRQ,SCSI_ID,RECONNECT,PARITY,SYNCHRONOUS,DELAY,EXT_TRANS
- configuration override for second controller
-
-2. specify only what you need to (irq or io is required; new)
-
-io=IOPORT0[,IOPORT1]
- IOPORT for first and second controller
-
-irq=IRQ0[,IRQ1]
- IRQ for first and second controller
-
-scsiid=SCSIID0[,SCSIID1]
- SCSIID for first and second controller
-
-reconnect=RECONNECT0[,RECONNECT1]
- allow targets to disconnect for first and second controller
-
-parity=PAR0[PAR1]
- use parity for first and second controller
-
-sync=SYNCHRONOUS0[,SYNCHRONOUS1]
- enable synchronous transfers for first and second controller
-
-delay=DELAY0[,DELAY1]
- reset DELAY for first and second controller
-
-exttrans=EXTTRANS0[,EXTTRANS1]
- enable extended translation for first and second controller
-
-
-If you use both alternatives the first will be taken.
-
-
-NOTES ON EXT_TRANS:
-
-SCSI uses block numbers to address blocks/sectors on a device.
-The BIOS uses a cylinder/head/sector addressing scheme (C/H/S)
-scheme instead. DOS expects a BIOS or driver that understands this
-C/H/S addressing.
-
-The number of cylinders/heads/sectors is called geometry and is required
-as base for requests in C/H/S addressing. SCSI only knows about the
-total capacity of disks in blocks (sectors).
-
-Therefore the SCSI BIOS/DOS driver has to calculate a logical/virtual
-geometry just to be able to support that addressing scheme. The geometry
-returned by the SCSI BIOS is a pure calculation and has nothing to
-do with the real/physical geometry of the disk (which is usually
-irrelevant anyway).
-
-Basically this has no impact at all on Linux, because it also uses block
-instead of C/H/S addressing. Unfortunately C/H/S addressing is also used
-in the partition table and therefore every operating system has to know
-the right geometry to be able to interpret it.
-
-Moreover there are certain limitations to the C/H/S addressing scheme,
-namely the address space is limited to up to 255 heads, up to 63 sectors
-and a maximum of 1023 cylinders.
-
-The AHA-1522 BIOS calculates the geometry by fixing the number of heads
-to 64, the number of sectors to 32 and by calculating the number of
-cylinders by dividing the capacity reported by the disk by 64*32 (1 MB).
-This is considered to be the default translation.
-
-With respect to the limit of 1023 cylinders using C/H/S you can only
-address the first GB of your disk in the partition table. Therefore
-BIOSes of some newer controllers based on the AIC-6260/6360 support
-extended translation. This means that the BIOS uses 255 for heads,
-63 for sectors and then divides the capacity of the disk by 255*63
-(about 8 MB), as soon it sees a disk greater than 1 GB. That results
-in a maximum of about 8 GB addressable diskspace in the partition table
-(but there are already bigger disks out there today).
-
-To make it even more complicated the translation mode might/might
-not be configurable in certain BIOS setups.
-
-This driver does some more or less failsafe guessing to get the
-geometry right in most cases:
-
-- for disks<1GB: use default translation (C/32/64)
-
-- for disks>1GB:
- - take current geometry from the partition table
- (using scsicam_bios_param and accept only `valid' geometries,
- ie. either (C/32/64) or (C/63/255)). This can be extended translation
- even if it's not enabled in the driver.
-
- - if that fails, take extended translation if enabled by override,
- kernel or module parameter, otherwise take default translation and
- ask the user for verification. This might on not yet partitioned
- disks.
-
-
-REFERENCES USED:
-
- "AIC-6260 SCSI Chip Specification", Adaptec Corporation.
-
- "SCSI COMPUTER SYSTEM INTERFACE - 2 (SCSI-2)", X3T9.2/86-109 rev. 10h
-
- "Writing a SCSI device driver for Linux", Rik Faith (faith@cs.unc.edu)
-
- "Kernel Hacker's Guide", Michael K. Johnson (johnsonm@sunsite.unc.edu)
-
- "Adaptec 1520/1522 User's Guide", Adaptec Corporation.
-
- Michael K. Johnson (johnsonm@sunsite.unc.edu)
-
- Drew Eckhardt (drew@cs.colorado.edu)
-
- Eric Youngdale (eric@andante.org)
-
- special thanks to Eric Youngdale for the free(!) supplying the
- documentation on the chip.
diff --git a/Documentation/scsi/source/conf.py b/Documentation/scsi/source/conf.py
new file mode 100644
index 000000000000..8f60483b49fb
--- /dev/null
+++ b/Documentation/scsi/source/conf.py
@@ -0,0 +1,52 @@
+# Configuration file for the Sphinx documentation builder.
+#
+# This file only contains a selection of the most common options. For a full
+# list see the documentation:
+# http://www.sphinx-doc.org/en/master/config
+
+# -- Path setup --------------------------------------------------------------
+
+# If extensions (or modules to document with autodoc) are in another directory,
+# add these directories to sys.path here. If the directory is relative to the
+# documentation root, use os.path.abspath to make it absolute, like shown here.
+#
+# import os
+# import sys
+# sys.path.insert(0, os.path.abspath('.'))
+
+
+# -- Project information -----------------------------------------------------
+
+project = 'doc'
+copyright = '2019, sushma'
+author = 'sushma'
+
+
+# -- General configuration ---------------------------------------------------
+
+# Add any Sphinx extension module names here, as strings. They can be
+# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
+# ones.
+extensions = [
+]
+
+# Add any paths that contain templates here, relative to this directory.
+templates_path = ['_templates']
+
+# List of patterns, relative to source directory, that match files and
+# directories to ignore when looking for source files.
+# This pattern also affects html_static_path and html_extra_path.
+exclude_patterns = []
+
+
+# -- Options for HTML output -------------------------------------------------
+
+# The theme to use for HTML and HTML Help pages. See the documentation for
+# a list of builtin themes.
+#
+html_theme = 'alabaster'
+
+# Add any paths that contain custom static files (such as style sheets) here,
+# relative to this directory. They are copied after the builtin static files,
+# so a file named "default.css" will overwrite the builtin "default.css".
+html_static_path = ['_static']
diff --git a/Documentation/scsi/source/index.rst b/Documentation/scsi/source/index.rst
new file mode 100644
index 000000000000..003259e30a59
--- /dev/null
+++ b/Documentation/scsi/source/index.rst
@@ -0,0 +1,22 @@
+.. doc documentation master file, created by
+ sphinx-quickstart on Mon Jul 1 11:21:20 2019.
+ You can adapt this file completely to your liking, but it should at least
+ contain the root `toctree` directive.
+.SPDX-License-Identifier: GPL-2.0
+
+===============================
+SCSI Subsystem
+===============================
+
+.. toctree::
+ :maxdepth: 2
+ :caption: Contents:
+
+aha152x
+
+Indices and tables
+==================
+
+* :ref:`genindex`
+* :ref:`modindex`
+* :ref:`search`
--
2.17.1
^ permalink raw reply related
* Re: [PATCH] mm, slab: Extend slab/shrink to shrink all the memcg caches
From: Michal Hocko @ 2019-07-03 6:56 UTC (permalink / raw)
To: Waiman Long
Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
Andrew Morton, Alexander Viro, Jonathan Corbet, Luis Chamberlain,
Kees Cook, Johannes Weiner, Vladimir Davydov, linux-mm, linux-doc,
linux-fsdevel, cgroups, linux-kernel, Roman Gushchin,
Shakeel Butt, Andrea Arcangeli
In-Reply-To: <20190702183730.14461-1-longman@redhat.com>
On Tue 02-07-19 14:37:30, Waiman Long wrote:
> Currently, a value of '1" is written to /sys/kernel/slab/<slab>/shrink
> file to shrink the slab by flushing all the per-cpu slabs and free
> slabs in partial lists. This applies only to the root caches, though.
>
> Extends this capability by shrinking all the child memcg caches and
> the root cache when a value of '2' is written to the shrink sysfs file.
Why do we need a new value for this functionality? I would tend to think
that skipping memcg caches is a bug/incomplete implementation. Or is it
a deliberate decision to cover root caches only?
--
Michal Hocko
SUSE Labs
^ permalink raw reply
* Re: [PATCH v7 1/2] fTPM: firmware TPM running in TEE
From: Ilias Apalodimas @ 2019-07-03 6:58 UTC (permalink / raw)
To: Thirupathaiah Annapureddy
Cc: Jarkko Sakkinen, Sasha Levin, peterhuewe@gmx.de, jgg@ziepe.ca,
corbet@lwn.net, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, linux-integrity@vger.kernel.org,
Microsoft Linux Kernel List, Bryan Kelly (CSI),
tee-dev@lists.linaro.org, sumit.garg@linaro.org,
rdunlap@infradead.org
In-Reply-To: <CY4PR21MB0279B99FB0097309ADE83809BCF80@CY4PR21MB0279.namprd21.prod.outlook.com>
Hi Thirupathaiah,
>
> First of all, Thanks a lot for trying to test the driver.
>
np
[...]
> > I managed to do some quick testing in QEMU.
> > Everything works fine when i build this as a module (using IBM's TPM 2.0
> > TSS)
> >
> > - As module
> > # insmod /lib/modules/5.2.0-rc1/kernel/drivers/char/tpm/tpm_ftpm_tee.ko
> > # getrandom -by 8
> > randomBytes length 8
> > 23 b9 3d c3 90 13 d9 6b
> >
> > - Built-in
> > # dmesg | grep optee
> > ftpm-tee firmware:optee: ftpm_tee_probe:tee_client_open_session failed,
> > err=ffff0008
> This (0xffff0008) translates to TEE_ERROR_ITEM_NOT_FOUND.
>
> Where is fTPM TA located in the your test setup?
> Is it stitched into TEE binary as an EARLY_TA or
> Is it expected to be loaded during run-time with the help of user mode OP-TEE supplicant?
>
> My guess is that you are trying to load fTPM TA through user mode OP-TEE supplicant.
> Can you confirm?
I tried both
> If that is the true,
> - In the case of driver built as a module (CONFIG_TCG_FTPM_TEE=m), this is works fine
> as user mode supplicant is ready.
> - In the built-in case (CONFIG_TCG_FTPM_TEE=y),
> This would result in the above error 0xffff0008 as TEE is unable to find fTPM TA.
Maybe i did something wrong and never noticed it wasn't built as an earlyTA
>
> The expectation is that fTPM TA is built as an EARLY_TA (in BL32) so that
> U-boot and Linux driver stacks work seamlessly without dependency on supplicant.
>
You can add my tested-by tag for the module. I'll go back to testing it as
built-in at some point in real hardware and let you know if i have any issues.
If someone's is interested in the QEMU testing:
1. compile this https://github.com/jbech-linaro/manifest/blob/ftpm/README.md
2. replace the whole linux kernel on the root-dir with a latest version + fTPM
char driver
3. Apply a hack on kernel and disable dynamic shm (Need for this depends on
kernel + op-tee version)
diff --git a/drivers/tee/optee/core.c b/drivers/tee/optee/core.c
index 1854a3db..7aea8a5 100644
--- a/drivers/tee/optee/core.c
+++ b/drivers/tee/optee/core.c
@@ -588,13 +588,15 @@ static struct optee *optee_probe(struct device_node *np)
/*
* Try to use dynamic shared memory if possible
*/
+#if 0
if (sec_caps & OPTEE_SMC_SEC_CAP_DYNAMIC_SHM)
pool = optee_config_dyn_shm();
+#endif
/*
* If dynamic shared memory is not available or failed - try static one
*/
- if (IS_ERR(pool) && (sec_caps & OPTEE_SMC_SEC_CAP_HAVE_RESERVED_SHM))
+ if (sec_caps & OPTEE_SMC_SEC_CAP_HAVE_RESERVED_SHM)
pool = optee_config_shm_memremap(invoke_fn, &memremaped_shm);
if (IS_ERR(pool))
For the module part:
Tested-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
^ permalink raw reply related
* Re: [PATCH v7 1/2] fTPM: firmware TPM running in TEE
From: Ilias Apalodimas @ 2019-07-03 8:12 UTC (permalink / raw)
To: Thirupathaiah Annapureddy
Cc: Jarkko Sakkinen, Sasha Levin, peterhuewe@gmx.de, jgg@ziepe.ca,
corbet@lwn.net, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, linux-integrity@vger.kernel.org,
Microsoft Linux Kernel List, Bryan Kelly (CSI),
tee-dev@lists.linaro.org, sumit.garg@linaro.org,
rdunlap@infradead.org, Joakim Bech
In-Reply-To: <20190703065813.GA12724@apalos>
Hi Thirupathaiah,
(+Joakim)
On Wed, 3 Jul 2019 at 09:58, Ilias Apalodimas
<ilias.apalodimas@linaro.org> wrote:
>
> Hi Thirupathaiah,
> >
> > First of all, Thanks a lot for trying to test the driver.
> >
> np
>
> [...]
> > > I managed to do some quick testing in QEMU.
> > > Everything works fine when i build this as a module (using IBM's TPM 2.0
> > > TSS)
> > >
> > > - As module
> > > # insmod /lib/modules/5.2.0-rc1/kernel/drivers/char/tpm/tpm_ftpm_tee.ko
> > > # getrandom -by 8
> > > randomBytes length 8
> > > 23 b9 3d c3 90 13 d9 6b
> > >
> > > - Built-in
> > > # dmesg | grep optee
> > > ftpm-tee firmware:optee: ftpm_tee_probe:tee_client_open_session failed,
> > > err=ffff0008
> > This (0xffff0008) translates to TEE_ERROR_ITEM_NOT_FOUND.
> >
> > Where is fTPM TA located in the your test setup?
> > Is it stitched into TEE binary as an EARLY_TA or
> > Is it expected to be loaded during run-time with the help of user mode OP-TEE supplicant?
> >
> > My guess is that you are trying to load fTPM TA through user mode OP-TEE supplicant.
> > Can you confirm?
> I tried both
>
Ok apparently there was a failure with my built-in binary which i
didn't notice. I did a full rebuilt and checked the elf this time :)
Built as an earlyTA my error now is:
ftpm-tee firmware:optee: ftpm_tee_probe:tee_client_open_session
failed, err=ffff3024 (translates to TEE_ERROR_TARGET_DEAD)
Since you tested it on real hardware i guess you tried both
module/built-in. Which TEE version are you using?
Thanks
/Ilias
^ permalink raw reply
* Re: [PATCH 39/39] docs: gpio: add sysfs interface to the admin-guide
From: Linus Walleij @ 2019-07-03 8:44 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Linux Doc Mailing List, Mauro Carvalho Chehab,
linux-kernel@vger.kernel.org, Jonathan Corbet,
Bartosz Golaszewski, Rafael J. Wysocki, Len Brown, Harry Wei,
Alex Shi, open list:GPIO SUBSYSTEM, ACPI Devel Maling List
In-Reply-To: <1ecff14ec37c0c434f003d93c4b86b1cd3dac834.1561724493.git.mchehab+samsung@kernel.org>
On Fri, Jun 28, 2019 at 2:30 PM Mauro Carvalho Chehab
<mchehab+samsung@kernel.org> wrote:
> While this is stated as obsoleted, the sysfs interface described
> there is still valid, and belongs to the admin-guide.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
This doesn't apply to my tree because of dependencies in the
index so I guess it's best if you merge it:
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Yours,
Linus Walleij
^ permalink raw reply
* Re: [PATCH v7 1/2] fTPM: firmware TPM running in TEE
From: Sumit Garg @ 2019-07-03 10:03 UTC (permalink / raw)
To: Ilias Apalodimas, Thirupathaiah Annapureddy
Cc: Jarkko Sakkinen, Sasha Levin, peterhuewe@gmx.de, jgg@ziepe.ca,
corbet@lwn.net, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, linux-integrity@vger.kernel.org,
Microsoft Linux Kernel List, Bryan Kelly (CSI),
tee-dev@lists.linaro.org, rdunlap@infradead.org, Joakim Bech
In-Reply-To: <CAC_iWjK2F13QxjuvqzqNLx00SiGz_FQ5X=MQxJyDev57bo3=LQ@mail.gmail.com>
On Wed, 3 Jul 2019 at 13:42, Ilias Apalodimas
<ilias.apalodimas@linaro.org> wrote:
>
> Hi Thirupathaiah,
>
> (+Joakim)
>
> On Wed, 3 Jul 2019 at 09:58, Ilias Apalodimas
> <ilias.apalodimas@linaro.org> wrote:
> >
> > Hi Thirupathaiah,
> > >
> > > First of all, Thanks a lot for trying to test the driver.
> > >
> > np
> >
> > [...]
> > > > I managed to do some quick testing in QEMU.
> > > > Everything works fine when i build this as a module (using IBM's TPM 2.0
> > > > TSS)
> > > >
> > > > - As module
> > > > # insmod /lib/modules/5.2.0-rc1/kernel/drivers/char/tpm/tpm_ftpm_tee.ko
> > > > # getrandom -by 8
> > > > randomBytes length 8
> > > > 23 b9 3d c3 90 13 d9 6b
> > > >
> > > > - Built-in
> > > > # dmesg | grep optee
> > > > ftpm-tee firmware:optee: ftpm_tee_probe:tee_client_open_session failed,
> > > > err=ffff0008
> > > This (0xffff0008) translates to TEE_ERROR_ITEM_NOT_FOUND.
> > >
> > > Where is fTPM TA located in the your test setup?
> > > Is it stitched into TEE binary as an EARLY_TA or
> > > Is it expected to be loaded during run-time with the help of user mode OP-TEE supplicant?
> > >
> > > My guess is that you are trying to load fTPM TA through user mode OP-TEE supplicant.
> > > Can you confirm?
> > I tried both
> >
>
> Ok apparently there was a failure with my built-in binary which i
> didn't notice. I did a full rebuilt and checked the elf this time :)
>
> Built as an earlyTA my error now is:
> ftpm-tee firmware:optee: ftpm_tee_probe:tee_client_open_session
> failed, err=ffff3024 (translates to TEE_ERROR_TARGET_DEAD)
> Since you tested it on real hardware i guess you tried both
> module/built-in. Which TEE version are you using?
>
> > > U-boot and Linux driver stacks work seamlessly without dependency on supplicant.
Is this true?
It looks like this fTPM driver can't work as a built-in driver. The
reason seems to be secure storage access required by OP-TEE fTPM TA
that is provided via OP-TEE supplicant that's not available during
kernel boot.
Snippet from ms-tpm-20-ref/Samples/ARM32-FirmwareTPM/optee_ta/fTPM/fTPM.c +145:
// If we fail to open fTPM storage we cannot continue.
if (_plat__NVEnable(NULL) == 0) {
TEE_Panic(TEE_ERROR_BAD_STATE);
}
So it seems like this module will work as a loadable module only after
OP-TEE supplicant is up.
-Sumit
> Thanks
> /Ilias
^ permalink raw reply
* [PATCH 0/3] docs: s390: restore content and update s390dbf.rst
From: Steffen Maier @ 2019-07-03 10:19 UTC (permalink / raw)
To: linux-doc
Cc: linux-s390, Mauro Carvalho Chehab, Mauro Carvalho Chehab,
Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
linux-kernel
This is based on top of the 3 s390 patches Heiko already queued on our
s390 features branch.
[("Re: [PATCH v3 00/33] Convert files to ReST - part 1")
https://www.spinics.net/lists/linux-doc/msg66137.html
https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/log/Documentation/s390?h=features]
If I was not mistaken, some documentation was accidentally lost
and patch 1 restores it.
After having looked closer, I came up with patches 2 and 3.
Rendered successfully on a current Fedora 30 and it looks good:
$ make SPHINXDIRS="s390" htmldocs
Steffen Maier (3):
docs: s390: restore important non-kdoc parts of s390dbf.rst
docs: s390: unify and update s390dbf kdocs at debug.c
docs: s390: s390dbf: typos and formatting, update crash command
Documentation/s390/s390dbf.rst | 390 +++++++++++++++++++++++++++++++++++++++--
arch/s390/include/asm/debug.h | 112 ++----------
arch/s390/kernel/debug.c | 105 +++++++++--
3 files changed, 473 insertions(+), 134 deletions(-)
--
1.8.3.1
^ permalink raw reply
* [PATCH 3/3] docs: s390: s390dbf: typos and formatting, update crash command
From: Steffen Maier @ 2019-07-03 10:19 UTC (permalink / raw)
To: linux-doc
Cc: linux-s390, Mauro Carvalho Chehab, Mauro Carvalho Chehab,
Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
linux-kernel
In-Reply-To: <1562149189-1417-1-git-send-email-maier@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
---
Documentation/s390/s390dbf.rst | 122 +++++++++++++++++++++++------------------
1 file changed, 68 insertions(+), 54 deletions(-)
diff --git a/Documentation/s390/s390dbf.rst b/Documentation/s390/s390dbf.rst
index be42892b159e..cdb36842b898 100644
--- a/Documentation/s390/s390dbf.rst
+++ b/Documentation/s390/s390dbf.rst
@@ -23,7 +23,8 @@ The debug feature may also very useful for kernel and driver development.
Design:
-------
Kernel components (e.g. device drivers) can register themselves at the debug
-feature with the function call debug_register(). This function initializes a
+feature with the function call :c:func:`debug_register()`.
+This function initializes a
debug log for the caller. For each debug log exists a number of debug areas
where exactly one is active at one time. Each debug area consists of contiguous
pages in memory. In the debug areas there are stored debug entries (log records)
@@ -44,8 +45,9 @@ The debug areas themselves are also ordered in form of a ring buffer.
When an exception is thrown in the last debug area, the following debug
entries are then written again in the very first area.
-There are three versions for the event- and exception-calls: One for
-logging raw data, one for text and one for numbers.
+There are four versions for the event- and exception-calls: One for
+logging raw data, one for text, one for numbers (unsigned int and long),
+and one for sprintf-like formatted strings.
Each debug entry contains the following data:
@@ -56,29 +58,29 @@ Each debug entry contains the following data:
- Flag, if entry is an exception or not
The debug logs can be inspected in a live system through entries in
-the debugfs-filesystem. Under the toplevel directory "s390dbf" there is
+the debugfs-filesystem. Under the toplevel directory "``s390dbf``" there is
a directory for each registered component, which is named like the
corresponding component. The debugfs normally should be mounted to
-/sys/kernel/debug therefore the debug feature can be accessed under
-/sys/kernel/debug/s390dbf.
+``/sys/kernel/debug`` therefore the debug feature can be accessed under
+``/sys/kernel/debug/s390dbf``.
The content of the directories are files which represent different views
to the debug log. Each component can decide which views should be
-used through registering them with the function debug_register_view().
+used through registering them with the function :c:func:`debug_register_view()`.
Predefined views for hex/ascii, sprintf and raw binary data are provided.
It is also possible to define other views. The content of
a view can be inspected simply by reading the corresponding debugfs file.
All debug logs have an actual debug level (range from 0 to 6).
-The default level is 3. Event and Exception functions have a 'level'
+The default level is 3. Event and Exception functions have a :c:data:`level`
parameter. Only debug entries with a level that is lower or equal
than the actual level are written to the log. This means, when
writing events, high priority log entries should have a low level
value whereas low priority entries should have a high one.
The actual debug level can be changed with the help of the debugfs-filesystem
-through writing a number string "x" to the 'level' debugfs file which is
+through writing a number string "x" to the ``level`` debugfs file which is
provided for every debug log. Debugging can be switched off completely
-by using "-" on the 'level' debugfs file.
+by using "-" on the ``level`` debugfs file.
Example::
@@ -86,21 +88,21 @@ Example::
It is also possible to deactivate the debug feature globally for every
debug log. You can change the behavior using 2 sysctl parameters in
-/proc/sys/s390dbf:
+``/proc/sys/s390dbf``:
There are currently 2 possible triggers, which stop the debug feature
-globally. The first possibility is to use the "debug_active" sysctl. If
-set to 1 the debug feature is running. If "debug_active" is set to 0 the
+globally. The first possibility is to use the ``debug_active`` sysctl. If
+set to 1 the debug feature is running. If ``debug_active`` is set to 0 the
debug feature is turned off.
The second trigger which stops the debug feature is a kernel oops.
That prevents the debug feature from overwriting debug information that
happened before the oops. After an oops you can reactivate the debug feature
-by piping 1 to /proc/sys/s390dbf/debug_active. Nevertheless, its not
+by piping 1 to ``/proc/sys/s390dbf/debug_active``. Nevertheless, it's not
suggested to use an oopsed kernel in a production environment.
If you want to disallow the deactivation of the debug feature, you can use
-the "debug_stoppable" sysctl. If you set "debug_stoppable" to 0 the debug
+the ``debug_stoppable`` sysctl. If you set ``debug_stoppable`` to 0 the debug
feature cannot be stopped. If the debug feature is already stopped, it
will stay deactivated.
@@ -113,16 +115,18 @@ Kernel Interfaces:
Predefined views:
-----------------
-extern struct debug_view debug_hex_ascii_view;
+.. code-block:: c
-extern struct debug_view debug_raw_view;
+ extern struct debug_view debug_hex_ascii_view;
-extern struct debug_view debug_sprintf_view;
+ extern struct debug_view debug_raw_view;
+
+ extern struct debug_view debug_sprintf_view;
Examples
--------
-::
+.. code-block:: c
/*
* hex_ascii- + raw-view Example
@@ -131,15 +135,15 @@ Examples
#include <linux/init.h>
#include <asm/debug.h>
- static debug_info_t* debug_info;
+ static debug_info_t *debug_info;
static int init(void)
{
/* register 4 debug areas with one page each and 4 byte data field */
- debug_info = debug_register ("test", 1, 4, 4 );
- debug_register_view(debug_info,&debug_hex_ascii_view);
- debug_register_view(debug_info,&debug_raw_view);
+ debug_info = debug_register("test", 1, 4, 4 );
+ debug_register_view(debug_info, &debug_hex_ascii_view);
+ debug_register_view(debug_info, &debug_raw_view);
debug_text_event(debug_info, 4 , "one ");
debug_int_exception(debug_info, 4, 4711);
@@ -150,13 +154,13 @@ Examples
static void cleanup(void)
{
- debug_unregister (debug_info);
+ debug_unregister(debug_info);
}
module_init(init);
module_exit(cleanup);
-::
+.. code-block:: c
/*
* sprintf-view Example
@@ -165,15 +169,15 @@ Examples
#include <linux/init.h>
#include <asm/debug.h>
- static debug_info_t* debug_info;
+ static debug_info_t *debug_info;
static int init(void)
{
/* register 4 debug areas with one page each and data field for */
/* format string pointer + 2 varargs (= 3 * sizeof(long)) */
- debug_info = debug_register ("test", 1, 4, sizeof(long) * 3);
- debug_register_view(debug_info,&debug_sprintf_view);
+ debug_info = debug_register("test", 1, 4, sizeof(long) * 3);
+ debug_register_view(debug_info, &debug_sprintf_view);
debug_sprintf_event(debug_info, 2 , "first event in %s:%i\n",__FILE__,__LINE__);
debug_sprintf_exception(debug_info, 1, "pointer to debug info: %p\n",&debug_info);
@@ -183,7 +187,7 @@ Examples
static void cleanup(void)
{
- debug_unregister (debug_info);
+ debug_unregister(debug_info);
}
module_init(init);
@@ -252,7 +256,7 @@ Define 4 pages for the debug areas of debug feature "dasd"::
> echo "4" > /sys/kernel/debug/s390dbf/dasd/pages
-Stooping the debug feature
+Stopping the debug feature
--------------------------
Example:
@@ -264,10 +268,11 @@ Example:
> echo 0 > /proc/sys/s390dbf/debug_active
-lcrash Interface
+crash Interface
----------------
-It is planned that the dump analysis tool lcrash gets an additional command
-'s390dbf' to display all the debug logs. With this tool it will be possible
+The ``crash`` tool since v5.1.0 has a built-in command
+``s390dbf`` to display all the debug logs or export them to the file system.
+With this tool it is possible
to investigate the debug logs on a live system and with a memory dump after
a system crash.
@@ -276,8 +281,8 @@ Investigating raw memory
One last possibility to investigate the debug logs at a live
system and after a system crash is to look at the raw memory
under VM or at the Service Element.
-It is possible to find the anker of the debug-logs through
-the 'debug_area_first' symbol in the System map. Then one has
+It is possible to find the anchor of the debug-logs through
+the ``debug_area_first`` symbol in the System map. Then one has
to follow the correct pointers of the data-structures defined
in debug.h and find the debug-areas in memory.
Normally modules which use the debug feature will also have
@@ -286,7 +291,7 @@ this pointer it will also be possible to find the debug logs in
memory.
For this method it is recommended to use '16 * x + 4' byte (x = 0..n)
-for the length of the data field in debug_register() in
+for the length of the data field in :c:func:`debug_register()` in
order to see the debug entries well formatted.
@@ -295,7 +300,7 @@ Predefined Views
There are three predefined views: hex_ascii, raw and sprintf.
The hex_ascii view shows the data field in hex and ascii representation
-(e.g. '45 43 4b 44 | ECKD').
+(e.g. ``45 43 4b 44 | ECKD``).
The raw view returns a bytestream as the debug areas are stored in memory.
The sprintf view formats the debug entries in the same way as the sprintf
@@ -335,18 +340,20 @@ The format of the raw view is:
- datafield
A typical line of the hex_ascii view will look like the following (first line
-is only for explanation and will not be displayed when 'cating' the view):
+is only for explanation and will not be displayed when 'cating' the view)::
-area time level exception cpu caller data (hex + ascii)
---------------------------------------------------------------------------
-00 00964419409:440690 1 - 00 88023fe
+ area time level exception cpu caller data (hex + ascii)
+ --------------------------------------------------------------------------
+ 00 00964419409:440690 1 - 00 88023fe
Defining views
--------------
Views are specified with the 'debug_view' structure. There are defined
-callback functions which are used for reading and writing the debugfs files::
+callback functions which are used for reading and writing the debugfs files:
+
+.. code-block:: c
struct debug_view {
char name[DEBUG_MAX_PROCF_LEN];
@@ -357,7 +364,9 @@ callback functions which are used for reading and writing the debugfs files::
void* private_data;
};
-where::
+where:
+
+.. code-block:: c
typedef int (debug_header_proc_t) (debug_info_t* id,
struct debug_view* view,
@@ -395,10 +404,10 @@ Then 'header_proc' and 'format_proc' are called for each
existing debug entry.
The input_proc can be used to implement functionality when it is written to
-the view (e.g. like with 'echo "0" > /sys/kernel/debug/s390dbf/dasd/level).
+the view (e.g. like with ``echo "0" > /sys/kernel/debug/s390dbf/dasd/level``).
For header_proc there can be used the default function
-debug_dflt_header_fn() which is defined in debug.h.
+:c:func:`debug_dflt_header_fn()` which is defined in debug.h.
and which produces the same header output as the predefined views.
E.g::
@@ -407,7 +416,9 @@ E.g::
In order to see how to use the callback functions check the implementation
of the default views!
-Example::
+Example:
+
+.. code-block:: c
#include <asm/debug.h>
@@ -423,21 +434,20 @@ Example::
};
static int debug_test_format_fn(
- debug_info_t * id, struct debug_view *view,
+ debug_info_t *id, struct debug_view *view,
char *out_buf, const char *in_buf
)
{
int i, rc = 0;
- if(id->buf_size >= 4) {
+ if (id->buf_size >= 4) {
int msg_nr = *((int*)in_buf);
- if(msg_nr < sizeof(messages)/sizeof(char*) - 1)
+ if (msg_nr < sizeof(messages) / sizeof(char*) - 1)
rc += sprintf(out_buf, "%s", messages[msg_nr]);
else
rc += sprintf(out_buf, UNKNOWNSTR, msg_nr);
}
- out:
- return rc;
+ return rc;
}
struct debug_view debug_test_view = {
@@ -452,13 +462,17 @@ Example::
test:
=====
-::
+.. code-block:: c
debug_info_t *debug_info;
+ int i;
...
- debug_info = debug_register ("test", 0, 4, 4 ));
+ debug_info = debug_register("test", 0, 4, 4);
debug_register_view(debug_info, &debug_test_view);
- for(i = 0; i < 10; i ++) debug_int_event(debug_info, 1, i);
+ for (i = 0; i < 10; i ++)
+ debug_int_event(debug_info, 1, i);
+
+::
> cat /sys/kernel/debug/s390dbf/test/myview
00 00964419734:611402 1 - 00 88042ca This error...........
--
1.8.3.1
^ permalink raw reply related
* [PATCH 1/3] docs: s390: restore important non-kdoc parts of s390dbf.rst
From: Steffen Maier @ 2019-07-03 10:19 UTC (permalink / raw)
To: linux-doc
Cc: linux-s390, Mauro Carvalho Chehab, Mauro Carvalho Chehab,
Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
linux-kernel
In-Reply-To: <1562149189-1417-1-git-send-email-maier@linux.ibm.com>
Complements previous ("s390: include/asm/debug.h add kerneldoc markups")
which seemed to have dropped important non-kdoc parts such as
user space interface (level, size, flush)
as well as views and caution regarding strings in the sprintf view.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
---
Documentation/s390/s390dbf.rst | 339 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 339 insertions(+)
diff --git a/Documentation/s390/s390dbf.rst b/Documentation/s390/s390dbf.rst
index d2595b548879..01d66251643d 100644
--- a/Documentation/s390/s390dbf.rst
+++ b/Documentation/s390/s390dbf.rst
@@ -112,6 +112,345 @@ Kernel Interfaces:
Predefined views:
-----------------
+extern struct debug_view debug_hex_ascii_view;
+
+extern struct debug_view debug_raw_view;
+
+extern struct debug_view debug_sprintf_view;
+
+Examples
+--------
+
+::
+
+ /*
+ * hex_ascii- + raw-view Example
+ */
+
+ #include <linux/init.h>
+ #include <asm/debug.h>
+
+ static debug_info_t* debug_info;
+
+ static int init(void)
+ {
+ /* register 4 debug areas with one page each and 4 byte data field */
+
+ debug_info = debug_register ("test", 1, 4, 4 );
+ debug_register_view(debug_info,&debug_hex_ascii_view);
+ debug_register_view(debug_info,&debug_raw_view);
+
+ debug_text_event(debug_info, 4 , "one ");
+ debug_int_exception(debug_info, 4, 4711);
+ debug_event(debug_info, 3, &debug_info, 4);
+
+ return 0;
+ }
+
+ static void cleanup(void)
+ {
+ debug_unregister (debug_info);
+ }
+
+ module_init(init);
+ module_exit(cleanup);
+
+::
+
+ /*
+ * sprintf-view Example
+ */
+
+ #include <linux/init.h>
+ #include <asm/debug.h>
+
+ static debug_info_t* debug_info;
+
+ static int init(void)
+ {
+ /* register 4 debug areas with one page each and data field for */
+ /* format string pointer + 2 varargs (= 3 * sizeof(long)) */
+
+ debug_info = debug_register ("test", 1, 4, sizeof(long) * 3);
+ debug_register_view(debug_info,&debug_sprintf_view);
+
+ debug_sprintf_event(debug_info, 2 , "first event in %s:%i\n",__FILE__,__LINE__);
+ debug_sprintf_exception(debug_info, 1, "pointer to debug info: %p\n",&debug_info);
+
+ return 0;
+ }
+
+ static void cleanup(void)
+ {
+ debug_unregister (debug_info);
+ }
+
+ module_init(init);
+ module_exit(cleanup);
+
+Debugfs Interface
+-----------------
+Views to the debug logs can be investigated through reading the corresponding
+debugfs-files:
+
+Example::
+
+ > ls /sys/kernel/debug/s390dbf/dasd
+ flush hex_ascii level pages raw
+ > cat /sys/kernel/debug/s390dbf/dasd/hex_ascii | sort -k2,2 -s
+ 00 00974733272:680099 2 - 02 0006ad7e 07 ea 4a 90 | ....
+ 00 00974733272:682210 2 - 02 0006ade6 46 52 45 45 | FREE
+ 00 00974733272:682213 2 - 02 0006adf6 07 ea 4a 90 | ....
+ 00 00974733272:682281 1 * 02 0006ab08 41 4c 4c 43 | EXCP
+ 01 00974733272:682284 2 - 02 0006ab16 45 43 4b 44 | ECKD
+ 01 00974733272:682287 2 - 02 0006ab28 00 00 00 04 | ....
+ 01 00974733272:682289 2 - 02 0006ab3e 00 00 00 20 | ...
+ 01 00974733272:682297 2 - 02 0006ad7e 07 ea 4a 90 | ....
+ 01 00974733272:684384 2 - 00 0006ade6 46 52 45 45 | FREE
+ 01 00974733272:684388 2 - 00 0006adf6 07 ea 4a 90 | ....
+
+See section about predefined views for explanation of the above output!
+
+Changing the debug level
+------------------------
+
+Example::
+
+
+ > cat /sys/kernel/debug/s390dbf/dasd/level
+ 3
+ > echo "5" > /sys/kernel/debug/s390dbf/dasd/level
+ > cat /sys/kernel/debug/s390dbf/dasd/level
+ 5
+
+Flushing debug areas
+--------------------
+Debug areas can be flushed with piping the number of the desired
+area (0...n) to the debugfs file "flush". When using "-" all debug areas
+are flushed.
+
+Examples:
+
+1. Flush debug area 0::
+
+ > echo "0" > /sys/kernel/debug/s390dbf/dasd/flush
+
+2. Flush all debug areas::
+
+ > echo "-" > /sys/kernel/debug/s390dbf/dasd/flush
+
+Changing the size of debug areas
+------------------------------------
+It is possible the change the size of debug areas through piping
+the number of pages to the debugfs file "pages". The resize request will
+also flush the debug areas.
+
+Example:
+
+Define 4 pages for the debug areas of debug feature "dasd"::
+
+ > echo "4" > /sys/kernel/debug/s390dbf/dasd/pages
+
+Stooping the debug feature
+--------------------------
+Example:
+
+1. Check if stopping is allowed::
+
+ > cat /proc/sys/s390dbf/debug_stoppable
+
+2. Stop debug feature::
+
+ > echo 0 > /proc/sys/s390dbf/debug_active
+
+lcrash Interface
+----------------
+It is planned that the dump analysis tool lcrash gets an additional command
+'s390dbf' to display all the debug logs. With this tool it will be possible
+to investigate the debug logs on a live system and with a memory dump after
+a system crash.
+
+Investigating raw memory
+------------------------
+One last possibility to investigate the debug logs at a live
+system and after a system crash is to look at the raw memory
+under VM or at the Service Element.
+It is possible to find the anker of the debug-logs through
+the 'debug_area_first' symbol in the System map. Then one has
+to follow the correct pointers of the data-structures defined
+in debug.h and find the debug-areas in memory.
+Normally modules which use the debug feature will also have
+a global variable with the pointer to the debug-logs. Following
+this pointer it will also be possible to find the debug logs in
+memory.
+
+For this method it is recommended to use '16 * x + 4' byte (x = 0..n)
+for the length of the data field in debug_register() in
+order to see the debug entries well formatted.
+
+
+Predefined Views
+----------------
+
+There are three predefined views: hex_ascii, raw and sprintf.
+The hex_ascii view shows the data field in hex and ascii representation
+(e.g. '45 43 4b 44 | ECKD').
+The raw view returns a bytestream as the debug areas are stored in memory.
+
+The sprintf view formats the debug entries in the same way as the sprintf
+function would do. The sprintf event/exception functions write to the
+debug entry a pointer to the format string (size = sizeof(long))
+and for each vararg a long value. So e.g. for a debug entry with a format
+string plus two varargs one would need to allocate a (3 * sizeof(long))
+byte data area in the debug_register() function.
+
+IMPORTANT:
+ Using "%s" in sprintf event functions is dangerous. You can only
+ use "%s" in the sprintf event functions, if the memory for the passed string
+ is available as long as the debug feature exists. The reason behind this is
+ that due to performance considerations only a pointer to the string is stored
+ in the debug feature. If you log a string that is freed afterwards, you will
+ get an OOPS when inspecting the debug feature, because then the debug feature
+ will access the already freed memory.
+
+NOTE:
+ If using the sprintf view do NOT use other event/exception functions
+ than the sprintf-event and -exception functions.
+
+The format of the hex_ascii and sprintf view is as follows:
+
+- Number of area
+- Timestamp (formatted as seconds and microseconds since 00:00:00 Coordinated
+ Universal Time (UTC), January 1, 1970)
+- level of debug entry
+- Exception flag (* = Exception)
+- Cpu-Number of calling task
+- Return Address to caller
+- data field
+
+The format of the raw view is:
+
+- Header as described in debug.h
+- datafield
+
+A typical line of the hex_ascii view will look like the following (first line
+is only for explanation and will not be displayed when 'cating' the view):
+
+area time level exception cpu caller data (hex + ascii)
+--------------------------------------------------------------------------
+00 00964419409:440690 1 - 00 88023fe
+
+
+Defining views
+--------------
+
+Views are specified with the 'debug_view' structure. There are defined
+callback functions which are used for reading and writing the debugfs files::
+
+ struct debug_view {
+ char name[DEBUG_MAX_PROCF_LEN];
+ debug_prolog_proc_t* prolog_proc;
+ debug_header_proc_t* header_proc;
+ debug_format_proc_t* format_proc;
+ debug_input_proc_t* input_proc;
+ void* private_data;
+ };
+
+where::
+
+ typedef int (debug_header_proc_t) (debug_info_t* id,
+ struct debug_view* view,
+ int area,
+ debug_entry_t* entry,
+ char* out_buf);
+
+ typedef int (debug_format_proc_t) (debug_info_t* id,
+ struct debug_view* view, char* out_buf,
+ const char* in_buf);
+ typedef int (debug_prolog_proc_t) (debug_info_t* id,
+ struct debug_view* view,
+ char* out_buf);
+ typedef int (debug_input_proc_t) (debug_info_t* id,
+ struct debug_view* view,
+ struct file* file, const char* user_buf,
+ size_t in_buf_size, loff_t* offset);
+
+
+The "private_data" member can be used as pointer to view specific data.
+It is not used by the debug feature itself.
+
+The output when reading a debugfs file is structured like this::
+
+ "prolog_proc output"
+
+ "header_proc output 1" "format_proc output 1"
+ "header_proc output 2" "format_proc output 2"
+ "header_proc output 3" "format_proc output 3"
+ ...
+
+When a view is read from the debugfs, the Debug Feature calls the
+'prolog_proc' once for writing the prolog.
+Then 'header_proc' and 'format_proc' are called for each
+existing debug entry.
+
+The input_proc can be used to implement functionality when it is written to
+the view (e.g. like with 'echo "0" > /sys/kernel/debug/s390dbf/dasd/level).
+
+For header_proc there can be used the default function
+debug_dflt_header_fn() which is defined in debug.h.
+and which produces the same header output as the predefined views.
+E.g::
+
+ 00 00964419409:440761 2 - 00 88023ec
+
+In order to see how to use the callback functions check the implementation
+of the default views!
+
+Example::
+
+ #include <asm/debug.h>
+
+ #define UNKNOWNSTR "data: %08x"
+
+ const char* messages[] =
+ {"This error...........\n",
+ "That error...........\n",
+ "Problem..............\n",
+ "Something went wrong.\n",
+ "Everything ok........\n",
+ NULL
+ };
+
+ static int debug_test_format_fn(
+ debug_info_t * id, struct debug_view *view,
+ char *out_buf, const char *in_buf
+ )
+ {
+ int i, rc = 0;
+
+ if(id->buf_size >= 4) {
+ int msg_nr = *((int*)in_buf);
+ if(msg_nr < sizeof(messages)/sizeof(char*) - 1)
+ rc += sprintf(out_buf, "%s", messages[msg_nr]);
+ else
+ rc += sprintf(out_buf, UNKNOWNSTR, msg_nr);
+ }
+ out:
+ return rc;
+ }
+
+ struct debug_view debug_test_view = {
+ "myview", /* name of view */
+ NULL, /* no prolog */
+ &debug_dflt_header_fn, /* default header for each entry */
+ &debug_test_format_fn, /* our own format function */
+ NULL, /* no input function */
+ NULL /* no private data */
+ };
+
+test:
+=====
+
::
debug_info_t *debug_info;
--
1.8.3.1
^ permalink raw reply related
* [PATCH 2/3] docs: s390: unify and update s390dbf kdocs at debug.c
From: Steffen Maier @ 2019-07-03 10:19 UTC (permalink / raw)
To: linux-doc
Cc: linux-s390, Mauro Carvalho Chehab, Mauro Carvalho Chehab,
Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
linux-kernel
In-Reply-To: <1562149189-1417-1-git-send-email-maier@linux.ibm.com>
For non-static-inlines, debug.c already had non-compliant function
header docs. So move the pure prototype kdocs of
("s390: include/asm/debug.h add kerneldoc markups")
from debug.h to debug.c and merge them with the old function docs.
Also, I had the impression that kdoc typically is at the implementation
in the compile unit rather than at the prototype in the header file.
While at it, update the short kdoc description to distinguish the
different functions. And a few more consistency cleanups.
Added a new kdoc for debug_set_critical() since debug.h comments it
as part of the API.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
---
Documentation/s390/s390dbf.rst | 1 +
arch/s390/include/asm/debug.h | 112 ++++++-----------------------------------
arch/s390/kernel/debug.c | 105 +++++++++++++++++++++++++++++++-------
3 files changed, 102 insertions(+), 116 deletions(-)
diff --git a/Documentation/s390/s390dbf.rst b/Documentation/s390/s390dbf.rst
index 01d66251643d..be42892b159e 100644
--- a/Documentation/s390/s390dbf.rst
+++ b/Documentation/s390/s390dbf.rst
@@ -107,6 +107,7 @@ will stay deactivated.
Kernel Interfaces:
------------------
+.. kernel-doc:: arch/s390/kernel/debug.c
.. kernel-doc:: arch/s390/include/asm/debug.h
Predefined views:
diff --git a/arch/s390/include/asm/debug.h b/arch/s390/include/asm/debug.h
index 02c36eedd780..310134015541 100644
--- a/arch/s390/include/asm/debug.h
+++ b/arch/s390/include/asm/debug.h
@@ -95,77 +95,19 @@ debug_entry_t *debug_exception_common(debug_info_t *id, int level,
/* Debug Feature API: */
-/**
- * debug_register() - allocates memory for a debug log.
- *
- * @name: Name of debug log (e.g. used for debugfs entry)
- * @pages: Number of pages, which will be allocated per area
- * @nr_areas: Number of debug areas
- * @buf_size: Size of data area in each debug entry
- *
- * Return:
- * - Handler for generated debug area
- * - %NULL if register failed
- *
- * Must not be called within an interrupt handler.
- */
debug_info_t *debug_register(const char *name, int pages, int nr_areas,
int buf_size);
-/**
- * debug_register_mode() - allocates memory for a debug log.
- *
- * @name: Name of debug log (e.g. used for debugfs entry)
- * @pages: Number of pages, which will be allocated per area
- * @nr_areas: Number of debug areas
- * @buf_size: Size of data area in each debug entry
- * @mode: File mode for debugfs files. E.g. S_IRWXUGO
- * @uid: User ID for debugfs files. Currently only 0 is supported.
- * @gid: Group ID for debugfs files. Currently only 0 is supported.
- *
- * Return:
- * - Handler for generated debug area
- * - %NULL if register failed
- *
- * Must not be called within an interrupt handler
- */
debug_info_t *debug_register_mode(const char *name, int pages, int nr_areas,
int buf_size, umode_t mode, uid_t uid,
gid_t gid);
-/**
- * debug_unregister() - frees memory for a debug log and removes all
- * registered debug
- * views.
- *
- * @id: handle for debug log
- *
- * Return:
- * none
- *
- * Must not be called within an interrupt handler
- */
void debug_unregister(debug_info_t *id);
-/**
- * debug_set_level() - Sets new actual debug level if new_level is valid.
- *
- * @id: handle for debug log
- * @new_level: new debug level
- *
- * Return:
- * none
- */
void debug_set_level(debug_info_t *id, int new_level);
void debug_set_critical(void);
-/**
- * debug_stop_all() - stops the debug feature if stopping is allowed.
- *
- * Return:
- * - none
- */
void debug_stop_all(void);
/**
@@ -184,7 +126,7 @@ static inline bool debug_level_enabled(debug_info_t *id, int level)
}
/**
- * debug_event() - writes debug entry to active debug area
+ * debug_event() - writes binary debug entry to active debug area
* (if level <= actual debug level)
*
* @id: handle for debug log
@@ -194,6 +136,7 @@ static inline bool debug_level_enabled(debug_info_t *id, int level)
*
* Return:
* - Address of written debug entry
+ * - %NULL if error
*/
static inline debug_entry_t *debug_event(debug_info_t *id, int level,
void *data, int length)
@@ -204,7 +147,7 @@ static inline debug_entry_t *debug_event(debug_info_t *id, int level,
}
/**
- * debug_int_event() - writes debug entry to active debug area
+ * debug_int_event() - writes unsigned integer debug entry to active debug area
* (if level <= actual debug level)
*
* @id: handle for debug log
@@ -226,12 +169,12 @@ static inline debug_entry_t *debug_int_event(debug_info_t *id, int level,
}
/**
- * debug_long_event() - writes debug entry to active debug area
+ * debug_long_event() - writes unsigned long debug entry to active debug area
* (if level <= actual debug level)
*
* @id: handle for debug log
* @level: debug level
- * @tag: integer value for debug entry
+ * @tag: long integer value for debug entry
*
* Return:
* - Address of written debug entry
@@ -248,7 +191,7 @@ static inline debug_entry_t *debug_long_event(debug_info_t *id, int level,
}
/**
- * debug_text_event() - writes debug entry in ascii format to active
+ * debug_text_event() - writes string debug entry in ascii format to active
* debug area (if level <= actual debug level)
*
* @id: handle for debug log
@@ -306,9 +249,9 @@ static inline debug_entry_t *debug_text_event(debug_info_t *id, int level,
})
/**
- * debug_exception() - writes debug entry to active debug area
- * (if level <= actual debug level) and switches
- * to next debug area
+ * debug_exception() - writes binary debug entry to active debug area
+ * (if level <= actual debug level)
+ * and switches to next debug area
*
* @id: handle for debug log
* @level: debug level
@@ -328,7 +271,7 @@ static inline debug_entry_t *debug_exception(debug_info_t *id, int level,
}
/**
- * debug_int_exception() - writes debug entry to active debug area
+ * debug_int_exception() - writes unsigned int debug entry to active debug area
* (if level <= actual debug level)
* and switches to next debug area
*
@@ -351,13 +294,13 @@ static inline debug_entry_t *debug_int_exception(debug_info_t *id, int level,
}
/**
- * debug_long_exception() - writes debug entry to active debug area
+ * debug_long_exception() - writes long debug entry to active debug area
* (if level <= actual debug level)
* and switches to next debug area
*
* @id: handle for debug log
* @level: debug level
- * @tag: integer value for debug entry
+ * @tag: long integer value for debug entry
*
* Return:
* - Address of written debug entry
@@ -374,9 +317,9 @@ static inline debug_entry_t *debug_long_exception (debug_info_t *id, int level,
}
/**
- * debug_text_exception() - writes debug entry in ascii format to active
+ * debug_text_exception() - writes string debug entry in ascii format to active
* debug area (if level <= actual debug level)
- * and switches to next debug
+ * and switches to next debug area
* area
*
* @id: handle for debug log
@@ -407,7 +350,7 @@ static inline debug_entry_t *debug_text_exception(debug_info_t *id, int level,
/**
* debug_sprintf_exception() - writes debug entry with format string and
* varargs (longs) to active debug area
- * (if level $<=$ actual debug level)
+ * (if level <= actual debug level)
* and switches to next debug area.
*
* @_id: handle for debug log
@@ -435,33 +378,8 @@ static inline debug_entry_t *debug_text_exception(debug_info_t *id, int level,
__ret; \
})
-/**
- * debug_register_view() - registers new debug view and creates debugfs
- * dir entry
- *
- * @id: handle for debug log
- * @view: pointer to debug view struct
- *
- * Return:
- * - 0 : ok
- * - < 0: Error
- */
int debug_register_view(debug_info_t *id, struct debug_view *view);
-/**
- * debug_unregister_view()
- *
- * @id: handle for debug log
- * @view: pointer to debug view struct
- *
- * Return:
- * - 0 : ok
- * - < 0: Error
- *
- *
- * unregisters debug view and removes debugfs dir entry
- */
-
int debug_unregister_view(debug_info_t *id, struct debug_view *view);
/*
diff --git a/arch/s390/kernel/debug.c b/arch/s390/kernel/debug.c
index 0ebf08c3b35e..70a44bad625f 100644
--- a/arch/s390/kernel/debug.c
+++ b/arch/s390/kernel/debug.c
@@ -647,11 +647,23 @@ static int debug_close(struct inode *inode, struct file *file)
return 0; /* success */
}
-/*
- * debug_register_mode:
- * - Creates and initializes debug area for the caller
- * The mode parameter allows to specify access rights for the s390dbf files
- * - Returns handle for debug area
+/**
+ * debug_register_mode() - creates and initializes debug area.
+ *
+ * @name: Name of debug log (e.g. used for debugfs entry)
+ * @pages_per_area: Number of pages, which will be allocated per area
+ * @nr_areas: Number of debug areas
+ * @buf_size: Size of data area in each debug entry
+ * @mode: File mode for debugfs files. E.g. S_IRWXUGO
+ * @uid: User ID for debugfs files. Currently only 0 is supported.
+ * @gid: Group ID for debugfs files. Currently only 0 is supported.
+ *
+ * Return:
+ * - Handle for generated debug area
+ * - %NULL if register failed
+ *
+ * Allocates memory for a debug log.
+ * Must not be called within an interrupt handler.
*/
debug_info_t *debug_register_mode(const char *name, int pages_per_area,
int nr_areas, int buf_size, umode_t mode,
@@ -681,10 +693,21 @@ debug_info_t *debug_register_mode(const char *name, int pages_per_area,
}
EXPORT_SYMBOL(debug_register_mode);
-/*
- * debug_register:
- * - creates and initializes debug area for the caller
- * - returns handle for debug area
+/**
+ * debug_register() - creates and initializes debug area with default file mode.
+ *
+ * @name: Name of debug log (e.g. used for debugfs entry)
+ * @pages_per_area: Number of pages, which will be allocated per area
+ * @nr_areas: Number of debug areas
+ * @buf_size: Size of data area in each debug entry
+ *
+ * Return:
+ * - Handle for generated debug area
+ * - %NULL if register failed
+ *
+ * Allocates memory for a debug log.
+ * The debugfs file mode access permisions are read and write for user.
+ * Must not be called within an interrupt handler.
*/
debug_info_t *debug_register(const char *name, int pages_per_area,
int nr_areas, int buf_size)
@@ -694,9 +717,13 @@ debug_info_t *debug_register(const char *name, int pages_per_area,
}
EXPORT_SYMBOL(debug_register);
-/*
- * debug_unregister:
- * - give back debug area
+/**
+ * debug_unregister() - give back debug area.
+ *
+ * @id: handle for debug log
+ *
+ * Return:
+ * none
*/
void debug_unregister(debug_info_t *id)
{
@@ -745,9 +772,14 @@ static int debug_set_size(debug_info_t *id, int nr_areas, int pages_per_area)
return rc;
}
-/*
- * debug_set_level:
- * - set actual debug level
+/**
+ * debug_set_level() - Sets new actual debug level if new_level is valid.
+ *
+ * @id: handle for debug log
+ * @new_level: new debug level
+ *
+ * Return:
+ * none
*/
void debug_set_level(debug_info_t *id, int new_level)
{
@@ -873,6 +905,14 @@ static int s390dbf_procactive(struct ctl_table *table, int write,
static struct ctl_table_header *s390dbf_sysctl_header;
+/**
+ * debug_stop_all() - stops the debug feature if stopping is allowed.
+ *
+ * Return:
+ * - none
+ *
+ * Currently used in case of a kernel oops.
+ */
void debug_stop_all(void)
{
if (debug_stoppable)
@@ -880,6 +920,17 @@ void debug_stop_all(void)
}
EXPORT_SYMBOL(debug_stop_all);
+/**
+ * debug_set_critical() - event/exception functions try lock instead of spin.
+ *
+ * Return:
+ * - none
+ *
+ * Currently used in case of stopping all CPUs but the current one.
+ * Once in this state, functions to write a debug entry for an
+ * event or exception no longer spin on the debug area lock,
+ * but only try to get it and fail if they do not get the lock.
+ */
void debug_set_critical(void)
{
debug_critical = 1;
@@ -1036,8 +1087,16 @@ debug_entry_t *__debug_sprintf_exception(debug_info_t *id, int level, char *stri
}
EXPORT_SYMBOL(__debug_sprintf_exception);
-/*
- * debug_register_view:
+/**
+ * debug_register_view() - registers new debug view and creates debugfs
+ * dir entry
+ *
+ * @id: handle for debug log
+ * @view: pointer to debug view struct
+ *
+ * Return:
+ * - 0 : ok
+ * - < 0: Error
*/
int debug_register_view(debug_info_t *id, struct debug_view *view)
{
@@ -1077,8 +1136,16 @@ int debug_register_view(debug_info_t *id, struct debug_view *view)
}
EXPORT_SYMBOL(debug_register_view);
-/*
- * debug_unregister_view:
+/**
+ * debug_unregister_view() - unregisters debug view and removes debugfs
+ * dir entry
+ *
+ * @id: handle for debug log
+ * @view: pointer to debug view struct
+ *
+ * Return:
+ * - 0 : ok
+ * - < 0: Error
*/
int debug_unregister_view(debug_info_t *id, struct debug_view *view)
{
--
1.8.3.1
^ permalink raw reply related
* Re: [UPDATE][PATCH 10/10] tools/power/x86: A tool to validate Intel Speed Select commands
From: Rafael J. Wysocki @ 2019-07-03 11:58 UTC (permalink / raw)
To: Srinivas Pandruvada
Cc: dvhart, andy, andriy.shevchenko, corbet, alan, lenb, prarit,
darcari, linux-doc, linux-kernel, platform-driver-x86
In-Reply-To: <20190630171408.8673-1-srinivas.pandruvada@linux.intel.com>
On Sunday, June 30, 2019 7:14:08 PM CEST Srinivas Pandruvada wrote:
> The Intel(R) Speed select technologies contains four features.
>
> Performance profile:An non architectural mechanism that allows multiple
> optimized performance profiles per system via static and/or dynamic
> adjustment of core count, workload, Tjmax, and TDP, etc. aka ISS
> in the documentation.
>
> Base Frequency: Enables users to increase guaranteed base frequency on
> certain cores (high priority cores) in exchange for lower base frequency
> on remaining cores (low priority cores). aka PBF in the documenation.
>
> Turbo frequency: Enables the ability to set different turbo ratio limits
> to cores based on priority. aka FACT in the documentation.
>
> Core power: An Interface that allows user to define per core/tile
> priority.
>
> There is a multi level help for commands and options. This can be used
> to check required arguments for each feature and commands for the
> feature.
>
> To start navigating the features start with
>
> $sudo intel-speed-select --help
>
> For help on a specific feature for example
> $sudo intel-speed-select perf-profile --help
>
> To get help for a command for a feature for example
> $sudo intel-speed-select perf-profile get-lock-status --help
>
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> Updates:
> - Copied Makefile from tools/gpio and moified the Makefile here
> - Added entry to tools/build/Makefile
> - Rename directory to match the executable name
> - Fix one error message
>
> tools/Makefile | 12 +-
> tools/power/x86/intel-speed-select/Build | 1 +
> tools/power/x86/intel-speed-select/Makefile | 56 +
> .../x86/intel-speed-select/isst-config.c | 1607 +++++++++++++++++
> .../power/x86/intel-speed-select/isst-core.c | 721 ++++++++
> .../x86/intel-speed-select/isst-display.c | 479 +++++
> tools/power/x86/intel-speed-select/isst.h | 231 +++
> 7 files changed, 3102 insertions(+), 5 deletions(-)
> create mode 100644 tools/power/x86/intel-speed-select/Build
> create mode 100644 tools/power/x86/intel-speed-select/Makefile
> create mode 100644 tools/power/x86/intel-speed-select/isst-config.c
> create mode 100644 tools/power/x86/intel-speed-select/isst-core.c
> create mode 100644 tools/power/x86/intel-speed-select/isst-display.c
> create mode 100644 tools/power/x86/intel-speed-select/isst.h
>
> diff --git a/tools/Makefile b/tools/Makefile
> index 3dfd72ae6c1a..68defd7ecf5d 100644
> --- a/tools/Makefile
> +++ b/tools/Makefile
> @@ -19,6 +19,7 @@ help:
> @echo ' gpio - GPIO tools'
> @echo ' hv - tools used when in Hyper-V clients'
> @echo ' iio - IIO tools'
> + @echo ' intel-speed-select - Intel Speed Select tool'
> @echo ' kvm_stat - top-like utility for displaying kvm statistics'
> @echo ' leds - LEDs tools'
> @echo ' liblockdep - user-space wrapper for kernel locking-validator'
> @@ -82,7 +83,7 @@ perf: FORCE
> selftests: FORCE
> $(call descend,testing/$@)
>
> -turbostat x86_energy_perf_policy: FORCE
> +turbostat x86_energy_perf_policy intel-speed-select: FORCE
> $(call descend,power/x86/$@)
>
> tmon: FORCE
> @@ -115,7 +116,7 @@ liblockdep_install:
> selftests_install:
> $(call descend,testing/$(@:_install=),install)
>
> -turbostat_install x86_energy_perf_policy_install:
> +turbostat_install x86_energy_perf_policy_install intel-speed-select_install:
> $(call descend,power/x86/$(@:_install=),install)
>
> tmon_install:
> @@ -132,7 +133,7 @@ install: acpi_install cgroup_install cpupower_install gpio_install \
> perf_install selftests_install turbostat_install usb_install \
> virtio_install vm_install bpf_install x86_energy_perf_policy_install \
> tmon_install freefall_install objtool_install kvm_stat_install \
> - wmi_install pci_install debugging_install
> + wmi_install pci_install debugging_install intel-speed-select_install
>
> acpi_clean:
> $(call descend,power/acpi,clean)
> @@ -162,7 +163,7 @@ perf_clean:
> selftests_clean:
> $(call descend,testing/$(@:_clean=),clean)
>
> -turbostat_clean x86_energy_perf_policy_clean:
> +turbostat_clean x86_energy_perf_policy_clean intel-speed-select_clean:
> $(call descend,power/x86/$(@:_clean=),clean)
>
> tmon_clean:
> @@ -178,6 +179,7 @@ clean: acpi_clean cgroup_clean cpupower_clean hv_clean firewire_clean \
> perf_clean selftests_clean turbostat_clean spi_clean usb_clean virtio_clean \
> vm_clean bpf_clean iio_clean x86_energy_perf_policy_clean tmon_clean \
> freefall_clean build_clean libbpf_clean libsubcmd_clean liblockdep_clean \
> - gpio_clean objtool_clean leds_clean wmi_clean pci_clean firmware_clean debugging_clean
> + gpio_clean objtool_clean leds_clean wmi_clean pci_clean firmware_clean debugging_clean \
> + intel-speed-select_clean
>
> .PHONY: FORCE
> diff --git a/tools/power/x86/intel-speed-select/Build b/tools/power/x86/intel-speed-select/Build
> new file mode 100644
> index 000000000000..b61456d75190
> --- /dev/null
> +++ b/tools/power/x86/intel-speed-select/Build
> @@ -0,0 +1 @@
> +intel-speed-select-y += isst-config.o isst-core.o isst-display.o
> diff --git a/tools/power/x86/intel-speed-select/Makefile b/tools/power/x86/intel-speed-select/Makefile
> new file mode 100644
> index 000000000000..12c6939dca2a
> --- /dev/null
> +++ b/tools/power/x86/intel-speed-select/Makefile
> @@ -0,0 +1,56 @@
> +# SPDX-License-Identifier: GPL-2.0
> +include ../../../scripts/Makefile.include
> +
> +bindir ?= /usr/bin
> +
> +ifeq ($(srctree),)
> +srctree := $(patsubst %/,%,$(dir $(CURDIR)))
> +srctree := $(patsubst %/,%,$(dir $(srctree)))
> +srctree := $(patsubst %/,%,$(dir $(srctree)))
> +srctree := $(patsubst %/,%,$(dir $(srctree)))
> +endif
> +
> +# Do not use make's built-in rules
> +# (this improves performance and avoids hard-to-debug behaviour);
> +MAKEFLAGS += -r
> +
> +override CFLAGS += -O2 -Wall -g -D_GNU_SOURCE -I$(OUTPUT)include
> +
> +ALL_TARGETS := intel-speed-select
> +ALL_PROGRAMS := $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS))
> +
> +all: $(ALL_PROGRAMS)
> +
> +export srctree OUTPUT CC LD CFLAGS
> +include $(srctree)/tools/build/Makefile.include
> +
> +#
> +# We need the following to be outside of kernel tree
> +#
> +$(OUTPUT)include/linux/isst_if.h: ../../../../include/uapi/linux/isst_if.h
> + mkdir -p $(OUTPUT)include/linux 2>&1 || true
> + ln -sf $(CURDIR)/../../../../include/uapi/linux/isst_if.h $@
> +
> +prepare: $(OUTPUT)include/linux/isst_if.h
> +
> +ISST_IN := $(OUTPUT)intel-speed-select-in.o
> +
> +$(ISST_IN): prepare FORCE
> + $(Q)$(MAKE) $(build)=intel-speed-select
> +$(OUTPUT)intel-speed-select: $(ISST_IN)
> + $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $< -o $@
> +
> +clean:
> + rm -f $(ALL_PROGRAMS)
> + rm -rf $(OUTPUT)include/linux/isst_if.h
> + find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.d' -delete
> +
> +install: $(ALL_PROGRAMS)
> + install -d -m 755 $(DESTDIR)$(bindir); \
> + for program in $(ALL_PROGRAMS); do \
> + install $$program $(DESTDIR)$(bindir); \
> + done
> +
> +FORCE:
> +
> +.PHONY: all install clean FORCE prepare
> diff --git a/tools/power/x86/intel-speed-select/isst-config.c b/tools/power/x86/intel-speed-select/isst-config.c
> new file mode 100644
> index 000000000000..91c5ad1685a1
> --- /dev/null
> +++ b/tools/power/x86/intel-speed-select/isst-config.c
> @@ -0,0 +1,1607 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Intel Speed Select -- Enumerate and control features
> + * Copyright (c) 2019 Intel Corporation.
> + */
> +
> +#include <linux/isst_if.h>
> +
> +#include "isst.h"
> +
> +struct process_cmd_struct {
> + char *feature;
> + char *command;
> + void (*process_fn)(void);
> +};
> +
> +static const char *version_str = "v1.0";
> +static const int supported_api_ver = 1;
> +static struct isst_if_platform_info isst_platform_info;
> +static char *progname;
> +static int debug_flag;
> +static FILE *outf;
> +
> +static int cpu_model;
> +
> +#define MAX_CPUS_IN_ONE_REQ 64
> +static short max_target_cpus;
> +static unsigned short target_cpus[MAX_CPUS_IN_ONE_REQ];
> +
> +static int topo_max_cpus;
> +static size_t present_cpumask_size;
> +static cpu_set_t *present_cpumask;
> +static size_t target_cpumask_size;
> +static cpu_set_t *target_cpumask;
> +static int tdp_level = 0xFF;
> +static int fact_bucket = 0xFF;
> +static int fact_avx = 0xFF;
> +static unsigned long long fact_trl;
> +static int out_format_json;
> +static int cmd_help;
> +
> +/* clos related */
> +static int current_clos = -1;
> +static int clos_epp = -1;
> +static int clos_prop_prio = -1;
> +static int clos_min = -1;
> +static int clos_max = -1;
> +static int clos_desired = -1;
> +static int clos_priority_type;
> +
> +struct _cpu_map {
> + unsigned short core_id;
> + unsigned short pkg_id;
> + unsigned short die_id;
> + unsigned short punit_cpu;
> + unsigned short punit_cpu_core;
> +};
> +struct _cpu_map *cpu_map;
> +
> +void debug_printf(const char *format, ...)
> +{
> + va_list args;
> +
> + va_start(args, format);
> +
> + if (debug_flag)
> + vprintf(format, args);
> +
> + va_end(args);
> +}
> +
> +static void update_cpu_model(void)
> +{
> + unsigned int ebx, ecx, edx;
> + unsigned int fms, family;
> +
> + __cpuid(1, fms, ebx, ecx, edx);
> + family = (fms >> 8) & 0xf;
> + cpu_model = (fms >> 4) & 0xf;
> + if (family == 6 || family == 0xf)
> + cpu_model += ((fms >> 16) & 0xf) << 4;
> +}
> +
> +/* Open a file, and exit on failure */
> +static FILE *fopen_or_exit(const char *path, const char *mode)
> +{
> + FILE *filep = fopen(path, mode);
> +
> + if (!filep)
> + err(1, "%s: open failed", path);
> +
> + return filep;
> +}
> +
> +/* Parse a file containing a single int */
> +static int parse_int_file(int fatal, const char *fmt, ...)
> +{
> + va_list args;
> + char path[PATH_MAX];
> + FILE *filep;
> + int value;
> +
> + va_start(args, fmt);
> + vsnprintf(path, sizeof(path), fmt, args);
> + va_end(args);
> + if (fatal) {
> + filep = fopen_or_exit(path, "r");
> + } else {
> + filep = fopen(path, "r");
> + if (!filep)
> + return -1;
> + }
> + if (fscanf(filep, "%d", &value) != 1)
> + err(1, "%s: failed to parse number from file", path);
> + fclose(filep);
> +
> + return value;
> +}
> +
> +int cpufreq_sysfs_present(void)
> +{
> + DIR *dir;
> +
> + dir = opendir("/sys/devices/system/cpu/cpu0/cpufreq");
> + if (dir) {
> + closedir(dir);
> + return 1;
> + }
> +
> + return 0;
> +}
> +
> +int out_format_is_json(void)
> +{
> + return out_format_json;
> +}
> +
> +int get_physical_package_id(int cpu)
> +{
> + return parse_int_file(
> + 1, "/sys/devices/system/cpu/cpu%d/topology/physical_package_id",
> + cpu);
> +}
> +
> +int get_physical_core_id(int cpu)
> +{
> + return parse_int_file(
> + 1, "/sys/devices/system/cpu/cpu%d/topology/core_id", cpu);
> +}
> +
> +int get_physical_die_id(int cpu)
> +{
> + int ret;
> +
> + ret = parse_int_file(0, "/sys/devices/system/cpu/cpu%d/topology/die_id",
> + cpu);
> + if (ret < 0)
> + ret = 0;
> +
> + return ret;
> +}
> +
> +int get_topo_max_cpus(void)
> +{
> + return topo_max_cpus;
> +}
> +
> +#define MAX_PACKAGE_COUNT 8
> +#define MAX_DIE_PER_PACKAGE 2
> +static void for_each_online_package_in_set(void (*callback)(int, void *, void *,
> + void *, void *),
> + void *arg1, void *arg2, void *arg3,
> + void *arg4)
> +{
> + int max_packages[MAX_PACKAGE_COUNT * MAX_PACKAGE_COUNT];
> + int pkg_index = 0, i;
> +
> + memset(max_packages, 0xff, sizeof(max_packages));
> + for (i = 0; i < topo_max_cpus; ++i) {
> + int j, online, pkg_id, die_id = 0, skip = 0;
> +
> + if (!CPU_ISSET_S(i, present_cpumask_size, present_cpumask))
> + continue;
> + if (i)
> + online = parse_int_file(
> + 1, "/sys/devices/system/cpu/cpu%d/online", i);
> + else
> + online =
> + 1; /* online entry for CPU 0 needs some special configs */
> +
> + die_id = get_physical_die_id(i);
> + if (die_id < 0)
> + die_id = 0;
> + pkg_id = get_physical_package_id(i);
> + /* Create an unique id for package, die combination to store */
> + pkg_id = (MAX_PACKAGE_COUNT * pkg_id + die_id);
> +
> + for (j = 0; j < pkg_index; ++j) {
> + if (max_packages[j] == pkg_id) {
> + skip = 1;
> + break;
> + }
> + }
> +
> + if (!skip && online && callback) {
> + callback(i, arg1, arg2, arg3, arg4);
> + max_packages[pkg_index++] = pkg_id;
> + }
> + }
> +}
> +
> +static void for_each_online_target_cpu_in_set(
> + void (*callback)(int, void *, void *, void *, void *), void *arg1,
> + void *arg2, void *arg3, void *arg4)
> +{
> + int i;
> +
> + for (i = 0; i < topo_max_cpus; ++i) {
> + int online;
> +
> + if (!CPU_ISSET_S(i, target_cpumask_size, target_cpumask))
> + continue;
> + if (i)
> + online = parse_int_file(
> + 1, "/sys/devices/system/cpu/cpu%d/online", i);
> + else
> + online =
> + 1; /* online entry for CPU 0 needs some special configs */
> +
> + if (online && callback)
> + callback(i, arg1, arg2, arg3, arg4);
> + }
> +}
> +
> +#define BITMASK_SIZE 32
> +static void set_max_cpu_num(void)
> +{
> + FILE *filep;
> + unsigned long dummy;
> +
> + topo_max_cpus = 0;
> + filep = fopen_or_exit(
> + "/sys/devices/system/cpu/cpu0/topology/thread_siblings", "r");
> + while (fscanf(filep, "%lx,", &dummy) == 1)
> + topo_max_cpus += BITMASK_SIZE;
> + fclose(filep);
> + topo_max_cpus--; /* 0 based */
> +
> + debug_printf("max cpus %d\n", topo_max_cpus);
> +}
> +
> +size_t alloc_cpu_set(cpu_set_t **cpu_set)
> +{
> + cpu_set_t *_cpu_set;
> + size_t size;
> +
> + _cpu_set = CPU_ALLOC((topo_max_cpus + 1));
> + if (_cpu_set == NULL)
> + err(3, "CPU_ALLOC");
> + size = CPU_ALLOC_SIZE((topo_max_cpus + 1));
> + CPU_ZERO_S(size, _cpu_set);
> +
> + *cpu_set = _cpu_set;
> + return size;
> +}
> +
> +void free_cpu_set(cpu_set_t *cpu_set)
> +{
> + CPU_FREE(cpu_set);
> +}
> +
> +static int cpu_cnt[MAX_PACKAGE_COUNT][MAX_DIE_PER_PACKAGE];
> +static void set_cpu_present_cpu_mask(void)
> +{
> + size_t size;
> + DIR *dir;
> + int i;
> +
> + size = alloc_cpu_set(&present_cpumask);
> + present_cpumask_size = size;
> + for (i = 0; i < topo_max_cpus; ++i) {
> + char buffer[256];
> +
> + snprintf(buffer, sizeof(buffer),
> + "/sys/devices/system/cpu/cpu%d", i);
> + dir = opendir(buffer);
> + if (dir) {
> + int pkg_id, die_id;
> +
> + CPU_SET_S(i, size, present_cpumask);
> + die_id = get_physical_die_id(i);
> + if (die_id < 0)
> + die_id = 0;
> +
> + pkg_id = get_physical_package_id(i);
> + if (pkg_id < MAX_PACKAGE_COUNT &&
> + die_id < MAX_DIE_PER_PACKAGE)
> + cpu_cnt[pkg_id][die_id]++;
> + }
> + closedir(dir);
> + }
> +}
> +
> +int get_cpu_count(int pkg_id, int die_id)
> +{
> + if (pkg_id < MAX_PACKAGE_COUNT && die_id < MAX_DIE_PER_PACKAGE)
> + return cpu_cnt[pkg_id][die_id] + 1;
> +
> + return 0;
> +}
> +
> +static void set_cpu_target_cpu_mask(void)
> +{
> + size_t size;
> + int i;
> +
> + size = alloc_cpu_set(&target_cpumask);
> + target_cpumask_size = size;
> + for (i = 0; i < max_target_cpus; ++i) {
> + if (!CPU_ISSET_S(target_cpus[i], present_cpumask_size,
> + present_cpumask))
> + continue;
> +
> + CPU_SET_S(target_cpus[i], size, target_cpumask);
> + }
> +}
> +
> +static void create_cpu_map(void)
> +{
> + const char *pathname = "/dev/isst_interface";
> + int i, fd = 0;
> + struct isst_if_cpu_maps map;
> +
> + cpu_map = malloc(sizeof(*cpu_map) * topo_max_cpus);
> + if (!cpu_map)
> + err(3, "cpumap");
> +
> + fd = open(pathname, O_RDWR);
> + if (fd < 0)
> + err(-1, "%s open failed", pathname);
> +
> + for (i = 0; i < topo_max_cpus; ++i) {
> + if (!CPU_ISSET_S(i, present_cpumask_size, present_cpumask))
> + continue;
> +
> + map.cmd_count = 1;
> + map.cpu_map[0].logical_cpu = i;
> +
> + debug_printf(" map logical_cpu:%d\n",
> + map.cpu_map[0].logical_cpu);
> + if (ioctl(fd, ISST_IF_GET_PHY_ID, &map) == -1) {
> + perror("ISST_IF_GET_PHY_ID");
> + fprintf(outf, "Error: map logical_cpu:%d\n",
> + map.cpu_map[0].logical_cpu);
> + continue;
> + }
> + cpu_map[i].core_id = get_physical_core_id(i);
> + cpu_map[i].pkg_id = get_physical_package_id(i);
> + cpu_map[i].die_id = get_physical_die_id(i);
> + cpu_map[i].punit_cpu = map.cpu_map[0].physical_cpu;
> + cpu_map[i].punit_cpu_core = (map.cpu_map[0].physical_cpu >>
> + 1); // shift to get core id
> +
> + debug_printf(
> + "map logical_cpu:%d core: %d die:%d pkg:%d punit_cpu:%d punit_core:%d\n",
> + i, cpu_map[i].core_id, cpu_map[i].die_id,
> + cpu_map[i].pkg_id, cpu_map[i].punit_cpu,
> + cpu_map[i].punit_cpu_core);
> + }
> +
> + if (fd)
> + close(fd);
> +}
> +
> +int find_logical_cpu(int pkg_id, int die_id, int punit_core_id)
> +{
> + int i;
> +
> + for (i = 0; i < topo_max_cpus; ++i) {
> + if (cpu_map[i].pkg_id == pkg_id &&
> + cpu_map[i].die_id == die_id &&
> + cpu_map[i].punit_cpu_core == punit_core_id)
> + return i;
> + }
> +
> + return -EINVAL;
> +}
> +
> +void set_cpu_mask_from_punit_coremask(int cpu, unsigned long long core_mask,
> + size_t core_cpumask_size,
> + cpu_set_t *core_cpumask, int *cpu_cnt)
> +{
> + int i, cnt = 0;
> + int die_id, pkg_id;
> +
> + *cpu_cnt = 0;
> + die_id = get_physical_die_id(cpu);
> + pkg_id = get_physical_package_id(cpu);
> +
> + for (i = 0; i < 64; ++i) {
> + if (core_mask & BIT(i)) {
> + int j;
> +
> + for (j = 0; j < topo_max_cpus; ++j) {
> + if (cpu_map[j].pkg_id == pkg_id &&
> + cpu_map[j].die_id == die_id &&
> + cpu_map[j].punit_cpu_core == i) {
> + CPU_SET_S(j, core_cpumask_size,
> + core_cpumask);
> + ++cnt;
> + }
> + }
> + }
> + }
> +
> + *cpu_cnt = cnt;
> +}
> +
> +int find_phy_core_num(int logical_cpu)
> +{
> + if (logical_cpu < topo_max_cpus)
> + return cpu_map[logical_cpu].punit_cpu_core;
> +
> + return -EINVAL;
> +}
> +
> +static int isst_send_mmio_command(unsigned int cpu, unsigned int reg, int write,
> + unsigned int *value)
> +{
> + struct isst_if_io_regs io_regs;
> + const char *pathname = "/dev/isst_interface";
> + int cmd;
> + int fd;
> +
> + debug_printf("mmio_cmd cpu:%d reg:%d write:%d\n", cpu, reg, write);
> +
> + fd = open(pathname, O_RDWR);
> + if (fd < 0)
> + err(-1, "%s open failed", pathname);
> +
> + io_regs.req_count = 1;
> + io_regs.io_reg[0].logical_cpu = cpu;
> + io_regs.io_reg[0].reg = reg;
> + cmd = ISST_IF_IO_CMD;
> + if (write) {
> + io_regs.io_reg[0].read_write = 1;
> + io_regs.io_reg[0].value = *value;
> + } else {
> + io_regs.io_reg[0].read_write = 0;
> + }
> +
> + if (ioctl(fd, cmd, &io_regs) == -1) {
> + perror("ISST_IF_IO_CMD");
> + fprintf(outf, "Error: mmio_cmd cpu:%d reg:%x read_write:%x\n",
> + cpu, reg, write);
> + } else {
> + if (!write)
> + *value = io_regs.io_reg[0].value;
> +
> + debug_printf(
> + "mmio_cmd response: cpu:%d reg:%x rd_write:%x resp:%x\n",
> + cpu, reg, write, *value);
> + }
> +
> + close(fd);
> +
> + return 0;
> +}
> +
> +int isst_send_mbox_command(unsigned int cpu, unsigned char command,
> + unsigned char sub_command, unsigned int parameter,
> + unsigned int req_data, unsigned int *resp)
> +{
> + const char *pathname = "/dev/isst_interface";
> + int fd;
> + struct isst_if_mbox_cmds mbox_cmds = { 0 };
> +
> + debug_printf(
> + "mbox_send: cpu:%d command:%x sub_command:%x parameter:%x req_data:%x\n",
> + cpu, command, sub_command, parameter, req_data);
> +
> + if (isst_platform_info.mmio_supported && command == CONFIG_CLOS) {
> + unsigned int value;
> + int write = 0;
> + int clos_id, core_id, ret = 0;
> +
> + debug_printf("CLOS %d\n", cpu);
> +
> + if (parameter & BIT(MBOX_CMD_WRITE_BIT)) {
> + value = req_data;
> + write = 1;
> + }
> +
> + switch (sub_command) {
> + case CLOS_PQR_ASSOC:
> + core_id = parameter & 0xff;
> + ret = isst_send_mmio_command(
> + cpu, PQR_ASSOC_OFFSET + core_id * 4, write,
> + &value);
> + if (!ret && !write)
> + *resp = value;
> + break;
> + case CLOS_PM_CLOS:
> + clos_id = parameter & 0x03;
> + ret = isst_send_mmio_command(
> + cpu, PM_CLOS_OFFSET + clos_id * 4, write,
> + &value);
> + if (!ret && !write)
> + *resp = value;
> + break;
> + case CLOS_PM_QOS_CONFIG:
> + ret = isst_send_mmio_command(cpu, PM_QOS_CONFIG_OFFSET,
> + write, &value);
> + if (!ret && !write)
> + *resp = value;
> + break;
> + case CLOS_STATUS:
> + break;
> + default:
> + break;
> + }
> + return ret;
> + }
> +
> + mbox_cmds.cmd_count = 1;
> + mbox_cmds.mbox_cmd[0].logical_cpu = cpu;
> + mbox_cmds.mbox_cmd[0].command = command;
> + mbox_cmds.mbox_cmd[0].sub_command = sub_command;
> + mbox_cmds.mbox_cmd[0].parameter = parameter;
> + mbox_cmds.mbox_cmd[0].req_data = req_data;
> +
> + fd = open(pathname, O_RDWR);
> + if (fd < 0)
> + err(-1, "%s open failed", pathname);
> +
> + if (ioctl(fd, ISST_IF_MBOX_COMMAND, &mbox_cmds) == -1) {
> + perror("ISST_IF_MBOX_COMMAND");
> + fprintf(outf,
> + "Error: mbox_cmd cpu:%d command:%x sub_command:%x parameter:%x req_data:%x\n",
> + cpu, command, sub_command, parameter, req_data);
> + } else {
> + *resp = mbox_cmds.mbox_cmd[0].resp_data;
> + debug_printf(
> + "mbox_cmd response: cpu:%d command:%x sub_command:%x parameter:%x req_data:%x resp:%x\n",
> + cpu, command, sub_command, parameter, req_data, *resp);
> + }
> +
> + close(fd);
> +
> + return 0;
> +}
> +
> +int isst_send_msr_command(unsigned int cpu, unsigned int msr, int write,
> + unsigned long long *req_resp)
> +{
> + struct isst_if_msr_cmds msr_cmds;
> + const char *pathname = "/dev/isst_interface";
> + int fd;
> +
> + fd = open(pathname, O_RDWR);
> + if (fd < 0)
> + err(-1, "%s open failed", pathname);
> +
> + msr_cmds.cmd_count = 1;
> + msr_cmds.msr_cmd[0].logical_cpu = cpu;
> + msr_cmds.msr_cmd[0].msr = msr;
> + msr_cmds.msr_cmd[0].read_write = write;
> + if (write)
> + msr_cmds.msr_cmd[0].data = *req_resp;
> +
> + if (ioctl(fd, ISST_IF_MSR_COMMAND, &msr_cmds) == -1) {
> + perror("ISST_IF_MSR_COMMAD");
> + fprintf(outf, "Error: msr_cmd cpu:%d msr:%x read_write:%d\n",
> + cpu, msr, write);
> + } else {
> + if (!write)
> + *req_resp = msr_cmds.msr_cmd[0].data;
> +
> + debug_printf(
> + "msr_cmd response: cpu:%d msr:%x rd_write:%x resp:%llx %llx\n",
> + cpu, msr, write, *req_resp, msr_cmds.msr_cmd[0].data);
> + }
> +
> + close(fd);
> +
> + return 0;
> +}
> +
> +static int isst_fill_platform_info(void)
> +{
> + const char *pathname = "/dev/isst_interface";
> + int fd;
> +
> + fd = open(pathname, O_RDWR);
> + if (fd < 0)
> + err(-1, "%s open failed", pathname);
> +
> + if (ioctl(fd, ISST_IF_GET_PLATFORM_INFO, &isst_platform_info) == -1) {
> + perror("ISST_IF_GET_PLATFORM_INFO");
> + close(fd);
> + return -1;
> + }
> +
> + close(fd);
> +
> + return 0;
> +}
> +
> +static void isst_print_platform_information(void)
> +{
> + struct isst_if_platform_info platform_info;
> + const char *pathname = "/dev/isst_interface";
> + int fd;
> +
> + fd = open(pathname, O_RDWR);
> + if (fd < 0)
> + err(-1, "%s open failed", pathname);
> +
> + if (ioctl(fd, ISST_IF_GET_PLATFORM_INFO, &platform_info) == -1) {
> + perror("ISST_IF_GET_PLATFORM_INFO");
> + } else {
> + fprintf(outf, "Platform: API version : %d\n",
> + platform_info.api_version);
> + fprintf(outf, "Platform: Driver version : %d\n",
> + platform_info.driver_version);
> + fprintf(outf, "Platform: mbox supported : %d\n",
> + platform_info.mbox_supported);
> + fprintf(outf, "Platform: mmio supported : %d\n",
> + platform_info.mmio_supported);
> + }
> +
> + close(fd);
> +
> + exit(0);
> +}
> +
> +static void exec_on_get_ctdp_cpu(int cpu, void *arg1, void *arg2, void *arg3,
> + void *arg4)
> +{
> + int (*fn_ptr)(int cpu, void *arg);
> + int ret;
> +
> + fn_ptr = arg1;
> + ret = fn_ptr(cpu, arg2);
> + if (ret)
> + perror("get_tdp_*");
> + else
> + isst_display_result(cpu, outf, "perf-profile", (char *)arg3,
> + *(unsigned int *)arg4);
> +}
> +
> +#define _get_tdp_level(desc, suffix, object, help) \
> + static void get_tdp_##object(void) \
> + { \
> + struct isst_pkg_ctdp ctdp; \
> +\
> + if (cmd_help) { \
> + fprintf(stderr, \
> + "Print %s [No command arguments are required]\n", \
> + help); \
> + exit(0); \
> + } \
> + isst_ctdp_display_information_start(outf); \
> + if (max_target_cpus) \
> + for_each_online_target_cpu_in_set( \
> + exec_on_get_ctdp_cpu, isst_get_ctdp_##suffix, \
> + &ctdp, desc, &ctdp.object); \
> + else \
> + for_each_online_package_in_set(exec_on_get_ctdp_cpu, \
> + isst_get_ctdp_##suffix, \
> + &ctdp, desc, \
> + &ctdp.object); \
> + isst_ctdp_display_information_end(outf); \
> + }
> +
> +_get_tdp_level("get-config-levels", levels, levels, "TDP levels");
> +_get_tdp_level("get-config-version", levels, version, "TDP version");
> +_get_tdp_level("get-config-enabled", levels, enabled, "TDP enable status");
> +_get_tdp_level("get-config-current_level", levels, current_level,
> + "Current TDP Level");
> +_get_tdp_level("get-lock-status", levels, locked, "TDP lock status");
> +
> +static void dump_isst_config_for_cpu(int cpu, void *arg1, void *arg2,
> + void *arg3, void *arg4)
> +{
> + struct isst_pkg_ctdp pkg_dev;
> + int ret;
> +
> + memset(&pkg_dev, 0, sizeof(pkg_dev));
> + ret = isst_get_process_ctdp(cpu, tdp_level, &pkg_dev);
> + if (ret) {
> + perror("isst_get_process_ctdp");
> + } else {
> + isst_ctdp_display_information(cpu, outf, tdp_level, &pkg_dev);
> + isst_get_process_ctdp_complete(cpu, &pkg_dev);
> + }
> +}
> +
> +static void dump_isst_config(void)
> +{
> + if (cmd_help) {
> + fprintf(stderr,
> + "Print Intel(R) Speed Select Technology Performance profile configuration\n");
> + fprintf(stderr,
> + "including base frequency and turbo frequency configurations\n");
> + fprintf(stderr, "Optional: -l|--level : Specify tdp level\n");
> + fprintf(stderr,
> + "\tIf no arguments, dump information for all TDP levels\n");
> + exit(0);
> + }
> +
> + isst_ctdp_display_information_start(outf);
> +
> + if (max_target_cpus)
> + for_each_online_target_cpu_in_set(dump_isst_config_for_cpu,
> + NULL, NULL, NULL, NULL);
> + else
> + for_each_online_package_in_set(dump_isst_config_for_cpu, NULL,
> + NULL, NULL, NULL);
> +
> + isst_ctdp_display_information_end(outf);
> +}
> +
> +static void set_tdp_level_for_cpu(int cpu, void *arg1, void *arg2, void *arg3,
> + void *arg4)
> +{
> + int ret;
> +
> + ret = isst_set_tdp_level(cpu, tdp_level);
> + if (ret)
> + perror("set_tdp_level_for_cpu");
> + else
> + isst_display_result(cpu, outf, "perf-profile", "set_tdp_level",
> + ret);
> +}
> +
> +static void set_tdp_level(void)
> +{
> + if (cmd_help) {
> + fprintf(stderr, "Set Config TDP level\n");
> + fprintf(stderr,
> + "\t Arguments: -l|--level : Specify tdp level\n");
> + exit(0);
> + }
> +
> + if (tdp_level == 0xff) {
> + fprintf(outf, "Invalid command: specify tdp_level\n");
> + exit(1);
> + }
> + isst_ctdp_display_information_start(outf);
> + if (max_target_cpus)
> + for_each_online_target_cpu_in_set(set_tdp_level_for_cpu, NULL,
> + NULL, NULL, NULL);
> + else
> + for_each_online_package_in_set(set_tdp_level_for_cpu, NULL,
> + NULL, NULL, NULL);
> + isst_ctdp_display_information_end(outf);
> +}
> +
> +static void dump_pbf_config_for_cpu(int cpu, void *arg1, void *arg2, void *arg3,
> + void *arg4)
> +{
> + struct isst_pbf_info pbf_info;
> + int ret;
> +
> + ret = isst_get_pbf_info(cpu, tdp_level, &pbf_info);
> + if (ret) {
> + perror("isst_get_pbf_info");
> + } else {
> + isst_pbf_display_information(cpu, outf, tdp_level, &pbf_info);
> + isst_get_pbf_info_complete(&pbf_info);
> + }
> +}
> +
> +static void dump_pbf_config(void)
> +{
> + if (cmd_help) {
> + fprintf(stderr,
> + "Print Intel(R) Speed Select Technology base frequency configuration for a TDP level\n");
> + fprintf(stderr,
> + "\tArguments: -l|--level : Specify tdp level\n");
> + exit(0);
> + }
> +
> + if (tdp_level == 0xff) {
> + fprintf(outf, "Invalid command: specify tdp_level\n");
> + exit(1);
> + }
> +
> + isst_ctdp_display_information_start(outf);
> + if (max_target_cpus)
> + for_each_online_target_cpu_in_set(dump_pbf_config_for_cpu, NULL,
> + NULL, NULL, NULL);
> + else
> + for_each_online_package_in_set(dump_pbf_config_for_cpu, NULL,
> + NULL, NULL, NULL);
> + isst_ctdp_display_information_end(outf);
> +}
> +
> +static void set_pbf_for_cpu(int cpu, void *arg1, void *arg2, void *arg3,
> + void *arg4)
> +{
> + int ret;
> + int status = *(int *)arg4;
> +
> + ret = isst_set_pbf_fact_status(cpu, 1, status);
> + if (ret) {
> + perror("isst_set_pbf");
> + } else {
> + if (status)
> + isst_display_result(cpu, outf, "base-freq", "enable",
> + ret);
> + else
> + isst_display_result(cpu, outf, "base-freq", "disable",
> + ret);
> + }
> +}
> +
> +static void set_pbf_enable(void)
> +{
> + int status = 1;
> +
> + if (cmd_help) {
> + fprintf(stderr,
> + "Enable Intel Speed Select Technology base frequency feature [No command arguments are required]\n");
> + exit(0);
> + }
> +
> + isst_ctdp_display_information_start(outf);
> + if (max_target_cpus)
> + for_each_online_target_cpu_in_set(set_pbf_for_cpu, NULL, NULL,
> + NULL, &status);
> + else
> + for_each_online_package_in_set(set_pbf_for_cpu, NULL, NULL,
> + NULL, &status);
> + isst_ctdp_display_information_end(outf);
> +}
> +
> +static void set_pbf_disable(void)
> +{
> + int status = 0;
> +
> + if (cmd_help) {
> + fprintf(stderr,
> + "Disable Intel Speed Select Technology base frequency feature [No command arguments are required]\n");
> + exit(0);
> + }
> +
> + isst_ctdp_display_information_start(outf);
> + if (max_target_cpus)
> + for_each_online_target_cpu_in_set(set_pbf_for_cpu, NULL, NULL,
> + NULL, &status);
> + else
> + for_each_online_package_in_set(set_pbf_for_cpu, NULL, NULL,
> + NULL, &status);
> + isst_ctdp_display_information_end(outf);
> +}
> +
> +static void dump_fact_config_for_cpu(int cpu, void *arg1, void *arg2,
> + void *arg3, void *arg4)
> +{
> + struct isst_fact_info fact_info;
> + int ret;
> +
> + ret = isst_get_fact_info(cpu, tdp_level, &fact_info);
> + if (ret)
> + perror("isst_get_fact_bucket_info");
> + else
> + isst_fact_display_information(cpu, outf, tdp_level, fact_bucket,
> + fact_avx, &fact_info);
> +}
> +
> +static void dump_fact_config(void)
> +{
> + if (cmd_help) {
> + fprintf(stderr,
> + "Print complete Intel Speed Select Technology turbo frequency configuration for a TDP level. Other arguments are optional.\n");
> + fprintf(stderr,
> + "\tArguments: -l|--level : Specify tdp level\n");
> + fprintf(stderr,
> + "\tArguments: -b|--bucket : Bucket index to dump\n");
> + fprintf(stderr,
> + "\tArguments: -r|--trl-type : Specify trl type: sse|avx2|avx512\n");
> + exit(0);
> + }
> +
> + if (tdp_level == 0xff) {
> + fprintf(outf, "Invalid command: specify tdp_level\n");
> + exit(1);
> + }
> +
> + isst_ctdp_display_information_start(outf);
> + if (max_target_cpus)
> + for_each_online_target_cpu_in_set(dump_fact_config_for_cpu,
> + NULL, NULL, NULL, NULL);
> + else
> + for_each_online_package_in_set(dump_fact_config_for_cpu, NULL,
> + NULL, NULL, NULL);
> + isst_ctdp_display_information_end(outf);
> +}
> +
> +static void set_fact_for_cpu(int cpu, void *arg1, void *arg2, void *arg3,
> + void *arg4)
> +{
> + int ret;
> + int status = *(int *)arg4;
> +
> + ret = isst_set_pbf_fact_status(cpu, 0, status);
> + if (ret)
> + perror("isst_set_fact");
> + else {
> + if (status) {
> + struct isst_pkg_ctdp pkg_dev;
> +
> + ret = isst_get_ctdp_levels(cpu, &pkg_dev);
> + if (ret) {
> + isst_display_result(cpu, outf, "turbo-freq",
> + "enable", ret);
> + return;
> + }
> + ret = isst_set_trl(cpu, fact_trl);
> + isst_display_result(cpu, outf, "turbo-freq", "enable",
> + ret);
> + } else {
> + /* Since we modified TRL during Fact enable, restore it */
> + isst_set_trl_from_current_tdp(cpu, fact_trl);
> + isst_display_result(cpu, outf, "turbo-freq", "disable",
> + ret);
> + }
> + }
> +}
> +
> +static void set_fact_enable(void)
> +{
> + int status = 1;
> +
> + if (cmd_help) {
> + fprintf(stderr,
> + "Enable Intel Speed Select Technology Turbo frequency feature\n");
> + fprintf(stderr,
> + "Optional: -t|--trl : Specify turbo ratio limit\n");
> + exit(0);
> + }
> +
> + isst_ctdp_display_information_start(outf);
> + if (max_target_cpus)
> + for_each_online_target_cpu_in_set(set_fact_for_cpu, NULL, NULL,
> + NULL, &status);
> + else
> + for_each_online_package_in_set(set_fact_for_cpu, NULL, NULL,
> + NULL, &status);
> + isst_ctdp_display_information_end(outf);
> +}
> +
> +static void set_fact_disable(void)
> +{
> + int status = 0;
> +
> + if (cmd_help) {
> + fprintf(stderr,
> + "Disable Intel Speed Select Technology turbo frequency feature\n");
> + fprintf(stderr,
> + "Optional: -t|--trl : Specify turbo ratio limit\n");
> + exit(0);
> + }
> +
> + isst_ctdp_display_information_start(outf);
> + if (max_target_cpus)
> + for_each_online_target_cpu_in_set(set_fact_for_cpu, NULL, NULL,
> + NULL, &status);
> + else
> + for_each_online_package_in_set(set_fact_for_cpu, NULL, NULL,
> + NULL, &status);
> + isst_ctdp_display_information_end(outf);
> +}
> +
> +static void enable_clos_qos_config(int cpu, void *arg1, void *arg2, void *arg3,
> + void *arg4)
> +{
> + int ret;
> + int status = *(int *)arg4;
> +
> + ret = isst_pm_qos_config(cpu, status, clos_priority_type);
> + if (ret) {
> + perror("isst_pm_qos_config");
> + } else {
> + if (status)
> + isst_display_result(cpu, outf, "core-power", "enable",
> + ret);
> + else
> + isst_display_result(cpu, outf, "core-power", "disable",
> + ret);
> + }
> +}
> +
> +static void set_clos_enable(void)
> +{
> + int status = 1;
> +
> + if (cmd_help) {
> + fprintf(stderr, "Enable core-power for a package/die\n");
> + fprintf(stderr,
> + "\tClos Enable: Specify priority type with [--priority|-p]\n");
> + fprintf(stderr, "\t\t 0: Proportional, 1: Ordered\n");
> + exit(0);
> + }
> +
> + if (cpufreq_sysfs_present()) {
> + fprintf(stderr,
> + "cpufreq subsystem and core-power enable will interfere with each other!\n");
> + }
> +
> + isst_ctdp_display_information_start(outf);
> + if (max_target_cpus)
> + for_each_online_target_cpu_in_set(enable_clos_qos_config, NULL,
> + NULL, NULL, &status);
> + else
> + for_each_online_package_in_set(enable_clos_qos_config, NULL,
> + NULL, NULL, &status);
> + isst_ctdp_display_information_end(outf);
> +}
> +
> +static void set_clos_disable(void)
> +{
> + int status = 0;
> +
> + if (cmd_help) {
> + fprintf(stderr,
> + "Disable core-power: [No command arguments are required]\n");
> + exit(0);
> + }
> +
> + isst_ctdp_display_information_start(outf);
> + if (max_target_cpus)
> + for_each_online_target_cpu_in_set(enable_clos_qos_config, NULL,
> + NULL, NULL, &status);
> + else
> + for_each_online_package_in_set(enable_clos_qos_config, NULL,
> + NULL, NULL, &status);
> + isst_ctdp_display_information_end(outf);
> +}
> +
> +static void dump_clos_config_for_cpu(int cpu, void *arg1, void *arg2,
> + void *arg3, void *arg4)
> +{
> + struct isst_clos_config clos_config;
> + int ret;
> +
> + ret = isst_pm_get_clos(cpu, current_clos, &clos_config);
> + if (ret)
> + perror("isst_pm_get_clos");
> + else
> + isst_clos_display_information(cpu, outf, current_clos,
> + &clos_config);
> +}
> +
> +static void dump_clos_config(void)
> +{
> + if (cmd_help) {
> + fprintf(stderr,
> + "Print Intel Speed Select Technology core power configuration\n");
> + fprintf(stderr,
> + "\tArguments: [-c | --clos]: Specify clos id\n");
> + exit(0);
> + }
> + if (current_clos < 0 || current_clos > 3) {
> + fprintf(stderr, "Invalid clos id\n");
> + exit(0);
> + }
> +
> + isst_ctdp_display_information_start(outf);
> + if (max_target_cpus)
> + for_each_online_target_cpu_in_set(dump_clos_config_for_cpu,
> + NULL, NULL, NULL, NULL);
> + else
> + for_each_online_package_in_set(dump_clos_config_for_cpu, NULL,
> + NULL, NULL, NULL);
> + isst_ctdp_display_information_end(outf);
> +}
> +
> +static void set_clos_config_for_cpu(int cpu, void *arg1, void *arg2, void *arg3,
> + void *arg4)
> +{
> + struct isst_clos_config clos_config;
> + int ret;
> +
> + clos_config.pkg_id = get_physical_package_id(cpu);
> + clos_config.die_id = get_physical_die_id(cpu);
> +
> + clos_config.epp = clos_epp;
> + clos_config.clos_prop_prio = clos_prop_prio;
> + clos_config.clos_min = clos_min;
> + clos_config.clos_max = clos_max;
> + clos_config.clos_desired = clos_desired;
> + ret = isst_set_clos(cpu, current_clos, &clos_config);
> + if (ret)
> + perror("isst_set_clos");
> + else
> + isst_display_result(cpu, outf, "core-power", "config", ret);
> +}
> +
> +static void set_clos_config(void)
> +{
> + if (cmd_help) {
> + fprintf(stderr,
> + "Set core-power configuration for one of the four clos ids\n");
> + fprintf(stderr,
> + "\tSpecify targeted clos id with [--clos|-c]\n");
> + fprintf(stderr, "\tSpecify clos EPP with [--epp|-e]\n");
> + fprintf(stderr,
> + "\tSpecify clos Proportional Priority [--weight|-w]\n");
> + fprintf(stderr, "\tSpecify clos min with [--min|-n]\n");
> + fprintf(stderr, "\tSpecify clos max with [--max|-m]\n");
> + fprintf(stderr, "\tSpecify clos desired with [--desired|-d]\n");
> + exit(0);
> + }
> +
> + if (current_clos < 0 || current_clos > 3) {
> + fprintf(stderr, "Invalid clos id\n");
> + exit(0);
> + }
> + if (clos_epp < 0 || clos_epp > 0x0F) {
> + fprintf(stderr, "clos epp is not specified, default: 0\n");
> + clos_epp = 0;
> + }
> + if (clos_prop_prio < 0 || clos_prop_prio > 0x0F) {
> + fprintf(stderr,
> + "clos frequency weight is not specified, default: 0\n");
> + clos_prop_prio = 0;
> + }
> + if (clos_min < 0) {
> + fprintf(stderr, "clos min is not specified, default: 0\n");
> + clos_min = 0;
> + }
> + if (clos_max < 0) {
> + fprintf(stderr, "clos max is not specified, default: 0xff\n");
> + clos_max = 0xff;
> + }
> + if (clos_desired < 0) {
> + fprintf(stderr, "clos desired is not specified, default: 0\n");
> + clos_desired = 0x00;
> + }
> +
> + isst_ctdp_display_information_start(outf);
> + if (max_target_cpus)
> + for_each_online_target_cpu_in_set(set_clos_config_for_cpu, NULL,
> + NULL, NULL, NULL);
> + else
> + for_each_online_package_in_set(set_clos_config_for_cpu, NULL,
> + NULL, NULL, NULL);
> + isst_ctdp_display_information_end(outf);
> +}
> +
> +static void set_clos_assoc_for_cpu(int cpu, void *arg1, void *arg2, void *arg3,
> + void *arg4)
> +{
> + int ret;
> +
> + ret = isst_clos_associate(cpu, current_clos);
> + if (ret)
> + perror("isst_clos_associate");
> + else
> + isst_display_result(cpu, outf, "core-power", "assoc", ret);
> +}
> +
> +static void set_clos_assoc(void)
> +{
> + if (cmd_help) {
> + fprintf(stderr, "Associate a clos id to a CPU\n");
> + fprintf(stderr,
> + "\tSpecify targeted clos id with [--clos|-c]\n");
> + exit(0);
> + }
> +
> + if (current_clos < 0 || current_clos > 3) {
> + fprintf(stderr, "Invalid clos id\n");
> + exit(0);
> + }
> + if (max_target_cpus)
> + for_each_online_target_cpu_in_set(set_clos_assoc_for_cpu, NULL,
> + NULL, NULL, NULL);
> + else {
> + fprintf(stderr,
> + "Invalid target cpu. Specify with [-c|--cpu]\n");
> + }
> +}
> +
> +static void get_clos_assoc_for_cpu(int cpu, void *arg1, void *arg2, void *arg3,
> + void *arg4)
> +{
> + int clos, ret;
> +
> + ret = isst_clos_get_assoc_status(cpu, &clos);
> + if (ret)
> + perror("isst_clos_get_assoc_status");
> + else
> + isst_display_result(cpu, outf, "core-power", "get-assoc", clos);
> +}
> +
> +static void get_clos_assoc(void)
> +{
> + if (cmd_help) {
> + fprintf(stderr, "Get associate clos id to a CPU\n");
> + fprintf(stderr, "\tSpecify targeted cpu id with [--cpu|-c]\n");
> + exit(0);
> + }
> + if (max_target_cpus)
> + for_each_online_target_cpu_in_set(get_clos_assoc_for_cpu, NULL,
> + NULL, NULL, NULL);
> + else {
> + fprintf(stderr,
> + "Invalid target cpu. Specify with [-c|--cpu]\n");
> + }
> +}
> +
> +static struct process_cmd_struct isst_cmds[] = {
> + { "perf-profile", "get-lock-status", get_tdp_locked },
> + { "perf-profile", "get-config-levels", get_tdp_levels },
> + { "perf-profile", "get-config-version", get_tdp_version },
> + { "perf-profile", "get-config-enabled", get_tdp_enabled },
> + { "perf-profile", "get-config-current-level", get_tdp_current_level },
> + { "perf-profile", "set-config-level", set_tdp_level },
> + { "perf-profile", "info", dump_isst_config },
> + { "base-freq", "info", dump_pbf_config },
> + { "base-freq", "enable", set_pbf_enable },
> + { "base-freq", "disable", set_pbf_disable },
> + { "turbo-freq", "info", dump_fact_config },
> + { "turbo-freq", "enable", set_fact_enable },
> + { "turbo-freq", "disable", set_fact_disable },
> + { "core-power", "info", dump_clos_config },
> + { "core-power", "enable", set_clos_enable },
> + { "core-power", "disable", set_clos_disable },
> + { "core-power", "config", set_clos_config },
> + { "core-power", "assoc", set_clos_assoc },
> + { "core-power", "get-assoc", get_clos_assoc },
> + { NULL, NULL, NULL }
> +};
> +
> +/*
> + * parse cpuset with following syntax
> + * 1,2,4..6,8-10 and set bits in cpu_subset
> + */
> +void parse_cpu_command(char *optarg)
> +{
> + unsigned int start, end;
> + char *next;
> +
> + next = optarg;
> +
> + while (next && *next) {
> + if (*next == '-') /* no negative cpu numbers */
> + goto error;
> +
> + start = strtoul(next, &next, 10);
> +
> + if (max_target_cpus < MAX_CPUS_IN_ONE_REQ)
> + target_cpus[max_target_cpus++] = start;
> +
> + if (*next == '\0')
> + break;
> +
> + if (*next == ',') {
> + next += 1;
> + continue;
> + }
> +
> + if (*next == '-') {
> + next += 1; /* start range */
> + } else if (*next == '.') {
> + next += 1;
> + if (*next == '.')
> + next += 1; /* start range */
> + else
> + goto error;
> + }
> +
> + end = strtoul(next, &next, 10);
> + if (end <= start)
> + goto error;
> +
> + while (++start <= end) {
> + if (max_target_cpus < MAX_CPUS_IN_ONE_REQ)
> + target_cpus[max_target_cpus++] = start;
> + }
> +
> + if (*next == ',')
> + next += 1;
> + else if (*next != '\0')
> + goto error;
> + }
> +
> +#ifdef DEBUG
> + {
> + int i;
> +
> + for (i = 0; i < max_target_cpus; ++i)
> + printf("cpu [%d] in arg\n", target_cpus[i]);
> + }
> +#endif
> + return;
> +
> +error:
> + fprintf(stderr, "\"--cpu %s\" malformed\n", optarg);
> + exit(-1);
> +}
> +
> +static void parse_cmd_args(int argc, int start, char **argv)
> +{
> + int opt;
> + int option_index;
> +
> + static struct option long_options[] = {
> + { "bucket", required_argument, 0, 'b' },
> + { "level", required_argument, 0, 'l' },
> + { "trl-type", required_argument, 0, 'r' },
> + { "trl", required_argument, 0, 't' },
> + { "help", no_argument, 0, 'h' },
> + { "clos", required_argument, 0, 'c' },
> + { "desired", required_argument, 0, 'd' },
> + { "epp", required_argument, 0, 'e' },
> + { "min", required_argument, 0, 'n' },
> + { "max", required_argument, 0, 'm' },
> + { "priority", required_argument, 0, 'p' },
> + { "weight", required_argument, 0, 'w' },
> + { 0, 0, 0, 0 }
> + };
> +
> + option_index = start;
> +
> + optind = start + 1;
> + while ((opt = getopt_long(argc, argv, "b:l:t:c:d:e:n:m:p:w:h",
> + long_options, &option_index)) != -1) {
> + switch (opt) {
> + case 'b':
> + fact_bucket = atoi(optarg);
> + break;
> + case 'h':
> + cmd_help = 1;
> + break;
> + case 'l':
> + tdp_level = atoi(optarg);
> + break;
> + case 't':
> + sscanf(optarg, "0x%llx", &fact_trl);
> + break;
> + case 'r':
> + if (!strncmp(optarg, "sse", 3)) {
> + fact_avx = 0x01;
> + } else if (!strncmp(optarg, "avx2", 4)) {
> + fact_avx = 0x02;
> + } else if (!strncmp(optarg, "avx512", 4)) {
> + fact_avx = 0x04;
> + } else {
> + fprintf(outf, "Invalid sse,avx options\n");
> + exit(1);
> + }
> + break;
> + /* CLOS related */
> + case 'c':
> + current_clos = atoi(optarg);
> + printf("clos %d\n", current_clos);
> + break;
> + case 'd':
> + clos_desired = atoi(optarg);
> + break;
> + case 'e':
> + clos_epp = atoi(optarg);
> + break;
> + case 'n':
> + clos_min = atoi(optarg);
> + break;
> + case 'm':
> + clos_max = atoi(optarg);
> + break;
> + case 'p':
> + clos_priority_type = atoi(optarg);
> + break;
> + case 'w':
> + clos_prop_prio = atoi(optarg);
> + break;
> + default:
> + printf("no match\n");
> + }
> + }
> +}
> +
> +static void isst_help(void)
> +{
> + printf("perf-profile:\tAn architectural mechanism that allows multiple optimized \n\
> + performance profiles per system via static and/or dynamic\n\
> + adjustment of core count, workload, Tjmax, and\n\
> + TDP, etc.\n");
> + printf("\nCommands : For feature=perf-profile\n");
> + printf("\tinfo\n");
> + printf("\tget-lock-status\n");
> + printf("\tget-config-levels\n");
> + printf("\tget-config-version\n");
> + printf("\tget-config-enabled\n");
> + printf("\tget-config-current-level\n");
> + printf("\tset-config-level\n");
> +}
> +
> +static void pbf_help(void)
> +{
> + printf("base-freq:\tEnables users to increase guaranteed base frequency\n\
> + on certain cores (high priority cores) in exchange for lower\n\
> + base frequency on remaining cores (low priority cores).\n");
> + printf("\tcommand : info\n");
> + printf("\tcommand : enable\n");
> + printf("\tcommand : disable\n");
> +}
> +
> +static void fact_help(void)
> +{
> + printf("turbo-freq:\tEnables the ability to set different turbo ratio\n\
> + limits to cores based on priority.\n");
> + printf("\nCommand: For feature=turbo-freq\n");
> + printf("\tcommand : info\n");
> + printf("\tcommand : enable\n");
> + printf("\tcommand : disable\n");
> +}
> +
> +static void core_power_help(void)
> +{
> + printf("core-power:\tInterface that allows user to define per core/tile\n\
> + priority.\n");
> + printf("\nCommands : For feature=core-power\n");
> + printf("\tinfo\n");
> + printf("\tenable\n");
> + printf("\tdisable\n");
> + printf("\tconfig\n");
> + printf("\tassoc\n");
> + printf("\tget-assoc\n");
> +}
> +
> +struct process_cmd_help_struct {
> + char *feature;
> + void (*process_fn)(void);
> +};
> +
> +static struct process_cmd_help_struct isst_help_cmds[] = {
> + { "perf-profile", isst_help },
> + { "base-freq", pbf_help },
> + { "turbo-freq", fact_help },
> + { "core-power", core_power_help },
> + { NULL, NULL }
> +};
> +
> +void process_command(int argc, char **argv)
> +{
> + int i = 0, matched = 0;
> + char *feature = argv[optind];
> + char *cmd = argv[optind + 1];
> +
> + if (!feature || !cmd)
> + return;
> +
> + debug_printf("feature name [%s] command [%s]\n", feature, cmd);
> + if (!strcmp(cmd, "-h") || !strcmp(cmd, "--help")) {
> + while (isst_help_cmds[i].feature) {
> + if (!strcmp(isst_help_cmds[i].feature, feature)) {
> + isst_help_cmds[i].process_fn();
> + exit(0);
> + }
> + ++i;
> + }
> + }
> +
> + create_cpu_map();
> +
> + i = 0;
> + while (isst_cmds[i].feature) {
> + if (!strcmp(isst_cmds[i].feature, feature) &&
> + !strcmp(isst_cmds[i].command, cmd)) {
> + parse_cmd_args(argc, optind + 1, argv);
> + isst_cmds[i].process_fn();
> + matched = 1;
> + break;
> + }
> + ++i;
> + }
> +
> + if (!matched)
> + fprintf(stderr, "Invalid command\n");
> +}
> +
> +static void usage(void)
> +{
> + printf("Intel(R) Speed Select Technology\n");
> + printf("\nUsage:\n");
> + printf("intel-speed-select [OPTIONS] FEATURE COMMAND COMMAND_ARGUMENTS\n");
> + printf("\nUse this tool to enumerate and control the Intel Speed Select Technology features,\n");
> + printf("\nFEATURE : [perf-profile|base-freq|turbo-freq|core-power]\n");
> + printf("\nFor help on each feature, use --h|--help\n");
> + printf("\tFor example: intel-speed-select perf-profile -h\n");
> +
> + printf("\nFor additional help on each command for a feature, use --h|--help\n");
> + printf("\tFor example: intel-speed-select perf-profile get-lock-status -h\n");
> + printf("\t\t This will print help for the command \"get-lock-status\" for the feature \"perf-profile\"\n");
> +
> + printf("\nOPTIONS\n");
> + printf("\t[-c|--cpu] : logical cpu number\n");
> + printf("\t\tDefault: Die scoped for all dies in the system with multiple dies/package\n");
> + printf("\t\t\t Or Package scoped for all Packages when each package contains one die\n");
> + printf("\t[-d|--debug] : Debug mode\n");
> + printf("\t[-h|--help] : Print help\n");
> + printf("\t[-i|--info] : Print platform information\n");
> + printf("\t[-o|--out] : Output file\n");
> + printf("\t\t\tDefault : stderr\n");
> + printf("\t[-f|--format] : output format [json|text]. Default: text\n");
> + printf("\t[-v|--version] : Print version\n");
> +
> + printf("\nResult format\n");
> + printf("\tResult display uses a common format for each command:\n");
> + printf("\tResults are formatted in text/JSON with\n");
> + printf("\t\tPackage, Die, CPU, and command specific results.\n");
> + printf("\t\t\tFor Set commands, status is 0 for success and rest for failures\n");
> + exit(1);
> +}
> +
> +static void print_version(void)
> +{
> + fprintf(outf, "Version %s\n", version_str);
> + fprintf(outf, "Build date %s time %s\n", __DATE__, __TIME__);
> + exit(0);
> +}
> +
> +static void cmdline(int argc, char **argv)
> +{
> + int opt;
> + int option_index = 0;
> +
> + static struct option long_options[] = {
> + { "cpu", required_argument, 0, 'c' },
> + { "debug", no_argument, 0, 'd' },
> + { "format", required_argument, 0, 'f' },
> + { "help", no_argument, 0, 'h' },
> + { "info", no_argument, 0, 'i' },
> + { "out", required_argument, 0, 'o' },
> + { "version", no_argument, 0, 'v' },
> + { 0, 0, 0, 0 }
> + };
> +
> + progname = argv[0];
> + while ((opt = getopt_long_only(argc, argv, "+c:df:hio:v", long_options,
> + &option_index)) != -1) {
> + switch (opt) {
> + case 'c':
> + parse_cpu_command(optarg);
> + break;
> + case 'd':
> + debug_flag = 1;
> + printf("Debug Mode ON\n");
> + break;
> + case 'f':
> + if (!strncmp(optarg, "json", 4))
> + out_format_json = 1;
> + break;
> + case 'h':
> + usage();
> + break;
> + case 'i':
> + isst_print_platform_information();
> + break;
> + case 'o':
> + if (outf)
> + fclose(outf);
> + outf = fopen_or_exit(optarg, "w");
> + break;
> + case 'v':
> + print_version();
> + break;
> + default:
> + usage();
> + }
> + }
> +
> + if (geteuid() != 0) {
> + fprintf(stderr, "Must run as root\n");
> + exit(0);
> + }
> +
> + if (optind > (argc - 2)) {
> + fprintf(stderr, "Feature name and|or command not specified\n");
> + exit(0);
> + }
> + update_cpu_model();
> + printf("Intel(R) Speed Select Technology\n");
> + printf("Executing on CPU model:%d[0x%x]\n", cpu_model, cpu_model);
> + set_max_cpu_num();
> + set_cpu_present_cpu_mask();
> + set_cpu_target_cpu_mask();
> + isst_fill_platform_info();
> + if (isst_platform_info.api_version > supported_api_ver) {
> + printf("Incompatible API versions; Upgrade of tool is required\n");
> + exit(0);
> + }
> +
> + process_command(argc, argv);
> +}
> +
> +int main(int argc, char **argv)
> +{
> + outf = stderr;
> + cmdline(argc, argv);
> + return 0;
> +}
> diff --git a/tools/power/x86/intel-speed-select/isst-core.c b/tools/power/x86/intel-speed-select/isst-core.c
> new file mode 100644
> index 000000000000..8de4ac39a008
> --- /dev/null
> +++ b/tools/power/x86/intel-speed-select/isst-core.c
> @@ -0,0 +1,721 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Intel Speed Select -- Enumerate and control features
> + * Copyright (c) 2019 Intel Corporation.
> + */
> +
> +#include "isst.h"
> +
> +int isst_get_ctdp_levels(int cpu, struct isst_pkg_ctdp *pkg_dev)
> +{
> + unsigned int resp;
> + int ret;
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_TDP,
> + CONFIG_TDP_GET_LEVELS_INFO, 0, 0, &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf("cpu:%d CONFIG_TDP_GET_LEVELS_INFO resp:%x\n", cpu, resp);
> +
> + pkg_dev->version = resp & 0xff;
> + pkg_dev->levels = (resp >> 8) & 0xff;
> + pkg_dev->current_level = (resp >> 16) & 0xff;
> + pkg_dev->locked = !!(resp & BIT(24));
> + pkg_dev->enabled = !!(resp & BIT(31));
> +
> + return 0;
> +}
> +
> +int isst_get_ctdp_control(int cpu, int config_index,
> + struct isst_pkg_ctdp_level_info *ctdp_level)
> +{
> + unsigned int resp;
> + int ret;
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_TDP,
> + CONFIG_TDP_GET_TDP_CONTROL, 0,
> + config_index, &resp);
> + if (ret)
> + return ret;
> +
> + ctdp_level->fact_support = resp & BIT(0);
> + ctdp_level->pbf_support = !!(resp & BIT(1));
> + ctdp_level->fact_enabled = !!(resp & BIT(16));
> + ctdp_level->pbf_enabled = !!(resp & BIT(17));
> +
> + debug_printf(
> + "cpu:%d CONFIG_TDP_GET_TDP_CONTROL resp:%x fact_support:%d pbf_support: %d fact_enabled:%d pbf_enabled:%d\n",
> + cpu, resp, ctdp_level->fact_support, ctdp_level->pbf_support,
> + ctdp_level->fact_enabled, ctdp_level->pbf_enabled);
> +
> + return 0;
> +}
> +
> +int isst_get_tdp_info(int cpu, int config_index,
> + struct isst_pkg_ctdp_level_info *ctdp_level)
> +{
> + unsigned int resp;
> + int ret;
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_TDP, CONFIG_TDP_GET_TDP_INFO,
> + 0, config_index, &resp);
> + if (ret)
> + return ret;
> +
> + ctdp_level->pkg_tdp = resp & GENMASK(14, 0);
> + ctdp_level->tdp_ratio = (resp & GENMASK(23, 16)) >> 16;
> +
> + debug_printf(
> + "cpu:%d ctdp:%d CONFIG_TDP_GET_TDP_INFO resp:%x tdp_ratio:%d pkg_tdp:%d\n",
> + cpu, config_index, resp, ctdp_level->tdp_ratio,
> + ctdp_level->pkg_tdp);
> + return 0;
> +}
> +
> +int isst_get_pwr_info(int cpu, int config_index,
> + struct isst_pkg_ctdp_level_info *ctdp_level)
> +{
> + unsigned int resp;
> + int ret;
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_TDP, CONFIG_TDP_GET_PWR_INFO,
> + 0, config_index, &resp);
> + if (ret)
> + return ret;
> +
> + ctdp_level->pkg_max_power = resp & GENMASK(14, 0);
> + ctdp_level->pkg_min_power = (resp & GENMASK(30, 16)) >> 16;
> +
> + debug_printf(
> + "cpu:%d ctdp:%d CONFIG_TDP_GET_PWR_INFO resp:%x pkg_max_power:%d pkg_min_power:%d\n",
> + cpu, config_index, resp, ctdp_level->pkg_max_power,
> + ctdp_level->pkg_min_power);
> +
> + return 0;
> +}
> +
> +int isst_get_tjmax_info(int cpu, int config_index,
> + struct isst_pkg_ctdp_level_info *ctdp_level)
> +{
> + unsigned int resp;
> + int ret;
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_TDP, CONFIG_TDP_GET_TJMAX_INFO,
> + 0, config_index, &resp);
> + if (ret)
> + return ret;
> +
> + ctdp_level->t_proc_hot = resp & GENMASK(7, 0);
> +
> + debug_printf(
> + "cpu:%d ctdp:%d CONFIG_TDP_GET_TJMAX_INFO resp:%x t_proc_hot:%d\n",
> + cpu, config_index, resp, ctdp_level->t_proc_hot);
> +
> + return 0;
> +}
> +
> +int isst_get_coremask_info(int cpu, int config_index,
> + struct isst_pkg_ctdp_level_info *ctdp_level)
> +{
> + unsigned int resp;
> + int i, ret;
> +
> + ctdp_level->cpu_count = 0;
> + for (i = 0; i < 2; ++i) {
> + unsigned long long mask;
> + int cpu_count = 0;
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_TDP,
> + CONFIG_TDP_GET_CORE_MASK, 0,
> + (i << 8) | config_index, &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf(
> + "cpu:%d ctdp:%d mask:%d CONFIG_TDP_GET_CORE_MASK resp:%x\n",
> + cpu, config_index, i, resp);
> +
> + mask = (unsigned long long)resp << (32 * i);
> + set_cpu_mask_from_punit_coremask(cpu, mask,
> + ctdp_level->core_cpumask_size,
> + ctdp_level->core_cpumask,
> + &cpu_count);
> + ctdp_level->cpu_count += cpu_count;
> + debug_printf("cpu:%d ctdp:%d mask:%d cpu count:%d\n", cpu,
> + config_index, i, ctdp_level->cpu_count);
> + }
> +
> + return 0;
> +}
> +
> +int isst_get_get_trl(int cpu, int level, int avx_level, int *trl)
> +{
> + unsigned int req, resp;
> + int ret;
> +
> + req = level | (avx_level << 16);
> + ret = isst_send_mbox_command(cpu, CONFIG_TDP,
> + CONFIG_TDP_GET_TURBO_LIMIT_RATIOS, 0, req,
> + &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf(
> + "cpu:%d CONFIG_TDP_GET_TURBO_LIMIT_RATIOS req:%x resp:%x\n",
> + cpu, req, resp);
> +
> + trl[0] = resp & GENMASK(7, 0);
> + trl[1] = (resp & GENMASK(15, 8)) >> 8;
> + trl[2] = (resp & GENMASK(23, 16)) >> 16;
> + trl[3] = (resp & GENMASK(31, 24)) >> 24;
> +
> + req = level | BIT(8) | (avx_level << 16);
> + ret = isst_send_mbox_command(cpu, CONFIG_TDP,
> + CONFIG_TDP_GET_TURBO_LIMIT_RATIOS, 0, req,
> + &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf("cpu:%d CONFIG_TDP_GET_TURBO_LIMIT req:%x resp:%x\n", cpu,
> + req, resp);
> +
> + trl[4] = resp & GENMASK(7, 0);
> + trl[5] = (resp & GENMASK(15, 8)) >> 8;
> + trl[6] = (resp & GENMASK(23, 16)) >> 16;
> + trl[7] = (resp & GENMASK(31, 24)) >> 24;
> +
> + return 0;
> +}
> +
> +int isst_set_tdp_level_msr(int cpu, int tdp_level)
> +{
> + int ret;
> +
> + debug_printf("cpu: tdp_level via MSR %d\n", cpu, tdp_level);
> +
> + if (isst_get_config_tdp_lock_status(cpu)) {
> + debug_printf("cpu: tdp_locked %d\n", cpu);
> + return -1;
> + }
> +
> + if (tdp_level > 2)
> + return -1; /* invalid value */
> +
> + ret = isst_send_msr_command(cpu, 0x64b, 1,
> + (unsigned long long *)&tdp_level);
> + if (ret)
> + return ret;
> +
> + debug_printf("cpu: tdp_level via MSR successful %d\n", cpu, tdp_level);
> +
> + return 0;
> +}
> +
> +int isst_set_tdp_level(int cpu, int tdp_level)
> +{
> + unsigned int resp;
> + int ret;
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_TDP, CONFIG_TDP_SET_LEVEL, 0,
> + tdp_level, &resp);
> + if (ret)
> + return isst_set_tdp_level_msr(cpu, tdp_level);
> +
> + return 0;
> +}
> +
> +int isst_get_pbf_info(int cpu, int level, struct isst_pbf_info *pbf_info)
> +{
> + unsigned int req, resp;
> + int i, ret;
> +
> + pbf_info->core_cpumask_size = alloc_cpu_set(&pbf_info->core_cpumask);
> +
> + for (i = 0; i < 2; ++i) {
> + unsigned long long mask;
> + int count;
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_TDP,
> + CONFIG_TDP_PBF_GET_CORE_MASK_INFO,
> + 0, (i << 8) | level, &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf(
> + "cpu:%d CONFIG_TDP_PBF_GET_CORE_MASK_INFO resp:%x\n",
> + cpu, resp);
> +
> + mask = (unsigned long long)resp << (32 * i);
> + set_cpu_mask_from_punit_coremask(cpu, mask,
> + pbf_info->core_cpumask_size,
> + pbf_info->core_cpumask,
> + &count);
> + }
> +
> + req = level;
> + ret = isst_send_mbox_command(cpu, CONFIG_TDP,
> + CONFIG_TDP_PBF_GET_P1HI_P1LO_INFO, 0, req,
> + &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf("cpu:%d CONFIG_TDP_PBF_GET_P1HI_P1LO_INFO resp:%x\n", cpu,
> + resp);
> +
> + pbf_info->p1_low = resp & 0xff;
> + pbf_info->p1_high = (resp & GENMASK(15, 8)) >> 8;
> +
> + req = level;
> + ret = isst_send_mbox_command(
> + cpu, CONFIG_TDP, CONFIG_TDP_PBF_GET_TDP_INFO, 0, req, &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf("cpu:%d CONFIG_TDP_PBF_GET_TDP_INFO resp:%x\n", cpu, resp);
> +
> + pbf_info->tdp = resp & 0xffff;
> +
> + req = level;
> + ret = isst_send_mbox_command(
> + cpu, CONFIG_TDP, CONFIG_TDP_PBF_GET_TJ_MAX_INFO, 0, req, &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf("cpu:%d CONFIG_TDP_PBF_GET_TJ_MAX_INFO resp:%x\n", cpu,
> + resp);
> + pbf_info->t_control = (resp >> 8) & 0xff;
> + pbf_info->t_prochot = resp & 0xff;
> +
> + return 0;
> +}
> +
> +void isst_get_pbf_info_complete(struct isst_pbf_info *pbf_info)
> +{
> + free_cpu_set(pbf_info->core_cpumask);
> +}
> +
> +int isst_set_pbf_fact_status(int cpu, int pbf, int enable)
> +{
> + struct isst_pkg_ctdp pkg_dev;
> + struct isst_pkg_ctdp_level_info ctdp_level;
> + int current_level;
> + unsigned int req = 0, resp;
> + int ret;
> +
> + ret = isst_get_ctdp_levels(cpu, &pkg_dev);
> + if (ret)
> + return ret;
> +
> + current_level = pkg_dev.current_level;
> +
> + ret = isst_get_ctdp_control(cpu, current_level, &ctdp_level);
> + if (ret)
> + return ret;
> +
> + if (pbf) {
> + if (ctdp_level.fact_enabled)
> + req = BIT(16);
> +
> + if (enable)
> + req |= BIT(17);
> + else
> + req &= ~BIT(17);
> + } else {
> + if (ctdp_level.pbf_enabled)
> + req = BIT(17);
> +
> + if (enable)
> + req |= BIT(16);
> + else
> + req &= ~BIT(16);
> + }
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_TDP,
> + CONFIG_TDP_SET_TDP_CONTROL, 0, req, &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf("cpu:%d CONFIG_TDP_SET_TDP_CONTROL pbf/fact:%d req:%x\n",
> + cpu, pbf, req);
> +
> + return 0;
> +}
> +
> +int isst_get_fact_bucket_info(int cpu, int level,
> + struct isst_fact_bucket_info *bucket_info)
> +{
> + unsigned int resp;
> + int i, k, ret;
> +
> + for (i = 0; i < 2; ++i) {
> + int j;
> +
> + ret = isst_send_mbox_command(
> + cpu, CONFIG_TDP,
> + CONFIG_TDP_GET_FACT_HP_TURBO_LIMIT_NUMCORES, 0,
> + (i << 8) | level, &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf(
> + "cpu:%d CONFIG_TDP_GET_FACT_HP_TURBO_LIMIT_NUMCORES index:%d level:%d resp:%x\n",
> + cpu, i, level, resp);
> +
> + for (j = 0; j < 4; ++j) {
> + bucket_info[j + (i * 4)].high_priority_cores_count =
> + (resp >> (j * 8)) & 0xff;
> + }
> + }
> +
> + for (k = 0; k < 3; ++k) {
> + for (i = 0; i < 2; ++i) {
> + int j;
> +
> + ret = isst_send_mbox_command(
> + cpu, CONFIG_TDP,
> + CONFIG_TDP_GET_FACT_HP_TURBO_LIMIT_RATIOS, 0,
> + (k << 16) | (i << 8) | level, &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf(
> + "cpu:%d CONFIG_TDP_GET_FACT_HP_TURBO_LIMIT_RATIOS index:%d level:%d avx:%d resp:%x\n",
> + cpu, i, level, k, resp);
> +
> + for (j = 0; j < 4; ++j) {
> + switch (k) {
> + case 0:
> + bucket_info[j + (i * 4)].sse_trl =
> + (resp >> (j * 8)) & 0xff;
> + break;
> + case 1:
> + bucket_info[j + (i * 4)].avx_trl =
> + (resp >> (j * 8)) & 0xff;
> + break;
> + case 2:
> + bucket_info[j + (i * 4)].avx512_trl =
> + (resp >> (j * 8)) & 0xff;
> + break;
> + default:
> + break;
> + }
> + }
> + }
> + }
> +
> + return 0;
> +}
> +
> +int isst_get_fact_info(int cpu, int level, struct isst_fact_info *fact_info)
> +{
> + unsigned int resp;
> + int ret;
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_TDP,
> + CONFIG_TDP_GET_FACT_LP_CLIPPING_RATIO, 0,
> + level, &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf("cpu:%d CONFIG_TDP_GET_FACT_LP_CLIPPING_RATIO resp:%x\n",
> + cpu, resp);
> +
> + fact_info->lp_clipping_ratio_license_sse = resp & 0xff;
> + fact_info->lp_clipping_ratio_license_avx2 = (resp >> 8) & 0xff;
> + fact_info->lp_clipping_ratio_license_avx512 = (resp >> 16) & 0xff;
> +
> + ret = isst_get_fact_bucket_info(cpu, level, fact_info->bucket_info);
> +
> + return ret;
> +}
> +
> +int isst_set_trl(int cpu, unsigned long long trl)
> +{
> + int ret;
> +
> + if (!trl)
> + trl = 0xFFFFFFFFFFFFFFFFULL;
> +
> + ret = isst_send_msr_command(cpu, 0x1AD, 1, &trl);
> + if (ret)
> + return ret;
> +
> + return 0;
> +}
> +
> +int isst_set_trl_from_current_tdp(int cpu, unsigned long long trl)
> +{
> + unsigned long long msr_trl;
> + int ret;
> +
> + if (trl) {
> + msr_trl = trl;
> + } else {
> + struct isst_pkg_ctdp pkg_dev;
> + int trl[8];
> + int i;
> +
> + ret = isst_get_ctdp_levels(cpu, &pkg_dev);
> + if (ret)
> + return ret;
> +
> + ret = isst_get_get_trl(cpu, pkg_dev.current_level, 0, trl);
> + if (ret)
> + return ret;
> +
> + msr_trl = 0;
> + for (i = 0; i < 8; ++i) {
> + unsigned long long _trl = trl[i];
> +
> + msr_trl |= (_trl << (i * 8));
> + }
> + }
> + ret = isst_send_msr_command(cpu, 0x1AD, 1, &msr_trl);
> + if (ret)
> + return ret;
> +
> + return 0;
> +}
> +
> +/* Return 1 if locked */
> +int isst_get_config_tdp_lock_status(int cpu)
> +{
> + unsigned long long tdp_control = 0;
> + int ret;
> +
> + ret = isst_send_msr_command(cpu, 0x64b, 0, &tdp_control);
> + if (ret)
> + return ret;
> +
> + ret = !!(tdp_control & BIT(31));
> +
> + return ret;
> +}
> +
> +void isst_get_process_ctdp_complete(int cpu, struct isst_pkg_ctdp *pkg_dev)
> +{
> + int i;
> +
> + if (!pkg_dev->processed)
> + return;
> +
> + for (i = 0; i < pkg_dev->levels; ++i) {
> + struct isst_pkg_ctdp_level_info *ctdp_level;
> +
> + ctdp_level = &pkg_dev->ctdp_level[i];
> + if (ctdp_level->pbf_support)
> + free_cpu_set(ctdp_level->pbf_info.core_cpumask);
> + free_cpu_set(ctdp_level->core_cpumask);
> + }
> +}
> +
> +int isst_get_process_ctdp(int cpu, int tdp_level, struct isst_pkg_ctdp *pkg_dev)
> +{
> + int i, ret;
> +
> + if (pkg_dev->processed)
> + return 0;
> +
> + ret = isst_get_ctdp_levels(cpu, pkg_dev);
> + if (ret)
> + return ret;
> +
> + debug_printf("cpu: %d ctdp enable:%d current level: %d levels:%d\n",
> + cpu, pkg_dev->enabled, pkg_dev->current_level,
> + pkg_dev->levels);
> +
> + for (i = 0; i <= pkg_dev->levels; ++i) {
> + struct isst_pkg_ctdp_level_info *ctdp_level;
> +
> + if (tdp_level != 0xff && i != tdp_level)
> + continue;
> +
> + debug_printf("cpu:%d Get Information for TDP level:%d\n", cpu,
> + i);
> + ctdp_level = &pkg_dev->ctdp_level[i];
> +
> + ctdp_level->processed = 1;
> + ctdp_level->level = i;
> + ctdp_level->control_cpu = cpu;
> + ctdp_level->pkg_id = get_physical_package_id(cpu);
> + ctdp_level->die_id = get_physical_die_id(cpu);
> +
> + ret = isst_get_ctdp_control(cpu, i, ctdp_level);
> + if (ret)
> + return ret;
> +
> + ret = isst_get_tdp_info(cpu, i, ctdp_level);
> + if (ret)
> + return ret;
> +
> + ret = isst_get_pwr_info(cpu, i, ctdp_level);
> + if (ret)
> + return ret;
> +
> + ret = isst_get_tjmax_info(cpu, i, ctdp_level);
> + if (ret)
> + return ret;
> +
> + ctdp_level->core_cpumask_size =
> + alloc_cpu_set(&ctdp_level->core_cpumask);
> + ret = isst_get_coremask_info(cpu, i, ctdp_level);
> + if (ret)
> + return ret;
> +
> + ret = isst_get_get_trl(cpu, i, 0,
> + ctdp_level->trl_sse_active_cores);
> + if (ret)
> + return ret;
> +
> + ret = isst_get_get_trl(cpu, i, 1,
> + ctdp_level->trl_avx_active_cores);
> + if (ret)
> + return ret;
> +
> + ret = isst_get_get_trl(cpu, i, 2,
> + ctdp_level->trl_avx_512_active_cores);
> + if (ret)
> + return ret;
> +
> + if (ctdp_level->pbf_support) {
> + ret = isst_get_pbf_info(cpu, i, &ctdp_level->pbf_info);
> + if (!ret)
> + ctdp_level->pbf_found = 1;
> + }
> +
> + if (ctdp_level->fact_support) {
> + ret = isst_get_fact_info(cpu, i,
> + &ctdp_level->fact_info);
> + if (ret)
> + return ret;
> + }
> + }
> +
> + pkg_dev->processed = 1;
> +
> + return 0;
> +}
> +
> +int isst_pm_qos_config(int cpu, int enable_clos, int priority_type)
> +{
> + unsigned int req, resp;
> + int ret;
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_CLOS, CLOS_PM_QOS_CONFIG, 0, 0,
> + &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf("cpu:%d CLOS_PM_QOS_CONFIG resp:%x\n", cpu, resp);
> +
> + req = resp;
> +
> + if (enable_clos)
> + req = req | BIT(1);
> + else
> + req = req & ~BIT(1);
> +
> + if (priority_type)
> + req = req | BIT(2);
> + else
> + req = req & ~BIT(2);
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_CLOS, CLOS_PM_QOS_CONFIG,
> + BIT(MBOX_CMD_WRITE_BIT), req, &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf("cpu:%d CLOS_PM_QOS_CONFIG priority type:%d req:%x\n", cpu,
> + priority_type, req);
> +
> + return 0;
> +}
> +
> +int isst_pm_get_clos(int cpu, int clos, struct isst_clos_config *clos_config)
> +{
> + unsigned int resp;
> + int ret;
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_CLOS, CLOS_PM_CLOS, clos, 0,
> + &resp);
> + if (ret)
> + return ret;
> +
> + clos_config->pkg_id = get_physical_package_id(cpu);
> + clos_config->die_id = get_physical_die_id(cpu);
> +
> + clos_config->epp = resp & 0x0f;
> + clos_config->clos_prop_prio = (resp >> 4) & 0x0f;
> + clos_config->clos_min = (resp >> 8) & 0xff;
> + clos_config->clos_max = (resp >> 16) & 0xff;
> + clos_config->clos_desired = (resp >> 24) & 0xff;
> +
> + return 0;
> +}
> +
> +int isst_set_clos(int cpu, int clos, struct isst_clos_config *clos_config)
> +{
> + unsigned int req, resp;
> + unsigned int param;
> + int ret;
> +
> + req = clos_config->epp & 0x0f;
> + req |= (clos_config->clos_prop_prio & 0x0f) << 4;
> + req |= (clos_config->clos_min & 0xff) << 8;
> + req |= (clos_config->clos_max & 0xff) << 16;
> + req |= (clos_config->clos_desired & 0xff) << 24;
> +
> + param = BIT(MBOX_CMD_WRITE_BIT) | clos;
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_CLOS, CLOS_PM_CLOS, param, req,
> + &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf("cpu:%d CLOS_PM_CLOS param:%x req:%x\n", cpu, param, req);
> +
> + return 0;
> +}
> +
> +int isst_clos_get_assoc_status(int cpu, int *clos_id)
> +{
> + unsigned int resp;
> + unsigned int param;
> + int core_id, ret;
> +
> + core_id = find_phy_core_num(cpu);
> + param = core_id;
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_CLOS, CLOS_PQR_ASSOC, param, 0,
> + &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf("cpu:%d CLOS_PQR_ASSOC param:%x resp:%x\n", cpu, param,
> + resp);
> + *clos_id = (resp >> 16) & 0x03;
> +
> + return 0;
> +}
> +
> +int isst_clos_associate(int cpu, int clos_id)
> +{
> + unsigned int req, resp;
> + unsigned int param;
> + int core_id, ret;
> +
> + req = (clos_id & 0x03) << 16;
> + core_id = find_phy_core_num(cpu);
> + param = BIT(MBOX_CMD_WRITE_BIT) | core_id;
> +
> + ret = isst_send_mbox_command(cpu, CONFIG_CLOS, CLOS_PQR_ASSOC, param,
> + req, &resp);
> + if (ret)
> + return ret;
> +
> + debug_printf("cpu:%d CLOS_PQR_ASSOC param:%x req:%x\n", cpu, param,
> + req);
> +
> + return 0;
> +}
> diff --git a/tools/power/x86/intel-speed-select/isst-display.c b/tools/power/x86/intel-speed-select/isst-display.c
> new file mode 100644
> index 000000000000..f368b8323742
> --- /dev/null
> +++ b/tools/power/x86/intel-speed-select/isst-display.c
> @@ -0,0 +1,479 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Intel dynamic_speed_select -- Enumerate and control features
> + * Copyright (c) 2019 Intel Corporation.
> + */
> +
> +#include "isst.h"
> +
> +#define DISP_FREQ_MULTIPLIER 100000
> +
> +static void printcpumask(int str_len, char *str, int mask_size,
> + cpu_set_t *cpu_mask)
> +{
> + int i, max_cpus = get_topo_max_cpus();
> + unsigned int *mask;
> + int size, index, curr_index;
> +
> + size = max_cpus / (sizeof(unsigned int) * 8);
> + if (max_cpus % (sizeof(unsigned int) * 8))
> + size++;
> +
> + mask = calloc(size, sizeof(unsigned int));
> + if (!mask)
> + return;
> +
> + for (i = 0; i < max_cpus; ++i) {
> + int mask_index, bit_index;
> +
> + if (!CPU_ISSET_S(i, mask_size, cpu_mask))
> + continue;
> +
> + mask_index = i / (sizeof(unsigned int) * 8);
> + bit_index = i % (sizeof(unsigned int) * 8);
> + mask[mask_index] |= BIT(bit_index);
> + }
> +
> + curr_index = 0;
> + for (i = size - 1; i >= 0; --i) {
> + index = snprintf(&str[curr_index], str_len - curr_index, "%08x",
> + mask[i]);
> + curr_index += index;
> + if (i) {
> + strncat(&str[curr_index], ",", str_len - curr_index);
> + curr_index++;
> + }
> + }
> +
> + free(mask);
> +}
> +
> +static void format_and_print_txt(FILE *outf, int level, char *header,
> + char *value)
> +{
> + char *spaces = " ";
> + static char delimiters[256];
> + int i, j = 0;
> +
> + if (!level)
> + return;
> +
> + if (level == 1) {
> + strcpy(delimiters, " ");
> + } else {
> + for (i = 0; i < level - 1; ++i)
> + j += snprintf(&delimiters[j], sizeof(delimiters) - j,
> + "%s", spaces);
> + }
> +
> + if (header && value) {
> + fprintf(outf, "%s", delimiters);
> + fprintf(outf, "%s:%s\n", header, value);
> + } else if (header) {
> + fprintf(outf, "%s", delimiters);
> + fprintf(outf, "%s\n", header);
> + }
> +}
> +
> +static int last_level;
> +static void format_and_print(FILE *outf, int level, char *header, char *value)
> +{
> + char *spaces = " ";
> + static char delimiters[256];
> + int i;
> +
> + if (!out_format_is_json()) {
> + format_and_print_txt(outf, level, header, value);
> + return;
> + }
> +
> + if (level == 0) {
> + if (header)
> + fprintf(outf, "{");
> + else
> + fprintf(outf, "\n}\n");
> +
> + } else {
> + int j = 0;
> +
> + for (i = 0; i < level; ++i)
> + j += snprintf(&delimiters[j], sizeof(delimiters) - j,
> + "%s", spaces);
> +
> + if (last_level == level)
> + fprintf(outf, ",\n");
> +
> + if (value) {
> + if (last_level != level)
> + fprintf(outf, "\n");
> +
> + fprintf(outf, "%s\"%s\": ", delimiters, header);
> + fprintf(outf, "\"%s\"", value);
> + } else {
> + for (i = last_level - 1; i >= level; --i) {
> + int k = 0;
> +
> + for (j = i; j > 0; --j)
> + k += snprintf(&delimiters[k],
> + sizeof(delimiters) - k,
> + "%s", spaces);
> + if (i == level && header)
> + fprintf(outf, "\n%s},", delimiters);
> + else
> + fprintf(outf, "\n%s}", delimiters);
> + }
> + if (abs(last_level - level) < 3)
> + fprintf(outf, "\n");
> + if (header)
> + fprintf(outf, "%s\"%s\": {", delimiters,
> + header);
> + }
> + }
> +
> + last_level = level;
> +}
> +
> +static void print_packag_info(int cpu, FILE *outf)
> +{
> + char header[256];
> +
> + snprintf(header, sizeof(header), "package-%d",
> + get_physical_package_id(cpu));
> + format_and_print(outf, 1, header, NULL);
> + snprintf(header, sizeof(header), "die-%d", get_physical_die_id(cpu));
> + format_and_print(outf, 2, header, NULL);
> + snprintf(header, sizeof(header), "cpu-%d", cpu);
> + format_and_print(outf, 3, header, NULL);
> +}
> +
> +static void _isst_pbf_display_information(int cpu, FILE *outf, int level,
> + struct isst_pbf_info *pbf_info,
> + int disp_level)
> +{
> + char header[256];
> + char value[256];
> +
> + snprintf(header, sizeof(header), "speed-select-base-freq");
> + format_and_print(outf, disp_level, header, NULL);
> +
> + snprintf(header, sizeof(header), "high-priority-base-frequency(KHz)");
> + snprintf(value, sizeof(value), "%d",
> + pbf_info->p1_high * DISP_FREQ_MULTIPLIER);
> + format_and_print(outf, disp_level + 1, header, value);
> +
> + snprintf(header, sizeof(header), "high-priority-cpu-mask");
> + printcpumask(sizeof(value), value, pbf_info->core_cpumask_size,
> + pbf_info->core_cpumask);
> + format_and_print(outf, disp_level + 1, header, value);
> +
> + snprintf(header, sizeof(header), "low-priority-base-frequency(KHz)");
> + snprintf(value, sizeof(value), "%d",
> + pbf_info->p1_low * DISP_FREQ_MULTIPLIER);
> + format_and_print(outf, disp_level + 1, header, value);
> +
> + snprintf(header, sizeof(header), "tjunction-temperature(C)");
> + snprintf(value, sizeof(value), "%d", pbf_info->t_prochot);
> + format_and_print(outf, disp_level + 1, header, value);
> +
> + snprintf(header, sizeof(header), "thermal-design-power(W)");
> + snprintf(value, sizeof(value), "%d", pbf_info->tdp);
> + format_and_print(outf, disp_level + 1, header, value);
> +}
> +
> +static void _isst_fact_display_information(int cpu, FILE *outf, int level,
> + int fact_bucket, int fact_avx,
> + struct isst_fact_info *fact_info,
> + int base_level)
> +{
> + struct isst_fact_bucket_info *bucket_info = fact_info->bucket_info;
> + char header[256];
> + char value[256];
> + int j;
> +
> + snprintf(header, sizeof(header), "speed-select-turbo-freq");
> + format_and_print(outf, base_level, header, NULL);
> + for (j = 0; j < ISST_FACT_MAX_BUCKETS; ++j) {
> + if (fact_bucket != 0xff && fact_bucket != j)
> + continue;
> +
> + if (!bucket_info[j].high_priority_cores_count)
> + break;
> +
> + snprintf(header, sizeof(header), "bucket-%d", j);
> + format_and_print(outf, base_level + 1, header, NULL);
> +
> + snprintf(header, sizeof(header), "high-priority-cores-count");
> + snprintf(value, sizeof(value), "%d",
> + bucket_info[j].high_priority_cores_count);
> + format_and_print(outf, base_level + 2, header, value);
> +
> + if (fact_avx & 0x01) {
> + snprintf(header, sizeof(header),
> + "high-priority-max-frequency(KHz)");
> + snprintf(value, sizeof(value), "%d",
> + bucket_info[j].sse_trl * DISP_FREQ_MULTIPLIER);
> + format_and_print(outf, base_level + 2, header, value);
> + }
> +
> + if (fact_avx & 0x02) {
> + snprintf(header, sizeof(header),
> + "high-priority-max-avx2-frequency(KHz)");
> + snprintf(value, sizeof(value), "%d",
> + bucket_info[j].avx_trl * DISP_FREQ_MULTIPLIER);
> + format_and_print(outf, base_level + 2, header, value);
> + }
> +
> + if (fact_avx & 0x04) {
> + snprintf(header, sizeof(header),
> + "high-priority-max-avx512-frequency(KHz)");
> + snprintf(value, sizeof(value), "%d",
> + bucket_info[j].avx512_trl *
> + DISP_FREQ_MULTIPLIER);
> + format_and_print(outf, base_level + 2, header, value);
> + }
> + }
> + snprintf(header, sizeof(header),
> + "speed-select-turbo-freq-clip-frequencies");
> + format_and_print(outf, base_level + 1, header, NULL);
> + snprintf(header, sizeof(header), "low-priority-max-frequency(KHz)");
> + snprintf(value, sizeof(value), "%d",
> + fact_info->lp_clipping_ratio_license_sse *
> + DISP_FREQ_MULTIPLIER);
> + format_and_print(outf, base_level + 2, header, value);
> + snprintf(header, sizeof(header),
> + "low-priority-max-avx2-frequency(KHz)");
> + snprintf(value, sizeof(value), "%d",
> + fact_info->lp_clipping_ratio_license_avx2 *
> + DISP_FREQ_MULTIPLIER);
> + format_and_print(outf, base_level + 2, header, value);
> + snprintf(header, sizeof(header),
> + "low-priority-max-avx512-frequency(KHz)");
> + snprintf(value, sizeof(value), "%d",
> + fact_info->lp_clipping_ratio_license_avx512 *
> + DISP_FREQ_MULTIPLIER);
> + format_and_print(outf, base_level + 2, header, value);
> +}
> +
> +void isst_ctdp_display_information(int cpu, FILE *outf, int tdp_level,
> + struct isst_pkg_ctdp *pkg_dev)
> +{
> + char header[256];
> + char value[256];
> + int i, base_level = 1;
> +
> + print_packag_info(cpu, outf);
> +
> + for (i = 0; i <= pkg_dev->levels; ++i) {
> + struct isst_pkg_ctdp_level_info *ctdp_level;
> + int j;
> +
> + ctdp_level = &pkg_dev->ctdp_level[i];
> + if (!ctdp_level->processed)
> + continue;
> +
> + snprintf(header, sizeof(header), "perf-profile-level-%d",
> + ctdp_level->level);
> + format_and_print(outf, base_level + 3, header, NULL);
> +
> + snprintf(header, sizeof(header), "cpu-count");
> + j = get_cpu_count(get_physical_die_id(cpu),
> + get_physical_die_id(cpu));
> + snprintf(value, sizeof(value), "%d", j);
> + format_and_print(outf, base_level + 4, header, value);
> +
> + snprintf(header, sizeof(header), "enable-cpu-mask");
> + printcpumask(sizeof(value), value,
> + ctdp_level->core_cpumask_size,
> + ctdp_level->core_cpumask);
> + format_and_print(outf, base_level + 4, header, value);
> +
> + snprintf(header, sizeof(header), "thermal-design-power-ratio");
> + snprintf(value, sizeof(value), "%d", ctdp_level->tdp_ratio);
> + format_and_print(outf, base_level + 4, header, value);
> +
> + snprintf(header, sizeof(header), "base-frequency(KHz)");
> + snprintf(value, sizeof(value), "%d",
> + ctdp_level->tdp_ratio * DISP_FREQ_MULTIPLIER);
> + format_and_print(outf, base_level + 4, header, value);
> +
> + snprintf(header, sizeof(header),
> + "speed-select-turbo-freq-support");
> + snprintf(value, sizeof(value), "%d", ctdp_level->fact_support);
> + format_and_print(outf, base_level + 4, header, value);
> +
> + snprintf(header, sizeof(header),
> + "speed-select-base-freq-support");
> + snprintf(value, sizeof(value), "%d", ctdp_level->pbf_support);
> + format_and_print(outf, base_level + 4, header, value);
> +
> + snprintf(header, sizeof(header),
> + "speed-select-base-freq-enabled");
> + snprintf(value, sizeof(value), "%d", ctdp_level->pbf_enabled);
> + format_and_print(outf, base_level + 4, header, value);
> +
> + snprintf(header, sizeof(header),
> + "speed-select-turbo-freq-enabled");
> + snprintf(value, sizeof(value), "%d", ctdp_level->fact_enabled);
> + format_and_print(outf, base_level + 4, header, value);
> +
> + snprintf(header, sizeof(header), "thermal-design-power(W)");
> + snprintf(value, sizeof(value), "%d", ctdp_level->pkg_tdp);
> + format_and_print(outf, base_level + 4, header, value);
> +
> + snprintf(header, sizeof(header), "tjunction-max(C)");
> + snprintf(value, sizeof(value), "%d", ctdp_level->t_proc_hot);
> + format_and_print(outf, base_level + 4, header, value);
> +
> + snprintf(header, sizeof(header), "turbo-ratio-limits-sse");
> + format_and_print(outf, base_level + 4, header, NULL);
> + for (j = 0; j < 8; ++j) {
> + snprintf(header, sizeof(header), "bucket-%d", j);
> + format_and_print(outf, base_level + 5, header, NULL);
> +
> + snprintf(header, sizeof(header), "core-count");
> + snprintf(value, sizeof(value), "%d", j);
> + format_and_print(outf, base_level + 6, header, value);
> +
> + snprintf(header, sizeof(header), "turbo-ratio");
> + snprintf(value, sizeof(value), "%d",
> + ctdp_level->trl_sse_active_cores[j]);
> + format_and_print(outf, base_level + 6, header, value);
> + }
> + snprintf(header, sizeof(header), "turbo-ratio-limits-avx");
> + format_and_print(outf, base_level + 4, header, NULL);
> + for (j = 0; j < 8; ++j) {
> + snprintf(header, sizeof(header), "bucket-%d", j);
> + format_and_print(outf, base_level + 5, header, NULL);
> +
> + snprintf(header, sizeof(header), "core-count");
> + snprintf(value, sizeof(value), "%d", j);
> + format_and_print(outf, base_level + 6, header, value);
> +
> + snprintf(header, sizeof(header), "turbo-ratio");
> + snprintf(value, sizeof(value), "%d",
> + ctdp_level->trl_avx_active_cores[j]);
> + format_and_print(outf, base_level + 6, header, value);
> + }
> +
> + snprintf(header, sizeof(header), "turbo-ratio-limits-avx512");
> + format_and_print(outf, base_level + 4, header, NULL);
> + for (j = 0; j < 8; ++j) {
> + snprintf(header, sizeof(header), "bucket-%d", j);
> + format_and_print(outf, base_level + 5, header, NULL);
> +
> + snprintf(header, sizeof(header), "core-count");
> + snprintf(value, sizeof(value), "%d", j);
> + format_and_print(outf, base_level + 6, header, value);
> +
> + snprintf(header, sizeof(header), "turbo-ratio");
> + snprintf(value, sizeof(value), "%d",
> + ctdp_level->trl_avx_512_active_cores[j]);
> + format_and_print(outf, base_level + 6, header, value);
> + }
> + if (ctdp_level->pbf_support)
> + _isst_pbf_display_information(cpu, outf, i,
> + &ctdp_level->pbf_info,
> + base_level + 4);
> + if (ctdp_level->fact_support)
> + _isst_fact_display_information(cpu, outf, i, 0xff, 0xff,
> + &ctdp_level->fact_info,
> + base_level + 4);
> + }
> +
> + format_and_print(outf, 1, NULL, NULL);
> +}
> +
> +void isst_ctdp_display_information_start(FILE *outf)
> +{
> + last_level = 0;
> + format_and_print(outf, 0, "start", NULL);
> +}
> +
> +void isst_ctdp_display_information_end(FILE *outf)
> +{
> + format_and_print(outf, 0, NULL, NULL);
> +}
> +
> +void isst_pbf_display_information(int cpu, FILE *outf, int level,
> + struct isst_pbf_info *pbf_info)
> +{
> + print_packag_info(cpu, outf);
> + _isst_pbf_display_information(cpu, outf, level, pbf_info, 4);
> + format_and_print(outf, 1, NULL, NULL);
> +}
> +
> +void isst_fact_display_information(int cpu, FILE *outf, int level,
> + int fact_bucket, int fact_avx,
> + struct isst_fact_info *fact_info)
> +{
> + print_packag_info(cpu, outf);
> + _isst_fact_display_information(cpu, outf, level, fact_bucket, fact_avx,
> + fact_info, 4);
> + format_and_print(outf, 1, NULL, NULL);
> +}
> +
> +void isst_clos_display_information(int cpu, FILE *outf, int clos,
> + struct isst_clos_config *clos_config)
> +{
> + char header[256];
> + char value[256];
> +
> + snprintf(header, sizeof(header), "package-%d",
> + get_physical_package_id(cpu));
> + format_and_print(outf, 1, header, NULL);
> + snprintf(header, sizeof(header), "die-%d", get_physical_die_id(cpu));
> + format_and_print(outf, 2, header, NULL);
> + snprintf(header, sizeof(header), "cpu-%d", cpu);
> + format_and_print(outf, 3, header, NULL);
> +
> + snprintf(header, sizeof(header), "core-power");
> + format_and_print(outf, 4, header, NULL);
> +
> + snprintf(header, sizeof(header), "clos");
> + snprintf(value, sizeof(value), "%d", clos);
> + format_and_print(outf, 5, header, value);
> +
> + snprintf(header, sizeof(header), "epp");
> + snprintf(value, sizeof(value), "%d", clos_config->epp);
> + format_and_print(outf, 5, header, value);
> +
> + snprintf(header, sizeof(header), "clos-proportional-priority");
> + snprintf(value, sizeof(value), "%d", clos_config->clos_prop_prio);
> + format_and_print(outf, 5, header, value);
> +
> + snprintf(header, sizeof(header), "clos-min");
> + snprintf(value, sizeof(value), "%d", clos_config->clos_min);
> + format_and_print(outf, 5, header, value);
> +
> + snprintf(header, sizeof(header), "clos-max");
> + snprintf(value, sizeof(value), "%d", clos_config->clos_max);
> + format_and_print(outf, 5, header, value);
> +
> + snprintf(header, sizeof(header), "clos-desired");
> + snprintf(value, sizeof(value), "%d", clos_config->clos_desired);
> + format_and_print(outf, 5, header, value);
> +
> + format_and_print(outf, 1, NULL, NULL);
> +}
> +
> +void isst_display_result(int cpu, FILE *outf, char *feature, char *cmd,
> + int result)
> +{
> + char header[256];
> + char value[256];
> +
> + snprintf(header, sizeof(header), "package-%d",
> + get_physical_package_id(cpu));
> + format_and_print(outf, 1, header, NULL);
> + snprintf(header, sizeof(header), "die-%d", get_physical_die_id(cpu));
> + format_and_print(outf, 2, header, NULL);
> + snprintf(header, sizeof(header), "cpu-%d", cpu);
> + format_and_print(outf, 3, header, NULL);
> + snprintf(header, sizeof(header), "%s", feature);
> + format_and_print(outf, 4, header, NULL);
> + snprintf(header, sizeof(header), "%s", cmd);
> + snprintf(value, sizeof(value), "%d", result);
> + format_and_print(outf, 5, header, value);
> +
> + format_and_print(outf, 1, NULL, NULL);
> +}
> diff --git a/tools/power/x86/intel-speed-select/isst.h b/tools/power/x86/intel-speed-select/isst.h
> new file mode 100644
> index 000000000000..221881761609
> --- /dev/null
> +++ b/tools/power/x86/intel-speed-select/isst.h
> @@ -0,0 +1,231 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Intel Speed Select -- Enumerate and control features
> + * Copyright (c) 2019 Intel Corporation.
> + */
> +
> +#ifndef _ISST_H_
> +#define _ISST_H_
> +
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <sys/types.h>
> +#include <sched.h>
> +#include <sys/stat.h>
> +#include <sys/resource.h>
> +#include <getopt.h>
> +#include <err.h>
> +#include <fcntl.h>
> +#include <signal.h>
> +#include <sys/time.h>
> +#include <limits.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <cpuid.h>
> +#include <dirent.h>
> +#include <errno.h>
> +
> +#include <stdarg.h>
> +#include <sys/ioctl.h>
> +
> +#define BIT(x) (1 << (x))
> +#define GENMASK(h, l) (((~0UL) << (l)) & (~0UL >> (sizeof(long) * 8 - 1 - (h))))
> +#define GENMASK_ULL(h, l) \
> + (((~0ULL) << (l)) & (~0ULL >> (sizeof(long long) * 8 - 1 - (h))))
> +
> +#define CONFIG_TDP 0x7f
> +#define CONFIG_TDP_GET_LEVELS_INFO 0x00
> +#define CONFIG_TDP_GET_TDP_CONTROL 0x01
> +#define CONFIG_TDP_SET_TDP_CONTROL 0x02
> +#define CONFIG_TDP_GET_TDP_INFO 0x03
> +#define CONFIG_TDP_GET_PWR_INFO 0x04
> +#define CONFIG_TDP_GET_TJMAX_INFO 0x05
> +#define CONFIG_TDP_GET_CORE_MASK 0x06
> +#define CONFIG_TDP_GET_TURBO_LIMIT_RATIOS 0x07
> +#define CONFIG_TDP_SET_LEVEL 0x08
> +#define CONFIG_TDP_GET_UNCORE_P0_P1_INFO 0X09
> +#define CONFIG_TDP_GET_P1_INFO 0x0a
> +#define CONFIG_TDP_GET_MEM_FREQ 0x0b
> +
> +#define CONFIG_TDP_GET_FACT_HP_TURBO_LIMIT_NUMCORES 0x10
> +#define CONFIG_TDP_GET_FACT_HP_TURBO_LIMIT_RATIOS 0x11
> +#define CONFIG_TDP_GET_FACT_LP_CLIPPING_RATIO 0x12
> +
> +#define CONFIG_TDP_PBF_GET_CORE_MASK_INFO 0x20
> +#define CONFIG_TDP_PBF_GET_P1HI_P1LO_INFO 0x21
> +#define CONFIG_TDP_PBF_GET_TJ_MAX_INFO 0x22
> +#define CONFIG_TDP_PBF_GET_TDP_INFO 0X23
> +
> +#define CONFIG_CLOS 0xd0
> +#define CLOS_PQR_ASSOC 0x00
> +#define CLOS_PM_CLOS 0x01
> +#define CLOS_PM_QOS_CONFIG 0x02
> +#define CLOS_STATUS 0x03
> +
> +#define MBOX_CMD_WRITE_BIT 0x08
> +
> +#define PM_QOS_INFO_OFFSET 0x00
> +#define PM_QOS_CONFIG_OFFSET 0x04
> +#define PM_CLOS_OFFSET 0x08
> +#define PQR_ASSOC_OFFSET 0x20
> +
> +struct isst_clos_config {
> + int pkg_id;
> + int die_id;
> + unsigned char epp;
> + unsigned char clos_prop_prio;
> + unsigned char clos_min;
> + unsigned char clos_max;
> + unsigned char clos_desired;
> +};
> +
> +struct isst_fact_bucket_info {
> + int high_priority_cores_count;
> + int sse_trl;
> + int avx_trl;
> + int avx512_trl;
> +};
> +
> +struct isst_pbf_info {
> + int pbf_acticated;
> + int pbf_available;
> + size_t core_cpumask_size;
> + cpu_set_t *core_cpumask;
> + int p1_high;
> + int p1_low;
> + int t_control;
> + int t_prochot;
> + int tdp;
> +};
> +
> +#define ISST_TRL_MAX_ACTIVE_CORES 8
> +#define ISST_FACT_MAX_BUCKETS 8
> +struct isst_fact_info {
> + int lp_clipping_ratio_license_sse;
> + int lp_clipping_ratio_license_avx2;
> + int lp_clipping_ratio_license_avx512;
> + struct isst_fact_bucket_info bucket_info[ISST_FACT_MAX_BUCKETS];
> +};
> +
> +struct isst_pkg_ctdp_level_info {
> + int processed;
> + int control_cpu;
> + int pkg_id;
> + int die_id;
> + int level;
> + int fact_support;
> + int pbf_support;
> + int fact_enabled;
> + int pbf_enabled;
> + int tdp_ratio;
> + int active;
> + int tdp_control;
> + int pkg_tdp;
> + int pkg_min_power;
> + int pkg_max_power;
> + int fact;
> + int t_proc_hot;
> + int uncore_p0;
> + int uncore_p1;
> + int sse_p1;
> + int avx2_p1;
> + int avx512_p1;
> + int mem_freq;
> + size_t core_cpumask_size;
> + cpu_set_t *core_cpumask;
> + int cpu_count;
> + int trl_sse_active_cores[ISST_TRL_MAX_ACTIVE_CORES];
> + int trl_avx_active_cores[ISST_TRL_MAX_ACTIVE_CORES];
> + int trl_avx_512_active_cores[ISST_TRL_MAX_ACTIVE_CORES];
> + int kobj_bucket_index;
> + int active_bucket;
> + int fact_max_index;
> + int fact_max_config;
> + int pbf_found;
> + int pbf_active;
> + struct isst_pbf_info pbf_info;
> + struct isst_fact_info fact_info;
> +};
> +
> +#define ISST_MAX_TDP_LEVELS (4 + 1) /* +1 for base config */
> +struct isst_pkg_ctdp {
> + int locked;
> + int version;
> + int processed;
> + int levels;
> + int current_level;
> + int enabled;
> + struct isst_pkg_ctdp_level_info ctdp_level[ISST_MAX_TDP_LEVELS];
> +};
> +
> +extern int get_topo_max_cpus(void);
> +extern int get_cpu_count(int pkg_id, int die_id);
> +
> +/* Common interfaces */
> +extern void debug_printf(const char *format, ...);
> +extern int out_format_is_json(void);
> +extern int get_physical_package_id(int cpu);
> +extern int get_physical_die_id(int cpu);
> +extern size_t alloc_cpu_set(cpu_set_t **cpu_set);
> +extern void free_cpu_set(cpu_set_t *cpu_set);
> +extern int find_logical_cpu(int pkg_id, int die_id, int phy_cpu);
> +extern int find_phy_cpu_num(int logical_cpu);
> +extern int find_phy_core_num(int logical_cpu);
> +extern void set_cpu_mask_from_punit_coremask(int cpu,
> + unsigned long long core_mask,
> + size_t core_cpumask_size,
> + cpu_set_t *core_cpumask,
> + int *cpu_cnt);
> +
> +extern int isst_send_mbox_command(unsigned int cpu, unsigned char command,
> + unsigned char sub_command,
> + unsigned int write,
> + unsigned int req_data, unsigned int *resp);
> +
> +extern int isst_send_msr_command(unsigned int cpu, unsigned int command,
> + int write, unsigned long long *req_resp);
> +
> +extern int isst_get_ctdp_levels(int cpu, struct isst_pkg_ctdp *pkg_dev);
> +extern int isst_get_process_ctdp(int cpu, int tdp_level,
> + struct isst_pkg_ctdp *pkg_dev);
> +extern void isst_get_process_ctdp_complete(int cpu,
> + struct isst_pkg_ctdp *pkg_dev);
> +extern void isst_ctdp_display_information(int cpu, FILE *outf, int tdp_level,
> + struct isst_pkg_ctdp *pkg_dev);
> +extern void isst_ctdp_display_information_start(FILE *outf);
> +extern void isst_ctdp_display_information_end(FILE *outf);
> +extern void isst_pbf_display_information(int cpu, FILE *outf, int level,
> + struct isst_pbf_info *info);
> +extern int isst_set_tdp_level(int cpu, int tdp_level);
> +extern int isst_set_tdp_level_msr(int cpu, int tdp_level);
> +extern int isst_set_pbf_fact_status(int cpu, int pbf, int enable);
> +extern int isst_get_pbf_info(int cpu, int level,
> + struct isst_pbf_info *pbf_info);
> +extern void isst_get_pbf_info_complete(struct isst_pbf_info *pbf_info);
> +extern int isst_get_fact_info(int cpu, int level,
> + struct isst_fact_info *fact_info);
> +extern int isst_get_fact_bucket_info(int cpu, int level,
> + struct isst_fact_bucket_info *bucket_info);
> +extern void isst_fact_display_information(int cpu, FILE *outf, int level,
> + int fact_bucket, int fact_avx,
> + struct isst_fact_info *fact_info);
> +extern int isst_set_trl(int cpu, unsigned long long trl);
> +extern int isst_set_trl_from_current_tdp(int cpu, unsigned long long trl);
> +extern int isst_get_config_tdp_lock_status(int cpu);
> +
> +extern int isst_pm_qos_config(int cpu, int enable_clos, int priority_type);
> +extern int isst_pm_get_clos(int cpu, int clos,
> + struct isst_clos_config *clos_config);
> +extern int isst_set_clos(int cpu, int clos,
> + struct isst_clos_config *clos_config);
> +extern int isst_clos_associate(int cpu, int clos);
> +extern int isst_clos_get_assoc_status(int cpu, int *clos_id);
> +extern void isst_clos_display_information(int cpu, FILE *outf, int clos,
> + struct isst_clos_config *clos_config);
> +
> +extern int isst_read_reg(unsigned short reg, unsigned int *val);
> +extern int isst_write_reg(int reg, unsigned int val);
> +
> +extern void isst_display_result(int cpu, FILE *outf, char *feature, char *cmd,
> + int result);
> +#endif
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox