linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] drivers: cacheinfo support
@ 2014-06-25 17:30 Sudeep Holla
  2014-06-25 17:30 ` [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs Sudeep Holla
                   ` (4 more replies)
  0 siblings, 5 replies; 29+ messages in thread
From: Sudeep Holla @ 2014-06-25 17:30 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sudeep Holla <sudeep.holla@arm.com>

This series adds a generic cacheinfo support similar to topology. The
implementation is based on x86 cacheinfo support. Currently x86, powerpc,
ia64 and s390 have their own implementations. While adding similar support
to ARM and ARM64, here is the attempt to make it generic quite similar to
topology info support. It also adds the missing ABI documentation for
the cacheinfo sysfs which is already being used.

It moves all the existing different implementations on x86, ia64, powerpc
and s390 to use the generic cacheinfo infrastructure introduced here.
These changes on non-ARM platforms are only compile tested and tested on x86.

This series also adds support for ARM and ARM64 architectures based on
the generic support.

Since there was no objection to the idea in RFC, I am posting non-RFC
version here.

The code can be fetched from:
 git://linux-arm.org/linux-skn cacheinfo

Previous RFCs:
[1] https://lkml.org/lkml/2014/1/8/523
[2] https://lkml.org/lkml/2014/2/7/654
[3] https://lkml.org/lkml/2014/2/19/391

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-ia64 at vger.kernel.org
Cc: linux390 at de.ibm.com
Cc: linux-s390 at vger.kernel.org
Cc: x86 at kernel.org
Cc: linuxppc-dev at lists.ozlabs.org
Cc: linux-arm-kernel at lists.infradead.org

---
Sudeep Holla (9):
  drivers: base: add new class "cpu" to group cpu devices
  drivers: base: support cpu cache information interface to userspace
    via sysfs
  ia64: move cacheinfo sysfs to generic cacheinfo infrastructure
  s390: move cacheinfo sysfs to generic cacheinfo infrastructure
  x86: move cacheinfo sysfs to generic cacheinfo infrastructure
  powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure
  ARM64: kernel: add support for cpu cache information
  ARM: kernel: add support for cpu cache information
  ARM: kernel: add outer cache support for cacheinfo implementation

 Documentation/ABI/testing/sysfs-devices-system-cpu |  41 ++
 arch/arm/include/asm/outercache.h                  |  13 +
 arch/arm/kernel/Makefile                           |   1 +
 arch/arm/kernel/cacheinfo.c                        | 249 +++++++
 arch/arm/mm/Kconfig                                |  13 +
 arch/arm/mm/cache-l2x0.c                           |  10 +
 arch/arm/mm/cache-tauros2.c                        |  34 +
 arch/arm/mm/cache-xsc3l2.c                         |  15 +
 arch/arm64/kernel/Makefile                         |   3 +-
 arch/arm64/kernel/cacheinfo.c                      | 135 ++++
 arch/ia64/kernel/topology.c                        | 401 ++--------
 arch/powerpc/kernel/cacheinfo.c                    | 813 +++------------------
 arch/powerpc/kernel/cacheinfo.h                    |   8 -
 arch/powerpc/kernel/sysfs.c                        |  12 +-
 arch/s390/kernel/cache.c                           | 388 +++-------
 arch/x86/kernel/cpu/intel_cacheinfo.c              | 655 ++++-------------
 drivers/base/Makefile                              |   2 +-
 drivers/base/cacheinfo.c                           | 564 ++++++++++++++
 drivers/base/core.c                                |  39 +-
 drivers/base/cpu.c                                 |   7 +
 include/linux/cacheinfo.h                          |  56 ++
 include/linux/cpu.h                                |   2 +
 22 files changed, 1590 insertions(+), 1871 deletions(-)
 create mode 100644 arch/arm/kernel/cacheinfo.c
 create mode 100644 arch/arm64/kernel/cacheinfo.c
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h
 create mode 100644 drivers/base/cacheinfo.c
 create mode 100644 include/linux/cacheinfo.h

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs
  2014-06-25 17:30 [PATCH 0/9] drivers: cacheinfo support Sudeep Holla
@ 2014-06-25 17:30 ` Sudeep Holla
  2014-06-25 22:23   ` Russell King - ARM Linux
  2014-07-10  0:09   ` Greg Kroah-Hartman
  2014-06-25 17:30 ` [PATCH 7/9] ARM64: kernel: add support for cpu cache information Sudeep Holla
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 29+ messages in thread
From: Sudeep Holla @ 2014-06-25 17:30 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sudeep Holla <sudeep.holla@arm.com>

This patch adds initial support for providing processor cache information
to userspace through sysfs interface. This is based on already existing
implementations(x86, ia64, s390 and powerpc) and hence the interface is
intended to be fully compatible.

The main purpose of this generic support is to avoid further code
duplication to support new architectures and also to unify all the existing
different implementations.

This implementation maintains the hierarchy of cache objects which reflects
the system's cache topology. Cache devices are instantiated as needed as
CPUs come online. The cache information is replicated per-cpu even if they are
shared. A per-cpu array of cache information maintained is used mainly for
sysfs-related book keeping.

It also implements the shared_cpu_map attribute, which is essential for
enabling both kernel and user-space to discover the system's overall cache
topology.

This patch also add the missing ABI documentation for the cacheinfo sysfs
interface already, which is well defined and widely used.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Rob Herring <robh@kernel.org>
Cc: linux-doc at vger.kernel.org
Cc: linux-ia64 at vger.kernel.org
Cc: linux390 at de.ibm.com
Cc: linux-s390 at vger.kernel.org
Cc: x86 at kernel.org
Cc: linuxppc-dev at lists.ozlabs.org
Cc: linux-arm-kernel at lists.infradead.org
---
 Documentation/ABI/testing/sysfs-devices-system-cpu |  41 ++
 drivers/base/Makefile                              |   2 +-
 drivers/base/cacheinfo.c                           | 564 +++++++++++++++++++++
 include/linux/cacheinfo.h                          |  56 ++
 4 files changed, 662 insertions(+), 1 deletion(-)
 create mode 100644 drivers/base/cacheinfo.c
 create mode 100644 include/linux/cacheinfo.h

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index acb9bfc..5827f4e 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -224,3 +224,44 @@ Description:	Parameters for the Intel P-state driver
 		frequency range.
 
 		More details can be found in Documentation/cpu-freq/intel-pstate.txt
+
+What:		/sys/devices/system/cpu/cpu*/cache/index*/<set_of_attributes_mentioned_below>
+Date:		June 2014(documented, existed before August 2008)
+Contact:	Sudeep Holla <sudeep.holla@arm.com>
+		Linux kernel mailing list <linux-kernel@vger.kernel.org>
+Description:	Parameters for the CPU cache attributes
+
+		attributes:
+			- writethrough: data is written to both the cache line
+					and to the block in the lower-level memory
+			- writeback: data is written only to the cache line and
+				     the modified cache line is written to main
+				     memory only when it is replaced
+			- writeallocate: allocate a memory location to a cache line
+					 on a cache miss because of a write
+			- readallocate: allocate a memory location to a cache line
+					on a cache miss because of a read
+
+		coherency_line_size: the minimum amount of data that gets transferred
+
+		level: the cache hierarcy in the multi-level cache configuration
+
+		number_of_sets: total number of sets in the cache, a set is a
+				collection of cache lines with the same cache index
+
+		physical_line_partition: number of physical cache line per cache tag
+
+		shared_cpu_list: the list of cpus sharing the cache
+
+		shared_cpu_map: logical cpu mask containing the list of cpus sharing
+				the cache
+
+		size: the total cache size in kB
+
+		type:
+			- instruction: cache that only holds instructions
+			- data: cache that only caches data
+			- unified: cache that holds both data and instructions
+
+		ways_of_associativity: degree of freedom in placing a particular block
+					of memory in the cache
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 04b314e..bad2ff8 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -4,7 +4,7 @@ obj-y			:= component.o core.o bus.o dd.o syscore.o \
 			   driver.o class.o platform.o \
 			   cpu.o firmware.o init.o map.o devres.o \
 			   attribute_container.o transport_class.o \
-			   topology.o container.o
+			   topology.o container.o cacheinfo.o
 obj-$(CONFIG_DEVTMPFS)	+= devtmpfs.o
 obj-$(CONFIG_DMA_CMA) += dma-contiguous.o
 obj-y			+= power/
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
new file mode 100644
index 0000000..c12e03c
--- /dev/null
+++ b/drivers/base/cacheinfo.c
@@ -0,0 +1,564 @@
+/*
+ * cacheinfo support - processor cache information via sysfs
+ *
+ * Based on arch/x86/kernel/cpu/intel_cacheinfo.c
+ * Author: Sudeep Holla <sudeep.holla@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+#include <linux/bitops.h>
+#include <linux/cacheinfo.h>
+#include <linux/compiler.h>
+#include <linux/cpu.h>
+#include <linux/device.h>
+#include <linux/init.h>
+#include <linux/of.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/smp.h>
+#include <linux/sysfs.h>
+
+/* pointer to per cpu cacheinfo */
+static DEFINE_PER_CPU(struct cpu_cacheinfo, ci_cpu_cacheinfo);
+#define ci_cacheinfo(cpu)	(&per_cpu(ci_cpu_cacheinfo, cpu))
+#define cache_leaves(cpu)	(ci_cacheinfo(cpu)->num_leaves)
+#define per_cpu_cacheinfo(cpu)	(ci_cacheinfo(cpu)->info_list)
+
+struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
+{
+	return ci_cacheinfo(cpu);
+}
+
+#ifdef CONFIG_OF
+static int cache_setup_of_node(unsigned int cpu)
+{
+	struct device_node *np;
+	struct cacheinfo *this_leaf;
+	struct device *cpu_dev = get_cpu_device(cpu);
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	unsigned int index = 0;
+
+	/* skip if of_node is already populated */
+	if (this_cpu_ci->info_list->of_node)
+		return 0;
+
+	if (!cpu_dev) {
+		pr_err("No cpu device for CPU %d\n", cpu);
+		return -ENODEV;
+	}
+	np = cpu_dev->of_node;
+	if (!np) {
+		pr_err("Failed to find cpu%d device node\n", cpu);
+		return -ENOENT;
+	}
+
+	while (np && index < cache_leaves(cpu)) {
+		this_leaf = this_cpu_ci->info_list + index;
+		if (this_leaf->level != 1)
+			np = of_find_next_cache_node(np);
+		else
+			np = of_node_get(np);/* cpu node itself */
+		this_leaf->of_node = np;
+		index++;
+	}
+	return 0;
+}
+
+static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
+					   struct cacheinfo *sib_leaf)
+{
+	return sib_leaf->of_node == this_leaf->of_node;
+}
+
+static int of_cache_shared_cpu_map_setup(unsigned int cpu)
+{
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	struct cacheinfo *this_leaf, *sib_leaf;
+	unsigned int index;
+	int ret;
+
+	ret = cache_setup_of_node(cpu);
+	if (ret)
+		return ret;
+
+	for (index = 0; index < cache_leaves(cpu); index++) {
+		unsigned int i;
+
+		this_leaf = this_cpu_ci->info_list + index;
+		cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map);
+
+		for_each_online_cpu(i) {
+			struct cpu_cacheinfo *sib_cpu_ci = get_cpu_cacheinfo(i);
+
+			if (i == cpu || !sib_cpu_ci->info_list)
+				continue;/* skip if itself or no cacheinfo */
+			sib_leaf = sib_cpu_ci->info_list + index;
+			if (cache_leaves_are_shared(this_leaf, sib_leaf)) {
+				cpumask_set_cpu(cpu, &sib_leaf->shared_cpu_map);
+				cpumask_set_cpu(i, &this_leaf->shared_cpu_map);
+			}
+		}
+	}
+
+	return 0;
+}
+#else
+static inline int of_cache_shared_cpu_map_setup(unsigned int cpu)
+{
+	return 0;
+}
+#endif
+
+static void cache_shared_cpu_map_remove(unsigned int cpu)
+{
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	struct cacheinfo *this_leaf, *sib_leaf;
+	unsigned int sibling, index;
+
+	for (index = 0; index < cache_leaves(cpu); index++) {
+		this_leaf = this_cpu_ci->info_list + index;
+		for_each_cpu(sibling, &this_leaf->shared_cpu_map) {
+			struct cpu_cacheinfo *sib_cpu_ci;
+
+			if (sibling == cpu) /* skip itself */
+				continue;
+			sib_cpu_ci = get_cpu_cacheinfo(sibling);
+			sib_leaf = sib_cpu_ci->info_list + index;
+			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
+			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
+		}
+		of_node_put(this_leaf->of_node);
+	}
+}
+
+int __weak init_cache_level(unsigned int cpu)
+{
+	return -ENOENT;
+}
+
+int __weak populate_cache_leaves(unsigned int cpu)
+{
+	return -ENOENT;
+}
+
+static void free_cache_attributes(unsigned int cpu)
+{
+	cache_shared_cpu_map_remove(cpu);
+
+	kfree(per_cpu_cacheinfo(cpu));
+	per_cpu_cacheinfo(cpu) = NULL;
+}
+
+/*
+ * Helpers to make sure "func" is executed on the cpu whose cache
+ * attributes are being detected
+ */
+#define DEFINE_SMP_CALL_FUNCTION(func)		\
+static void _##func(void *ret)			\
+{						\
+	int cpu = smp_processor_id();		\
+	*(int *)ret = func(cpu);		\
+}						\
+static int __##func(unsigned int cpu)		\
+{						\
+	int ret;				\
+	smp_call_function_single(cpu, _##func, &ret, true);	\
+	return ret;				\
+}
+DEFINE_SMP_CALL_FUNCTION(init_cache_level)
+DEFINE_SMP_CALL_FUNCTION(populate_cache_leaves)
+
+static int detect_cache_attributes(unsigned int cpu)
+{
+	int ret;
+
+	if (__init_cache_level(cpu))
+		return -ENOENT;
+
+	per_cpu_cacheinfo(cpu) = kzalloc(sizeof(struct cacheinfo) *
+					 cache_leaves(cpu), GFP_KERNEL);
+	if (per_cpu_cacheinfo(cpu) == NULL)
+		return -ENOMEM;
+
+	ret = __populate_cache_leaves(cpu);
+	if (ret)
+		goto free_ci;
+	/*
+	 * For systems using DT for cache hierarcy, of_node and shared_cpu_map
+	 * will be set up here. Otherwise populate_cache_leaves needs to set
+	 * shared_cpu_map and next-level-cache should not be specified in DT
+	 */
+	ret = of_cache_shared_cpu_map_setup(cpu);
+	if (ret)
+		goto free_ci;
+	return 0;
+
+free_ci:
+	free_cache_attributes(cpu);
+	return ret;
+}
+
+#ifdef CONFIG_SYSFS
+
+/* pointer to cpuX/cache device */
+static DEFINE_PER_CPU(struct device *, ci_cache_dev);
+#define per_cpu_cache_dev(cpu)	(per_cpu(ci_cache_dev, cpu))
+
+static cpumask_t cache_dev_map;
+
+/* pointer to array of devices for cpuX/cache/indexY */
+static DEFINE_PER_CPU(struct device **, ci_index_dev);
+#define per_cpu_index_dev(cpu)	(per_cpu(ci_index_dev, cpu))
+#define per_cache_index_dev(cpu, idx)	((per_cpu_index_dev(cpu))[idx])
+
+#define show_one(file_name, object)				\
+static ssize_t file_name##_show(struct device *dev,		\
+		struct device_attribute *attr, char *buf)	\
+{								\
+	struct cacheinfo *this_leaf = dev_get_drvdata(dev);	\
+	if (!this_leaf->object)					\
+		return -EINVAL;					\
+	return sprintf(buf, "%u\n", this_leaf->object);		\
+}
+
+show_one(level, level);
+show_one(coherency_line_size, coherency_line_size);
+show_one(number_of_sets, number_of_sets);
+show_one(physical_line_partition, physical_line_partition);
+
+static ssize_t ways_of_associativity_show(struct device *dev,
+					  struct device_attribute *attr,
+					  char *buf)
+{
+	struct cacheinfo *this_leaf = dev_get_drvdata(dev);
+
+	/* will be zero for fully associative cache, but check for size */
+	if (!this_leaf->size)
+		return -EINVAL;
+	return sprintf(buf, "%u\n", this_leaf->ways_of_associativity);
+}
+
+static ssize_t size_show(struct device *dev,
+			 struct device_attribute *attr, char *buf)
+{
+	struct cacheinfo *this_leaf = dev_get_drvdata(dev);
+
+	if (!this_leaf->size)
+		return -EINVAL;
+	return sprintf(buf, "%uK\n", this_leaf->size >> 10);
+}
+
+static ssize_t shared_cpumap_show_func(struct device *dev, int type, char *buf)
+{
+	struct cacheinfo *this_leaf = dev_get_drvdata(dev);
+	ptrdiff_t len = PTR_ALIGN(buf + PAGE_SIZE - 1, PAGE_SIZE) - buf;
+	int n = 0;
+
+	if (len > 1) {
+		const struct cpumask *mask = &this_leaf->shared_cpu_map;
+
+		n = type ? cpulist_scnprintf(buf, len - 2, mask) :
+			   cpumask_scnprintf(buf, len - 2, mask);
+		buf[n++] = '\n';
+		buf[n] = '\0';
+	}
+	return n;
+}
+
+static ssize_t shared_cpu_map_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	return shared_cpumap_show_func(dev, 0, buf);
+}
+
+static ssize_t shared_cpu_list_show(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+	return shared_cpumap_show_func(dev, 1, buf);
+}
+
+static ssize_t type_show(struct device *dev,
+			 struct device_attribute *attr, char *buf)
+{
+	struct cacheinfo *this_leaf = dev_get_drvdata(dev);
+
+	switch (this_leaf->type) {
+	case CACHE_TYPE_DATA:
+		return sprintf(buf, "Data\n");
+	case CACHE_TYPE_INST:
+		return sprintf(buf, "Instruction\n");
+	case CACHE_TYPE_UNIFIED:
+		return sprintf(buf, "Unified\n");
+	default:
+		return -EINVAL;
+	}
+}
+
+static ssize_t attributes_show(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	struct cacheinfo *this_leaf = dev_get_drvdata(dev);
+	unsigned int ci_attr = this_leaf->attributes;
+	ptrdiff_t len = PTR_ALIGN(buf + PAGE_SIZE - 1, PAGE_SIZE) - buf - 2;
+	int n = 0;
+
+	if (!ci_attr)
+		return -EINVAL;
+
+	if (ci_attr & CACHE_WRITE_THROUGH)
+		n += snprintf(buf + n, len - n, "WriteThrough\n");
+	if (ci_attr & CACHE_WRITE_BACK)
+		n += snprintf(buf + n, len - n, "WriteBack\n");
+	if (ci_attr & CACHE_READ_ALLOCATE)
+		n += snprintf(buf + n, len - n, "ReadAllocate\n");
+	if (ci_attr & CACHE_WRITE_ALLOCATE)
+		n += snprintf(buf + n, len - n, "WriteAllocate\n");
+	buf[n] = '\0';
+	return n;
+}
+
+static DEVICE_ATTR_RO(level);
+static DEVICE_ATTR_RO(type);
+static DEVICE_ATTR_RO(coherency_line_size);
+static DEVICE_ATTR_RO(ways_of_associativity);
+static DEVICE_ATTR_RO(number_of_sets);
+static DEVICE_ATTR_RO(size);
+static DEVICE_ATTR_RO(attributes);
+static DEVICE_ATTR_RO(shared_cpu_map);
+static DEVICE_ATTR_RO(shared_cpu_list);
+static DEVICE_ATTR_RO(physical_line_partition);
+
+static struct attribute *cache_default_attrs[] = {
+	&dev_attr_type.attr,
+	&dev_attr_level.attr,
+	&dev_attr_shared_cpu_map.attr,
+	&dev_attr_shared_cpu_list.attr,
+	NULL
+};
+
+ATTRIBUTE_GROUPS(cache_default);
+
+static const struct device_attribute *cache_optional_attrs[] = {
+	&dev_attr_coherency_line_size,
+	&dev_attr_ways_of_associativity,
+	&dev_attr_number_of_sets,
+	&dev_attr_size,
+	&dev_attr_attributes,
+	&dev_attr_physical_line_partition,
+	NULL
+};
+
+static int device_add_attrs(struct device *dev,
+			    const struct device_attribute **dev_attrs)
+{
+	int i, error = 0;
+	struct device_attribute *dev_attr;
+	char *buf;
+
+	if (!dev_attrs)
+		return 0;
+
+	buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	for (i = 0; dev_attrs[i]; i++) {
+		dev_attr = (struct device_attribute *)dev_attrs[i];
+
+		/* create attributes that provides meaningful value */
+		if (dev_attr->show(dev, dev_attr, buf) < 0)
+			continue;
+
+		error = device_create_file(dev, dev_attrs[i]);
+		if (error) {
+			while (--i >= 0)
+				device_remove_file(dev, dev_attrs[i]);
+			break;
+		}
+	}
+
+	kfree(buf);
+	return error;
+}
+
+static void device_remove_attrs(struct device *dev,
+				const struct device_attribute **dev_attrs)
+{
+	int i;
+
+	if (!dev_attrs)
+		return;
+
+	for (i = 0; dev_attrs[i]; dev_attrs++, i++)
+		device_remove_file(dev, dev_attrs[i]);
+}
+
+const struct device_attribute **
+__weak cache_get_priv_attr(struct device *cache_idx_dev)
+{
+	return NULL;
+}
+
+/* Add/Remove cache interface for CPU device */
+static void cpu_cache_sysfs_exit(unsigned int cpu)
+{
+	int i;
+	struct device *tmp_dev;
+	const struct device_attribute **ci_priv_attr;
+
+	if (per_cpu_index_dev(cpu)) {
+		for (i = 0; i < cache_leaves(cpu); i++) {
+			tmp_dev = per_cache_index_dev(cpu, i);
+			if (!tmp_dev)
+				continue;
+			ci_priv_attr = cache_get_priv_attr(tmp_dev);
+			device_remove_attrs(tmp_dev, ci_priv_attr);
+			device_remove_attrs(tmp_dev, cache_optional_attrs);
+			device_unregister(tmp_dev);
+		}
+		kfree(per_cpu_index_dev(cpu));
+		per_cpu_index_dev(cpu) = NULL;
+	}
+	device_unregister(per_cpu_cache_dev(cpu));
+	per_cpu_cache_dev(cpu) = NULL;
+}
+
+static int cpu_cache_sysfs_init(unsigned int cpu)
+{
+	struct device *dev = get_cpu_device(cpu);
+
+	if (per_cpu_cacheinfo(cpu) == NULL)
+		return -ENOENT;
+
+	per_cpu_cache_dev(cpu) = device_create(dev->class, dev, cpu,
+					       NULL, "cache");
+	if (IS_ERR_OR_NULL(per_cpu_cache_dev(cpu)))
+		return PTR_ERR(per_cpu_cache_dev(cpu));
+
+	/* Allocate all required memory */
+	per_cpu_index_dev(cpu) = kzalloc(sizeof(struct device *) *
+					 cache_leaves(cpu), GFP_KERNEL);
+	if (unlikely(per_cpu_index_dev(cpu) == NULL))
+		goto err_out;
+
+	return 0;
+
+err_out:
+	cpu_cache_sysfs_exit(cpu);
+	return -ENOMEM;
+}
+
+static int cache_add_dev(unsigned int cpu)
+{
+	unsigned short i;
+	int rc;
+	struct device *tmp_dev, *parent;
+	struct cacheinfo *this_leaf;
+	const struct device_attribute **ci_priv_attr;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+
+	rc = cpu_cache_sysfs_init(cpu);
+	if (unlikely(rc < 0))
+		return rc;
+
+	parent = per_cpu_cache_dev(cpu);
+	for (i = 0; i < cache_leaves(cpu); i++) {
+		this_leaf = this_cpu_ci->info_list + i;
+		if (this_leaf->disable_sysfs)
+			continue;
+		tmp_dev = device_create_with_groups(parent->class, parent, i,
+						    this_leaf,
+						    cache_default_groups,
+						    "index%1u", i);
+		if (IS_ERR_OR_NULL(tmp_dev)) {
+			rc = PTR_ERR(tmp_dev);
+			goto err;
+		}
+
+		rc = device_add_attrs(tmp_dev, cache_optional_attrs);
+		if (unlikely(rc))
+			goto err;
+
+		ci_priv_attr = cache_get_priv_attr(tmp_dev);
+		rc = device_add_attrs(tmp_dev, ci_priv_attr);
+		if (unlikely(rc))
+			goto err;
+
+		per_cache_index_dev(cpu, i) = tmp_dev;
+	}
+	cpumask_set_cpu(cpu, &cache_dev_map);
+
+	return 0;
+err:
+	cpu_cache_sysfs_exit(cpu);
+	return rc;
+}
+
+static void cache_remove_dev(unsigned int cpu)
+{
+	if (!cpumask_test_cpu(cpu, &cache_dev_map))
+		return;
+	cpumask_clear_cpu(cpu, &cache_dev_map);
+
+	cpu_cache_sysfs_exit(cpu);
+}
+
+static int cacheinfo_cpu_callback(struct notifier_block *nfb,
+				  unsigned long action, void *hcpu)
+{
+	unsigned int cpu = (unsigned long)hcpu;
+	int rc = 0;
+
+	switch (action) {
+	case CPU_ONLINE:
+	case CPU_ONLINE_FROZEN:
+		rc = detect_cache_attributes(cpu);
+		if (!rc)
+			rc = cache_add_dev(cpu);
+		break;
+	case CPU_DEAD:
+	case CPU_DEAD_FROZEN:
+		cache_remove_dev(cpu);
+		if (per_cpu_cacheinfo(cpu))
+			free_cache_attributes(cpu);
+		break;
+	}
+	return notifier_from_errno(rc);
+}
+
+static int __init cacheinfo_sysfs_init(void)
+{
+	int cpu, rc = 0;
+
+	cpu_notifier_register_begin();
+
+	for_each_online_cpu(cpu) {
+		rc = detect_cache_attributes(cpu);
+		if (rc) {
+			pr_err("error detecting cacheinfo..cpu%d\n", cpu);
+			goto out;
+		}
+		rc = cache_add_dev(cpu);
+		if (rc) {
+			free_cache_attributes(cpu);
+			pr_err("error populating cacheinfo..cpu%d\n", cpu);
+			goto out;
+		}
+	}
+	__hotcpu_notifier(cacheinfo_cpu_callback, 0);
+
+out:
+	cpu_notifier_register_done();
+	return rc;
+}
+
+device_initcall(cacheinfo_sysfs_init);
+
+#endif	/* CONFIG_SYSFS */
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
new file mode 100644
index 0000000..a9bd9f9
--- /dev/null
+++ b/include/linux/cacheinfo.h
@@ -0,0 +1,56 @@
+#ifndef _LINUX_CACHEINFO_H
+#define _LINUX_CACHEINFO_H
+
+#include <linux/bitops.h>
+#include <linux/compiler.h>
+#include <linux/cpumask.h>
+#include <linux/device.h>
+#include <linux/of.h>
+#include <linux/sysfs.h>
+
+enum cache_type {
+	CACHE_TYPE_NOCACHE = 0,
+	CACHE_TYPE_INST = BIT(0),
+	CACHE_TYPE_DATA = BIT(1),
+	CACHE_TYPE_SEPARATE = CACHE_TYPE_INST | CACHE_TYPE_DATA,
+	CACHE_TYPE_UNIFIED = BIT(2),
+};
+
+struct cacheinfo {
+	/* core properties */
+	enum cache_type type; /* data, inst or unified */
+	unsigned int level;
+	unsigned int coherency_line_size; /* cache line size  */
+	unsigned int number_of_sets; /* no. of sets per way */
+	unsigned int ways_of_associativity; /* no. of ways */
+	unsigned int physical_line_partition; /* no. of lines per tag */
+	unsigned int size; /* total cache size */
+	cpumask_t shared_cpu_map;
+	unsigned int attributes;
+#define CACHE_WRITE_THROUGH	BIT(0)
+#define CACHE_WRITE_BACK	BIT(1)
+#define CACHE_READ_ALLOCATE	BIT(2)
+#define CACHE_WRITE_ALLOCATE	BIT(3)
+
+	/* book keeping */
+	struct device_node *of_node;	/* cpu if no explicit cache node */
+	bool disable_sysfs; /* don't expose this leaf through sysfs */
+	void *priv;
+};
+
+struct cpu_cacheinfo {
+	struct cacheinfo *info_list;
+	unsigned int num_levels;
+	unsigned int num_leaves;
+};
+
+struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
+int init_cache_level(unsigned int cpu);
+int populate_cache_leaves(unsigned int cpu);
+
+#ifdef CONFIG_SYSFS
+const struct device_attribute **
+cache_get_priv_attr(struct device *cache_idx_dev);
+#endif
+
+#endif /* _LINUX_CACHEINFO_H */
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 7/9] ARM64: kernel: add support for cpu cache information
  2014-06-25 17:30 [PATCH 0/9] drivers: cacheinfo support Sudeep Holla
  2014-06-25 17:30 ` [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs Sudeep Holla
@ 2014-06-25 17:30 ` Sudeep Holla
  2014-06-27 10:36   ` Mark Rutland
  2014-06-25 17:30 ` [PATCH 8/9] ARM: " Sudeep Holla
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 29+ messages in thread
From: Sudeep Holla @ 2014-06-25 17:30 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sudeep Holla <sudeep.holla@arm.com>

This patch adds support for cacheinfo on ARM64.

On ARMv8, the cache hierarchy can be identified through Cache Level ID
(CLIDR) register while the cache geometry is provided by Cache Size ID
(CCSIDR) register.

Since the architecture doesn't provide any way of detecting the cpus
sharing particular cache, device tree is used for the same purpose.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: linux-arm-kernel at lists.infradead.org
---
 arch/arm64/kernel/Makefile    |   3 +-
 arch/arm64/kernel/cacheinfo.c | 135 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 137 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/cacheinfo.c

diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index cdaedad..754a3d0 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -15,7 +15,8 @@ CFLAGS_REMOVE_return_address.o = -pg
 arm64-obj-y		:= cputable.o debug-monitors.o entry.o irq.o fpsimd.o	\
 			   entry-fpsimd.o process.o ptrace.o setup.o signal.o	\
 			   sys.o stacktrace.o time.o traps.o io.o vdso.o	\
-			   hyp-stub.o psci.o cpu_ops.o insn.o return_address.o
+			   hyp-stub.o psci.o cpu_ops.o insn.o return_address.o	\
+			   cacheinfo.o
 
 arm64-obj-$(CONFIG_COMPAT)		+= sys32.o kuser32.o signal32.o 	\
 					   sys_compat.o
diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
new file mode 100644
index 0000000..a6f81cc
--- /dev/null
+++ b/arch/arm64/kernel/cacheinfo.c
@@ -0,0 +1,135 @@
+/*
+ *  ARM64 cacheinfo support
+ *
+ *  Copyright (C) 2014 ARM Ltd.
+ *  All Rights Reserved
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/bitops.h>
+#include <linux/cacheinfo.h>
+#include <linux/cpu.h>
+#include <linux/compiler.h>
+#include <linux/of.h>
+
+#include <asm/processor.h>
+
+#define MAX_CACHE_LEVEL			7	/* Max 7 level supported */
+/* Ctypen, bits[3(n - 1) + 2 : 3(n - 1)], for n = 1 to 7 */
+#define CLIDR_CTYPE_SHIFT(level)	(3 * (level - 1))
+#define CLIDR_CTYPE_MASK(level)		(7 << CLIDR_CTYPE_SHIFT(level))
+#define CLIDR_CTYPE(clidr, level)	\
+	(((clidr) & CLIDR_CTYPE_MASK(level)) >> CLIDR_CTYPE_SHIFT(level))
+
+static inline enum cache_type get_cache_type(int level)
+{
+	unsigned int clidr;
+
+	if (level > MAX_CACHE_LEVEL)
+		return CACHE_TYPE_NOCACHE;
+	asm volatile ("mrs     %0, clidr_el1" : "=r" (clidr));
+	return CLIDR_CTYPE(clidr, level);
+}
+
+/*
+ * NumSets, bits[27:13] - (Number of sets in cache) - 1
+ * Associativity, bits[12:3] - (Associativity of cache) - 1
+ * LineSize, bits[2:0] - (Log2(Number of words in cache line)) - 2
+ */
+#define CCSIDR_WRITE_THROUGH	BIT(31)
+#define CCSIDR_WRITE_BACK	BIT(30)
+#define CCSIDR_READ_ALLOCATE	BIT(29)
+#define CCSIDR_WRITE_ALLOCATE	BIT(28)
+#define CCSIDR_LINESIZE_MASK	0x7
+#define CCSIDR_ASSOCIAT_SHIFT	3
+#define CCSIDR_ASSOCIAT_MASK	0x3FF
+#define CCSIDR_NUMSETS_SHIFT	13
+#define CCSIDR_NUMSETS_MASK	0x7FF
+
+/*
+ * Which cache CCSIDR represents depends on CSSELR value
+ * Make sure no one else changes CSSELR during this
+ * smp_call_function_single prevents preemption for us
+ */
+static inline u32 get_ccsidr(u32 csselr)
+{
+	u32 ccsidr;
+
+	/* Put value into CSSELR */
+	asm volatile("msr csselr_el1, %x0" : : "r" (csselr));
+	isb();
+	/* Read result out of CCSIDR */
+	asm volatile("mrs %0, ccsidr_el1" : "=r" (ccsidr));
+
+	return ccsidr;
+}
+
+static void ci_leaf_init(struct cacheinfo *this_leaf,
+			 enum cache_type type, unsigned int level)
+{
+	bool is_instr_cache = type & CACHE_TYPE_INST;
+	u32 tmp = get_ccsidr((level - 1) << 1 | is_instr_cache);
+
+	this_leaf->level = level;
+	this_leaf->type = type;
+	this_leaf->coherency_line_size =
+	    (1 << ((tmp & CCSIDR_LINESIZE_MASK) + 2)) * 4;
+	this_leaf->number_of_sets =
+	    ((tmp >> CCSIDR_NUMSETS_SHIFT) & CCSIDR_NUMSETS_MASK) + 1;
+	this_leaf->ways_of_associativity =
+	    ((tmp >> CCSIDR_ASSOCIAT_SHIFT) & CCSIDR_ASSOCIAT_MASK) + 1;
+	this_leaf->size = this_leaf->number_of_sets *
+	    this_leaf->coherency_line_size * this_leaf->ways_of_associativity;
+	this_leaf->attributes =
+		((tmp & CCSIDR_WRITE_THROUGH) ? CACHE_WRITE_THROUGH : 0) |
+		((tmp & CCSIDR_WRITE_BACK) ? CACHE_WRITE_BACK : 0) |
+		((tmp & CCSIDR_READ_ALLOCATE) ? CACHE_READ_ALLOCATE : 0) |
+		((tmp & CCSIDR_WRITE_ALLOCATE) ? CACHE_WRITE_ALLOCATE : 0);
+}
+
+int init_cache_level(unsigned int cpu)
+{
+	unsigned int ctype, level = 1, leaves = 0;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+
+	if (!this_cpu_ci)
+		return -EINVAL;
+
+	do {
+		ctype = get_cache_type(level);
+		if (ctype == CACHE_TYPE_NOCACHE)
+			break;
+		/* Separate instruction and data caches */
+		leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
+	} while (++level <= MAX_CACHE_LEVEL);
+
+	this_cpu_ci->num_levels = level - 1;
+	this_cpu_ci->num_leaves = leaves;
+	return 0;
+}
+
+int populate_cache_leaves(unsigned int cpu)
+{
+	unsigned int level, idx;
+	enum cache_type type;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	struct cacheinfo *this_leaf = this_cpu_ci->info_list;
+
+	for (idx = 0, level = 1; level <= this_cpu_ci->num_levels &&
+	     idx < this_cpu_ci->num_leaves; idx++, level++) {
+		if (!this_leaf)
+			return -EINVAL;
+
+		type = get_cache_type(level);
+		if (type == CACHE_TYPE_SEPARATE) {
+			ci_leaf_init(this_leaf++, CACHE_TYPE_DATA, level);
+			ci_leaf_init(this_leaf++, CACHE_TYPE_INST, level);
+		} else {
+			ci_leaf_init(this_leaf++, type, level);
+		}
+	}
+	return 0;
+}
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 8/9] ARM: kernel: add support for cpu cache information
  2014-06-25 17:30 [PATCH 0/9] drivers: cacheinfo support Sudeep Holla
  2014-06-25 17:30 ` [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs Sudeep Holla
  2014-06-25 17:30 ` [PATCH 7/9] ARM64: kernel: add support for cpu cache information Sudeep Holla
@ 2014-06-25 17:30 ` Sudeep Holla
  2014-06-25 22:33   ` Russell King - ARM Linux
  2014-06-26  0:19   ` Stephen Boyd
  2014-06-25 17:30 ` [PATCH 9/9] ARM: kernel: add outer cache support for cacheinfo implementation Sudeep Holla
       [not found] ` <1406306692-7135-1-git-send-email-sudeep.holla@arm.com>
  4 siblings, 2 replies; 29+ messages in thread
From: Sudeep Holla @ 2014-06-25 17:30 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sudeep Holla <sudeep.holla@arm.com>

This patch adds support for cacheinfo on ARM platforms.

On ARMv7, the cache hierarchy can be identified through Cache Level ID
register(CLIDR) while the cache geometry is provided by Cache Size ID
register(CCSIDR).

On architecture versions before ARMv7, CLIDR and CCSIDR is not
implemented. The cache type register(CTR) provides both cache hierarchy
and geometry if implemented. For implementations that doesn't support
CTR, we need to list the probable value of CTR if it was implemented
along with the cpuid for the sake of simplicity to handle them.

Since the architecture doesn't provide any way of detecting the cpus
sharing particular cache, device tree is used fo the same purpose.
On non-DT platforms, first level caches are per-cpu while higher level
caches are assumed system-wide.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: linux-arm-kernel at lists.infradead.org
---
 arch/arm/kernel/Makefile    |   1 +
 arch/arm/kernel/cacheinfo.c | 229 ++++++++++++++++++++++++++++++++++++++++++++
 arch/arm/mm/Kconfig         |  13 +++
 3 files changed, 243 insertions(+)
 create mode 100644 arch/arm/kernel/cacheinfo.c

diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index 38ddd9f..2c5ff0e 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -29,6 +29,7 @@ obj-y		+= entry-v7m.o v7m.o
 else
 obj-y		+= entry-armv.o
 endif
+obj-$(CONFIG_CPU_HAS_CACHE) += cacheinfo.o
 
 obj-$(CONFIG_OC_ETM)		+= etm.o
 obj-$(CONFIG_CPU_IDLE)		+= cpuidle.o
diff --git a/arch/arm/kernel/cacheinfo.c b/arch/arm/kernel/cacheinfo.c
new file mode 100644
index 0000000..ab70993
--- /dev/null
+++ b/arch/arm/kernel/cacheinfo.c
@@ -0,0 +1,229 @@
+/*
+ *  ARM cacheinfo support
+ *
+ *  Copyright (C) 2014 ARM Ltd.
+ *  All Rights Reserved
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/bitops.h>
+#include <linux/cacheinfo.h>
+#include <linux/cpu.h>
+#include <linux/compiler.h>
+#include <linux/of.h>
+
+#include <asm/cputype.h>
+#include <asm/processor.h>
+
+#if __LINUX_ARM_ARCH__ < 7 /* pre ARMv7 */
+
+#define MAX_CACHE_LEVEL		1	/* Only 1 level supported */
+#define CTR_CTYPE_SHIFT		24
+#define CTR_CTYPE_MASK		(1 << CTR_CTYPE_SHIFT)
+
+struct ctr_info {
+	unsigned int cpuid_id;
+	unsigned int ctr;
+};
+
+static struct ctr_info cache_ctr_list[] = {
+};
+
+static int get_unimplemented_ctr(unsigned int *ctr)
+{
+	int i, cpuid_id = read_cpuid_id();
+
+	for (i = 0; i < ARRAY_SIZE(cache_ctr_list); i++)
+		if (cache_ctr_list[i].cpuid_id == cpuid_id) {
+			*ctr = cache_ctr_list[i].ctr;
+			return 0;
+		}
+	return -ENOENT;
+}
+
+static unsigned int get_ctr(void)
+{
+	unsigned int ctr;
+
+	if (get_unimplemented_ctr(&ctr))
+		asm volatile ("mrc p15, 0, %0, c0, c0, 1" : "=r" (ctr));
+	return ctr;
+}
+
+static enum cache_type get_cache_type(int level)
+{
+	if (level > MAX_CACHE_LEVEL)
+		return CACHE_TYPE_NOCACHE;
+	return get_ctr() & CTR_CTYPE_MASK ?
+		CACHE_TYPE_SEPARATE : CACHE_TYPE_UNIFIED;
+}
+
+/*
+ *  +---------------------------------+
+ *  | 9  8  7  6 | 5  4  3 | 2 | 1  0 |
+ *  +---------------------------------+
+ *  |    size    |  assoc  | m |  len |
+ *  +---------------------------------+
+ * linelen        = 1 << (len + 3)
+ * multiplier     = 2 + m
+ * nsets          = 1 << (size + 6 - assoc - len)
+ * associativity  = multiplier << (assoc - 1)
+ * cache_size     = multiplier << (size + 8)
+ */
+#define CTR_LINESIZE_MASK	0x3
+#define CTR_MULTIPLIER_SHIFT	2
+#define CTR_MULTIPLIER_MASK	0x1
+#define CTR_ASSOCIAT_SHIFT	3
+#define CTR_ASSOCIAT_MASK	0x7
+#define CTR_SIZE_SHIFT		6
+#define CTR_SIZE_MASK		0xF
+#define CTR_DCACHE_SHIFT	12
+
+static void __ci_leaf_init(enum cache_type type, struct cacheinfo *this_leaf)
+{
+	unsigned int size, multiplier, assoc, len, tmp = get_ctr();
+
+	if (type == CACHE_TYPE_DATA)
+		tmp >>= CTR_DCACHE_SHIFT;
+
+	len = tmp & CTR_LINESIZE_MASK;
+	size = (tmp >> CTR_SIZE_SHIFT) & CTR_SIZE_MASK;
+	assoc = (tmp >> CTR_ASSOCIAT_SHIFT) & CTR_ASSOCIAT_MASK;
+	multiplier = ((tmp >> CTR_MULTIPLIER_SHIFT) & CTR_MULTIPLIER_MASK) + 2;
+
+	this_leaf->type = type;
+	this_leaf->coherency_line_size = 1 << (len + 3);
+	this_leaf->number_of_sets = 1 << (size + 6 - assoc - len);
+	this_leaf->ways_of_associativity = multiplier << (assoc - 1);
+	this_leaf->size = multiplier << (size + 8);
+}
+
+#else /* ARMv7 */
+
+#define MAX_CACHE_LEVEL			7	/* Max 7 level supported */
+/* Ctypen, bits[3(n - 1) + 2 : 3(n - 1)], for n = 1 to 7 */
+#define CLIDR_CTYPE_SHIFT(level)	(3 * (level - 1))
+#define CLIDR_CTYPE_MASK(level)		(7 << CLIDR_CTYPE_SHIFT(level))
+#define CLIDR_CTYPE(clidr, level)	\
+	(((clidr) & CLIDR_CTYPE_MASK(level)) >> CLIDR_CTYPE_SHIFT(level))
+
+static inline enum cache_type get_cache_type(int level)
+{
+	unsigned int clidr;
+
+	if (level > MAX_CACHE_LEVEL)
+		return CACHE_TYPE_NOCACHE;
+	asm volatile ("mrc p15, 1, %0, c0, c0, 1" : "=r" (clidr));
+	return CLIDR_CTYPE(clidr, level);
+}
+
+/*
+ * NumSets, bits[27:13] - (Number of sets in cache) - 1
+ * Associativity, bits[12:3] - (Associativity of cache) - 1
+ * LineSize, bits[2:0] - (Log2(Number of words in cache line)) - 2
+ */
+#define CCSIDR_WRITE_THROUGH	BIT(31)
+#define CCSIDR_WRITE_BACK	BIT(30)
+#define CCSIDR_READ_ALLOCATE	BIT(29)
+#define CCSIDR_WRITE_ALLOCATE	BIT(28)
+#define CCSIDR_LINESIZE_MASK	0x7
+#define CCSIDR_ASSOCIAT_SHIFT	3
+#define CCSIDR_ASSOCIAT_MASK	0x3FF
+#define CCSIDR_NUMSETS_SHIFT	13
+#define CCSIDR_NUMSETS_MASK	0x7FF
+
+/*
+ * Which cache CCSIDR represents depends on CSSELR value
+ * Make sure no one else changes CSSELR during this
+ * smp_call_function_single prevents preemption for us
+ */
+static inline u32 get_ccsidr(u32 csselr)
+{
+	u32 ccsidr;
+
+	/* Put value into CSSELR */
+	asm volatile ("mcr p15, 2, %0, c0, c0, 0" : : "r" (csselr));
+	isb();
+	/* Read result out of CCSIDR */
+	asm volatile ("mrc p15, 1, %0, c0, c0, 0" : "=r" (ccsidr));
+
+	return ccsidr;
+}
+
+static void __ci_leaf_init(enum cache_type type, struct cacheinfo *this_leaf)
+{
+	bool is_instr_cache = type & CACHE_TYPE_INST;
+	u32 tmp = get_ccsidr((this_leaf->level - 1) << 1 | is_instr_cache);
+
+	this_leaf->type = type;
+	this_leaf->coherency_line_size =
+	    (1 << ((tmp & CCSIDR_LINESIZE_MASK) + 2)) * 4;
+	this_leaf->number_of_sets =
+	    ((tmp >> CCSIDR_NUMSETS_SHIFT) & CCSIDR_NUMSETS_MASK) + 1;
+	this_leaf->ways_of_associativity =
+	    ((tmp >> CCSIDR_ASSOCIAT_SHIFT) & CCSIDR_ASSOCIAT_MASK) + 1;
+	this_leaf->size = this_leaf->number_of_sets *
+	    this_leaf->coherency_line_size * this_leaf->ways_of_associativity;
+	this_leaf->attributes =
+		((tmp & CCSIDR_WRITE_THROUGH) ? CACHE_WRITE_THROUGH : 0) |
+		((tmp & CCSIDR_WRITE_BACK) ? CACHE_WRITE_BACK : 0) |
+		((tmp & CCSIDR_READ_ALLOCATE) ? CACHE_READ_ALLOCATE : 0) |
+		((tmp & CCSIDR_WRITE_ALLOCATE) ? CACHE_WRITE_ALLOCATE : 0);
+}
+
+#endif
+
+static void ci_leaf_init(struct cacheinfo *this_leaf,
+			 enum cache_type type, unsigned int level)
+{
+	this_leaf->level = level;
+	__ci_leaf_init(type, this_leaf);
+}
+
+int init_cache_level(unsigned int cpu)
+{
+	unsigned int ctype, level = 1, leaves = 0;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+
+	if (!this_cpu_ci)
+		return -EINVAL;
+
+	do {
+		ctype = get_cache_type(level);
+		if (ctype == CACHE_TYPE_NOCACHE)
+			break;
+		/* Separate instruction and data caches */
+		leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
+	} while (++level <= MAX_CACHE_LEVEL);
+
+	this_cpu_ci->num_levels = level - 1;
+	this_cpu_ci->num_leaves = leaves;
+
+	return 0;
+}
+
+int populate_cache_leaves(unsigned int cpu)
+{
+	unsigned int level, idx;
+	enum cache_type type;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	struct cacheinfo *this_leaf = this_cpu_ci->info_list;
+
+	for (idx = 0, level = 1; level <= this_cpu_ci->num_levels &&
+	     idx < this_cpu_ci->num_leaves; idx++, level++) {
+		if (!this_leaf)
+			return -EINVAL;
+
+		type = get_cache_type(level);
+		if (type == CACHE_TYPE_SEPARATE) {
+			ci_leaf_init(this_leaf++, CACHE_TYPE_DATA, level);
+			ci_leaf_init(this_leaf++, CACHE_TYPE_INST, level);
+		} else {
+			ci_leaf_init(this_leaf++, type, level);
+		}
+	}
+	return 0;
+}
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index eda0dd0..fac8646 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -494,30 +494,42 @@ config CPU_PABRT_V7
 # The cache model
 config CPU_CACHE_V4
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_V4WT
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_V4WB
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_V6
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_V7
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_NOP
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_VIVT
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_VIPT
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_FA
 	bool
+	select CPU_HAS_CACHE
+
+config CPU_HAS_CACHE
+	bool
 
 if MMU
 # The copy-page model
@@ -845,6 +857,7 @@ config DMA_CACHE_RWFO
 
 config OUTER_CACHE
 	bool
+	select CPU_HAS_CACHE
 
 config OUTER_CACHE_SYNC
 	bool
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 9/9] ARM: kernel: add outer cache support for cacheinfo implementation
  2014-06-25 17:30 [PATCH 0/9] drivers: cacheinfo support Sudeep Holla
                   ` (2 preceding siblings ...)
  2014-06-25 17:30 ` [PATCH 8/9] ARM: " Sudeep Holla
@ 2014-06-25 17:30 ` Sudeep Holla
  2014-06-25 22:37   ` Russell King - ARM Linux
       [not found] ` <1406306692-7135-1-git-send-email-sudeep.holla@arm.com>
  4 siblings, 1 reply; 29+ messages in thread
From: Sudeep Holla @ 2014-06-25 17:30 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sudeep Holla <sudeep.holla@arm.com>

In order to support outer cache in the cacheinfo infrastructure, a new
function 'get_info' is added to outer_cache_fns. This function is used
to get the outer cache information namely: line size, number of ways of
associativity and number of sets.

This patch adds 'get_info' supports to all L2 cache implementations on
ARM except Marvell's Feroceon L2 cache.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: linux-arm-kernel at lists.infradead.org
---
 arch/arm/include/asm/outercache.h | 13 +++++++++++++
 arch/arm/kernel/cacheinfo.c       | 22 +++++++++++++++++++++-
 arch/arm/mm/cache-l2x0.c          | 10 ++++++++++
 arch/arm/mm/cache-tauros2.c       | 34 ++++++++++++++++++++++++++++++++++
 arch/arm/mm/cache-xsc3l2.c        | 15 +++++++++++++++
 5 files changed, 93 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/outercache.h b/arch/arm/include/asm/outercache.h
index 891a56b..991cf63 100644
--- a/arch/arm/include/asm/outercache.h
+++ b/arch/arm/include/asm/outercache.h
@@ -23,7 +23,14 @@
 
 #include <linux/types.h>
 
+struct outer_cache_info {
+	unsigned int num_ways;
+	unsigned int num_sets;
+	unsigned int line_size;
+};
+
 struct outer_cache_fns {
+	void (*get_info)(struct outer_cache_info *info);
 	void (*inv_range)(unsigned long, unsigned long);
 	void (*clean_range)(unsigned long, unsigned long);
 	void (*flush_range)(unsigned long, unsigned long);
@@ -112,6 +119,11 @@ static inline void outer_resume(void)
 		outer_cache.resume();
 }
 
+static inline void outer_get_info(struct outer_cache_info *info)
+{
+	if (outer_cache.get_info)
+		outer_cache.get_info(info);
+}
 #else
 
 static inline void outer_inv_range(phys_addr_t start, phys_addr_t end)
@@ -123,6 +135,7 @@ static inline void outer_flush_range(phys_addr_t start, phys_addr_t end)
 static inline void outer_flush_all(void) { }
 static inline void outer_disable(void) { }
 static inline void outer_resume(void) { }
+static inline void outer_get_info(struct outer_cache_info *info) { }
 
 #endif
 
diff --git a/arch/arm/kernel/cacheinfo.c b/arch/arm/kernel/cacheinfo.c
index ab70993..88b552b 100644
--- a/arch/arm/kernel/cacheinfo.c
+++ b/arch/arm/kernel/cacheinfo.c
@@ -16,6 +16,7 @@
 #include <linux/of.h>
 
 #include <asm/cputype.h>
+#include <asm/outercache.h>
 #include <asm/processor.h>
 
 #if __LINUX_ARM_ARCH__ < 7 /* pre ARMv7 */
@@ -176,11 +177,27 @@ static void __ci_leaf_init(enum cache_type type, struct cacheinfo *this_leaf)
 
 #endif
 
+static void __outer_ci_leaf_init(struct cacheinfo *this_leaf)
+{
+	struct outer_cache_info info;
+
+	outer_get_info(&info);
+
+	this_leaf->type = CACHE_TYPE_UNIFIED;/* record it as Unified */
+	this_leaf->ways_of_associativity = info.num_ways;
+	this_leaf->number_of_sets = info.num_sets;
+	this_leaf->coherency_line_size = info.line_size;
+	this_leaf->size = info.num_ways * info.num_sets * info.line_size;
+}
+
 static void ci_leaf_init(struct cacheinfo *this_leaf,
 			 enum cache_type type, unsigned int level)
 {
 	this_leaf->level = level;
-	__ci_leaf_init(type, this_leaf);
+	if (type == CACHE_TYPE_NOCACHE)	/* must be outer cache */
+		__outer_ci_leaf_init(this_leaf);
+	else
+		__ci_leaf_init(type, this_leaf);
 }
 
 int init_cache_level(unsigned int cpu)
@@ -202,6 +219,9 @@ int init_cache_level(unsigned int cpu)
 	this_cpu_ci->num_levels = level - 1;
 	this_cpu_ci->num_leaves = leaves;
 
+	if (IS_ENABLED(CONFIG_OUTER_CACHE) && outer_cache.get_info)
+		this_cpu_ci->num_leaves++, this_cpu_ci->num_levels++;
+
 	return 0;
 }
 
diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c
index efc5cab..30ca151 100644
--- a/arch/arm/mm/cache-l2x0.c
+++ b/arch/arm/mm/cache-l2x0.c
@@ -105,6 +105,15 @@ static inline void l2c_unlock(void __iomem *base, unsigned num)
 	}
 }
 
+static void l2x0_getinfo(struct outer_cache_info *info)
+{
+	if (!info)
+		return;
+	info->num_ways = get_count_order(l2x0_way_mask);
+	info->line_size = CACHE_LINE_SIZE;
+	info->num_sets = l2x0_size / (info->num_ways * CACHE_LINE_SIZE);
+}
+
 /*
  * Enable the L2 cache controller.  This function must only be
  * called when the cache controller is known to be disabled.
@@ -894,6 +903,7 @@ static void __init __l2c_init(const struct l2c_init_data *data,
 		data->enable(l2x0_base, aux, data->num_lock);
 
 	outer_cache = fns;
+	outer_cache.get_info = l2x0_getinfo;
 
 	/*
 	 * It is strange to save the register state before initialisation,
diff --git a/arch/arm/mm/cache-tauros2.c b/arch/arm/mm/cache-tauros2.c
index b273739..8708684 100644
--- a/arch/arm/mm/cache-tauros2.c
+++ b/arch/arm/mm/cache-tauros2.c
@@ -60,6 +60,7 @@ static inline void tauros2_inv_pa(unsigned long addr)
  * noninclusive.
  */
 #define CACHE_LINE_SIZE		32
+#define CACHE_LINE_SHIFT	5
 
 static void tauros2_inv_range(unsigned long start, unsigned long end)
 {
@@ -131,6 +132,38 @@ static void tauros2_resume(void)
 	"mcr	p15, 0, %0, c1, c0, 0 @Enable L2 Cache\n\t"
 	: : "r" (0x0));
 }
+
+/*
+ *  +----------------------------------------+
+ *  | 11 10 9  8 | 7  6  5  4  3 | 2 |  1  0 |
+ *  +----------------------------------------+
+ *  |  way size  | associativity | - |line_sz|
+ *  +----------------------------------------+
+ */
+#define L2CTR_ASSOCIAT_SHIFT	3
+#define L2CTR_ASSOCIAT_MASK	0x1F
+#define L2CTR_WAYSIZE_SHIFT	8
+#define L2CTR_WAYSIZE_MASK	0xF
+#define CACHE_WAY_PER_SET(l2ctr)	\
+	(((l2_ctr) >> L2CTR_ASSOCIAT_SHIFT) & L2CTR_ASSOCIAT_MASK)
+#define CACHE_WAY_SIZE(l2ctr)		\
+	(8192 << (((l2ctr) >> L2CTR_WAYSIZE_SHIFT) & L2CTR_WAYSIZE_MASK))
+#define CACHE_SET_SIZE(l2ctr)	(CACHE_WAY_SIZE(l2ctr) >> CACHE_LINE_SHIFT)
+
+static void tauros2_getinfo(struct outer_cache_info *info)
+{
+	unsigned int l2_ctr;
+
+	if (!info)
+		return;
+
+	__asm__("mrc p15, 1, %0, c0, c0, 1" : "=r" (l2_ctr));
+
+	info->line_size = CACHE_LINE_SIZE;
+	info->num_ways = CACHE_WAY_PER_SET(l2_ctr);
+	info->num_sets = CACHE_SET_SIZE(l2_ctr);
+}
+
 #endif
 
 static inline u32 __init read_extra_features(void)
@@ -226,6 +259,7 @@ static void __init tauros2_internal_init(unsigned int features)
 		outer_cache.flush_range = tauros2_flush_range;
 		outer_cache.disable = tauros2_disable;
 		outer_cache.resume = tauros2_resume;
+		outer_cache.get_info = tauros2_getinfo;
 	}
 #endif
 
diff --git a/arch/arm/mm/cache-xsc3l2.c b/arch/arm/mm/cache-xsc3l2.c
index 6c3edeb..353c642 100644
--- a/arch/arm/mm/cache-xsc3l2.c
+++ b/arch/arm/mm/cache-xsc3l2.c
@@ -201,6 +201,20 @@ static void xsc3_l2_flush_range(unsigned long start, unsigned long end)
 	dsb();
 }
 
+static void xsc3_l2_getinfo(struct outer_cache_info *info)
+{
+	unsigned long l2ctype;
+
+	if (!info)
+		return;
+
+	__asm__("mrc p15, 1, %0, c0, c0, 1" : "=r" (l2ctype));
+
+	info->num_ways = CACHE_WAY_PER_SET;
+	info->line_size = CACHE_LINE_SIZE;
+	info->num_sets = CACHE_SET_SIZE(l2ctype);
+}
+
 static int __init xsc3_l2_init(void)
 {
 	if (!cpu_is_xsc3() || !xsc3_l2_present())
@@ -213,6 +227,7 @@ static int __init xsc3_l2_init(void)
 		outer_cache.inv_range = xsc3_l2_inv_range;
 		outer_cache.clean_range = xsc3_l2_clean_range;
 		outer_cache.flush_range = xsc3_l2_flush_range;
+		outer_cache.get_info    = xsc3_l2_getinfo;
 	}
 
 	return 0;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs
  2014-06-25 17:30 ` [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs Sudeep Holla
@ 2014-06-25 22:23   ` Russell King - ARM Linux
  2014-06-26 18:41     ` Sudeep Holla
  2014-07-10  0:09   ` Greg Kroah-Hartman
  1 sibling, 1 reply; 29+ messages in thread
From: Russell King - ARM Linux @ 2014-06-25 22:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 25, 2014 at 06:30:37PM +0100, Sudeep Holla wrote:
> +		coherency_line_size: the minimum amount of data that gets transferred

So, what value to do envision this taking for a CPU where the cache
line size is 32 bytes, but each cache line has two dirty bits which
allow it to only evict either the upper or lower 16 bytes depending
on which are dirty?

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 8/9] ARM: kernel: add support for cpu cache information
  2014-06-25 17:30 ` [PATCH 8/9] ARM: " Sudeep Holla
@ 2014-06-25 22:33   ` Russell King - ARM Linux
  2014-06-26 11:33     ` Sudeep Holla
  2014-06-26  0:19   ` Stephen Boyd
  1 sibling, 1 reply; 29+ messages in thread
From: Russell King - ARM Linux @ 2014-06-25 22:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 25, 2014 at 06:30:43PM +0100, Sudeep Holla wrote:
> diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
> index 38ddd9f..2c5ff0e 100644
> --- a/arch/arm/kernel/Makefile
> +++ b/arch/arm/kernel/Makefile
> @@ -29,6 +29,7 @@ obj-y		+= entry-v7m.o v7m.o
>  else
>  obj-y		+= entry-armv.o
>  endif
> +obj-$(CONFIG_CPU_HAS_CACHE) += cacheinfo.o
>  
>  obj-$(CONFIG_OC_ETM)		+= etm.o
>  obj-$(CONFIG_CPU_IDLE)		+= cpuidle.o
> diff --git a/arch/arm/kernel/cacheinfo.c b/arch/arm/kernel/cacheinfo.c
> new file mode 100644
> index 0000000..ab70993
> --- /dev/null
> +++ b/arch/arm/kernel/cacheinfo.c
> @@ -0,0 +1,229 @@
> +/*
> + *  ARM cacheinfo support
> + *
> + *  Copyright (C) 2014 ARM Ltd.
> + *  All Rights Reserved
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/bitops.h>
> +#include <linux/cacheinfo.h>
> +#include <linux/cpu.h>
> +#include <linux/compiler.h>
> +#include <linux/of.h>
> +
> +#include <asm/cputype.h>
> +#include <asm/processor.h>
> +
> +#if __LINUX_ARM_ARCH__ < 7 /* pre ARMv7 */

__LINUX_ARM_ARCH__ defines the minimum architecture version we are building
for - we may support later versions than the architecture version denoted
by this symbol.  It does not define which CPUs we are building for.  Are
you sure that this is correct here?  What if we build a kernel supporting
both v6 + v7, as the OMAP guys do?

> +
> +#define MAX_CACHE_LEVEL		1	/* Only 1 level supported */
> +#define CTR_CTYPE_SHIFT		24
> +#define CTR_CTYPE_MASK		(1 << CTR_CTYPE_SHIFT)
> +
> +struct ctr_info {
> +	unsigned int cpuid_id;
> +	unsigned int ctr;
> +};
> +
> +static struct ctr_info cache_ctr_list[] = {
> +};

This list needs to be populated.  Early CPUs (such as StrongARM) do not
have the CTR register.

> +static int get_unimplemented_ctr(unsigned int *ctr)
> +{
> +	int i, cpuid_id = read_cpuid_id();
> +
> +	for (i = 0; i < ARRAY_SIZE(cache_ctr_list); i++)
> +		if (cache_ctr_list[i].cpuid_id == cpuid_id) {
> +			*ctr = cache_ctr_list[i].ctr;
> +			return 0;
> +		}
> +	return -ENOENT;
> +}
> +
> +static unsigned int get_ctr(void)
> +{
> +	unsigned int ctr;
> +
> +	if (get_unimplemented_ctr(&ctr))
> +		asm volatile ("mrc p15, 0, %0, c0, c0, 1" : "=r" (ctr));

read_cpuid_cachetype() ?

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 9/9] ARM: kernel: add outer cache support for cacheinfo implementation
  2014-06-25 17:30 ` [PATCH 9/9] ARM: kernel: add outer cache support for cacheinfo implementation Sudeep Holla
@ 2014-06-25 22:37   ` Russell King - ARM Linux
  2014-06-26 13:02     ` Sudeep Holla
  0 siblings, 1 reply; 29+ messages in thread
From: Russell King - ARM Linux @ 2014-06-25 22:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 25, 2014 at 06:30:44PM +0100, Sudeep Holla wrote:
> diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c
> index efc5cab..30ca151 100644
> --- a/arch/arm/mm/cache-l2x0.c
> +++ b/arch/arm/mm/cache-l2x0.c
> @@ -105,6 +105,15 @@ static inline void l2c_unlock(void __iomem *base, unsigned num)
>  	}
>  }
>  
> +static void l2x0_getinfo(struct outer_cache_info *info)
> +{
> +	if (!info)
> +		return;

Pointless NULL test.  If someone passes NULL to this function (which
you never do in this file) then we want to know about it because _that_
is a kernel bug - it is invalid to pass NULL.  Hence the kernel should
oops.

Please, don't go around adding stupid NULL tests for conditions which
should _never_ happen, instead, rely on the kernel to oops if these
invalid conditions occur.  That's why we produce a backtrace from such
events, to allow invalid conditions to be debugged and fixed.

Having stuff silently ignore in this way does not detect these bugs so
they go by unnoticed.

Take a moment to read some of the fs/ or kernel/ code, and you'll find
a lack of NULL checks in there.  That's what gives that code performance,
because it's not spending its time doing loads of useless NULL checks.

> @@ -894,6 +903,7 @@ static void __init __l2c_init(const struct l2c_init_data *data,
>  		data->enable(l2x0_base, aux, data->num_lock);
>  
>  	outer_cache = fns;
> +	outer_cache.get_info = l2x0_getinfo;

NAK.  Think about it.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 8/9] ARM: kernel: add support for cpu cache information
  2014-06-25 17:30 ` [PATCH 8/9] ARM: " Sudeep Holla
  2014-06-25 22:33   ` Russell King - ARM Linux
@ 2014-06-26  0:19   ` Stephen Boyd
  2014-06-26 11:36     ` Sudeep Holla
  1 sibling, 1 reply; 29+ messages in thread
From: Stephen Boyd @ 2014-06-26  0:19 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/25/14 10:30, Sudeep Holla wrote:
> +
> +/*
> + * Which cache CCSIDR represents depends on CSSELR value
> + * Make sure no one else changes CSSELR during this
> + * smp_call_function_single prevents preemption for us
> + */

Where's the smp_call_function_single() or preemption disable happening?

> +static inline u32 get_ccsidr(u32 csselr)
> +{
> +	u32 ccsidr;
> +
> +	/* Put value into CSSELR */
> +	asm volatile ("mcr p15, 2, %0, c0, c0, 0" : : "r" (csselr));
> +	isb();
> +	/* Read result out of CCSIDR */
> +	asm volatile ("mrc p15, 1, %0, c0, c0, 0" : "=r" (ccsidr));
> +
> +	return ccsidr;
> +}
> +

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 8/9] ARM: kernel: add support for cpu cache information
  2014-06-25 22:33   ` Russell King - ARM Linux
@ 2014-06-26 11:33     ` Sudeep Holla
  0 siblings, 0 replies; 29+ messages in thread
From: Sudeep Holla @ 2014-06-26 11:33 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

Thanks for the reviews.

On 25/06/14 23:33, Russell King - ARM Linux wrote:
> On Wed, Jun 25, 2014 at 06:30:43PM +0100, Sudeep Holla wrote:
[...]
>> +
>> +#include <linux/bitops.h>
>> +#include <linux/cacheinfo.h>
>> +#include <linux/cpu.h>
>> +#include <linux/compiler.h>
>> +#include <linux/of.h>
>> +
>> +#include <asm/cputype.h>
>> +#include <asm/processor.h>
>> +
>> +#if __LINUX_ARM_ARCH__ < 7 /* pre ARMv7 */
>
> __LINUX_ARM_ARCH__ defines the minimum architecture version we are building
> for - we may support later versions than the architecture version denoted
> by this symbol.  It does not define which CPUs we are building for.  Are
> you sure that this is correct here?  What if we build a kernel supporting
> both v6 + v7, as the OMAP guys do?
>

You are right, I have not considered v6 + v7, I will use cpu_architecture and
make it runtime.

>> +
>> +#define MAX_CACHE_LEVEL		1	/* Only 1 level supported */
>> +#define CTR_CTYPE_SHIFT		24
>> +#define CTR_CTYPE_MASK		(1 << CTR_CTYPE_SHIFT)
>> +
>> +struct ctr_info {
>> +	unsigned int cpuid_id;
>> +	unsigned int ctr;
>> +};
>> +
>> +static struct ctr_info cache_ctr_list[] = {
>> +};
>
> This list needs to be populated.  Early CPUs (such as StrongARM) do not
> have the CTR register.
>

Right, since I didn't have the list left it empty. I will compile the list
soon but I need your help. The list of StrongARM I can come up is:
1. SA-110
2. SA-1100
3. SA-1110
4. SA-1500 (grep didn't show this in kernel, not sure if it's supported)

I also have to find all other ARMv4 implementations not having CTR.

>> +static int get_unimplemented_ctr(unsigned int *ctr)
>> +{
>> +	int i, cpuid_id = read_cpuid_id();
>> +
>> +	for (i = 0; i < ARRAY_SIZE(cache_ctr_list); i++)
>> +		if (cache_ctr_list[i].cpuid_id == cpuid_id) {
>> +			*ctr = cache_ctr_list[i].ctr;
>> +			return 0;
>> +		}
>> +	return -ENOENT;
>> +}
>> +
>> +static unsigned int get_ctr(void)
>> +{
>> +	unsigned int ctr;
>> +
>> +	if (get_unimplemented_ctr(&ctr))
>> +		asm volatile ("mrc p15, 0, %0, c0, c0, 1" : "=r" (ctr));
>
> read_cpuid_cachetype() ?
>
Ah, I missed to see that, will use it.

Regards,
Sudeep

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 8/9] ARM: kernel: add support for cpu cache information
  2014-06-26  0:19   ` Stephen Boyd
@ 2014-06-26 11:36     ` Sudeep Holla
  2014-06-26 18:45       ` Stephen Boyd
  0 siblings, 1 reply; 29+ messages in thread
From: Sudeep Holla @ 2014-06-26 11:36 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Stephen,

On 26/06/14 01:19, Stephen Boyd wrote:
> On 06/25/14 10:30, Sudeep Holla wrote:
>> +
>> +/*
>> + * Which cache CCSIDR represents depends on CSSELR value
>> + * Make sure no one else changes CSSELR during this
>> + * smp_call_function_single prevents preemption for us
>> + */
>
> Where's the smp_call_function_single() or preemption disable happening?
>

init_cache_level is called using smp_call_function_single in
drivers/base/cacheinfo.c(PATCH 2/9)

Regards,
Sudeep

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 9/9] ARM: kernel: add outer cache support for cacheinfo implementation
  2014-06-25 22:37   ` Russell King - ARM Linux
@ 2014-06-26 13:02     ` Sudeep Holla
  0 siblings, 0 replies; 29+ messages in thread
From: Sudeep Holla @ 2014-06-26 13:02 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On 25/06/14 23:37, Russell King - ARM Linux wrote:
> On Wed, Jun 25, 2014 at 06:30:44PM +0100, Sudeep Holla wrote:
>> diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c
>> index efc5cab..30ca151 100644
>> --- a/arch/arm/mm/cache-l2x0.c
>> +++ b/arch/arm/mm/cache-l2x0.c
>> @@ -105,6 +105,15 @@ static inline void l2c_unlock(void __iomem *base, unsigned num)
>>   	}
>>   }
>>
>> +static void l2x0_getinfo(struct outer_cache_info *info)
>> +{
>> +	if (!info)
>> +		return;
>
> Pointless NULL test.  If someone passes NULL to this function (which
> you never do in this file) then we want to know about it because _that_
> is a kernel bug - it is invalid to pass NULL.  Hence the kernel should
> oops.
>
> Please, don't go around adding stupid NULL tests for conditions which
> should _never_ happen, instead, rely on the kernel to oops if these
> invalid conditions occur.  That's why we produce a backtrace from such
> events, to allow invalid conditions to be debugged and fixed.
>
> Having stuff silently ignore in this way does not detect these bugs so
> they go by unnoticed.
>
> Take a moment to read some of the fs/ or kernel/ code, and you'll find
> a lack of NULL checks in there.  That's what gives that code performance,
> because it's not spending its time doing loads of useless NULL checks.
>

Understood, will get rid of it.

>> @@ -894,6 +903,7 @@ static void __init __l2c_init(const struct l2c_init_data *data,
>>   		data->enable(l2x0_base, aux, data->num_lock);
>>
>>   	outer_cache = fns;
>> +	outer_cache.get_info = l2x0_getinfo;
>
> NAK.  Think about it.
>

Ah, will specify in l2c_init_data for individual implementations so that
fixups is possible if needed for get_info. Sorry for missing this.

Regards,
Sudeep

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs
  2014-06-25 22:23   ` Russell King - ARM Linux
@ 2014-06-26 18:41     ` Sudeep Holla
  2014-06-26 18:50       ` Russell King - ARM Linux
  0 siblings, 1 reply; 29+ messages in thread
From: Sudeep Holla @ 2014-06-26 18:41 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 25/06/14 23:23, Russell King - ARM Linux wrote:
> On Wed, Jun 25, 2014 at 06:30:37PM +0100, Sudeep Holla wrote:
>> +		coherency_line_size: the minimum amount of data that gets transferred
>
> So, what value to do envision this taking for a CPU where the cache
> line size is 32 bytes, but each cache line has two dirty bits which
> allow it to only evict either the upper or lower 16 bytes depending
> on which are dirty?
>

IIUC most of existing implementations of cacheinfo on various architectures
are representing the cache line size as coherency_line_size, in which case I
need fix the definition in this file.

BTW will there be any architectural way of finding such configuration ?

Regards,
Sudeep

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 8/9] ARM: kernel: add support for cpu cache information
  2014-06-26 11:36     ` Sudeep Holla
@ 2014-06-26 18:45       ` Stephen Boyd
  2014-06-27  9:38         ` Sudeep Holla
  0 siblings, 1 reply; 29+ messages in thread
From: Stephen Boyd @ 2014-06-26 18:45 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/26/14 04:36, Sudeep Holla wrote:
> Hi Stephen,
>
> On 26/06/14 01:19, Stephen Boyd wrote:
>> On 06/25/14 10:30, Sudeep Holla wrote:
>>> +
>>> +/*
>>> + * Which cache CCSIDR represents depends on CSSELR value
>>> + * Make sure no one else changes CSSELR during this
>>> + * smp_call_function_single prevents preemption for us
>>> + */
>>
>> Where's the smp_call_function_single() or preemption disable happening?
>>
>
> init_cache_level is called using smp_call_function_single in
> drivers/base/cacheinfo.c(PATCH 2/9)

Oh that's unexpected. Do other architectures require the use of
smp_call_function_single() to read their cache information? It seems
like an ARM architecture specific detail that has been pushed up into
the generic layer.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs
  2014-06-26 18:41     ` Sudeep Holla
@ 2014-06-26 18:50       ` Russell King - ARM Linux
  2014-06-26 19:03         ` Sudeep Holla
  0 siblings, 1 reply; 29+ messages in thread
From: Russell King - ARM Linux @ 2014-06-26 18:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 26, 2014 at 07:41:32PM +0100, Sudeep Holla wrote:
> Hi,
>
> On 25/06/14 23:23, Russell King - ARM Linux wrote:
>> On Wed, Jun 25, 2014 at 06:30:37PM +0100, Sudeep Holla wrote:
>>> +		coherency_line_size: the minimum amount of data that gets transferred
>>
>> So, what value to do envision this taking for a CPU where the cache
>> line size is 32 bytes, but each cache line has two dirty bits which
>> allow it to only evict either the upper or lower 16 bytes depending
>> on which are dirty?
>>
>
> IIUC most of existing implementations of cacheinfo on various architectures
> are representing the cache line size as coherency_line_size, in which case I
> need fix the definition in this file.

As an example, here's an extract from the SA110 TRM:

StrongARM contains a 16KByte writeback data cache. The DC has 512 lines
of 32 bytes (8 words), arranged as a 32 way set associative cache, and
uses the virtual addresses generated by the processor. A line also
contains the physical address the block was fetched from and two dirty
bits. There is a dirty bit associated with both the first and second
half of the block. When a store hits in the cache the dirty bit
associated with it is set. When a block is evicted from the cache the
dirty bits are used to decide if all, half, or none of the block will
be written back to memory using the physical address stored with the
block. The DC is always reloaded a line at a time (8 words).

> BTW will there be any architectural way of finding such configuration ?

Not that I know of.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs
  2014-06-26 18:50       ` Russell King - ARM Linux
@ 2014-06-26 19:03         ` Sudeep Holla
  0 siblings, 0 replies; 29+ messages in thread
From: Sudeep Holla @ 2014-06-26 19:03 UTC (permalink / raw)
  To: linux-arm-kernel



On 26/06/14 19:50, Russell King - ARM Linux wrote:
> On Thu, Jun 26, 2014 at 07:41:32PM +0100, Sudeep Holla wrote:
>> Hi,
>>
>> On 25/06/14 23:23, Russell King - ARM Linux wrote:
>>> On Wed, Jun 25, 2014 at 06:30:37PM +0100, Sudeep Holla wrote:
>>>> +		coherency_line_size: the minimum amount of data that gets transferred
>>>
>>> So, what value to do envision this taking for a CPU where the cache
>>> line size is 32 bytes, but each cache line has two dirty bits which
>>> allow it to only evict either the upper or lower 16 bytes depending
>>> on which are dirty?
>>>
>>
>> IIUC most of existing implementations of cacheinfo on various architectures
>> are representing the cache line size as coherency_line_size, in which case I
>> need fix the definition in this file.
>
> As an example, here's an extract from the SA110 TRM:
>
> StrongARM contains a 16KByte writeback data cache. The DC has 512 lines
> of 32 bytes (8 words), arranged as a 32 way set associative cache, and
> uses the virtual addresses generated by the processor. A line also
> contains the physical address the block was fetched from and two dirty
> bits. There is a dirty bit associated with both the first and second
> half of the block. When a store hits in the cache the dirty bit
> associated with it is set. When a block is evicted from the cache the
> dirty bits are used to decide if all, half, or none of the block will
> be written back to memory using the physical address stored with the
> block. The DC is always reloaded a line at a time (8 words).
>

Thanks for the information. It's interesting that line is referred as block
when referring to 2 dirty bits. I am not sure if this can be mapped to 
physical_line_partition = 2. Thoughts ?

>> BTW will there be any architectural way of finding such configuration ?
>
> Not that I know of.
>
That's bad :)

Regards,
Sudeep

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 8/9] ARM: kernel: add support for cpu cache information
  2014-06-26 18:45       ` Stephen Boyd
@ 2014-06-27  9:38         ` Sudeep Holla
  0 siblings, 0 replies; 29+ messages in thread
From: Sudeep Holla @ 2014-06-27  9:38 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 26/06/14 19:45, Stephen Boyd wrote:
> On 06/26/14 04:36, Sudeep Holla wrote:
>> Hi Stephen,
>>
>> On 26/06/14 01:19, Stephen Boyd wrote:
>>> On 06/25/14 10:30, Sudeep Holla wrote:
>>>> +
>>>> +/*
>>>> + * Which cache CCSIDR represents depends on CSSELR value
>>>> + * Make sure no one else changes CSSELR during this
>>>> + * smp_call_function_single prevents preemption for us
>>>> + */
>>>
>>> Where's the smp_call_function_single() or preemption disable happening?
>>>
>>
>> init_cache_level is called using smp_call_function_single in
>> drivers/base/cacheinfo.c(PATCH 2/9)
>
> Oh that's unexpected. Do other architectures require the use of
> smp_call_function_single() to read their cache information? It seems
> like an ARM architecture specific detail that has been pushed up into
> the generic layer.
>

Right, since I started with x86 as reference and it requires it, I missed to
see others. So x86,ARM{32,64} requires it while ppc,s390 and ia64 doesn't.
I see how to fix that. Thanks for spotting it.

Regards,
Sudeep

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 7/9] ARM64: kernel: add support for cpu cache information
  2014-06-25 17:30 ` [PATCH 7/9] ARM64: kernel: add support for cpu cache information Sudeep Holla
@ 2014-06-27 10:36   ` Mark Rutland
  2014-06-27 11:22     ` Sudeep Holla
  0 siblings, 1 reply; 29+ messages in thread
From: Mark Rutland @ 2014-06-27 10:36 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Sudeep,

On Wed, Jun 25, 2014 at 06:30:42PM +0100, Sudeep Holla wrote:
> From: Sudeep Holla <sudeep.holla@arm.com>
> 
> This patch adds support for cacheinfo on ARM64.
> 
> On ARMv8, the cache hierarchy can be identified through Cache Level ID
> (CLIDR) register while the cache geometry is provided by Cache Size ID
> (CCSIDR) register.
> 
> Since the architecture doesn't provide any way of detecting the cpus
> sharing particular cache, device tree is used for the same purpose.
> 
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Cc: linux-arm-kernel at lists.infradead.org
> ---
>  arch/arm64/kernel/Makefile    |   3 +-
>  arch/arm64/kernel/cacheinfo.c | 135 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 137 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/kernel/cacheinfo.c

[...]

> +static inline enum cache_type get_cache_type(int level)
> +{
> +	unsigned int clidr;
> +
> +	if (level > MAX_CACHE_LEVEL)
> +		return CACHE_TYPE_NOCACHE;
> +	asm volatile ("mrs     %0, clidr_el1" : "=r" (clidr));

Can't that allocate a w register?

You can make clidr a u64 to avoid that.

> +	return CLIDR_CTYPE(clidr, level);
> +}
> +
> +/*
> + * NumSets, bits[27:13] - (Number of sets in cache) - 1
> + * Associativity, bits[12:3] - (Associativity of cache) - 1
> + * LineSize, bits[2:0] - (Log2(Number of words in cache line)) - 2
> + */
> +#define CCSIDR_WRITE_THROUGH	BIT(31)
> +#define CCSIDR_WRITE_BACK	BIT(30)
> +#define CCSIDR_READ_ALLOCATE	BIT(29)
> +#define CCSIDR_WRITE_ALLOCATE	BIT(28)
> +#define CCSIDR_LINESIZE_MASK	0x7
> +#define CCSIDR_ASSOCIAT_SHIFT	3
> +#define CCSIDR_ASSOCIAT_MASK	0x3FF

ASSOCIAT doesn't quite roll off of the tongue...

> +#define CCSIDR_NUMSETS_SHIFT	13
> +#define CCSIDR_NUMSETS_MASK	0x7FF
> +
> +/*
> + * Which cache CCSIDR represents depends on CSSELR value
> + * Make sure no one else changes CSSELR during this
> + * smp_call_function_single prevents preemption for us
> + */
> +static inline u32 get_ccsidr(u32 csselr)
> +{
> +	u32 ccsidr;
> +
> +	/* Put value into CSSELR */
> +	asm volatile("msr csselr_el1, %x0" : : "r" (csselr));

This looks a little dodgy. I think GCC can leave the upper 32 bits in a
random state. Why not cast csselr to a u64 here?

> +	isb();
> +	/* Read result out of CCSIDR */
> +	asm volatile("mrs %0, ccsidr_el1" : "=r" (ccsidr));
> +
> +	return ccsidr;

Similarly it might make sense to make the temporary variable a u64.

[...]

> +int init_cache_level(unsigned int cpu)
> +{
> +	unsigned int ctype, level = 1, leaves = 0;
> +	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> +
> +	if (!this_cpu_ci)
> +		return -EINVAL;
> +
> +	do {
> +		ctype = get_cache_type(level);
> +		if (ctype == CACHE_TYPE_NOCACHE)
> +			break;
> +		/* Separate instruction and data caches */
> +		leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
> +	} while (++level <= MAX_CACHE_LEVEL);

I think this would be clearer with:

for (level = 1; level <= MAX_CACHE_LEVEL; level++)

We do something like that in populate_cache_leaves below.

> +
> +	this_cpu_ci->num_levels = level - 1;
> +	this_cpu_ci->num_leaves = leaves;
> +	return 0;
> +}
> +
> +int populate_cache_leaves(unsigned int cpu)
> +{
> +	unsigned int level, idx;
> +	enum cache_type type;
> +	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> +	struct cacheinfo *this_leaf = this_cpu_ci->info_list;
> +
> +	for (idx = 0, level = 1; level <= this_cpu_ci->num_levels &&
> +	     idx < this_cpu_ci->num_leaves; idx++, level++) {
> +		if (!this_leaf)
> +			return -EINVAL;
> +
> +		type = get_cache_type(level);
> +		if (type == CACHE_TYPE_SEPARATE) {
> +			ci_leaf_init(this_leaf++, CACHE_TYPE_DATA, level);
> +			ci_leaf_init(this_leaf++, CACHE_TYPE_INST, level);
> +		} else {
> +			ci_leaf_init(this_leaf++, type, level);
> +		}
> +	}
> +	return 0;
> +}

Cheers,
Mark.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 7/9] ARM64: kernel: add support for cpu cache information
  2014-06-27 10:36   ` Mark Rutland
@ 2014-06-27 11:22     ` Sudeep Holla
  2014-06-27 11:34       ` Mark Rutland
  0 siblings, 1 reply; 29+ messages in thread
From: Sudeep Holla @ 2014-06-27 11:22 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

Thanks for the review.

On 27/06/14 11:36, Mark Rutland wrote:
> Hi Sudeep,
>
> On Wed, Jun 25, 2014 at 06:30:42PM +0100, Sudeep Holla wrote:
>> From: Sudeep Holla <sudeep.holla@arm.com>
>>
>> This patch adds support for cacheinfo on ARM64.
>>
>> On ARMv8, the cache hierarchy can be identified through Cache Level ID
>> (CLIDR) register while the cache geometry is provided by Cache Size ID
>> (CCSIDR) register.
>>
>> Since the architecture doesn't provide any way of detecting the cpus
>> sharing particular cache, device tree is used for the same purpose.
>>
>> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will.deacon@arm.com>
>> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
>> Cc: linux-arm-kernel at lists.infradead.org
>> ---
>>   arch/arm64/kernel/Makefile    |   3 +-
>>   arch/arm64/kernel/cacheinfo.c | 135 ++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 137 insertions(+), 1 deletion(-)
>>   create mode 100644 arch/arm64/kernel/cacheinfo.c
>
> [...]
>
>> +static inline enum cache_type get_cache_type(int level)
>> +{
>> +	unsigned int clidr;
>> +
>> +	if (level > MAX_CACHE_LEVEL)
>> +		return CACHE_TYPE_NOCACHE;
>> +	asm volatile ("mrs     %0, clidr_el1" : "=r" (clidr));
>
> Can't that allocate a w register?
>

That should be fine, as all of these cache info registers are 32-bit.

> You can make clidr a u64 to avoid that.
>

What would be the preference ?
Using w registers for all these cache registers or using u64 with x registers?

>> +	return CLIDR_CTYPE(clidr, level);
>> +}
>> +
>> +/*
>> + * NumSets, bits[27:13] - (Number of sets in cache) - 1
>> + * Associativity, bits[12:3] - (Associativity of cache) - 1
>> + * LineSize, bits[2:0] - (Log2(Number of words in cache line)) - 2
>> + */
>> +#define CCSIDR_WRITE_THROUGH	BIT(31)
>> +#define CCSIDR_WRITE_BACK	BIT(30)
>> +#define CCSIDR_READ_ALLOCATE	BIT(29)
>> +#define CCSIDR_WRITE_ALLOCATE	BIT(28)
>> +#define CCSIDR_LINESIZE_MASK	0x7
>> +#define CCSIDR_ASSOCIAT_SHIFT	3
>> +#define CCSIDR_ASSOCIAT_MASK	0x3FF
>
> ASSOCIAT doesn't quite roll off of the tongue...
>

I have no idea why I chose that incomplete name :(

>> +#define CCSIDR_NUMSETS_SHIFT	13
>> +#define CCSIDR_NUMSETS_MASK	0x7FF
>> +
>> +/*
>> + * Which cache CCSIDR represents depends on CSSELR value
>> + * Make sure no one else changes CSSELR during this
>> + * smp_call_function_single prevents preemption for us
>> + */
>> +static inline u32 get_ccsidr(u32 csselr)
>> +{
>> +	u32 ccsidr;
>> +
>> +	/* Put value into CSSELR */
>> +	asm volatile("msr csselr_el1, %x0" : : "r" (csselr));
>
> This looks a little dodgy. I think GCC can leave the upper 32 bits in a
> random state. Why not cast csselr to a u64 here?
>
>> +	isb();
>> +	/* Read result out of CCSIDR */
>> +	asm volatile("mrs %0, ccsidr_el1" : "=r" (ccsidr));
>> +
>> +	return ccsidr;
>
> Similarly it might make sense to make the temporary variable a u64.
>
> [...]
>
>> +int init_cache_level(unsigned int cpu)
>> +{
>> +	unsigned int ctype, level = 1, leaves = 0;
>> +	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>> +
>> +	if (!this_cpu_ci)
>> +		return -EINVAL;
>> +
>> +	do {
>> +		ctype = get_cache_type(level);
>> +		if (ctype == CACHE_TYPE_NOCACHE)
>> +			break;
>> +		/* Separate instruction and data caches */
>> +		leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
>> +	} while (++level <= MAX_CACHE_LEVEL);
>
> I think this would be clearer with:
>
> for (level = 1; level <= MAX_CACHE_LEVEL; level++)
>
> We do something like that in populate_cache_leaves below.
>

Right will change it.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 7/9] ARM64: kernel: add support for cpu cache information
  2014-06-27 11:22     ` Sudeep Holla
@ 2014-06-27 11:34       ` Mark Rutland
  0 siblings, 0 replies; 29+ messages in thread
From: Mark Rutland @ 2014-06-27 11:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 12:22:17PM +0100, Sudeep Holla wrote:
> Hi Mark,
> 
> Thanks for the review.
> 
> On 27/06/14 11:36, Mark Rutland wrote:
> > Hi Sudeep,
> >
> > On Wed, Jun 25, 2014 at 06:30:42PM +0100, Sudeep Holla wrote:
> >> From: Sudeep Holla <sudeep.holla@arm.com>
> >>
> >> This patch adds support for cacheinfo on ARM64.
> >>
> >> On ARMv8, the cache hierarchy can be identified through Cache Level ID
> >> (CLIDR) register while the cache geometry is provided by Cache Size ID
> >> (CCSIDR) register.
> >>
> >> Since the architecture doesn't provide any way of detecting the cpus
> >> sharing particular cache, device tree is used for the same purpose.
> >>
> >> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> >> Cc: Catalin Marinas <catalin.marinas@arm.com>
> >> Cc: Will Deacon <will.deacon@arm.com>
> >> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> >> Cc: linux-arm-kernel at lists.infradead.org
> >> ---
> >>   arch/arm64/kernel/Makefile    |   3 +-
> >>   arch/arm64/kernel/cacheinfo.c | 135 ++++++++++++++++++++++++++++++++++++++++++
> >>   2 files changed, 137 insertions(+), 1 deletion(-)
> >>   create mode 100644 arch/arm64/kernel/cacheinfo.c
> >
> > [...]
> >
> >> +static inline enum cache_type get_cache_type(int level)
> >> +{
> >> +	unsigned int clidr;
> >> +
> >> +	if (level > MAX_CACHE_LEVEL)
> >> +		return CACHE_TYPE_NOCACHE;
> >> +	asm volatile ("mrs     %0, clidr_el1" : "=r" (clidr));
> >
> > Can't that allocate a w register?
> >
> 
> That should be fine, as all of these cache info registers are 32-bit.

In A64 mrs/msr only works for x registers, and gas will barf for w registers:

[mark at leverpostej:~]% echo "mrs x0, clidr_el1" | aarch64-linux-gnu-as - 
[mark at leverpostej:~]% echo "mrs w0, clidr_el1" | aarch64-linux-gnu-as - 
{standard input}: Assembler messages:
{standard input}:1: Error: operand mismatch -- `mrs w0,clidr_el1'
[mark at leverpostej:~]% echo "msr clidr_el1, x0" | aarch64-linux-gnu-as - 
[mark at leverpostej:~]% echo "msr clidr_el1, w0" | aarch64-linux-gnu-as - 
{standard input}: Assembler messages:
{standard input}:1: Error: operand mismatch -- `msr clidr_el1,w0'
[mark at leverpostej:~]% 

> > You can make clidr a u64 to avoid that.
> >
> 
> What would be the preference ?
> Using w registers for all these cache registers or using u64 with x registers?

You must use x registers.

To prevent GCC from making the assumption that the upper 32-bits are
irrelevant, it's better to cast to a u64 than use %xN in the asm.

> >> +	return CLIDR_CTYPE(clidr, level);
> >> +}
> >> +
> >> +/*
> >> + * NumSets, bits[27:13] - (Number of sets in cache) - 1
> >> + * Associativity, bits[12:3] - (Associativity of cache) - 1
> >> + * LineSize, bits[2:0] - (Log2(Number of words in cache line)) - 2
> >> + */
> >> +#define CCSIDR_WRITE_THROUGH	BIT(31)
> >> +#define CCSIDR_WRITE_BACK	BIT(30)
> >> +#define CCSIDR_READ_ALLOCATE	BIT(29)
> >> +#define CCSIDR_WRITE_ALLOCATE	BIT(28)
> >> +#define CCSIDR_LINESIZE_MASK	0x7
> >> +#define CCSIDR_ASSOCIAT_SHIFT	3
> >> +#define CCSIDR_ASSOCIAT_MASK	0x3FF
> >
> > ASSOCIAT doesn't quite roll off of the tongue...
> >
> 
> I have no idea why I chose that incomplete name :(

At least we can fix it :)

Cheers,
Mark.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs
  2014-06-25 17:30 ` [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs Sudeep Holla
  2014-06-25 22:23   ` Russell King - ARM Linux
@ 2014-07-10  0:09   ` Greg Kroah-Hartman
  2014-07-10 13:37     ` Sudeep Holla
  1 sibling, 1 reply; 29+ messages in thread
From: Greg Kroah-Hartman @ 2014-07-10  0:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 25, 2014 at 06:30:37PM +0100, Sudeep Holla wrote:
> +static const struct device_attribute *cache_optional_attrs[] = {
> +	&dev_attr_coherency_line_size,
> +	&dev_attr_ways_of_associativity,
> +	&dev_attr_number_of_sets,
> +	&dev_attr_size,
> +	&dev_attr_attributes,
> +	&dev_attr_physical_line_partition,
> +	NULL
> +};
> +
> +static int device_add_attrs(struct device *dev,
> +			    const struct device_attribute **dev_attrs)
> +{
> +	int i, error = 0;
> +	struct device_attribute *dev_attr;
> +	char *buf;
> +
> +	if (!dev_attrs)
> +		return 0;
> +
> +	buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
> +	if (!buf)
> +		return -ENOMEM;
> +
> +	for (i = 0; dev_attrs[i]; i++) {
> +		dev_attr = (struct device_attribute *)dev_attrs[i];
> +
> +		/* create attributes that provides meaningful value */
> +		if (dev_attr->show(dev, dev_attr, buf) < 0)
> +			continue;
> +
> +		error = device_create_file(dev, dev_attrs[i]);
> +		if (error) {
> +			while (--i >= 0)
> +				device_remove_file(dev, dev_attrs[i]);
> +			break;
> +		}
> +	}
> +
> +	kfree(buf);
> +	return error;
> +}

Ick, why create your own function for this when the driver core has this
functionality built into it?  Look at the is_visible() callback, and how
it is use for an attribute group please.

> +static void device_remove_attrs(struct device *dev,
> +				const struct device_attribute **dev_attrs)
> +{
> +	int i;
> +
> +	if (!dev_attrs)
> +		return;
> +
> +	for (i = 0; dev_attrs[i]; dev_attrs++, i++)
> +		device_remove_file(dev, dev_attrs[i]);
> +}

You should just remove a whole group at once, not individually.

> +
> +const struct device_attribute **
> +__weak cache_get_priv_attr(struct device *cache_idx_dev)
> +{
> +	return NULL;
> +}
> +
> +/* Add/Remove cache interface for CPU device */
> +static void cpu_cache_sysfs_exit(unsigned int cpu)
> +{
> +	int i;
> +	struct device *tmp_dev;
> +	const struct device_attribute **ci_priv_attr;
> +
> +	if (per_cpu_index_dev(cpu)) {
> +		for (i = 0; i < cache_leaves(cpu); i++) {
> +			tmp_dev = per_cache_index_dev(cpu, i);
> +			if (!tmp_dev)
> +				continue;
> +			ci_priv_attr = cache_get_priv_attr(tmp_dev);
> +			device_remove_attrs(tmp_dev, ci_priv_attr);
> +			device_remove_attrs(tmp_dev, cache_optional_attrs);
> +			device_unregister(tmp_dev);
> +		}
> +		kfree(per_cpu_index_dev(cpu));
> +		per_cpu_index_dev(cpu) = NULL;
> +	}
> +	device_unregister(per_cpu_cache_dev(cpu));
> +	per_cpu_cache_dev(cpu) = NULL;
> +}
> +
> +static int cpu_cache_sysfs_init(unsigned int cpu)
> +{
> +	struct device *dev = get_cpu_device(cpu);
> +
> +	if (per_cpu_cacheinfo(cpu) == NULL)
> +		return -ENOENT;
> +
> +	per_cpu_cache_dev(cpu) = device_create(dev->class, dev, cpu,
> +					       NULL, "cache");
> +	if (IS_ERR_OR_NULL(per_cpu_cache_dev(cpu)))
> +		return PTR_ERR(per_cpu_cache_dev(cpu));
> +
> +	/* Allocate all required memory */
> +	per_cpu_index_dev(cpu) = kzalloc(sizeof(struct device *) *
> +					 cache_leaves(cpu), GFP_KERNEL);
> +	if (unlikely(per_cpu_index_dev(cpu) == NULL))
> +		goto err_out;
> +
> +	return 0;
> +
> +err_out:
> +	cpu_cache_sysfs_exit(cpu);
> +	return -ENOMEM;
> +}
> +
> +static int cache_add_dev(unsigned int cpu)
> +{
> +	unsigned short i;
> +	int rc;
> +	struct device *tmp_dev, *parent;
> +	struct cacheinfo *this_leaf;
> +	const struct device_attribute **ci_priv_attr;
> +	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> +
> +	rc = cpu_cache_sysfs_init(cpu);
> +	if (unlikely(rc < 0))
> +		return rc;
> +
> +	parent = per_cpu_cache_dev(cpu);
> +	for (i = 0; i < cache_leaves(cpu); i++) {
> +		this_leaf = this_cpu_ci->info_list + i;
> +		if (this_leaf->disable_sysfs)
> +			continue;
> +		tmp_dev = device_create_with_groups(parent->class, parent, i,
> +						    this_leaf,
> +						    cache_default_groups,
> +						    "index%1u", i);
> +		if (IS_ERR_OR_NULL(tmp_dev)) {
> +			rc = PTR_ERR(tmp_dev);
> +			goto err;
> +		}
> +
> +		rc = device_add_attrs(tmp_dev, cache_optional_attrs);
> +		if (unlikely(rc))
> +			goto err;
> +
> +		ci_priv_attr = cache_get_priv_attr(tmp_dev);
> +		rc = device_add_attrs(tmp_dev, ci_priv_attr);
> +		if (unlikely(rc))
> +			goto err;

You just raced with userspace here, creating these files _after_ the
device was announced to userspace, causing problems with anyone wanting
to read these attributes :(

I think if you fix up the is_visible() thing above, these calls will go
away, right?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs
  2014-07-10  0:09   ` Greg Kroah-Hartman
@ 2014-07-10 13:37     ` Sudeep Holla
  0 siblings, 0 replies; 29+ messages in thread
From: Sudeep Holla @ 2014-07-10 13:37 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Greg,

Thanks for reviewing this.

On 10/07/14 01:09, Greg Kroah-Hartman wrote:
> On Wed, Jun 25, 2014 at 06:30:37PM +0100, Sudeep Holla wrote:
>> +static const struct device_attribute *cache_optional_attrs[] = {
>> +	&dev_attr_coherency_line_size,
>> +	&dev_attr_ways_of_associativity,
>> +	&dev_attr_number_of_sets,
>> +	&dev_attr_size,
>> +	&dev_attr_attributes,
>> +	&dev_attr_physical_line_partition,
>> +	NULL
>> +};
>> +
>> +static int device_add_attrs(struct device *dev,
>> +			    const struct device_attribute **dev_attrs)
>> +{
>> +	int i, error = 0;
>> +	struct device_attribute *dev_attr;
>> +	char *buf;
>> +
>> +	if (!dev_attrs)
>> +		return 0;
>> +
>> +	buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
>> +	if (!buf)
>> +		return -ENOMEM;
>> +
>> +	for (i = 0; dev_attrs[i]; i++) {
>> +		dev_attr = (struct device_attribute *)dev_attrs[i];
>> +
>> +		/* create attributes that provides meaningful value */
>> +		if (dev_attr->show(dev, dev_attr, buf) < 0)
>> +			continue;
>> +
>> +		error = device_create_file(dev, dev_attrs[i]);
>> +		if (error) {
>> +			while (--i >= 0)
>> +				device_remove_file(dev, dev_attrs[i]);
>> +			break;
>> +		}
>> +	}
>> +
>> +	kfree(buf);
>> +	return error;
>> +}
>
> Ick, why create your own function for this when the driver core has this
> functionality built into it?  Look at the is_visible() callback, and how
> it is use for an attribute group please.
>

I agree even I added this function hesitantly as didn't realize that I can use
is_visible for this purpose. Thanks for pointing that out I will have a look
at it.

>> +static void device_remove_attrs(struct device *dev,
>> +				const struct device_attribute **dev_attrs)
>> +{
>> +	int i;
>> +
>> +	if (!dev_attrs)
>> +		return;
>> +
>> +	for (i = 0; dev_attrs[i]; dev_attrs++, i++)
>> +		device_remove_file(dev, dev_attrs[i]);
>> +}
>
> You should just remove a whole group at once, not individually.
>

Right, I must be able to get rid of these 2 functions once I use
is_visible callback.

>> +
>> +const struct device_attribute **
>> +__weak cache_get_priv_attr(struct device *cache_idx_dev)
>> +{
>> +	return NULL;
>> +}
>> +
>> +/* Add/Remove cache interface for CPU device */
>> +static void cpu_cache_sysfs_exit(unsigned int cpu)
>> +{
>> +	int i;
>> +	struct device *tmp_dev;
>> +	const struct device_attribute **ci_priv_attr;
>> +
>> +	if (per_cpu_index_dev(cpu)) {
>> +		for (i = 0; i < cache_leaves(cpu); i++) {
>> +			tmp_dev = per_cache_index_dev(cpu, i);
>> +			if (!tmp_dev)
>> +				continue;
>> +			ci_priv_attr = cache_get_priv_attr(tmp_dev);
>> +			device_remove_attrs(tmp_dev, ci_priv_attr);
>> +			device_remove_attrs(tmp_dev, cache_optional_attrs);
>> +			device_unregister(tmp_dev);
>> +		}
>> +		kfree(per_cpu_index_dev(cpu));
>> +		per_cpu_index_dev(cpu) = NULL;
>> +	}
>> +	device_unregister(per_cpu_cache_dev(cpu));
>> +	per_cpu_cache_dev(cpu) = NULL;
>> +}
>> +
>> +static int cpu_cache_sysfs_init(unsigned int cpu)
>> +{
>> +	struct device *dev = get_cpu_device(cpu);
>> +
>> +	if (per_cpu_cacheinfo(cpu) == NULL)
>> +		return -ENOENT;
>> +
>> +	per_cpu_cache_dev(cpu) = device_create(dev->class, dev, cpu,
>> +					       NULL, "cache");
>> +	if (IS_ERR_OR_NULL(per_cpu_cache_dev(cpu)))
>> +		return PTR_ERR(per_cpu_cache_dev(cpu));
>> +
>> +	/* Allocate all required memory */
>> +	per_cpu_index_dev(cpu) = kzalloc(sizeof(struct device *) *
>> +					 cache_leaves(cpu), GFP_KERNEL);
>> +	if (unlikely(per_cpu_index_dev(cpu) == NULL))
>> +		goto err_out;
>> +
>> +	return 0;
>> +
>> +err_out:
>> +	cpu_cache_sysfs_exit(cpu);
>> +	return -ENOMEM;
>> +}
>> +
>> +static int cache_add_dev(unsigned int cpu)
>> +{
>> +	unsigned short i;
>> +	int rc;
>> +	struct device *tmp_dev, *parent;
>> +	struct cacheinfo *this_leaf;
>> +	const struct device_attribute **ci_priv_attr;
>> +	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>> +
>> +	rc = cpu_cache_sysfs_init(cpu);
>> +	if (unlikely(rc < 0))
>> +		return rc;
>> +
>> +	parent = per_cpu_cache_dev(cpu);
>> +	for (i = 0; i < cache_leaves(cpu); i++) {
>> +		this_leaf = this_cpu_ci->info_list + i;
>> +		if (this_leaf->disable_sysfs)
>> +			continue;
>> +		tmp_dev = device_create_with_groups(parent->class, parent, i,
>> +						    this_leaf,
>> +						    cache_default_groups,
>> +						    "index%1u", i);
>> +		if (IS_ERR_OR_NULL(tmp_dev)) {
>> +			rc = PTR_ERR(tmp_dev);
>> +			goto err;
>> +		}
>> +
>> +		rc = device_add_attrs(tmp_dev, cache_optional_attrs);
>> +		if (unlikely(rc))
>> +			goto err;
>> +
>> +		ci_priv_attr = cache_get_priv_attr(tmp_dev);
>> +		rc = device_add_attrs(tmp_dev, ci_priv_attr);
>> +		if (unlikely(rc))
>> +			goto err;
>
> You just raced with userspace here, creating these files _after_ the
> device was announced to userspace, causing problems with anyone wanting
> to read these attributes :(
>
> I think if you fix up the is_visible() thing above, these calls will go
> away, right?
>

Yes I agree.

Regards,
Sudeep

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 7/9] ARM64: kernel: add support for cpu cache information
       [not found] ` <1406306692-7135-1-git-send-email-sudeep.holla@arm.com>
@ 2014-07-25 16:44   ` Sudeep Holla
  2014-07-25 16:44   ` [PATCH v2 8/9] ARM: " Sudeep Holla
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: Sudeep Holla @ 2014-07-25 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sudeep Holla <sudeep.holla@arm.com>

This patch adds support for cacheinfo on ARM64.

On ARMv8, the cache hierarchy can be identified through Cache Level ID
(CLIDR) register while the cache geometry is provided by Cache Size ID
(CCSIDR) register.

Since the architecture doesn't provide any way of detecting the cpus
sharing particular cache, device tree is used for the same purpose.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: linux-arm-kernel at lists.infradead.org
---
 arch/arm64/kernel/Makefile    |   3 +-
 arch/arm64/kernel/cacheinfo.c | 142 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 144 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/cacheinfo.c

diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index cdaedad3afe5..754a3d07b6b8 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -15,7 +15,8 @@ CFLAGS_REMOVE_return_address.o = -pg
 arm64-obj-y		:= cputable.o debug-monitors.o entry.o irq.o fpsimd.o	\
 			   entry-fpsimd.o process.o ptrace.o setup.o signal.o	\
 			   sys.o stacktrace.o time.o traps.o io.o vdso.o	\
-			   hyp-stub.o psci.o cpu_ops.o insn.o return_address.o
+			   hyp-stub.o psci.o cpu_ops.o insn.o return_address.o	\
+			   cacheinfo.o
 
 arm64-obj-$(CONFIG_COMPAT)		+= sys32.o kuser32.o signal32.o 	\
 					   sys_compat.o
diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
new file mode 100644
index 000000000000..6c559659b460
--- /dev/null
+++ b/arch/arm64/kernel/cacheinfo.c
@@ -0,0 +1,142 @@
+/*
+ *  ARM64 cacheinfo support
+ *
+ *  Copyright (C) 2014 ARM Ltd.
+ *  All Rights Reserved
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/bitops.h>
+#include <linux/cacheinfo.h>
+#include <linux/cpu.h>
+#include <linux/compiler.h>
+#include <linux/of.h>
+
+#include <asm/processor.h>
+
+#define MAX_CACHE_LEVEL			7	/* Max 7 level supported */
+/* Ctypen, bits[3(n - 1) + 2 : 3(n - 1)], for n = 1 to 7 */
+#define CLIDR_CTYPE_SHIFT(level)	(3 * (level - 1))
+#define CLIDR_CTYPE_MASK(level)		(7 << CLIDR_CTYPE_SHIFT(level))
+#define CLIDR_CTYPE(clidr, level)	\
+	(((clidr) & CLIDR_CTYPE_MASK(level)) >> CLIDR_CTYPE_SHIFT(level))
+
+static inline enum cache_type get_cache_type(int level)
+{
+	u64 clidr;
+
+	if (level > MAX_CACHE_LEVEL)
+		return CACHE_TYPE_NOCACHE;
+	asm volatile ("mrs     %x0, clidr_el1" : "=r" (clidr));
+	return CLIDR_CTYPE(clidr, level);
+}
+
+/*
+ * NumSets, bits[27:13] - (Number of sets in cache) - 1
+ * Associativity, bits[12:3] - (Associativity of cache) - 1
+ * LineSize, bits[2:0] - (Log2(Number of words in cache line)) - 2
+ */
+#define CCSIDR_WRITE_THROUGH		BIT(31)
+#define CCSIDR_WRITE_BACK		BIT(30)
+#define CCSIDR_READ_ALLOCATE		BIT(29)
+#define CCSIDR_WRITE_ALLOCATE		BIT(28)
+#define CCSIDR_LINESIZE_MASK		0x7
+#define CCSIDR_ASSOCIATIVITY_SHIFT	3
+#define CCSIDR_ASSOCIATIVITY_MASK	0x3FF
+#define CCSIDR_NUMSETS_SHIFT		13
+#define CCSIDR_NUMSETS_MASK		0x7FF
+
+/*
+ * Which cache CCSIDR represents depends on CSSELR value
+ * Make sure no one else changes CSSELR during this
+ * smp_call_function_single prevents preemption for us
+ */
+static inline u32 get_ccsidr(u64 csselr)
+{
+	u64 ccsidr;
+
+	/* Put value into CSSELR */
+	asm volatile("msr csselr_el1, %x0" : : "r" (csselr));
+	isb();
+	/* Read result out of CCSIDR */
+	asm volatile("mrs %x0, ccsidr_el1" : "=r" (ccsidr));
+
+	return (u32)ccsidr;
+}
+
+static void ci_leaf_init(struct cacheinfo *this_leaf,
+			 enum cache_type type, unsigned int level)
+{
+	bool is_instr_cache = type & CACHE_TYPE_INST;
+	u32 tmp = get_ccsidr((level - 1) << 1 | is_instr_cache);
+
+	this_leaf->level = level;
+	this_leaf->type = type;
+	this_leaf->coherency_line_size =
+	    (1 << ((tmp & CCSIDR_LINESIZE_MASK) + 2)) * 4;
+	this_leaf->number_of_sets =
+	    ((tmp >> CCSIDR_NUMSETS_SHIFT) & CCSIDR_NUMSETS_MASK) + 1;
+	this_leaf->ways_of_associativity = ((tmp >> CCSIDR_ASSOCIATIVITY_SHIFT)
+					    & CCSIDR_ASSOCIATIVITY_MASK) + 1;
+	this_leaf->size = this_leaf->number_of_sets *
+	    this_leaf->coherency_line_size * this_leaf->ways_of_associativity;
+	this_leaf->attributes =
+		((tmp & CCSIDR_WRITE_THROUGH) ? CACHE_WRITE_THROUGH : 0) |
+		((tmp & CCSIDR_WRITE_BACK) ? CACHE_WRITE_BACK : 0) |
+		((tmp & CCSIDR_READ_ALLOCATE) ? CACHE_READ_ALLOCATE : 0) |
+		((tmp & CCSIDR_WRITE_ALLOCATE) ? CACHE_WRITE_ALLOCATE : 0);
+}
+
+static int __init_cache_level(unsigned int cpu)
+{
+	unsigned int ctype, level, leaves;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+
+	for (level = 1, leaves = 0; level <= MAX_CACHE_LEVEL; level++) {
+		ctype = get_cache_type(level);
+		if (ctype == CACHE_TYPE_NOCACHE) {
+			level--;
+			break;
+		}
+		/* Separate instruction and data caches */
+		leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
+	}
+
+	this_cpu_ci->num_levels = level;
+	this_cpu_ci->num_leaves = leaves;
+	return 0;
+}
+
+static int __populate_cache_leaves(unsigned int cpu)
+{
+	unsigned int level, idx;
+	enum cache_type type;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	struct cacheinfo *this_leaf = this_cpu_ci->info_list;
+
+	for (idx = 0, level = 1; level <= this_cpu_ci->num_levels &&
+	     idx < this_cpu_ci->num_leaves; idx++, level++) {
+		type = get_cache_type(level);
+		if (type == CACHE_TYPE_SEPARATE) {
+			ci_leaf_init(this_leaf++, CACHE_TYPE_DATA, level);
+			ci_leaf_init(this_leaf++, CACHE_TYPE_INST, level);
+		} else {
+			ci_leaf_init(this_leaf++, type, level);
+		}
+	}
+	return 0;
+}
+
+DEFINE_SMP_CALL_FUNCTION(init_cache_level)
+DEFINE_SMP_CALL_FUNCTION(populate_cache_leaves)
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 8/9] ARM: kernel: add support for cpu cache information
       [not found] ` <1406306692-7135-1-git-send-email-sudeep.holla@arm.com>
  2014-07-25 16:44   ` [PATCH v2 7/9] ARM64: kernel: add support for cpu cache information Sudeep Holla
@ 2014-07-25 16:44   ` Sudeep Holla
  2014-07-25 16:44   ` [PATCH v2 9/9] ARM: kernel: add outer cache support for cacheinfo implementation Sudeep Holla
  2014-08-21 10:59   ` [PATCH v3 00/11] drivers: cacheinfo support Sudeep Holla
  3 siblings, 0 replies; 29+ messages in thread
From: Sudeep Holla @ 2014-07-25 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sudeep Holla <sudeep.holla@arm.com>

This patch adds support for cacheinfo on ARM platforms.

On ARMv7, the cache hierarchy can be identified through Cache Level ID
register(CLIDR) while the cache geometry is provided by Cache Size ID
register(CCSIDR).

On architecture versions before ARMv7, CLIDR and CCSIDR is not
implemented. The cache type register(CTR) provides both cache hierarchy
and geometry if implemented. For implementations that doesn't support
CTR, we need to list the probable value of CTR if it was implemented
along with the cpuid for the sake of simplicity to handle them.

Since the architecture doesn't provide any way of detecting the cpus
sharing particular cache, device tree is used fo the same purpose.
On non-DT platforms, first level caches are per-cpu while higher level
caches are assumed system-wide.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: linux-arm-kernel at lists.infradead.org
---
 arch/arm/kernel/Makefile    |   1 +
 arch/arm/kernel/cacheinfo.c | 272 ++++++++++++++++++++++++++++++++++++++++++++
 arch/arm/mm/Kconfig         |  13 +++
 3 files changed, 286 insertions(+)
 create mode 100644 arch/arm/kernel/cacheinfo.c

--->8

Hi Russell,

Since for few CPUs like ARM11MP core which implements VMSAv6 + Advanced OS
Features, Linux returns cpu_architecture as armv7, I have added a list
for that as they are armv6 and don't implement CLIDR and group.
Let me know if there's any alternative to handle that.

This also depends on your patch
"ARM: make it easier to check the CPU part number correctly"

Regards,
Sudeep

diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index 38ddd9f83d0e..2c5ff0efb670 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -29,6 +29,7 @@ obj-y		+= entry-v7m.o v7m.o
 else
 obj-y		+= entry-armv.o
 endif
+obj-$(CONFIG_CPU_HAS_CACHE) += cacheinfo.o
 
 obj-$(CONFIG_OC_ETM)		+= etm.o
 obj-$(CONFIG_CPU_IDLE)		+= cpuidle.o
diff --git a/arch/arm/kernel/cacheinfo.c b/arch/arm/kernel/cacheinfo.c
new file mode 100644
index 000000000000..427ba641b33a
--- /dev/null
+++ b/arch/arm/kernel/cacheinfo.c
@@ -0,0 +1,272 @@
+/*
+ *  ARM cacheinfo support
+ *
+ *  Copyright (C) 2014 ARM Ltd.
+ *  All Rights Reserved
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/bitops.h>
+#include <linux/cacheinfo.h>
+#include <linux/cpu.h>
+#include <linux/compiler.h>
+#include <linux/of.h>
+
+#include <asm/cputype.h>
+#include <asm/processor.h>
+#include <asm/system_info.h>
+
+#define cache_is_armv7() \
+	(cpu_architecture() >= CPU_ARCH_ARMv7 && !armv6_extended())
+#define MAX_CACHE_LEVEL		(cache_is_armv7() ? 7 : 1)
+
+#define CTR_CTYPE_SHIFT		24
+#define CTR_CTYPE_MASK		(1 << CTR_CTYPE_SHIFT)
+
+struct ctr_info {
+	unsigned int cpuid_part;
+	unsigned int ctr;
+};
+/*
+ *  Cache Type Register
+ *  +---------------------------------+
+ *  | 31 29 | 28 25 |24| 23 12 | 11 0 |
+ *  +---------------------------------+
+ *  | 0 0 0 | Ctype | S| Dsize | Isize|
+ *  +---------------------------------+
+ * The table below encodes only Dsize and Isize
+ */
+static struct ctr_info cache_ctr_list[] = {
+	{0x4400a100, 0x0016A16A }, /* 32kB D$, 32kB I$ */
+	{0x4400a110, 0x0012A16A }, /* 16kB D$, 32kB I$ */
+	{0x6900b110, 0x0012A16A }, /* 16kB D$, 32kB I$ */
+};
+
+/*
+ * List of CPUs reported as ARMv7 but don't implement CLIDR,
+ * CSSELR and CCSIDR. Cache information is still available from CTR
+ */
+static int armv6_ext_cpuid_part[] = { 0x4100b020, };
+
+static bool armv6_extended(void)
+{
+	int i, cpuid_part = read_cpuid_part();
+
+	for (i = 0; i < ARRAY_SIZE(armv6_ext_cpuid_part); i++)
+		if (armv6_ext_cpuid_part[i] == cpuid_part)
+			return true;
+	return false;
+}
+
+static int get_unimplemented_ctr(unsigned int *ctr)
+{
+	int i, cpuid_part = read_cpuid_part();
+
+	for (i = 0; i < ARRAY_SIZE(cache_ctr_list); i++)
+		if (cache_ctr_list[i].cpuid_part == cpuid_part) {
+			*ctr = cache_ctr_list[i].ctr;
+			return 0;
+		}
+	return -ENOENT;
+}
+
+static unsigned int get_ctr(void)
+{
+	unsigned int ctr;
+
+	if (get_unimplemented_ctr(&ctr))
+		ctr = read_cpuid_cachetype();
+	return ctr;
+}
+
+static enum cache_type __get_cache_type(int level)
+{
+	if (level > MAX_CACHE_LEVEL)
+		return CACHE_TYPE_NOCACHE;
+	return get_ctr() & CTR_CTYPE_MASK ?
+		CACHE_TYPE_SEPARATE : CACHE_TYPE_UNIFIED;
+}
+
+/*
+ *  +---------------------------------+
+ *  | 9  8  7  6 | 5  4  3 | 2 | 1  0 |
+ *  +---------------------------------+
+ *  |    size    |  assoc  | m |  len |
+ *  +---------------------------------+
+ * linelen        = 1 << (len + 3)
+ * multiplier     = 2 + m
+ * nsets          = 1 << (size + 6 - assoc - len)
+ * associativity  = multiplier << (assoc - 1)
+ * cache_size     = multiplier << (size + 8)
+ */
+#define CTR_LINESIZE_MASK	0x3
+#define CTR_MULTIPLIER_SHIFT	2
+#define CTR_MULTIPLIER_MASK	0x1
+#define CTR_ASSOCIAT_SHIFT	3
+#define CTR_ASSOCIAT_MASK	0x7
+#define CTR_SIZE_SHIFT		6
+#define CTR_SIZE_MASK		0xF
+#define CTR_DCACHE_SHIFT	12
+
+static void __ci_leaf_init(enum cache_type type, struct cacheinfo *this_leaf)
+{
+	unsigned int size, multiplier, assoc, len, tmp = get_ctr();
+
+	if (type == CACHE_TYPE_DATA)
+		tmp >>= CTR_DCACHE_SHIFT;
+
+	len = tmp & CTR_LINESIZE_MASK;
+	size = (tmp >> CTR_SIZE_SHIFT) & CTR_SIZE_MASK;
+	assoc = (tmp >> CTR_ASSOCIAT_SHIFT) & CTR_ASSOCIAT_MASK;
+	multiplier = ((tmp >> CTR_MULTIPLIER_SHIFT) & CTR_MULTIPLIER_MASK) + 2;
+
+	this_leaf->type = type;
+	this_leaf->coherency_line_size = 1 << (len + 3);
+	this_leaf->number_of_sets = 1 << (size + 6 - assoc - len);
+	this_leaf->ways_of_associativity = multiplier << (assoc - 1);
+	this_leaf->size = multiplier << (size + 8);
+}
+
+/* Ctypen, bits[3(n - 1) + 2 : 3(n - 1)], for n = 1 to 7 */
+#define CLIDR_CTYPE_SHIFT(level)	(3 * (level - 1))
+#define CLIDR_CTYPE_MASK(level)		(7 << CLIDR_CTYPE_SHIFT(level))
+#define CLIDR_CTYPE(clidr, level)	\
+	(((clidr) & CLIDR_CTYPE_MASK(level)) >> CLIDR_CTYPE_SHIFT(level))
+
+static inline enum cache_type __armv7_get_cache_type(int level)
+{
+	unsigned int clidr;
+
+	if (level > MAX_CACHE_LEVEL)
+		return CACHE_TYPE_NOCACHE;
+	asm volatile ("mrc p15, 1, %0, c0, c0, 1" : "=r" (clidr));
+	return CLIDR_CTYPE(clidr, level);
+}
+
+/*
+ * NumSets, bits[27:13] - (Number of sets in cache) - 1
+ * Associativity, bits[12:3] - (Associativity of cache) - 1
+ * LineSize, bits[2:0] - (Log2(Number of words in cache line)) - 2
+ */
+#define CCSIDR_WRITE_THROUGH		BIT(31)
+#define CCSIDR_WRITE_BACK		BIT(30)
+#define CCSIDR_READ_ALLOCATE		BIT(29)
+#define CCSIDR_WRITE_ALLOCATE		BIT(28)
+#define CCSIDR_LINESIZE_MASK		0x7
+#define CCSIDR_ASSOCIATIVITY_SHIFT	3
+#define CCSIDR_ASSOCIATIVITY_MASK	0x3FF
+#define CCSIDR_NUMSETS_SHIFT		13
+#define CCSIDR_NUMSETS_MASK		0x7FF
+
+/*
+ * Which cache CCSIDR represents depends on CSSELR value
+ * Make sure no one else changes CSSELR during this
+ * smp_call_function_single prevents preemption for us
+ */
+static inline u32 get_ccsidr(u32 csselr)
+{
+	u32 ccsidr;
+
+	/* Put value into CSSELR */
+	asm volatile ("mcr p15, 2, %0, c0, c0, 0" : : "r" (csselr));
+	isb();
+	/* Read result out of CCSIDR */
+	asm volatile ("mrc p15, 1, %0, c0, c0, 0" : "=r" (ccsidr));
+
+	return ccsidr;
+}
+
+static void __armv7_ci_leaf_init(enum cache_type type,
+				 struct cacheinfo *this_leaf)
+{
+	bool is_instr_cache = type & CACHE_TYPE_INST;
+	u32 tmp = get_ccsidr((this_leaf->level - 1) << 1 | is_instr_cache);
+
+	this_leaf->type = type;
+	this_leaf->coherency_line_size =
+	    (1 << ((tmp & CCSIDR_LINESIZE_MASK) + 2)) * 4;
+	this_leaf->number_of_sets =
+	    ((tmp >> CCSIDR_NUMSETS_SHIFT) & CCSIDR_NUMSETS_MASK) + 1;
+	this_leaf->ways_of_associativity = ((tmp >> CCSIDR_ASSOCIATIVITY_SHIFT)
+					    & CCSIDR_ASSOCIATIVITY_MASK) + 1;
+	this_leaf->size = this_leaf->number_of_sets *
+	    this_leaf->coherency_line_size * this_leaf->ways_of_associativity;
+	this_leaf->attributes =
+		((tmp & CCSIDR_WRITE_THROUGH) ? CACHE_WRITE_THROUGH : 0) |
+		((tmp & CCSIDR_WRITE_BACK) ? CACHE_WRITE_BACK : 0) |
+		((tmp & CCSIDR_READ_ALLOCATE) ? CACHE_READ_ALLOCATE : 0) |
+		((tmp & CCSIDR_WRITE_ALLOCATE) ? CACHE_WRITE_ALLOCATE : 0);
+}
+
+static inline enum cache_type get_cache_type(int level)
+{
+	if (cache_is_armv7())
+		return __armv7_get_cache_type(level);
+	else
+		return __get_cache_type(level);
+}
+
+static void ci_leaf_init(struct cacheinfo *this_leaf,
+			 enum cache_type type, unsigned int level)
+{
+	this_leaf->level = level;
+	if (cache_is_armv7())
+		__armv7_ci_leaf_init(type, this_leaf);
+	else
+		__ci_leaf_init(type, this_leaf);
+}
+
+static int __init_cache_level(unsigned int cpu)
+{
+	unsigned int ctype, level, leaves;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+
+	for (level = 1, leaves = 0; level <= MAX_CACHE_LEVEL; level++) {
+		ctype = get_cache_type(level);
+		if (ctype == CACHE_TYPE_NOCACHE) {
+			level--;
+			break;
+		}
+		/* Separate instruction and data caches */
+		leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
+	}
+
+	this_cpu_ci->num_levels = level;
+	this_cpu_ci->num_leaves = leaves;
+
+	return 0;
+}
+
+static int __populate_cache_leaves(unsigned int cpu)
+{
+	unsigned int level, idx;
+	enum cache_type type;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	struct cacheinfo *this_leaf = this_cpu_ci->info_list;
+
+	for (idx = 0, level = 1; level <= this_cpu_ci->num_levels &&
+	     idx < this_cpu_ci->num_leaves; idx++, level++) {
+		type = get_cache_type(level);
+		if (type == CACHE_TYPE_SEPARATE) {
+			ci_leaf_init(this_leaf++, CACHE_TYPE_DATA, level);
+			ci_leaf_init(this_leaf++, CACHE_TYPE_INST, level);
+		} else {
+			ci_leaf_init(this_leaf++, type, level);
+		}
+	}
+	return 0;
+}
+
+DEFINE_SMP_CALL_FUNCTION(init_cache_level)
+DEFINE_SMP_CALL_FUNCTION(populate_cache_leaves)
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index c348eaee7ee2..153abc3bac4e 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -494,30 +494,42 @@ config CPU_PABRT_V7
 # The cache model
 config CPU_CACHE_V4
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_V4WT
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_V4WB
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_V6
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_V7
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_NOP
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_VIVT
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_VIPT
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_FA
 	bool
+	select CPU_HAS_CACHE
+
+config CPU_HAS_CACHE
+	bool
 
 if MMU
 # The copy-page model
@@ -845,6 +857,7 @@ config DMA_CACHE_RWFO
 
 config OUTER_CACHE
 	bool
+	select CPU_HAS_CACHE
 
 config OUTER_CACHE_SYNC
 	bool
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 9/9] ARM: kernel: add outer cache support for cacheinfo implementation
       [not found] ` <1406306692-7135-1-git-send-email-sudeep.holla@arm.com>
  2014-07-25 16:44   ` [PATCH v2 7/9] ARM64: kernel: add support for cpu cache information Sudeep Holla
  2014-07-25 16:44   ` [PATCH v2 8/9] ARM: " Sudeep Holla
@ 2014-07-25 16:44   ` Sudeep Holla
  2014-08-21 10:59   ` [PATCH v3 00/11] drivers: cacheinfo support Sudeep Holla
  3 siblings, 0 replies; 29+ messages in thread
From: Sudeep Holla @ 2014-07-25 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sudeep Holla <sudeep.holla@arm.com>

In order to support outer cache in the cacheinfo infrastructure, a new
function 'get_info' is added to outer_cache_fns. This function is used
to get the outer cache information namely: line size, number of ways of
associativity and number of sets.

This patch adds 'get_info' supports to all L2 cache implementations on
ARM except Marvell's Feroceon L2 cache.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: linux-arm-kernel at lists.infradead.org
---
 arch/arm/include/asm/outercache.h |  9 +++++++++
 arch/arm/kernel/cacheinfo.c       | 14 +++++++++++++-
 arch/arm/mm/cache-l2x0.c          | 35 ++++++++++++++++++++++++++++++++++-
 arch/arm/mm/cache-tauros2.c       | 35 +++++++++++++++++++++++++++++++++++
 arch/arm/mm/cache-xsc3l2.c        | 16 ++++++++++++++++
 5 files changed, 107 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/outercache.h b/arch/arm/include/asm/outercache.h
index 891a56b35bcf..2765c8c61c8c 100644
--- a/arch/arm/include/asm/outercache.h
+++ b/arch/arm/include/asm/outercache.h
@@ -23,7 +23,10 @@
 
 #include <linux/types.h>
 
+struct cacheinfo;
+
 struct outer_cache_fns {
+	void (*get_info)(struct cacheinfo *info);
 	void (*inv_range)(unsigned long, unsigned long);
 	void (*clean_range)(unsigned long, unsigned long);
 	void (*flush_range)(unsigned long, unsigned long);
@@ -112,6 +115,11 @@ static inline void outer_resume(void)
 		outer_cache.resume();
 }
 
+static inline void outer_get_info(struct cacheinfo *info)
+{
+	if (outer_cache.get_info)
+		outer_cache.get_info(info);
+}
 #else
 
 static inline void outer_inv_range(phys_addr_t start, phys_addr_t end)
@@ -123,6 +131,7 @@ static inline void outer_flush_range(phys_addr_t start, phys_addr_t end)
 static inline void outer_flush_all(void) { }
 static inline void outer_disable(void) { }
 static inline void outer_resume(void) { }
+static inline void outer_get_info(struct outer_cache_info *info) { }
 
 #endif
 
diff --git a/arch/arm/kernel/cacheinfo.c b/arch/arm/kernel/cacheinfo.c
index 427ba641b33a..9a516e43bbf5 100644
--- a/arch/arm/kernel/cacheinfo.c
+++ b/arch/arm/kernel/cacheinfo.c
@@ -24,6 +24,7 @@
 #include <linux/of.h>
 
 #include <asm/cputype.h>
+#include <asm/outercache.h>
 #include <asm/processor.h>
 #include <asm/system_info.h>
 
@@ -217,11 +218,19 @@ static inline enum cache_type get_cache_type(int level)
 		return __get_cache_type(level);
 }
 
+static inline void __outer_ci_leaf_init(struct cacheinfo *this_leaf)
+{
+	outer_get_info(this_leaf);
+	BUG_ON(this_leaf->type == CACHE_TYPE_SEPARATE);
+}
+
 static void ci_leaf_init(struct cacheinfo *this_leaf,
 			 enum cache_type type, unsigned int level)
 {
 	this_leaf->level = level;
-	if (cache_is_armv7())
+	if (type == CACHE_TYPE_NOCACHE)	/* must be outer cache */
+		__outer_ci_leaf_init(this_leaf);
+	else if (cache_is_armv7())
 		__armv7_ci_leaf_init(type, this_leaf);
 	else
 		__ci_leaf_init(type, this_leaf);
@@ -245,6 +254,9 @@ static int __init_cache_level(unsigned int cpu)
 	this_cpu_ci->num_levels = level;
 	this_cpu_ci->num_leaves = leaves;
 
+	if (IS_ENABLED(CONFIG_OUTER_CACHE) && outer_cache.get_info)
+		this_cpu_ci->num_leaves++, this_cpu_ci->num_levels++;
+
 	return 0;
 }
 
diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c
index 7c3fb41a462e..503bafdeb25b 100644
--- a/arch/arm/mm/cache-l2x0.c
+++ b/arch/arm/mm/cache-l2x0.c
@@ -17,6 +17,7 @@
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
  */
 #include <linux/cpu.h>
+#include <linux/cacheinfo.h>
 #include <linux/err.h>
 #include <linux/init.h>
 #include <linux/smp.h>
@@ -105,6 +106,22 @@ static inline void l2c_unlock(void __iomem *base, unsigned num)
 	}
 }
 
+static void __l2x0_getinfo(struct cacheinfo *this_leaf)
+{
+	unsigned int assoc = get_count_order(l2x0_way_mask);
+
+	this_leaf->size = l2x0_size;
+	this_leaf->coherency_line_size = CACHE_LINE_SIZE;
+	this_leaf->ways_of_associativity = assoc;
+	this_leaf->number_of_sets = l2x0_size / (assoc * CACHE_LINE_SIZE);
+}
+
+static void l2x0_getinfo(struct cacheinfo *this_leaf)
+{
+	this_leaf->type = CACHE_TYPE_UNIFIED;
+	__l2x0_getinfo(this_leaf);
+}
+
 /*
  * Enable the L2 cache controller.  This function must only be
  * called when the cache controller is known to be disabled.
@@ -309,6 +326,7 @@ static const struct l2c_init_data l2c210_data __initconst = {
 		.disable = l2c_disable,
 		.sync = l2c210_sync,
 		.resume = l2c210_resume,
+		.get_info = l2x0_getinfo,
 	},
 };
 
@@ -466,6 +484,7 @@ static const struct l2c_init_data l2c220_data = {
 		.disable = l2c_disable,
 		.sync = l2c220_sync,
 		.resume = l2c210_resume,
+		.get_info = l2x0_getinfo,
 	},
 };
 
@@ -814,6 +833,7 @@ static const struct l2c_init_data l2c310_init_fns __initconst = {
 		.disable = l2c310_disable,
 		.sync = l2c210_sync,
 		.resume = l2c310_resume,
+		.get_info = l2x0_getinfo,
 	},
 };
 
@@ -894,7 +914,6 @@ static void __init __l2c_init(const struct l2c_init_data *data,
 		data->enable(l2x0_base, aux, data->num_lock);
 
 	outer_cache = fns;
-
 	/*
 	 * It is strange to save the register state before initialisation,
 	 * but hey, this is what the DT implementations decided to do.
@@ -994,6 +1013,7 @@ static const struct l2c_init_data of_l2c210_data __initconst = {
 		.disable     = l2c_disable,
 		.sync        = l2c210_sync,
 		.resume      = l2c210_resume,
+		.get_info    = l2x0_getinfo,
 	},
 };
 
@@ -1012,6 +1032,7 @@ static const struct l2c_init_data of_l2c220_data __initconst = {
 		.disable     = l2c_disable,
 		.sync        = l2c220_sync,
 		.resume      = l2c210_resume,
+		.get_info    = l2x0_getinfo,
 	},
 };
 
@@ -1065,6 +1086,7 @@ static const struct l2c_init_data of_l2c310_data __initconst = {
 		.disable     = l2c310_disable,
 		.sync        = l2c210_sync,
 		.resume      = l2c310_resume,
+		.get_info    = l2x0_getinfo,
 	},
 };
 
@@ -1092,6 +1114,7 @@ static const struct l2c_init_data of_l2c310_coherent_data __initconst = {
 		.flush_all   = l2c210_flush_all,
 		.disable     = l2c310_disable,
 		.resume      = l2c310_resume,
+		.get_info    = l2x0_getinfo,
 	},
 };
 
@@ -1255,6 +1278,12 @@ static void __init aurora_of_parse(const struct device_node *np,
 	*aux_mask &= ~mask;
 }
 
+static void aurora_no_outer_data_getinfo(struct cacheinfo *this_leaf)
+{
+	this_leaf->type = CACHE_TYPE_INST;
+	__l2x0_getinfo(this_leaf);
+}
+
 static const struct l2c_init_data of_aurora_with_outer_data __initconst = {
 	.type = "Aurora",
 	.way_size_0 = SZ_4K,
@@ -1271,6 +1300,7 @@ static const struct l2c_init_data of_aurora_with_outer_data __initconst = {
 		.disable     = l2x0_disable,
 		.sync        = l2x0_cache_sync,
 		.resume      = aurora_resume,
+		.get_info    = l2x0_getinfo,
 	},
 };
 
@@ -1284,6 +1314,7 @@ static const struct l2c_init_data of_aurora_no_outer_data __initconst = {
 	.save  = aurora_save,
 	.outer_cache = {
 		.resume      = aurora_resume,
+		.get_info    = aurora_no_outer_data_getinfo,
 	},
 };
 
@@ -1439,6 +1470,7 @@ static const struct l2c_init_data of_bcm_l2x0_data __initconst = {
 		.disable     = l2c310_disable,
 		.sync        = l2c210_sync,
 		.resume      = l2c310_resume,
+		.get_info    = l2x0_getinfo,
 	},
 };
 
@@ -1475,6 +1507,7 @@ static const struct l2c_init_data of_tauros3_data __initconst = {
 	/* Tauros3 broadcasts L1 cache operations to L2 */
 	.outer_cache = {
 		.resume      = tauros3_resume,
+		.get_info    = l2x0_getinfo,
 	},
 };
 
diff --git a/arch/arm/mm/cache-tauros2.c b/arch/arm/mm/cache-tauros2.c
index b273739e6359..98efe3d12d94 100644
--- a/arch/arm/mm/cache-tauros2.c
+++ b/arch/arm/mm/cache-tauros2.c
@@ -60,6 +60,7 @@ static inline void tauros2_inv_pa(unsigned long addr)
  * noninclusive.
  */
 #define CACHE_LINE_SIZE		32
+#define CACHE_LINE_SHIFT	5
 
 static void tauros2_inv_range(unsigned long start, unsigned long end)
 {
@@ -131,6 +132,39 @@ static void tauros2_resume(void)
 	"mcr	p15, 0, %0, c1, c0, 0 @Enable L2 Cache\n\t"
 	: : "r" (0x0));
 }
+
+/*
+ *  +----------------------------------------+
+ *  | 11 10 9  8 | 7  6  5  4  3 | 2 |  1  0 |
+ *  +----------------------------------------+
+ *  |  way size  | associativity | - |line_sz|
+ *  +----------------------------------------+
+ */
+#define L2CTR_ASSOCIAT_SHIFT	3
+#define L2CTR_ASSOCIAT_MASK	0x1F
+#define L2CTR_WAYSIZE_SHIFT	8
+#define L2CTR_WAYSIZE_MASK	0xF
+#define CACHE_WAY_PER_SET(l2ctr)	\
+	(((l2_ctr) >> L2CTR_ASSOCIAT_SHIFT) & L2CTR_ASSOCIAT_MASK)
+#define CACHE_WAY_SIZE(l2ctr)		\
+	(8192 << (((l2ctr) >> L2CTR_WAYSIZE_SHIFT) & L2CTR_WAYSIZE_MASK))
+#define CACHE_SET_SIZE(l2ctr)	(CACHE_WAY_SIZE(l2ctr) >> CACHE_LINE_SHIFT)
+
+static void tauros2_getinfo(struct cacheinfo *info)
+{
+	unsigned int l2_ctr;
+
+	__asm__("mrc p15, 1, %0, c0, c0, 1" : "=r" (l2_ctr));
+
+	this_leaf->type = CACHE_TYPE_UNIFIED;
+	this_leaf->coherency_line_size = CACHE_LINE_SIZE;
+	this_leaf->ways_of_associativity = CACHE_WAY_PER_SET(l2_ctr);
+	this_leaf->number_of_sets = CACHE_SET_SIZE(l2_ctr);
+	this_leaf->size = this_leaf->coherency_line_size *
+			  this_leaf->number_of_sets *
+			  this_leaf->ways_of_associativity;
+}
+
 #endif
 
 static inline u32 __init read_extra_features(void)
@@ -226,6 +260,7 @@ static void __init tauros2_internal_init(unsigned int features)
 		outer_cache.flush_range = tauros2_flush_range;
 		outer_cache.disable = tauros2_disable;
 		outer_cache.resume = tauros2_resume;
+		outer_cache.get_info = tauros2_getinfo;
 	}
 #endif
 
diff --git a/arch/arm/mm/cache-xsc3l2.c b/arch/arm/mm/cache-xsc3l2.c
index 6c3edeb66e74..0f8ad0431a1b 100644
--- a/arch/arm/mm/cache-xsc3l2.c
+++ b/arch/arm/mm/cache-xsc3l2.c
@@ -201,6 +201,21 @@ static void xsc3_l2_flush_range(unsigned long start, unsigned long end)
 	dsb();
 }
 
+static void xsc3_l2_getinfo(struct outer_cache_info *info)
+{
+	unsigned long l2ctype;
+
+	__asm__("mrc p15, 1, %0, c0, c0, 1" : "=r" (l2ctype));
+
+	this_leaf->type = CACHE_TYPE_UNIFIED;
+	this_leaf->coherency_line_size = CACHE_LINE_SIZE;
+	this_leaf->ways_of_associativity = CACHE_WAY_PER_SET;
+	this_leaf->number_of_sets = CACHE_SET_SIZE(l2ctype);
+	this_leaf->size = this_leaf->coherency_line_size *
+			  this_leaf->number_of_sets *
+			  this_leaf->ways_of_associativity;
+}
+
 static int __init xsc3_l2_init(void)
 {
 	if (!cpu_is_xsc3() || !xsc3_l2_present())
@@ -213,6 +228,7 @@ static int __init xsc3_l2_init(void)
 		outer_cache.inv_range = xsc3_l2_inv_range;
 		outer_cache.clean_range = xsc3_l2_clean_range;
 		outer_cache.flush_range = xsc3_l2_flush_range;
+		outer_cache.get_info    = xsc3_l2_getinfo;
 	}
 
 	return 0;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 00/11] drivers: cacheinfo support
       [not found] ` <1406306692-7135-1-git-send-email-sudeep.holla@arm.com>
                     ` (2 preceding siblings ...)
  2014-07-25 16:44   ` [PATCH v2 9/9] ARM: kernel: add outer cache support for cacheinfo implementation Sudeep Holla
@ 2014-08-21 10:59   ` Sudeep Holla
  2014-08-21 10:59     ` [PATCH v3 09/11] ARM64: kernel: add support for cpu cache information Sudeep Holla
                       ` (2 more replies)
  3 siblings, 3 replies; 29+ messages in thread
From: Sudeep Holla @ 2014-08-21 10:59 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sudeep Holla <sudeep.holla@arm.com>

This series adds a generic cacheinfo support similar to topology. The
implementation is based on x86 cacheinfo support. Currently x86, powerpc,
ia64 and s390 have their own implementations. While adding similar support
to ARM and ARM64, here is the attempt to make it generic quite similar to
topology info support. It also adds the missing ABI documentation for
the cacheinfo sysfs which is already being used.

It moves all the existing different implementations on x86, ia64, powerpc
and s390 to use the generic cacheinfo infrastructure introduced here.
These changes on non-ARM platforms are only compile tested and tested on x86.

This series also adds support for ARM and ARM64 architectures based on
the generic support.

The code can be fetched from:
 git://linux-arm.org/linux-skn cacheinfo

Changes v2->v3:
	- Added {allocation,write}_policy instead of single attributes sysfs
	  (attributes retained on ia64 privately as it was used only on that)
	- factored out show_cpumap into separate helper in cpumask.h
	- populate cpu_{map,list} for non-DT system if they are not populated
	  by arch specific callbacks
	- removed use of sysfs *_show callback in cache_attrs_is_visible
	- all the review comments from Stephen Boyd implemented

Changes v1->v2:
	- removed custom device_{add,remove}_attrs, using is_visible callback
	  instead(suggested by GregKH)
	- arm64: changes as per MarkR review comments
	- Moved smp_call_function_single to architectures using it(arm, arm64,
	  x86) (suggested by Stephen Boyd)
	- arm (mostly changes as per RMK's review comments)
		- fixed to allow v7 + v6 build
		- l2 cache changes to remove extra structure
		- populated CTR for few StrongARM CPU's not implementing CTR

[v1] https://lkml.org/lkml/2014/6/25/603
[v2] https://lkml.org/lkml/2014/7/25/467

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-ia64 at vger.kernel.org
Cc: linux390 at de.ibm.com
Cc: linux-s390 at vger.kernel.org
Cc: x86 at kernel.org
Cc: linuxppc-dev at lists.ozlabs.org
Cc: linux-arm-kernel at lists.infradead.org

Sudeep Holla (11):
  cpumask: factor out show_cpumap into separate helper function
  topology: replace custom attribute macros with standard DEVICE_ATTR*
  drivers: base: add new class "cpu" to group cpu devices
  drivers: base: support cpu cache information interface to userspace
    via sysfs
  ia64: move cacheinfo sysfs to generic cacheinfo infrastructure
  s390: move cacheinfo sysfs to generic cacheinfo infrastructure
  x86: move cacheinfo sysfs to generic cacheinfo infrastructure
  powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure
  ARM64: kernel: add support for cpu cache information
  ARM: kernel: add support for cpu cache information
  ARM: kernel: add outer cache support for cacheinfo implementation

 Documentation/ABI/testing/sysfs-devices-system-cpu |  47 ++
 arch/arm/include/asm/outercache.h                  |   9 +
 arch/arm/kernel/Makefile                           |   1 +
 arch/arm/kernel/cacheinfo.c                        | 287 ++++++++
 arch/arm/mm/Kconfig                                |  13 +
 arch/arm/mm/cache-l2x0.c                           |  35 +-
 arch/arm/mm/cache-tauros2.c                        |  36 +
 arch/arm/mm/cache-xsc3l2.c                         |  17 +
 arch/arm64/kernel/Makefile                         |   2 +-
 arch/arm64/kernel/cacheinfo.c                      | 142 ++++
 arch/ia64/kernel/topology.c                        | 421 +++--------
 arch/powerpc/kernel/cacheinfo.c                    | 812 +++------------------
 arch/powerpc/kernel/cacheinfo.h                    |   8 -
 arch/powerpc/kernel/sysfs.c                        |  12 +-
 arch/s390/kernel/cache.c                           | 388 +++-------
 arch/x86/kernel/cpu/intel_cacheinfo.c              | 709 +++++-------------
 arch/x86/kernel/cpu/perf_event_amd_iommu.c         |   5 +-
 arch/x86/kernel/cpu/perf_event_amd_uncore.c        |   6 +-
 arch/x86/kernel/cpu/perf_event_intel_rapl.c        |   6 +-
 arch/x86/kernel/cpu/perf_event_intel_uncore.c      |   6 +-
 drivers/acpi/acpi_pad.c                            |   6 +-
 drivers/base/Makefile                              |   2 +-
 drivers/base/cacheinfo.c                           | 543 ++++++++++++++
 drivers/base/core.c                                |  39 +-
 drivers/base/cpu.c                                 |  12 +-
 drivers/base/node.c                                |  14 +-
 drivers/base/topology.c                            |  71 +-
 drivers/pci/pci-sysfs.c                            |  39 +-
 include/linux/cacheinfo.h                          | 100 +++
 include/linux/cpu.h                                |   2 +
 include/linux/cpumask.h                            |  27 +
 31 files changed, 1826 insertions(+), 1991 deletions(-)
 create mode 100644 arch/arm/kernel/cacheinfo.c
 create mode 100644 arch/arm64/kernel/cacheinfo.c
 delete mode 100644 arch/powerpc/kernel/cacheinfo.h
 create mode 100644 drivers/base/cacheinfo.c
 create mode 100644 include/linux/cacheinfo.h

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 09/11] ARM64: kernel: add support for cpu cache information
  2014-08-21 10:59   ` [PATCH v3 00/11] drivers: cacheinfo support Sudeep Holla
@ 2014-08-21 10:59     ` Sudeep Holla
  2014-08-21 10:59     ` [PATCH v3 10/11] ARM: " Sudeep Holla
  2014-08-21 10:59     ` [PATCH v3 11/11] ARM: kernel: add outer cache support for cacheinfo implementation Sudeep Holla
  2 siblings, 0 replies; 29+ messages in thread
From: Sudeep Holla @ 2014-08-21 10:59 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sudeep Holla <sudeep.holla@arm.com>

This patch adds support for cacheinfo on ARM64.

On ARMv8, the cache hierarchy can be identified through Cache Level ID
(CLIDR) register while the cache geometry is provided by Cache Size ID
(CCSIDR) register.

Since the architecture doesn't provide any way of detecting the cpus
sharing particular cache, device tree is used for the same purpose.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: linux-arm-kernel at lists.infradead.org
---
 arch/arm64/kernel/Makefile    |   2 +-
 arch/arm64/kernel/cacheinfo.c | 142 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 143 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/cacheinfo.c

diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index df7ef8768fc2..285cd88c1e37 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -15,7 +15,7 @@ arm64-obj-y		:= cputable.o debug-monitors.o entry.o irq.o fpsimd.o	\
 			   entry-fpsimd.o process.o ptrace.o setup.o signal.o	\
 			   sys.o stacktrace.o time.o traps.o io.o vdso.o	\
 			   hyp-stub.o psci.o cpu_ops.o insn.o return_address.o	\
-			   cpuinfo.o
+			   cpuinfo.o cacheinfo.o
 
 arm64-obj-$(CONFIG_COMPAT)		+= sys32.o kuser32.o signal32.o 	\
 					   sys_compat.o
diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
new file mode 100644
index 000000000000..a9cbf3b40a1f
--- /dev/null
+++ b/arch/arm64/kernel/cacheinfo.c
@@ -0,0 +1,142 @@
+/*
+ *  ARM64 cacheinfo support
+ *
+ *  Copyright (C) 2014 ARM Ltd.
+ *  All Rights Reserved
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/bitops.h>
+#include <linux/cacheinfo.h>
+#include <linux/cpu.h>
+#include <linux/compiler.h>
+#include <linux/of.h>
+
+#include <asm/processor.h>
+
+#define MAX_CACHE_LEVEL			7	/* Max 7 level supported */
+/* Ctypen, bits[3(n - 1) + 2 : 3(n - 1)], for n = 1 to 7 */
+#define CLIDR_CTYPE_SHIFT(level)	(3 * (level - 1))
+#define CLIDR_CTYPE_MASK(level)		(7 << CLIDR_CTYPE_SHIFT(level))
+#define CLIDR_CTYPE(clidr, level)	\
+	(((clidr) & CLIDR_CTYPE_MASK(level)) >> CLIDR_CTYPE_SHIFT(level))
+
+static inline enum cache_type get_cache_type(int level)
+{
+	u64 clidr;
+
+	if (level > MAX_CACHE_LEVEL)
+		return CACHE_TYPE_NOCACHE;
+	asm volatile ("mrs     %x0, clidr_el1" : "=r" (clidr));
+	return CLIDR_CTYPE(clidr, level);
+}
+
+/*
+ * NumSets, bits[27:13] - (Number of sets in cache) - 1
+ * Associativity, bits[12:3] - (Associativity of cache) - 1
+ * LineSize, bits[2:0] - (Log2(Number of words in cache line)) - 2
+ */
+#define CCSIDR_WRITE_THROUGH		BIT(31)
+#define CCSIDR_WRITE_BACK		BIT(30)
+#define CCSIDR_READ_ALLOCATE		BIT(29)
+#define CCSIDR_WRITE_ALLOCATE		BIT(28)
+#define CCSIDR_LINESIZE_MASK		0x7
+#define CCSIDR_ASSOCIATIVITY_SHIFT	3
+#define CCSIDR_ASSOCIATIVITY_MASK	0x3FF
+#define CCSIDR_NUMSETS_SHIFT		13
+#define CCSIDR_NUMSETS_MASK		0x7FF
+
+/*
+ * Which cache CCSIDR represents depends on CSSELR value
+ * Make sure no one else changes CSSELR during this
+ * smp_call_function_single prevents preemption for us
+ */
+static inline u32 get_ccsidr(u64 csselr)
+{
+	u64 ccsidr;
+
+	/* Put value into CSSELR */
+	asm volatile("msr csselr_el1, %x0" : : "r" (csselr));
+	isb();
+	/* Read result out of CCSIDR */
+	asm volatile("mrs %x0, ccsidr_el1" : "=r" (ccsidr));
+
+	return (u32)ccsidr;
+}
+
+static void ci_leaf_init(struct cacheinfo *this_leaf,
+			 enum cache_type type, unsigned int level)
+{
+	bool is_instr_cache = type & CACHE_TYPE_INST;
+	u32 tmp = get_ccsidr((level - 1) << 1 | is_instr_cache);
+
+	this_leaf->level = level;
+	this_leaf->type = type;
+	this_leaf->coherency_line_size =
+	    (1 << ((tmp & CCSIDR_LINESIZE_MASK) + 2)) * 4;
+	this_leaf->number_of_sets =
+	    ((tmp >> CCSIDR_NUMSETS_SHIFT) & CCSIDR_NUMSETS_MASK) + 1;
+	this_leaf->ways_of_associativity = ((tmp >> CCSIDR_ASSOCIATIVITY_SHIFT)
+					    & CCSIDR_ASSOCIATIVITY_MASK) + 1;
+	this_leaf->size = this_leaf->number_of_sets *
+	    this_leaf->coherency_line_size * this_leaf->ways_of_associativity;
+	this_leaf->attributes =
+		((tmp & CCSIDR_WRITE_THROUGH) ? CACHE_WRITE_THROUGH : 0) |
+		((tmp & CCSIDR_WRITE_BACK) ? CACHE_WRITE_BACK : 0) |
+		((tmp & CCSIDR_READ_ALLOCATE) ? CACHE_READ_ALLOCATE : 0) |
+		((tmp & CCSIDR_WRITE_ALLOCATE) ? CACHE_WRITE_ALLOCATE : 0);
+}
+
+static int __init_cache_level(unsigned int cpu)
+{
+	unsigned int ctype, level, leaves;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+
+	for (level = 1, leaves = 0; level <= MAX_CACHE_LEVEL; level++) {
+		ctype = get_cache_type(level);
+		if (ctype == CACHE_TYPE_NOCACHE) {
+			level--;
+			break;
+		}
+		/* Separate instruction and data caches */
+		leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
+	}
+
+	this_cpu_ci->num_levels = level;
+	this_cpu_ci->num_leaves = leaves;
+	return 0;
+}
+
+static int __populate_cache_leaves(unsigned int cpu)
+{
+	unsigned int level, idx;
+	enum cache_type type;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	struct cacheinfo *this_leaf = this_cpu_ci->info_list;
+
+	for (idx = 0, level = 1; level <= this_cpu_ci->num_levels &&
+	     idx < this_cpu_ci->num_leaves; idx++, level++) {
+		type = get_cache_type(level);
+		if (type == CACHE_TYPE_SEPARATE) {
+			ci_leaf_init(this_leaf++, CACHE_TYPE_DATA, level);
+			ci_leaf_init(this_leaf++, CACHE_TYPE_INST, level);
+		} else {
+			ci_leaf_init(this_leaf++, type, level);
+		}
+	}
+	return 0;
+}
+
+DEFINE_SMP_CALL_CACHE_FUNCTION(init_cache_level)
+DEFINE_SMP_CALL_CACHE_FUNCTION(populate_cache_leaves)
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 10/11] ARM: kernel: add support for cpu cache information
  2014-08-21 10:59   ` [PATCH v3 00/11] drivers: cacheinfo support Sudeep Holla
  2014-08-21 10:59     ` [PATCH v3 09/11] ARM64: kernel: add support for cpu cache information Sudeep Holla
@ 2014-08-21 10:59     ` Sudeep Holla
  2014-08-21 10:59     ` [PATCH v3 11/11] ARM: kernel: add outer cache support for cacheinfo implementation Sudeep Holla
  2 siblings, 0 replies; 29+ messages in thread
From: Sudeep Holla @ 2014-08-21 10:59 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sudeep Holla <sudeep.holla@arm.com>

This patch adds support for cacheinfo on ARM platforms.

On ARMv7, the cache hierarchy can be identified through Cache Level ID
register(CLIDR) while the cache geometry is provided by Cache Size ID
register(CCSIDR).

On architecture versions before ARMv7, CLIDR and CCSIDR is not
implemented. The cache type register(CTR) provides both cache hierarchy
and geometry if implemented. For implementations that doesn't support
CTR, we need to list the probable value of CTR if it was implemented
along with the cpuid for the sake of simplicity to handle them.

Since the architecture doesn't provide any way of detecting the cpus
sharing particular cache, device tree is used fo the same purpose.
On non-DT platforms, first level caches are per-cpu while higher level
caches are assumed system-wide.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: linux-arm-kernel at lists.infradead.org
---
 arch/arm/kernel/Makefile    |   1 +
 arch/arm/kernel/cacheinfo.c | 275 ++++++++++++++++++++++++++++++++++++++++++++
 arch/arm/mm/Kconfig         |  13 +++
 3 files changed, 289 insertions(+)
 create mode 100644 arch/arm/kernel/cacheinfo.c

diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index 38ddd9f83d0e..2c5ff0efb670 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -29,6 +29,7 @@ obj-y		+= entry-v7m.o v7m.o
 else
 obj-y		+= entry-armv.o
 endif
+obj-$(CONFIG_CPU_HAS_CACHE) += cacheinfo.o
 
 obj-$(CONFIG_OC_ETM)		+= etm.o
 obj-$(CONFIG_CPU_IDLE)		+= cpuidle.o
diff --git a/arch/arm/kernel/cacheinfo.c b/arch/arm/kernel/cacheinfo.c
new file mode 100644
index 000000000000..d2659563de31
--- /dev/null
+++ b/arch/arm/kernel/cacheinfo.c
@@ -0,0 +1,275 @@
+/*
+ *  ARM cacheinfo support
+ *
+ *  Copyright (C) 2014 ARM Ltd.
+ *  All Rights Reserved
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/bitops.h>
+#include <linux/cacheinfo.h>
+#include <linux/cpu.h>
+#include <linux/compiler.h>
+#include <linux/of.h>
+
+#include <asm/cputype.h>
+#include <asm/processor.h>
+#include <asm/system_info.h>
+
+#define cache_is_armv7() \
+	(cpu_architecture() >= CPU_ARCH_ARMv7 && !armv6_extended())
+#define MAX_CACHE_LEVEL		(cache_is_armv7() ? 7 : 1)
+
+#define CTR_CTYPE_SHIFT		24
+#define CTR_CTYPE_MASK		(1 << CTR_CTYPE_SHIFT)
+
+struct ctr_info {
+	unsigned int cpuid_part;
+	unsigned int ctr;
+};
+
+/*
+ *  Cache Type Register
+ *  +---------------------------------+
+ *  | 31 29 | 28 25 |24| 23 12 | 11 0 |
+ *  +---------------------------------+
+ *  | 0 0 0 | Ctype | S| Dsize | Isize|
+ *  +---------------------------------+
+ * The table below encodes only Dsize and Isize
+ */
+static struct ctr_info cache_ctr_list[] = {
+	{0x4400a100, 0x0016A16A }, /* SA-110:  32kB D$, 32kB I$ */
+	{0x4400a110, 0x0012A16A }, /* SA-1100: 16kB D$, 32kB I$ */
+	{0x6900b110, 0x0012A16A }, /* SA-1110: 16kB D$, 32kB I$ */
+};
+
+/*
+ * List of CPUs reported as ARMv7 but don't implement CLIDR,
+ * CSSELR and CCSIDR. Cache information is still available from CTR
+ */
+static int armv6_ext_cpuid_part[] = {
+	0x4100b020, /* ARM11MP */
+	0x4100b760, /* ARM1176 */
+};
+
+static bool armv6_extended(void)
+{
+	int i, cpuid_part = read_cpuid_part();
+
+	for (i = 0; i < ARRAY_SIZE(armv6_ext_cpuid_part); i++)
+		if (armv6_ext_cpuid_part[i] == cpuid_part)
+			return true;
+	return false;
+}
+
+static int get_unimplemented_ctr(unsigned int *ctr)
+{
+	int i, cpuid_part = read_cpuid_part();
+
+	for (i = 0; i < ARRAY_SIZE(cache_ctr_list); i++)
+		if (cache_ctr_list[i].cpuid_part == cpuid_part) {
+			*ctr = cache_ctr_list[i].ctr;
+			return 0;
+		}
+	return -ENOENT;
+}
+
+static unsigned int get_ctr(void)
+{
+	unsigned int ctr;
+
+	if (get_unimplemented_ctr(&ctr))
+		ctr = read_cpuid_cachetype();
+	return ctr;
+}
+
+static enum cache_type __get_cache_type(int level)
+{
+	if (level > MAX_CACHE_LEVEL)
+		return CACHE_TYPE_NOCACHE;
+	return get_ctr() & CTR_CTYPE_MASK ?
+		CACHE_TYPE_SEPARATE : CACHE_TYPE_UNIFIED;
+}
+
+/*
+ *  +---------------------------------+
+ *  | 9  8  7  6 | 5  4  3 | 2 | 1  0 |
+ *  +---------------------------------+
+ *  |    size    |  assoc  | m |  len |
+ *  +---------------------------------+
+ * linelen        = 1 << (len + 3)
+ * multiplier     = 2 + m
+ * nsets          = 1 << (size + 6 - assoc - len)
+ * associativity  = multiplier << (assoc - 1)
+ * cache_size     = multiplier << (size + 8)
+ */
+#define CTR_LINESIZE_MASK	0x3
+#define CTR_MULTIPLIER_SHIFT	2
+#define CTR_MULTIPLIER_MASK	0x1
+#define CTR_ASSOCIAT_SHIFT	3
+#define CTR_ASSOCIAT_MASK	0x7
+#define CTR_SIZE_SHIFT		6
+#define CTR_SIZE_MASK		0xF
+#define CTR_DCACHE_SHIFT	12
+
+static void __ci_leaf_init(enum cache_type type, struct cacheinfo *this_leaf)
+{
+	unsigned int size, multiplier, assoc, len, tmp = get_ctr();
+
+	if (type == CACHE_TYPE_DATA)
+		tmp >>= CTR_DCACHE_SHIFT;
+
+	len = tmp & CTR_LINESIZE_MASK;
+	size = (tmp >> CTR_SIZE_SHIFT) & CTR_SIZE_MASK;
+	assoc = (tmp >> CTR_ASSOCIAT_SHIFT) & CTR_ASSOCIAT_MASK;
+	multiplier = ((tmp >> CTR_MULTIPLIER_SHIFT) & CTR_MULTIPLIER_MASK) + 2;
+
+	this_leaf->type = type;
+	this_leaf->coherency_line_size = 1 << (len + 3);
+	this_leaf->number_of_sets = 1 << (size + 6 - assoc - len);
+	this_leaf->ways_of_associativity = multiplier << (assoc - 1);
+	this_leaf->size = multiplier << (size + 8);
+}
+
+/* Ctypen, bits[3(n - 1) + 2 : 3(n - 1)], for n = 1 to 7 */
+#define CLIDR_CTYPE_SHIFT(level)	(3 * (level - 1))
+#define CLIDR_CTYPE_MASK(level)		(7 << CLIDR_CTYPE_SHIFT(level))
+#define CLIDR_CTYPE(clidr, level)	\
+	(((clidr) & CLIDR_CTYPE_MASK(level)) >> CLIDR_CTYPE_SHIFT(level))
+
+static inline enum cache_type __armv7_get_cache_type(int level)
+{
+	unsigned int clidr;
+
+	if (level > MAX_CACHE_LEVEL)
+		return CACHE_TYPE_NOCACHE;
+	asm volatile ("mrc p15, 1, %0, c0, c0, 1" : "=r" (clidr));
+	return CLIDR_CTYPE(clidr, level);
+}
+
+/*
+ * NumSets, bits[27:13] - (Number of sets in cache) - 1
+ * Associativity, bits[12:3] - (Associativity of cache) - 1
+ * LineSize, bits[2:0] - (Log2(Number of words in cache line)) - 2
+ */
+#define CCSIDR_WRITE_THROUGH		BIT(31)
+#define CCSIDR_WRITE_BACK		BIT(30)
+#define CCSIDR_READ_ALLOCATE		BIT(29)
+#define CCSIDR_WRITE_ALLOCATE		BIT(28)
+#define CCSIDR_LINESIZE_MASK		0x7
+#define CCSIDR_ASSOCIATIVITY_SHIFT	3
+#define CCSIDR_ASSOCIATIVITY_MASK	0x3FF
+#define CCSIDR_NUMSETS_SHIFT		13
+#define CCSIDR_NUMSETS_MASK		0x7FF
+
+/*
+ * Which cache CCSIDR represents depends on CSSELR value
+ * Make sure no one else changes CSSELR during this
+ * smp_call_function_single prevents preemption for us
+ */
+static inline u32 get_ccsidr(u32 csselr)
+{
+	u32 ccsidr;
+
+	/* Put value into CSSELR */
+	asm volatile ("mcr p15, 2, %0, c0, c0, 0" : : "r" (csselr));
+	isb();
+	/* Read result out of CCSIDR */
+	asm volatile ("mrc p15, 1, %0, c0, c0, 0" : "=r" (ccsidr));
+
+	return ccsidr;
+}
+
+static void __armv7_ci_leaf_init(enum cache_type type,
+				 struct cacheinfo *this_leaf)
+{
+	bool is_instr_cache = type & CACHE_TYPE_INST;
+	u32 tmp = get_ccsidr((this_leaf->level - 1) << 1 | is_instr_cache);
+
+	this_leaf->type = type;
+	this_leaf->coherency_line_size =
+	    (1 << ((tmp & CCSIDR_LINESIZE_MASK) + 2)) * 4;
+	this_leaf->number_of_sets =
+	    ((tmp >> CCSIDR_NUMSETS_SHIFT) & CCSIDR_NUMSETS_MASK) + 1;
+	this_leaf->ways_of_associativity = ((tmp >> CCSIDR_ASSOCIATIVITY_SHIFT)
+					    & CCSIDR_ASSOCIATIVITY_MASK) + 1;
+	this_leaf->size = this_leaf->number_of_sets *
+	    this_leaf->coherency_line_size * this_leaf->ways_of_associativity;
+	this_leaf->attributes =
+		((tmp & CCSIDR_WRITE_THROUGH) ? CACHE_WRITE_THROUGH : 0) |
+		((tmp & CCSIDR_WRITE_BACK) ? CACHE_WRITE_BACK : 0) |
+		((tmp & CCSIDR_READ_ALLOCATE) ? CACHE_READ_ALLOCATE : 0) |
+		((tmp & CCSIDR_WRITE_ALLOCATE) ? CACHE_WRITE_ALLOCATE : 0);
+}
+
+static inline enum cache_type get_cache_type(int level)
+{
+	if (cache_is_armv7())
+		return __armv7_get_cache_type(level);
+	return __get_cache_type(level);
+}
+
+static void ci_leaf_init(struct cacheinfo *this_leaf,
+			 enum cache_type type, unsigned int level)
+{
+	this_leaf->level = level;
+	if (cache_is_armv7())
+		__armv7_ci_leaf_init(type, this_leaf);
+	else
+		__ci_leaf_init(type, this_leaf);
+}
+
+static int __init_cache_level(unsigned int cpu)
+{
+	unsigned int ctype, level, leaves;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+
+	for (level = 1, leaves = 0; level <= MAX_CACHE_LEVEL; level++) {
+		ctype = get_cache_type(level);
+		if (ctype == CACHE_TYPE_NOCACHE) {
+			level--;
+			break;
+		}
+		/* Separate instruction and data caches */
+		leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
+	}
+
+	this_cpu_ci->num_levels = level;
+	this_cpu_ci->num_leaves = leaves;
+
+	return 0;
+}
+
+static int __populate_cache_leaves(unsigned int cpu)
+{
+	unsigned int level, idx;
+	enum cache_type type;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	struct cacheinfo *this_leaf = this_cpu_ci->info_list;
+
+	for (idx = 0, level = 1; level <= this_cpu_ci->num_levels &&
+	     idx < this_cpu_ci->num_leaves; idx++, level++) {
+		type = get_cache_type(level);
+		if (type == CACHE_TYPE_SEPARATE) {
+			ci_leaf_init(this_leaf++, CACHE_TYPE_DATA, level);
+			ci_leaf_init(this_leaf++, CACHE_TYPE_INST, level);
+		} else {
+			ci_leaf_init(this_leaf++, type, level);
+		}
+	}
+	return 0;
+}
+
+DEFINE_SMP_CALL_CACHE_FUNCTION(init_cache_level)
+DEFINE_SMP_CALL_CACHE_FUNCTION(populate_cache_leaves)
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index ae69809a9e47..9cfbf2fa8bc4 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -494,30 +494,42 @@ config CPU_PABRT_V7
 # The cache model
 config CPU_CACHE_V4
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_V4WT
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_V4WB
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_V6
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_V7
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_NOP
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_VIVT
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_VIPT
 	bool
+	select CPU_HAS_CACHE
 
 config CPU_CACHE_FA
 	bool
+	select CPU_HAS_CACHE
+
+config CPU_HAS_CACHE
+	bool
 
 if MMU
 # The copy-page model
@@ -845,6 +857,7 @@ config DMA_CACHE_RWFO
 
 config OUTER_CACHE
 	bool
+	select CPU_HAS_CACHE
 
 config OUTER_CACHE_SYNC
 	bool
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 11/11] ARM: kernel: add outer cache support for cacheinfo implementation
  2014-08-21 10:59   ` [PATCH v3 00/11] drivers: cacheinfo support Sudeep Holla
  2014-08-21 10:59     ` [PATCH v3 09/11] ARM64: kernel: add support for cpu cache information Sudeep Holla
  2014-08-21 10:59     ` [PATCH v3 10/11] ARM: " Sudeep Holla
@ 2014-08-21 10:59     ` Sudeep Holla
  2 siblings, 0 replies; 29+ messages in thread
From: Sudeep Holla @ 2014-08-21 10:59 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sudeep Holla <sudeep.holla@arm.com>

In order to support outer cache in the cacheinfo infrastructure, a new
function 'get_info' is added to outer_cache_fns. This function is used
to get the outer cache information namely: line size, number of ways of
associativity and number of sets.

This patch adds 'get_info' supports to all L2 cache implementations on
ARM except Marvell's Feroceon L2 cache.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: linux-arm-kernel at lists.infradead.org
---
 arch/arm/include/asm/outercache.h |  9 +++++++++
 arch/arm/kernel/cacheinfo.c       | 14 +++++++++++++-
 arch/arm/mm/cache-l2x0.c          | 35 ++++++++++++++++++++++++++++++++++-
 arch/arm/mm/cache-tauros2.c       | 36 ++++++++++++++++++++++++++++++++++++
 arch/arm/mm/cache-xsc3l2.c        | 17 +++++++++++++++++
 5 files changed, 109 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/outercache.h b/arch/arm/include/asm/outercache.h
index 891a56b35bcf..e063d8c87077 100644
--- a/arch/arm/include/asm/outercache.h
+++ b/arch/arm/include/asm/outercache.h
@@ -23,7 +23,10 @@
 
 #include <linux/types.h>
 
+struct cacheinfo;
+
 struct outer_cache_fns {
+	void (*get_info)(struct cacheinfo *);
 	void (*inv_range)(unsigned long, unsigned long);
 	void (*clean_range)(unsigned long, unsigned long);
 	void (*flush_range)(unsigned long, unsigned long);
@@ -112,6 +115,11 @@ static inline void outer_resume(void)
 		outer_cache.resume();
 }
 
+static inline void outer_get_info(struct cacheinfo *this_leaf)
+{
+	if (outer_cache.get_info)
+		outer_cache.get_info(this_leaf);
+}
 #else
 
 static inline void outer_inv_range(phys_addr_t start, phys_addr_t end)
@@ -123,6 +131,7 @@ static inline void outer_flush_range(phys_addr_t start, phys_addr_t end)
 static inline void outer_flush_all(void) { }
 static inline void outer_disable(void) { }
 static inline void outer_resume(void) { }
+static inline void outer_get_info(struct cacheinfo *this_leaf) { }
 
 #endif
 
diff --git a/arch/arm/kernel/cacheinfo.c b/arch/arm/kernel/cacheinfo.c
index d2659563de31..8b8043a2c73f 100644
--- a/arch/arm/kernel/cacheinfo.c
+++ b/arch/arm/kernel/cacheinfo.c
@@ -24,6 +24,7 @@
 #include <linux/of.h>
 
 #include <asm/cputype.h>
+#include <asm/outercache.h>
 #include <asm/processor.h>
 #include <asm/system_info.h>
 
@@ -220,11 +221,19 @@ static inline enum cache_type get_cache_type(int level)
 	return __get_cache_type(level);
 }
 
+static inline void __outer_ci_leaf_init(struct cacheinfo *this_leaf)
+{
+	outer_get_info(this_leaf);
+	BUG_ON(this_leaf->type == CACHE_TYPE_SEPARATE);
+}
+
 static void ci_leaf_init(struct cacheinfo *this_leaf,
 			 enum cache_type type, unsigned int level)
 {
 	this_leaf->level = level;
-	if (cache_is_armv7())
+	if (type == CACHE_TYPE_NOCACHE)	/* must be outer cache */
+		__outer_ci_leaf_init(this_leaf);
+	else if (cache_is_armv7())
 		__armv7_ci_leaf_init(type, this_leaf);
 	else
 		__ci_leaf_init(type, this_leaf);
@@ -248,6 +257,9 @@ static int __init_cache_level(unsigned int cpu)
 	this_cpu_ci->num_levels = level;
 	this_cpu_ci->num_leaves = leaves;
 
+	if (IS_ENABLED(CONFIG_OUTER_CACHE) && outer_cache.get_info)
+		this_cpu_ci->num_leaves++, this_cpu_ci->num_levels++;
+
 	return 0;
 }
 
diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c
index 5f2c988a06ac..4114b1944807 100644
--- a/arch/arm/mm/cache-l2x0.c
+++ b/arch/arm/mm/cache-l2x0.c
@@ -17,6 +17,7 @@
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
  */
 #include <linux/cpu.h>
+#include <linux/cacheinfo.h>
 #include <linux/err.h>
 #include <linux/init.h>
 #include <linux/smp.h>
@@ -105,6 +106,22 @@ static inline void l2c_unlock(void __iomem *base, unsigned num)
 	}
 }
 
+static void __l2x0_getinfo(struct cacheinfo *this_leaf)
+{
+	unsigned int assoc = get_count_order(l2x0_way_mask);
+
+	this_leaf->size = l2x0_size;
+	this_leaf->coherency_line_size = CACHE_LINE_SIZE;
+	this_leaf->ways_of_associativity = assoc;
+	this_leaf->number_of_sets = l2x0_size / (assoc * CACHE_LINE_SIZE);
+}
+
+static void l2x0_getinfo(struct cacheinfo *this_leaf)
+{
+	this_leaf->type = CACHE_TYPE_UNIFIED;
+	__l2x0_getinfo(this_leaf);
+}
+
 /*
  * Enable the L2 cache controller.  This function must only be
  * called when the cache controller is known to be disabled.
@@ -309,6 +326,7 @@ static const struct l2c_init_data l2c210_data __initconst = {
 		.disable = l2c_disable,
 		.sync = l2c210_sync,
 		.resume = l2c210_resume,
+		.get_info = l2x0_getinfo,
 	},
 };
 
@@ -466,6 +484,7 @@ static const struct l2c_init_data l2c220_data = {
 		.disable = l2c_disable,
 		.sync = l2c220_sync,
 		.resume = l2c210_resume,
+		.get_info = l2x0_getinfo,
 	},
 };
 
@@ -814,6 +833,7 @@ static const struct l2c_init_data l2c310_init_fns __initconst = {
 		.disable = l2c310_disable,
 		.sync = l2c210_sync,
 		.resume = l2c310_resume,
+		.get_info = l2x0_getinfo,
 	},
 };
 
@@ -894,7 +914,6 @@ static void __init __l2c_init(const struct l2c_init_data *data,
 		data->enable(l2x0_base, aux, data->num_lock);
 
 	outer_cache = fns;
-
 	/*
 	 * It is strange to save the register state before initialisation,
 	 * but hey, this is what the DT implementations decided to do.
@@ -994,6 +1013,7 @@ static const struct l2c_init_data of_l2c210_data __initconst = {
 		.disable     = l2c_disable,
 		.sync        = l2c210_sync,
 		.resume      = l2c210_resume,
+		.get_info    = l2x0_getinfo,
 	},
 };
 
@@ -1012,6 +1032,7 @@ static const struct l2c_init_data of_l2c220_data __initconst = {
 		.disable     = l2c_disable,
 		.sync        = l2c220_sync,
 		.resume      = l2c210_resume,
+		.get_info    = l2x0_getinfo,
 	},
 };
 
@@ -1065,6 +1086,7 @@ static const struct l2c_init_data of_l2c310_data __initconst = {
 		.disable     = l2c310_disable,
 		.sync        = l2c210_sync,
 		.resume      = l2c310_resume,
+		.get_info    = l2x0_getinfo,
 	},
 };
 
@@ -1092,6 +1114,7 @@ static const struct l2c_init_data of_l2c310_coherent_data __initconst = {
 		.flush_all   = l2c210_flush_all,
 		.disable     = l2c310_disable,
 		.resume      = l2c310_resume,
+		.get_info    = l2x0_getinfo,
 	},
 };
 
@@ -1255,6 +1278,12 @@ static void __init aurora_of_parse(const struct device_node *np,
 	*aux_mask &= ~mask;
 }
 
+static void aurora_no_outer_data_getinfo(struct cacheinfo *this_leaf)
+{
+	this_leaf->type = CACHE_TYPE_INST;
+	__l2x0_getinfo(this_leaf);
+}
+
 static const struct l2c_init_data of_aurora_with_outer_data __initconst = {
 	.type = "Aurora",
 	.way_size_0 = SZ_4K,
@@ -1271,6 +1300,7 @@ static const struct l2c_init_data of_aurora_with_outer_data __initconst = {
 		.disable     = l2x0_disable,
 		.sync        = l2x0_cache_sync,
 		.resume      = aurora_resume,
+		.get_info    = l2x0_getinfo,
 	},
 };
 
@@ -1284,6 +1314,7 @@ static const struct l2c_init_data of_aurora_no_outer_data __initconst = {
 	.save  = aurora_save,
 	.outer_cache = {
 		.resume      = aurora_resume,
+		.get_info    = aurora_no_outer_data_getinfo,
 	},
 };
 
@@ -1439,6 +1470,7 @@ static const struct l2c_init_data of_bcm_l2x0_data __initconst = {
 		.disable     = l2c310_disable,
 		.sync        = l2c210_sync,
 		.resume      = l2c310_resume,
+		.get_info    = l2x0_getinfo,
 	},
 };
 
@@ -1475,6 +1507,7 @@ static const struct l2c_init_data of_tauros3_data __initconst = {
 	/* Tauros3 broadcasts L1 cache operations to L2 */
 	.outer_cache = {
 		.resume      = tauros3_resume,
+		.get_info    = l2x0_getinfo,
 	},
 };
 
diff --git a/arch/arm/mm/cache-tauros2.c b/arch/arm/mm/cache-tauros2.c
index b273739e6359..0d4dd8ff6a56 100644
--- a/arch/arm/mm/cache-tauros2.c
+++ b/arch/arm/mm/cache-tauros2.c
@@ -14,6 +14,7 @@
  *   Document ID MV-S105190-00, Rev 0.7, March 14 2008.
  */
 
+#include <linux/cacheinfo.h>
 #include <linux/init.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
@@ -60,6 +61,7 @@ static inline void tauros2_inv_pa(unsigned long addr)
  * noninclusive.
  */
 #define CACHE_LINE_SIZE		32
+#define CACHE_LINE_SHIFT	5
 
 static void tauros2_inv_range(unsigned long start, unsigned long end)
 {
@@ -131,6 +133,39 @@ static void tauros2_resume(void)
 	"mcr	p15, 0, %0, c1, c0, 0 @Enable L2 Cache\n\t"
 	: : "r" (0x0));
 }
+
+/*
+ *  +----------------------------------------+
+ *  | 11 10 9  8 | 7  6  5  4  3 | 2 |  1  0 |
+ *  +----------------------------------------+
+ *  |  way size  | associativity | - |line_sz|
+ *  +----------------------------------------+
+ */
+#define L2CTR_ASSOCIAT_SHIFT	3
+#define L2CTR_ASSOCIAT_MASK	0x1F
+#define L2CTR_WAYSIZE_SHIFT	8
+#define L2CTR_WAYSIZE_MASK	0xF
+#define CACHE_WAY_PER_SET(l2ctr)	\
+	(((l2_ctr) >> L2CTR_ASSOCIAT_SHIFT) & L2CTR_ASSOCIAT_MASK)
+#define CACHE_WAY_SIZE(l2ctr)		\
+	(8192 << (((l2ctr) >> L2CTR_WAYSIZE_SHIFT) & L2CTR_WAYSIZE_MASK))
+#define CACHE_SET_SIZE(l2ctr)	(CACHE_WAY_SIZE(l2ctr) >> CACHE_LINE_SHIFT)
+
+static void tauros2_getinfo(struct cacheinfo *this_leaf)
+{
+	unsigned int l2_ctr;
+
+	__asm__("mrc p15, 1, %0, c0, c0, 1" : "=r" (l2_ctr));
+
+	this_leaf->type = CACHE_TYPE_UNIFIED;
+	this_leaf->coherency_line_size = CACHE_LINE_SIZE;
+	this_leaf->ways_of_associativity = CACHE_WAY_PER_SET(l2_ctr);
+	this_leaf->number_of_sets = CACHE_SET_SIZE(l2_ctr);
+	this_leaf->size = this_leaf->coherency_line_size *
+			  this_leaf->number_of_sets *
+			  this_leaf->ways_of_associativity;
+}
+
 #endif
 
 static inline u32 __init read_extra_features(void)
@@ -226,6 +261,7 @@ static void __init tauros2_internal_init(unsigned int features)
 		outer_cache.flush_range = tauros2_flush_range;
 		outer_cache.disable = tauros2_disable;
 		outer_cache.resume = tauros2_resume;
+		outer_cache.get_info = tauros2_getinfo;
 	}
 #endif
 
diff --git a/arch/arm/mm/cache-xsc3l2.c b/arch/arm/mm/cache-xsc3l2.c
index 6c3edeb66e74..175bf44eb039 100644
--- a/arch/arm/mm/cache-xsc3l2.c
+++ b/arch/arm/mm/cache-xsc3l2.c
@@ -16,6 +16,7 @@
  * along with this program; if not, write to the Free Software
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
  */
+#include <linux/cacheinfo.h>
 #include <linux/init.h>
 #include <linux/highmem.h>
 #include <asm/cp15.h>
@@ -201,6 +202,21 @@ static void xsc3_l2_flush_range(unsigned long start, unsigned long end)
 	dsb();
 }
 
+static void xsc3_l2_getinfo(struct cacheinfo *this_leaf)
+{
+	unsigned long l2ctype;
+
+	__asm__("mrc p15, 1, %0, c0, c0, 1" : "=r" (l2ctype));
+
+	this_leaf->type = CACHE_TYPE_UNIFIED;
+	this_leaf->coherency_line_size = CACHE_LINE_SIZE;
+	this_leaf->ways_of_associativity = CACHE_WAY_PER_SET;
+	this_leaf->number_of_sets = CACHE_SET_SIZE(l2ctype);
+	this_leaf->size = this_leaf->coherency_line_size *
+			  this_leaf->number_of_sets *
+			  this_leaf->ways_of_associativity;
+}
+
 static int __init xsc3_l2_init(void)
 {
 	if (!cpu_is_xsc3() || !xsc3_l2_present())
@@ -213,6 +229,7 @@ static int __init xsc3_l2_init(void)
 		outer_cache.inv_range = xsc3_l2_inv_range;
 		outer_cache.clean_range = xsc3_l2_clean_range;
 		outer_cache.flush_range = xsc3_l2_flush_range;
+		outer_cache.get_info    = xsc3_l2_getinfo;
 	}
 
 	return 0;
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2014-08-21 10:59 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-25 17:30 [PATCH 0/9] drivers: cacheinfo support Sudeep Holla
2014-06-25 17:30 ` [PATCH 2/9] drivers: base: support cpu cache information interface to userspace via sysfs Sudeep Holla
2014-06-25 22:23   ` Russell King - ARM Linux
2014-06-26 18:41     ` Sudeep Holla
2014-06-26 18:50       ` Russell King - ARM Linux
2014-06-26 19:03         ` Sudeep Holla
2014-07-10  0:09   ` Greg Kroah-Hartman
2014-07-10 13:37     ` Sudeep Holla
2014-06-25 17:30 ` [PATCH 7/9] ARM64: kernel: add support for cpu cache information Sudeep Holla
2014-06-27 10:36   ` Mark Rutland
2014-06-27 11:22     ` Sudeep Holla
2014-06-27 11:34       ` Mark Rutland
2014-06-25 17:30 ` [PATCH 8/9] ARM: " Sudeep Holla
2014-06-25 22:33   ` Russell King - ARM Linux
2014-06-26 11:33     ` Sudeep Holla
2014-06-26  0:19   ` Stephen Boyd
2014-06-26 11:36     ` Sudeep Holla
2014-06-26 18:45       ` Stephen Boyd
2014-06-27  9:38         ` Sudeep Holla
2014-06-25 17:30 ` [PATCH 9/9] ARM: kernel: add outer cache support for cacheinfo implementation Sudeep Holla
2014-06-25 22:37   ` Russell King - ARM Linux
2014-06-26 13:02     ` Sudeep Holla
     [not found] ` <1406306692-7135-1-git-send-email-sudeep.holla@arm.com>
2014-07-25 16:44   ` [PATCH v2 7/9] ARM64: kernel: add support for cpu cache information Sudeep Holla
2014-07-25 16:44   ` [PATCH v2 8/9] ARM: " Sudeep Holla
2014-07-25 16:44   ` [PATCH v2 9/9] ARM: kernel: add outer cache support for cacheinfo implementation Sudeep Holla
2014-08-21 10:59   ` [PATCH v3 00/11] drivers: cacheinfo support Sudeep Holla
2014-08-21 10:59     ` [PATCH v3 09/11] ARM64: kernel: add support for cpu cache information Sudeep Holla
2014-08-21 10:59     ` [PATCH v3 10/11] ARM: " Sudeep Holla
2014-08-21 10:59     ` [PATCH v3 11/11] ARM: kernel: add outer cache support for cacheinfo implementation Sudeep Holla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).