* [RFC PATCH v5 05/19] memory-hotplug: check whether memory is present or not
From: Wen Congyang @ 2012-07-27 10:28 UTC (permalink / raw)
To: linux-mm, linux-kernel, linuxppc-dev, linux-acpi, linux-s390,
linux-sh, linux-ia64, cmetcalf
Cc: len.brown, Yasuaki ISIMATU, paulus, minchan.kim, kosaki.motohiro,
rientjes, cl, akpm, liuj97
In-Reply-To: <50126B83.3050201@cn.fujitsu.com>
From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
If system supports memory hot-remove, online_pages() may online removed pages.
So online_pages() need to check whether onlining pages are present or not.
CC: David Rientjes <rientjes@google.com>
CC: Jiang Liu <liuj97@gmail.com>
CC: Len Brown <len.brown@intel.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Christoph Lameter <cl@linux.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
---
include/linux/mmzone.h | 19 +++++++++++++++++++
mm/memory_hotplug.c | 13 +++++++++++++
2 files changed, 32 insertions(+), 0 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 458988b..822f705 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1168,6 +1168,25 @@ void sparse_init(void);
#define sparse_index_init(_sec, _nid) do {} while (0)
#endif /* CONFIG_SPARSEMEM */
+#ifdef CONFIG_SPARSEMEM
+static inline int pfns_present(unsigned long pfn, unsigned long nr_pages)
+{
+ int i;
+ for (i = 0; i < nr_pages; i++) {
+ if (pfn_present(pfn + 1))
+ continue;
+ else
+ return -EINVAL;
+ }
+ return 0;
+}
+#else
+static inline int pfns_present(unsigned long pfn, unsigned long nr_pages)
+{
+ return 0;
+}
+#endif /* CONFIG_SPARSEMEM*/
+
#ifdef CONFIG_NODES_SPAN_OTHER_NODES
bool early_pfn_in_nid(unsigned long pfn, int nid);
#else
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 5af0a9f..d510be0 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -467,6 +467,19 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages)
struct memory_notify arg;
lock_memory_hotplug();
+ /*
+ * If system supports memory hot-remove, the memory may have been
+ * removed. So we check whether the memory has been removed or not.
+ *
+ * Note: When CONFIG_SPARSEMEM is defined, pfns_present() become
+ * effective. If CONFIG_SPARSEMEM is not defined, pfns_present()
+ * always returns 0.
+ */
+ ret = pfns_present(pfn, nr_pages);
+ if (ret) {
+ unlock_memory_hotplug();
+ return ret;
+ }
arg.start_pfn = pfn;
arg.nr_pages = nr_pages;
arg.status_change_nid = -1;
--
1.7.1
^ permalink raw reply related
* [RFC PATCH v5 04/19] memory-hotplug: offline and remove memory when removing the memory device
From: Wen Congyang @ 2012-07-27 10:27 UTC (permalink / raw)
To: linux-mm, linux-kernel, linuxppc-dev, linux-acpi, linux-s390,
linux-sh, linux-ia64, cmetcalf
Cc: len.brown, Yasuaki ISIMATU, paulus, minchan.kim, kosaki.motohiro,
rientjes, cl, akpm, liuj97
In-Reply-To: <50126B83.3050201@cn.fujitsu.com>
From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
We should offline and remove memory when removing the memory device.
The memory device can be removed by 2 ways:
1. send eject request by SCI
2. echo 1 >/sys/bus/pci/devices/PNP0C80:XX/eject
In the 1st case, acpi_memory_disable_device() will be called. In the 2nd
case, acpi_memory_device_remove() will be called. acpi_memory_device_remove()
will also be called when we unbind the memory device from the driver
acpi_memhotplug. If the type is ACPI_BUS_REMOVAL_EJECT, it means
that the user wants to eject the memory device, and we should offline
and remove memory in acpi_memory_device_remove().
The function remove_memory() is not implemeted now. It only check whether
all memory has been offllined now.
CC: David Rientjes <rientjes@google.com>
CC: Jiang Liu <liuj97@gmail.com>
CC: Len Brown <len.brown@intel.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Christoph Lameter <cl@linux.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
drivers/acpi/acpi_memhotplug.c | 42 +++++++++++++++++++++++++++++++++------
drivers/base/memory.c | 39 +++++++++++++++++++++++++++++++++++++
include/linux/memory.h | 5 ++++
include/linux/memory_hotplug.h | 5 ++++
mm/memory_hotplug.c | 22 ++++++++++++++++++++
5 files changed, 106 insertions(+), 7 deletions(-)
diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
index 293d718..ed37fc2 100644
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -29,6 +29,7 @@
#include <linux/module.h>
#include <linux/init.h>
#include <linux/types.h>
+#include <linux/memory.h>
#include <linux/memory_hotplug.h>
#include <linux/slab.h>
#include <acpi/acpi_drivers.h>
@@ -310,26 +311,42 @@ static int acpi_memory_powerdown_device(struct acpi_memory_device *mem_device)
return 0;
}
-static int acpi_memory_disable_device(struct acpi_memory_device *mem_device)
+static int
+acpi_memory_device_remove_memory(struct acpi_memory_device *mem_device)
{
int result;
struct acpi_memory_info *info, *n;
+ int node = mem_device->nid;
-
- /*
- * Ask the VM to offline this memory range.
- * Note: Assume that this function returns zero on success
- */
list_for_each_entry_safe(info, n, &mem_device->res_list, list) {
if (info->enabled) {
result = offline_memory(info->start_addr, info->length);
if (result)
return result;
+
+ result = remove_memory(node, info->start_addr,
+ info->length);
+ if (result)
+ return result;
}
+
list_del(&info->list);
kfree(info);
}
+ return 0;
+}
+
+static int acpi_memory_disable_device(struct acpi_memory_device *mem_device)
+{
+ int result;
+
+ /*
+ * Ask the VM to offline this memory range.
+ * Note: Assume that this function returns zero on success
+ */
+ result = acpi_memory_device_remove_memory(mem_device);
+
/* Power-off and eject the device */
result = acpi_memory_powerdown_device(mem_device);
if (result) {
@@ -478,12 +495,23 @@ static int acpi_memory_device_add(struct acpi_device *device)
static int acpi_memory_device_remove(struct acpi_device *device, int type)
{
struct acpi_memory_device *mem_device = NULL;
-
+ int result;
if (!device || !acpi_driver_data(device))
return -EINVAL;
mem_device = acpi_driver_data(device);
+
+ if (type == ACPI_BUS_REMOVAL_EJECT) {
+ /*
+ * offline and remove memory only when the memory device is
+ * ejected.
+ */
+ result = acpi_memory_device_remove_memory(mem_device);
+ if (result)
+ return result;
+ }
+
kfree(mem_device);
return 0;
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 86c8821..038be73 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -70,6 +70,45 @@ void unregister_memory_isolate_notifier(struct notifier_block *nb)
}
EXPORT_SYMBOL(unregister_memory_isolate_notifier);
+bool is_memblk_offline(unsigned long start, unsigned long size)
+{
+ struct memory_block *mem = NULL;
+ struct mem_section *section;
+ unsigned long start_pfn, end_pfn;
+ unsigned long pfn, section_nr;
+
+ start_pfn = PFN_DOWN(start);
+ end_pfn = PFN_UP(start + size);
+
+ for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+ section_nr = pfn_to_section_nr(pfn);
+ if (!present_section_nr(section_nr))
+ continue;
+
+ section = __nr_to_section(section_nr);
+ /* same memblock? */
+ if (mem)
+ if ((section_nr >= mem->start_section_nr) &&
+ (section_nr <= mem->end_section_nr))
+ continue;
+
+ mem = find_memory_block_hinted(section, mem);
+ if (!mem)
+ continue;
+ if (mem->state == MEM_OFFLINE)
+ continue;
+
+ kobject_put(&mem->dev.kobj);
+ return false;
+ }
+
+ if (mem)
+ kobject_put(&mem->dev.kobj);
+
+ return true;
+}
+EXPORT_SYMBOL(is_memblk_offline);
+
/*
* register_memory - Setup a sysfs device for a memory block
*/
diff --git a/include/linux/memory.h b/include/linux/memory.h
index 1ac7f6e..7c66126 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -106,6 +106,10 @@ static inline int memory_isolate_notify(unsigned long val, void *v)
{
return 0;
}
+static inline bool is_memblk_offline(unsigned long start, unsigned long size)
+{
+ return false;
+}
#else
extern int register_memory_notifier(struct notifier_block *nb);
extern void unregister_memory_notifier(struct notifier_block *nb);
@@ -120,6 +124,7 @@ extern int memory_isolate_notify(unsigned long val, void *v);
extern struct memory_block *find_memory_block_hinted(struct mem_section *,
struct memory_block *);
extern struct memory_block *find_memory_block(struct mem_section *);
+extern bool is_memblk_offline(unsigned long start, unsigned long size);
#define CONFIG_MEM_BLOCK_SIZE (PAGES_PER_SECTION<<PAGE_SHIFT)
enum mem_add_context { BOOT, HOTPLUG };
#endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 0b040bb..fd84ea9 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -222,6 +222,7 @@ static inline void unlock_memory_hotplug(void) {}
#ifdef CONFIG_MEMORY_HOTREMOVE
extern int is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
+extern int remove_memory(int nid, u64 start, u64 size);
#else
static inline int is_mem_section_removable(unsigned long pfn,
@@ -229,6 +230,10 @@ static inline int is_mem_section_removable(unsigned long pfn,
{
return 0;
}
+static inline int remove_memory(int nid, u64 start, u64 size)
+{
+ return -EBUSY;
+}
#endif /* CONFIG_MEMORY_HOTREMOVE */
extern int mem_online_node(int nid);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 992454a..5af0a9f 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1034,6 +1034,28 @@ int offline_memory(u64 start, u64 size)
return 0;
}
+
+int remove_memory(int nid, u64 start, u64 size)
+{
+ int ret = -EBUSY;
+ lock_memory_hotplug();
+ /*
+ * The memory might become online by other task, even if you offine it.
+ * So we check whether the cpu has been onlined or not.
+ */
+ if (!is_memblk_offline(start, size)) {
+ pr_warn("memory removing [mem %#010llx-%#010llx] failed, "
+ "because the memmory range is online\n",
+ start, start + size);
+ ret = -EAGAIN;
+ }
+
+ unlock_memory_hotplug();
+ return ret;
+
+}
+EXPORT_SYMBOL_GPL(remove_memory);
+
#else
int offline_pages(u64 start, u64 size)
{
--
1.7.1
^ permalink raw reply related
* [RFC PATCH v5 03/19] memory-hotplug: store the node id in acpi_memory_device
From: Wen Congyang @ 2012-07-27 10:27 UTC (permalink / raw)
To: linux-mm, linux-kernel, linuxppc-dev, linux-acpi, linux-s390,
linux-sh, linux-ia64, cmetcalf
Cc: len.brown, Yasuaki ISIMATU, paulus, minchan.kim, kosaki.motohiro,
rientjes, cl, akpm, liuj97
In-Reply-To: <50126B83.3050201@cn.fujitsu.com>
The memory device has only one node id. Store the node id when
enable the memory device, and we can reuse it when removing the
memory device.
CC: David Rientjes <rientjes@google.com>
CC: Jiang Liu <liuj97@gmail.com>
CC: Len Brown <len.brown@intel.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Christoph Lameter <cl@linux.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
---
drivers/acpi/acpi_memhotplug.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
index 8957ed9..293d718 100644
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -83,6 +83,7 @@ struct acpi_memory_info {
struct acpi_memory_device {
struct acpi_device * device;
unsigned int state; /* State of the memory device */
+ int nid;
struct list_head res_list;
};
@@ -256,6 +257,9 @@ static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
info->enabled = 1;
num_enabled++;
}
+
+ mem_device->nid = node;
+
if (!num_enabled) {
printk(KERN_ERR PREFIX "add_memory failed\n");
mem_device->state = MEMORY_INVALID_STATE;
--
1.7.1
^ permalink raw reply related
* [RFC PATCH v5 02/19] memory-hotplug: implement offline_memory()
From: Wen Congyang @ 2012-07-27 10:26 UTC (permalink / raw)
To: linux-mm, linux-kernel, linuxppc-dev, linux-acpi, linux-s390,
linux-sh, linux-ia64, cmetcalf
Cc: len.brown, Yasuaki ISIMATU, paulus, minchan.kim, kosaki.motohiro,
rientjes, cl, akpm, liuj97
In-Reply-To: <50126B83.3050201@cn.fujitsu.com>
The function offline_memory() will be called when hot removing a
memory device. The memory device may contain more than one memory
block. If the memory block has been offlined, __offline_pages()
will fail. So we should try to offline one memory block at a
time.
If the memory block is offlined in offline_memory(), we also
update it's state, and notify the userspace that its state is
changed.
The function offline_memory() also check each memory block's
state. So there is no need to check the memory block's state
before calling offline_memory().
CC: David Rientjes <rientjes@google.com>
CC: Jiang Liu <liuj97@gmail.com>
CC: Len Brown <len.brown@intel.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Christoph Lameter <cl@linux.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
CC: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
drivers/base/memory.c | 31 +++++++++++++++++++++++++++----
include/linux/memory_hotplug.h | 2 ++
mm/memory_hotplug.c | 37 ++++++++++++++++++++++++++++++++++++-
3 files changed, 65 insertions(+), 5 deletions(-)
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 44e7de6..86c8821 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -275,13 +275,11 @@ memory_block_action(unsigned long phys_index, unsigned long action)
return ret;
}
-static int memory_block_change_state(struct memory_block *mem,
+static int __memory_block_change_state(struct memory_block *mem,
unsigned long to_state, unsigned long from_state_req)
{
int ret = 0;
- mutex_lock(&mem->state_mutex);
-
if (mem->state != from_state_req) {
ret = -EINVAL;
goto out;
@@ -309,10 +307,20 @@ static int memory_block_change_state(struct memory_block *mem,
break;
}
out:
- mutex_unlock(&mem->state_mutex);
return ret;
}
+static int memory_block_change_state(struct memory_block *mem,
+ unsigned long to_state, unsigned long from_state_req)
+{
+ int ret;
+
+ mutex_lock(&mem->state_mutex);
+ ret = __memory_block_change_state(mem, to_state, from_state_req);
+ mutex_unlock(&mem->state_mutex);
+
+ return ret;
+}
static ssize_t
store_mem_state(struct device *dev,
struct device_attribute *attr, const char *buf, size_t count)
@@ -653,6 +661,21 @@ int unregister_memory_section(struct mem_section *section)
}
/*
+ * offline one memory block. If the memory block has been offlined, do nothing.
+ */
+int offline_memory_block(struct memory_block *mem)
+{
+ int ret = 0;
+
+ mutex_lock(&mem->state_mutex);
+ if (mem->state != MEM_OFFLINE)
+ ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE);
+ mutex_unlock(&mem->state_mutex);
+
+ return ret;
+}
+
+/*
* Initialize the sysfs support for memory devices...
*/
int __init memory_dev_init(void)
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index c183f39..0b040bb 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -10,6 +10,7 @@ struct page;
struct zone;
struct pglist_data;
struct mem_section;
+struct memory_block;
#ifdef CONFIG_MEMORY_HOTPLUG
@@ -234,6 +235,7 @@ extern int mem_online_node(int nid);
extern int add_memory(int nid, u64 start, u64 size);
extern int arch_add_memory(int nid, u64 start, u64 size);
extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
+extern int offline_memory_block(struct memory_block *mem);
extern int offline_memory(u64 start, u64 size);
extern int sparse_add_one_section(struct zone *zone, unsigned long start_pfn,
int nr_pages);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 7a6659f..992454a 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -997,7 +997,42 @@ int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
int offline_memory(u64 start, u64 size)
{
- return -EINVAL;
+ struct memory_block *mem = NULL;
+ struct mem_section *section;
+ unsigned long start_pfn, end_pfn;
+ unsigned long pfn, section_nr;
+ int ret;
+
+ start_pfn = PFN_DOWN(start);
+ end_pfn = start_pfn + PFN_DOWN(size);
+
+ for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+ section_nr = pfn_to_section_nr(pfn);
+ if (!present_section_nr(section_nr))
+ continue;
+
+ section = __nr_to_section(section_nr);
+ /* same memblock? */
+ if (mem)
+ if ((section_nr >= mem->start_section_nr) &&
+ (section_nr <= mem->end_section_nr))
+ continue;
+
+ mem = find_memory_block_hinted(section, mem);
+ if (!mem)
+ continue;
+
+ ret = offline_memory_block(mem);
+ if (ret) {
+ kobject_put(&mem->dev.kobj);
+ return ret;
+ }
+ }
+
+ if (mem)
+ kobject_put(&mem->dev.kobj);
+
+ return 0;
}
#else
int offline_pages(u64 start, u64 size)
--
1.7.1
^ permalink raw reply related
* [RFC PATCH v5 01/19] memory-hotplug: rename remove_memory() to offline_memory()/offline_pages()
From: Wen Congyang @ 2012-07-27 10:25 UTC (permalink / raw)
To: linux-mm, linux-kernel, linuxppc-dev, linux-acpi, linux-s390,
linux-sh, linux-ia64, cmetcalf
Cc: len.brown, Yasuaki ISIMATU, paulus, minchan.kim, kosaki.motohiro,
rientjes, cl, akpm, liuj97
In-Reply-To: <50126B83.3050201@cn.fujitsu.com>
From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
remove_memory() only try to offline pages. It is called in two cases:
1. hot remove a memory device
2. echo offline >/sys/devices/system/memory/memoryXX/state
In the 1st case, we should also change memory block's state, and notify
the userspace that the memory block's state is changed after offlining
pages.
So rename remove_memory() to offline_memory()/offline_pages(). And in
the 1st case, offline_memory() will be used. The function offline_memory()
is not implemented. In the 2nd case, offline_pages() will be used.
CC: David Rientjes <rientjes@google.com>
CC: Jiang Liu <liuj97@gmail.com>
CC: Len Brown <len.brown@intel.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Christoph Lameter <cl@linux.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
drivers/acpi/acpi_memhotplug.c | 2 +-
drivers/base/memory.c | 9 +++------
include/linux/memory_hotplug.h | 3 ++-
mm/memory_hotplug.c | 22 ++++++++++++++--------
4 files changed, 20 insertions(+), 16 deletions(-)
diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
index 81a9def..8957ed9 100644
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -318,7 +318,7 @@ static int acpi_memory_disable_device(struct acpi_memory_device *mem_device)
*/
list_for_each_entry_safe(info, n, &mem_device->res_list, list) {
if (info->enabled) {
- result = remove_memory(info->start_addr, info->length);
+ result = offline_memory(info->start_addr, info->length);
if (result)
return result;
}
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 7dda4f7..44e7de6 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -248,26 +248,23 @@ static bool pages_correctly_reserved(unsigned long start_pfn,
static int
memory_block_action(unsigned long phys_index, unsigned long action)
{
- unsigned long start_pfn, start_paddr;
+ unsigned long start_pfn;
unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
struct page *first_page;
int ret;
first_page = pfn_to_page(phys_index << PFN_SECTION_SHIFT);
+ start_pfn = page_to_pfn(first_page);
switch (action) {
case MEM_ONLINE:
- start_pfn = page_to_pfn(first_page);
-
if (!pages_correctly_reserved(start_pfn, nr_pages))
return -EBUSY;
ret = online_pages(start_pfn, nr_pages);
break;
case MEM_OFFLINE:
- start_paddr = page_to_pfn(first_page) << PAGE_SHIFT;
- ret = remove_memory(start_paddr,
- nr_pages << PAGE_SHIFT);
+ ret = offline_pages(start_pfn, nr_pages);
break;
default:
WARN(1, KERN_WARNING "%s(%ld, %ld) unknown action: "
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 910550f..c183f39 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -233,7 +233,8 @@ static inline int is_mem_section_removable(unsigned long pfn,
extern int mem_online_node(int nid);
extern int add_memory(int nid, u64 start, u64 size);
extern int arch_add_memory(int nid, u64 start, u64 size);
-extern int remove_memory(u64 start, u64 size);
+extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
+extern int offline_memory(u64 start, u64 size);
extern int sparse_add_one_section(struct zone *zone, unsigned long start_pfn,
int nr_pages);
extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 427bb29..7a6659f 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -865,7 +865,7 @@ check_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
return offlined;
}
-static int __ref offline_pages(unsigned long start_pfn,
+static int __ref __offline_pages(unsigned long start_pfn,
unsigned long end_pfn, unsigned long timeout)
{
unsigned long pfn, nr_pages, expire;
@@ -990,18 +990,24 @@ out:
return ret;
}
-int remove_memory(u64 start, u64 size)
+int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
{
- unsigned long start_pfn, end_pfn;
+ return __offline_pages(start_pfn, start_pfn + nr_pages, 120 * HZ);
+}
- start_pfn = PFN_DOWN(start);
- end_pfn = start_pfn + PFN_DOWN(size);
- return offline_pages(start_pfn, end_pfn, 120 * HZ);
+int offline_memory(u64 start, u64 size)
+{
+ return -EINVAL;
}
#else
-int remove_memory(u64 start, u64 size)
+int offline_pages(u64 start, u64 size)
+{
+ return -EINVAL;
+}
+
+int offline_memory(u64 start, u64 size)
{
return -EINVAL;
}
#endif /* CONFIG_MEMORY_HOTREMOVE */
-EXPORT_SYMBOL_GPL(remove_memory);
+EXPORT_SYMBOL_GPL(offline_memory);
--
1.7.1
^ permalink raw reply related
* [PATCH 0.5/19] remove memory info from list before freeing it
From: Wen Congyang @ 2012-07-27 10:24 UTC (permalink / raw)
To: linux-mm, linux-kernel, linuxppc-dev, linux-acpi, linux-s390,
linux-sh, linux-ia64, cmetcalf
Cc: len.brown, Yasuaki ISIMATU, paulus, minchan.kim, kosaki.motohiro,
rientjes, cl, akpm, liuj97
In-Reply-To: <50126B83.3050201@cn.fujitsu.com>
We free info, but we forget to remove it from the list. It will cause
unexpected problem when we access the list next time.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
drivers/acpi/acpi_memhotplug.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
index 8fe0e02..5cafd6b 100644
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -323,6 +323,7 @@ static int acpi_memory_disable_device(struct acpi_memory_device *mem_device)
if (result)
return result;
}
+ list_del(&info->list);
kfree(info);
}
^ permalink raw reply related
* [RFC PATCH 0/19] firmware_map : unify argument of firmware_map_add_early/hotplug
From: Wen Congyang @ 2012-07-27 10:22 UTC (permalink / raw)
To: linux-mm, linux-kernel, linuxppc-dev, linux-acpi, linux-s390,
linux-sh, linux-ia64, cmetcalf
Cc: len.brown, Yasuaki ISIMATU, paulus, minchan.kim, kosaki.motohiro,
rientjes, cl, akpm, liuj97
In-Reply-To: <50126B83.3050201@cn.fujitsu.com>
From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
There are two ways to create /sys/firmware/memmap/X sysfs:
- firmware_map_add_early
When the system starts, it is calledd from e820_reserve_resources()
- firmware_map_add_hotplug
When the memory is hot plugged, it is called from add_memory()
But these functions are called without unifying value of end argument as below:
- end argument of firmware_map_add_early() : start + size - 1
- end argument of firmware_map_add_hogplug() : start + size
The patch unifies them to "start + size". Even if applying the patch,
/sys/firmware/memmap/X/end file content does not change.
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@kernel.org>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Tejun Heo <tj@kernel.org>
CC: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
---
arch/x86/kernel/e820.c | 2 +-
drivers/firmware/memmap.c | 8 ++++----
2 files changed, 5 insertions(+), 5 deletions(-)
Index: linux-3.5-rc6/arch/x86/kernel/e820.c
===================================================================
--- linux-3.5-rc6.orig/arch/x86/kernel/e820.c 2012-07-18 17:19:38.391365260 +0900
+++ linux-3.5-rc6/arch/x86/kernel/e820.c 2012-07-18 17:19:43.616300222 +0900
@@ -944,7 +944,7 @@ void __init e820_reserve_resources(void)
for (i = 0; i < e820_saved.nr_map; i++) {
struct e820entry *entry = &e820_saved.map[i];
firmware_map_add_early(entry->addr,
- entry->addr + entry->size - 1,
+ entry->addr + entry->size,
e820_type_to_string(entry->type));
}
}
Index: linux-3.5-rc6/drivers/firmware/memmap.c
===================================================================
--- linux-3.5-rc6.orig/drivers/firmware/memmap.c 2012-07-18 17:19:38.388365299 +0900
+++ linux-3.5-rc6/drivers/firmware/memmap.c 2012-07-18 18:30:47.608390251 +0900
@@ -98,7 +98,7 @@ static LIST_HEAD(map_entries);
/**
* firmware_map_add_entry() - Does the real work to add a firmware memmap entry.
* @start: Start of the memory range.
- * @end: End of the memory range (inclusive).
+ * @end: End of the memory range.
* @type: Type of the memory range.
* @entry: Pre-allocated (either kmalloc() or bootmem allocator), uninitialised
* entry.
@@ -113,7 +113,7 @@ static int firmware_map_add_entry(u64 st
BUG_ON(start > end);
entry->start = start;
- entry->end = end;
+ entry->end = end - 1;
entry->type = type;
INIT_LIST_HEAD(&entry->list);
kobject_init(&entry->kobj, &memmap_ktype);
@@ -148,7 +148,7 @@ static int add_sysfs_fw_map_entry(struct
* firmware_map_add_hotplug() - Adds a firmware mapping entry when we do
* memory hotplug.
* @start: Start of the memory range.
- * @end: End of the memory range (inclusive).
+ * @end: End of the memory range.
* @type: Type of the memory range.
*
* Adds a firmware mapping entry. This function is for memory hotplug, it is
@@ -175,7 +175,7 @@ int __meminit firmware_map_add_hotplug(u
/**
* firmware_map_add_early() - Adds a firmware mapping entry.
* @start: Start of the memory range.
- * @end: End of the memory range (inclusive).
+ * @end: End of the memory range.
* @type: Type of the memory range.
*
* Adds a firmware mapping entry. This function uses the bootmem allocator
^ permalink raw reply
* [RFC PATCH v5 00/19] memory-hotplug: hot-remove physical memory
From: Wen Congyang @ 2012-07-27 10:20 UTC (permalink / raw)
To: linux-mm, linux-kernel, linuxppc-dev, linux-acpi, linux-s390,
linux-sh, linux-ia64, cmetcalf
Cc: len.brown, Yasuaki ISIMATU, paulus, minchan.kim, kosaki.motohiro,
rientjes, cl, akpm, liuj97
This patch series aims to support physical memory hot-remove.
The patches can free/remove following things:
- acpi_memory_info : [RFC PATCH 4/19]
- /sys/firmware/memmap/X/{end, start, type} : [RFC PATCH 8/19]
- iomem_resource : [RFC PATCH 9/19]
- mem_section and related sysfs files : [RFC PATCH 10-11, 13-16/19]
- page table of removed memory : [RFC PATCH 12/19]
- node and related sysfs files : [RFC PATCH 18-19/19]
If you find lack of function for physical memory hot-remove, please let me
know.
change log of v5:
* merge the patchset to clear page table and the patchset to hot remove
memory(from ishimatsu) to one big patchset.
[RFC PATCH v5 1/19]
* rename remove_memory() to offline_memory()/offline_pages()
[RFC PATCH v5 2/19]
* new patch: implement offline_memory(). This function offlines pages,
update memory block's state, and notify the userspace that the memory
block's state is changed.
[RFC PATCH v5 4/19]
* offline and remove memory in acpi_memory_disable_device() too.
[RFC PATCH v5 17/19]
* new patch: add a new function __remove_zone() to revert the things done
in the function __add_zone().
[RFC PATCH v5 18/19]
* flush work befor reseting node device.
change log of v4:
* remove "memory-hotplug : unify argument of firmware_map_add_early/hotplug"
from the patch series, since the patch is a bugfix. It is being disccussed
on other thread. But for testing the patch series, the patch is needed.
So I added the patch as [PATCH 0/13].
[RFC PATCH v4 2/13]
* check memory is online or not at remove_memory()
* add memory_add_physaddr_to_nid() to acpi_memory_device_remove() for
getting node id
[RFC PATCH v4 3/13]
* create new patch : check memory is online or not at online_pages()
[RFC PATCH v4 4/13]
* add __ref section to remove_memory()
* call firmware_map_remove_entry() before remove_sysfs_fw_map_entry()
[RFC PATCH v4 11/13]
* rewrite register_page_bootmem_memmap() for removing page used as PT/PMD
change log of v3:
* rebase to 3.5.0-rc6
[RFC PATCH v2 2/13]
* remove extra kobject_put()
* The patch was commented by Wen. Wen's comment is
"acpi_memory_device_remove() should ignore a return value of
remove_memory() since caller does not care the return value".
But I did not change it since I think caller should care the
return value. And I am trying to fix it as follow:
https://lkml.org/lkml/2012/7/5/624
[RFC PATCH v2 4/13]
* remove a firmware_memmap_entry allocated by kzmalloc()
change log of v2:
[RFC PATCH v2 2/13]
* check whether memory block is offline or not before calling offline_memory()
* check whether section is valid or not in is_memblk_offline()
* call kobject_put() for each memory_block in is_memblk_offline()
[RFC PATCH v2 3/13]
* unify the end argument of firmware_map_add_early/hotplug
[RFC PATCH v2 4/13]
* add release_firmware_map_entry() for freeing firmware_map_entry
[RFC PATCH v2 6/13]
* add release_memory_block() for freeing memory_block
[RFC PATCH v2 11/13]
* fix wrong arguments of free_pages()
Wen Congyang (5):
memory-hotplug: implement offline_memory()
memory-hotplug: store the node id in acpi_memory_device
memory-hotplug: export the function acpi_bus_remove()
memory-hotplug: call acpi_bus_remove() to remove memory device
memory-hotplug: introduce new function arch_remove_memory()
Yasuaki Ishimatsu (14):
memory-hotplug: rename remove_memory() to
offline_memory()/offline_pages()
memory-hotplug: offline and remove memory when removing the memory
device
memory-hotplug: check whether memory is present or not
memory-hotplug: remove /sys/firmware/memmap/X sysfs
memory-hotplug: does not release memory region in PAGES_PER_SECTION
chunks
memory-hotplug: add memory_block_release
memory-hotplug: remove_memory calls __remove_pages
memory-hotplug: check page type in get_page_bootmem
memory-hotplug: move register_page_bootmem_info_node and
put_page_bootmem for sparse-vmemmap
memory-hotplug: implement register_page_bootmem_info_section of
sparse-vmemmap
memory-hotplug: free memmap of sparse-vmemmap
memory_hotplug: clear zone when the memory is removed
memory-hotplug: add node_device_release
memory-hotplug: remove sysfs file of node
arch/ia64/mm/init.c | 16 +
arch/powerpc/mm/mem.c | 14 +
arch/powerpc/platforms/pseries/hotplug-memory.c | 16 +-
arch/s390/mm/init.c | 8 +
arch/sh/mm/init.c | 15 +
arch/tile/mm/init.c | 8 +
arch/x86/include/asm/pgtable_types.h | 1 +
arch/x86/mm/init_32.c | 10 +
arch/x86/mm/init_64.c | 333 ++++++++++++++++++++++
arch/x86/mm/pageattr.c | 47 ++--
drivers/acpi/acpi_memhotplug.c | 51 +++-
drivers/acpi/scan.c | 3 +-
drivers/base/memory.c | 90 ++++++-
drivers/base/node.c | 8 +
drivers/firmware/memmap.c | 78 +++++-
include/acpi/acpi_bus.h | 1 +
include/linux/firmware-map.h | 6 +
include/linux/memory.h | 5 +
include/linux/memory_hotplug.h | 25 +-
include/linux/mm.h | 5 +-
include/linux/mmzone.h | 19 ++
mm/memory_hotplug.c | 337 +++++++++++++++++++++--
mm/sparse.c | 5 +-
23 files changed, 1010 insertions(+), 91 deletions(-)
^ permalink raw reply
* RE: [PATCH V3 1/5] powerpc/fsl-pci: Unify pci/pcie initialization code
From: Jia Hongtao-B38951 @ 2012-07-27 10:10 UTC (permalink / raw)
To: Kumar Gala
Cc: Wood Scott-B07421, linuxppc-dev@lists.ozlabs.org, Li Yang-R58472
In-Reply-To: <ACC8975F-CA31-4AA0-9824-4C3F6C70CD45@kernel.crashing.org>
Hi kumar,
I know "duplicate code from pci_process_bridge_OF_ranges()" is
hard to accept but "refactor the code to have a shared function"
is knotty. Actually this is the reason I didn't do the refactor.
Here is the situation:
First, pci_process_bridge_OF_ranges() is a common code using by
so many pci client.
Second, the contents of pci_process_bridge_OF_ranges() twisted
together. It's hard to decouple the function I need for determining
swiotlb with other contents. I tried and found a way to do this
but the shared function need so many parameters which is also
unacceptable.=20
Third, my function to determine swiotlb should know the start
and the end of pci mem space address for all the controllers.
Note that the end of address is for determining where to map
PEXCSRBAR.
If you have any idea for this please let me know.
Thanks.
-Hongtao.
> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> Sent: Friday, July 27, 2012 1:53 AM
> To: Jia Hongtao-B38951
> Cc: linuxppc-dev@lists.ozlabs.org; Wood Scott-B07421; Li Yang-R58472
> Subject: Re: [PATCH V3 1/5] powerpc/fsl-pci: Unify pci/pcie
> initialization code
>=20
>=20
> On Jul 26, 2012, at 7:30 AM, Jia Hongtao wrote:
>=20
> > We unified the Freescale pci/pcie initialization by changing the
> fsl_pci
> > to a platform driver. In previous PCI code architecture the
> initialization
> > routine is called at board_setup_arch stage. Now the initialization is
> done
> > in probe function which is architectural better. Also It's convenient
> for
> > adding PM support for PCI controller in later patch.
> >
> > One issue introduced by this architecture is the timing of swiotlb_init=
.
> > During PCI initialization the need of swiotlb is determined and this
> should
> > be done before swiotlb_init. So a new function to determine swiotlb by
> > parsing pci ranges is made. This function is called at board_setup_arch
> > stage which is earlier than swiotlb_init.
> >
> > Signed-off-by: Jia Hongtao <B38951@freescale.com>
> > Signed-off-by: Li Yang <leoli@freescale.com>
> > ---
> > Changed for V3:
> > - Rebase the patch set on the latest tree
> > - merge PCI unify and swiotlb patch into one
> >
> > arch/powerpc/sysdev/fsl_pci.c | 155 ++++++++++++++++++++++++++++++++--
> -------
> > arch/powerpc/sysdev/fsl_pci.h | 9 +--
> > 2 files changed, 125 insertions(+), 39 deletions(-)
> >
> > diff --git a/arch/powerpc/sysdev/fsl_pci.c
> b/arch/powerpc/sysdev/fsl_pci.c
> > index a7b2a60..5228b6b 100644
> > --- a/arch/powerpc/sysdev/fsl_pci.c
> > +++ b/arch/powerpc/sysdev/fsl_pci.c
> > @@ -823,56 +823,143 @@ static const struct of_device_id pci_ids[] =3D {
> > {},
> > };
> >
> > -struct device_node *fsl_pci_primary;
> > -
> > -void __devinit fsl_pci_init(void)
> > +#ifdef CONFIG_SWIOTLB
> > +void pci_determine_swiotlb(void)
> > {
> > + const u32 *ranges;
> > + int rlen;
> > + int pna;
> > + int np;
> > struct device_node *node;
> > - struct pci_controller *hose;
> > - dma_addr_t max =3D 0xffffffff;
> > -
> > - /* Callers can specify the primary bus using other means. */
> > - if (!fsl_pci_primary) {
> > - /* If a PCI host bridge contains an ISA node, it's primary.
> */
> > - node =3D of_find_node_by_type(NULL, "isa");
> > - while ((fsl_pci_primary =3D of_get_parent(node))) {
> > - of_node_put(node);
> > - node =3D fsl_pci_primary;
> > -
> > - if (of_match_node(pci_ids, node))
> > - break;
> > - }
> > - }
> > + int memno;
> > + u32 pci_space;
> > + unsigned long long pci_addr, cpu_addr, pci_next, cpu_next, size;
> > + unsigned long long pci_addr_lo =3D ULLONG_MAX;
> > + unsigned long long pci_addr_hi =3D 0x0;
> > + dma_addr_t pci_dma_sz;
> >
> > - node =3D NULL;
> > for_each_node_by_type(node, "pci") {
> > if (of_match_node(pci_ids, node)) {
> > - /*
> > - * If there's no PCI host bridge with ISA, arbitrarily
> > - * designate one as primary. This can go away once
> > - * various bugs with primary-less systems are fixed.
> > - */
> > - if (!fsl_pci_primary)
> > - fsl_pci_primary =3D node;
> > -
> > - fsl_add_bridge(node, fsl_pci_primary =3D=3D node);
> > - hose =3D pci_find_hose_for_OF_device(node);
> > - max =3D min(max, hose->dma_window_base_cur +
> > - hose->dma_window_size);
> > + memno =3D 0;
> > + pna =3D of_n_addr_cells(node);
> > + np =3D pna + 5;
>=20
> Don't duplicate code from pci_process_bridge_OF_ranges(), refactor the
> code to have a shared function:
>=20
> > + /* Get ranges property */
> > + ranges =3D of_get_property(node, "ranges", &rlen);
> > + if (ranges =3D=3D NULL)
> > + return;
> > +
> > + /* Parse outbound MEM window range */
> > + while ((rlen -=3D np * 4) >=3D 0) {
> > + /* Read next ranges element */
> > + pci_space =3D ranges[0];
> > + if (!((pci_space >> 24) & 0x2)) {
> > + ranges +=3D np;
> > + break;
> > + }
> > + pci_addr =3D of_read_number(ranges + 1, 2);
> > + cpu_addr =3D of_translate_address(
> > + node, ranges + 3);
> > + size =3D of_read_number(ranges + pna + 3, 2);
> > + ranges +=3D np;
> > +
> > + /*
> > + * If we failed translation or got a zero-sized
> > + * region (some FW try to feed us with non
> > + * sensical zero sized regions such as power3
> > + * which look like some kind of attempt at
> > + * exposing the VGA memory hole)
> > + */
> > + if (cpu_addr =3D=3D OF_BAD_ADDR || size =3D=3D 0)
> > + continue;
> > +
> > + /*
> > + * Now consume following elements while they
> > + * are contiguous
> > + */
> > + for (; rlen >=3D np * sizeof(u32);
> > + ranges +=3D np, rlen -=3D np * 4) {
> > + if (ranges[0] !=3D pci_space)
> > + break;
> > + pci_next =3D of_read_number(ranges + 1,
> > + 2);
> > + cpu_next =3D of_translate_address(node,
> > + ranges + 3);
> > + if (pci_next !=3D pci_addr + size ||
> > + cpu_next !=3D cpu_addr + size)
> > + break;
> > + size +=3D of_read_number(
> > + ranges + pna + 3, 2);
> > + }
> > +
> > + /* We support only 3 memory ranges */
> > + if (memno >=3D 3) {
> > + printk(KERN_INFO
> > + " \\--> Skipped (too
> many) !\n");
> > + continue;
> > + }
> > +
> > + pci_addr_lo =3D min(pci_addr, pci_addr_lo);
> > + pci_addr_hi =3D max(pci_addr + size, pci_addr_hi);
> > + memno++;
> > + }
> > }
> > }
> >
> > -#ifdef CONFIG_SWIOTLB
> > + /* Get PEXCSRBAR size (equal to CCSR size) */
> > + node =3D of_find_node_by_type(NULL, "soc");
> > + ranges =3D of_get_property(node, "ranges", &rlen);
> > + if (ranges =3D=3D NULL)
> > + return;
> > +
> > + size =3D of_read_number(ranges + 3, 1);
> > + of_node_put(node);
> > +
> > + if (pci_addr_hi < (0x100000000ull - size))
> > + pci_dma_sz =3D pci_addr_lo;
> > + else
> > + pci_dma_sz =3D pci_addr_lo - size;
> > +
> > /*
> > * if we couldn't map all of DRAM via the dma windows
> > * we need SWIOTLB to handle buffers located outside of
> > * dma capable memory region
> > */
> > - if (memblock_end_of_DRAM() - 1 > max) {
> > + if (memblock_end_of_DRAM() > pci_dma_sz) {
> > ppc_swiotlb_enable =3D 1;
> > set_pci_dma_ops(&swiotlb_dma_ops);
> > - ppc_md.pci_dma_dev_setup =3D pci_dma_dev_setup_swiotlb;
> > + ppc_md.pci_dma_dev_setup =3D
> > + pci_dma_dev_setup_swiotlb;
>=20
> why the line wrap change?
>=20
> > }
> > +}
> > #endif
> > +
> > +int primary_phb_addr;
> > +static int __devinit fsl_pci_probe(struct platform_device *pdev)
> > +{
> > + struct pci_controller *hose;
> > + bool is_primary;
> > +
> > + if (of_match_node(pci_ids, pdev->dev.of_node)) {
> > + struct resource rsrc;
> > + of_address_to_resource(pdev->dev.of_node, 0, &rsrc);
> > + is_primary =3D ((rsrc.start & 0xfffff) =3D=3D primary_phb_addr);
> > + fsl_add_bridge(pdev->dev.of_node, is_primary);
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static struct platform_driver fsl_pci_driver =3D {
> > + .driver =3D {
> > + .name =3D "fsl-pci",
> > + .of_match_table =3D pci_ids,
> > + },
> > + .probe =3D fsl_pci_probe,
> > +};
> > +
> > +static int __init fsl_pci_init(void)
> > +{
> > + return platform_driver_register(&fsl_pci_driver);
> > }
> > +arch_initcall(fsl_pci_init);
> > #endif
>=20
>=20
^ permalink raw reply
* [PATCH v4 7/7] fsl-dma: add memcpy self test interface
From: qiang.liu @ 2012-07-27 9:16 UTC (permalink / raw)
To: linux-crypto, linuxppc-dev
Cc: Ira W. Snyder, Vinod Koul, Qiang Liu, herbert, Dan Williams,
davem
From: Qiang Liu <qiang.liu@freescale.com>
Add memory copy self test when probe device, fsl-dma will be disabled
if self test failed.
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Li Yang <leoli@freescale.com>
Cc: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
---
drivers/dma/fsldma.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 83 insertions(+), 0 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 6fc22eb..5e0b162 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -1277,6 +1277,82 @@ out_unwind:
return ret;
}
+/*
+ * Perform a transaction to verify the HW works.
+ */
+#define FSL_DMA_TEST_SIZE 2000
+
+static int __devinit fsl_dma_memcpy_self_test(struct fsldma_device *device)
+{
+ int i;
+ void *src, *dest;
+ dma_addr_t src_dma, dest_dma;
+ struct dma_chan *dma_chan;
+ dma_cookie_t cookie;
+ struct dma_async_tx_descriptor *tx;
+ int err = 0;
+ struct fsldma_chan *chan;
+
+ src = kmalloc(sizeof(u8) * FSL_DMA_TEST_SIZE, GFP_KERNEL);
+ if (!src)
+ return -ENOMEM;
+
+ dest = kzalloc(sizeof(u8) * FSL_DMA_TEST_SIZE, GFP_KERNEL);
+ if (!dest) {
+ kfree(src);
+ return -ENOMEM;
+ }
+
+ /* Fill in src buffer */
+ for (i = 0; i < FSL_DMA_TEST_SIZE; i++)
+ ((u8 *) src)[i] = (u8)i;
+
+ /* Start copy, using first DMA channel */
+ dma_chan = container_of(device->common.channels.next,
+ struct dma_chan, device_node);
+ if (fsl_dma_alloc_chan_resources(dma_chan) < 1) {
+ err = -ENODEV;
+ goto out;
+ }
+
+ chan = to_fsl_chan(dma_chan);
+ dest_dma = dma_map_single(chan->common.device->dev, dest,
+ FSL_DMA_TEST_SIZE, DMA_FROM_DEVICE);
+
+ src_dma = dma_map_single(chan->common.device->dev, src,
+ FSL_DMA_TEST_SIZE, DMA_TO_DEVICE);
+
+ tx = fsl_dma_prep_memcpy(dma_chan, dest_dma, src_dma, FSL_DMA_TEST_SIZE,
+ DMA_COMPL_SKIP_DEST_UNMAP | DMA_COMPL_SRC_UNMAP_SINGLE);
+ cookie = fsl_dma_tx_submit(tx);
+ fsl_dma_memcpy_issue_pending(dma_chan);
+ async_tx_ack(tx);
+ msleep(1);
+
+ if (fsl_tx_status(dma_chan, cookie, NULL) != DMA_SUCCESS) {
+ dev_printk(KERN_ERR, dma_chan->device->dev,
+ "Self-test copy timed out, disabling\n");
+ err = -ENODEV;
+ goto free_resources;
+ }
+
+ dma_sync_single_for_cpu(device->dev, dest_dma,
+ FSL_DMA_TEST_SIZE, DMA_FROM_DEVICE);
+ if (memcmp(src, dest, FSL_DMA_TEST_SIZE)) {
+ dev_printk(KERN_ERR, dma_chan->device->dev,
+ "Self-test copy failed compare, disabling\n");
+ err = -ENODEV;
+ goto free_resources;
+ }
+
+free_resources:
+ fsl_dma_free_chan_resources(dma_chan);
+out:
+ kfree(src);
+ kfree(dest);
+ return err;
+}
+
/*----------------------------------------------------------------------------*/
/* OpenFirmware Subsystem */
/*----------------------------------------------------------------------------*/
@@ -1461,6 +1537,13 @@ static int __devinit fsldma_of_probe(struct platform_device *op)
goto out_free_fdev;
}
+ if (dma_has_cap(DMA_MEMCPY, fdev->common.cap_mask)) {
+ err = fsl_dma_memcpy_self_test(fdev);
+ printk(KERN_INFO "FSL-DMA Channel memcpy self test returned %d\n", err);
+ if (err)
+ goto out_free_fdev;
+ }
+
dma_async_device_register(&fdev->common);
return 0;
--
1.7.5.1
^ permalink raw reply related
* [PATCH v4 6/7] fsl-dma: fix a warning of unitialized cookie
From: qiang.liu @ 2012-07-27 9:16 UTC (permalink / raw)
To: linux-crypto, linuxppc-dev
Cc: Vinod Koul, Qiang Liu, herbert, Dan Williams, davem
From: Qiang Liu <qiang.liu@freescale.com>
Fix a warning of unitialized value when compile with -Wuninitialized.
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Li Yang <leoli@freescale.com>
Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
Cc: Kim Phillips <kim.phillips@freescale.com>
---
drivers/dma/fsldma.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index e3814aa..6fc22eb 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -645,7 +645,7 @@ static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor *tx)
struct fsldma_chan *chan = to_fsl_chan(tx->chan);
struct fsl_desc_sw *desc = tx_to_fsl_desc(tx);
struct fsl_desc_sw *child;
- dma_cookie_t cookie;
+ dma_cookie_t cookie = 0;
spin_lock_bh(&chan->desc_lock);
--
1.7.5.1
^ permalink raw reply related
* [PATCH v4 5/7] fsl-dma: use spin_lock_bh to instead of spin_lock_irqsave
From: qiang.liu @ 2012-07-27 9:16 UTC (permalink / raw)
To: linux-crypto, linuxppc-dev
Cc: Vinod Koul, Timur Tabi, Qiang Liu, herbert, Dan Williams, davem
From: Qiang Liu <qiang.liu@freescale.com>
- use spin_lock_bh() is the right way to use async_tx api,
dma_run_dependencies() should not be protected by spin_lock_irqsave();
- use spin_lock_bh to instead of spin_lock_irqsave for improving performance,
There is not any place to access descriptor queues in fsl-dma ISR except its
tasklet, spin_lock_bh() is more proper here. Interrupts will be turned off and
context will be save in irqsave, there is needless to use irqsave..
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Li Yang <leoli@freescale.com>
Cc: Timur Tabi <timur@freescale.com>
Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
---
drivers/dma/fsldma.c | 30 ++++++++++++------------------
1 files changed, 12 insertions(+), 18 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index bb883c0..e3814aa 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -645,10 +645,9 @@ static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor *tx)
struct fsldma_chan *chan = to_fsl_chan(tx->chan);
struct fsl_desc_sw *desc = tx_to_fsl_desc(tx);
struct fsl_desc_sw *child;
- unsigned long flags;
dma_cookie_t cookie;
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_bh(&chan->desc_lock);
/*
* assign cookies to all of the software descriptors
@@ -661,7 +660,7 @@ static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor *tx)
/* put this transaction onto the tail of the pending queue */
append_ld_queue(chan, desc);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_bh(&chan->desc_lock);
return cookie;
}
@@ -770,15 +769,14 @@ static void fsldma_free_desc_list_reverse(struct fsldma_chan *chan,
static void fsl_dma_free_chan_resources(struct dma_chan *dchan)
{
struct fsldma_chan *chan = to_fsl_chan(dchan);
- unsigned long flags;
chan_dbg(chan, "free all channel resources\n");
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_bh(&chan->desc_lock);
fsldma_cleanup_descriptor(chan);
fsldma_free_desc_list(chan, &chan->ld_pending);
fsldma_free_desc_list(chan, &chan->ld_running);
fsldma_free_desc_list(chan, &chan->ld_completed);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_bh(&chan->desc_lock);
dma_pool_destroy(chan->desc_pool);
chan->desc_pool = NULL;
@@ -997,7 +995,6 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
{
struct dma_slave_config *config;
struct fsldma_chan *chan;
- unsigned long flags;
int size;
if (!dchan)
@@ -1007,7 +1004,7 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
switch (cmd) {
case DMA_TERMINATE_ALL:
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_bh(&chan->desc_lock);
/* Halt the DMA engine */
dma_halt(chan);
@@ -1017,7 +1014,7 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
fsldma_free_desc_list(chan, &chan->ld_running);
chan->idle = true;
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_bh(&chan->desc_lock);
return 0;
case DMA_SLAVE_CONFIG:
@@ -1059,11 +1056,10 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
static void fsl_dma_memcpy_issue_pending(struct dma_chan *dchan)
{
struct fsldma_chan *chan = to_fsl_chan(dchan);
- unsigned long flags;
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_bh(&chan->desc_lock);
fsl_chan_xfer_ld_queue(chan);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_bh(&chan->desc_lock);
}
/**
@@ -1076,15 +1072,14 @@ static enum dma_status fsl_tx_status(struct dma_chan *dchan,
{
struct fsldma_chan *chan = to_fsl_chan(dchan);
enum dma_status ret;
- unsigned long flags;
ret = dma_cookie_status(dchan, cookie, txstate);
if (ret == DMA_SUCCESS)
return ret;
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_bh(&chan->desc_lock);
fsldma_cleanup_descriptor(chan);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_bh(&chan->desc_lock);
return dma_cookie_status(dchan, cookie, txstate);
}
@@ -1163,11 +1158,10 @@ static irqreturn_t fsldma_chan_irq(int irq, void *data)
static void dma_do_tasklet(unsigned long data)
{
struct fsldma_chan *chan = (struct fsldma_chan *)data;
- unsigned long flags;
chan_dbg(chan, "tasklet entry\n");
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_bh(&chan->desc_lock);
/* the hardware is now idle and ready for more */
chan->idle = true;
@@ -1175,7 +1169,7 @@ static void dma_do_tasklet(unsigned long data)
/* Run all cleanup for this descriptor */
fsldma_cleanup_descriptor(chan);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_bh(&chan->desc_lock);
chan_dbg(chan, "tasklet exit\n");
}
--
1.7.5.1
^ permalink raw reply related
* [PATCH v4 4/7] fsl-dma: move the function ahead of its invoke function
From: qiang.liu @ 2012-07-27 9:16 UTC (permalink / raw)
To: linux-crypto, linuxppc-dev
Cc: Vinod Koul, Qiang Liu, herbert, Dan Williams, davem
From: Qiang Liu <qiang.liu@freescale.com>
Move the function fsldma_cleanup_descriptor() and fsl_chan_xfer_ld_queue()
ahead of its invoke function for avoiding redundant definition.
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Li Yang <leoli@freescale.com>
Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
---
drivers/dma/fsldma.c | 252 +++++++++++++++++++++++++-------------------------
1 files changed, 124 insertions(+), 128 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 87f52c0..bb883c0 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -400,9 +400,6 @@ out_splice:
list_splice_tail_init(&desc->tx_list, &chan->ld_pending);
}
-static void fsldma_cleanup_descriptor(struct fsldma_chan *chan);
-static void fsl_chan_xfer_ld_queue(struct fsldma_chan *chan);
-
/**
* fsldma_clean_completed_descriptor - free all descriptors which
* has been completed and acked
@@ -519,6 +516,130 @@ fsldma_clean_running_descriptor(struct fsldma_chan *chan,
return 0;
}
+/**
+ * fsl_chan_xfer_ld_queue - transfer any pending transactions
+ * @chan : Freescale DMA channel
+ *
+ * HARDWARE STATE: idle
+ * LOCKING: must hold chan->desc_lock
+ */
+static void fsl_chan_xfer_ld_queue(struct fsldma_chan *chan)
+{
+ struct fsl_desc_sw *desc;
+
+ /*
+ * If the list of pending descriptors is empty, then we
+ * don't need to do any work at all
+ */
+ if (list_empty(&chan->ld_pending)) {
+ chan_dbg(chan, "no pending LDs\n");
+ return;
+ }
+
+ /*
+ * The DMA controller is not idle, which means that the interrupt
+ * handler will start any queued transactions when it runs after
+ * this transaction finishes
+ */
+ if (!chan->idle) {
+ chan_dbg(chan, "DMA controller still busy\n");
+ return;
+ }
+
+ /*
+ * If there are some link descriptors which have not been
+ * transferred, we need to start the controller
+ */
+
+ /*
+ * Move all elements from the queue of pending transactions
+ * onto the list of running transactions
+ */
+ chan_dbg(chan, "idle, starting controller\n");
+ desc = list_first_entry(&chan->ld_pending, struct fsl_desc_sw, node);
+ list_splice_tail_init(&chan->ld_pending, &chan->ld_running);
+
+ /*
+ * The 85xx DMA controller doesn't clear the channel start bit
+ * automatically at the end of a transfer. Therefore we must clear
+ * it in software before starting the transfer.
+ */
+ if ((chan->feature & FSL_DMA_IP_MASK) == FSL_DMA_IP_85XX) {
+ u32 mode;
+
+ mode = DMA_IN(chan, &chan->regs->mr, 32);
+ mode &= ~FSL_DMA_MR_CS;
+ DMA_OUT(chan, &chan->regs->mr, mode, 32);
+ }
+
+ /*
+ * Program the descriptor's address into the DMA controller,
+ * then start the DMA transaction
+ */
+ set_cdar(chan, desc->async_tx.phys);
+ get_cdar(chan);
+
+ dma_start(chan);
+ chan->idle = false;
+}
+
+/**
+ * fsldma_cleanup_descriptor - cleanup and free a single link descriptor
+ * @chan: Freescale DMA channel
+ * @desc: descriptor to cleanup and free
+ *
+ * This function is used on a descriptor which has been executed by the DMA
+ * controller. It will run any callbacks, submit any dependencies, and then
+ * free the descriptor.
+ */
+static void fsldma_cleanup_descriptor(struct fsldma_chan *chan)
+{
+ struct fsl_desc_sw *desc, *_desc;
+ dma_cookie_t cookie = 0;
+ dma_addr_t curr_phys = get_cdar(chan);
+ int idle = dma_is_idle(chan);
+ int seen_current = 0;
+
+ fsldma_clean_completed_descriptor(chan);
+
+ /* Run the callback for each descriptor, in order */
+ list_for_each_entry_safe(desc, _desc, &chan->ld_running, node) {
+ /*
+ * do not advance past the current descriptor loaded into the
+ * hardware channel, subsequent descriptors are either in
+ * process or have not been submitted
+ */
+ if (seen_current)
+ break;
+
+ /*
+ * stop the search if we reach the current descriptor and the
+ * channel is busy
+ */
+ if (desc->async_tx.phys == curr_phys) {
+ seen_current = 1;
+ if (!idle)
+ break;
+ }
+
+ cookie = fsldma_run_tx_complete_actions(desc, chan, cookie);
+
+ if (fsldma_clean_running_descriptor(chan, desc))
+ break;
+ }
+
+ /*
+ * Start any pending transactions automatically
+ *
+ * In the ideal case, we keep the DMA controller busy while we go
+ * ahead and free the descriptors below.
+ */
+ fsl_chan_xfer_ld_queue(chan);
+
+ if (cookie > 0)
+ chan->common.completed_cookie = cookie;
+}
+
static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor *tx)
{
struct fsldma_chan *chan = to_fsl_chan(tx->chan);
@@ -932,131 +1053,6 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
}
/**
- * fsldma_cleanup_descriptor - cleanup and free a single link descriptor
- * @chan: Freescale DMA channel
- * @desc: descriptor to cleanup and free
- *
- * This function is used on a descriptor which has been executed by the DMA
- * controller. It will run any callbacks, submit any dependencies, and then
- * free the descriptor.
- */
-static void fsldma_cleanup_descriptor(struct fsldma_chan *chan)
-{
- struct fsl_desc_sw *desc, *_desc;
- dma_cookie_t cookie = 0;
- dma_addr_t curr_phys = get_cdar(chan);
- int idle = dma_is_idle(chan);
- int seen_current = 0;
-
- fsldma_clean_completed_descriptor(chan);
-
- /* Run the callback for each descriptor, in order */
- list_for_each_entry_safe(desc, _desc, &chan->ld_running, node) {
- /*
- * do not advance past the current descriptor loaded into the
- * hardware channel, subsequent descriptors are either in
- * process or have not been submitted
- */
- if (seen_current)
- break;
-
- /*
- * stop the search if we reach the current descriptor and the
- * channel is busy
- */
- if (desc->async_tx.phys == curr_phys) {
- seen_current = 1;
- if (!idle)
- break;
- }
-
- cookie = fsldma_run_tx_complete_actions(desc, chan, cookie);
-
- if (fsldma_clean_running_descriptor(chan, desc))
- break;
-
- }
-
- /*
- * Start any pending transactions automatically
- *
- * In the ideal case, we keep the DMA controller busy while we go
- * ahead and free the descriptors below.
- */
- fsl_chan_xfer_ld_queue(chan);
-
- if (cookie > 0)
- chan->common.completed_cookie = cookie;
-}
-
-/**
- * fsl_chan_xfer_ld_queue - transfer any pending transactions
- * @chan : Freescale DMA channel
- *
- * HARDWARE STATE: idle
- * LOCKING: must hold chan->desc_lock
- */
-static void fsl_chan_xfer_ld_queue(struct fsldma_chan *chan)
-{
- struct fsl_desc_sw *desc;
-
- /*
- * If the list of pending descriptors is empty, then we
- * don't need to do any work at all
- */
- if (list_empty(&chan->ld_pending)) {
- chan_dbg(chan, "no pending LDs\n");
- return;
- }
-
- /*
- * The DMA controller is not idle, which means that the interrupt
- * handler will start any queued transactions when it runs after
- * this transaction finishes
- */
- if (!chan->idle) {
- chan_dbg(chan, "DMA controller still busy\n");
- return;
- }
-
- /*
- * If there are some link descriptors which have not been
- * transferred, we need to start the controller
- */
-
- /*
- * Move all elements from the queue of pending transactions
- * onto the list of running transactions
- */
- chan_dbg(chan, "idle, starting controller\n");
- desc = list_first_entry(&chan->ld_pending, struct fsl_desc_sw, node);
- list_splice_tail_init(&chan->ld_pending, &chan->ld_running);
-
- /*
- * The 85xx DMA controller doesn't clear the channel start bit
- * automatically at the end of a transfer. Therefore we must clear
- * it in software before starting the transfer.
- */
- if ((chan->feature & FSL_DMA_IP_MASK) == FSL_DMA_IP_85XX) {
- u32 mode;
-
- mode = DMA_IN(chan, &chan->regs->mr, 32);
- mode &= ~FSL_DMA_MR_CS;
- DMA_OUT(chan, &chan->regs->mr, mode, 32);
- }
-
- /*
- * Program the descriptor's address into the DMA controller,
- * then start the DMA transaction
- */
- set_cdar(chan, desc->async_tx.phys);
- get_cdar(chan);
-
- dma_start(chan);
- chan->idle = false;
-}
-
-/**
* fsl_dma_memcpy_issue_pending - Issue the DMA start command
* @chan : Freescale DMA channel
*/
--
1.7.5.1
^ permalink raw reply related
* [PATCH v4 3/7] fsl-dma: change release process of dma descriptor for supporting async_tx
From: qiang.liu @ 2012-07-27 9:16 UTC (permalink / raw)
To: linux-crypto, linuxppc-dev
Cc: Ira W. Snyder, Vinod Koul, Qiang Liu, herbert, Dan Williams,
davem
From: Qiang Liu <qiang.liu@freescale.com>
Fix the potential risk when enable config NET_DMA and ASYNC_TX.
Async_tx is lack of support in current release process of dma descriptor,
all descriptors will be released whatever is acked or no-acked by async_tx,
so there is a potential race condition when dma engine is uesd by others
clients (e.g. when enable NET_DMA to offload TCP).
In our case, a race condition which is raised when use both of talitos
and dmaengine to offload xor is because napi scheduler will sync all
pending requests in dma channels, it affects the process of raid operations
due to ack_tx is not checked in fsl dma. The no-acked descriptor is freed
which is submitted just now, as a dependent tx, this freed descriptor trigger
BUG_ON(async_tx_test_ack(depend_tx)) in async_tx_submit().
TASK = ee1a94a0[1390] 'md0_raid5' THREAD: ecf40000 CPU: 0
GPR00: 00000001 ecf41ca0 ee44/921a94a0 0000003f 00000001 c00593e4 00000000 00000001
GPR08: 00000000 a7a7a7a7 00000001 045/920000002 42028042 100a38d4 ed576d98 00000000
GPR16: ed5a11b0 00000000 2b162000 00000200 046/920000000 2d555000 ed3015e8 c15a7aa0
GPR24: 00000000 c155fc40 00000000 ecb63220 ecf41d28 e47/92f640bb0 ef640c30 ecf41ca0
NIP [c02b048c] async_tx_submit+0x6c/0x2b4
LR [c02b068c] async_tx_submit+0x26c/0x2b4
Call Trace:
[ecf41ca0] [c02b068c] async_tx_submit+0x26c/0x2b448/92 (unreliable)
[ecf41cd0] [c02b0a4c] async_memcpy+0x240/0x25c
[ecf41d20] [c0421064] async_copy_data+0xa0/0x17c
[ecf41d70] [c0421cf4] __raid_run_ops+0x874/0xe10
[ecf41df0] [c0426ee4] handle_stripe+0x820/0x25e8
[ecf41e90] [c0429080] raid5d+0x3d4/0x5b4
[ecf41f40] [c04329b8] md_thread+0x138/0x16c
[ecf41f90] [c008277c] kthread+0x8c/0x90
[ecf41ff0] [c0011630] kernel_thread+0x4c/0x68
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Li Yang <leoli@freescale.com>
Cc: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
---
drivers/dma/fsldma.c | 242 +++++++++++++++++++++++++++++++++++---------------
drivers/dma/fsldma.h | 1 +
2 files changed, 172 insertions(+), 71 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 4f2f212..87f52c0 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -400,6 +400,125 @@ out_splice:
list_splice_tail_init(&desc->tx_list, &chan->ld_pending);
}
+static void fsldma_cleanup_descriptor(struct fsldma_chan *chan);
+static void fsl_chan_xfer_ld_queue(struct fsldma_chan *chan);
+
+/**
+ * fsldma_clean_completed_descriptor - free all descriptors which
+ * has been completed and acked
+ * @chan: Freescale DMA channel
+ *
+ * This function is used on all completed and acked descriptors.
+ * All descriptors should only be freed in this function.
+ */
+static int
+fsldma_clean_completed_descriptor(struct fsldma_chan *chan)
+{
+ struct fsl_desc_sw *desc, *_desc;
+
+ /* Run the callback for each descriptor, in order */
+ list_for_each_entry_safe(desc, _desc, &chan->ld_completed, node) {
+
+ if (async_tx_test_ack(&desc->async_tx)) {
+ /* Remove from the list of transactions */
+ list_del(&desc->node);
+#ifdef FSL_DMA_LD_DEBUG
+ chan_dbg(chan, "LD %p free\n", desc);
+#endif
+ dma_pool_free(chan->desc_pool, desc,
+ desc->async_tx.phys);
+ }
+ }
+
+ return 0;
+}
+
+/**
+ * fsldma_run_tx_complete_actions - cleanup and free a single link descriptor
+ * @chan: Freescale DMA channel
+ * @desc: descriptor to cleanup and free
+ * @cookie: Freescale DMA transaction identifier
+ *
+ * This function is used on a descriptor which has been executed by the DMA
+ * controller. It will run any callbacks, submit any dependencies.
+ */
+static dma_cookie_t fsldma_run_tx_complete_actions(struct fsl_desc_sw *desc,
+ struct fsldma_chan *chan, dma_cookie_t cookie)
+{
+ struct dma_async_tx_descriptor *txd = &desc->async_tx;
+ struct device *dev = chan->common.device->dev;
+ dma_addr_t src = get_desc_src(chan, desc);
+ dma_addr_t dst = get_desc_dst(chan, desc);
+ u32 len = get_desc_cnt(chan, desc);
+
+ BUG_ON(txd->cookie < 0);
+
+ if (txd->cookie > 0) {
+ cookie = txd->cookie;
+
+ /* Run the link descriptor callback function */
+ if (txd->callback) {
+#ifdef FSL_DMA_LD_DEBUG
+ chan_dbg(chan, "LD %p callback\n", desc);
+#endif
+ txd->callback(txd->callback_param);
+ }
+
+ /* Unmap the dst buffer, if requested */
+ if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
+ if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
+ dma_unmap_single(dev, dst, len, DMA_FROM_DEVICE);
+ else
+ dma_unmap_page(dev, dst, len, DMA_FROM_DEVICE);
+ }
+
+ /* Unmap the src buffer, if requested */
+ if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
+ if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
+ dma_unmap_single(dev, src, len, DMA_TO_DEVICE);
+ else
+ dma_unmap_page(dev, src, len, DMA_TO_DEVICE);
+ }
+ }
+
+ /* Run any dependencies */
+ dma_run_dependencies(txd);
+
+ return cookie;
+}
+
+/**
+ * fsldma_clean_running_descriptor - move the completed descriptor from
+ * ld_running to ld_completed
+ * @chan: Freescale DMA channel
+ * @desc: the descriptor which is completed
+ *
+ * Free the descriptor directly if acked by async_tx api, or move it to
+ * queue ld_completed.
+ */
+static int
+fsldma_clean_running_descriptor(struct fsldma_chan *chan,
+ struct fsl_desc_sw *desc)
+{
+ /* Remove from the list of transactions */
+ list_del(&desc->node);
+ /*
+ * the client is allowed to attach dependent operations
+ * until 'ack' is set
+ */
+ if (!async_tx_test_ack(&desc->async_tx)) {
+ /*
+ * Move this descriptor to the list of descriptors which is
+ * completed, but still awaiting the 'ack' bit to be set.
+ */
+ list_add_tail(&desc->node, &chan->ld_completed);
+ return 0;
+ }
+
+ dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
+ return 0;
+}
+
static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor *tx)
{
struct fsldma_chan *chan = to_fsl_chan(tx->chan);
@@ -534,8 +653,10 @@ static void fsl_dma_free_chan_resources(struct dma_chan *dchan)
chan_dbg(chan, "free all channel resources\n");
spin_lock_irqsave(&chan->desc_lock, flags);
+ fsldma_cleanup_descriptor(chan);
fsldma_free_desc_list(chan, &chan->ld_pending);
fsldma_free_desc_list(chan, &chan->ld_running);
+ fsldma_free_desc_list(chan, &chan->ld_completed);
spin_unlock_irqrestore(&chan->desc_lock, flags);
dma_pool_destroy(chan->desc_pool);
@@ -819,46 +940,53 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
* controller. It will run any callbacks, submit any dependencies, and then
* free the descriptor.
*/
-static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
- struct fsl_desc_sw *desc)
+static void fsldma_cleanup_descriptor(struct fsldma_chan *chan)
{
- struct dma_async_tx_descriptor *txd = &desc->async_tx;
- struct device *dev = chan->common.device->dev;
- dma_addr_t src = get_desc_src(chan, desc);
- dma_addr_t dst = get_desc_dst(chan, desc);
- u32 len = get_desc_cnt(chan, desc);
+ struct fsl_desc_sw *desc, *_desc;
+ dma_cookie_t cookie = 0;
+ dma_addr_t curr_phys = get_cdar(chan);
+ int idle = dma_is_idle(chan);
+ int seen_current = 0;
- /* Run the link descriptor callback function */
- if (txd->callback) {
-#ifdef FSL_DMA_LD_DEBUG
- chan_dbg(chan, "LD %p callback\n", desc);
-#endif
- txd->callback(txd->callback_param);
- }
+ fsldma_clean_completed_descriptor(chan);
- /* Run any dependencies */
- dma_run_dependencies(txd);
+ /* Run the callback for each descriptor, in order */
+ list_for_each_entry_safe(desc, _desc, &chan->ld_running, node) {
+ /*
+ * do not advance past the current descriptor loaded into the
+ * hardware channel, subsequent descriptors are either in
+ * process or have not been submitted
+ */
+ if (seen_current)
+ break;
- /* Unmap the dst buffer, if requested */
- if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
- if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
- dma_unmap_single(dev, dst, len, DMA_FROM_DEVICE);
- else
- dma_unmap_page(dev, dst, len, DMA_FROM_DEVICE);
- }
+ /*
+ * stop the search if we reach the current descriptor and the
+ * channel is busy
+ */
+ if (desc->async_tx.phys == curr_phys) {
+ seen_current = 1;
+ if (!idle)
+ break;
+ }
+
+ cookie = fsldma_run_tx_complete_actions(desc, chan, cookie);
+
+ if (fsldma_clean_running_descriptor(chan, desc))
+ break;
- /* Unmap the src buffer, if requested */
- if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
- if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
- dma_unmap_single(dev, src, len, DMA_TO_DEVICE);
- else
- dma_unmap_page(dev, src, len, DMA_TO_DEVICE);
}
-#ifdef FSL_DMA_LD_DEBUG
- chan_dbg(chan, "LD %p free\n", desc);
-#endif
- dma_pool_free(chan->desc_pool, desc, txd->phys);
+ /*
+ * Start any pending transactions automatically
+ *
+ * In the ideal case, we keep the DMA controller busy while we go
+ * ahead and free the descriptors below.
+ */
+ fsl_chan_xfer_ld_queue(chan);
+
+ if (cookie > 0)
+ chan->common.completed_cookie = cookie;
}
/**
@@ -954,11 +1082,15 @@ static enum dma_status fsl_tx_status(struct dma_chan *dchan,
enum dma_status ret;
unsigned long flags;
- spin_lock_irqsave(&chan->desc_lock, flags);
ret = dma_cookie_status(dchan, cookie, txstate);
+ if (ret == DMA_SUCCESS)
+ return ret;
+
+ spin_lock_irqsave(&chan->desc_lock, flags);
+ fsldma_cleanup_descriptor(chan);
spin_unlock_irqrestore(&chan->desc_lock, flags);
- return ret;
+ return dma_cookie_status(dchan, cookie, txstate);
}
/*----------------------------------------------------------------------------*/
@@ -1035,52 +1167,19 @@ static irqreturn_t fsldma_chan_irq(int irq, void *data)
static void dma_do_tasklet(unsigned long data)
{
struct fsldma_chan *chan = (struct fsldma_chan *)data;
- struct fsl_desc_sw *desc, *_desc;
- LIST_HEAD(ld_cleanup);
unsigned long flags;
chan_dbg(chan, "tasklet entry\n");
spin_lock_irqsave(&chan->desc_lock, flags);
- /* update the cookie if we have some descriptors to cleanup */
- if (!list_empty(&chan->ld_running)) {
- dma_cookie_t cookie;
-
- desc = to_fsl_desc(chan->ld_running.prev);
- cookie = desc->async_tx.cookie;
- dma_cookie_complete(&desc->async_tx);
-
- chan_dbg(chan, "completed_cookie=%d\n", cookie);
- }
-
- /*
- * move the descriptors to a temporary list so we can drop the lock
- * during the entire cleanup operation
- */
- list_splice_tail_init(&chan->ld_running, &ld_cleanup);
-
/* the hardware is now idle and ready for more */
chan->idle = true;
- /*
- * Start any pending transactions automatically
- *
- * In the ideal case, we keep the DMA controller busy while we go
- * ahead and free the descriptors below.
- */
- fsl_chan_xfer_ld_queue(chan);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
-
- /* Run the callback for each descriptor, in order */
- list_for_each_entry_safe(desc, _desc, &ld_cleanup, node) {
+ /* Run all cleanup for this descriptor */
+ fsldma_cleanup_descriptor(chan);
- /* Remove from the list of transactions */
- list_del(&desc->node);
-
- /* Run all cleanup for this descriptor */
- fsldma_cleanup_descriptor(chan, desc);
- }
+ spin_unlock_irqrestore(&chan->desc_lock, flags);
chan_dbg(chan, "tasklet exit\n");
}
@@ -1262,6 +1361,7 @@ static int __devinit fsl_dma_chan_probe(struct fsldma_device *fdev,
spin_lock_init(&chan->desc_lock);
INIT_LIST_HEAD(&chan->ld_pending);
INIT_LIST_HEAD(&chan->ld_running);
+ INIT_LIST_HEAD(&chan->ld_completed);
chan->idle = true;
chan->common.device = &fdev->common;
diff --git a/drivers/dma/fsldma.h b/drivers/dma/fsldma.h
index f5c3879..7ede908 100644
--- a/drivers/dma/fsldma.h
+++ b/drivers/dma/fsldma.h
@@ -140,6 +140,7 @@ struct fsldma_chan {
spinlock_t desc_lock; /* Descriptor operation lock */
struct list_head ld_pending; /* Link descriptors queue */
struct list_head ld_running; /* Link descriptors queue */
+ struct list_head ld_completed; /* Link descriptors queue */
struct dma_chan common; /* DMA common channel */
struct dma_pool *desc_pool; /* Descriptors pool */
struct device *dev; /* Channel device */
--
1.7.5.1
^ permalink raw reply related
* [PATCH v4 2/7] fsl-dma: remove attribute DMA_INTERRUPT of dmaengine
From: qiang.liu @ 2012-07-27 9:15 UTC (permalink / raw)
To: linux-crypto, linuxppc-dev
Cc: Vinod Koul, Qiang Liu, herbert, Dan Williams, davem
From: Qiang Liu <qiang.liu@freescale.com>
Delete attribute DMA_INTERRUPT because fsl-dma doesn't support this function,
exception will be thrown if talitos is used to offload xor at the same time.
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Li Yang <leoli@freescale.com>
Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
Acked-by: Ira W. Snyder <iws@ovro.caltech.edu>
---
drivers/dma/fsldma.c | 31 -------------------------------
1 files changed, 0 insertions(+), 31 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 8f84761..4f2f212 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -543,35 +543,6 @@ static void fsl_dma_free_chan_resources(struct dma_chan *dchan)
}
static struct dma_async_tx_descriptor *
-fsl_dma_prep_interrupt(struct dma_chan *dchan, unsigned long flags)
-{
- struct fsldma_chan *chan;
- struct fsl_desc_sw *new;
-
- if (!dchan)
- return NULL;
-
- chan = to_fsl_chan(dchan);
-
- new = fsl_dma_alloc_descriptor(chan);
- if (!new) {
- chan_err(chan, "%s\n", msg_ld_oom);
- return NULL;
- }
-
- new->async_tx.cookie = -EBUSY;
- new->async_tx.flags = flags;
-
- /* Insert the link descriptor to the LD ring */
- list_add_tail(&new->node, &new->tx_list);
-
- /* Set End-of-link to the last link descriptor of new list */
- set_ld_eol(chan, new);
-
- return &new->async_tx;
-}
-
-static struct dma_async_tx_descriptor *
fsl_dma_prep_memcpy(struct dma_chan *dchan,
dma_addr_t dma_dst, dma_addr_t dma_src,
size_t len, unsigned long flags)
@@ -1352,12 +1323,10 @@ static int __devinit fsldma_of_probe(struct platform_device *op)
fdev->irq = irq_of_parse_and_map(op->dev.of_node, 0);
dma_cap_set(DMA_MEMCPY, fdev->common.cap_mask);
- dma_cap_set(DMA_INTERRUPT, fdev->common.cap_mask);
dma_cap_set(DMA_SG, fdev->common.cap_mask);
dma_cap_set(DMA_SLAVE, fdev->common.cap_mask);
fdev->common.device_alloc_chan_resources = fsl_dma_alloc_chan_resources;
fdev->common.device_free_chan_resources = fsl_dma_free_chan_resources;
- fdev->common.device_prep_dma_interrupt = fsl_dma_prep_interrupt;
fdev->common.device_prep_dma_memcpy = fsl_dma_prep_memcpy;
fdev->common.device_prep_dma_sg = fsl_dma_prep_sg;
fdev->common.device_tx_status = fsl_tx_status;
--
1.7.5.1
^ permalink raw reply related
* [PATCH v4 0/7] Raid: enable talitos xor offload for improving performance
From: qiang.liu @ 2012-07-27 9:15 UTC (permalink / raw)
To: linux-crypto, vinod.koul, dan.j.williams, herbert, linuxppc-dev
Hi,
The following 7 patches enabling fsl-dma and talitos offload raid
operations for improving raid performance and balancing CPU load.
Write performance will be improved by 25-30% tested by iozone.
Write performance is improved about 2% after using spin_lock_bh replace
spin_lock_irqsave.
CPU load will be reduced by 8%.
Changes in V4:
- fix an error in talitos when dest addr is same with src addr, dest
should be freed only one time if src is same with dest addr;
- correct coding style in fsl-dma according to Ira's comments;
- fix a race condition in fsl-dma fsl_tx_status(), remove the interface
which is used to free descriptors in queue ld_completed, this interface
has been included in fsldma_cleanup_descriptor(), in v3, there is one
place missed spin_lock protect;
- split the original patch 3/4 up to 2 patches 3/7 and 4/7 according to
Li Yang's comments.
- fix a warning of unitialized cookie;
- add memory copy self test in fsl-dma;
- add more detail description about use spin_lock_bh() to instead of
spin_lock_irqsave() according to Timur's comments;
Changes in v3:
- change release process of fsl-dma descriptor for resolve the
potential race condition
- add test result when use spin_lock_bh replace spin_lock_irqsave
- modify the benchmark results according to the latest patch
Changes in v2:
- rebase onto cryptodev tree
- split the patch 3/4 up to 3 independent patches
- remove the patch 4/4, the fix is not for cryptodev tree
Qiang Liu (4):
Talitos: Support for async_tx XOR offload
fsl-dma: remove attribute DMA_INTERRUPT of dmaengine
fsl-dma: change release process of dma descriptor for supporting async_tx
fsl-dma: use spin_lock_bh to instead of spin_lock_irqsave
Qiang Liu (7):
Talitos: Support for async_tx XOR offload
fsl-dma: remove attribute DMA_INTERRUPT of dmaengine
fsl-dma: change release process of dma descriptor for supporting async_tx
fsl-dma: move the function ahead of its invoke function
fsl-dma: use spin_lock_bh to instead of spin_lock_irqsave
fsl-dma: fix a warning of unitialized cookie
fsl-dma: add memcpy self test interface
drivers/crypto/Kconfig | 9 +
drivers/crypto/talitos.c | 413 ++++++++++++++++++++++++++++++++++
drivers/crypto/talitos.h | 53 +++++
drivers/dma/fsldma.c | 550 +++++++++++++++++++++++++++++-----------------
drivers/dma/fsldma.h | 1 +
5 files changed, 822 insertions(+), 204 deletions(-)
^ permalink raw reply
* [PATCH v4 1/7] Talitos: Support for async_tx XOR offload
From: qiang.liu @ 2012-07-27 9:15 UTC (permalink / raw)
To: linux-crypto, herbert, davem, linuxppc-dev
Cc: vinod.koul, Qiang Liu, dan.j.williams
From: Qiang Liu <qiang.liu@freescale.com>
Expose Talitos's XOR functionality to be used for RAID parity
calculation via the Async_tx layer.
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Dipen Dudhat <Dipen.Dudhat@freescale.com>
Signed-off-by: Maneesh Gupta <Maneesh.Gupta@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Vishnu Suresh <Vishnu@freescale.com>
Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
---
drivers/crypto/Kconfig | 9 +
drivers/crypto/talitos.c | 413 ++++++++++++++++++++++++++++++++++++++++++++++
drivers/crypto/talitos.h | 53 ++++++
3 files changed, 475 insertions(+), 0 deletions(-)
diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index be6b2ba..f0a7c29 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -222,6 +222,15 @@ config CRYPTO_DEV_TALITOS
To compile this driver as a module, choose M here: the module
will be called talitos.
+config CRYPTO_DEV_TALITOS_RAIDXOR
+ bool "Talitos RAID5 XOR Calculation Offload"
+ default y
+ select DMA_ENGINE
+ depends on CRYPTO_DEV_TALITOS
+ help
+ Say 'Y' here to use the Freescale Security Engine (SEC) to
+ offload RAID XOR parity Calculation
+
config CRYPTO_DEV_IXP4XX
tristate "Driver for IXP4xx crypto hardware acceleration"
depends on ARCH_IXP4XX
diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index efff788..b34264e 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -619,6 +619,399 @@ static void talitos_unregister_rng(struct device *dev)
hwrng_unregister(&priv->rng);
}
+#ifdef CONFIG_CRYPTO_DEV_TALITOS_RAIDXOR
+static void talitos_release_xor(struct device *dev, struct talitos_desc *hwdesc,
+ void *context, int error);
+
+static enum dma_status talitos_is_tx_complete(struct dma_chan *chan,
+ dma_cookie_t cookie,
+ struct dma_tx_state *state)
+{
+ struct talitos_xor_chan *xor_chan;
+ dma_cookie_t last_used;
+ dma_cookie_t last_complete;
+
+ xor_chan = container_of(chan, struct talitos_xor_chan, common);
+
+ last_used = chan->cookie;
+ last_complete = xor_chan->completed_cookie;
+
+ if (state->last)
+ state->last = last_complete;
+
+ if (state->used)
+ state->used = last_used;
+
+ return dma_async_is_complete(cookie, last_complete, last_used);
+}
+
+static void talitos_process_pending(struct talitos_xor_chan *xor_chan)
+{
+ struct talitos_xor_desc *desc, *_desc;
+ unsigned long flags;
+ int status;
+ struct talitos_private *priv;
+ int ch;
+
+ priv = dev_get_drvdata(xor_chan->dev);
+ ch = atomic_inc_return(&priv->last_chan) &
+ (priv->num_channels - 1);
+ spin_lock_irqsave(&xor_chan->desc_lock, flags);
+
+ list_for_each_entry_safe(desc, _desc, &xor_chan->pending_q, node) {
+ status = talitos_submit(xor_chan->dev, ch, &desc->hwdesc,
+ talitos_release_xor, desc);
+ if (status != -EINPROGRESS)
+ break;
+
+ list_del(&desc->node);
+ list_add_tail(&desc->node, &xor_chan->in_progress_q);
+ }
+
+ spin_unlock_irqrestore(&xor_chan->desc_lock, flags);
+}
+
+static void talitos_xor_run_tx_complete_actions(struct talitos_xor_desc *desc,
+ struct talitos_xor_chan *xor_chan)
+{
+ struct device *dev = xor_chan->dev;
+ dma_addr_t dest, addr;
+ unsigned int src_cnt = desc->unmap_src_cnt;
+ unsigned int len = desc->unmap_len;
+ enum dma_ctrl_flags flags = desc->async_tx.flags;
+ struct dma_async_tx_descriptor *tx = &desc->async_tx;
+
+ /* unmap dma addresses */
+ dest = desc->hwdesc.ptr[6].ptr;
+ if (likely(!(flags & DMA_COMPL_SKIP_DEST_UNMAP)))
+ dma_unmap_page(dev, dest, len, DMA_BIDIRECTIONAL);
+
+ desc->idx = 6 - src_cnt;
+ if (likely(!(flags & DMA_COMPL_SKIP_SRC_UNMAP))) {
+ while(desc->idx < 6) {
+ addr = desc->hwdesc.ptr[desc->idx++].ptr;
+ if (addr == dest)
+ continue;
+ dma_unmap_page(dev, addr, len, DMA_TO_DEVICE);
+ }
+ }
+
+ /* run dependent operations */
+ dma_run_dependencies(tx);
+}
+
+static void talitos_release_xor(struct device *dev, struct talitos_desc *hwdesc,
+ void *context, int error)
+{
+ struct talitos_xor_desc *desc = context;
+ struct talitos_xor_chan *xor_chan;
+ dma_async_tx_callback callback;
+ void *callback_param;
+
+ if (unlikely(error))
+ dev_err(dev, "xor operation: talitos error %d\n", error);
+
+ xor_chan = container_of(desc->async_tx.chan, struct talitos_xor_chan,
+ common);
+ spin_lock_bh(&xor_chan->desc_lock);
+ if (xor_chan->completed_cookie < desc->async_tx.cookie)
+ xor_chan->completed_cookie = desc->async_tx.cookie;
+
+ callback = desc->async_tx.callback;
+ callback_param = desc->async_tx.callback_param;
+
+ if (callback) {
+ spin_unlock_bh(&xor_chan->desc_lock);
+ callback(callback_param);
+ spin_lock_bh(&xor_chan->desc_lock);
+ }
+
+ talitos_xor_run_tx_complete_actions(desc, xor_chan);
+
+ list_del(&desc->node);
+ list_add_tail(&desc->node, &xor_chan->free_desc);
+ spin_unlock_bh(&xor_chan->desc_lock);
+ if (!list_empty(&xor_chan->pending_q))
+ talitos_process_pending(xor_chan);
+}
+
+/**
+ * talitos_issue_pending - move the descriptors in submit
+ * queue to pending queue and submit them for processing
+ * @chan: DMA channel
+ */
+static void talitos_issue_pending(struct dma_chan *chan)
+{
+ struct talitos_xor_chan *xor_chan;
+
+ xor_chan = container_of(chan, struct talitos_xor_chan, common);
+ spin_lock_bh(&xor_chan->desc_lock);
+ list_splice_tail_init(&xor_chan->submit_q,
+ &xor_chan->pending_q);
+ spin_unlock_bh(&xor_chan->desc_lock);
+ talitos_process_pending(xor_chan);
+}
+
+static dma_cookie_t talitos_async_tx_submit(struct dma_async_tx_descriptor *tx)
+{
+ struct talitos_xor_desc *desc;
+ struct talitos_xor_chan *xor_chan;
+ dma_cookie_t cookie;
+
+ desc = container_of(tx, struct talitos_xor_desc, async_tx);
+ xor_chan = container_of(tx->chan, struct talitos_xor_chan, common);
+
+ spin_lock_bh(&xor_chan->desc_lock);
+
+ cookie = xor_chan->common.cookie + 1;
+ if (cookie < 0)
+ cookie = 1;
+
+ desc->async_tx.cookie = cookie;
+ xor_chan->common.cookie = desc->async_tx.cookie;
+
+ list_splice_tail_init(&desc->tx_list,
+ &xor_chan->submit_q);
+
+ spin_unlock_bh(&xor_chan->desc_lock);
+
+ return cookie;
+}
+
+static struct talitos_xor_desc *talitos_xor_alloc_descriptor(
+ struct talitos_xor_chan *xor_chan, gfp_t flags)
+{
+ struct talitos_xor_desc *desc;
+
+ desc = kmalloc(sizeof(*desc), flags);
+ if (desc) {
+ xor_chan->total_desc++;
+ desc->async_tx.tx_submit = talitos_async_tx_submit;
+ }
+
+ return desc;
+}
+
+static void talitos_free_chan_resources(struct dma_chan *chan)
+{
+ struct talitos_xor_chan *xor_chan;
+ struct talitos_xor_desc *desc, *_desc;
+
+ xor_chan = container_of(chan, struct talitos_xor_chan, common);
+
+ spin_lock_bh(&xor_chan->desc_lock);
+
+ list_for_each_entry_safe(desc, _desc, &xor_chan->submit_q, node) {
+ list_del(&desc->node);
+ xor_chan->total_desc--;
+ kfree(desc);
+ }
+ list_for_each_entry_safe(desc, _desc, &xor_chan->pending_q, node) {
+ list_del(&desc->node);
+ xor_chan->total_desc--;
+ kfree(desc);
+ }
+ list_for_each_entry_safe(desc, _desc, &xor_chan->in_progress_q, node) {
+ list_del(&desc->node);
+ xor_chan->total_desc--;
+ kfree(desc);
+ }
+ list_for_each_entry_safe(desc, _desc, &xor_chan->free_desc, node) {
+ list_del(&desc->node);
+ xor_chan->total_desc--;
+ kfree(desc);
+ }
+
+ /* Some descriptor not freed? */
+ if (unlikely(xor_chan->total_desc))
+ dev_warn(chan->device->dev, "Failed to free xor channel resource\n");
+
+ spin_unlock_bh(&xor_chan->desc_lock);
+}
+
+static int talitos_alloc_chan_resources(struct dma_chan *chan)
+{
+ struct talitos_xor_chan *xor_chan;
+ struct talitos_xor_desc *desc;
+ LIST_HEAD(tmp_list);
+ int i;
+
+ xor_chan = container_of(chan, struct talitos_xor_chan, common);
+
+ if (!list_empty(&xor_chan->free_desc))
+ return xor_chan->total_desc;
+
+ for (i = 0; i < TALITOS_MAX_DESCRIPTOR_NR; i++) {
+ desc = talitos_xor_alloc_descriptor(xor_chan,
+ GFP_KERNEL | GFP_DMA);
+ if (!desc) {
+ dev_err(xor_chan->common.device->dev,
+ "Only %d initial descriptors\n", i);
+ break;
+ }
+ list_add_tail(&desc->node, &tmp_list);
+ }
+
+ if (!i)
+ return -ENOMEM;
+
+ /* At least one desc is allocated */
+ spin_lock_bh(&xor_chan->desc_lock);
+ list_splice_init(&tmp_list, &xor_chan->free_desc);
+ spin_unlock_bh(&xor_chan->desc_lock);
+
+ return xor_chan->total_desc;
+}
+
+static struct dma_async_tx_descriptor *talitos_prep_dma_xor(
+ struct dma_chan *chan, dma_addr_t dest, dma_addr_t *src,
+ unsigned int src_cnt, size_t len, unsigned long flags)
+{
+ struct talitos_xor_chan *xor_chan;
+ struct talitos_xor_desc *new;
+ struct talitos_desc *desc;
+ int i, j;
+
+ BUG_ON(len > TALITOS_MAX_DATA_LEN);
+
+ xor_chan = container_of(chan, struct talitos_xor_chan, common);
+
+ spin_lock_bh(&xor_chan->desc_lock);
+ if (!list_empty(&xor_chan->free_desc)) {
+ new = container_of(xor_chan->free_desc.next,
+ struct talitos_xor_desc, node);
+ list_del(&new->node);
+ } else {
+ new = talitos_xor_alloc_descriptor(xor_chan, GFP_KERNEL | GFP_DMA);
+ }
+ spin_unlock_bh(&xor_chan->desc_lock);
+
+ if (!new) {
+ dev_err(xor_chan->common.device->dev,
+ "No free memory for XOR DMA descriptor\n");
+ return NULL;
+ }
+ dma_async_tx_descriptor_init(&new->async_tx, &xor_chan->common);
+
+ INIT_LIST_HEAD(&new->node);
+ INIT_LIST_HEAD(&new->tx_list);
+
+ desc = &new->hwdesc;
+ /* Set destination: Last pointer pair */
+ to_talitos_ptr(&desc->ptr[6], dest);
+ desc->ptr[6].len = cpu_to_be16(len);
+ desc->ptr[6].j_extent = 0;
+ new->unmap_src_cnt = src_cnt;
+ new->unmap_len = len;
+
+ /* Set Sources: End loading from second-last pointer pair */
+ for (i = 5, j = 0; j < src_cnt && i >= 0; i--, j++) {
+ to_talitos_ptr(&desc->ptr[i], src[j]);
+ desc->ptr[i].len = cpu_to_be16(len);
+ desc->ptr[i].j_extent = 0;
+ }
+
+ /*
+ * documentation states first 0 ptr/len combo marks end of sources
+ * yet device produces scatter boundary error unless all subsequent
+ * sources are zeroed out
+ */
+ for (; i >= 0; i--) {
+ to_talitos_ptr(&desc->ptr[i], 0);
+ desc->ptr[i].len = 0;
+ desc->ptr[i].j_extent = 0;
+ }
+
+ desc->hdr = DESC_HDR_SEL0_AESU | DESC_HDR_MODE0_AESU_XOR |
+ DESC_HDR_TYPE_RAID_XOR;
+
+ new->async_tx.parent = NULL;
+ new->async_tx.next = NULL;
+ new->async_tx.cookie = 0;
+ async_tx_ack(&new->async_tx);
+
+ list_add_tail(&new->node, &new->tx_list);
+
+ new->async_tx.flags = flags;
+ new->async_tx.cookie = -EBUSY;
+
+ return &new->async_tx;
+}
+
+static void talitos_unregister_async_xor(struct device *dev)
+{
+ struct talitos_private *priv = dev_get_drvdata(dev);
+ struct talitos_xor_chan *xor_chan;
+ struct dma_chan *chan, *_chan;
+
+ if (priv->dma_dev_common.chancnt)
+ dma_async_device_unregister(&priv->dma_dev_common);
+
+ list_for_each_entry_safe(chan, _chan, &priv->dma_dev_common.channels,
+ device_node) {
+ xor_chan = container_of(chan, struct talitos_xor_chan,
+ common);
+ list_del(&chan->device_node);
+ priv->dma_dev_common.chancnt--;
+ kfree(xor_chan);
+ }
+}
+
+/**
+ * talitos_register_dma_async - Initialize the Freescale XOR ADMA device
+ * It is registered as a DMA device with the capability to perform
+ * XOR operation with the Async_tx layer.
+ * The various queues and channel resources are also allocated.
+ */
+static int talitos_register_async_tx(struct device *dev, int max_xor_srcs)
+{
+ struct talitos_private *priv = dev_get_drvdata(dev);
+ struct dma_device *dma_dev = &priv->dma_dev_common;
+ struct talitos_xor_chan *xor_chan;
+ int err;
+
+ xor_chan = kzalloc(sizeof(struct talitos_xor_chan), GFP_KERNEL);
+ if (!xor_chan) {
+ dev_err(dev, "unable to allocate xor channel\n");
+ return -ENOMEM;
+ }
+
+ dma_dev->dev = dev;
+ dma_dev->device_alloc_chan_resources = talitos_alloc_chan_resources;
+ dma_dev->device_free_chan_resources = talitos_free_chan_resources;
+ dma_dev->device_prep_dma_xor = talitos_prep_dma_xor;
+ dma_dev->max_xor = max_xor_srcs;
+ dma_dev->device_tx_status = talitos_is_tx_complete;
+ dma_dev->device_issue_pending = talitos_issue_pending;
+ INIT_LIST_HEAD(&dma_dev->channels);
+ dma_cap_set(DMA_XOR, dma_dev->cap_mask);
+
+ xor_chan->dev = dev;
+ xor_chan->common.device = dma_dev;
+ xor_chan->total_desc = 0;
+ INIT_LIST_HEAD(&xor_chan->submit_q);
+ INIT_LIST_HEAD(&xor_chan->pending_q);
+ INIT_LIST_HEAD(&xor_chan->in_progress_q);
+ INIT_LIST_HEAD(&xor_chan->free_desc);
+ spin_lock_init(&xor_chan->desc_lock);
+
+ list_add_tail(&xor_chan->common.device_node, &dma_dev->channels);
+ dma_dev->chancnt++;
+
+ err = dma_async_device_register(dma_dev);
+ if (err) {
+ dev_err(dev, "Unable to register XOR with Async_tx\n");
+ goto err_out;
+ }
+
+ return err;
+
+err_out:
+ talitos_unregister_async_xor(dev);
+ return err;
+}
+#endif
+
/*
* crypto alg
*/
@@ -2891,6 +3284,26 @@ static int talitos_probe(struct platform_device *ofdev)
dev_info(dev, "hwrng\n");
}
+#ifdef CONFIG_CRYPTO_DEV_TALITOS_RAIDXOR
+ /*
+ * register with async_tx xor, if capable
+ * SEC 2.x support up to 3 RAID sources,
+ * SEC 3.x support up to 6
+ */
+ if (hw_supports(dev, DESC_HDR_SEL0_AESU | DESC_HDR_TYPE_RAID_XOR)) {
+ int max_xor_srcs = 3;
+ if (of_device_is_compatible(np, "fsl,sec3.0"))
+ max_xor_srcs = 6;
+ err = talitos_register_async_tx(dev, max_xor_srcs);
+ if (err) {
+ dev_err(dev, "failed to register async_tx xor: %d\n",
+ err);
+ goto err_out;
+ }
+ dev_info(dev, "max_xor_srcs %d\n", max_xor_srcs);
+ }
+#endif
+
/* register crypto algorithms the device supports */
for (i = 0; i < ARRAY_SIZE(driver_algs); i++) {
if (hw_supports(dev, driver_algs[i].desc_hdr_template)) {
diff --git a/drivers/crypto/talitos.h b/drivers/crypto/talitos.h
index 61a1405..fc9d125 100644
--- a/drivers/crypto/talitos.h
+++ b/drivers/crypto/talitos.h
@@ -30,6 +30,7 @@
#define TALITOS_TIMEOUT 100000
#define TALITOS_MAX_DATA_LEN 65535
+#define TALITOS_MAX_DESCRIPTOR_NR 256
#define DESC_TYPE(desc_hdr) ((be32_to_cpu(desc_hdr) >> 3) & 0x1f)
#define PRIMARY_EU(desc_hdr) ((be32_to_cpu(desc_hdr) >> 28) & 0xf)
@@ -131,7 +132,57 @@ struct talitos_private {
/* hwrng device */
struct hwrng rng;
+
+#ifdef CONFIG_CRYPTO_DEV_TALITOS_RAIDXOR
+ /* XOR Device */
+ struct dma_device dma_dev_common;
+#endif
+};
+
+#ifdef CONFIG_CRYPTO_DEV_TALITOS_RAIDXOR
+/**
+ * talitos_xor_chan - context management for the async_tx channel
+ * @completed_cookie: the last completed cookie
+ * @desc_lock: lock for tx queue
+ * @total_desc: number of descriptors allocated
+ * @submit_q: queue of submitted descriptors
+ * @pending_q: queue of pending descriptors
+ * @in_progress_q: queue of descriptors in progress
+ * @free_desc: queue of unused descriptors
+ * @dev: talitos device implementing this channel
+ * @common: the corresponding xor channel in async_tx
+ */
+struct talitos_xor_chan {
+ dma_cookie_t completed_cookie;
+ spinlock_t desc_lock;
+ unsigned int total_desc;
+ struct list_head submit_q;
+ struct list_head pending_q;
+ struct list_head in_progress_q;
+ struct list_head free_desc;
+ struct device *dev;
+ struct dma_chan common;
+};
+
+/**
+ * talitos_xor_desc - software xor descriptor
+ * @async_tx: the referring async_tx descriptor
+ * @node:
+ * @hwdesc: h/w descriptor
+ * @unmap_src_cnt: number of xor sources
+ * @unmap_len: transaction byte count
+ * @idx: index of xor sources
+ */
+struct talitos_xor_desc {
+ struct dma_async_tx_descriptor async_tx;
+ struct list_head tx_list;
+ struct list_head node;
+ struct talitos_desc hwdesc;
+ unsigned int unmap_src_cnt;
+ unsigned int unmap_len;
+ unsigned int idx;
};
+#endif
extern int talitos_submit(struct device *dev, int ch, struct talitos_desc *desc,
void (*callback)(struct device *dev,
@@ -284,6 +335,7 @@ extern int talitos_submit(struct device *dev, int ch, struct talitos_desc *desc,
/* primary execution unit mode (MODE0) and derivatives */
#define DESC_HDR_MODE0_ENCRYPT cpu_to_be32(0x00100000)
#define DESC_HDR_MODE0_AESU_CBC cpu_to_be32(0x00200000)
+#define DESC_HDR_MODE0_AESU_XOR cpu_to_be32(0x0c600000)
#define DESC_HDR_MODE0_DEU_CBC cpu_to_be32(0x00400000)
#define DESC_HDR_MODE0_DEU_3DES cpu_to_be32(0x00200000)
#define DESC_HDR_MODE0_MDEU_CONT cpu_to_be32(0x08000000)
@@ -344,6 +396,7 @@ extern int talitos_submit(struct device *dev, int ch, struct talitos_desc *desc,
#define DESC_HDR_TYPE_IPSEC_ESP cpu_to_be32(1 << 3)
#define DESC_HDR_TYPE_COMMON_NONSNOOP_NO_AFEU cpu_to_be32(2 << 3)
#define DESC_HDR_TYPE_HMAC_SNOOP_NO_AFEU cpu_to_be32(4 << 3)
+#define DESC_HDR_TYPE_RAID_XOR cpu_to_be32(21 << 3)
/* link table extent field bits */
#define DESC_PTR_LNKTBL_JUMP 0x80
--
1.7.5.1
^ permalink raw reply related
* Re: [PATCH v3 2/2] powerpc: Uprobes port to powerpc
From: Srikar Dronamraju @ 2012-07-27 8:40 UTC (permalink / raw)
To: Ananth N Mavinakayanahalli
Cc: peterz, lkml, oleg, Paul Mackerras, Anton Blanchard, Ingo Molnar,
linuxppc-dev
In-Reply-To: <20120726052029.GB29466@in.ibm.com>
* Ananth N Mavinakayanahalli <ananth@in.ibm.com> [2012-07-26 10:50:29]:
> From: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
>
> This is the port of uprobes to powerpc. Usage is similar to x86.
>
> [root@xxxx ~]# ./bin/perf probe -x /lib64/libc.so.6 malloc
> Added new event:
> probe_libc:malloc (on 0xb4860)
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe_libc:malloc -aR sleep 1
>
> [root@xxxx ~]# ./bin/perf record -e probe_libc:malloc -aR sleep 20
> [ perf record: Woken up 22 times to write data ]
> [ perf record: Captured and wrote 5.843 MB perf.data (~255302 samples) ]
> [root@xxxx ~]# ./bin/perf report --stdio
> ...
>
> # Samples: 83K of event 'probe_libc:malloc'
> # Event count (approx.): 83484
> #
> # Overhead Command Shared Object Symbol
> # ........ ............ ............. ..........
> #
> 69.05% tar libc-2.12.so [.] malloc
> 28.57% rm libc-2.12.so [.] malloc
> 1.32% avahi-daemon libc-2.12.so [.] malloc
> 0.58% bash libc-2.12.so [.] malloc
> 0.28% sshd libc-2.12.so [.] malloc
> 0.08% irqbalance libc-2.12.so [.] malloc
> 0.05% bzip2 libc-2.12.so [.] malloc
> 0.04% sleep libc-2.12.so [.] malloc
> 0.03% multipathd libc-2.12.so [.] malloc
> 0.01% sendmail libc-2.12.so [.] malloc
> 0.01% automount libc-2.12.so [.] malloc
>
> Patch applies on the current master branch of Linus' tree (bdc0077af).
> The trap_nr addition patch is a prereq.
>
> Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
^ permalink raw reply
* RE: [PATCH V3 1/5] powerpc/fsl-pci: Unify pci/pcie initialization code
From: Jia Hongtao-B38951 @ 2012-07-27 8:35 UTC (permalink / raw)
To: Kumar Gala
Cc: Wood Scott-B07421, linuxppc-dev@lists.ozlabs.org, Li Yang-R58472
In-Reply-To: <F5E29818-CC20-444E-A125-249E5E3BA365@kernel.crashing.org>
> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> Sent: Friday, July 27, 2012 2:15 AM
> To: Jia Hongtao-B38951
> Cc: linuxppc-dev@lists.ozlabs.org; Wood Scott-B07421; Li Yang-R58472
> Subject: Re: [PATCH V3 1/5] powerpc/fsl-pci: Unify pci/pcie
> initialization code
>=20
>=20
> On Jul 26, 2012, at 7:30 AM, Jia Hongtao wrote:
>=20
> > We unified the Freescale pci/pcie initialization by changing the
> fsl_pci
> > to a platform driver. In previous PCI code architecture the
> initialization
> > routine is called at board_setup_arch stage. Now the initialization is
> done
> > in probe function which is architectural better. Also It's convenient
> for
> > adding PM support for PCI controller in later patch.
> >
> > One issue introduced by this architecture is the timing of swiotlb_init=
.
> > During PCI initialization the need of swiotlb is determined and this
> should
> > be done before swiotlb_init. So a new function to determine swiotlb by
> > parsing pci ranges is made. This function is called at board_setup_arch
> > stage which is earlier than swiotlb_init.
> >
> > Signed-off-by: Jia Hongtao <B38951@freescale.com>
> > Signed-off-by: Li Yang <leoli@freescale.com>
> > ---
> > Changed for V3:
> > - Rebase the patch set on the latest tree
> > - merge PCI unify and swiotlb patch into one
> >
> > arch/powerpc/sysdev/fsl_pci.c | 155 ++++++++++++++++++++++++++++++++--
> -------
> > arch/powerpc/sysdev/fsl_pci.h | 9 +--
> > 2 files changed, 125 insertions(+), 39 deletions(-)
>=20
> I'd like the SWIOTLB refactoring as a separate patch. Additionally, the
> order of patches should be as follows:
>=20
> 1. refactor PCI node parsing code
> 2. add pci_determine_swiotlb (should rename to fsl_pci_determine_swiotlb)
> 3. Determine primary bus by looking for ISA node
> 4. convert all boards over to fsl_pci_init
> 5. convert fsl pci to platform driver (edac and other fixes should be
> merged in here)
> 6. PM support
>=20
> - k
Should I convert all boards over to fsl_pci_init first and then convert the=
m
over to platform driver again or just convert them direct to platform drive=
r?
Thanks.
-Hongtao.
^ permalink raw reply
* Re: [PATCH] scsi/ibmvscsi: /sys/class/scsi_host/hostX/config doesn't show any information
From: James Bottomley @ 2012-07-27 6:56 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, olaf, Linda Xie, linux-scsi
In-Reply-To: <1343366362.2118.15.camel@pasglop>
On Fri, 2012-07-27 at 15:19 +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2012-07-18 at 18:49 +0200, olaf@aepfle.de wrote:
> > From: Linda Xie <lxiep@us.ibm.com>
>
> James, can I assume you're picking up those two ?
If they get acked by the maintiners ...
James
^ permalink raw reply
* [PATCH] powerpc/fsl: mpic timer driver
From: Dongsheng.wang @ 2012-07-27 6:20 UTC (permalink / raw)
To: benh, paulus; +Cc: scottwood, linuxppc-dev, Wang Dongsheng
From: Wang Dongsheng <Dongsheng.Wang@freescale.com>
Global timers A and B internal to the PIC. The two independent groups
of global timer, group A and group B, are identical in their functionality.
The hardware timer generates an interrupt on every timer cycle.
e.g
Power management can use the hardware timer to wake up the machine.
Signed-off-by: Wang Dongsheng <Dongsheng.Wang@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
---
arch/powerpc/include/asm/mpic_timer.h | 15 +
arch/powerpc/platforms/Kconfig | 5 +
arch/powerpc/sysdev/Makefile | 1 +
arch/powerpc/sysdev/mpic_timer.c | 459 +++++++++++++++++++++++++++++++++
4 files changed, 480 insertions(+), 0 deletions(-)
create mode 100644 arch/powerpc/include/asm/mpic_timer.h
create mode 100644 arch/powerpc/sysdev/mpic_timer.c
diff --git a/arch/powerpc/include/asm/mpic_timer.h b/arch/powerpc/include/asm/mpic_timer.h
new file mode 100644
index 0000000..01d58a2
--- /dev/null
+++ b/arch/powerpc/include/asm/mpic_timer.h
@@ -0,0 +1,15 @@
+#ifndef __MPIC_TIMER__
+#define __MPIC_TIMER__
+
+#include <linux/interrupt.h>
+#include <linux/time.h>
+
+struct mpic_timer *mpic_request_timer(irq_handler_t fn, void *dev,
+ const struct timeval *time);
+
+void mpic_start_timer(struct mpic_timer *handle);
+
+void mpic_stop_timer(struct mpic_timer *handle);
+
+void mpic_free_timer(struct mpic_timer *handle);
+#endif
diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig
index f21af8d..3466690 100644
--- a/arch/powerpc/platforms/Kconfig
+++ b/arch/powerpc/platforms/Kconfig
@@ -87,6 +87,11 @@ config MPIC
bool
default n
+config MPIC_TIMER
+ bool "MPIC Global Timer"
+ depends on MPIC && FSL_SOC
+ default n
+
config PPC_EPAPR_HV_PIC
bool
default n
diff --git a/arch/powerpc/sysdev/Makefile b/arch/powerpc/sysdev/Makefile
index b0aff6c..3002f28 100644
--- a/arch/powerpc/sysdev/Makefile
+++ b/arch/powerpc/sysdev/Makefile
@@ -4,6 +4,7 @@ ccflags-$(CONFIG_PPC64) := -mno-minimal-toc
mpic-msi-obj-$(CONFIG_PCI_MSI) += mpic_msi.o mpic_u3msi.o mpic_pasemi_msi.o
obj-$(CONFIG_MPIC) += mpic.o $(mpic-msi-obj-y)
+obj-$(CONFIG_MPIC_TIMER) += mpic_timer.o
obj-$(CONFIG_PPC_EPAPR_HV_PIC) += ehv_pic.o
fsl-msi-obj-$(CONFIG_PCI_MSI) += fsl_msi.o
obj-$(CONFIG_PPC_MSI_BITMAP) += msi_bitmap.o
diff --git a/arch/powerpc/sysdev/mpic_timer.c b/arch/powerpc/sysdev/mpic_timer.c
new file mode 100644
index 0000000..ef0db4d
--- /dev/null
+++ b/arch/powerpc/sysdev/mpic_timer.c
@@ -0,0 +1,459 @@
+/*
+ * Copyright (c) 2012 Freescale Semiconductor, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/errno.h>
+#include <asm/io.h>
+#include <linux/mm.h>
+#include <linux/interrupt.h>
+#include <linux/slab.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+
+#include <sysdev/fsl_soc.h>
+#include <asm/mpic_timer.h>
+
+
+#define MPIC_TIMER_TCR_ROVR_OFFSET 24
+#define MPIC_TIMER_TCR_CLKDIV_64 0x00000300
+
+#define MPIC_TIMER_STOP 0x80000000
+#define MPIC_ALL_TIMER 4
+
+#define MAX_TIME (~0U>>1)
+#define MAX_TIME_CASCADE (~0U)
+
+#define TIMER_OFFSET(num) (1 << (MPIC_ALL_TIMER - 1 - num))
+#define ONE_SECOND 1000000
+
+struct timer_regs {
+ u32 gtccr;
+ u32 res0[3];
+ u32 gtbcr;
+ u32 res1[3];
+ u32 gtvpr;
+ u32 res2[3];
+ u32 gtdr;
+ u32 res3[3];
+};
+
+struct mpic_timer {
+ void *dev;
+ struct cascade_priv *cascade_handle;
+ unsigned int num;
+ int irq;
+};
+
+struct cascade_priv {
+ u32 tcr_value; /* TCR register: CASC & ROVR value */
+ unsigned int cascade_map; /* cascade map */
+ unsigned int timer_num; /* cascade control timer */
+};
+
+struct group_priv {
+ struct timer_regs __iomem *regs;
+ struct mpic_timer timer[MPIC_ALL_TIMER];
+ struct list_head node;
+ unsigned int idle;
+ spinlock_t lock;
+ void __iomem *group_tcr;
+};
+
+static struct cascade_priv cascade_timer[] = {
+ /* cascade timer 0 and 1 */
+ {0x1, 0xc, 0x1},
+ /* cascade timer 1 and 2 */
+ {0x2, 0x6, 0x2},
+ /* cascade timer 2 and 3 */
+ {0x4, 0x3, 0x3}
+};
+
+static u32 ccbfreq;
+static u64 max_value; /* prevent u64 overflow */
+static LIST_HEAD(group_list);
+
+/* the time set by the user is converted to "ticks" */
+static int transform_time(const struct timeval *time, int clkdiv, u64 *ticks)
+{
+ u64 tmp = 0;
+ u64 tmp_sec = 0;
+ u64 tmp_ms = 0;
+ u64 tmp_us = 0;
+ u32 div = 0;
+
+ if ((time->tv_sec + time->tv_usec) == 0 ||
+ time->tv_sec < 0 || time->tv_usec < 0)
+ return -EINVAL;
+
+ if (time->tv_usec > ONE_SECOND)
+ return -EINVAL;
+
+ if (time->tv_sec > max_value ||
+ (time->tv_sec == max_value && time->tv_usec > 0))
+ return -EINVAL;
+
+ div = (1 << (clkdiv >> 8)) * 8;
+
+ tmp_sec = div_u64((u64)time->tv_sec * (u64)ccbfreq, div);
+ tmp += tmp_sec;
+
+ tmp_ms = time->tv_usec / 1000;
+ tmp_ms = div_u64((u64)tmp_ms * (u64)ccbfreq, div * 1000);
+ tmp += tmp_ms;
+
+ tmp_us = time->tv_usec % 1000;
+ tmp_us = div_u64((u64)tmp_us * (u64)ccbfreq, div * 1000000);
+ tmp += tmp_us;
+
+ *ticks = tmp;
+
+ return 0;
+}
+
+/* detect whether there is a cascade timer available */
+struct mpic_timer *detect_idle_cascade_timer(void)
+{
+ struct group_priv *priv;
+ struct cascade_priv *casc_priv;
+ unsigned int tmp;
+ unsigned int array_size = ARRAY_SIZE(cascade_timer);
+ unsigned int num;
+ unsigned int i;
+
+ list_for_each_entry(priv, &group_list, node) {
+ casc_priv = cascade_timer;
+
+ for (i = 0; i < array_size; i++) {
+ unsigned long flags;
+
+ spin_lock_irqsave(&priv->lock, flags);
+ tmp = casc_priv->cascade_map & priv->idle;
+ if (tmp == casc_priv->cascade_map) {
+ num = casc_priv->timer_num;
+ priv->timer[num].cascade_handle = casc_priv;
+
+ /* set timer busy */
+ priv->idle &= ~casc_priv->cascade_map;
+ spin_unlock_irqrestore(&priv->lock, flags);
+ return &priv->timer[num];
+ }
+ spin_unlock_irqrestore(&priv->lock, flags);
+ casc_priv++;
+ }
+ }
+
+ return NULL;
+}
+
+static int set_cascade_timer(struct group_priv *priv, u64 ticks,
+ unsigned int num)
+{
+ struct cascade_priv *casc_priv;
+ u32 tmp;
+ u32 tmp_ticks;
+ u32 rem_ticks;
+
+ /* set group tcr reg for cascade */
+ casc_priv = priv->timer[num].cascade_handle;
+ if (!casc_priv)
+ return -EINVAL;
+
+ tmp = casc_priv->tcr_value |
+ (casc_priv->tcr_value << MPIC_TIMER_TCR_ROVR_OFFSET);
+ setbits32(priv->group_tcr, tmp);
+
+ tmp_ticks = div_u64_rem(ticks, MAX_TIME_CASCADE, &rem_ticks);
+
+ out_be32(&priv->regs[num].gtccr, 0);
+ out_be32(&priv->regs[num].gtbcr, tmp_ticks | MPIC_TIMER_STOP);
+
+ out_be32(&priv->regs[num - 1].gtccr, 0);
+ out_be32(&priv->regs[num - 1].gtbcr, rem_ticks);
+
+ return 0;
+}
+
+struct mpic_timer *get_cascade_timer(u64 ticks)
+{
+ struct group_priv *priv = NULL;
+ struct mpic_timer *allocated_timer = NULL;
+
+ /* Two cascade timers: Support the maximum time */
+ const u64 max_ticks = (u64)MAX_TIME * (u64)MAX_TIME_CASCADE;
+ int ret;
+
+ if (ticks > max_ticks)
+ return NULL;
+
+ /* detect idle timer */
+ allocated_timer = detect_idle_cascade_timer();
+ if (!allocated_timer)
+ return NULL;
+
+ priv = container_of(allocated_timer, struct group_priv,
+ timer[allocated_timer->num]);
+
+ /* set ticks to timer */
+ ret = set_cascade_timer(priv, ticks, allocated_timer->num);
+ if (ret < 0)
+ return NULL;
+
+ return allocated_timer;
+}
+
+struct mpic_timer *get_timer(u64 ticks)
+{
+ struct group_priv *priv;
+ unsigned int num;
+ unsigned int i;
+
+ list_for_each_entry(priv, &group_list, node) {
+ for (i = 0; i < MPIC_ALL_TIMER; i++) {
+ unsigned long flags;
+
+ /* one timer: Reverse allocation */
+ num = MPIC_ALL_TIMER - 1 - i;
+
+ spin_lock_irqsave(&priv->lock, flags);
+ if (priv->idle & (1 << i)) {
+ /* set timer busy */
+ priv->idle &= ~(1 << i);
+ /* set ticks & stop timer */
+ out_be32(&priv->regs[num].gtbcr,
+ ticks | MPIC_TIMER_STOP);
+ out_be32(&priv->regs[num].gtccr, 0);
+
+ spin_unlock_irqrestore(&priv->lock, flags);
+ priv->timer[num].cascade_handle = NULL;
+
+ return &priv->timer[num];
+ }
+ spin_unlock_irqrestore(&priv->lock, flags);
+ }
+ }
+
+ return NULL;
+}
+
+/**
+ * mpic_request_timer - get a hardware timer
+ * @fn: interrupt handler function
+ * @dev: callback function of the data
+ * @time: time for timer
+ *
+ * This executes the "request_irq", returning NULL
+ * else "handle" on success.
+ */
+struct mpic_timer *mpic_request_timer(irq_handler_t fn, void *dev,
+ const struct timeval *time)
+{
+ struct mpic_timer *allocated_timer = NULL;
+ u64 ticks = 0;
+ int ret = 0;
+
+ if (list_empty(&group_list))
+ return NULL;
+
+ ret = transform_time(time, MPIC_TIMER_TCR_CLKDIV_64, &ticks);
+ if (ret < 0)
+ return NULL;
+
+ if (ticks > MAX_TIME)
+ allocated_timer = get_cascade_timer(ticks);
+ else
+ allocated_timer = get_timer(ticks);
+
+ if (!allocated_timer)
+ return NULL;
+
+ ret = request_irq(allocated_timer->irq, fn, IRQF_TRIGGER_LOW,
+ "mpic-global-timer", dev);
+ if (ret)
+ return NULL;
+
+ allocated_timer->dev = dev;
+
+ return allocated_timer;
+}
+EXPORT_SYMBOL(mpic_request_timer);
+
+/**
+ * mpic_start_timer - start hardware timer
+ * @handle: the timer to be started.
+ *
+ * It will do ->fn(->dev) callback from the hardware interrupt at
+ * the ->timeval point in the future.
+ */
+void mpic_start_timer(struct mpic_timer *handle)
+{
+ struct group_priv *priv = container_of(handle, struct group_priv,
+ timer[handle->num]);
+
+ clrbits32(&priv->regs[handle->num].gtbcr, MPIC_TIMER_STOP);
+}
+EXPORT_SYMBOL(mpic_start_timer);
+
+/**
+ * mpic_stop_timer - stop hardware timer
+ * @handle: the timer to be stoped
+ *
+ * The timer periodically generates an interrupt. Unless user stops the timer.
+ */
+void mpic_stop_timer(struct mpic_timer *handle)
+{
+ struct group_priv *priv = container_of(handle, struct group_priv,
+ timer[handle->num]);
+
+ setbits32(&priv->regs[handle->num].gtbcr, MPIC_TIMER_STOP);
+}
+EXPORT_SYMBOL(mpic_stop_timer);
+
+/**
+ * mpic_free_timer - free hardware timer
+ * @handle: the timer to be removed.
+ *
+ * Free the timer.
+ *
+ * Note: can not be used in interrupt context.
+ */
+void mpic_free_timer(struct mpic_timer *handle)
+{
+ struct group_priv *priv = container_of(handle, struct group_priv,
+ timer[handle->num]);
+
+ struct cascade_priv *casc_priv = NULL;
+ unsigned long flags;
+
+ mpic_stop_timer(handle);
+
+ casc_priv = priv->timer[handle->num].cascade_handle;
+
+ free_irq(priv->timer[handle->num].irq, priv->timer[handle->num].dev);
+
+ spin_lock_irqsave(&priv->lock, flags);
+ if (casc_priv) {
+ u32 tmp;
+ tmp = casc_priv->tcr_value | (casc_priv->tcr_value <<
+ MPIC_TIMER_TCR_ROVR_OFFSET);
+ clrbits32(priv->group_tcr, tmp);
+ priv->idle |= casc_priv->cascade_map;
+ priv->timer[handle->num].cascade_handle = NULL;
+ } else {
+ priv->idle |= 1 << (MPIC_ALL_TIMER - 1 - handle->num);
+ }
+ spin_unlock_irqrestore(&priv->lock, flags);
+}
+EXPORT_SYMBOL(mpic_free_timer);
+
+static void group_init(struct device_node *np)
+{
+ struct group_priv *priv = NULL;
+ const u32 all_timer[] = { 0, MPIC_ALL_TIMER };
+ const u32 *p;
+ u32 offset;
+ u32 count;
+
+ unsigned int i = 0;
+ unsigned int j = 0;
+ unsigned int irq_index = 0;
+ int irq = 0;
+ int len = 0;
+
+ priv = kzalloc(sizeof(struct group_priv), GFP_KERNEL);
+ if (!priv) {
+ pr_err("%s: cannot allocate memory for group.\n",
+ np->full_name);
+ return;
+ }
+
+ priv->regs = of_iomap(np, 0);
+ if (!priv->regs) {
+ pr_err("%s: cannot ioremap register address.\n",
+ np->full_name);
+ goto out;
+ }
+
+ priv->group_tcr = of_iomap(np, 1);
+ if (!priv->group_tcr) {
+ pr_err("%s: cannot ioremap tcr address.\n", np->full_name);
+ goto out;
+ }
+
+ /* Get irq numbers form dts */
+ p = of_get_property(np, "fsl,available-ranges", &len);
+ if (p && len % (2 * sizeof(u32)) != 0) {
+ pr_err("%s: malformed fsl,available-ranges property.\n",
+ np->full_name);
+ goto out;
+ }
+
+ if (!p) {
+ p = all_timer;
+ len = sizeof(all_timer);
+ }
+
+ len /= 2 * sizeof(u32);
+
+ for (i = 0; i < len; i++) {
+ offset = p[i * 2];
+ count = p[i * 2 + 1];
+ for (j = 0; j < count; j++) {
+ irq = irq_of_parse_and_map(np, irq_index);
+ if (!irq)
+ break;
+
+ /* Set timer idle */
+ priv->idle |= TIMER_OFFSET((offset + j));
+ priv->timer[offset + j].irq = irq;
+ priv->timer[offset + j].num = offset + j;
+ irq_index++;
+ }
+ }
+
+ /* Init lock */
+ spin_lock_init(&priv->lock);
+
+ /* Init timer hardware */
+ setbits32(priv->group_tcr, MPIC_TIMER_TCR_CLKDIV_64);
+
+ list_add_tail(&priv->node, &group_list);
+
+ return;
+out:
+ if (priv->group_tcr)
+ iounmap(priv->group_tcr);
+
+ if (priv->regs)
+ iounmap(priv->regs);
+
+ kfree(priv);
+}
+
+static int __init mpic_timer_init(void)
+{
+ struct device_node *np = NULL;
+
+ ccbfreq = fsl_get_sys_freq();
+ if (ccbfreq == 0) {
+ pr_err("mpic_timer: No bus frequency "
+ "in device tree.\n");
+ return -ENODEV;
+ }
+
+ max_value = div_u64(ULLONG_MAX, ccbfreq);
+
+ for_each_compatible_node(np, NULL, "fsl,mpic-global-timer")
+ group_init(np);
+
+ if (list_empty(&group_list))
+ return -ENODEV;
+
+ return 0;
+}
+arch_initcall(mpic_timer_init);
--
1.7.5.1
^ permalink raw reply related
* Re: [RFC PATCH v4 12/13] memory-hotplug : add node_device_release
From: Wen Congyang @ 2012-07-27 6:17 UTC (permalink / raw)
To: Yasuaki Ishimatsu
Cc: len.brown, linux-acpi, linux-kernel, linux-mm, paulus,
minchan.kim, kosaki.motohiro, rientjes, cl, linuxppc-dev, akpm,
liuj97
In-Reply-To: <50068D41.9090109@jp.fujitsu.com>
At 07/18/2012 06:17 PM, Yasuaki Ishimatsu Wrote:
> When calling unregister_node(), the function shows following message at
> device_release().
>
> Device 'node2' does not have a release() function, it is broken and must be
> fixed.
>
> So the patch implements node_device_release()
>
> CC: David Rientjes <rientjes@google.com>
> CC: Jiang Liu <liuj97@gmail.com>
> CC: Len Brown <len.brown@intel.com>
> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> CC: Paul Mackerras <paulus@samba.org>
> CC: Christoph Lameter <cl@linux.com>
> Cc: Minchan Kim <minchan.kim@gmail.com>
> CC: Andrew Morton <akpm@linux-foundation.org>
> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> CC: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>
> ---
> drivers/base/node.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> Index: linux-3.5-rc6/drivers/base/node.c
> ===================================================================
> --- linux-3.5-rc6.orig/drivers/base/node.c 2012-07-18 18:24:29.191121066 +0900
> +++ linux-3.5-rc6/drivers/base/node.c 2012-07-18 18:25:47.111146983 +0900
> @@ -252,6 +252,12 @@ static inline void hugetlb_register_node
> static inline void hugetlb_unregister_node(struct node *node) {}
> #endif
>
> +static void node_device_release(struct device *dev)
> +{
> + struct node *node_dev = to_node(dev);
> +
> + memset(node_dev, 0, sizeof(struct node));
This line is wrong. node_dev->work_struct may be queued in workqueue.
So, it is very dangerous to clear node_dev->work_struct here.
In my test, it will cause kernel panicked.
Thanks
Wen Congyang
> +}
>
> /*
> * register_node - Setup a sysfs device for a node.
> @@ -265,6 +271,7 @@ int register_node(struct node *node, int
>
> node->dev.id = num;
> node->dev.bus = &node_subsys;
> + node->dev.release = node_device_release;
> error = device_register(&node->dev);
>
> if (!error){
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply
* Re: [PATCH] scsi/ibmvscsi: /sys/class/scsi_host/hostX/config doesn't show any information
From: Benjamin Herrenschmidt @ 2012-07-27 5:19 UTC (permalink / raw)
To: olaf; +Cc: linuxppc-dev, Linda Xie, linux-scsi, James E.J. Bottomley
In-Reply-To: <1342630157-16468-1-git-send-email-olaf@aepfle.de>
On Wed, 2012-07-18 at 18:49 +0200, olaf@aepfle.de wrote:
> From: Linda Xie <lxiep@us.ibm.com>
James, can I assume you're picking up those two ?
Cheers,
Ben.
> Expected result:
> It should show something like this:
> x1521p4:~ # cat /sys/class/scsi_host/host1/config
> PARTITIONNAME='x1521p4'
> NWSDNAME='X1521P4'
> HOSTNAME='X1521P4'
> DOMAINNAME='RCHLAND.IBM.COM'
> NAMESERVERS='9.10.244.100 9.10.244.200'
>
> Actual result:
> x1521p4:~ # cat /sys/class/scsi_host/host0/config
> x1521p4:~ #
>
> This patch changes the size of the buffer used for transfering config
> data to 4K. It was tested against 2.6.19-rc2 tree.
>
> Reported by IBM during SLES11 beta testing:
>
> https://bugzilla.novell.com/show_bug.cgi?id=439970
> LTC49349
>
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
>
> diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
> index e580aa4..1513ca8 100644
> --- a/drivers/scsi/ibmvscsi/ibmvscsi.c
> +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
> @@ -93,6 +93,8 @@ static int max_requests = IBMVSCSI_MAX_REQUESTS_DEFAULT;
> static int max_events = IBMVSCSI_MAX_REQUESTS_DEFAULT + 2;
> static int fast_fail = 1;
> static int client_reserve = 1;
> +/* host data buffer size */
> +#define HOST_BUFFER_SIZE 4096
>
> static struct scsi_transport_template *ibmvscsi_transport_template;
>
> @@ -1666,7 +1668,7 @@ static ssize_t show_host_srp_version(struct device *dev,
> struct ibmvscsi_host_data *hostdata = shost_priv(shost);
> int len;
>
> - len = snprintf(buf, PAGE_SIZE, "%s\n",
> + len = snprintf(buf, HOST_BUFFER_SIZE, "%s\n",
> hostdata->madapter_info.srp_version);
> return len;
> }
> @@ -1687,7 +1689,7 @@ static ssize_t show_host_partition_name(struct device *dev,
> struct ibmvscsi_host_data *hostdata = shost_priv(shost);
> int len;
>
> - len = snprintf(buf, PAGE_SIZE, "%s\n",
> + len = snprintf(buf, HOST_BUFFER_SIZE, "%s\n",
> hostdata->madapter_info.partition_name);
> return len;
> }
> @@ -1708,7 +1710,7 @@ static ssize_t show_host_partition_number(struct device *dev,
> struct ibmvscsi_host_data *hostdata = shost_priv(shost);
> int len;
>
> - len = snprintf(buf, PAGE_SIZE, "%d\n",
> + len = snprintf(buf, HOST_BUFFER_SIZE, "%d\n",
> hostdata->madapter_info.partition_number);
> return len;
> }
> @@ -1728,7 +1730,7 @@ static ssize_t show_host_mad_version(struct device *dev,
> struct ibmvscsi_host_data *hostdata = shost_priv(shost);
> int len;
>
> - len = snprintf(buf, PAGE_SIZE, "%d\n",
> + len = snprintf(buf, HOST_BUFFER_SIZE, "%d\n",
> hostdata->madapter_info.mad_version);
> return len;
> }
> @@ -1748,7 +1750,7 @@ static ssize_t show_host_os_type(struct device *dev,
> struct ibmvscsi_host_data *hostdata = shost_priv(shost);
> int len;
>
> - len = snprintf(buf, PAGE_SIZE, "%d\n", hostdata->madapter_info.os_type);
> + len = snprintf(buf, HOST_BUFFER_SIZE, "%d\n", hostdata->madapter_info.os_type);
> return len;
> }
>
> @@ -1767,7 +1769,7 @@ static ssize_t show_host_config(struct device *dev,
> struct ibmvscsi_host_data *hostdata = shost_priv(shost);
>
> /* returns null-terminated host config data */
> - if (ibmvscsi_do_host_config(hostdata, buf, PAGE_SIZE) == 0)
> + if (ibmvscsi_do_host_config(hostdata, buf, HOST_BUFFER_SIZE) == 0)
> return strlen(buf);
> else
> return 0;
^ permalink raw reply
* [PATCH v2] powerpc: powernv: Always go into nap mode when CPU is offline
From: Paul Mackerras @ 2012-07-27 4:51 UTC (permalink / raw)
To: Benjamin Herrenschmidt, linuxppc-dev; +Cc: Alexander Graf
In-Reply-To: <20120726235347.GA16461@bloggs.ozlabs.ibm.com>
The CPU hotplug code for the powernv platform currently only puts
offline CPUs into nap mode if the powersave_nap variable is set.
However, HV-style KVM on this platform requires secondary CPU threads
to be offline and in nap mode. Since we know nap mode works just
fine on all POWER7 machines, and the only machines that support the
powernv platform are POWER7 machines, this changes the code to
always put offline CPUs into nap mode, regardless of powersave_nap.
Powersave_nap still controls whether or not CPUs go into nap mode
when idle, as before.
Signed-off-by: Paul Mackerras <paulus@samba.org>
---
v2: rediffed against current Linus tree to cope with minor context changes
arch/powerpc/include/asm/processor.h | 1 +
arch/powerpc/kernel/idle_power7.S | 2 ++
arch/powerpc/platforms/powernv/smp.c | 10 +---------
3 files changed, 4 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 53b6dfa..54b73a2 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -386,6 +386,7 @@ extern unsigned long cpuidle_disable;
enum idle_boot_override {IDLE_NO_OVERRIDE = 0, IDLE_POWERSAVE_OFF};
extern int powersave_nap; /* set if nap mode can be used in idle loop */
+extern void power7_nap(void);
#ifdef CONFIG_PSERIES_IDLE
extern void update_smt_snooze_delay(int snooze);
diff --git a/arch/powerpc/kernel/idle_power7.S b/arch/powerpc/kernel/idle_power7.S
index 7140d83..e11863f 100644
--- a/arch/powerpc/kernel/idle_power7.S
+++ b/arch/powerpc/kernel/idle_power7.S
@@ -28,7 +28,9 @@ _GLOBAL(power7_idle)
lwz r4,ADDROFF(powersave_nap)(r3)
cmpwi 0,r4,0
beqlr
+ /* fall through */
+_GLOBAL(power7_nap)
/* NAP is a state loss, we create a regs frame on the
* stack, fill it up with the state we care about and
* stick a pointer to it in PACAR1. We really only
diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c
index 3ef4625..7698b6e 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -106,14 +106,6 @@ static void pnv_smp_cpu_kill_self(void)
{
unsigned int cpu;
- /* If powersave_nap is enabled, use NAP mode, else just
- * spin aimlessly
- */
- if (!powersave_nap) {
- generic_mach_cpu_die();
- return;
- }
-
/* Standard hot unplug procedure */
local_irq_disable();
idle_task_exit();
@@ -128,7 +120,7 @@ static void pnv_smp_cpu_kill_self(void)
*/
mtspr(SPRN_LPCR, mfspr(SPRN_LPCR) & ~(u64)LPCR_PECE1);
while (!generic_check_cpu_restart(cpu)) {
- power7_idle();
+ power7_nap();
if (!generic_check_cpu_restart(cpu)) {
DBG("CPU%d Unexpected exit while offline !\n", cpu);
/* We may be getting an IPI, so we re-enable
--
1.7.10
^ permalink raw reply related
* [git pull] Please pull powerpc.git merge branch
From: Benjamin Herrenschmidt @ 2012-07-27 4:37 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linuxppc-dev list, Andrew Morton, Linux Kernel list
Hi Linus !
Here's a handful of powerpc patches, a couple of regression fixes
for problems introduced in the main batch in this merge window,
a couple of defconfig updates, and some trivials. The radeonfb
one is something that was long standing in SLES which I forgot
to pickup earlier.
Cheers,
Ben.
The following changes since commit bdc0077af574800d24318b6945cf2344e8dbb050:
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi (2012-07-24 18:11:22 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc.git merge
for you to fetch changes up to bac821a6e3404330d509fd3a245bf7701f210c7c:
powerpc/ftrace: Trace function graph entry before updating index (2012-07-27 11:42:34 +1000)
----------------------------------------------------------------
Alexander Graf (1):
powerpc/kvm/bookehv: Fix build regression
Anton Blanchard (2):
powerpc: Enable pseries hardware RNG and crypto modules
powerpc: Lack of firmware flash support is not an error
Benjamin Herrenschmidt (1):
powerpc: Update g5_defconfig
Steven Rostedt (1):
powerpc/ftrace: Trace function graph entry before updating index
Stuart Yoder (1):
powerpc: Set stack limit properly in crit_transfer_to_handler
Tony Breeds (1):
radeonfb: Add quirk for the graphics adapter in some JSxx
arch/powerpc/configs/g5_defconfig | 103 ++++++++++----------------------
arch/powerpc/configs/ppc64_defconfig | 6 +-
arch/powerpc/configs/pseries_defconfig | 6 +-
arch/powerpc/kernel/entry_32.S | 12 +++-
arch/powerpc/kernel/ftrace.c | 11 ++--
arch/powerpc/kernel/rtas_flash.c | 2 +-
arch/powerpc/kvm/bookehv_interrupts.S | 77 ++++++++++++------------
drivers/video/aty/radeon_monitor.c | 35 +++++++++++
8 files changed, 128 insertions(+), 124 deletions(-)
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox