* [PATCH 3/7] powerpc/kprobes: Fix ABIv2 issues with kprobe_lookup_name
From: Anton Blanchard @ 2014-04-04 6:09 UTC (permalink / raw)
To: benh, paulus, rusty, ulrich.weigand, amodra, mikey, mjw, rostedt,
philippe.bergheaud
Cc: linuxppc-dev
In-Reply-To: <1396591750-8203-1-git-send-email-anton@samba.org>
Use ppc_function_entry in places where we previously assumed
function descriptors exist.
Signed-off-by: Anton Blanchard <anton@samba.org>
---
arch/powerpc/include/asm/kprobes.h | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/kprobes.h b/arch/powerpc/include/asm/kprobes.h
index 7b6feab..af15d4d 100644
--- a/arch/powerpc/include/asm/kprobes.h
+++ b/arch/powerpc/include/asm/kprobes.h
@@ -30,6 +30,7 @@
#include <linux/ptrace.h>
#include <linux/percpu.h>
#include <asm/probes.h>
+#include <asm/code-patching.h>
#define __ARCH_WANT_KPROBES_INSN_SLOT
@@ -56,9 +57,9 @@ typedef ppc_opcode_t kprobe_opcode_t;
if ((colon = strchr(name, ':')) != NULL) { \
colon++; \
if (*colon != '\0' && *colon != '.') \
- addr = *(kprobe_opcode_t **)addr; \
+ addr = (kprobe_opcode_t *)ppc_function_entry(addr); \
} else if (name[0] != '.') \
- addr = *(kprobe_opcode_t **)addr; \
+ addr = (kprobe_opcode_t *)ppc_function_entry(addr); \
} else { \
char dot_name[KSYM_NAME_LEN]; \
dot_name[0] = '.'; \
--
1.8.3.2
^ permalink raw reply related
* [PATCH 2/7] powerpc: ftrace_caller, _mcount is exported to modules so needs _GLOBAL_TOC()
From: Anton Blanchard @ 2014-04-04 6:09 UTC (permalink / raw)
To: benh, paulus, rusty, ulrich.weigand, amodra, mikey, mjw, rostedt,
philippe.bergheaud
Cc: linuxppc-dev
In-Reply-To: <1396591750-8203-1-git-send-email-anton@samba.org>
When testing the ftrace function tracer, I realised that ftrace_caller
and mcount are called from modules and they both call into C, therefore
they need the ABIv2 global entry point to establish r2.
Signed-off-by: Anton Blanchard <anton@samba.org>
---
arch/powerpc/kernel/entry_64.S | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index cf4f6e6..9fde8a1 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -1175,7 +1175,7 @@ _GLOBAL(mcount)
_GLOBAL(_mcount)
blr
-_GLOBAL(ftrace_caller)
+_GLOBAL_TOC(ftrace_caller)
/* Taken from output of objdump from lib64/glibc */
mflr r3
ld r11, 0(r1)
@@ -1199,10 +1199,7 @@ _GLOBAL(ftrace_graph_stub)
_GLOBAL(ftrace_stub)
blr
#else
-_GLOBAL(mcount)
- blr
-
-_GLOBAL(_mcount)
+_GLOBAL_TOC(_mcount)
/* Taken from output of objdump from lib64/glibc */
mflr r3
ld r11, 0(r1)
--
1.8.3.2
^ permalink raw reply related
* [PATCH 1/7] powerpc: Add _GLOBAL_TOC for ABIv2 assembly functions exported to modules
From: Anton Blanchard @ 2014-04-04 6:09 UTC (permalink / raw)
To: benh, paulus, rusty, ulrich.weigand, amodra, mikey, mjw, rostedt,
philippe.bergheaud
Cc: linuxppc-dev
In-Reply-To: <1396591750-8203-1-git-send-email-anton@samba.org>
If an assembly function that calls back into c code is exported to
modules, we need to ensure r2 is setup correctly. There are only
two places crazy enough to do it (two of which are my fault).
Signed-off-by: Anton Blanchard <anton@samba.org>
---
arch/powerpc/include/asm/ppc_asm.h | 12 ++++++++++++
arch/powerpc/lib/copyuser_64.S | 2 +-
arch/powerpc/lib/memcpy_64.S | 2 +-
3 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h
index 2cc2511..6400f18 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -207,6 +207,16 @@ END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR)
.globl name; \
name:
+#define _GLOBAL_TOC(name) \
+ .section ".text"; \
+ .align 2 ; \
+ .type name,@function; \
+ .globl name; \
+name: \
+0: addis r2,r12,(.TOC.-0b)@ha; \
+ addi r2,r2,(.TOC.-0b)@l; \
+ .localentry name,.-name
+
#define _KPROBE(name) \
.section ".kprobes.text","a"; \
.align 2 ; \
@@ -235,6 +245,8 @@ name: \
.type GLUE(.,name),@function; \
GLUE(.,name):
+#define _GLOBAL_TOC(name) _GLOBAL(name)
+
#define _KPROBE(name) \
.section ".kprobes.text","a"; \
.align 2 ; \
diff --git a/arch/powerpc/lib/copyuser_64.S b/arch/powerpc/lib/copyuser_64.S
index 596a285..0860ee4 100644
--- a/arch/powerpc/lib/copyuser_64.S
+++ b/arch/powerpc/lib/copyuser_64.S
@@ -18,7 +18,7 @@
#endif
.align 7
-_GLOBAL(__copy_tofrom_user)
+_GLOBAL_TOC(__copy_tofrom_user)
BEGIN_FTR_SECTION
nop
FTR_SECTION_ELSE
diff --git a/arch/powerpc/lib/memcpy_64.S b/arch/powerpc/lib/memcpy_64.S
index 9d3960c..bc9a2ca 100644
--- a/arch/powerpc/lib/memcpy_64.S
+++ b/arch/powerpc/lib/memcpy_64.S
@@ -10,7 +10,7 @@
#include <asm/ppc_asm.h>
.align 7
-_GLOBAL(memcpy)
+_GLOBAL_TOC(memcpy)
BEGIN_FTR_SECTION
std r3,-STACKFRAMESIZE+STK_REG(R31)(r1) /* save destination pointer for return value */
FTR_SECTION_ELSE
--
1.8.3.2
^ permalink raw reply related
* [PATCH 0/7] Build ppc64le kernel using ABIv2, supplemental patches
From: Anton Blanchard @ 2014-04-04 6:09 UTC (permalink / raw)
To: benh, paulus, rusty, ulrich.weigand, amodra, mikey, mjw, rostedt,
philippe.bergheaud
Cc: linuxppc-dev
These patches apply against my last series and fix all known
ABIv2 issues.
To stress the module loader and dynamic ftrace code, I built an
allmodconfig kernel and inserted every module I could. I found a bunch
of bugs in the modules themselves, but in the end I managed to
get quite a few modules to load:
# cat /proc/modules | wc -l
3830
It survived a fair bit of poking (enabling/disabling function tracers
etc).
Anton Blanchard (7):
powerpc: Add _GLOBAL_TOC for ABIv2 assembly functions exported to
modules
powerpc: ftrace_caller, _mcount is exported to modules so needs
_GLOBAL_TOC()
powerpc/kprobes: Fix ABIv2 issues with kprobe_lookup_name
powerpc/modules: Create is_module_trampoline()
powerpc/modules: Create module_trampoline_target()
powerpc/ftrace: Use module loader helpers to parse trampolines
powerpc/ftrace: Fix ABIv2 issues with __ftrace_make_call
arch/powerpc/include/asm/kprobes.h | 5 +-
arch/powerpc/include/asm/module.h | 3 +
arch/powerpc/include/asm/ppc_asm.h | 12 ++++
arch/powerpc/kernel/entry_64.S | 7 +-
arch/powerpc/kernel/ftrace.c | 137 +++++++++++--------------------------
arch/powerpc/kernel/module_64.c | 80 ++++++++++++++++++++--
arch/powerpc/lib/copyuser_64.S | 2 +-
arch/powerpc/lib/memcpy_64.S | 2 +-
8 files changed, 136 insertions(+), 112 deletions(-)
--
1.8.3.2
^ permalink raw reply
* [PATCH] powerpc/powernv: Add debugfs file to grab opalv3 trace data
From: Benjamin Herrenschmidt @ 2014-04-04 5:27 UTC (permalink / raw)
To: linuxppc-dev list
From: Rusty Russell <rusty@rustcorp.com.au>
This adds files in debugfs that can be used to retrieve the
OPALv3 firmware "live binary traces" which can then be parsed
using a userspace tool.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
arch/powerpc/platforms/powernv/Makefile | 2 +-
arch/powerpc/platforms/powernv/opal-trace-types.h | 58 +++++++
arch/powerpc/platforms/powernv/opal-trace.c | 183 ++++++++++++++++++++++
3 files changed, 242 insertions(+), 1 deletion(-)
create mode 100644 arch/powerpc/platforms/powernv/opal-trace-types.h
create mode 100644 arch/powerpc/platforms/powernv/opal-trace.c
diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile
index f5d4149..e34a28d 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -2,7 +2,7 @@ obj-y += setup.o opal-takeover.o opal-wrappers.o opal.o opal-async.o
obj-y += opal-rtc.o opal-nvram.o opal-lpc.o opal-flash.o opal-sysparam.o
obj-y += rng.o opal-dump.o opal-elog.o opal-sensor.o opal-msglog.o
obj-y += subcore.o subcore-asm.o
-
+obj-$(CONFIG_DEBUG_FS) += opal-trace.o
obj-$(CONFIG_SMP) += smp.o
obj-$(CONFIG_PCI) += pci.o pci-p5ioc2.o pci-ioda.o
obj-$(CONFIG_EEH) += eeh-ioda.o eeh-powernv.o
diff --git a/arch/powerpc/platforms/powernv/opal-trace-types.h b/arch/powerpc/platforms/powernv/opal-trace-types.h
new file mode 100644
index 0000000..e9816d4
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/opal-trace-types.h
@@ -0,0 +1,58 @@
+/* API for kernel to read trace buffer. */
+#ifndef __OPAL_TRACE_TYPES_H
+#define __OPAL_TRACE_TYPES_H
+
+#define TRACE_REPEAT 1
+#define TRACE_OVERFLOW 2
+#define TRACE_OPAL 3
+#define TRACE_FSP 4
+
+/* One per cpu, plus one for NMIs */
+struct tracebuf {
+ /* Mask to apply to get buffer offset. */
+ u64 mask;
+ /* This where the buffer starts. */
+ u64 start;
+ /* This is where writer has written to. */
+ u64 end;
+ /* This is where the writer wrote to previously. */
+ u64 last;
+ /* This is where the reader is up to. */
+ u64 rpos;
+ /* If the last one we read was a repeat, this shows how many. */
+ u32 last_repeat;
+ /* Maximum possible size of a record. */
+ u32 max_size;
+
+ char buf[/* TBUF_SZ + max_size */];
+};
+
+/* Common header for all trace entries. */
+struct trace_hdr {
+ u64 timestamp;
+ u8 type;
+ u8 len_div_8;
+ u16 cpu;
+ u8 unused[4];
+};
+
+/* Note: all other entries must be at least as large as this! */
+struct trace_repeat {
+ u64 timestamp; /* Last repeat happened at this timestamp */
+ u8 type; /* == TRACE_REPEAT */
+ u8 len_div_8;
+ u16 cpu;
+ u16 prev_len;
+ u16 num; /* Starts at 1, ie. 1 repeat, or two traces. */
+ /* Note that the count can be one short, if read races a repeat. */
+};
+
+struct trace_overflow {
+ u64 unused64; /* Timestamp is unused */
+ u8 type; /* == TRACE_OVERFLOW */
+ u8 len_div_8;
+ u8 unused[6]; /* ie. hdr.cpu is indeterminate */
+ u64 bytes_missed;
+};
+
+#endif /* __OPAL_TRACE_TYPES_H */
diff --git a/arch/powerpc/platforms/powernv/opal-trace.c b/arch/powerpc/platforms/powernv/opal-trace.c
new file mode 100644
index 0000000..e445528
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/opal-trace.c
@@ -0,0 +1,183 @@
+/*
+ * Copyright (C) 2013 Rusty Russell, IBM Corporation
+ *
+ * Simple debugfs file firmware_trace to read out OPALv3 trace
+ * ringbuffers.
+ */
+#include <linux/mutex.h>
+#include <linux/debugfs.h>
+#include <linux/uaccess.h>
+#include <linux/of.h>
+#include <linux/slab.h>
+#include <asm/debug.h>
+
+#include "opal-trace-types.h"
+
+static DEFINE_MUTEX(tracelock);
+static struct tracebuf **tb;
+static size_t num_tb;
+
+/* Maximum possible size of record (since len is 8 bits). */
+union max_trace {
+ struct trace_hdr hdr;
+ struct trace_overflow overflow;
+ struct trace_repeat repeat;
+ char buf[255 * 8];
+};
+static union max_trace trace;
+
+static bool trace_empty(const struct tracebuf *tb)
+{
+ const struct trace_repeat *rep;
+
+ if (tb->rpos == tb->end)
+ return true;
+
+ /*
+ * If we have a single element only, and it's a repeat buffer
+ * we've already seen every repeat for (yet which may be
+ * incremented in future), we're also empty.
+ */
+ rep = (void *)tb->buf + (tb->rpos & tb->mask);
+ if (tb->end != tb->rpos + sizeof(*rep))
+ return false;
+
+ if (rep->type != TRACE_REPEAT)
+ return false;
+
+ if (rep->num != tb->last_repeat)
+ return false;
+
+ return true;
+}
+
+/* You can't read in parallel, so some locking required in caller. */
+static bool trace_get(union max_trace *t, struct tracebuf *tb)
+{
+ u64 start;
+
+ if (trace_empty(tb))
+ return false;
+
+again:
+ /*
+ * The actual buffer is slightly larger than tbsize, so this
+ * memcpy is always valid.
+ */
+ memcpy(t, tb->buf + (tb->rpos & tb->mask), tb->max_size);
+
+ rmb(); /* read barrier, so we read tb->start after copying record. */
+
+ start = tb->start;
+
+ /* Now, was that overwritten? */
+ if (tb->rpos < start) {
+ /* Create overflow record. */
+ t->overflow.unused64 = 0;
+ t->overflow.type = TRACE_OVERFLOW;
+ t->overflow.len_div_8 = sizeof(t->overflow) / 8;
+ t->overflow.bytes_missed = start - tb->rpos;
+ tb->rpos += t->overflow.bytes_missed;
+ return true;
+ }
+
+ /* Repeat entries need special handling */
+ if (t->hdr.type == TRACE_REPEAT) {
+ u32 num = t->repeat.num;
+
+ /* In case we've read some already... */
+ t->repeat.num -= tb->last_repeat;
+
+ /* Record how many repeats we saw this time. */
+ tb->last_repeat = num;
+
+ /* Don't report an empty repeat buffer. */
+ if (t->repeat.num == 0) {
+ /*
+ * This can't be the last buffer, otherwise
+ * trace_empty would have returned true.
+ */
+ BUG_ON(tb->end <= tb->rpos + t->hdr.len_div_8 * 8);
+ /* Skip to next entry. */
+ tb->rpos += t->hdr.len_div_8 * 8;
+ goto again;
+ }
+ } else {
+ tb->last_repeat = 0;
+ tb->rpos += t->hdr.len_div_8 * 8;
+ }
+
+ return true;
+}
+
+/* Horrible polling interface, designed for dumping. */
+static ssize_t read_opal_trace(struct file *file, char __user *ubuf,
+ size_t count, loff_t *ppos)
+{
+ ssize_t err;
+ unsigned int i;
+
+ err = mutex_lock_interruptible(&tracelock);
+ if (err)
+ return err;
+
+ for (i = 0; i < num_tb; i++) {
+ if (trace_get(&trace, tb[i])) {
+ size_t len = trace.hdr.len_div_8 * 8;
+ if (len > count)
+ len = count;
+ if (copy_to_user(ubuf, &trace, len) != 0)
+ err = -EFAULT;
+ else
+ err = len;
+ break;
+ }
+ }
+
+ mutex_unlock(&tracelock);
+ return err;
+}
+
+static const struct file_operations opal_trace_fops = {
+ .read = read_opal_trace,
+ .open = simple_open,
+};
+
+static int opal_trace_init(void)
+{
+ struct device_node *dn;
+ const u64 *reg;
+ int len, i;
+
+ dn = of_find_node_by_name(NULL, "ibm,trace");
+ if (!dn)
+ return -ENODEV;
+
+ reg = of_get_property(dn, "reg", &len);
+ if (!reg) {
+ pr_warning("%s: OF node property %s::reg not found\n",
+ __func__, dn->full_name);
+ goto fail;
+ }
+
+ num_tb = len / (sizeof(u64) * 2);
+ if (!num_tb) {
+ pr_warning("%s: OF node property %s::reg invalid length %i\n",
+ __func__, dn->full_name, len);
+ goto fail;
+ }
+ tb = kmalloc(sizeof(*tb) * num_tb, GFP_KERNEL);
+ for (i = 0; i < num_tb; i++)
+ tb[i] = __va(be64_to_cpu(reg[i*2]));
+
+ debugfs_create_file("opal-trace", 0400, powerpc_debugfs_root,
+ NULL, &opal_trace_fops);
+ of_node_put(dn);
+ return 0;
+
+fail:
+ of_node_put(dn);
+ return -EINVAL;
+}
+module_init(opal_trace_init);
+
^ permalink raw reply related
* Re: [PATCH] power, sched: stop updating inside arch_update_cpu_topology() when nothing to be update
From: Michael wang @ 2014-04-04 3:48 UTC (permalink / raw)
To: Srivatsa S. Bhat
Cc: sfr, Srikar Dronamraju, LKML, jlarrew, paulus, alistair, nfont,
Andrew Morton, rcj, linuxppc-dev
In-Reply-To: <533D20E1.4000008@linux.vnet.ibm.com>
Hi, Srivatsa
Thanks for your reply :)
On 04/03/2014 04:50 PM, Srivatsa S. Bhat wrote:
[snip]
>
> Now, the interesting thing to note here is that, if CPU0's node was already
> set as node0, *nothing* should go wrong, since its just a redundant update.
> However, if CPU0's original node mapping was something different, or if
> node0 doesn't even exist in the machine, then the system can crash.
By printk I confirmed all cpus was belong to node 1 at very beginning,
and things become magically after the wrong updating...
>
> Have you verified that CPU0's node mapping is different from node 0?
> That is, boot the kernel with "numa=debug" in the kernel command line and
> it will print out the cpu-to-node associativity during boot. That way you
> can figure out what was the original associativity that was set. This will
> confirm the theory that the hypervisor sent a redundant update, but because
> of the weird pre-allocation using kzalloc that we do inside
> arch_update_cpu_topology(), we wrongly updated CPU0's mapping as CPU0 <-> Node0.
Associativity should changes, otherwise we won't continue the updating,
and empty updates[] was confirmed to show up inside
arch_update_cpu_topology().
What I can't make sure is whether this is legal, notify changes but no
changes happen sounds weird...however, even if it's legal, a check in
here still make sense IMHO.
Regards,
Michael Wang
>
>
> Regards,
> Srivatsa S. Bhat
>
>> Thus we should stop the updating in such cases, this patch will achieve
>> this and fix the issue.
>>
>> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> CC: Paul Mackerras <paulus@samba.org>
>> CC: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>> CC: Stephen Rothwell <sfr@canb.auug.org.au>
>> CC: Andrew Morton <akpm@linux-foundation.org>
>> CC: Robert Jennings <rcj@linux.vnet.ibm.com>
>> CC: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
>> CC: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
>> CC: Alistair Popple <alistair@popple.id.au>
>> Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/mm/numa.c | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>>
>> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
>> index 30a42e2..6757690 100644
>> --- a/arch/powerpc/mm/numa.c
>> +++ b/arch/powerpc/mm/numa.c
>> @@ -1591,6 +1591,14 @@ int arch_update_cpu_topology(void)
>> cpu = cpu_last_thread_sibling(cpu);
>> }
>>
>> + /*
>> + * The 'cpu_associativity_changes_mask' could be cleared if
>> + * all the cpus it indicates won't change their node, in
>> + * which case the 'updated_cpus' will be empty.
>> + */
>> + if (!cpumask_weight(&updated_cpus))
>> + goto out;
>> +
>> stop_machine(update_cpu_topology, &updates[0], &updated_cpus);
>>
>> /*
>> @@ -1612,6 +1620,7 @@ int arch_update_cpu_topology(void)
>> changed = 1;
>> }
>>
>> +out:
>> kfree(updates);
>> return changed;
>> }
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply
* [PATCH v2 8/8] DMA: Freescale: add suspend resume functions for DMA driver
From: hongbo.zhang @ 2014-04-04 3:27 UTC (permalink / raw)
To: vkoul, dan.j.williams, dmaengine
Cc: scottwood, Hongbo Zhang, linuxppc-dev, linux-kernel
In-Reply-To: <1396582037-23065-1-git-send-email-hongbo.zhang@freescale.com>
From: Hongbo Zhang <hongbo.zhang@freescale.com>
This patch adds suspend resume functions for Freescale DMA driver.
.prepare callback is used to stop further descriptors from being added into the
pending queue, and also issue pending queues into execution if there is any.
.suspend callback makes sure all the pending jobs are cleaned up and all the
channels are idle, and save the mode registers.
.resume callback re-initializes the channels by restore the mode registers.
Signed-off-by: Hongbo Zhang <hongbo.zhang@freescale.com>
---
drivers/dma/fsldma.c | 99 ++++++++++++++++++++++++++++++++++++++++++++++++++
drivers/dma/fsldma.h | 16 ++++++++
2 files changed, 115 insertions(+)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index c9bf54a..91482d2 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -400,6 +400,14 @@ static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor *tx)
spin_lock_bh(&chan->desc_lock);
+#ifdef CONFIG_PM
+ if (unlikely(chan->pm_state != RUNNING)) {
+ chan_dbg(chan, "cannot submit due to suspend\n");
+ spin_unlock_bh(&chan->desc_lock);
+ return -1;
+ }
+#endif
+
/*
* assign cookies to all of the software descriptors
* that make up this transaction
@@ -1311,6 +1319,9 @@ static int fsl_dma_chan_probe(struct fsldma_device *fdev,
INIT_LIST_HEAD(&chan->ld_running);
INIT_LIST_HEAD(&chan->ld_completed);
chan->idle = true;
+#ifdef CONFIG_PM
+ chan->pm_state = RUNNING;
+#endif
chan->common.device = &fdev->common;
dma_cookie_init(&chan->common);
@@ -1450,6 +1461,91 @@ static int fsldma_of_remove(struct platform_device *op)
return 0;
}
+#ifdef CONFIG_PM
+static int fsldma_prepare(struct device *dev)
+{
+ struct platform_device *pdev = to_platform_device(dev);
+ struct fsldma_device *fdev = platform_get_drvdata(pdev);
+ struct fsldma_chan *chan;
+ int i;
+
+ for (i = 0; i < FSL_DMA_MAX_CHANS_PER_DEVICE; i++) {
+ chan = fdev->chan[i];
+ if (!chan)
+ continue;
+
+ spin_lock_bh(&chan->desc_lock);
+ chan->pm_state = SUSPENDING;
+ if (!list_empty(&chan->ld_pending))
+ fsl_chan_xfer_ld_queue(chan);
+ spin_unlock_bh(&chan->desc_lock);
+ }
+
+ return 0;
+}
+
+static int fsldma_suspend(struct device *dev)
+{
+ struct platform_device *pdev = to_platform_device(dev);
+ struct fsldma_device *fdev = platform_get_drvdata(pdev);
+ struct fsldma_chan *chan;
+ int i;
+
+ for (i = 0; i < FSL_DMA_MAX_CHANS_PER_DEVICE; i++) {
+ chan = fdev->chan[i];
+ if (!chan)
+ continue;
+
+ spin_lock_bh(&chan->desc_lock);
+ if (!chan->idle)
+ goto out;
+ chan->regs_save.mr = DMA_IN(chan, &chan->regs->mr, 32);
+ chan->pm_state = SUSPENDED;
+ spin_unlock_bh(&chan->desc_lock);
+ }
+ return 0;
+
+out:
+ for (; i >= 0; i--) {
+ chan = fdev->chan[i];
+ if (!chan)
+ continue;
+ spin_unlock_bh(&chan->desc_lock);
+ }
+ return -EBUSY;
+}
+
+static int fsldma_resume(struct device *dev)
+{
+ struct platform_device *pdev = to_platform_device(dev);
+ struct fsldma_device *fdev = platform_get_drvdata(pdev);
+ struct fsldma_chan *chan;
+ u32 mode;
+ int i;
+
+ for (i = 0; i < FSL_DMA_MAX_CHANS_PER_DEVICE; i++) {
+ chan = fdev->chan[i];
+ if (!chan)
+ continue;
+
+ spin_lock_bh(&chan->desc_lock);
+ mode = chan->regs_save.mr
+ & ~FSL_DMA_MR_CS & ~FSL_DMA_MR_CC & ~FSL_DMA_MR_CA;
+ DMA_OUT(chan, &chan->regs->mr, mode, 32);
+ chan->pm_state = RUNNING;
+ spin_unlock_bh(&chan->desc_lock);
+ }
+
+ return 0;
+}
+
+static const struct dev_pm_ops fsldma_pm_ops = {
+ .prepare = fsldma_prepare,
+ .suspend = fsldma_suspend,
+ .resume = fsldma_resume,
+};
+#endif
+
static const struct of_device_id fsldma_of_ids[] = {
{ .compatible = "fsl,elo3-dma", },
{ .compatible = "fsl,eloplus-dma", },
@@ -1462,6 +1558,9 @@ static struct platform_driver fsldma_of_driver = {
.name = "fsl-elo-dma",
.owner = THIS_MODULE,
.of_match_table = fsldma_of_ids,
+#ifdef CONFIG_PM
+ .pm = &fsldma_pm_ops,
+#endif
},
.probe = fsldma_of_probe,
.remove = fsldma_of_remove,
diff --git a/drivers/dma/fsldma.h b/drivers/dma/fsldma.h
index ec19517..eecaf9e 100644
--- a/drivers/dma/fsldma.h
+++ b/drivers/dma/fsldma.h
@@ -134,6 +134,18 @@ struct fsldma_device {
#define FSL_DMA_CHAN_PAUSE_EXT 0x00001000
#define FSL_DMA_CHAN_START_EXT 0x00002000
+#ifdef CONFIG_PM
+struct fsldma_chan_regs_save {
+ u32 mr;
+};
+
+enum fsldma_pm_state {
+ RUNNING = 0,
+ SUSPENDING,
+ SUSPENDED,
+};
+#endif
+
struct fsldma_chan {
char name[8]; /* Channel name */
struct fsldma_chan_regs __iomem *regs;
@@ -161,6 +173,10 @@ struct fsldma_chan {
struct tasklet_struct tasklet;
u32 feature;
bool idle; /* DMA controller is idle */
+#ifdef CONFIG_PM
+ struct fsldma_chan_regs_save regs_save;
+ enum fsldma_pm_state pm_state;
+#endif
void (*toggle_ext_pause)(struct fsldma_chan *fsl_chan, int enable);
void (*toggle_ext_start)(struct fsldma_chan *fsl_chan, int enable);
--
1.7.9.5
^ permalink raw reply related
* [PATCH v2 7/8] DMA: Freescale: use spin_lock_bh instead of spin_lock_irqsave
From: hongbo.zhang @ 2014-04-04 3:27 UTC (permalink / raw)
To: vkoul, dan.j.williams, dmaengine
Cc: scottwood, Hongbo Zhang, linuxppc-dev, linux-kernel
In-Reply-To: <1396582037-23065-1-git-send-email-hongbo.zhang@freescale.com>
From: Hongbo Zhang <hongbo.zhang@freescale.com>
The usage of spin_lock_irqsave() is a stronger locking mechanism than is
required throughout the driver. The minimum locking required should be used
instead. Interrupts will be turned off and context will be saved, it is
unnecessary to use irqsave.
This patch changes all instances of spin_lock_irqsave() to spin_lock_bh(). All
manipulation of protected fields is done using tasklet context or weaker, which
makes spin_lock_bh() the correct choice.
Signed-off-by: Hongbo Zhang <hongbo.zhang@freescale.com>
Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
---
drivers/dma/fsldma.c | 25 ++++++++++---------------
1 file changed, 10 insertions(+), 15 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index f8eee60..c9bf54a 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -396,10 +396,9 @@ static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor *tx)
struct fsldma_chan *chan = to_fsl_chan(tx->chan);
struct fsl_desc_sw *desc = tx_to_fsl_desc(tx);
struct fsl_desc_sw *child;
- unsigned long flags;
dma_cookie_t cookie = -EINVAL;
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_bh(&chan->desc_lock);
/*
* assign cookies to all of the software descriptors
@@ -412,7 +411,7 @@ static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor *tx)
/* put this transaction onto the tail of the pending queue */
append_ld_queue(chan, desc);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_bh(&chan->desc_lock);
return cookie;
}
@@ -725,15 +724,14 @@ static void fsldma_free_desc_list_reverse(struct fsldma_chan *chan,
static void fsl_dma_free_chan_resources(struct dma_chan *dchan)
{
struct fsldma_chan *chan = to_fsl_chan(dchan);
- unsigned long flags;
chan_dbg(chan, "free all channel resources\n");
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_bh(&chan->desc_lock);
fsldma_cleanup_descriptors(chan);
fsldma_free_desc_list(chan, &chan->ld_pending);
fsldma_free_desc_list(chan, &chan->ld_running);
fsldma_free_desc_list(chan, &chan->ld_completed);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_bh(&chan->desc_lock);
dma_pool_destroy(chan->desc_pool);
chan->desc_pool = NULL;
@@ -952,7 +950,6 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
{
struct dma_slave_config *config;
struct fsldma_chan *chan;
- unsigned long flags;
int size;
if (!dchan)
@@ -962,7 +959,7 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
switch (cmd) {
case DMA_TERMINATE_ALL:
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_bh(&chan->desc_lock);
/* Halt the DMA engine */
dma_halt(chan);
@@ -973,7 +970,7 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
fsldma_free_desc_list(chan, &chan->ld_completed);
chan->idle = true;
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_bh(&chan->desc_lock);
return 0;
case DMA_SLAVE_CONFIG:
@@ -1015,11 +1012,10 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
static void fsl_dma_memcpy_issue_pending(struct dma_chan *dchan)
{
struct fsldma_chan *chan = to_fsl_chan(dchan);
- unsigned long flags;
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_bh(&chan->desc_lock);
fsl_chan_xfer_ld_queue(chan);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_bh(&chan->desc_lock);
}
/**
@@ -1118,11 +1114,10 @@ static irqreturn_t fsldma_chan_irq(int irq, void *data)
static void dma_do_tasklet(unsigned long data)
{
struct fsldma_chan *chan = (struct fsldma_chan *)data;
- unsigned long flags;
chan_dbg(chan, "tasklet entry\n");
- spin_lock_irqsave(&chan->desc_lock, flags);
+ spin_lock_bh(&chan->desc_lock);
/* the hardware is now idle and ready for more */
chan->idle = true;
@@ -1130,7 +1125,7 @@ static void dma_do_tasklet(unsigned long data)
/* Run all cleanup for descriptors which have been completed */
fsldma_cleanup_descriptors(chan);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ spin_unlock_bh(&chan->desc_lock);
chan_dbg(chan, "tasklet exit\n");
}
--
1.7.9.5
^ permalink raw reply related
* [PATCH v2 6/8] DMA: Freescale: change descriptor release process for supporting async_tx
From: hongbo.zhang @ 2014-04-04 3:27 UTC (permalink / raw)
To: vkoul, dan.j.williams, dmaengine
Cc: scottwood, Hongbo Zhang, linuxppc-dev, linux-kernel
In-Reply-To: <1396582037-23065-1-git-send-email-hongbo.zhang@freescale.com>
From: Hongbo Zhang <hongbo.zhang@freescale.com>
Fix the potential risk when enable config NET_DMA and ASYNC_TX. Async_tx is
lack of support in current release process of dma descriptor, all descriptors
will be released whatever is acked or no-acked by async_tx, so there is a
potential race condition when dma engine is uesd by others clients (e.g. when
enable NET_DMA to offload TCP).
In our case, a race condition which is raised when use both of talitos and
dmaengine to offload xor is because napi scheduler will sync all pending
requests in dma channels, it affects the process of raid operations due to
ack_tx is not checked in fsl dma. The no-acked descriptor is freed which is
submitted just now, as a dependent tx, this freed descriptor trigger
BUG_ON(async_tx_test_ack(depend_tx)) in async_tx_submit().
TASK = ee1a94a0[1390] 'md0_raid5' THREAD: ecf40000 CPU: 0
GPR00: 00000001 ecf41ca0 ee44/921a94a0 0000003f 00000001 c00593e4 00000000 00000001
GPR08: 00000000 a7a7a7a7 00000001 045/920000002 42028042 100a38d4 ed576d98 00000000
GPR16: ed5a11b0 00000000 2b162000 00000200 046/920000000 2d555000 ed3015e8 c15a7aa0
GPR24: 00000000 c155fc40 00000000 ecb63220 ecf41d28 e47/92f640bb0 ef640c30 ecf41ca0
NIP [c02b048c] async_tx_submit+0x6c/0x2b4
LR [c02b068c] async_tx_submit+0x26c/0x2b4
Call Trace:
[ecf41ca0] [c02b068c] async_tx_submit+0x26c/0x2b448/92 (unreliable)
[ecf41cd0] [c02b0a4c] async_memcpy+0x240/0x25c
[ecf41d20] [c0421064] async_copy_data+0xa0/0x17c
[ecf41d70] [c0421cf4] __raid_run_ops+0x874/0xe10
[ecf41df0] [c0426ee4] handle_stripe+0x820/0x25e8
[ecf41e90] [c0429080] raid5d+0x3d4/0x5b4
[ecf41f40] [c04329b8] md_thread+0x138/0x16c
[ecf41f90] [c008277c] kthread+0x8c/0x90
[ecf41ff0] [c0011630] kernel_thread+0x4c/0x68
Another modification in this patch is the change of completed descriptors,
there is a potential risk which caused by exception interrupt, all descriptors
in ld_running list are seemed completed when an interrupt raised, it works fine
under normal condition, but if there is an exception occured, it cannot work as
our excepted. Hardware should not be depend on s/w list, the right way is to
read current descriptor address register to find the last completed descriptor.
If an interrupt is raised by an error, all descriptors in ld_running should not
be seemed finished, or these unfinished descriptors in ld_running will be
released wrongly.
A simple way to reproduce:
Enable dmatest first, then insert some bad descriptors which can trigger
Programming Error interrupts before the good descriptors. Last, the good
descriptors will be freed before they are processsed because of the exception
intrerrupt.
Note: the bad descriptors are only for simulating an exception interrupt. This
case can illustrate the potential risk in current fsl-dma very well.
Signed-off-by: Hongbo Zhang <hongbo.zhang@freescale.com>
Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
---
drivers/dma/fsldma.c | 195 ++++++++++++++++++++++++++++++++++++--------------
drivers/dma/fsldma.h | 17 ++++-
2 files changed, 158 insertions(+), 54 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 968877f..f8eee60 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -459,6 +459,87 @@ static struct fsl_desc_sw *fsl_dma_alloc_descriptor(struct fsldma_chan *chan)
}
/**
+ * fsldma_clean_completed_descriptor - free all descriptors which
+ * has been completed and acked
+ * @chan: Freescale DMA channel
+ *
+ * This function is used on all completed and acked descriptors.
+ * All descriptors should only be freed in this function.
+ */
+static void fsldma_clean_completed_descriptor(struct fsldma_chan *chan)
+{
+ struct fsl_desc_sw *desc, *_desc;
+
+ /* Run the callback for each descriptor, in order */
+ list_for_each_entry_safe(desc, _desc, &chan->ld_completed, node)
+ if (async_tx_test_ack(&desc->async_tx))
+ fsl_dma_free_descriptor(chan, desc);
+}
+
+/**
+ * fsldma_run_tx_complete_actions - cleanup a single link descriptor
+ * @chan: Freescale DMA channel
+ * @desc: descriptor to cleanup and free
+ * @cookie: Freescale DMA transaction identifier
+ *
+ * This function is used on a descriptor which has been executed by the DMA
+ * controller. It will run any callbacks, submit any dependencies.
+ */
+static dma_cookie_t fsldma_run_tx_complete_actions(struct fsldma_chan *chan,
+ struct fsl_desc_sw *desc, dma_cookie_t cookie)
+{
+ struct dma_async_tx_descriptor *txd = &desc->async_tx;
+
+ BUG_ON(txd->cookie < 0);
+
+ if (txd->cookie > 0) {
+ cookie = txd->cookie;
+
+ /* Run the link descriptor callback function */
+ if (txd->callback) {
+ chan_dbg(chan, "LD %p callback\n", desc);
+ txd->callback(txd->callback_param);
+ }
+ }
+
+ /* Run any dependencies */
+ dma_run_dependencies(txd);
+
+ return cookie;
+}
+
+/**
+ * fsldma_clean_running_descriptor - move the completed descriptor from
+ * ld_running to ld_completed
+ * @chan: Freescale DMA channel
+ * @desc: the descriptor which is completed
+ *
+ * Free the descriptor directly if acked by async_tx api, or move it to
+ * queue ld_completed.
+ */
+static void fsldma_clean_running_descriptor(struct fsldma_chan *chan,
+ struct fsl_desc_sw *desc)
+{
+ /* Remove from the list of transactions */
+ list_del(&desc->node);
+
+ /*
+ * the client is allowed to attach dependent operations
+ * until 'ack' is set
+ */
+ if (!async_tx_test_ack(&desc->async_tx)) {
+ /*
+ * Move this descriptor to the list of descriptors which is
+ * completed, but still awaiting the 'ack' bit to be set.
+ */
+ list_add_tail(&desc->node, &chan->ld_completed);
+ return;
+ }
+
+ dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
+}
+
+/**
* fsl_chan_xfer_ld_queue - transfer any pending transactions
* @chan : Freescale DMA channel
*
@@ -526,30 +607,58 @@ static void fsl_chan_xfer_ld_queue(struct fsldma_chan *chan)
}
/**
- * fsldma_cleanup_descriptor - cleanup and free a single link descriptor
+ * fsldma_cleanup_descriptors - cleanup link descriptors which are completed
+ * and move them to ld_completed to free until flag 'ack' is set
* @chan: Freescale DMA channel
- * @desc: descriptor to cleanup and free
*
- * This function is used on a descriptor which has been executed by the DMA
- * controller. It will run any callbacks, submit any dependencies, and then
- * free the descriptor.
+ * This function is used on descriptors which have been executed by the DMA
+ * controller. It will run any callbacks, submit any dependencies, then
+ * free these descriptors if flag 'ack' is set.
*/
-static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
- struct fsl_desc_sw *desc)
+static void fsldma_cleanup_descriptors(struct fsldma_chan *chan)
{
- struct dma_async_tx_descriptor *txd = &desc->async_tx;
+ struct fsl_desc_sw *desc, *_desc;
+ dma_cookie_t cookie = 0;
+ dma_addr_t curr_phys = get_cdar(chan);
+ int seen_current = 0;
+
+ fsldma_clean_completed_descriptor(chan);
+
+ /* Run the callback for each descriptor, in order */
+ list_for_each_entry_safe(desc, _desc, &chan->ld_running, node) {
+ /*
+ * do not advance past the current descriptor loaded into the
+ * hardware channel, subsequent descriptors are either in
+ * process or have not been submitted
+ */
+ if (seen_current)
+ break;
+
+ /*
+ * stop the search if we reach the current descriptor and the
+ * channel is busy
+ */
+ if (desc->async_tx.phys == curr_phys) {
+ seen_current = 1;
+ if (!dma_is_idle(chan))
+ break;
+ }
+
+ cookie = fsldma_run_tx_complete_actions(chan, desc, cookie);
- /* Run the link descriptor callback function */
- if (txd->callback) {
- chan_dbg(chan, "LD %p callback\n", desc);
- txd->callback(txd->callback_param);
+ fsldma_clean_running_descriptor(chan, desc);
}
- /* Run any dependencies */
- dma_run_dependencies(txd);
+ /*
+ * Start any pending transactions automatically
+ *
+ * In the ideal case, we keep the DMA controller busy while we go
+ * ahead and free the descriptors below.
+ */
+ fsl_chan_xfer_ld_queue(chan);
- dma_descriptor_unmap(txd);
- fsl_dma_free_descriptor(chan, desc);
+ if (cookie > 0)
+ chan->common.completed_cookie = cookie;
}
/**
@@ -620,8 +729,10 @@ static void fsl_dma_free_chan_resources(struct dma_chan *dchan)
chan_dbg(chan, "free all channel resources\n");
spin_lock_irqsave(&chan->desc_lock, flags);
+ fsldma_cleanup_descriptors(chan);
fsldma_free_desc_list(chan, &chan->ld_pending);
fsldma_free_desc_list(chan, &chan->ld_running);
+ fsldma_free_desc_list(chan, &chan->ld_completed);
spin_unlock_irqrestore(&chan->desc_lock, flags);
dma_pool_destroy(chan->desc_pool);
@@ -859,6 +970,7 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
/* Remove and free all of the descriptors in the LD queue */
fsldma_free_desc_list(chan, &chan->ld_pending);
fsldma_free_desc_list(chan, &chan->ld_running);
+ fsldma_free_desc_list(chan, &chan->ld_completed);
chan->idle = true;
spin_unlock_irqrestore(&chan->desc_lock, flags);
@@ -918,6 +1030,17 @@ static enum dma_status fsl_tx_status(struct dma_chan *dchan,
dma_cookie_t cookie,
struct dma_tx_state *txstate)
{
+ struct fsldma_chan *chan = to_fsl_chan(dchan);
+ enum dma_status ret;
+
+ ret = dma_cookie_status(dchan, cookie, txstate);
+ if (ret == DMA_COMPLETE)
+ return ret;
+
+ spin_lock_bh(&chan->desc_lock);
+ fsldma_cleanup_descriptors(chan);
+ spin_unlock_bh(&chan->desc_lock);
+
return dma_cookie_status(dchan, cookie, txstate);
}
@@ -995,52 +1118,19 @@ static irqreturn_t fsldma_chan_irq(int irq, void *data)
static void dma_do_tasklet(unsigned long data)
{
struct fsldma_chan *chan = (struct fsldma_chan *)data;
- struct fsl_desc_sw *desc, *_desc;
- LIST_HEAD(ld_cleanup);
unsigned long flags;
chan_dbg(chan, "tasklet entry\n");
spin_lock_irqsave(&chan->desc_lock, flags);
- /* update the cookie if we have some descriptors to cleanup */
- if (!list_empty(&chan->ld_running)) {
- dma_cookie_t cookie;
-
- desc = to_fsl_desc(chan->ld_running.prev);
- cookie = desc->async_tx.cookie;
- dma_cookie_complete(&desc->async_tx);
-
- chan_dbg(chan, "completed_cookie=%d\n", cookie);
- }
-
- /*
- * move the descriptors to a temporary list so we can drop the lock
- * during the entire cleanup operation
- */
- list_splice_tail_init(&chan->ld_running, &ld_cleanup);
-
/* the hardware is now idle and ready for more */
chan->idle = true;
- /*
- * Start any pending transactions automatically
- *
- * In the ideal case, we keep the DMA controller busy while we go
- * ahead and free the descriptors below.
- */
- fsl_chan_xfer_ld_queue(chan);
- spin_unlock_irqrestore(&chan->desc_lock, flags);
+ /* Run all cleanup for descriptors which have been completed */
+ fsldma_cleanup_descriptors(chan);
- /* Run the callback for each descriptor, in order */
- list_for_each_entry_safe(desc, _desc, &ld_cleanup, node) {
-
- /* Remove from the list of transactions */
- list_del(&desc->node);
-
- /* Run all cleanup for this descriptor */
- fsldma_cleanup_descriptor(chan, desc);
- }
+ spin_unlock_irqrestore(&chan->desc_lock, flags);
chan_dbg(chan, "tasklet exit\n");
}
@@ -1224,6 +1314,7 @@ static int fsl_dma_chan_probe(struct fsldma_device *fdev,
spin_lock_init(&chan->desc_lock);
INIT_LIST_HEAD(&chan->ld_pending);
INIT_LIST_HEAD(&chan->ld_running);
+ INIT_LIST_HEAD(&chan->ld_completed);
chan->idle = true;
chan->common.device = &fdev->common;
diff --git a/drivers/dma/fsldma.h b/drivers/dma/fsldma.h
index d56e835..ec19517 100644
--- a/drivers/dma/fsldma.h
+++ b/drivers/dma/fsldma.h
@@ -138,8 +138,21 @@ struct fsldma_chan {
char name[8]; /* Channel name */
struct fsldma_chan_regs __iomem *regs;
spinlock_t desc_lock; /* Descriptor operation lock */
- struct list_head ld_pending; /* Link descriptors queue */
- struct list_head ld_running; /* Link descriptors queue */
+ /*
+ * Descriptors which are queued to run, but have not yet been
+ * submitted to the hardware for execution
+ */
+ struct list_head ld_pending;
+ /*
+ * Descriptors which are currently being executed by the hardware
+ */
+ struct list_head ld_running;
+ /*
+ * Descriptors which have finished execution by the hardware. These
+ * descriptors have already had their cleanup actions run. They are
+ * waiting for the ACK bit to be set by the async_tx API.
+ */
+ struct list_head ld_completed; /* Link descriptors queue */
struct dma_chan common; /* DMA common channel */
struct dma_pool *desc_pool; /* Descriptors pool */
struct device *dev; /* Channel device */
--
1.7.9.5
^ permalink raw reply related
* [PATCH v2 5/8] DMA: Freescale: move functions to avoid forward declarations
From: hongbo.zhang @ 2014-04-04 3:27 UTC (permalink / raw)
To: vkoul, dan.j.williams, dmaengine
Cc: scottwood, Hongbo Zhang, linuxppc-dev, linux-kernel
In-Reply-To: <1396582037-23065-1-git-send-email-hongbo.zhang@freescale.com>
From: Hongbo Zhang <hongbo.zhang@freescale.com>
These functions will be modified in the next patch in the series. By moving the
function in a patch separate from the changes, it will make review easier.
Signed-off-by: Hongbo Zhang <hongbo.zhang@freescale.com>
Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
---
drivers/dma/fsldma.c | 188 +++++++++++++++++++++++++-------------------------
1 file changed, 94 insertions(+), 94 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index b5a0ffa..968877f 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -459,6 +459,100 @@ static struct fsl_desc_sw *fsl_dma_alloc_descriptor(struct fsldma_chan *chan)
}
/**
+ * fsl_chan_xfer_ld_queue - transfer any pending transactions
+ * @chan : Freescale DMA channel
+ *
+ * HARDWARE STATE: idle
+ * LOCKING: must hold chan->desc_lock
+ */
+static void fsl_chan_xfer_ld_queue(struct fsldma_chan *chan)
+{
+ struct fsl_desc_sw *desc;
+
+ /*
+ * If the list of pending descriptors is empty, then we
+ * don't need to do any work at all
+ */
+ if (list_empty(&chan->ld_pending)) {
+ chan_dbg(chan, "no pending LDs\n");
+ return;
+ }
+
+ /*
+ * The DMA controller is not idle, which means that the interrupt
+ * handler will start any queued transactions when it runs after
+ * this transaction finishes
+ */
+ if (!chan->idle) {
+ chan_dbg(chan, "DMA controller still busy\n");
+ return;
+ }
+
+ /*
+ * If there are some link descriptors which have not been
+ * transferred, we need to start the controller
+ */
+
+ /*
+ * Move all elements from the queue of pending transactions
+ * onto the list of running transactions
+ */
+ chan_dbg(chan, "idle, starting controller\n");
+ desc = list_first_entry(&chan->ld_pending, struct fsl_desc_sw, node);
+ list_splice_tail_init(&chan->ld_pending, &chan->ld_running);
+
+ /*
+ * The 85xx DMA controller doesn't clear the channel start bit
+ * automatically at the end of a transfer. Therefore we must clear
+ * it in software before starting the transfer.
+ */
+ if ((chan->feature & FSL_DMA_IP_MASK) == FSL_DMA_IP_85XX) {
+ u32 mode;
+
+ mode = get_mr(chan);
+ mode &= ~FSL_DMA_MR_CS;
+ set_mr(chan, mode);
+ }
+
+ /*
+ * Program the descriptor's address into the DMA controller,
+ * then start the DMA transaction
+ */
+ set_cdar(chan, desc->async_tx.phys);
+ get_cdar(chan);
+
+ dma_start(chan);
+ chan->idle = false;
+}
+
+/**
+ * fsldma_cleanup_descriptor - cleanup and free a single link descriptor
+ * @chan: Freescale DMA channel
+ * @desc: descriptor to cleanup and free
+ *
+ * This function is used on a descriptor which has been executed by the DMA
+ * controller. It will run any callbacks, submit any dependencies, and then
+ * free the descriptor.
+ */
+static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
+ struct fsl_desc_sw *desc)
+{
+ struct dma_async_tx_descriptor *txd = &desc->async_tx;
+
+ /* Run the link descriptor callback function */
+ if (txd->callback) {
+ chan_dbg(chan, "LD %p callback\n", desc);
+ txd->callback(txd->callback_param);
+ }
+
+ /* Run any dependencies */
+ dma_run_dependencies(txd);
+
+ dma_descriptor_unmap(txd);
+ fsl_dma_free_descriptor(chan, desc);
+}
+
+/**
* fsl_dma_alloc_chan_resources - Allocate resources for DMA channel.
* @chan : Freescale DMA channel
*
@@ -803,100 +897,6 @@ static int fsl_dma_device_control(struct dma_chan *dchan,
}
/**
- * fsldma_cleanup_descriptor - cleanup and free a single link descriptor
- * @chan: Freescale DMA channel
- * @desc: descriptor to cleanup and free
- *
- * This function is used on a descriptor which has been executed by the DMA
- * controller. It will run any callbacks, submit any dependencies, and then
- * free the descriptor.
- */
-static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
- struct fsl_desc_sw *desc)
-{
- struct dma_async_tx_descriptor *txd = &desc->async_tx;
-
- /* Run the link descriptor callback function */
- if (txd->callback) {
- chan_dbg(chan, "LD %p callback\n", desc);
- txd->callback(txd->callback_param);
- }
-
- /* Run any dependencies */
- dma_run_dependencies(txd);
-
- dma_descriptor_unmap(txd);
- fsl_dma_free_descriptor(chan, desc);
-}
-
-/**
- * fsl_chan_xfer_ld_queue - transfer any pending transactions
- * @chan : Freescale DMA channel
- *
- * HARDWARE STATE: idle
- * LOCKING: must hold chan->desc_lock
- */
-static void fsl_chan_xfer_ld_queue(struct fsldma_chan *chan)
-{
- struct fsl_desc_sw *desc;
-
- /*
- * If the list of pending descriptors is empty, then we
- * don't need to do any work at all
- */
- if (list_empty(&chan->ld_pending)) {
- chan_dbg(chan, "no pending LDs\n");
- return;
- }
-
- /*
- * The DMA controller is not idle, which means that the interrupt
- * handler will start any queued transactions when it runs after
- * this transaction finishes
- */
- if (!chan->idle) {
- chan_dbg(chan, "DMA controller still busy\n");
- return;
- }
-
- /*
- * If there are some link descriptors which have not been
- * transferred, we need to start the controller
- */
-
- /*
- * Move all elements from the queue of pending transactions
- * onto the list of running transactions
- */
- chan_dbg(chan, "idle, starting controller\n");
- desc = list_first_entry(&chan->ld_pending, struct fsl_desc_sw, node);
- list_splice_tail_init(&chan->ld_pending, &chan->ld_running);
-
- /*
- * The 85xx DMA controller doesn't clear the channel start bit
- * automatically at the end of a transfer. Therefore we must clear
- * it in software before starting the transfer.
- */
- if ((chan->feature & FSL_DMA_IP_MASK) == FSL_DMA_IP_85XX) {
- u32 mode;
-
- mode = get_mr(chan);
- mode &= ~FSL_DMA_MR_CS;
- set_mr(chan, mode);
- }
-
- /*
- * Program the descriptor's address into the DMA controller,
- * then start the DMA transaction
- */
- set_cdar(chan, desc->async_tx.phys);
- get_cdar(chan);
-
- dma_start(chan);
- chan->idle = false;
-}
-
-/**
* fsl_dma_memcpy_issue_pending - Issue the DMA start command
* @chan : Freescale DMA channel
*/
--
1.7.9.5
^ permalink raw reply related
* [PATCH v2 4/8] DMA: Freescale: add fsl_dma_free_descriptor() to reduce code duplication
From: hongbo.zhang @ 2014-04-04 3:27 UTC (permalink / raw)
To: vkoul, dan.j.williams, dmaengine
Cc: scottwood, Hongbo Zhang, linuxppc-dev, linux-kernel
In-Reply-To: <1396582037-23065-1-git-send-email-hongbo.zhang@freescale.com>
From: Hongbo Zhang <hongbo.zhang@freescale.com>
There are several places where descriptors are freed using identical code.
This patch puts this code into a function to reduce code duplication.
Signed-off-by: Hongbo Zhang <hongbo.zhang@freescale.com>
Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
---
drivers/dma/fsldma.c | 30 ++++++++++++++++++------------
1 file changed, 18 insertions(+), 12 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index b71cc04..b5a0ffa 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -418,6 +418,19 @@ static dma_cookie_t fsl_dma_tx_submit(struct dma_async_tx_descriptor *tx)
}
/**
+ * fsl_dma_free_descriptor - Free descriptor from channel's DMA pool.
+ * @chan : Freescale DMA channel
+ * @desc: descriptor to be freed
+ */
+static void fsl_dma_free_descriptor(struct fsldma_chan *chan,
+ struct fsl_desc_sw *desc)
+{
+ list_del(&desc->node);
+ chan_dbg(chan, "LD %p free\n", desc);
+ dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
+}
+
+/**
* fsl_dma_alloc_descriptor - Allocate descriptor from channel's DMA pool.
* @chan : Freescale DMA channel
*
@@ -489,11 +502,8 @@ static void fsldma_free_desc_list(struct fsldma_chan *chan,
{
struct fsl_desc_sw *desc, *_desc;
- list_for_each_entry_safe(desc, _desc, list, node) {
- list_del(&desc->node);
- chan_dbg(chan, "LD %p free\n", desc);
- dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
- }
+ list_for_each_entry_safe(desc, _desc, list, node)
+ fsl_dma_free_descriptor(chan, desc);
}
static void fsldma_free_desc_list_reverse(struct fsldma_chan *chan,
@@ -501,11 +511,8 @@ static void fsldma_free_desc_list_reverse(struct fsldma_chan *chan,
{
struct fsl_desc_sw *desc, *_desc;
- list_for_each_entry_safe_reverse(desc, _desc, list, node) {
- list_del(&desc->node);
- chan_dbg(chan, "LD %p free\n", desc);
- dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
- }
+ list_for_each_entry_safe_reverse(desc, _desc, list, node)
+ fsl_dma_free_descriptor(chan, desc);
}
/**
@@ -819,8 +826,7 @@ static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
dma_run_dependencies(txd);
dma_descriptor_unmap(txd);
- chan_dbg(chan, "LD %p free\n", desc);
- dma_pool_free(chan->desc_pool, desc, txd->phys);
+ fsl_dma_free_descriptor(chan, desc);
}
/**
--
1.7.9.5
^ permalink raw reply related
* [PATCH v2 3/8] DMA: Freescale: remove attribute DMA_INTERRUPT of dmaengine
From: hongbo.zhang @ 2014-04-04 3:27 UTC (permalink / raw)
To: vkoul, dan.j.williams, dmaengine
Cc: scottwood, Hongbo Zhang, linuxppc-dev, linux-kernel
In-Reply-To: <1396582037-23065-1-git-send-email-hongbo.zhang@freescale.com>
From: Hongbo Zhang <hongbo.zhang@freescale.com>
Delete attribute DMA_INTERRUPT because fsldma doesn't support this function,
exception will be thrown if talitos is used to offload xor at the same time.
Signed-off-by: Hongbo Zhang <hongbo.zhang@freescale.com>
Signed-off-by: Qiang Liu <qiang.liu@freescale.com>
---
drivers/dma/fsldma.c | 31 -------------------------------
1 file changed, 31 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 5f32cb8..b71cc04 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -528,35 +528,6 @@ static void fsl_dma_free_chan_resources(struct dma_chan *dchan)
}
static struct dma_async_tx_descriptor *
-fsl_dma_prep_interrupt(struct dma_chan *dchan, unsigned long flags)
-{
- struct fsldma_chan *chan;
- struct fsl_desc_sw *new;
-
- if (!dchan)
- return NULL;
-
- chan = to_fsl_chan(dchan);
-
- new = fsl_dma_alloc_descriptor(chan);
- if (!new) {
- chan_err(chan, "%s\n", msg_ld_oom);
- return NULL;
- }
-
- new->async_tx.cookie = -EBUSY;
- new->async_tx.flags = flags;
-
- /* Insert the link descriptor to the LD ring */
- list_add_tail(&new->node, &new->tx_list);
-
- /* Set End-of-link to the last link descriptor of new list */
- set_ld_eol(chan, new);
-
- return &new->async_tx;
-}
-
-static struct dma_async_tx_descriptor *
fsl_dma_prep_memcpy(struct dma_chan *dchan,
dma_addr_t dma_dst, dma_addr_t dma_src,
size_t len, unsigned long flags)
@@ -1308,12 +1279,10 @@ static int fsldma_of_probe(struct platform_device *op)
fdev->irq = irq_of_parse_and_map(op->dev.of_node, 0);
dma_cap_set(DMA_MEMCPY, fdev->common.cap_mask);
- dma_cap_set(DMA_INTERRUPT, fdev->common.cap_mask);
dma_cap_set(DMA_SG, fdev->common.cap_mask);
dma_cap_set(DMA_SLAVE, fdev->common.cap_mask);
fdev->common.device_alloc_chan_resources = fsl_dma_alloc_chan_resources;
fdev->common.device_free_chan_resources = fsl_dma_free_chan_resources;
- fdev->common.device_prep_dma_interrupt = fsl_dma_prep_interrupt;
fdev->common.device_prep_dma_memcpy = fsl_dma_prep_memcpy;
fdev->common.device_prep_dma_sg = fsl_dma_prep_sg;
fdev->common.device_tx_status = fsl_tx_status;
--
1.7.9.5
^ permalink raw reply related
* [PATCH v2 0/8] DMA: Freescale: driver cleanups and enhancements
From: hongbo.zhang @ 2014-04-04 3:27 UTC (permalink / raw)
To: vkoul, dan.j.williams, dmaengine
Cc: scottwood, Hongbo Zhang, linuxppc-dev, linux-kernel
From: Hongbo Zhang <hongbo.zhang@freescale.com>
Hi Vinod Koul,
Please have a look at the v2 patch set.
v1 -> v2 change:
The only one change is introducing a new patch[1/7] to remove the unnecessary
macro FSL_DMA_LD_DEBUG, thus the total patches number is 8 now (was 7)
Hongbo Zhang (8):
DMA: Freescale: remove the unnecessary FSL_DMA_LD_DEBUG
DMA: Freescale: unify register access methods
DMA: Freescale: remove attribute DMA_INTERRUPT of dmaengine
DMA: Freescale: add fsl_dma_free_descriptor() to reduce code
duplication
DMA: Freescale: move functions to avoid forward declarations
DMA: Freescale: change descriptor release process for supporting
async_tx
DMA: Freescale: use spin_lock_bh instead of spin_lock_irqsave
DMA: Freescale: add suspend resume functions for DMA driver
drivers/dma/fsldma.c | 590 ++++++++++++++++++++++++++++++++------------------
drivers/dma/fsldma.h | 33 ++-
2 files changed, 408 insertions(+), 215 deletions(-)
--
1.7.9.5
^ permalink raw reply
* [PATCH v2 2/8] DMA: Freescale: unify register access methods
From: hongbo.zhang @ 2014-04-04 3:27 UTC (permalink / raw)
To: vkoul, dan.j.williams, dmaengine
Cc: scottwood, Hongbo Zhang, linuxppc-dev, linux-kernel
In-Reply-To: <1396582037-23065-1-git-send-email-hongbo.zhang@freescale.com>
From: Hongbo Zhang <hongbo.zhang@freescale.com>
Methods of accessing DMA contorller registers are inconsistent, some registers
are accessed by DMA_IN/OUT directly, while others are accessed by functions
get/set_* which are wrappers of DMA_IN/OUT, and even for the BCR register, it
is read by get_bcr but written by DMA_OUT.
This patch unifies the inconsistent methods, all registers are accessed by
get/set_* now.
Signed-off-by: Hongbo Zhang <hongbo.zhang@freescale.com>
---
drivers/dma/fsldma.c | 52 ++++++++++++++++++++++++++++++++------------------
1 file changed, 33 insertions(+), 19 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index ec50420..5f32cb8 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -61,6 +61,16 @@ static u32 get_sr(struct fsldma_chan *chan)
return DMA_IN(chan, &chan->regs->sr, 32);
}
+static void set_mr(struct fsldma_chan *chan, u32 val)
+{
+ DMA_OUT(chan, &chan->regs->mr, val, 32);
+}
+
+static u32 get_mr(struct fsldma_chan *chan)
+{
+ return DMA_IN(chan, &chan->regs->mr, 32);
+}
+
static void set_cdar(struct fsldma_chan *chan, dma_addr_t addr)
{
DMA_OUT(chan, &chan->regs->cdar, addr | FSL_DMA_SNEN, 64);
@@ -71,6 +81,11 @@ static dma_addr_t get_cdar(struct fsldma_chan *chan)
return DMA_IN(chan, &chan->regs->cdar, 64) & ~FSL_DMA_SNEN;
}
+static void set_bcr(struct fsldma_chan *chan, u32 val)
+{
+ DMA_OUT(chan, &chan->regs->bcr, val, 32);
+}
+
static u32 get_bcr(struct fsldma_chan *chan)
{
return DMA_IN(chan, &chan->regs->bcr, 32);
@@ -135,7 +150,7 @@ static void set_ld_eol(struct fsldma_chan *chan, struct fsl_desc_sw *desc)
static void dma_init(struct fsldma_chan *chan)
{
/* Reset the channel */
- DMA_OUT(chan, &chan->regs->mr, 0, 32);
+ set_mr(chan, 0);
switch (chan->feature & FSL_DMA_IP_MASK) {
case FSL_DMA_IP_85XX:
@@ -144,16 +159,15 @@ static void dma_init(struct fsldma_chan *chan)
* EOLNIE - End of links interrupt enable
* BWC - Bandwidth sharing among channels
*/
- DMA_OUT(chan, &chan->regs->mr, FSL_DMA_MR_BWC
- | FSL_DMA_MR_EIE | FSL_DMA_MR_EOLNIE, 32);
+ set_mr(chan, FSL_DMA_MR_BWC | FSL_DMA_MR_EIE
+ | FSL_DMA_MR_EOLNIE);
break;
case FSL_DMA_IP_83XX:
/* Set the channel to below modes:
* EOTIE - End-of-transfer interrupt enable
* PRC_RM - PCI read multiple
*/
- DMA_OUT(chan, &chan->regs->mr, FSL_DMA_MR_EOTIE
- | FSL_DMA_MR_PRC_RM, 32);
+ set_mr(chan, FSL_DMA_MR_EOTIE | FSL_DMA_MR_PRC_RM);
break;
}
}
@@ -175,10 +189,10 @@ static void dma_start(struct fsldma_chan *chan)
{
u32 mode;
- mode = DMA_IN(chan, &chan->regs->mr, 32);
+ mode = get_mr(chan);
if (chan->feature & FSL_DMA_CHAN_PAUSE_EXT) {
- DMA_OUT(chan, &chan->regs->bcr, 0, 32);
+ set_bcr(chan, 0);
mode |= FSL_DMA_MR_EMP_EN;
} else {
mode &= ~FSL_DMA_MR_EMP_EN;
@@ -191,7 +205,7 @@ static void dma_start(struct fsldma_chan *chan)
mode |= FSL_DMA_MR_CS;
}
- DMA_OUT(chan, &chan->regs->mr, mode, 32);
+ set_mr(chan, mode);
}
static void dma_halt(struct fsldma_chan *chan)
@@ -200,7 +214,7 @@ static void dma_halt(struct fsldma_chan *chan)
int i;
/* read the mode register */
- mode = DMA_IN(chan, &chan->regs->mr, 32);
+ mode = get_mr(chan);
/*
* The 85xx controller supports channel abort, which will stop
@@ -209,14 +223,14 @@ static void dma_halt(struct fsldma_chan *chan)
*/
if ((chan->feature & FSL_DMA_IP_MASK) == FSL_DMA_IP_85XX) {
mode |= FSL_DMA_MR_CA;
- DMA_OUT(chan, &chan->regs->mr, mode, 32);
+ set_mr(chan, mode);
mode &= ~FSL_DMA_MR_CA;
}
/* stop the DMA controller */
mode &= ~(FSL_DMA_MR_CS | FSL_DMA_MR_EMS_EN);
- DMA_OUT(chan, &chan->regs->mr, mode, 32);
+ set_mr(chan, mode);
/* wait for the DMA controller to become idle */
for (i = 0; i < 100; i++) {
@@ -245,7 +259,7 @@ static void fsl_chan_set_src_loop_size(struct fsldma_chan *chan, int size)
{
u32 mode;
- mode = DMA_IN(chan, &chan->regs->mr, 32);
+ mode = get_mr(chan);
switch (size) {
case 0:
@@ -259,7 +273,7 @@ static void fsl_chan_set_src_loop_size(struct fsldma_chan *chan, int size)
break;
}
- DMA_OUT(chan, &chan->regs->mr, mode, 32);
+ set_mr(chan, mode);
}
/**
@@ -277,7 +291,7 @@ static void fsl_chan_set_dst_loop_size(struct fsldma_chan *chan, int size)
{
u32 mode;
- mode = DMA_IN(chan, &chan->regs->mr, 32);
+ mode = get_mr(chan);
switch (size) {
case 0:
@@ -291,7 +305,7 @@ static void fsl_chan_set_dst_loop_size(struct fsldma_chan *chan, int size)
break;
}
- DMA_OUT(chan, &chan->regs->mr, mode, 32);
+ set_mr(chan, mode);
}
/**
@@ -312,10 +326,10 @@ static void fsl_chan_set_request_count(struct fsldma_chan *chan, int size)
BUG_ON(size > 1024);
- mode = DMA_IN(chan, &chan->regs->mr, 32);
+ mode = get_mr(chan);
mode |= (__ilog2(size) << 24) & 0x0f000000;
- DMA_OUT(chan, &chan->regs->mr, mode, 32);
+ set_mr(chan, mode);
}
/**
@@ -889,9 +903,9 @@ static void fsl_chan_xfer_ld_queue(struct fsldma_chan *chan)
if ((chan->feature & FSL_DMA_IP_MASK) == FSL_DMA_IP_85XX) {
u32 mode;
- mode = DMA_IN(chan, &chan->regs->mr, 32);
+ mode = get_mr(chan);
mode &= ~FSL_DMA_MR_CS;
- DMA_OUT(chan, &chan->regs->mr, mode, 32);
+ set_mr(chan, mode);
}
/*
--
1.7.9.5
^ permalink raw reply related
* [PATCH v2 1/8] DMA: Freescale: remove the unnecessary FSL_DMA_LD_DEBUG
From: hongbo.zhang @ 2014-04-04 3:27 UTC (permalink / raw)
To: vkoul, dan.j.williams, dmaengine
Cc: scottwood, Hongbo Zhang, linuxppc-dev, linux-kernel
In-Reply-To: <1396582037-23065-1-git-send-email-hongbo.zhang@freescale.com>
From: Hongbo Zhang <hongbo.zhang@freescale.com>
Some codes are calling chan_dbg with FSL_DMA_LD_DEBUG surrounded, it is really
unnecessary to use such a macro because chan_dbg is a wrapper of dev_dbg, we do
have corresponding DEBUG macro to switch on/off dev_dbg, and most of the other
codes are also calling chan_dbg directly without using FSL_DMA_LD_DEBUG.
Signed-off-by: Hongbo Zhang <hongbo.zhang@freescale.com>
---
drivers/dma/fsldma.c | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index f157c6f..ec50420 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -426,9 +426,7 @@ static struct fsl_desc_sw *fsl_dma_alloc_descriptor(struct fsldma_chan *chan)
desc->async_tx.tx_submit = fsl_dma_tx_submit;
desc->async_tx.phys = pdesc;
-#ifdef FSL_DMA_LD_DEBUG
chan_dbg(chan, "LD %p allocated\n", desc);
-#endif
return desc;
}
@@ -479,9 +477,7 @@ static void fsldma_free_desc_list(struct fsldma_chan *chan,
list_for_each_entry_safe(desc, _desc, list, node) {
list_del(&desc->node);
-#ifdef FSL_DMA_LD_DEBUG
chan_dbg(chan, "LD %p free\n", desc);
-#endif
dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
}
}
@@ -493,9 +489,7 @@ static void fsldma_free_desc_list_reverse(struct fsldma_chan *chan,
list_for_each_entry_safe_reverse(desc, _desc, list, node) {
list_del(&desc->node);
-#ifdef FSL_DMA_LD_DEBUG
chan_dbg(chan, "LD %p free\n", desc);
-#endif
dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
}
}
@@ -832,9 +826,7 @@ static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
/* Run the link descriptor callback function */
if (txd->callback) {
-#ifdef FSL_DMA_LD_DEBUG
chan_dbg(chan, "LD %p callback\n", desc);
-#endif
txd->callback(txd->callback_param);
}
@@ -842,9 +834,7 @@ static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
dma_run_dependencies(txd);
dma_descriptor_unmap(txd);
-#ifdef FSL_DMA_LD_DEBUG
chan_dbg(chan, "LD %p free\n", desc);
-#endif
dma_pool_free(chan->desc_pool, desc, txd->phys);
}
--
1.7.9.5
^ permalink raw reply related
* Re: hugetlb: ensure hugepage access is denied if hugepages are not supported
From: Aneesh Kumar K.V @ 2014-04-04 0:42 UTC (permalink / raw)
To: Nishanth Aravamudan, linux-mm; +Cc: paulus, anton, nyc, akpm, linuxppc-dev
In-Reply-To: <20140403231413.GB17412@linux.vnet.ibm.com>
Nishanth Aravamudan <nacc@linux.vnet.ibm.com> writes:
> In KVM guests on Power, in a guest not backed by hugepages, we see the
> following:
>
> AnonHugePages: 0 kB
> HugePages_Total: 0
> HugePages_Free: 0
> HugePages_Rsvd: 0
> HugePages_Surp: 0
> Hugepagesize: 64 kB
>
> HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages
> are not supported at boot-time, but this is only checked in
> hugetlb_init(). Extract the check to a helper function, and use it in a
> few relevant places.
>
> This does make hugetlbfs not supported (not registered at all) in this
> environment. I believe this is fine, as there are no valid hugepages and
> that won't change at runtime.
>
> Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index d19b30a..cc8fcc7 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -1017,6 +1017,11 @@ static int __init init_hugetlbfs_fs(void)
> int error;
> int i;
>
> + if (!hugepages_supported()) {
> + printk(KERN_ERR "hugetlbfs: Disabling because there are no supported hugepage sizes\n");
> + return -ENOTSUPP;
> + }
> +
> error = bdi_init(&hugetlbfs_backing_dev_info);
> if (error)
> return error;
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 8c43cc4..0aea8de 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -450,4 +450,14 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h,
> return ptl;
> }
>
> +static inline bool hugepages_supported(void)
> +{
> + /*
> + * Some platform decide whether they support huge pages at boot
> + * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when
> + * there is no such support
> + */
> + return HPAGE_SHIFT != 0;
> +}
> +
> #endif /* _LINUX_HUGETLB_H */
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index c01cb9f..1c99585 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1949,11 +1949,7 @@ module_exit(hugetlb_exit);
>
> static int __init hugetlb_init(void)
> {
> - /* Some platform decide whether they support huge pages at boot
> - * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when
> - * there is no such support
> - */
> - if (HPAGE_SHIFT == 0)
> + if (!hugepages_supported())
> return 0;
>
> if (!size_to_hstate(default_hstate_size)) {
> @@ -2069,6 +2065,9 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
> unsigned long tmp;
> int ret;
>
> + if (!hugepages_supported())
> + return -ENOTSUPP;
> +
> tmp = h->max_huge_pages;
>
> if (write && h->order >= MAX_ORDER)
> @@ -2122,6 +2121,9 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write,
> unsigned long tmp;
> int ret;
>
> + if (!hugepages_supported())
> + return -ENOTSUPP;
> +
> tmp = h->nr_overcommit_huge_pages;
>
> if (write && h->order >= MAX_ORDER)
> @@ -2147,6 +2149,8 @@ out:
> void hugetlb_report_meminfo(struct seq_file *m)
> {
> struct hstate *h = &default_hstate;
> + if (!hugepages_supported())
> + return;
> seq_printf(m,
> "HugePages_Total: %5lu\n"
> "HugePages_Free: %5lu\n"
> @@ -2163,6 +2167,8 @@ void hugetlb_report_meminfo(struct seq_file *m)
> int hugetlb_report_node_meminfo(int nid, char *buf)
> {
> struct hstate *h = &default_hstate;
> + if (!hugepages_supported())
> + return 0;
> return sprintf(buf,
> "Node %d HugePages_Total: %5u\n"
> "Node %d HugePages_Free: %5u\n"
> @@ -2177,6 +2183,9 @@ void hugetlb_show_meminfo(void)
> struct hstate *h;
> int nid;
>
> + if (!hugepages_supported())
> + return;
> +
> for_each_node_state(nid, N_MEMORY)
> for_each_hstate(h)
> pr_info("Node %d hugepages_total=%u hugepages_free=%u hugepages_surp=%u hugepages_size=%lukB\n",
^ permalink raw reply
* hugetlb: ensure hugepage access is denied if hugepages are not supported
From: Nishanth Aravamudan @ 2014-04-03 23:14 UTC (permalink / raw)
To: linux-mm; +Cc: paulus, anton, nyc, akpm, linuxppc-dev, aneesh.kumar
In KVM guests on Power, in a guest not backed by hugepages, we see the
following:
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 64 kB
HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages
are not supported at boot-time, but this is only checked in
hugetlb_init(). Extract the check to a helper function, and use it in a
few relevant places.
This does make hugetlbfs not supported (not registered at all) in this
environment. I believe this is fine, as there are no valid hugepages and
that won't change at runtime.
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index d19b30a..cc8fcc7 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -1017,6 +1017,11 @@ static int __init init_hugetlbfs_fs(void)
int error;
int i;
+ if (!hugepages_supported()) {
+ printk(KERN_ERR "hugetlbfs: Disabling because there are no supported hugepage sizes\n");
+ return -ENOTSUPP;
+ }
+
error = bdi_init(&hugetlbfs_backing_dev_info);
if (error)
return error;
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 8c43cc4..0aea8de 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -450,4 +450,14 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h,
return ptl;
}
+static inline bool hugepages_supported(void)
+{
+ /*
+ * Some platform decide whether they support huge pages at boot
+ * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when
+ * there is no such support
+ */
+ return HPAGE_SHIFT != 0;
+}
+
#endif /* _LINUX_HUGETLB_H */
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index c01cb9f..1c99585 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1949,11 +1949,7 @@ module_exit(hugetlb_exit);
static int __init hugetlb_init(void)
{
- /* Some platform decide whether they support huge pages at boot
- * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when
- * there is no such support
- */
- if (HPAGE_SHIFT == 0)
+ if (!hugepages_supported())
return 0;
if (!size_to_hstate(default_hstate_size)) {
@@ -2069,6 +2065,9 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
unsigned long tmp;
int ret;
+ if (!hugepages_supported())
+ return -ENOTSUPP;
+
tmp = h->max_huge_pages;
if (write && h->order >= MAX_ORDER)
@@ -2122,6 +2121,9 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write,
unsigned long tmp;
int ret;
+ if (!hugepages_supported())
+ return -ENOTSUPP;
+
tmp = h->nr_overcommit_huge_pages;
if (write && h->order >= MAX_ORDER)
@@ -2147,6 +2149,8 @@ out:
void hugetlb_report_meminfo(struct seq_file *m)
{
struct hstate *h = &default_hstate;
+ if (!hugepages_supported())
+ return;
seq_printf(m,
"HugePages_Total: %5lu\n"
"HugePages_Free: %5lu\n"
@@ -2163,6 +2167,8 @@ void hugetlb_report_meminfo(struct seq_file *m)
int hugetlb_report_node_meminfo(int nid, char *buf)
{
struct hstate *h = &default_hstate;
+ if (!hugepages_supported())
+ return 0;
return sprintf(buf,
"Node %d HugePages_Total: %5u\n"
"Node %d HugePages_Free: %5u\n"
@@ -2177,6 +2183,9 @@ void hugetlb_show_meminfo(void)
struct hstate *h;
int nid;
+ if (!hugepages_supported())
+ return;
+
for_each_node_state(nid, N_MEMORY)
for_each_hstate(h)
pr_info("Node %d hugepages_total=%u hugepages_free=%u hugepages_surp=%u hugepages_size=%lukB\n",
^ permalink raw reply related
* Re: [RFC PATCH] hugetlb: ensure hugepage access is denied if hugepages are not supported
From: Nishanth Aravamudan @ 2014-04-03 23:12 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: linux-mm, paulus, linuxppc-dev, nyc, anton
In-Reply-To: <87ioqqo065.fsf@linux.vnet.ibm.com>
On 03.04.2014 [21:49:46 +0530], Aneesh Kumar K.V wrote:
> Nishanth Aravamudan <nacc@linux.vnet.ibm.com> writes:
>
> > On 24.03.2014 [16:02:56 -0700], Nishanth Aravamudan wrote:
> >> In KVM guests on Power, if the guest is not backed by hugepages, we see
> >> the following in the guest:
> >>
> >> AnonHugePages: 0 kB
> >> HugePages_Total: 0
> >> HugePages_Free: 0
> >> HugePages_Rsvd: 0
> >> HugePages_Surp: 0
> >> Hugepagesize: 64 kB
> >>
> >> This seems like a configuration issue -- why is a hstate of 64k being
> >> registered?
> >>
> >> I did some debugging and found that the following does trigger,
> >> mm/hugetlb.c::hugetlb_init():
> >>
> >> /* Some platform decide whether they support huge pages at boot
> >> * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when
> >> * there is no such support
> >> */
> >> if (HPAGE_SHIFT == 0)
> >> return 0;
> >>
> >> That check is only during init-time. So we don't support hugepages, but
> >> none of the hugetlb APIs actually check this condition (HPAGE_SHIFT ==
> >> 0), so /proc/meminfo above falsely indicates there is a valid hstate (at
> >> least one). But note that there is no /sys/kernel/mm/hugepages meaning
> >> no hstate was actually registered.
> >>
> >> Further, it turns out that huge_page_order(default_hstate) is 0, so
> >> hugetlb_report_meminfo is doing:
> >>
> >> 1UL << (huge_page_order(h) + PAGE_SHIFT - 10)
> >>
> >> which ends up just doing 1 << (PAGE_SHIFT - 10) and since the base page
> >> size is 64k, we report a hugepage size of 64k... And allow the user to
> >> allocate hugepages via the sysctl, etc.
> >>
> >> What's the right thing to do here?
> >>
> >> 1) Should we add checks for HPAGE_SHIFT == 0 to all the hugetlb APIs? It
> >> seems like HPAGE_SHIFT == 0 should be the equivalent, functionally, of
> >> the config options being off. This seems like a lot of overhead, though,
> >> to put everywhere, so maybe I can do it in an arch-specific macro, that
> >> in asm-generic defaults to 0 (and so will hopefully be compiled out?).
> >>
> >> 2) What should hugetlbfs do when HPAGE_SHIFT == 0? Should it be
> >> mountable? Obviously if it's mountable, we can't great files there
> >> (since the fs will report insufficient space). [1]
> >
> > Here is my solution to this. Comments appreciated!
> >
> > In KVM guests on Power, in a guest not backed by hugepages, we see the
> > following:
> >
> > AnonHugePages: 0 kB
> > HugePages_Total: 0
> > HugePages_Free: 0
> > HugePages_Rsvd: 0
> > HugePages_Surp: 0
> > Hugepagesize: 64 kB
> >
> > HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages
> > are not supported at boot-time, but this is only checked in
> > hugetlb_init(). Extract the check to a helper function, and use it in a
> > few relevant places.
> >
> > This does make hugetlbfs not supported in this environment. I believe
> > this is fine, as there are no valid hugepages and that won't change at
> > runtime.
> >
> > Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
>
>
> Looks good. Can you resubmit it as a proper patch ?
Will Cc you on that.
> You may also want to capture in commit message saying hugetlbfs file
> system also will not be registered.
I did that already:
> > This does make hugetlbfs not supported in this environment. I
> > believe this is fine, as there are no valid hugepages and that won't
> > change at runtime.
Thanks,
Nish
^ permalink raw reply
* [PATCH 20/20] of: push struct boot_param_header and defines into powerpc
From: Rob Herring @ 2014-04-03 22:17 UTC (permalink / raw)
To: linux-kernel; +Cc: Grant Likely, Rob Herring, Paul Mackerras, linuxppc-dev
In-Reply-To: <1396563423-30893-1-git-send-email-robherring2@gmail.com>
From: Rob Herring <robh@kernel.org>
Now powerpc is the only user of struct boot_param_header and FDT defines,
so they can be moved into the powerpc architecture code.
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
---
arch/powerpc/include/asm/prom.h | 39 +++++++++++++++++++++++++++++++++++++++
include/linux/of_fdt.h | 37 -------------------------------------
2 files changed, 39 insertions(+), 37 deletions(-)
diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index d977b9b..74b79f0 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -26,6 +26,45 @@
#include <linux/of_irq.h>
#include <linux/platform_device.h>
+#define OF_DT_BEGIN_NODE 0x1 /* Start of node, full name */
+#define OF_DT_END_NODE 0x2 /* End node */
+#define OF_DT_PROP 0x3 /* Property: name off, size,
+ * content */
+#define OF_DT_NOP 0x4 /* nop */
+#define OF_DT_END 0x9
+
+#define OF_DT_VERSION 0x10
+
+/*
+ * This is what gets passed to the kernel by prom_init or kexec
+ *
+ * The dt struct contains the device tree structure, full pathes and
+ * property contents. The dt strings contain a separate block with just
+ * the strings for the property names, and is fully page aligned and
+ * self contained in a page, so that it can be kept around by the kernel,
+ * each property name appears only once in this page (cheap compression)
+ *
+ * the mem_rsvmap contains a map of reserved ranges of physical memory,
+ * passing it here instead of in the device-tree itself greatly simplifies
+ * the job of everybody. It's just a list of u64 pairs (base/size) that
+ * ends when size is 0
+ */
+struct boot_param_header {
+ __be32 magic; /* magic word OF_DT_HEADER */
+ __be32 totalsize; /* total size of DT block */
+ __be32 off_dt_struct; /* offset to structure */
+ __be32 off_dt_strings; /* offset to strings */
+ __be32 off_mem_rsvmap; /* offset to memory reserve map */
+ __be32 version; /* format version */
+ __be32 last_comp_version; /* last compatible version */
+ /* version 2 fields below */
+ __be32 boot_cpuid_phys; /* Physical CPU id we're booting on */
+ /* version 3 fields below */
+ __be32 dt_strings_size; /* size of the DT strings block */
+ /* version 17 fields below */
+ __be32 dt_struct_size; /* size of the DT structure block */
+};
+
/*
* OF address retreival & translation
*/
diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
index aa7fb87..7f67790 100644
--- a/include/linux/of_fdt.h
+++ b/include/linux/of_fdt.h
@@ -17,47 +17,10 @@
/* Definitions used by the flattened device tree */
#define OF_DT_HEADER 0xd00dfeed /* marker */
-#define OF_DT_BEGIN_NODE 0x1 /* Start of node, full name */
-#define OF_DT_END_NODE 0x2 /* End node */
-#define OF_DT_PROP 0x3 /* Property: name off, size,
- * content */
-#define OF_DT_NOP 0x4 /* nop */
-#define OF_DT_END 0x9
-
-#define OF_DT_VERSION 0x10
#ifndef __ASSEMBLY__
#include <linux/libfdt.h>
-/*
- * This is what gets passed to the kernel by prom_init or kexec
- *
- * The dt struct contains the device tree structure, full pathes and
- * property contents. The dt strings contain a separate block with just
- * the strings for the property names, and is fully page aligned and
- * self contained in a page, so that it can be kept around by the kernel,
- * each property name appears only once in this page (cheap compression)
- *
- * the mem_rsvmap contains a map of reserved ranges of physical memory,
- * passing it here instead of in the device-tree itself greatly simplifies
- * the job of everybody. It's just a list of u64 pairs (base/size) that
- * ends when size is 0
- */
-struct boot_param_header {
- __be32 magic; /* magic word OF_DT_HEADER */
- __be32 totalsize; /* total size of DT block */
- __be32 off_dt_struct; /* offset to structure */
- __be32 off_dt_strings; /* offset to strings */
- __be32 off_mem_rsvmap; /* offset to memory reserve map */
- __be32 version; /* format version */
- __be32 last_comp_version; /* last compatible version */
- /* version 2 fields below */
- __be32 boot_cpuid_phys; /* Physical CPU id we're booting on */
- /* version 3 fields below */
- __be32 dt_strings_size; /* size of the DT strings block */
- /* version 17 fields below */
- __be32 dt_struct_size; /* size of the DT structure block */
-};
#if defined(CONFIG_OF_FLATTREE)
--
1.8.3.2
^ permalink raw reply related
* [PATCH 09/20] of/fdt: create common debugfs
From: Rob Herring @ 2014-04-03 22:16 UTC (permalink / raw)
To: linux-kernel
Cc: Rob Herring, Michal Simek, Paul Mackerras, Grant Likely,
linuxppc-dev
In-Reply-To: <1396563423-30893-1-git-send-email-robherring2@gmail.com>
From: Rob Herring <robh@kernel.org>
Both powerpc and microblaze have the same FDT blob in debugfs feature.
Move this to common location and remove the powerpc and microblaze
implementations. This feature could become more useful when FDT
overlay support is added.
This changes the path of the blob from "$arch/flat-device-tree" to
"device-tree/flat-device-tree".
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
---
arch/microblaze/kernel/prom.c | 31 -------------------------------
arch/powerpc/kernel/prom.c | 21 ---------------------
drivers/of/fdt.c | 24 ++++++++++++++++++++++++
3 files changed, 24 insertions(+), 52 deletions(-)
diff --git a/arch/microblaze/kernel/prom.c b/arch/microblaze/kernel/prom.c
index abdfb10..1312cd2 100644
--- a/arch/microblaze/kernel/prom.c
+++ b/arch/microblaze/kernel/prom.c
@@ -114,34 +114,3 @@ void __init early_init_devtree(void *params)
pr_debug(" <- early_init_devtree()\n");
}
-
-/*******
- *
- * New implementation of the OF "find" APIs, return a refcounted
- * object, call of_node_put() when done. The device tree and list
- * are protected by a rw_lock.
- *
- * Note that property management will need some locking as well,
- * this isn't dealt with yet.
- *
- *******/
-
-#if defined(CONFIG_DEBUG_FS) && defined(DEBUG)
-static struct debugfs_blob_wrapper flat_dt_blob;
-
-static int __init export_flat_device_tree(void)
-{
- struct dentry *d;
-
- flat_dt_blob.data = initial_boot_params;
- flat_dt_blob.size = initial_boot_params->totalsize;
-
- d = debugfs_create_blob("flat-device-tree", S_IFREG | S_IRUSR,
- of_debugfs_root, &flat_dt_blob);
- if (!d)
- return 1;
-
- return 0;
-}
-device_initcall(export_flat_device_tree);
-#endif
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index dd72beb..7c2f90c 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -29,7 +29,6 @@
#include <linux/bitops.h>
#include <linux/export.h>
#include <linux/kexec.h>
-#include <linux/debugfs.h>
#include <linux/irq.h>
#include <linux/memblock.h>
#include <linux/of.h>
@@ -918,23 +917,3 @@ bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
{
return (int)phys_id == get_hard_smp_processor_id(cpu);
}
-
-#if defined(CONFIG_DEBUG_FS) && defined(DEBUG)
-static struct debugfs_blob_wrapper flat_dt_blob;
-
-static int __init export_flat_device_tree(void)
-{
- struct dentry *d;
-
- flat_dt_blob.data = initial_boot_params;
- flat_dt_blob.size = be32_to_cpu(initial_boot_params->totalsize);
-
- d = debugfs_create_blob("flat-device-tree", S_IFREG | S_IRUSR,
- powerpc_debugfs_root, &flat_dt_blob);
- if (!d)
- return 1;
-
- return 0;
-}
-__initcall(export_flat_device_tree);
-#endif
diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index fa16a91..2085d47 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -20,6 +20,7 @@
#include <linux/string.h>
#include <linux/errno.h>
#include <linux/slab.h>
+#include <linux/debugfs.h>
#include <asm/setup.h> /* for COMMAND_LINE_SIZE */
#ifdef CONFIG_PPC
@@ -1084,4 +1085,27 @@ void __init unflatten_and_copy_device_tree(void)
unflatten_device_tree();
}
+#if defined(CONFIG_DEBUG_FS) && defined(DEBUG)
+static struct debugfs_blob_wrapper flat_dt_blob;
+
+static int __init of_flat_dt_debugfs_export_fdt(void)
+{
+ struct dentry *d = debugfs_create_dir("device-tree", NULL);
+
+ if (!d)
+ return -ENOENT;
+
+ flat_dt_blob.data = initial_boot_params;
+ flat_dt_blob.size = fdt_totalsize(initial_boot_params);
+
+ d = debugfs_create_blob("flat-device-tree", S_IFREG | S_IRUSR,
+ d, &flat_dt_blob);
+ if (!d)
+ return -ENOENT;
+
+ return 0;
+}
+module_init(of_flat_dt_debugfs_export_fdt);
+#endif
+
#endif /* CONFIG_OF_EARLY_FLATTREE */
--
1.8.3.2
^ permalink raw reply related
* [PATCH 00/20] FDT clean-ups and libfdt support
From: Rob Herring @ 2014-04-03 22:16 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mips, Aurelien Jacquiot, H. Peter Anvin, Max Filippov,
Paul Mackerras, linux, Jonas Bonn, Rob Herring, Russell King,
linux-c6x-dev, x86, Ingo Molnar, Mark Salter, Grant Likely,
linux-xtensa, James Hogan, Thomas Gleixner, linux-metag,
linux-arm-kernel, Chris Zankel, Michal Simek, Vineet Gupta,
Ralf Baechle, linuxppc-dev
From: Rob Herring <robh@kernel.org>
This is a series of clean-ups of architecture FDT code and converts the
core FDT code over to using libfdt functions. This is in preparation
to add FDT based address translation parsing functions for early
console support.
The current MIPS lantiq and xlp DT code is buggy as built-in DTBs need
to be copied out of init section. Patches 2 and 3 should be applied to
3.15.
A branch is available here[1]. Please test! I've compiled on arm,
arm64, mips, microblaze, and powerpc and booted on arm and arm64.
Rob
[1] git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git libfdt
Rob Herring (20):
mips: octeon: convert to use unflatten_and_copy_device_tree
mips: lantiq: copy built-in DTB out of init section
mips: xlp: copy built-in DTB out of init section
mips: ralink: convert to use unflatten_and_copy_device_tree
ARM: dt: use default early_init_dt_alloc_memory_arch
c6x: convert fdt pointers to opaque pointers
mips: convert fdt pointers to opaque pointers
of/fdt: consolidate built-in dtb section variables
of/fdt: create common debugfs
of/fdt: remove some unneeded includes
of/fdt: remove unused of_scan_flat_dt_by_path
of/fdt: update of_get_flat_dt_prop in prep for libfdt
of/fdt: Convert FDT functions to use libfdt
of/fdt: use libfdt accessors for header data
of/fdt: move memreserve and dtb memory reservations into core
build: add libfdt include path globally
powerpc: use libfdt accessors for header data
x86: use libfdt accessors for header data
of/fdt: convert initial_boot_params to opaque pointer
of: push struct boot_param_header and defines into powerpc
Makefile | 5 +
arch/arc/include/asm/sections.h | 1 -
arch/arc/kernel/devtree.c | 2 +-
arch/arm/include/asm/prom.h | 2 -
arch/arm/kernel/devtree.c | 34 +--
arch/arm/mach-exynos/exynos.c | 2 +-
arch/arm/mach-vexpress/platsmp.c | 2 +-
arch/arm/mm/init.c | 1 -
arch/arm/plat-samsung/s5p-dev-mfc.c | 4 +-
arch/arm64/mm/init.c | 21 --
arch/c6x/kernel/setup.c | 4 +-
arch/metag/kernel/setup.c | 4 -
arch/microblaze/kernel/prom.c | 37 +--
arch/mips/cavium-octeon/Makefile | 3 -
arch/mips/cavium-octeon/setup.c | 18 +-
arch/mips/include/asm/mips-boards/generic.h | 4 -
arch/mips/include/asm/prom.h | 6 +-
arch/mips/kernel/prom.c | 2 +-
arch/mips/lantiq/prom.c | 15 +-
arch/mips/lantiq/prom.h | 2 -
arch/mips/mti-sead3/Makefile | 2 -
arch/mips/mti-sead3/sead3-setup.c | 8 +-
arch/mips/netlogic/xlp/dt.c | 17 +-
arch/mips/ralink/of.c | 29 +--
arch/openrisc/kernel/vmlinux.h | 2 -
arch/powerpc/include/asm/prom.h | 39 +++
arch/powerpc/kernel/epapr_paravirt.c | 2 +-
arch/powerpc/kernel/fadump.c | 4 +-
arch/powerpc/kernel/prom.c | 77 ++----
arch/powerpc/kernel/rtas.c | 2 +-
arch/powerpc/mm/hash_utils_64.c | 22 +-
arch/powerpc/platforms/52xx/efika.c | 4 +-
arch/powerpc/platforms/chrp/setup.c | 4 +-
arch/powerpc/platforms/powernv/opal.c | 12 +-
arch/powerpc/platforms/pseries/setup.c | 4 +-
arch/x86/kernel/devicetree.c | 7 +-
arch/xtensa/kernel/setup.c | 3 +-
drivers/of/Kconfig | 1 +
drivers/of/fdt.c | 379 +++++++++-------------------
drivers/of/of_reserved_mem.c | 4 +-
include/linux/of_fdt.h | 64 +----
lib/Makefile | 2 -
42 files changed, 261 insertions(+), 596 deletions(-)
--
1.8.3.2
^ permalink raw reply
* Re: Bug in reclaim logic with exhausted nodes?
From: Christoph Lameter @ 2014-04-03 16:41 UTC (permalink / raw)
To: Nishanth Aravamudan; +Cc: linux-mm, mgorman, linuxppc-dev, anton, rientjes
In-Reply-To: <20140401013346.GD5144@linux.vnet.ibm.com>
On Mon, 31 Mar 2014, Nishanth Aravamudan wrote:
> Yep. The node exists, it's just fully exhausted at boot (due to the
> presence of 16GB pages reserved at boot-time).
Well if you want us to support that then I guess you need to propose
patches to address this issue.
> I'd appreciate a bit more guidance? I'm suggesting that in this case the
> node functionally has no memory. So the page allocator should not allow
> allocations from it -- except (I need to investigate this still)
> userspace accessing the 16GB pages on that node, but that, I believe,
> doesn't go through the page allocator at all, it's all from hugetlb
> interfaces. It seems to me there is a bug in SLUB that we are noting
> that we have a useless per-node structure for a given nid, but not
> actually preventing requests to that node or reclaim because of those
> allocations.
Well if you can address that without impacting the fastpath then we could
do this. Otherwise we would need a fake structure here to avoid adding
checks to the fastpath
> I think there is a logical bug (even if it only occurs in this
> particular corner case) where if reclaim progresses for a THISNODE
> allocation, we don't check *where* the reclaim is progressing, and thus
> may falsely be indicating that we have done some progress when in fact
> the allocation that is causing reclaim will not possibly make any more
> progress.
Ok maybe we could address this corner case. How would you do this?
^ permalink raw reply
* Re: [RFC PATCH] hugetlb: ensure hugepage access is denied if hugepages are not supported
From: Aneesh Kumar K.V @ 2014-04-03 16:19 UTC (permalink / raw)
To: Nishanth Aravamudan, linux-mm; +Cc: linuxppc-dev, paulus, anton, nyc
In-Reply-To: <20140326155815.GB15234@linux.vnet.ibm.com>
Nishanth Aravamudan <nacc@linux.vnet.ibm.com> writes:
> On 24.03.2014 [16:02:56 -0700], Nishanth Aravamudan wrote:
>> In KVM guests on Power, if the guest is not backed by hugepages, we see
>> the following in the guest:
>>
>> AnonHugePages: 0 kB
>> HugePages_Total: 0
>> HugePages_Free: 0
>> HugePages_Rsvd: 0
>> HugePages_Surp: 0
>> Hugepagesize: 64 kB
>>
>> This seems like a configuration issue -- why is a hstate of 64k being
>> registered?
>>
>> I did some debugging and found that the following does trigger,
>> mm/hugetlb.c::hugetlb_init():
>>
>> /* Some platform decide whether they support huge pages at boot
>> * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when
>> * there is no such support
>> */
>> if (HPAGE_SHIFT == 0)
>> return 0;
>>
>> That check is only during init-time. So we don't support hugepages, but
>> none of the hugetlb APIs actually check this condition (HPAGE_SHIFT ==
>> 0), so /proc/meminfo above falsely indicates there is a valid hstate (at
>> least one). But note that there is no /sys/kernel/mm/hugepages meaning
>> no hstate was actually registered.
>>
>> Further, it turns out that huge_page_order(default_hstate) is 0, so
>> hugetlb_report_meminfo is doing:
>>
>> 1UL << (huge_page_order(h) + PAGE_SHIFT - 10)
>>
>> which ends up just doing 1 << (PAGE_SHIFT - 10) and since the base page
>> size is 64k, we report a hugepage size of 64k... And allow the user to
>> allocate hugepages via the sysctl, etc.
>>
>> What's the right thing to do here?
>>
>> 1) Should we add checks for HPAGE_SHIFT == 0 to all the hugetlb APIs? It
>> seems like HPAGE_SHIFT == 0 should be the equivalent, functionally, of
>> the config options being off. This seems like a lot of overhead, though,
>> to put everywhere, so maybe I can do it in an arch-specific macro, that
>> in asm-generic defaults to 0 (and so will hopefully be compiled out?).
>>
>> 2) What should hugetlbfs do when HPAGE_SHIFT == 0? Should it be
>> mountable? Obviously if it's mountable, we can't great files there
>> (since the fs will report insufficient space). [1]
>
> Here is my solution to this. Comments appreciated!
>
> In KVM guests on Power, in a guest not backed by hugepages, we see the
> following:
>
> AnonHugePages: 0 kB
> HugePages_Total: 0
> HugePages_Free: 0
> HugePages_Rsvd: 0
> HugePages_Surp: 0
> Hugepagesize: 64 kB
>
> HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages
> are not supported at boot-time, but this is only checked in
> hugetlb_init(). Extract the check to a helper function, and use it in a
> few relevant places.
>
> This does make hugetlbfs not supported in this environment. I believe
> this is fine, as there are no valid hugepages and that won't change at
> runtime.
>
> Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Looks good. Can you resubmit it as a proper patch ? You may also want to
capture in commit message saying hugetlbfs file system also will not be
registered.
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index d19b30a..c7aa477 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -1017,6 +1017,11 @@ static int __init init_hugetlbfs_fs(void)
> int error;
> int i;
>
> + if (!hugepages_supported()) {
> + printk(KERN_ERR "hugetlbfs: Disabling because there are no supported page sizes\n");
> + return -ENOTSUPP;
> + }
> +
> error = bdi_init(&hugetlbfs_backing_dev_info);
> if (error)
> return error;
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 8c43cc4..0aea8de 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -450,4 +450,14 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h,
> return ptl;
> }
>
> +static inline bool hugepages_supported(void)
> +{
> + /*
> + * Some platform decide whether they support huge pages at boot
> + * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when
> + * there is no such support
> + */
> + return HPAGE_SHIFT != 0;
> +}
> +
> #endif /* _LINUX_HUGETLB_H */
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index c01cb9f..1c99585 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1949,11 +1949,7 @@ module_exit(hugetlb_exit);
>
> static int __init hugetlb_init(void)
> {
> - /* Some platform decide whether they support huge pages at boot
> - * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when
> - * there is no such support
> - */
> - if (HPAGE_SHIFT == 0)
> + if (!hugepages_supported())
> return 0;
>
> if (!size_to_hstate(default_hstate_size)) {
> @@ -2069,6 +2065,9 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
> unsigned long tmp;
> int ret;
>
> + if (!hugepages_supported())
> + return -ENOTSUPP;
> +
> tmp = h->max_huge_pages;
>
> if (write && h->order >= MAX_ORDER)
> @@ -2122,6 +2121,9 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write,
> unsigned long tmp;
> int ret;
>
> + if (!hugepages_supported())
> + return -ENOTSUPP;
> +
> tmp = h->nr_overcommit_huge_pages;
>
> if (write && h->order >= MAX_ORDER)
> @@ -2147,6 +2149,8 @@ out:
> void hugetlb_report_meminfo(struct seq_file *m)
> {
> struct hstate *h = &default_hstate;
> + if (!hugepages_supported())
> + return;
> seq_printf(m,
> "HugePages_Total: %5lu\n"
> "HugePages_Free: %5lu\n"
> @@ -2163,6 +2167,8 @@ void hugetlb_report_meminfo(struct seq_file *m)
> int hugetlb_report_node_meminfo(int nid, char *buf)
> {
> struct hstate *h = &default_hstate;
> + if (!hugepages_supported())
> + return 0;
> return sprintf(buf,
> "Node %d HugePages_Total: %5u\n"
> "Node %d HugePages_Free: %5u\n"
> @@ -2177,6 +2183,9 @@ void hugetlb_show_meminfo(void)
> struct hstate *h;
> int nid;
>
> + if (!hugepages_supported())
> + return;
> +
> for_each_node_state(nid, N_MEMORY)
> for_each_hstate(h)
> pr_info("Node %d hugepages_total=%u hugepages_free=%u hugepages_surp=%u hugepages_size=%lukB\n",
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
^ permalink raw reply
* cscope: issue with symlinks in tools/testing/selftests/powerpc/copyloops/
From: Yann Droneaud @ 2014-04-03 13:16 UTC (permalink / raw)
To: Michael Ellerman, Anton Blanchard, Benjamin Herrenschmidt,
Hans-Bernhard Bröker, Hans-Bernhard Broeker, Neil Horman,
Neil Horman
Cc: cscope-devel, linuxppc-dev, linux-kernel
Hi,
I'm using cscope to browse kernel sources, but I'm facing warnings from
the tool since following commit:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=22d651dcef536c75f75537290bf3da5038e68b6b
commit 22d651dcef536c75f75537290bf3da5038e68b6b
Author: Michael Ellerman <mpe@ellerman.id.au>
Date: Tue Jan 21 15:22:17 2014 +1100
selftests/powerpc: Import Anton's memcpy / copy_tofrom_user tests
Turn Anton's memcpy / copy_tofrom_user test into something that can
live in tools/testing/selftests.
It requires one turd in arch/powerpc/lib/memcpy_64.S, but it's
pretty harmless IMHO.
We are sailing very close to the wind with the feature macros. We
define them to nothing, which currently means we get a few extra
nops and include the unaligned calls.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
cscope reports error when generating the cross-reference database:
$ make ALLSOURCE_ARCHS=all O=./obj-cscope/ cscope
GEN cscope
cscope: cannot find
file /home/ydroneaud/src/linux/tools/testing/selftests/powerpc/copyloops/copyuser_power7.S
cscope: cannot find
file /home/ydroneaud/src/linux/tools/testing/selftests/powerpc/copyloops/memcpy_64.S
cscope: cannot find
file /home/ydroneaud/src/linux/tools/testing/selftests/powerpc/copyloops/memcpy_power7.S
cscope: cannot find
file /home/ydroneaud/src/linux/tools/testing/selftests/powerpc/copyloops/copyuser_64.S
And when calling cscope from ./obj-cscope/ directory, it reports errors
too.
Hopefully it doesn't stop it from working, so I'm still able to use
cscope to browse kernel sources.
It's a rather uncommon side effect of having (for the first time ?)
sources files as symlinks: looking for symlinks in the kernel sources
returns only:
$ find . -type l
./arch/mips/boot/dts/include/dt-bindings
./arch/microblaze/boot/dts/system.dts
./arch/powerpc/boot/dts/include/dt-bindings
./arch/metag/boot/dts/include/dt-bindings
./arch/arm/boot/dts/include/dt-bindings
./tools/testing/selftests/powerpc/copyloops/copyuser_power7.S
./tools/testing/selftests/powerpc/copyloops/memcpy_64.S
./tools/testing/selftests/powerpc/copyloops/memcpy_power7.S
./tools/testing/selftests/powerpc/copyloops/copyuser_64.S
./obj-cscope/source
./Documentation/DocBook/vidioc-g-sliced-vbi-cap.xml
./Documentation/DocBook/vidioc-decoder-cmd.xml
...
./Documentation/DocBook/media-func-ioctl.xml
./Documentation/DocBook/vidioc-enumoutput.xml
So one can wonder if having symlinked sources files is an expected
supported feature for kbuild and all the various kernel
tools/infrastructure ?
Regarding cscope specifically, it does not support symlink, and it's the
expected behavior according to the bug reports I was able to find:
#214 cscope ignores symlinks to files
http://sourceforge.net/p/cscope/bugs/214/
#229 -I options doesn't handle symbolic link
http://sourceforge.net/p/cscope/bugs/229/
#247 cscope: cannot find file
http://sourceforge.net/p/cscope/bugs/247/
#252 cscope: cannot find file ***
http://sourceforge.net/p/cscope/bugs/252/
#261 Regression - version 15.7a does not follow symbolic links
http://sourceforge.net/p/cscope/bugs/261/
Regards.
--
Yann Droneaud
OPTEYA
^ permalink raw reply
* Re: [PATCH v2] powernv: kvm: make _PAGE_NUMA take effect
From: Alexander Graf @ 2014-04-03 11:43 UTC (permalink / raw)
To: Liu ping fan, Alexander Graf
Cc: kvm-devel, kvm-ppc, Paul Mackerras, Aneesh Kumar K.V,
linuxppc-dev
In-Reply-To: <533D4834.7050306@suse.com>
On 03.04.14 13:38, Alexander Graf wrote:
>
> On 03.04.14 13:36, Alexander Graf wrote:
>>
>> On 03.04.14 04:36, Liu ping fan wrote:
>>> Hi Alex, could you help to pick up this patch? since v3.14 kernel can
>>> enable numa fault for powerpc.
>>
>> What bad happens without this patch? We map a page even though it was
>> declared to get NUMA migrated? What happens next?
>>
>> I'm trying to figure out whether I need to mark this with a stable
>> tag for 3.14.
>
> Also, what about the other uses of pre_present()?
Oh how I love to reply to myself. The patch doesn't apply to the latest
code anymore either. Also please rework the wording of your patch
subject and patch description to explain the actual problem at hand.
Alex
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox