* [PATCH V5 0/7] iommu/msm: Add DT adaptation and generic bindings support
@ 2016-05-20 10:54 Sricharan R
2016-05-20 10:54 ` [PATCH V5 1/7] iommu/msm: Add DT adaptation Sricharan R
` (7 more replies)
0 siblings, 8 replies; 20+ messages in thread
From: Sricharan R @ 2016-05-20 10:54 UTC (permalink / raw)
To: devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, joro-zLv9SwRftAIdnm+yROfE0A,
robdclark-Re5JQEeQqe8AvxtiuMwx3w,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
srinivas.kandagatla-QSEj5FYQhm4dnm+yROfE0A,
laurent.pinchart-ryLnwIuWjnjg/C1BVhZhaw,
treding-DDmLM1+adcrQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
stepanm-sgV2jX0FEOL9JmXXK+q4OQ, architt-sgV2jX0FEOL9JmXXK+q4OQ,
arnd-r2nGTMty4D4
Cc: sricharan-sgV2jX0FEOL9JmXXK+q4OQ
The msm_iommu.c driver currently works based on platform data.
A single master device can be connected to more than one iommu and multiple
contexts in each of the iommu. This association between master and iommus was
represented from platform data using parent/child devices. The master drivers
were responsible for attaching all of the iommus/context to a domain. Now the
platform data support is removed and DT support is added. The master/iommus are
added through generic iommu bindings.
This is essentially rework of the patch posted earlier by
Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>. This series folds the changes in to the
existing driver with the addition of generic bindings.
http://www.spinics.net/lists/linux-arm-msm/msg10077.html
Tested this series on ifc6410 board.
[V5] Changed the compatible binding name as per comments, added comments
for usage of barriers in patch 6.
[V4] Addressed comments for making the iommu compatible binding more soc
specific and updated the documentation for the iommu clocks.
[V3] Addressed comments to correct the usage
of the #iommu-cells binding, improve the flush_iotlb_range function,
added a new patch to use writel_relaxed for register access and split
up the documentation patch.
[V2] Adapted the driver to use generic ARMV7S short descriptor pagetable ops
and addressed comments.
[V1]
https://lists.linuxfoundation.org/pipermail/iommu/2015-August/014074.html
Sricharan R (7):
iommu/msm: Add DT adaptation
documentation: iommu: Add bindings for msm,iommu-v0 ip
iommu/msm: Move the contents from msm_iommu_dev.c to msm_iommu.c
iommu/msm: Add support for generic master bindings
iommu/msm: use generic ARMV7S short descriptor pagetable ops
iommu/msm: Use writel_relaxed and add a barrier
iommu/msm: Remove driver BROKEN
.../devicetree/bindings/iommu/msm,iommu-v0.txt | 64 ++
drivers/iommu/Kconfig | 2 +-
drivers/iommu/Makefile | 2 +-
drivers/iommu/msm_iommu.c | 885 +++++++++++----------
drivers/iommu/msm_iommu.h | 73 +-
drivers/iommu/msm_iommu_dev.c | 381 ---------
drivers/iommu/msm_iommu_hw-8xxx.h | 109 +--
7 files changed, 636 insertions(+), 880 deletions(-)
create mode 100644 Documentation/devicetree/bindings/iommu/msm,iommu-v0.txt
delete mode 100644 drivers/iommu/msm_iommu_dev.c
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH V5 1/7] iommu/msm: Add DT adaptation
2016-05-20 10:54 [PATCH V5 0/7] iommu/msm: Add DT adaptation and generic bindings support Sricharan R
@ 2016-05-20 10:54 ` Sricharan R
2016-05-20 10:54 ` [PATCH V5 2/7] documentation: iommu: Add bindings for msm,iommu-v0 ip Sricharan R
` (6 subsequent siblings)
7 siblings, 0 replies; 20+ messages in thread
From: Sricharan R @ 2016-05-20 10:54 UTC (permalink / raw)
To: devicetree, linux-arm-msm, joro, robdclark, iommu,
srinivas.kandagatla, laurent.pinchart, treding, robin.murphy,
linux-arm-kernel, stepanm, architt, arnd
Cc: sricharan
The driver currently works based on platform data. Remove this
and add support for DT. A single master can have multiple ports
connected to more than one iommu.
master
|
|
|
------------------------
| |
IOMMU0 IOMMU1
| |
ctx0 ctx1 ctx0 ctx1
This association of master and iommus/contexts were previously
represented by platform data parent/child device details. The client
drivers were responsible for programming all of the iommus/contexts
for the device. Now while adapting to generic DT bindings we maintain the
list of iommus, contexts that each master domain is connected to and
program all of them on attach/detach.
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
---
drivers/iommu/msm_iommu.c | 252 ++++++++++++++++++---------------
drivers/iommu/msm_iommu.h | 73 ++++------
drivers/iommu/msm_iommu_dev.c | 315 ++++++++++--------------------------------
3 files changed, 237 insertions(+), 403 deletions(-)
diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index e321fa5..bc1a4e3 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -48,6 +48,7 @@ __asm__ __volatile__ ( \
static int msm_iommu_tex_class[4];
DEFINE_SPINLOCK(msm_iommu_lock);
+static LIST_HEAD(qcom_iommu_devices);
struct msm_priv {
unsigned long *pgtable;
@@ -60,35 +61,37 @@ static struct msm_priv *to_msm_priv(struct iommu_domain *dom)
return container_of(dom, struct msm_priv, domain);
}
-static int __enable_clocks(struct msm_iommu_drvdata *drvdata)
+static int __enable_clocks(struct msm_iommu_dev *iommu)
{
int ret;
- ret = clk_enable(drvdata->pclk);
+ ret = clk_enable(iommu->pclk);
if (ret)
goto fail;
- if (drvdata->clk) {
- ret = clk_enable(drvdata->clk);
+ if (iommu->clk) {
+ ret = clk_enable(iommu->clk);
if (ret)
- clk_disable(drvdata->pclk);
+ clk_disable(iommu->pclk);
}
fail:
return ret;
}
-static void __disable_clocks(struct msm_iommu_drvdata *drvdata)
+static void __disable_clocks(struct msm_iommu_dev *iommu)
{
- clk_disable(drvdata->clk);
- clk_disable(drvdata->pclk);
+ if (iommu->clk)
+ clk_disable(iommu->clk);
+ clk_disable(iommu->pclk);
}
static int __flush_iotlb(struct iommu_domain *domain)
{
struct msm_priv *priv = to_msm_priv(domain);
- struct msm_iommu_drvdata *iommu_drvdata;
- struct msm_iommu_ctx_drvdata *ctx_drvdata;
+ struct msm_iommu_dev *iommu = NULL;
+ struct msm_iommu_ctx_dev *master;
int ret = 0;
+
#ifndef CONFIG_IOMMU_PGTABLES_L2
unsigned long *fl_table = priv->pgtable;
int i;
@@ -105,24 +108,67 @@ static int __flush_iotlb(struct iommu_domain *domain)
}
#endif
- list_for_each_entry(ctx_drvdata, &priv->list_attached, attached_elm) {
-
- BUG_ON(!ctx_drvdata->pdev || !ctx_drvdata->pdev->dev.parent);
-
- iommu_drvdata = dev_get_drvdata(ctx_drvdata->pdev->dev.parent);
- BUG_ON(!iommu_drvdata);
-
- ret = __enable_clocks(iommu_drvdata);
+ list_for_each_entry(iommu, &priv->list_attached, dom_node) {
+ ret = __enable_clocks(iommu);
if (ret)
goto fail;
- SET_CTX_TLBIALL(iommu_drvdata->base, ctx_drvdata->num, 0);
- __disable_clocks(iommu_drvdata);
+ list_for_each_entry(master, &iommu->ctx_list, list)
+ SET_CTX_TLBIALL(iommu->base, master->num, 0);
+
+ __disable_clocks(iommu);
}
fail:
return ret;
}
+static int msm_iommu_alloc_ctx(unsigned long *map, int start, int end)
+{
+ int idx;
+
+ do {
+ idx = find_next_zero_bit(map, end, start);
+ if (idx == end)
+ return -ENOSPC;
+ } while (test_and_set_bit(idx, map));
+
+ return idx;
+}
+
+static void msm_iommu_free_ctx(unsigned long *map, int idx)
+{
+ clear_bit(idx, map);
+}
+
+static void config_mids(struct msm_iommu_dev *iommu,
+ struct msm_iommu_ctx_dev *master)
+{
+ int mid, ctx, i;
+
+ for (i = 0; i < master->num_mids; i++) {
+ mid = master->mids[i];
+ ctx = master->num;
+
+ SET_M2VCBR_N(iommu->base, mid, 0);
+ SET_CBACR_N(iommu->base, ctx, 0);
+
+ /* Set VMID = 0 */
+ SET_VMID(iommu->base, mid, 0);
+
+ /* Set the context number for that MID to this context */
+ SET_CBNDX(iommu->base, mid, ctx);
+
+ /* Set MID associated with this context bank to 0*/
+ SET_CBVMID(iommu->base, ctx, 0);
+
+ /* Set the ASID for TLB tagging for this context */
+ SET_CONTEXTIDR_ASID(iommu->base, ctx, ctx);
+
+ /* Set security bit override to be Non-secure */
+ SET_NSCFG(iommu->base, mid, 3);
+ }
+}
+
static void __reset_context(void __iomem *base, int ctx)
{
SET_BPRCOSH(base, ctx, 0);
@@ -272,94 +318,76 @@ static void msm_iommu_domain_free(struct iommu_domain *domain)
static int msm_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
{
- struct msm_priv *priv;
- struct msm_iommu_ctx_dev *ctx_dev;
- struct msm_iommu_drvdata *iommu_drvdata;
- struct msm_iommu_ctx_drvdata *ctx_drvdata;
- struct msm_iommu_ctx_drvdata *tmp_drvdata;
int ret = 0;
unsigned long flags;
+ struct msm_iommu_dev *iommu;
+ struct msm_priv *priv = to_msm_priv(domain);
+ struct msm_iommu_ctx_dev *master;
spin_lock_irqsave(&msm_iommu_lock, flags);
-
- priv = to_msm_priv(domain);
-
- if (!dev) {
- ret = -EINVAL;
- goto fail;
- }
-
- iommu_drvdata = dev_get_drvdata(dev->parent);
- ctx_drvdata = dev_get_drvdata(dev);
- ctx_dev = dev->platform_data;
-
- if (!iommu_drvdata || !ctx_drvdata || !ctx_dev) {
- ret = -EINVAL;
- goto fail;
- }
-
- if (!list_empty(&ctx_drvdata->attached_elm)) {
- ret = -EBUSY;
- goto fail;
- }
-
- list_for_each_entry(tmp_drvdata, &priv->list_attached, attached_elm)
- if (tmp_drvdata == ctx_drvdata) {
- ret = -EBUSY;
- goto fail;
+ list_for_each_entry(iommu, &qcom_iommu_devices, dev_node) {
+ master = list_first_entry(&iommu->ctx_list,
+ struct msm_iommu_ctx_dev,
+ list);
+ if (master->of_node == dev->of_node) {
+ ret = __enable_clocks(iommu);
+ if (ret)
+ goto fail;
+
+ list_for_each_entry(master, &iommu->ctx_list, list) {
+ if (master->num) {
+ dev_err(dev, "domain already attached");
+ ret = -EEXIST;
+ goto fail;
+ }
+ master->num =
+ msm_iommu_alloc_ctx(iommu->context_map,
+ 0, iommu->ncb);
+ if (IS_ERR_VALUE(master->num)) {
+ ret = -ENODEV;
+ goto fail;
+ }
+ config_mids(iommu, master);
+ __program_context(iommu->base, master->num,
+ __pa(priv->pgtable));
+ }
+ __disable_clocks(iommu);
+ list_add(&iommu->dom_node, &priv->list_attached);
}
+ }
- ret = __enable_clocks(iommu_drvdata);
- if (ret)
- goto fail;
-
- __program_context(iommu_drvdata->base, ctx_dev->num,
- __pa(priv->pgtable));
-
- __disable_clocks(iommu_drvdata);
- list_add(&(ctx_drvdata->attached_elm), &priv->list_attached);
ret = __flush_iotlb(domain);
-
fail:
spin_unlock_irqrestore(&msm_iommu_lock, flags);
+
return ret;
}
static void msm_iommu_detach_dev(struct iommu_domain *domain,
struct device *dev)
{
- struct msm_priv *priv;
- struct msm_iommu_ctx_dev *ctx_dev;
- struct msm_iommu_drvdata *iommu_drvdata;
- struct msm_iommu_ctx_drvdata *ctx_drvdata;
+ struct msm_priv *priv = to_msm_priv(domain);
unsigned long flags;
+ struct msm_iommu_dev *iommu;
+ struct msm_iommu_ctx_dev *master;
int ret;
spin_lock_irqsave(&msm_iommu_lock, flags);
- priv = to_msm_priv(domain);
-
- if (!dev)
- goto fail;
-
- iommu_drvdata = dev_get_drvdata(dev->parent);
- ctx_drvdata = dev_get_drvdata(dev);
- ctx_dev = dev->platform_data;
-
- if (!iommu_drvdata || !ctx_drvdata || !ctx_dev)
- goto fail;
-
ret = __flush_iotlb(domain);
if (ret)
goto fail;
- ret = __enable_clocks(iommu_drvdata);
- if (ret)
- goto fail;
-
- __reset_context(iommu_drvdata->base, ctx_dev->num);
- __disable_clocks(iommu_drvdata);
- list_del_init(&ctx_drvdata->attached_elm);
+ list_for_each_entry(iommu, &priv->list_attached, dom_node) {
+ ret = __enable_clocks(iommu);
+ if (ret)
+ goto fail;
+ list_for_each_entry(master, &iommu->ctx_list, list) {
+ msm_iommu_free_ctx(iommu->context_map, master->num);
+ __reset_context(iommu->base, master->num);
+ }
+ __disable_clocks(iommu);
+ }
fail:
spin_unlock_irqrestore(&msm_iommu_lock, flags);
}
@@ -555,47 +583,46 @@ static phys_addr_t msm_iommu_iova_to_phys(struct iommu_domain *domain,
dma_addr_t va)
{
struct msm_priv *priv;
- struct msm_iommu_drvdata *iommu_drvdata;
- struct msm_iommu_ctx_drvdata *ctx_drvdata;
+ struct msm_iommu_dev *iommu;
+ struct msm_iommu_ctx_dev *master;
unsigned int par;
unsigned long flags;
- void __iomem *base;
phys_addr_t ret = 0;
- int ctx;
spin_lock_irqsave(&msm_iommu_lock, flags);
priv = to_msm_priv(domain);
- if (list_empty(&priv->list_attached))
- goto fail;
+ iommu = list_first_entry(&priv->list_attached,
+ struct msm_iommu_dev, dom_node);
- ctx_drvdata = list_entry(priv->list_attached.next,
- struct msm_iommu_ctx_drvdata, attached_elm);
- iommu_drvdata = dev_get_drvdata(ctx_drvdata->pdev->dev.parent);
+ if (list_empty(&iommu->ctx_list))
+ goto fail;
- base = iommu_drvdata->base;
- ctx = ctx_drvdata->num;
+ master = list_first_entry(&iommu->ctx_list,
+ struct msm_iommu_ctx_dev, list);
+ if (!master)
+ goto fail;
- ret = __enable_clocks(iommu_drvdata);
+ ret = __enable_clocks(iommu);
if (ret)
goto fail;
/* Invalidate context TLB */
- SET_CTX_TLBIALL(base, ctx, 0);
- SET_V2PPR(base, ctx, va & V2Pxx_VA);
+ SET_CTX_TLBIALL(iommu->base, master->num, 0);
+ SET_V2PPR(iommu->base, master->num, va & V2Pxx_VA);
- par = GET_PAR(base, ctx);
+ par = GET_PAR(iommu->base, master->num);
/* We are dealing with a supersection */
- if (GET_NOFAULT_SS(base, ctx))
+ if (GET_NOFAULT_SS(iommu->base, master->num))
ret = (par & 0xFF000000) | (va & 0x00FFFFFF);
else /* Upper 20 bits from PAR, lower 12 from VA */
ret = (par & 0xFFFFF000) | (va & 0x00000FFF);
- if (GET_FAULT(base, ctx))
+ if (GET_FAULT(iommu->base, master->num))
ret = 0;
- __disable_clocks(iommu_drvdata);
+ __disable_clocks(iommu);
fail:
spin_unlock_irqrestore(&msm_iommu_lock, flags);
return ret;
@@ -635,37 +662,34 @@ static void print_ctx_regs(void __iomem *base, int ctx)
irqreturn_t msm_iommu_fault_handler(int irq, void *dev_id)
{
- struct msm_iommu_drvdata *drvdata = dev_id;
- void __iomem *base;
+ struct msm_iommu_dev *iommu = dev_id;
unsigned int fsr;
int i, ret;
spin_lock(&msm_iommu_lock);
- if (!drvdata) {
+ if (!iommu) {
pr_err("Invalid device ID in context interrupt handler\n");
goto fail;
}
- base = drvdata->base;
-
pr_err("Unexpected IOMMU page fault!\n");
- pr_err("base = %08x\n", (unsigned int) base);
+ pr_err("base = %08x\n", (unsigned int)iommu->base);
- ret = __enable_clocks(drvdata);
+ ret = __enable_clocks(iommu);
if (ret)
goto fail;
- for (i = 0; i < drvdata->ncb; i++) {
- fsr = GET_FSR(base, i);
+ for (i = 0; i < iommu->ncb; i++) {
+ fsr = GET_FSR(iommu->base, i);
if (fsr) {
pr_err("Fault occurred in context %d.\n", i);
pr_err("Interesting registers:\n");
- print_ctx_regs(base, i);
- SET_FSR(base, i, 0x4000000F);
+ print_ctx_regs(iommu->base, i);
+ SET_FSR(iommu->base, i, 0x4000000F);
}
}
- __disable_clocks(drvdata);
+ __disable_clocks(iommu);
fail:
spin_unlock(&msm_iommu_lock);
return 0;
diff --git a/drivers/iommu/msm_iommu.h b/drivers/iommu/msm_iommu.h
index 5c7c955..4ca25d5 100644
--- a/drivers/iommu/msm_iommu.h
+++ b/drivers/iommu/msm_iommu.h
@@ -42,74 +42,53 @@
*/
#define MAX_NUM_MIDS 32
+/* Maximum number of context banks that can be present in IOMMU */
+#define IOMMU_MAX_CBS 128
+
/**
* struct msm_iommu_dev - a single IOMMU hardware instance
- * name Human-readable name given to this IOMMU HW instance
* ncb Number of context banks present on this IOMMU HW instance
+ * dev: IOMMU device
+ * irq: Interrupt number
+ * clk: The bus clock for this IOMMU hardware instance
+ * pclk: The clock for the IOMMU bus interconnect
+ * dev_node: list head in qcom_iommu_device_list
+ * dom_node: list head for domain
+ * ctx_list: list of 'struct msm_iommu_ctx_dev'
+ * context_map: Bitmap to track allocated context banks
*/
struct msm_iommu_dev {
- const char *name;
+ void __iomem *base;
int ncb;
+ struct device *dev;
+ int irq;
+ struct clk *clk;
+ struct clk *pclk;
+ struct list_head dev_node;
+ struct list_head dom_node;
+ struct list_head ctx_list;
+ DECLARE_BITMAP(context_map, IOMMU_MAX_CBS);
};
/**
* struct msm_iommu_ctx_dev - an IOMMU context bank instance
- * name Human-readable name given to this context bank
+ * of_node node ptr of client device
* num Index of this context bank within the hardware
* mids List of Machine IDs that are to be mapped into this context
* bank, terminated by -1. The MID is a set of signals on the
* AXI bus that identifies the function associated with a specific
* memory request. (See ARM spec).
+ * num_mids Total number of mids
+ * node list head in ctx_list
*/
struct msm_iommu_ctx_dev {
- const char *name;
+ struct device_node *of_node;
int num;
int mids[MAX_NUM_MIDS];
+ int num_mids;
+ struct list_head list;
};
-
-/**
- * struct msm_iommu_drvdata - A single IOMMU hardware instance
- * @base: IOMMU config port base address (VA)
- * @ncb The number of contexts on this IOMMU
- * @irq: Interrupt number
- * @clk: The bus clock for this IOMMU hardware instance
- * @pclk: The clock for the IOMMU bus interconnect
- *
- * A msm_iommu_drvdata holds the global driver data about a single piece
- * of an IOMMU hardware instance.
- */
-struct msm_iommu_drvdata {
- void __iomem *base;
- int irq;
- int ncb;
- struct clk *clk;
- struct clk *pclk;
-};
-
-/**
- * struct msm_iommu_ctx_drvdata - an IOMMU context bank instance
- * @num: Hardware context number of this context
- * @pdev: Platform device associated wit this HW instance
- * @attached_elm: List element for domains to track which devices are
- * attached to them
- *
- * A msm_iommu_ctx_drvdata holds the driver data for a single context bank
- * within each IOMMU hardware instance
- */
-struct msm_iommu_ctx_drvdata {
- int num;
- struct platform_device *pdev;
- struct list_head attached_elm;
-};
-
-/*
- * Look up an IOMMU context device by its context name. NULL if none found.
- * Useful for testing and drivers that do not yet fully have IOMMU stuff in
- * their platform devices.
- */
-struct device *msm_iommu_get_ctx(const char *ctx_name);
-
/*
* Interrupt handler for the IOMMU context fault interrupt. Hooking the
* interrupt is not supported in the API yet, but this will print an error
diff --git a/drivers/iommu/msm_iommu_dev.c b/drivers/iommu/msm_iommu_dev.c
index 4b09e81..be01cc4 100644
--- a/drivers/iommu/msm_iommu_dev.c
+++ b/drivers/iommu/msm_iommu_dev.c
@@ -30,60 +30,6 @@
#include "msm_iommu_hw-8xxx.h"
#include "msm_iommu.h"
-struct iommu_ctx_iter_data {
- /* input */
- const char *name;
-
- /* output */
- struct device *dev;
-};
-
-static struct platform_device *msm_iommu_root_dev;
-
-static int each_iommu_ctx(struct device *dev, void *data)
-{
- struct iommu_ctx_iter_data *res = data;
- struct msm_iommu_ctx_dev *c = dev->platform_data;
-
- if (!res || !c || !c->name || !res->name)
- return -EINVAL;
-
- if (!strcmp(res->name, c->name)) {
- res->dev = dev;
- return 1;
- }
- return 0;
-}
-
-static int each_iommu(struct device *dev, void *data)
-{
- return device_for_each_child(dev, data, each_iommu_ctx);
-}
-
-struct device *msm_iommu_get_ctx(const char *ctx_name)
-{
- struct iommu_ctx_iter_data r;
- int found;
-
- if (!msm_iommu_root_dev) {
- pr_err("No root IOMMU device.\n");
- goto fail;
- }
-
- r.name = ctx_name;
- found = device_for_each_child(&msm_iommu_root_dev->dev, &r, each_iommu);
-
- if (!found) {
- pr_err("Could not find context <%s>\n", ctx_name);
- goto fail;
- }
-
- return r.dev;
-fail:
- return NULL;
-}
-EXPORT_SYMBOL(msm_iommu_get_ctx);
-
static void msm_iommu_reset(void __iomem *base, int ncb)
{
int ctx;
@@ -128,237 +74,122 @@ static void msm_iommu_reset(void __iomem *base, int ncb)
static int msm_iommu_probe(struct platform_device *pdev)
{
struct resource *r;
- struct clk *iommu_clk;
- struct clk *iommu_pclk;
- struct msm_iommu_drvdata *drvdata;
- struct msm_iommu_dev *iommu_dev = dev_get_platdata(&pdev->dev);
- void __iomem *regs_base;
- int ret, irq, par;
+ struct msm_iommu_dev *iommu;
+ int ret, par, val;
- if (pdev->id == -1) {
- msm_iommu_root_dev = pdev;
- return 0;
- }
+ iommu = devm_kzalloc(&pdev->dev, sizeof(*iommu), GFP_KERNEL);
+ if (!iommu)
+ return -ENODEV;
- drvdata = kzalloc(sizeof(*drvdata), GFP_KERNEL);
+ iommu->dev = &pdev->dev;
+ INIT_LIST_HEAD(&iommu->ctx_list);
- if (!drvdata) {
- ret = -ENOMEM;
- goto fail;
+ iommu->pclk = devm_clk_get(iommu->dev, "smmu_pclk");
+ if (IS_ERR(iommu->pclk)) {
+ dev_err(iommu->dev, "could not get smmu_pclk\n");
+ return PTR_ERR(iommu->pclk);
}
- if (!iommu_dev) {
- ret = -ENODEV;
- goto fail;
+ ret = clk_prepare(iommu->pclk);
+ if (ret) {
+ dev_err(iommu->dev, "could not prepare smmu_pclk\n");
+ return ret;
}
- iommu_pclk = clk_get(NULL, "smmu_pclk");
- if (IS_ERR(iommu_pclk)) {
- ret = -ENODEV;
- goto fail;
+ iommu->clk = devm_clk_get(iommu->dev, "iommu_clk");
+ if (IS_ERR(iommu->clk)) {
+ dev_err(iommu->dev, "could not get iommu_clk\n");
+ clk_unprepare(iommu->pclk);
+ return PTR_ERR(iommu->clk);
}
- ret = clk_prepare_enable(iommu_pclk);
- if (ret)
- goto fail_enable;
-
- iommu_clk = clk_get(&pdev->dev, "iommu_clk");
-
- if (!IS_ERR(iommu_clk)) {
- if (clk_get_rate(iommu_clk) == 0)
- clk_set_rate(iommu_clk, 1);
-
- ret = clk_prepare_enable(iommu_clk);
- if (ret) {
- clk_put(iommu_clk);
- goto fail_pclk;
- }
- } else
- iommu_clk = NULL;
+ ret = clk_prepare(iommu->clk);
+ if (ret) {
+ dev_err(iommu->dev, "could not prepare iommu_clk\n");
+ clk_unprepare(iommu->pclk);
+ return ret;
+ }
- r = platform_get_resource_byname(pdev, IORESOURCE_MEM, "physbase");
- regs_base = devm_ioremap_resource(&pdev->dev, r);
- if (IS_ERR(regs_base)) {
- ret = PTR_ERR(regs_base);
- goto fail_clk;
+ r = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ iommu->base = devm_ioremap_resource(iommu->dev, r);
+ if (IS_ERR(iommu->base)) {
+ dev_err(iommu->dev, "could not get iommu base\n");
+ ret = PTR_ERR(iommu->base);
+ goto fail;
}
- irq = platform_get_irq_byname(pdev, "secure_irq");
- if (irq < 0) {
+ iommu->irq = platform_get_irq(pdev, 0);
+ if (iommu->irq < 0) {
+ dev_err(iommu->dev, "could not get iommu irq\n");
ret = -ENODEV;
- goto fail_clk;
+ goto fail;
}
- msm_iommu_reset(regs_base, iommu_dev->ncb);
+ ret = of_property_read_u32(iommu->dev->of_node, "ncb", &val);
+ if (ret) {
+ dev_err(iommu->dev, "could not get ncb\n");
+ goto fail;
+ }
+ iommu->ncb = val;
- SET_M(regs_base, 0, 1);
- SET_PAR(regs_base, 0, 0);
- SET_V2PCFG(regs_base, 0, 1);
- SET_V2PPR(regs_base, 0, 0);
- par = GET_PAR(regs_base, 0);
- SET_V2PCFG(regs_base, 0, 0);
- SET_M(regs_base, 0, 0);
+ msm_iommu_reset(iommu->base, iommu->ncb);
+ SET_M(iommu->base, 0, 1);
+ SET_PAR(iommu->base, 0, 0);
+ SET_V2PCFG(iommu->base, 0, 1);
+ SET_V2PPR(iommu->base, 0, 0);
+ par = GET_PAR(iommu->base, 0);
+ SET_V2PCFG(iommu->base, 0, 0);
+ SET_M(iommu->base, 0, 0);
if (!par) {
- pr_err("%s: Invalid PAR value detected\n", iommu_dev->name);
+ pr_err("Invalid PAR value detected\n");
ret = -ENODEV;
- goto fail_clk;
+ goto fail;
}
- ret = request_irq(irq, msm_iommu_fault_handler, 0,
- "msm_iommu_secure_irpt_handler", drvdata);
+ ret = devm_request_threaded_irq(iommu->dev, iommu->irq, NULL,
+ msm_iommu_fault_handler,
+ IRQF_ONESHOT | IRQF_SHARED,
+ "msm_iommu_secure_irpt_handler",
+ iommu);
if (ret) {
- pr_err("Request IRQ %d failed with ret=%d\n", irq, ret);
- goto fail_clk;
+ pr_err("Request IRQ %d failed with ret=%d\n", iommu->irq, ret);
+ goto fail;
}
+ list_add(&iommu->dev_node, &qcom_iommu_devices);
- drvdata->pclk = iommu_pclk;
- drvdata->clk = iommu_clk;
- drvdata->base = regs_base;
- drvdata->irq = irq;
- drvdata->ncb = iommu_dev->ncb;
-
- pr_info("device %s mapped at %p, irq %d with %d ctx banks\n",
- iommu_dev->name, regs_base, irq, iommu_dev->ncb);
-
- platform_set_drvdata(pdev, drvdata);
-
- clk_disable(iommu_clk);
-
- clk_disable(iommu_pclk);
-
- return 0;
-fail_clk:
- if (iommu_clk) {
- clk_disable(iommu_clk);
- clk_put(iommu_clk);
- }
-fail_pclk:
- clk_disable_unprepare(iommu_pclk);
-fail_enable:
- clk_put(iommu_pclk);
+ pr_info("device mapped at %p, irq %d with %d ctx banks\n",
+ iommu->base, iommu->irq, iommu->ncb);
fail:
- kfree(drvdata);
+ clk_unprepare(iommu->clk);
+ clk_unprepare(iommu->pclk);
return ret;
}
+static const struct of_device_id msm_iommu_dt_match[] = {
+ { .compatible = "qcom,apq8064-iommu" },
+ {}
+};
+
static int msm_iommu_remove(struct platform_device *pdev)
{
- struct msm_iommu_drvdata *drv = NULL;
+ struct msm_iommu_dev *iommu = platform_get_drvdata(pdev);
- drv = platform_get_drvdata(pdev);
- if (drv) {
- if (drv->clk) {
- clk_unprepare(drv->clk);
- clk_put(drv->clk);
- }
- clk_unprepare(drv->pclk);
- clk_put(drv->pclk);
- memset(drv, 0, sizeof(*drv));
- kfree(drv);
- }
- return 0;
-}
-
-static int msm_iommu_ctx_probe(struct platform_device *pdev)
-{
- struct msm_iommu_ctx_dev *c = dev_get_platdata(&pdev->dev);
- struct msm_iommu_drvdata *drvdata;
- struct msm_iommu_ctx_drvdata *ctx_drvdata;
- int i, ret;
-
- if (!c || !pdev->dev.parent)
- return -EINVAL;
-
- drvdata = dev_get_drvdata(pdev->dev.parent);
- if (!drvdata)
- return -ENODEV;
-
- ctx_drvdata = kzalloc(sizeof(*ctx_drvdata), GFP_KERNEL);
- if (!ctx_drvdata)
- return -ENOMEM;
-
- ctx_drvdata->num = c->num;
- ctx_drvdata->pdev = pdev;
-
- INIT_LIST_HEAD(&ctx_drvdata->attached_elm);
- platform_set_drvdata(pdev, ctx_drvdata);
-
- ret = clk_prepare_enable(drvdata->pclk);
- if (ret)
- goto fail;
-
- if (drvdata->clk) {
- ret = clk_prepare_enable(drvdata->clk);
- if (ret) {
- clk_disable_unprepare(drvdata->pclk);
- goto fail;
- }
- }
-
- /* Program the M2V tables for this context */
- for (i = 0; i < MAX_NUM_MIDS; i++) {
- int mid = c->mids[i];
- if (mid == -1)
- break;
-
- SET_M2VCBR_N(drvdata->base, mid, 0);
- SET_CBACR_N(drvdata->base, c->num, 0);
-
- /* Set VMID = 0 */
- SET_VMID(drvdata->base, mid, 0);
-
- /* Set the context number for that MID to this context */
- SET_CBNDX(drvdata->base, mid, c->num);
-
- /* Set MID associated with this context bank to 0*/
- SET_CBVMID(drvdata->base, c->num, 0);
-
- /* Set the ASID for TLB tagging for this context */
- SET_CONTEXTIDR_ASID(drvdata->base, c->num, c->num);
-
- /* Set security bit override to be Non-secure */
- SET_NSCFG(drvdata->base, mid, 3);
- }
-
- clk_disable(drvdata->clk);
- clk_disable(drvdata->pclk);
-
- dev_info(&pdev->dev, "context %s using bank %d\n", c->name, c->num);
- return 0;
-fail:
- kfree(ctx_drvdata);
- return ret;
-}
-
-static int msm_iommu_ctx_remove(struct platform_device *pdev)
-{
- struct msm_iommu_ctx_drvdata *drv = NULL;
- drv = platform_get_drvdata(pdev);
- if (drv) {
- memset(drv, 0, sizeof(struct msm_iommu_ctx_drvdata));
- kfree(drv);
- }
+ clk_unprepare(iommu->clk);
+ clk_unprepare(iommu->pclk);
return 0;
}
static struct platform_driver msm_iommu_driver = {
.driver = {
.name = "msm_iommu",
+ .of_match_table = msm_iommu_dt_match,
},
.probe = msm_iommu_probe,
.remove = msm_iommu_remove,
};
-static struct platform_driver msm_iommu_ctx_driver = {
- .driver = {
- .name = "msm_iommu_ctx",
- },
- .probe = msm_iommu_ctx_probe,
- .remove = msm_iommu_ctx_remove,
-};
-
static struct platform_driver * const drivers[] = {
&msm_iommu_driver,
&msm_iommu_ctx_driver,
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH V5 2/7] documentation: iommu: Add bindings for msm,iommu-v0 ip
2016-05-20 10:54 [PATCH V5 0/7] iommu/msm: Add DT adaptation and generic bindings support Sricharan R
2016-05-20 10:54 ` [PATCH V5 1/7] iommu/msm: Add DT adaptation Sricharan R
@ 2016-05-20 10:54 ` Sricharan R
[not found] ` <1463741694-1735-3-git-send-email-sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2016-05-20 10:54 ` [PATCH V5 4/7] iommu/msm: Add support for generic master bindings Sricharan R
` (5 subsequent siblings)
7 siblings, 1 reply; 20+ messages in thread
From: Sricharan R @ 2016-05-20 10:54 UTC (permalink / raw)
To: devicetree, linux-arm-msm, joro, robdclark, iommu,
srinivas.kandagatla, laurent.pinchart, treding, robin.murphy,
linux-arm-kernel, stepanm, architt, arnd
Cc: sricharan
The MSM IOMMU is an implementation compatible with the ARM VMSA short
descriptor page tables. It provides address translation for bus masters outside
of the CPU, each connected to the IOMMU through a port called micro-TLB.
Adding the DT bindings for the same.
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
---
.../devicetree/bindings/iommu/msm,iommu-v0.txt | 64 ++++++++++++++++++++++
1 file changed, 64 insertions(+)
create mode 100644 Documentation/devicetree/bindings/iommu/msm,iommu-v0.txt
diff --git a/Documentation/devicetree/bindings/iommu/msm,iommu-v0.txt b/Documentation/devicetree/bindings/iommu/msm,iommu-v0.txt
new file mode 100644
index 0000000..2023638
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/msm,iommu-v0.txt
@@ -0,0 +1,64 @@
+* QCOM IOMMU
+
+The MSM IOMMU is an implementation compatible with the ARM VMSA short
+descriptor page tables. It provides address translation for bus masters outside
+of the CPU, each connected to the IOMMU through a port called micro-TLB.
+
+Required Properties:
+
+ - compatible: Must contain "qcom,apq8064-iommu".
+ - reg: Base address and size of the IOMMU registers.
+ - interrupts: Specifiers for the MMU fault interrupts. For instances that
+ support secure mode two interrupts must be specified, for non-secure and
+ secure mode, in that order. For instances that don't support secure mode a
+ single interrupt must be specified.
+ - #iommu-cells: The number of cells needed to specify the stream id. This
+ is always 1.
+ - qcom,ncb: The total number of context banks in the IOMMU.
+ - clocks : List of clocks to be used during SMMU register access. See
+ Documentation/devicetree/bindings/clock/clock-bindings.txt
+ for information about the format. For each clock specified
+ here, there must be a corresponding entry in clock-names
+ (see below).
+
+ - clock-names : List of clock names corresponding to the clocks specified in
+ the "clocks" property (above).
+ Should be "smmu_pclk" for specifying the interface clock
+ required for iommu's register accesses.
+ Should be "smmu_clk" for specifying the functional clock
+ required by iommu for bus accesses.
+
+Each bus master connected to an IOMMU must reference the IOMMU in its device
+node with the following property:
+
+ - iommus: A reference to the IOMMU in multiple cells. The first cell is a
+ phandle to the IOMMU and the second cell is the stream id.
+ A single master device can be connected to more than one iommu
+ and multiple contexts in each of the iommu. So multiple entries
+ are required to list all the iommus and the stream ids that the
+ master is connected to.
+
+Example: mdp iommu and its bus master
+
+ mdp_port0: iommu@7500000 {
+ compatible = "qcom,apq8064-iommu";
+ #iommu-cells = <1>;
+ clock-names =
+ "smmu_pclk",
+ "smmu_clk";
+ clocks =
+ <&mmcc SMMU_AHB_CLK>,
+ <&mmcc MDP_AXI_CLK>;
+ reg = <0x07500000 0x100000>;
+ interrupts =
+ <GIC_SPI 63 0>,
+ <GIC_SPI 64 0>;
+ qcom,ncb = <2>;
+ };
+
+ mdp: qcom,mdp@5100000 {
+ compatible = "qcom,mdp";
+ ...
+ iommus = <&mdp_port0 0
+ &mdp_port0 2>;
+ };
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH V5 3/7] iommu/msm: Move the contents from msm_iommu_dev.c to msm_iommu.c
[not found] ` <1463741694-1735-1-git-send-email-sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
@ 2016-05-20 10:54 ` Sricharan R
2016-05-23 8:10 ` [PATCH V5 0/7] iommu/msm: Add DT adaptation and generic bindings support Srinivas Kandagatla
1 sibling, 0 replies; 20+ messages in thread
From: Sricharan R @ 2016-05-20 10:54 UTC (permalink / raw)
To: devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, joro-zLv9SwRftAIdnm+yROfE0A,
robdclark-Re5JQEeQqe8AvxtiuMwx3w,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
srinivas.kandagatla-QSEj5FYQhm4dnm+yROfE0A,
laurent.pinchart-ryLnwIuWjnjg/C1BVhZhaw,
treding-DDmLM1+adcrQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
stepanm-sgV2jX0FEOL9JmXXK+q4OQ, architt-sgV2jX0FEOL9JmXXK+q4OQ,
arnd-r2nGTMty4D4
Cc: sricharan-sgV2jX0FEOL9JmXXK+q4OQ
There are only two functions left in msm_iommu_dev.c. Move it to
msm_iommu.c and delete the file.
Signed-off-by: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
---
drivers/iommu/Makefile | 2 +-
drivers/iommu/msm_iommu.c | 182 ++++++++++++++++++++++++++++++++++++
drivers/iommu/msm_iommu_dev.c | 212 ------------------------------------------
3 files changed, 183 insertions(+), 213 deletions(-)
delete mode 100644 drivers/iommu/msm_iommu_dev.c
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index f7134f7..8695ead 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -7,7 +7,7 @@ obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
obj-$(CONFIG_IOMMU_IOVA) += iova.o
obj-$(CONFIG_OF_IOMMU) += of_iommu.o
-obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o msm_iommu_dev.o
+obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o
obj-$(CONFIG_QCOM_IOMMU_V1) += qcom/
obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o
obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o
diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index bc1a4e3..792b352 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -27,6 +27,7 @@
#include <linux/slab.h>
#include <linux/iommu.h>
#include <linux/clk.h>
+#include <linux/err.h>
#include <asm/cacheflush.h>
#include <asm/sizes.h>
@@ -85,6 +86,47 @@ static void __disable_clocks(struct msm_iommu_dev *iommu)
clk_disable(iommu->pclk);
}
+static void msm_iommu_reset(void __iomem *base, int ncb)
+{
+ int ctx;
+
+ SET_RPUE(base, 0);
+ SET_RPUEIE(base, 0);
+ SET_ESRRESTORE(base, 0);
+ SET_TBE(base, 0);
+ SET_CR(base, 0);
+ SET_SPDMBE(base, 0);
+ SET_TESTBUSCR(base, 0);
+ SET_TLBRSW(base, 0);
+ SET_GLOBAL_TLBIALL(base, 0);
+ SET_RPU_ACR(base, 0);
+ SET_TLBLKCRWE(base, 1);
+
+ for (ctx = 0; ctx < ncb; ctx++) {
+ SET_BPRCOSH(base, ctx, 0);
+ SET_BPRCISH(base, ctx, 0);
+ SET_BPRCNSH(base, ctx, 0);
+ SET_BPSHCFG(base, ctx, 0);
+ SET_BPMTCFG(base, ctx, 0);
+ SET_ACTLR(base, ctx, 0);
+ SET_SCTLR(base, ctx, 0);
+ SET_FSRRESTORE(base, ctx, 0);
+ SET_TTBR0(base, ctx, 0);
+ SET_TTBR1(base, ctx, 0);
+ SET_TTBCR(base, ctx, 0);
+ SET_BFBCR(base, ctx, 0);
+ SET_PAR(base, ctx, 0);
+ SET_FAR(base, ctx, 0);
+ SET_CTX_TLBIALL(base, ctx, 0);
+ SET_TLBFLPTER(base, ctx, 0);
+ SET_TLBSLPTER(base, ctx, 0);
+ SET_TLBLKCR(base, ctx, 0);
+ SET_PRRR(base, ctx, 0);
+ SET_NMRR(base, ctx, 0);
+ SET_CONTEXTIDR(base, ctx, 0);
+ }
+}
+
static int __flush_iotlb(struct iommu_domain *domain)
{
struct msm_priv *priv = to_msm_priv(domain);
@@ -708,6 +750,146 @@ static const struct iommu_ops msm_iommu_ops = {
.pgsize_bitmap = MSM_IOMMU_PGSIZES,
};
+static int msm_iommu_probe(struct platform_device *pdev)
+{
+ struct resource *r;
+ struct msm_iommu_dev *iommu;
+ int ret, par, val;
+
+ iommu = devm_kzalloc(&pdev->dev, sizeof(*iommu), GFP_KERNEL);
+ if (!iommu)
+ return -ENODEV;
+
+ iommu->dev = &pdev->dev;
+ INIT_LIST_HEAD(&iommu->ctx_list);
+
+ iommu->pclk = devm_clk_get(iommu->dev, "smmu_pclk");
+ if (IS_ERR(iommu->pclk)) {
+ dev_err(iommu->dev, "could not get smmu_pclk\n");
+ return PTR_ERR(iommu->pclk);
+ }
+
+ ret = clk_prepare(iommu->pclk);
+ if (ret) {
+ dev_err(iommu->dev, "could not prepare smmu_pclk\n");
+ return ret;
+ }
+
+ iommu->clk = devm_clk_get(iommu->dev, "iommu_clk");
+ if (IS_ERR(iommu->clk)) {
+ dev_err(iommu->dev, "could not get iommu_clk\n");
+ clk_unprepare(iommu->pclk);
+ return PTR_ERR(iommu->clk);
+ }
+
+ ret = clk_prepare(iommu->clk);
+ if (ret) {
+ dev_err(iommu->dev, "could not prepare iommu_clk\n");
+ clk_unprepare(iommu->pclk);
+ return ret;
+ }
+
+ r = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ iommu->base = devm_ioremap_resource(iommu->dev, r);
+ if (IS_ERR(iommu->base)) {
+ dev_err(iommu->dev, "could not get iommu base\n");
+ ret = PTR_ERR(iommu->base);
+ goto fail;
+ }
+
+ iommu->irq = platform_get_irq(pdev, 0);
+ if (iommu->irq < 0) {
+ dev_err(iommu->dev, "could not get iommu irq\n");
+ ret = -ENODEV;
+ goto fail;
+ }
+
+ ret = of_property_read_u32(iommu->dev->of_node, "qcom,ncb", &val);
+ if (ret) {
+ dev_err(iommu->dev, "could not get ncb\n");
+ goto fail;
+ }
+ iommu->ncb = val;
+
+ msm_iommu_reset(iommu->base, iommu->ncb);
+ SET_M(iommu->base, 0, 1);
+ SET_PAR(iommu->base, 0, 0);
+ SET_V2PCFG(iommu->base, 0, 1);
+ SET_V2PPR(iommu->base, 0, 0);
+ par = GET_PAR(iommu->base, 0);
+ SET_V2PCFG(iommu->base, 0, 0);
+ SET_M(iommu->base, 0, 0);
+
+ if (!par) {
+ pr_err("Invalid PAR value detected\n");
+ ret = -ENODEV;
+ goto fail;
+ }
+
+ ret = devm_request_threaded_irq(iommu->dev, iommu->irq, NULL,
+ msm_iommu_fault_handler,
+ IRQF_ONESHOT | IRQF_SHARED,
+ "msm_iommu_secure_irpt_handler",
+ iommu);
+ if (ret) {
+ pr_err("Request IRQ %d failed with ret=%d\n", iommu->irq, ret);
+ goto fail;
+ }
+
+ list_add(&iommu->dev_node, &qcom_iommu_devices);
+
+ pr_info("device mapped at %p, irq %d with %d ctx banks\n",
+ iommu->base, iommu->irq, iommu->ncb);
+
+ return ret;
+fail:
+ clk_unprepare(iommu->clk);
+ clk_unprepare(iommu->pclk);
+ return ret;
+}
+
+static const struct of_device_id msm_iommu_dt_match[] = {
+ { .compatible = "qcom,apq8064-iommu" },
+ {}
+};
+
+static int msm_iommu_remove(struct platform_device *pdev)
+{
+ struct msm_iommu_dev *iommu = platform_get_drvdata(pdev);
+
+ clk_unprepare(iommu->clk);
+ clk_unprepare(iommu->pclk);
+ return 0;
+}
+
+static struct platform_driver msm_iommu_driver = {
+ .driver = {
+ .name = "msm_iommu",
+ .of_match_table = msm_iommu_dt_match,
+ },
+ .probe = msm_iommu_probe,
+ .remove = msm_iommu_remove,
+};
+
+static int __init msm_iommu_driver_init(void)
+{
+ int ret;
+
+ ret = platform_driver_register(&msm_iommu_driver);
+ if (ret != 0)
+ pr_err("Failed to register IOMMU driver\n");
+
+ return ret;
+}
+
+static void __exit msm_iommu_driver_exit(void)
+{
+ platform_driver_unregister(&msm_iommu_driver);
+}
+
+subsys_initcall(msm_iommu_driver_init);
+module_exit(msm_iommu_driver_exit);
+
static int __init get_tex_class(int icp, int ocp, int mt, int nos)
{
int i = 0;
diff --git a/drivers/iommu/msm_iommu_dev.c b/drivers/iommu/msm_iommu_dev.c
deleted file mode 100644
index be01cc4..0000000
--- a/drivers/iommu/msm_iommu_dev.c
+++ /dev/null
@@ -1,212 +0,0 @@
-/* Copyright (c) 2010-2011, Code Aurora Forum. All rights reserved.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 and
- * only version 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
- * 02110-1301, USA.
- */
-
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/platform_device.h>
-#include <linux/io.h>
-#include <linux/clk.h>
-#include <linux/iommu.h>
-#include <linux/interrupt.h>
-#include <linux/err.h>
-#include <linux/slab.h>
-
-#include "msm_iommu_hw-8xxx.h"
-#include "msm_iommu.h"
-
-static void msm_iommu_reset(void __iomem *base, int ncb)
-{
- int ctx;
-
- SET_RPUE(base, 0);
- SET_RPUEIE(base, 0);
- SET_ESRRESTORE(base, 0);
- SET_TBE(base, 0);
- SET_CR(base, 0);
- SET_SPDMBE(base, 0);
- SET_TESTBUSCR(base, 0);
- SET_TLBRSW(base, 0);
- SET_GLOBAL_TLBIALL(base, 0);
- SET_RPU_ACR(base, 0);
- SET_TLBLKCRWE(base, 1);
-
- for (ctx = 0; ctx < ncb; ctx++) {
- SET_BPRCOSH(base, ctx, 0);
- SET_BPRCISH(base, ctx, 0);
- SET_BPRCNSH(base, ctx, 0);
- SET_BPSHCFG(base, ctx, 0);
- SET_BPMTCFG(base, ctx, 0);
- SET_ACTLR(base, ctx, 0);
- SET_SCTLR(base, ctx, 0);
- SET_FSRRESTORE(base, ctx, 0);
- SET_TTBR0(base, ctx, 0);
- SET_TTBR1(base, ctx, 0);
- SET_TTBCR(base, ctx, 0);
- SET_BFBCR(base, ctx, 0);
- SET_PAR(base, ctx, 0);
- SET_FAR(base, ctx, 0);
- SET_CTX_TLBIALL(base, ctx, 0);
- SET_TLBFLPTER(base, ctx, 0);
- SET_TLBSLPTER(base, ctx, 0);
- SET_TLBLKCR(base, ctx, 0);
- SET_PRRR(base, ctx, 0);
- SET_NMRR(base, ctx, 0);
- SET_CONTEXTIDR(base, ctx, 0);
- }
-}
-
-static int msm_iommu_probe(struct platform_device *pdev)
-{
- struct resource *r;
- struct msm_iommu_dev *iommu;
- int ret, par, val;
-
- iommu = devm_kzalloc(&pdev->dev, sizeof(*iommu), GFP_KERNEL);
- if (!iommu)
- return -ENODEV;
-
- iommu->dev = &pdev->dev;
- INIT_LIST_HEAD(&iommu->ctx_list);
-
- iommu->pclk = devm_clk_get(iommu->dev, "smmu_pclk");
- if (IS_ERR(iommu->pclk)) {
- dev_err(iommu->dev, "could not get smmu_pclk\n");
- return PTR_ERR(iommu->pclk);
- }
-
- ret = clk_prepare(iommu->pclk);
- if (ret) {
- dev_err(iommu->dev, "could not prepare smmu_pclk\n");
- return ret;
- }
-
- iommu->clk = devm_clk_get(iommu->dev, "iommu_clk");
- if (IS_ERR(iommu->clk)) {
- dev_err(iommu->dev, "could not get iommu_clk\n");
- clk_unprepare(iommu->pclk);
- return PTR_ERR(iommu->clk);
- }
-
- ret = clk_prepare(iommu->clk);
- if (ret) {
- dev_err(iommu->dev, "could not prepare iommu_clk\n");
- clk_unprepare(iommu->pclk);
- return ret;
- }
-
- r = platform_get_resource(pdev, IORESOURCE_MEM, 0);
- iommu->base = devm_ioremap_resource(iommu->dev, r);
- if (IS_ERR(iommu->base)) {
- dev_err(iommu->dev, "could not get iommu base\n");
- ret = PTR_ERR(iommu->base);
- goto fail;
- }
-
- iommu->irq = platform_get_irq(pdev, 0);
- if (iommu->irq < 0) {
- dev_err(iommu->dev, "could not get iommu irq\n");
- ret = -ENODEV;
- goto fail;
- }
-
- ret = of_property_read_u32(iommu->dev->of_node, "ncb", &val);
- if (ret) {
- dev_err(iommu->dev, "could not get ncb\n");
- goto fail;
- }
- iommu->ncb = val;
-
- msm_iommu_reset(iommu->base, iommu->ncb);
- SET_M(iommu->base, 0, 1);
- SET_PAR(iommu->base, 0, 0);
- SET_V2PCFG(iommu->base, 0, 1);
- SET_V2PPR(iommu->base, 0, 0);
- par = GET_PAR(iommu->base, 0);
- SET_V2PCFG(iommu->base, 0, 0);
- SET_M(iommu->base, 0, 0);
-
- if (!par) {
- pr_err("Invalid PAR value detected\n");
- ret = -ENODEV;
- goto fail;
- }
-
- ret = devm_request_threaded_irq(iommu->dev, iommu->irq, NULL,
- msm_iommu_fault_handler,
- IRQF_ONESHOT | IRQF_SHARED,
- "msm_iommu_secure_irpt_handler",
- iommu);
- if (ret) {
- pr_err("Request IRQ %d failed with ret=%d\n", iommu->irq, ret);
- goto fail;
- }
-
- list_add(&iommu->dev_node, &qcom_iommu_devices);
-
- pr_info("device mapped at %p, irq %d with %d ctx banks\n",
- iommu->base, iommu->irq, iommu->ncb);
-fail:
- clk_unprepare(iommu->clk);
- clk_unprepare(iommu->pclk);
- return ret;
-}
-
-static const struct of_device_id msm_iommu_dt_match[] = {
- { .compatible = "qcom,apq8064-iommu" },
- {}
-};
-
-static int msm_iommu_remove(struct platform_device *pdev)
-{
- struct msm_iommu_dev *iommu = platform_get_drvdata(pdev);
-
- clk_unprepare(iommu->clk);
- clk_unprepare(iommu->pclk);
- return 0;
-}
-
-static struct platform_driver msm_iommu_driver = {
- .driver = {
- .name = "msm_iommu",
- .of_match_table = msm_iommu_dt_match,
- },
- .probe = msm_iommu_probe,
- .remove = msm_iommu_remove,
-};
-
-static struct platform_driver * const drivers[] = {
- &msm_iommu_driver,
- &msm_iommu_ctx_driver,
-};
-
-static int __init msm_iommu_driver_init(void)
-{
- return platform_register_drivers(drivers, ARRAY_SIZE(drivers));
-}
-
-static void __exit msm_iommu_driver_exit(void)
-{
- platform_unregister_drivers(drivers, ARRAY_SIZE(drivers));
-}
-
-subsys_initcall(msm_iommu_driver_init);
-module_exit(msm_iommu_driver_exit);
-
-MODULE_LICENSE("GPL v2");
-MODULE_AUTHOR("Stepan Moskovchenko <stepanm-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>");
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH V5 4/7] iommu/msm: Add support for generic master bindings
2016-05-20 10:54 [PATCH V5 0/7] iommu/msm: Add DT adaptation and generic bindings support Sricharan R
2016-05-20 10:54 ` [PATCH V5 1/7] iommu/msm: Add DT adaptation Sricharan R
2016-05-20 10:54 ` [PATCH V5 2/7] documentation: iommu: Add bindings for msm,iommu-v0 ip Sricharan R
@ 2016-05-20 10:54 ` Sricharan R
2016-05-20 10:54 ` [PATCH V5 5/7] iommu/msm: use generic ARMV7S short descriptor pagetable ops Sricharan R
` (4 subsequent siblings)
7 siblings, 0 replies; 20+ messages in thread
From: Sricharan R @ 2016-05-20 10:54 UTC (permalink / raw)
To: devicetree, linux-arm-msm, joro, robdclark, iommu,
srinivas.kandagatla, laurent.pinchart, treding, robin.murphy,
linux-arm-kernel, stepanm, architt, arnd
Cc: sricharan
This adds the xlate callback which gets invoked during
device registration from DT. The master devices gets added
through this.
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
---
drivers/iommu/msm_iommu.c | 61 +++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 59 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index 792b352..8ab0643 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -28,6 +28,7 @@
#include <linux/iommu.h>
#include <linux/clk.h>
#include <linux/err.h>
+#include <linux/of_iommu.h>
#include <asm/cacheflush.h>
#include <asm/sizes.h>
@@ -702,6 +703,54 @@ static void print_ctx_regs(void __iomem *base, int ctx)
GET_PRRR(base, ctx), GET_NMRR(base, ctx));
}
+static void insert_iommu_master(struct device *dev,
+ struct msm_iommu_dev **iommu,
+ struct of_phandle_args *spec)
+{
+ struct msm_iommu_ctx_dev *master = dev->archdata.iommu;
+ int sid;
+
+ if (list_empty(&(*iommu)->ctx_list)) {
+ master = kzalloc(sizeof(*master), GFP_ATOMIC);
+ master->of_node = dev->of_node;
+ list_add(&master->list, &(*iommu)->ctx_list);
+ dev->archdata.iommu = master;
+ }
+
+ for (sid = 0; sid < master->num_mids; sid++)
+ if (master->mids[sid] == spec->args[0]) {
+ dev_warn(dev, "Stream ID 0x%hx repeated; ignoring\n",
+ sid);
+ return;
+ }
+
+ master->mids[master->num_mids++] = spec->args[0];
+}
+
+static int qcom_iommu_of_xlate(struct device *dev,
+ struct of_phandle_args *spec)
+{
+ struct msm_iommu_dev *iommu;
+ unsigned long flags;
+ int ret = 0;
+
+ spin_lock_irqsave(&msm_iommu_lock, flags);
+ list_for_each_entry(iommu, &qcom_iommu_devices, dev_node)
+ if (iommu->dev->of_node == spec->np)
+ break;
+
+ if (!iommu || iommu->dev->of_node != spec->np) {
+ ret = -ENODEV;
+ goto fail;
+ }
+
+ insert_iommu_master(dev, &iommu, spec);
+fail:
+ spin_unlock_irqrestore(&msm_iommu_lock, flags);
+
+ return ret;
+}
+
irqreturn_t msm_iommu_fault_handler(int irq, void *dev_id)
{
struct msm_iommu_dev *iommu = dev_id;
@@ -737,7 +786,7 @@ fail:
return 0;
}
-static const struct iommu_ops msm_iommu_ops = {
+static struct iommu_ops msm_iommu_ops = {
.capable = msm_iommu_capable,
.domain_alloc = msm_iommu_domain_alloc,
.domain_free = msm_iommu_domain_free,
@@ -748,6 +797,7 @@ static const struct iommu_ops msm_iommu_ops = {
.map_sg = default_iommu_map_sg,
.iova_to_phys = msm_iommu_iova_to_phys,
.pgsize_bitmap = MSM_IOMMU_PGSIZES,
+ .of_xlate = qcom_iommu_of_xlate,
};
static int msm_iommu_probe(struct platform_device *pdev)
@@ -837,6 +887,7 @@ static int msm_iommu_probe(struct platform_device *pdev)
}
list_add(&iommu->dev_node, &qcom_iommu_devices);
+ of_iommu_set_ops(pdev->dev.of_node, &msm_iommu_ops);
pr_info("device mapped at %p, irq %d with %d ctx banks\n",
iommu->base, iommu->irq, iommu->ncb);
@@ -935,7 +986,13 @@ static int __init msm_iommu_init(void)
return 0;
}
-subsys_initcall(msm_iommu_init);
+static int __init msm_iommu_of_setup(struct device_node *np)
+{
+ msm_iommu_init();
+ return 0;
+}
+
+IOMMU_OF_DECLARE(msm_iommu_of, "qcom,apq8064-iommu", msm_iommu_of_setup);
MODULE_LICENSE("GPL v2");
MODULE_AUTHOR("Stepan Moskovchenko <stepanm@codeaurora.org>");
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH V5 5/7] iommu/msm: use generic ARMV7S short descriptor pagetable ops
2016-05-20 10:54 [PATCH V5 0/7] iommu/msm: Add DT adaptation and generic bindings support Sricharan R
` (2 preceding siblings ...)
2016-05-20 10:54 ` [PATCH V5 4/7] iommu/msm: Add support for generic master bindings Sricharan R
@ 2016-05-20 10:54 ` Sricharan R
2016-05-20 10:54 ` [PATCH V5 6/7] iommu/msm: Use writel_relaxed and add a barrier Sricharan R
` (3 subsequent siblings)
7 siblings, 0 replies; 20+ messages in thread
From: Sricharan R @ 2016-05-20 10:54 UTC (permalink / raw)
To: devicetree, linux-arm-msm, joro, robdclark, iommu,
srinivas.kandagatla, laurent.pinchart, treding, robin.murphy,
linux-arm-kernel, stepanm, architt, arnd
Cc: sricharan
This iommu uses the armv7 short descriptor format. So use the
generic ARMV7S pagetable ops instead of rewriting the same stuff
in the driver.
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
---
drivers/iommu/Kconfig | 1 +
drivers/iommu/msm_iommu.c | 400 ++++++++++++----------------------------------
2 files changed, 104 insertions(+), 297 deletions(-)
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 232a54a..761be35 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -92,6 +92,7 @@ config MSM_IOMMU
depends on ARCH_MSM8X60 || ARCH_MSM8960 || COMPILE_TEST
depends on BROKEN
select IOMMU_API
+ select IOMMU_IO_PGTABLE_ARMV7S
help
Support for the IOMMUs found on certain Qualcomm SOCs.
These IOMMUs allow virtualization of the address space used by most
diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index 8ab0643..0299a37 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -35,27 +35,27 @@
#include "msm_iommu_hw-8xxx.h"
#include "msm_iommu.h"
+#include "io-pgtable.h"
#define MRC(reg, processor, op1, crn, crm, op2) \
__asm__ __volatile__ ( \
" mrc " #processor "," #op1 ", %0," #crn "," #crm "," #op2 "\n" \
: "=r" (reg))
-#define RCP15_PRRR(reg) MRC(reg, p15, 0, c10, c2, 0)
-#define RCP15_NMRR(reg) MRC(reg, p15, 0, c10, c2, 1)
-
/* bitmap of the page sizes currently supported */
#define MSM_IOMMU_PGSIZES (SZ_4K | SZ_64K | SZ_1M | SZ_16M)
-static int msm_iommu_tex_class[4];
-
DEFINE_SPINLOCK(msm_iommu_lock);
static LIST_HEAD(qcom_iommu_devices);
+static struct iommu_ops msm_iommu_ops;
struct msm_priv {
- unsigned long *pgtable;
struct list_head list_attached;
struct iommu_domain domain;
+ struct io_pgtable_cfg cfg;
+ struct io_pgtable_ops *iop;
+ struct device *dev;
+ spinlock_t pgtlock; /* pagetable lock */
};
static struct msm_priv *to_msm_priv(struct iommu_domain *dom)
@@ -122,49 +122,74 @@ static void msm_iommu_reset(void __iomem *base, int ncb)
SET_TLBFLPTER(base, ctx, 0);
SET_TLBSLPTER(base, ctx, 0);
SET_TLBLKCR(base, ctx, 0);
- SET_PRRR(base, ctx, 0);
- SET_NMRR(base, ctx, 0);
SET_CONTEXTIDR(base, ctx, 0);
}
}
-static int __flush_iotlb(struct iommu_domain *domain)
+static void __flush_iotlb(void *cookie)
{
- struct msm_priv *priv = to_msm_priv(domain);
+ struct msm_priv *priv = cookie;
struct msm_iommu_dev *iommu = NULL;
struct msm_iommu_ctx_dev *master;
int ret = 0;
-#ifndef CONFIG_IOMMU_PGTABLES_L2
- unsigned long *fl_table = priv->pgtable;
- int i;
+ list_for_each_entry(iommu, &priv->list_attached, dom_node) {
+ ret = __enable_clocks(iommu);
+ if (ret)
+ goto fail;
- if (!list_empty(&priv->list_attached)) {
- dmac_flush_range(fl_table, fl_table + SZ_16K);
+ list_for_each_entry(master, &iommu->ctx_list, list)
+ SET_CTX_TLBIALL(iommu->base, master->num, 0);
- for (i = 0; i < NUM_FL_PTE; i++)
- if ((fl_table[i] & 0x03) == FL_TYPE_TABLE) {
- void *sl_table = __va(fl_table[i] &
- FL_BASE_MASK);
- dmac_flush_range(sl_table, sl_table + SZ_4K);
- }
+ __disable_clocks(iommu);
}
-#endif
+fail:
+ return;
+}
+
+static void __flush_iotlb_range(unsigned long iova, size_t size,
+ size_t granule, bool leaf, void *cookie)
+{
+ struct msm_priv *priv = cookie;
+ struct msm_iommu_dev *iommu = NULL;
+ struct msm_iommu_ctx_dev *master;
+ int ret = 0;
+ int temp_size;
list_for_each_entry(iommu, &priv->list_attached, dom_node) {
ret = __enable_clocks(iommu);
if (ret)
goto fail;
- list_for_each_entry(master, &iommu->ctx_list, list)
- SET_CTX_TLBIALL(iommu->base, master->num, 0);
+ list_for_each_entry(master, &iommu->ctx_list, list) {
+ temp_size = size;
+ do {
+ iova &= TLBIVA_VA;
+ iova |= GET_CONTEXTIDR_ASID(iommu->base,
+ master->num);
+ SET_TLBIVA(iommu->base, master->num, iova);
+ iova += granule;
+ } while (temp_size -= granule);
+ }
__disable_clocks(iommu);
}
+
fail:
- return ret;
+ return;
+}
+
+static void __flush_iotlb_sync(void *cookie)
+{
+ /* To avoid a null function pointer */
}
+static const struct iommu_gather_ops msm_iommu_gather_ops = {
+ .tlb_flush_all = __flush_iotlb,
+ .tlb_add_flush = __flush_iotlb_range,
+ .tlb_sync = __flush_iotlb_sync,
+};
+
static int msm_iommu_alloc_ctx(unsigned long *map, int start, int end)
{
int idx;
@@ -232,15 +257,17 @@ static void __reset_context(void __iomem *base, int ctx)
SET_TLBFLPTER(base, ctx, 0);
SET_TLBSLPTER(base, ctx, 0);
SET_TLBLKCR(base, ctx, 0);
- SET_PRRR(base, ctx, 0);
- SET_NMRR(base, ctx, 0);
}
-static void __program_context(void __iomem *base, int ctx, phys_addr_t pgtable)
+static void __program_context(void __iomem *base, int ctx,
+ struct msm_priv *priv)
{
- unsigned int prrr, nmrr;
__reset_context(base, ctx);
+ /* Turn on TEX Remap */
+ SET_TRE(base, ctx, 1);
+ SET_AFE(base, ctx, 1);
+
/* Set up HTW mode */
/* TLB miss configuration: perform HTW on miss */
SET_TLBMCFG(base, ctx, 0x3);
@@ -248,8 +275,13 @@ static void __program_context(void __iomem *base, int ctx, phys_addr_t pgtable)
/* V2P configuration: HTW for access */
SET_V2PCFG(base, ctx, 0x3);
- SET_TTBCR(base, ctx, 0);
- SET_TTBR0_PA(base, ctx, (pgtable >> 14));
+ SET_TTBCR(base, ctx, priv->cfg.arm_v7s_cfg.tcr);
+ SET_TTBR0(base, ctx, priv->cfg.arm_v7s_cfg.ttbr[0]);
+ SET_TTBR1(base, ctx, priv->cfg.arm_v7s_cfg.ttbr[1]);
+
+ /* Set prrr and nmrr */
+ SET_PRRR(base, ctx, priv->cfg.arm_v7s_cfg.prrr);
+ SET_NMRR(base, ctx, priv->cfg.arm_v7s_cfg.nmrr);
/* Invalidate the TLB for this context */
SET_CTX_TLBIALL(base, ctx, 0);
@@ -268,38 +300,9 @@ static void __program_context(void __iomem *base, int ctx, phys_addr_t pgtable)
SET_RCOSH(base, ctx, 1);
SET_RCNSH(base, ctx, 1);
- /* Turn on TEX Remap */
- SET_TRE(base, ctx, 1);
-
- /* Set TEX remap attributes */
- RCP15_PRRR(prrr);
- RCP15_NMRR(nmrr);
- SET_PRRR(base, ctx, prrr);
- SET_NMRR(base, ctx, nmrr);
-
/* Turn on BFB prefetch */
SET_BFBDFE(base, ctx, 1);
-#ifdef CONFIG_IOMMU_PGTABLES_L2
- /* Configure page tables as inner-cacheable and shareable to reduce
- * the TLB miss penalty.
- */
- SET_TTBR0_SH(base, ctx, 1);
- SET_TTBR1_SH(base, ctx, 1);
-
- SET_TTBR0_NOS(base, ctx, 1);
- SET_TTBR1_NOS(base, ctx, 1);
-
- SET_TTBR0_IRGNH(base, ctx, 0); /* WB, WA */
- SET_TTBR0_IRGNL(base, ctx, 1);
-
- SET_TTBR1_IRGNH(base, ctx, 0); /* WB, WA */
- SET_TTBR1_IRGNL(base, ctx, 1);
-
- SET_TTBR0_ORGN(base, ctx, 1); /* WB, WA */
- SET_TTBR1_ORGN(base, ctx, 1); /* WB, WA */
-#endif
-
/* Enable the MMU */
SET_M(base, ctx, 1);
}
@@ -316,13 +319,6 @@ static struct iommu_domain *msm_iommu_domain_alloc(unsigned type)
goto fail_nomem;
INIT_LIST_HEAD(&priv->list_attached);
- priv->pgtable = (unsigned long *)__get_free_pages(GFP_KERNEL,
- get_order(SZ_16K));
-
- if (!priv->pgtable)
- goto fail_nomem;
-
- memset(priv->pgtable, 0, SZ_16K);
priv->domain.geometry.aperture_start = 0;
priv->domain.geometry.aperture_end = (1ULL << 32) - 1;
@@ -339,24 +335,35 @@ static void msm_iommu_domain_free(struct iommu_domain *domain)
{
struct msm_priv *priv;
unsigned long flags;
- unsigned long *fl_table;
- int i;
spin_lock_irqsave(&msm_iommu_lock, flags);
priv = to_msm_priv(domain);
+ kfree(priv);
+ spin_unlock_irqrestore(&msm_iommu_lock, flags);
+}
- fl_table = priv->pgtable;
+static int msm_iommu_domain_config(struct msm_priv *priv)
+{
+ spin_lock_init(&priv->pgtlock);
- for (i = 0; i < NUM_FL_PTE; i++)
- if ((fl_table[i] & 0x03) == FL_TYPE_TABLE)
- free_page((unsigned long) __va(((fl_table[i]) &
- FL_BASE_MASK)));
+ priv->cfg = (struct io_pgtable_cfg) {
+ .quirks = IO_PGTABLE_QUIRK_TLBI_ON_MAP,
+ .pgsize_bitmap = msm_iommu_ops.pgsize_bitmap,
+ .ias = 32,
+ .oas = 32,
+ .tlb = &msm_iommu_gather_ops,
+ .iommu_dev = priv->dev,
+ };
- free_pages((unsigned long)priv->pgtable, get_order(SZ_16K));
- priv->pgtable = NULL;
+ priv->iop = alloc_io_pgtable_ops(ARM_V7S, &priv->cfg, priv);
+ if (!priv->iop) {
+ dev_err(priv->dev, "Failed to allocate pgtable\n");
+ return -EINVAL;
+ }
- kfree(priv);
- spin_unlock_irqrestore(&msm_iommu_lock, flags);
+ msm_iommu_ops.pgsize_bitmap = priv->cfg.pgsize_bitmap;
+
+ return 0;
}
static int msm_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
@@ -367,6 +374,9 @@ static int msm_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
struct msm_priv *priv = to_msm_priv(domain);
struct msm_iommu_ctx_dev *master;
+ priv->dev = dev;
+ msm_iommu_domain_config(priv);
+
spin_lock_irqsave(&msm_iommu_lock, flags);
list_for_each_entry(iommu, &qcom_iommu_devices, dev_node) {
master = list_first_entry(&iommu->ctx_list,
@@ -392,14 +402,13 @@ static int msm_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
}
config_mids(iommu, master);
__program_context(iommu->base, master->num,
- __pa(priv->pgtable));
+ priv);
}
__disable_clocks(iommu);
list_add(&iommu->dom_node, &priv->list_attached);
}
}
- ret = __flush_iotlb(domain);
fail:
spin_unlock_irqrestore(&msm_iommu_lock, flags);
@@ -415,11 +424,9 @@ static void msm_iommu_detach_dev(struct iommu_domain *domain,
struct msm_iommu_ctx_dev *master;
int ret;
- spin_lock_irqsave(&msm_iommu_lock, flags);
- ret = __flush_iotlb(domain);
- if (ret)
- goto fail;
+ free_io_pgtable_ops(priv->iop);
+ spin_lock_irqsave(&msm_iommu_lock, flags);
list_for_each_entry(iommu, &priv->list_attached, dom_node) {
ret = __enable_clocks(iommu);
if (ret)
@@ -435,190 +442,30 @@ fail:
spin_unlock_irqrestore(&msm_iommu_lock, flags);
}
-static int msm_iommu_map(struct iommu_domain *domain, unsigned long va,
+static int msm_iommu_map(struct iommu_domain *domain, unsigned long iova,
phys_addr_t pa, size_t len, int prot)
{
- struct msm_priv *priv;
+ struct msm_priv *priv = to_msm_priv(domain);
unsigned long flags;
- unsigned long *fl_table;
- unsigned long *fl_pte;
- unsigned long fl_offset;
- unsigned long *sl_table;
- unsigned long *sl_pte;
- unsigned long sl_offset;
- unsigned int pgprot;
- int ret = 0, tex, sh;
-
- spin_lock_irqsave(&msm_iommu_lock, flags);
-
- sh = (prot & MSM_IOMMU_ATTR_SH) ? 1 : 0;
- tex = msm_iommu_tex_class[prot & MSM_IOMMU_CP_MASK];
-
- if (tex < 0 || tex > NUM_TEX_CLASS - 1) {
- ret = -EINVAL;
- goto fail;
- }
-
- priv = to_msm_priv(domain);
-
- fl_table = priv->pgtable;
-
- if (len != SZ_16M && len != SZ_1M &&
- len != SZ_64K && len != SZ_4K) {
- pr_debug("Bad size: %d\n", len);
- ret = -EINVAL;
- goto fail;
- }
-
- if (!fl_table) {
- pr_debug("Null page table\n");
- ret = -EINVAL;
- goto fail;
- }
-
- if (len == SZ_16M || len == SZ_1M) {
- pgprot = sh ? FL_SHARED : 0;
- pgprot |= tex & 0x01 ? FL_BUFFERABLE : 0;
- pgprot |= tex & 0x02 ? FL_CACHEABLE : 0;
- pgprot |= tex & 0x04 ? FL_TEX0 : 0;
- } else {
- pgprot = sh ? SL_SHARED : 0;
- pgprot |= tex & 0x01 ? SL_BUFFERABLE : 0;
- pgprot |= tex & 0x02 ? SL_CACHEABLE : 0;
- pgprot |= tex & 0x04 ? SL_TEX0 : 0;
- }
-
- fl_offset = FL_OFFSET(va); /* Upper 12 bits */
- fl_pte = fl_table + fl_offset; /* int pointers, 4 bytes */
-
- if (len == SZ_16M) {
- int i = 0;
- for (i = 0; i < 16; i++)
- *(fl_pte+i) = (pa & 0xFF000000) | FL_SUPERSECTION |
- FL_AP_READ | FL_AP_WRITE | FL_TYPE_SECT |
- FL_SHARED | FL_NG | pgprot;
- }
-
- if (len == SZ_1M)
- *fl_pte = (pa & 0xFFF00000) | FL_AP_READ | FL_AP_WRITE | FL_NG |
- FL_TYPE_SECT | FL_SHARED | pgprot;
-
- /* Need a 2nd level table */
- if ((len == SZ_4K || len == SZ_64K) && (*fl_pte) == 0) {
- unsigned long *sl;
- sl = (unsigned long *) __get_free_pages(GFP_ATOMIC,
- get_order(SZ_4K));
-
- if (!sl) {
- pr_debug("Could not allocate second level table\n");
- ret = -ENOMEM;
- goto fail;
- }
-
- memset(sl, 0, SZ_4K);
- *fl_pte = ((((int)__pa(sl)) & FL_BASE_MASK) | FL_TYPE_TABLE);
- }
-
- sl_table = (unsigned long *) __va(((*fl_pte) & FL_BASE_MASK));
- sl_offset = SL_OFFSET(va);
- sl_pte = sl_table + sl_offset;
-
-
- if (len == SZ_4K)
- *sl_pte = (pa & SL_BASE_MASK_SMALL) | SL_AP0 | SL_AP1 | SL_NG |
- SL_SHARED | SL_TYPE_SMALL | pgprot;
-
- if (len == SZ_64K) {
- int i;
+ int ret;
- for (i = 0; i < 16; i++)
- *(sl_pte+i) = (pa & SL_BASE_MASK_LARGE) | SL_AP0 |
- SL_NG | SL_AP1 | SL_SHARED | SL_TYPE_LARGE | pgprot;
- }
+ spin_lock_irqsave(&priv->pgtlock, flags);
+ ret = priv->iop->map(priv->iop, iova, pa, len, prot);
+ spin_unlock_irqrestore(&priv->pgtlock, flags);
- ret = __flush_iotlb(domain);
-fail:
- spin_unlock_irqrestore(&msm_iommu_lock, flags);
return ret;
}
-static size_t msm_iommu_unmap(struct iommu_domain *domain, unsigned long va,
- size_t len)
+static size_t msm_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
+ size_t len)
{
- struct msm_priv *priv;
+ struct msm_priv *priv = to_msm_priv(domain);
unsigned long flags;
- unsigned long *fl_table;
- unsigned long *fl_pte;
- unsigned long fl_offset;
- unsigned long *sl_table;
- unsigned long *sl_pte;
- unsigned long sl_offset;
- int i, ret = 0;
-
- spin_lock_irqsave(&msm_iommu_lock, flags);
-
- priv = to_msm_priv(domain);
- fl_table = priv->pgtable;
+ spin_lock_irqsave(&priv->pgtlock, flags);
+ len = priv->iop->unmap(priv->iop, iova, len);
+ spin_unlock_irqrestore(&priv->pgtlock, flags);
- if (len != SZ_16M && len != SZ_1M &&
- len != SZ_64K && len != SZ_4K) {
- pr_debug("Bad length: %d\n", len);
- goto fail;
- }
-
- if (!fl_table) {
- pr_debug("Null page table\n");
- goto fail;
- }
-
- fl_offset = FL_OFFSET(va); /* Upper 12 bits */
- fl_pte = fl_table + fl_offset; /* int pointers, 4 bytes */
-
- if (*fl_pte == 0) {
- pr_debug("First level PTE is 0\n");
- goto fail;
- }
-
- /* Unmap supersection */
- if (len == SZ_16M)
- for (i = 0; i < 16; i++)
- *(fl_pte+i) = 0;
-
- if (len == SZ_1M)
- *fl_pte = 0;
-
- sl_table = (unsigned long *) __va(((*fl_pte) & FL_BASE_MASK));
- sl_offset = SL_OFFSET(va);
- sl_pte = sl_table + sl_offset;
-
- if (len == SZ_64K) {
- for (i = 0; i < 16; i++)
- *(sl_pte+i) = 0;
- }
-
- if (len == SZ_4K)
- *sl_pte = 0;
-
- if (len == SZ_4K || len == SZ_64K) {
- int used = 0;
-
- for (i = 0; i < NUM_SL_PTE; i++)
- if (sl_table[i])
- used = 1;
- if (!used) {
- free_page((unsigned long)sl_table);
- *fl_pte = 0;
- }
- }
-
- ret = __flush_iotlb(domain);
-
-fail:
- spin_unlock_irqrestore(&msm_iommu_lock, flags);
-
- /* the IOMMU API requires us to return how many bytes were unmapped */
- len = ret ? 0 : len;
return len;
}
@@ -699,8 +546,6 @@ static void print_ctx_regs(void __iomem *base, int ctx)
GET_TTBR0(base, ctx), GET_TTBR1(base, ctx));
pr_err("SCTLR = %08x ACTLR = %08x\n",
GET_SCTLR(base, ctx), GET_ACTLR(base, ctx));
- pr_err("PRRR = %08x NMRR = %08x\n",
- GET_PRRR(base, ctx), GET_NMRR(base, ctx));
}
static void insert_iommu_master(struct device *dev,
@@ -941,47 +786,8 @@ static void __exit msm_iommu_driver_exit(void)
subsys_initcall(msm_iommu_driver_init);
module_exit(msm_iommu_driver_exit);
-static int __init get_tex_class(int icp, int ocp, int mt, int nos)
-{
- int i = 0;
- unsigned int prrr = 0;
- unsigned int nmrr = 0;
- int c_icp, c_ocp, c_mt, c_nos;
-
- RCP15_PRRR(prrr);
- RCP15_NMRR(nmrr);
-
- for (i = 0; i < NUM_TEX_CLASS; i++) {
- c_nos = PRRR_NOS(prrr, i);
- c_mt = PRRR_MT(prrr, i);
- c_icp = NMRR_ICP(nmrr, i);
- c_ocp = NMRR_OCP(nmrr, i);
-
- if (icp == c_icp && ocp == c_ocp && c_mt == mt && c_nos == nos)
- return i;
- }
-
- return -ENODEV;
-}
-
-static void __init setup_iommu_tex_classes(void)
-{
- msm_iommu_tex_class[MSM_IOMMU_ATTR_NONCACHED] =
- get_tex_class(CP_NONCACHED, CP_NONCACHED, MT_NORMAL, 1);
-
- msm_iommu_tex_class[MSM_IOMMU_ATTR_CACHED_WB_WA] =
- get_tex_class(CP_WB_WA, CP_WB_WA, MT_NORMAL, 1);
-
- msm_iommu_tex_class[MSM_IOMMU_ATTR_CACHED_WB_NWA] =
- get_tex_class(CP_WB_NWA, CP_WB_NWA, MT_NORMAL, 1);
-
- msm_iommu_tex_class[MSM_IOMMU_ATTR_CACHED_WT] =
- get_tex_class(CP_WT, CP_WT, MT_NORMAL, 1);
-}
-
static int __init msm_iommu_init(void)
{
- setup_iommu_tex_classes();
bus_set_iommu(&platform_bus_type, &msm_iommu_ops);
return 0;
}
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH V5 6/7] iommu/msm: Use writel_relaxed and add a barrier
2016-05-20 10:54 [PATCH V5 0/7] iommu/msm: Add DT adaptation and generic bindings support Sricharan R
` (3 preceding siblings ...)
2016-05-20 10:54 ` [PATCH V5 5/7] iommu/msm: use generic ARMV7S short descriptor pagetable ops Sricharan R
@ 2016-05-20 10:54 ` Sricharan R
[not found] ` <1463741694-1735-7-git-send-email-sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2016-05-20 10:54 ` [PATCH V5 7/7] iommu/msm: Remove driver BROKEN Sricharan R
` (2 subsequent siblings)
7 siblings, 1 reply; 20+ messages in thread
From: Sricharan R @ 2016-05-20 10:54 UTC (permalink / raw)
To: devicetree, linux-arm-msm, joro, robdclark, iommu,
srinivas.kandagatla, laurent.pinchart, treding, robin.murphy,
linux-arm-kernel, stepanm, architt, arnd
Cc: sricharan
While using the generic pagetable ops the tlb maintenance
operation gets completed in the sync callback. So use writel_relaxed
for all register access and add a mb() at appropriate places.
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
---
drivers/iommu/msm_iommu.c | 24 +++++++--
drivers/iommu/msm_iommu_hw-8xxx.h | 109 ++++++++++++++++++++------------------
2 files changed, 79 insertions(+), 54 deletions(-)
diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index 0299a37..dfcaeef 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -124,6 +124,9 @@ static void msm_iommu_reset(void __iomem *base, int ncb)
SET_TLBLKCR(base, ctx, 0);
SET_CONTEXTIDR(base, ctx, 0);
}
+
+ /* Ensure completion of relaxed writes from the above SET macros */
+ mb();
}
static void __flush_iotlb(void *cookie)
@@ -141,6 +144,9 @@ static void __flush_iotlb(void *cookie)
list_for_each_entry(master, &iommu->ctx_list, list)
SET_CTX_TLBIALL(iommu->base, master->num, 0);
+ /* To ensure completion of TLBIALL above */
+ mb();
+
__disable_clocks(iommu);
}
fail:
@@ -181,7 +187,8 @@ fail:
static void __flush_iotlb_sync(void *cookie)
{
- /* To avoid a null function pointer */
+ /* To ensure completion of the TLBIVA in __flush_iotlb_range */
+ mb();
}
static const struct iommu_gather_ops msm_iommu_gather_ops = {
@@ -235,6 +242,9 @@ static void config_mids(struct msm_iommu_dev *iommu,
/* Set security bit override to be Non-secure */
SET_NSCFG(iommu->base, mid, 3);
}
+
+ /* Ensure completion of relaxed writes from the above SET macros */
+ mb();
}
static void __reset_context(void __iomem *base, int ctx)
@@ -257,6 +267,9 @@ static void __reset_context(void __iomem *base, int ctx)
SET_TLBFLPTER(base, ctx, 0);
SET_TLBSLPTER(base, ctx, 0);
SET_TLBLKCR(base, ctx, 0);
+
+ /* Ensure completion of relaxed writes from the above SET macros */
+ mb();
}
static void __program_context(void __iomem *base, int ctx,
@@ -305,6 +318,9 @@ static void __program_context(void __iomem *base, int ctx,
/* Enable the MMU */
SET_M(base, ctx, 1);
+
+ /* Ensure completion of relaxed writes from the above SET macros */
+ mb();
}
static struct iommu_domain *msm_iommu_domain_alloc(unsigned type)
@@ -500,7 +516,8 @@ static phys_addr_t msm_iommu_iova_to_phys(struct iommu_domain *domain,
/* Invalidate context TLB */
SET_CTX_TLBIALL(iommu->base, master->num, 0);
SET_V2PPR(iommu->base, master->num, va & V2Pxx_VA);
-
+ /* Ensure completion of relaxed writes from the above SET macros */
+ mb();
par = GET_PAR(iommu->base, master->num);
/* We are dealing with a supersection */
@@ -714,7 +731,8 @@ static int msm_iommu_probe(struct platform_device *pdev)
par = GET_PAR(iommu->base, 0);
SET_V2PCFG(iommu->base, 0, 0);
SET_M(iommu->base, 0, 0);
-
+ /* Ensure completion of relaxed writes from the above SET macros */
+ mb();
if (!par) {
pr_err("Invalid PAR value detected\n");
ret = -ENODEV;
diff --git a/drivers/iommu/msm_iommu_hw-8xxx.h b/drivers/iommu/msm_iommu_hw-8xxx.h
index fc16010..fe2c5ca 100644
--- a/drivers/iommu/msm_iommu_hw-8xxx.h
+++ b/drivers/iommu/msm_iommu_hw-8xxx.h
@@ -24,13 +24,19 @@
#define GET_CTX_REG(reg, base, ctx) \
(readl((base) + (reg) + ((ctx) << CTX_SHIFT)))
-#define SET_GLOBAL_REG(reg, base, val) writel((val), ((base) + (reg)))
+/*
+ * The writes to the global/context registers needs to be synced only after
+ * all the configuration writes are done. So use relaxed accessors and
+ * a barrier at the end.
+ */
+#define SET_GLOBAL_REG_RELAXED(reg, base, val) \
+ writel_relaxed((val), ((base) + (reg)))
-#define SET_CTX_REG(reg, base, ctx, val) \
- writel((val), ((base) + (reg) + ((ctx) << CTX_SHIFT)))
+#define SET_CTX_REG_RELAXED(reg, base, ctx, val) \
+ writel_relaxed((val), ((base) + (reg) + ((ctx) << CTX_SHIFT)))
-/* Wrappers for numbered registers */
-#define SET_GLOBAL_REG_N(b, n, r, v) SET_GLOBAL_REG(b, ((r) + (n << 2)), (v))
+ /* Wrappers for numbered registers */
+#define SET_GLOBAL_REG_N(b, n, r, v) SET_GLOBAL_REG_RELAXED(b, ((r) + (n << 2)), (v))
#define GET_GLOBAL_REG_N(b, n, r) GET_GLOBAL_REG(b, ((r) + (n << 2)))
/* Field wrappers */
@@ -39,16 +45,17 @@
GET_FIELD(((b) + (r) + ((c) << CTX_SHIFT)), F##_MASK, F##_SHIFT)
#define SET_GLOBAL_FIELD(b, r, F, v) \
- SET_FIELD(((b) + (r)), F##_MASK, F##_SHIFT, (v))
+ SET_FIELD_RELAXED(((b) + (r)), F##_MASK, F##_SHIFT, (v))
#define SET_CONTEXT_FIELD(b, c, r, F, v) \
- SET_FIELD(((b) + (r) + ((c) << CTX_SHIFT)), F##_MASK, F##_SHIFT, (v))
+ SET_FIELD_RELAXED(((b) + (r) + ((c) << CTX_SHIFT)), F##_MASK, F##_SHIFT, (v))
#define GET_FIELD(addr, mask, shift) ((readl(addr) >> (shift)) & (mask))
-#define SET_FIELD(addr, mask, shift, v) \
+#define SET_FIELD_RELAXED(addr, mask, shift, v) \
do { \
int t = readl(addr); \
- writel((t & ~((mask) << (shift))) + (((v) & (mask)) << (shift)), addr);\
+ writel_relaxed((t & ~((mask) << (shift))) + \
+ (((v) & (mask)) << (shift)), addr);\
} while (0)
@@ -96,20 +103,20 @@ do { \
/* Global register setters / getters */
#define SET_M2VCBR_N(b, N, v) SET_GLOBAL_REG_N(M2VCBR_N, N, (b), (v))
#define SET_CBACR_N(b, N, v) SET_GLOBAL_REG_N(CBACR_N, N, (b), (v))
-#define SET_TLBRSW(b, v) SET_GLOBAL_REG(TLBRSW, (b), (v))
-#define SET_TLBTR0(b, v) SET_GLOBAL_REG(TLBTR0, (b), (v))
-#define SET_TLBTR1(b, v) SET_GLOBAL_REG(TLBTR1, (b), (v))
-#define SET_TLBTR2(b, v) SET_GLOBAL_REG(TLBTR2, (b), (v))
-#define SET_TESTBUSCR(b, v) SET_GLOBAL_REG(TESTBUSCR, (b), (v))
-#define SET_GLOBAL_TLBIALL(b, v) SET_GLOBAL_REG(GLOBAL_TLBIALL, (b), (v))
-#define SET_TLBIVMID(b, v) SET_GLOBAL_REG(TLBIVMID, (b), (v))
-#define SET_CR(b, v) SET_GLOBAL_REG(CR, (b), (v))
-#define SET_EAR(b, v) SET_GLOBAL_REG(EAR, (b), (v))
-#define SET_ESR(b, v) SET_GLOBAL_REG(ESR, (b), (v))
-#define SET_ESRRESTORE(b, v) SET_GLOBAL_REG(ESRRESTORE, (b), (v))
-#define SET_ESYNR0(b, v) SET_GLOBAL_REG(ESYNR0, (b), (v))
-#define SET_ESYNR1(b, v) SET_GLOBAL_REG(ESYNR1, (b), (v))
-#define SET_RPU_ACR(b, v) SET_GLOBAL_REG(RPU_ACR, (b), (v))
+#define SET_TLBRSW(b, v) SET_GLOBAL_REG_RELAXED(TLBRSW, (b), (v))
+#define SET_TLBTR0(b, v) SET_GLOBAL_REG_RELAXED(TLBTR0, (b), (v))
+#define SET_TLBTR1(b, v) SET_GLOBAL_REG_RELAXED(TLBTR1, (b), (v))
+#define SET_TLBTR2(b, v) SET_GLOBAL_REG_RELAXED(TLBTR2, (b), (v))
+#define SET_TESTBUSCR(b, v) SET_GLOBAL_REG_RELAXED(TESTBUSCR, (b), (v))
+#define SET_GLOBAL_TLBIALL(b, v) SET_GLOBAL_REG_RELAXED(GLOBAL_TLBIALL, (b), (v))
+#define SET_TLBIVMID(b, v) SET_GLOBAL_REG_RELAXED(TLBIVMID, (b), (v))
+#define SET_CR(b, v) SET_GLOBAL_REG_RELAXED(CR, (b), (v))
+#define SET_EAR(b, v) SET_GLOBAL_REG_RELAXED(EAR, (b), (v))
+#define SET_ESR(b, v) SET_GLOBAL_REG_RELAXED(ESR, (b), (v))
+#define SET_ESRRESTORE(b, v) SET_GLOBAL_REG_RELAXED(ESRRESTORE, (b), (v))
+#define SET_ESYNR0(b, v) SET_GLOBAL_REG_RELAXED(ESYNR0, (b), (v))
+#define SET_ESYNR1(b, v) SET_GLOBAL_REG_RELAXED(ESYNR1, (b), (v))
+#define SET_RPU_ACR(b, v) SET_GLOBAL_REG_RELAXED(RPU_ACR, (b), (v))
#define GET_M2VCBR_N(b, N) GET_GLOBAL_REG_N(M2VCBR_N, N, (b))
#define GET_CBACR_N(b, N) GET_GLOBAL_REG_N(CBACR_N, N, (b))
@@ -131,34 +138,34 @@ do { \
/* Context register setters/getters */
-#define SET_SCTLR(b, c, v) SET_CTX_REG(SCTLR, (b), (c), (v))
-#define SET_ACTLR(b, c, v) SET_CTX_REG(ACTLR, (b), (c), (v))
-#define SET_CONTEXTIDR(b, c, v) SET_CTX_REG(CONTEXTIDR, (b), (c), (v))
-#define SET_TTBR0(b, c, v) SET_CTX_REG(TTBR0, (b), (c), (v))
-#define SET_TTBR1(b, c, v) SET_CTX_REG(TTBR1, (b), (c), (v))
-#define SET_TTBCR(b, c, v) SET_CTX_REG(TTBCR, (b), (c), (v))
-#define SET_PAR(b, c, v) SET_CTX_REG(PAR, (b), (c), (v))
-#define SET_FSR(b, c, v) SET_CTX_REG(FSR, (b), (c), (v))
-#define SET_FSRRESTORE(b, c, v) SET_CTX_REG(FSRRESTORE, (b), (c), (v))
-#define SET_FAR(b, c, v) SET_CTX_REG(FAR, (b), (c), (v))
-#define SET_FSYNR0(b, c, v) SET_CTX_REG(FSYNR0, (b), (c), (v))
-#define SET_FSYNR1(b, c, v) SET_CTX_REG(FSYNR1, (b), (c), (v))
-#define SET_PRRR(b, c, v) SET_CTX_REG(PRRR, (b), (c), (v))
-#define SET_NMRR(b, c, v) SET_CTX_REG(NMRR, (b), (c), (v))
-#define SET_TLBLKCR(b, c, v) SET_CTX_REG(TLBLCKR, (b), (c), (v))
-#define SET_V2PSR(b, c, v) SET_CTX_REG(V2PSR, (b), (c), (v))
-#define SET_TLBFLPTER(b, c, v) SET_CTX_REG(TLBFLPTER, (b), (c), (v))
-#define SET_TLBSLPTER(b, c, v) SET_CTX_REG(TLBSLPTER, (b), (c), (v))
-#define SET_BFBCR(b, c, v) SET_CTX_REG(BFBCR, (b), (c), (v))
-#define SET_CTX_TLBIALL(b, c, v) SET_CTX_REG(CTX_TLBIALL, (b), (c), (v))
-#define SET_TLBIASID(b, c, v) SET_CTX_REG(TLBIASID, (b), (c), (v))
-#define SET_TLBIVA(b, c, v) SET_CTX_REG(TLBIVA, (b), (c), (v))
-#define SET_TLBIVAA(b, c, v) SET_CTX_REG(TLBIVAA, (b), (c), (v))
-#define SET_V2PPR(b, c, v) SET_CTX_REG(V2PPR, (b), (c), (v))
-#define SET_V2PPW(b, c, v) SET_CTX_REG(V2PPW, (b), (c), (v))
-#define SET_V2PUR(b, c, v) SET_CTX_REG(V2PUR, (b), (c), (v))
-#define SET_V2PUW(b, c, v) SET_CTX_REG(V2PUW, (b), (c), (v))
-#define SET_RESUME(b, c, v) SET_CTX_REG(RESUME, (b), (c), (v))
+#define SET_SCTLR(b, c, v) SET_CTX_REG_RELAXED(SCTLR, (b), (c), (v))
+#define SET_ACTLR(b, c, v) SET_CTX_REG_RELAXED(ACTLR, (b), (c), (v))
+#define SET_CONTEXTIDR(b, c, v) SET_CTX_REG_RELAXED(CONTEXTIDR, (b), (c), (v))
+#define SET_TTBR0(b, c, v) SET_CTX_REG_RELAXED(TTBR0, (b), (c), (v))
+#define SET_TTBR1(b, c, v) SET_CTX_REG_RELAXED(TTBR1, (b), (c), (v))
+#define SET_TTBCR(b, c, v) SET_CTX_REG_RELAXED(TTBCR, (b), (c), (v))
+#define SET_PAR(b, c, v) SET_CTX_REG_RELAXED(PAR, (b), (c), (v))
+#define SET_FSR(b, c, v) SET_CTX_REG_RELAXED(FSR, (b), (c), (v))
+#define SET_FSRRESTORE(b, c, v) SET_CTX_REG_RELAXED(FSRRESTORE, (b), (c), (v))
+#define SET_FAR(b, c, v) SET_CTX_REG_RELAXED(FAR, (b), (c), (v))
+#define SET_FSYNR0(b, c, v) SET_CTX_REG_RELAXED(FSYNR0, (b), (c), (v))
+#define SET_FSYNR1(b, c, v) SET_CTX_REG_RELAXED(FSYNR1, (b), (c), (v))
+#define SET_PRRR(b, c, v) SET_CTX_REG_RELAXED(PRRR, (b), (c), (v))
+#define SET_NMRR(b, c, v) SET_CTX_REG_RELAXED(NMRR, (b), (c), (v))
+#define SET_TLBLKCR(b, c, v) SET_CTX_REG_RELAXED(TLBLCKR, (b), (c), (v))
+#define SET_V2PSR(b, c, v) SET_CTX_REG_RELAXED(V2PSR, (b), (c), (v))
+#define SET_TLBFLPTER(b, c, v) SET_CTX_REG_RELAXED(TLBFLPTER, (b), (c), (v))
+#define SET_TLBSLPTER(b, c, v) SET_CTX_REG_RELAXED(TLBSLPTER, (b), (c), (v))
+#define SET_BFBCR(b, c, v) SET_CTX_REG_RELAXED(BFBCR, (b), (c), (v))
+#define SET_CTX_TLBIALL(b, c, v) SET_CTX_REG_RELAXED(CTX_TLBIALL, (b), (c), (v))
+#define SET_TLBIASID(b, c, v) SET_CTX_REG_RELAXED(TLBIASID, (b), (c), (v))
+#define SET_TLBIVA(b, c, v) SET_CTX_REG_RELAXED(TLBIVA, (b), (c), (v))
+#define SET_TLBIVAA(b, c, v) SET_CTX_REG_RELAXED(TLBIVAA, (b), (c), (v))
+#define SET_V2PPR(b, c, v) SET_CTX_REG_RELAXED(V2PPR, (b), (c), (v))
+#define SET_V2PPW(b, c, v) SET_CTX_REG_RELAXED(V2PPW, (b), (c), (v))
+#define SET_V2PUR(b, c, v) SET_CTX_REG_RELAXED(V2PUR, (b), (c), (v))
+#define SET_V2PUW(b, c, v) SET_CTX_REG_RELAXED(V2PUW, (b), (c), (v))
+#define SET_RESUME(b, c, v) SET_CTX_REG_RELAXED(RESUME, (b), (c), (v))
#define GET_SCTLR(b, c) GET_CTX_REG(SCTLR, (b), (c))
#define GET_ACTLR(b, c) GET_CTX_REG(ACTLR, (b), (c))
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH V5 7/7] iommu/msm: Remove driver BROKEN
2016-05-20 10:54 [PATCH V5 0/7] iommu/msm: Add DT adaptation and generic bindings support Sricharan R
` (4 preceding siblings ...)
2016-05-20 10:54 ` [PATCH V5 6/7] iommu/msm: Use writel_relaxed and add a barrier Sricharan R
@ 2016-05-20 10:54 ` Sricharan R
2016-05-23 2:53 ` [PATCH V5 0/7] iommu/msm: Add DT adaptation and generic bindings support Archit Taneja
[not found] ` <1463741694-1735-1-git-send-email-sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
7 siblings, 0 replies; 20+ messages in thread
From: Sricharan R @ 2016-05-20 10:54 UTC (permalink / raw)
To: devicetree, linux-arm-msm, joro, robdclark, iommu,
srinivas.kandagatla, laurent.pinchart, treding, robin.murphy,
linux-arm-kernel, stepanm, architt, arnd
Cc: sricharan
Now that the driver is DT adapted, bus_set_iommu gets called only
when on compatible matching. So the driver should not break multiplatform
builds now. So remove the BROKEN config.
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
---
drivers/iommu/Kconfig | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 761be35..f3d9fd0 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -90,7 +90,6 @@ config MSM_IOMMU
bool "MSM IOMMU Support"
depends on ARM
depends on ARCH_MSM8X60 || ARCH_MSM8960 || COMPILE_TEST
- depends on BROKEN
select IOMMU_API
select IOMMU_IO_PGTABLE_ARMV7S
help
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH V5 6/7] iommu/msm: Use writel_relaxed and add a barrier
[not found] ` <1463741694-1735-7-git-send-email-sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
@ 2016-05-20 11:44 ` Arnd Bergmann
2016-05-20 12:20 ` Arnd Bergmann
2016-05-23 6:05 ` Sricharan
0 siblings, 2 replies; 20+ messages in thread
From: Arnd Bergmann @ 2016-05-20 11:44 UTC (permalink / raw)
To: Sricharan R
Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, joro-zLv9SwRftAIdnm+yROfE0A,
robdclark-Re5JQEeQqe8AvxtiuMwx3w,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
srinivas.kandagatla-QSEj5FYQhm4dnm+yROfE0A,
laurent.pinchart-ryLnwIuWjnjg/C1BVhZhaw,
treding-DDmLM1+adcrQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
stepanm-sgV2jX0FEOL9JmXXK+q4OQ, architt-sgV2jX0FEOL9JmXXK+q4OQ
On Friday 20 May 2016 16:24:53 Sricharan R wrote:
> While using the generic pagetable ops the tlb maintenance
> operation gets completed in the sync callback. So use writel_relaxed
> for all register access and add a mb() at appropriate places.
>
> Signed-off-by: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
> ---
> drivers/iommu/msm_iommu.c | 24 +++++++--
> drivers/iommu/msm_iommu_hw-8xxx.h | 109 ++++++++++++++++++++------------------
> 2 files changed, 79 insertions(+), 54 deletions(-)
>
> diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
> index 0299a37..dfcaeef 100644
> --- a/drivers/iommu/msm_iommu.c
> +++ b/drivers/iommu/msm_iommu.c
> @@ -124,6 +124,9 @@ static void msm_iommu_reset(void __iomem *base, int ncb)
> SET_TLBLKCR(base, ctx, 0);
> SET_CONTEXTIDR(base, ctx, 0);
> }
> +
> + /* Ensure completion of relaxed writes from the above SET macros */
> + mb();
> }
Why do the above use the relaxed writes? Do you have any performance
numbers showing that skipping the sync for the reset makes a measurable
difference?
How did you prove that skipping the sync before the write is safe?
How about resetting the iommu less often instead?
> static void __flush_iotlb(void *cookie)
> @@ -141,6 +144,9 @@ static void __flush_iotlb(void *cookie)
> list_for_each_entry(master, &iommu->ctx_list, list)
> SET_CTX_TLBIALL(iommu->base, master->num, 0);
>
> + /* To ensure completion of TLBIALL above */
> + mb();
> +
> __disable_clocks(iommu);
> }
> fail:
__disable_clocks() takes spinlocks and accesses the hardware, which
in turn will also involve multiple syncs, so this seems like
a premature optimization.
Have you tried just leaving the clocks enabled? I'd suspect you could
gain more performance that way.
> @@ -181,7 +187,8 @@ fail:
>
> static void __flush_iotlb_sync(void *cookie)
> {
> - /* To avoid a null function pointer */
> + /* To ensure completion of the TLBIVA in __flush_iotlb_range */
> + mb();
> }
I don't understand the specific race from the comment.
What operation comes after this that relies on __flush_iotlb_range
having completed, and how does an mb() guarantee this?
This seems to be a bug fix that is unrelated to the change to
use writel_relaxed(), so better split it out into a separate
patch, with a longer changeset description. Did you spot this
race by running into incorrect data, or by inspection?
> static const struct iommu_gather_ops msm_iommu_gather_ops = {
> @@ -235,6 +242,9 @@ static void config_mids(struct msm_iommu_dev *iommu,
> /* Set security bit override to be Non-secure */
> SET_NSCFG(iommu->base, mid, 3);
> }
> +
> + /* Ensure completion of relaxed writes from the above SET macros */
> + mb();
> }
>
This looks similar to the msm_iommu_reset() case. Maybe put those
two into one patch, or find a way to run config_mids() less often so
it doesn't show up in the profiles any more.
If you hit this often enough that it causes performance problems,
it's more likely that there is a bug in your code than that changing
to relaxed accessors is a good idea.
> static void __reset_context(void __iomem *base, int ctx)
> @@ -257,6 +267,9 @@ static void __reset_context(void __iomem *base, int ctx)
> SET_TLBFLPTER(base, ctx, 0);
> SET_TLBSLPTER(base, ctx, 0);
> SET_TLBLKCR(base, ctx, 0);
> +
> + /* Ensure completion of relaxed writes from the above SET macros */
> + mb();
> }
>
> static void __program_context(void __iomem *base, int ctx,
> @@ -305,6 +318,9 @@ static void __program_context(void __iomem *base, int ctx,
>
> /* Enable the MMU */
> SET_M(base, ctx, 1);
> +
> + /* Ensure completion of relaxed writes from the above SET macros */
> + mb();
> }
>
> static struct iommu_domain *msm_iommu_domain_alloc(unsigned type)
Maybe one more patch for these? I can see that
__reset_context/__program_context would be run often enough to make
a difference, though maybe not as much as the flushes.
Also, split this in two patches: one that adds a comment describing
how you proved that you don't need barriers before the writes, and
another one adding the barrier afterwards, with a description of
the failure scenario.
> @@ -500,7 +516,8 @@ static phys_addr_t msm_iommu_iova_to_phys(struct iommu_domain *domain,
> /* Invalidate context TLB */
> SET_CTX_TLBIALL(iommu->base, master->num, 0);
> SET_V2PPR(iommu->base, master->num, va & V2Pxx_VA);
> -
> + /* Ensure completion of relaxed writes from the above SET macros */
> + mb();
> par = GET_PAR(iommu->base, master->num);
>
> /* We are dealing with a supersection */
In this case, I'd say it's better to rewrite the function to avoid the
read: iova_to_phys() should be fast, and not require hardware access.
Getting rid of the hardware access by using an in-memory cache for
this should gain more than just removing the barriers, as an MMIO read
is always slow.
> @@ -714,7 +731,8 @@ static int msm_iommu_probe(struct platform_device *pdev)
> par = GET_PAR(iommu->base, 0);
> SET_V2PCFG(iommu->base, 0, 0);
> SET_M(iommu->base, 0, 0);
> -
> + /* Ensure completion of relaxed writes from the above SET macros */
> + mb();
> if (!par) {
Probe() clearly isn't performance sensitive, so please back this out
and use the non-relaxed accessors.
> diff --git a/drivers/iommu/msm_iommu_hw-8xxx.h b/drivers/iommu/msm_iommu_hw-8xxx.h
> index fc16010..fe2c5ca 100644
> --- a/drivers/iommu/msm_iommu_hw-8xxx.h
> +++ b/drivers/iommu/msm_iommu_hw-8xxx.h
> @@ -24,13 +24,19 @@
> #define GET_CTX_REG(reg, base, ctx) \
> (readl((base) + (reg) + ((ctx) << CTX_SHIFT)))
>
> -#define SET_GLOBAL_REG(reg, base, val) writel((val), ((base) + (reg)))
> +/*
> + * The writes to the global/context registers needs to be synced only after
> + * all the configuration writes are done. So use relaxed accessors and
> + * a barrier at the end.
> + */
> +#define SET_GLOBAL_REG_RELAXED(reg, base, val) \
> + writel_relaxed((val), ((base) + (reg)))
>
> -#define SET_CTX_REG(reg, base, ctx, val) \
> - writel((val), ((base) + (reg) + ((ctx) << CTX_SHIFT)))
> +#define SET_CTX_REG_RELAXED(reg, base, ctx, val) \
> + writel_relaxed((val), ((base) + (reg) + ((ctx) << CTX_SHIFT)))
Please add the relaxed accessors first in a separate patch, and then
change over one user at a time before you remove the non-relaxed ones.
It's hard to believe that all users are actually performance critical,
so start with the ones that gain the most performance. As commented above,
some of the conversions seem particularly fishy and it's likely that
a slow reset() function should not be fixed by making reset() faster
but by calling it less often.
> /* Context register setters/getters */
> -#define SET_SCTLR(b, c, v) SET_CTX_REG(SCTLR, (b), (c), (v))
> -#define SET_ACTLR(b, c, v) SET_CTX_REG(ACTLR, (b), (c), (v))
> -#define SET_CONTEXTIDR(b, c, v) SET_CTX_REG(CONTEXTIDR, (b), (c), (v))
> -#define SET_TTBR0(b, c, v) SET_CTX_REG(TTBR0, (b), (c), (v))
> -#define SET_TTBR1(b, c, v) SET_CTX_REG(TTBR1, (b), (c), (v))
> +#define SET_SCTLR(b, c, v) SET_CTX_REG_RELAXED(SCTLR, (b), (c), (v))
> +#define SET_ACTLR(b, c, v) SET_CTX_REG_RELAXED(ACTLR, (b), (c), (v))
> +#define SET_CONTEXTIDR(b, c, v) SET_CTX_REG_RELAXED(CONTEXTIDR, (b), (c), (v))
> +#define SET_TTBR0(b, c, v) SET_CTX_REG_RELAXED(TTBR0, (b), (c), (v))
> +#define SET_TTBR1(b, c, v) SET_CTX_REG_RELAXED(TTBR1, (b), (c), (v))
> +#define SET_TTBCR(b, c, v) SET_CTX_REG_RELAXED(TTBCR, (b), (c), (v))
> +#define SET_PAR(b, c, v) SET_CTX_REG_RELAXED(PAR, (b), (c), (v))
Can you just remove those macros, and open-code the function calls?
This is an unnecessary obfuscation that now also hides the fact that you
are using relaxed accessors.
Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH V5 6/7] iommu/msm: Use writel_relaxed and add a barrier
2016-05-20 11:44 ` Arnd Bergmann
@ 2016-05-20 12:20 ` Arnd Bergmann
2016-05-23 6:05 ` Sricharan
1 sibling, 0 replies; 20+ messages in thread
From: Arnd Bergmann @ 2016-05-20 12:20 UTC (permalink / raw)
To: Sricharan R
Cc: devicetree, architt, linux-arm-msm, joro, iommu, robdclark,
srinivas.kandagatla, laurent.pinchart, treding, robin.murphy,
linux-arm-kernel, stepanm
On Friday 20 May 2016 13:44:10 Arnd Bergmann wrote:
> > #define GET_CTX_REG(reg, base, ctx) \
> > (readl((base) + (reg) + ((ctx) << CTX_SHIFT)))
> >
> > -#define SET_GLOBAL_REG(reg, base, val) writel((val), ((base) + (reg)))
> > +/*
> > + * The writes to the global/context registers needs to be synced only after
> > + * all the configuration writes are done. So use relaxed accessors and
> > + * a barrier at the end.
> > + */
> > +#define SET_GLOBAL_REG_RELAXED(reg, base, val) \
> > + writel_relaxed((val), ((base) + (reg)))
> >
> > -#define SET_CTX_REG(reg, base, ctx, val) \
> > - writel((val), ((base) + (reg) + ((ctx) << CTX_SHIFT)))
> > +#define SET_CTX_REG_RELAXED(reg, base, ctx, val) \
> > + writel_relaxed((val), ((base) + (reg) + ((ctx) << CTX_SHIFT)))
>
> Please add the relaxed accessors first in a separate patch, and then
> change over one user at a time before you remove the non-relaxed ones.
>
> It's hard to believe that all users are actually performance critical,
> so start with the ones that gain the most performance. As commented above,
> some of the conversions seem particularly fishy and it's likely that
> a slow reset() function should not be fixed by making reset() faster
> but by calling it less often.
One more thing, please also convert them into inline functions when you modify
them, and use normal names such as
static inline void msm_iommu_write(struct msm_iommu_dev *iommu, unsigned int reg, u32 val)
{
writel(val, iommu->base + reg);
}
static inline void msm_iommu_context_write(struct msm_iommu_ctx_dev *ctx, unsigned int reg, u32 val)
{
writel(val, ctx->iommu->base + ctx << CTX_SHIFT + reg, val);
}
Arnd
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH V5 0/7] iommu/msm: Add DT adaptation and generic bindings support
2016-05-20 10:54 [PATCH V5 0/7] iommu/msm: Add DT adaptation and generic bindings support Sricharan R
` (5 preceding siblings ...)
2016-05-20 10:54 ` [PATCH V5 7/7] iommu/msm: Remove driver BROKEN Sricharan R
@ 2016-05-23 2:53 ` Archit Taneja
[not found] ` <1463741694-1735-1-git-send-email-sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
7 siblings, 0 replies; 20+ messages in thread
From: Archit Taneja @ 2016-05-23 2:53 UTC (permalink / raw)
To: Sricharan R
Cc: devicetree, linux-arm-msm, joro, robdclark, iommu,
srinivas.kandagatla, laurent.pinchart, treding, robin.murphy,
linux-arm-kernel, stepanm, arnd
On 05/20/2016 04:24 PM, Sricharan R wrote:
> The msm_iommu.c driver currently works based on platform data.
> A single master device can be connected to more than one iommu and multiple
> contexts in each of the iommu. This association between master and iommus was
> represented from platform data using parent/child devices. The master drivers
> were responsible for attaching all of the iommus/context to a domain. Now the
> platform data support is removed and DT support is added. The master/iommus are
> added through generic iommu bindings.
>
> This is essentially rework of the patch posted earlier by
> Rob Clark <robdclark@gmail.com>. This series folds the changes in to the
> existing driver with the addition of generic bindings.
>
> http://www.spinics.net/lists/linux-arm-msm/msg10077.html
>
> Tested this series on ifc6410 board.
>
> [V5] Changed the compatible binding name as per comments, added comments
> for usage of barriers in patch 6.
>
> [V4] Addressed comments for making the iommu compatible binding more soc
> specific and updated the documentation for the iommu clocks.
>
> [V3] Addressed comments to correct the usage
> of the #iommu-cells binding, improve the flush_iotlb_range function,
> added a new patch to use writel_relaxed for register access and split
> up the documentation patch.
>
> [V2] Adapted the driver to use generic ARMV7S short descriptor pagetable ops
> and addressed comments.
>
> [V1]
> https://lists.linuxfoundation.org/pipermail/iommu/2015-August/014074.html
Tested with the drm/msm driver as an iommu client on IFC6410 board
(APQ8064).
Tested-by: Archit Taneja <architt@codeaurora.org>
>
> Sricharan R (7):
> iommu/msm: Add DT adaptation
> documentation: iommu: Add bindings for msm,iommu-v0 ip
> iommu/msm: Move the contents from msm_iommu_dev.c to msm_iommu.c
> iommu/msm: Add support for generic master bindings
> iommu/msm: use generic ARMV7S short descriptor pagetable ops
> iommu/msm: Use writel_relaxed and add a barrier
> iommu/msm: Remove driver BROKEN
>
> .../devicetree/bindings/iommu/msm,iommu-v0.txt | 64 ++
> drivers/iommu/Kconfig | 2 +-
> drivers/iommu/Makefile | 2 +-
> drivers/iommu/msm_iommu.c | 885 +++++++++++----------
> drivers/iommu/msm_iommu.h | 73 +-
> drivers/iommu/msm_iommu_dev.c | 381 ---------
> drivers/iommu/msm_iommu_hw-8xxx.h | 109 +--
> 7 files changed, 636 insertions(+), 880 deletions(-)
> create mode 100644 Documentation/devicetree/bindings/iommu/msm,iommu-v0.txt
> delete mode 100644 drivers/iommu/msm_iommu_dev.c
>
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, hosted by The Linux Foundation
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH V5 6/7] iommu/msm: Use writel_relaxed and add a barrier
2016-05-20 11:44 ` Arnd Bergmann
2016-05-20 12:20 ` Arnd Bergmann
@ 2016-05-23 6:05 ` Sricharan
2016-05-24 14:00 ` Arnd Bergmann
1 sibling, 1 reply; 20+ messages in thread
From: Sricharan @ 2016-05-23 6:05 UTC (permalink / raw)
To: 'Arnd Bergmann'
Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, architt-sgV2jX0FEOL9JmXXK+q4OQ,
linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
srinivas.kandagatla-QSEj5FYQhm4dnm+yROfE0A,
laurent.pinchart-ryLnwIuWjnjg/C1BVhZhaw,
treding-DDmLM1+adcrQT0dZR+AlfA,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
stepanm-sgV2jX0FEOL9JmXXK+q4OQ
Hi Arnd,
>> @@ -124,6 +124,9 @@ static void msm_iommu_reset(void __iomem *base, int ncb)
>> SET_TLBLKCR(base, ctx, 0);
>> SET_CONTEXTIDR(base, ctx, 0);
>> }
>> +
>> + /* Ensure completion of relaxed writes from the above SET macros */
>> + mb();
>> }
>
>Why do the above use the relaxed writes? Do you have any performance
>numbers showing that skipping the sync for the reset makes a measurable
>difference?
>
>How did you prove that skipping the sync before the write is safe?
>
>How about resetting the iommu less often instead?
>
I had measured the numbers only for the full usecase path, not for the
reset path alone. I saw improvement of about 5% on full numbers.
As you said, the reset path would be called only less often
and might not bring a measurable change. I did not see a difference in behavior
when changing the sync to happen after the writes. But my understanding was that
the sync after the writes was required to ensure write completion.
I should have made smaller patches to do this change.
The only patch relevant for this series is the one that changes the write in _iotlb_range
function. Rest of the changes, should be added one by one in a separate series.
>> static void __flush_iotlb(void *cookie)
>> @@ -141,6 +144,9 @@ static void __flush_iotlb(void *cookie)
>> list_for_each_entry(master, &iommu->ctx_list, list)
>> SET_CTX_TLBIALL(iommu->base, master->num, 0);
>>
>> + /* To ensure completion of TLBIALL above */
>> + mb();
>> +
>> __disable_clocks(iommu);
>> }
>> fail:
>
>__disable_clocks() takes spinlocks and accesses the hardware, which
>in turn will also involve multiple syncs, so this seems like
>a premature optimization.
>
>Have you tried just leaving the clocks enabled? I'd suspect you could
>gain more performance that way.
I have not tried by keep clocks enabled always. But intention here was
to have more granular clock control. But agree that the barrier might not
be needed in this case as it is indirectly taken care. I will remove this here
and explain it.
>
>> @@ -181,7 +187,8 @@ fail:
>>
>> static void __flush_iotlb_sync(void *cookie)
>> {
>> - /* To avoid a null function pointer */
>> + /* To ensure completion of the TLBIVA in __flush_iotlb_range */
>> + mb();
>> }
>
>I don't understand the specific race from the comment.
>
>What operation comes after this that relies on __flush_iotlb_range
>having completed, and how does an mb() guarantee this?
>
The flush_iotlb_range operation invalidates the tlb for writes to
pagetable and the finally calls the sync operation to ensure completion
of the flush and this is required before returning back to the client
of the iommu. In the case of this iommu, only a barrier is required to
ensure completion of the invalidate operation.
>This seems to be a bug fix that is unrelated to the change to
>use writel_relaxed(), so better split it out into a separate
>patch, with a longer changeset description. Did you spot this
>race by running into incorrect data, or by inspection?
>
No i did not see a data corruption issue without the mb(),
but that it would have been hidden in someother way as well.
Another difference was the sync was done before the write
previously and now its moved after the write. As i understand
sync after the write is correct. So i will change this patch with more
description and move rest of that changes out.
>> static const struct iommu_gather_ops msm_iommu_gather_ops = {
>> @@ -235,6 +242,9 @@ static void config_mids(struct msm_iommu_dev *iommu,
>> /* Set security bit override to be Non-secure */
>> SET_NSCFG(iommu->base, mid, 3);
>> }
>> +
>> + /* Ensure completion of relaxed writes from the above SET macros */
>> + mb();
>> }
>>
>
>This looks similar to the msm_iommu_reset() case. Maybe put those
>two into one patch, or find a way to run config_mids() less often so
>it doesn't show up in the profiles any more.
>
>If you hit this often enough that it causes performance problems,
>it's more likely that there is a bug in your code than that changing
>to relaxed accessors is a good idea.
>
Ok, similar to reset this is called less often compared to flushes and
may not show difference in performance. I will move this out.
>> static void __reset_context(void __iomem *base, int ctx)
>> @@ -257,6 +267,9 @@ static void __reset_context(void __iomem *base, int ctx)
>> SET_TLBFLPTER(base, ctx, 0);
>> SET_TLBSLPTER(base, ctx, 0);
>> SET_TLBLKCR(base, ctx, 0);
>> +
>> + /* Ensure completion of relaxed writes from the above SET macros */
>> + mb();
>> }
>>
>> static void __program_context(void __iomem *base, int ctx,
>> @@ -305,6 +318,9 @@ static void __program_context(void __iomem *base, int ctx,
>>
>> /* Enable the MMU */
>> SET_M(base, ctx, 1);
>> +
>> + /* Ensure completion of relaxed writes from the above SET macros */
>> + mb();
>> }
>>
>> static struct iommu_domain *msm_iommu_domain_alloc(unsigned type)
>
>Maybe one more patch for these? I can see that
>__reset_context/__program_context would be run often enough to make
>a difference, though maybe not as much as the flushes.
>
>Also, split this in two patches: one that adds a comment describing
>how you proved that you don't need barriers before the writes, and
>another one adding the barrier afterwards, with a description of
>the failure scenario.
ok, i will split it. I did not see a failure though explicitly. It is based on
the understanding that the sync after the write is more appropriate
than before it.
>
>> @@ -500,7 +516,8 @@ static phys_addr_t msm_iommu_iova_to_phys(struct iommu_domain *domain,
>> /* Invalidate context TLB */
>> SET_CTX_TLBIALL(iommu->base, master->num, 0);
>> SET_V2PPR(iommu->base, master->num, va & V2Pxx_VA);
>> -
>> + /* Ensure completion of relaxed writes from the above SET macros */
>> + mb();
>> par = GET_PAR(iommu->base, master->num);
>>
>> /* We are dealing with a supersection */
>
>In this case, I'd say it's better to rewrite the function to avoid the
>read: iova_to_phys() should be fast, and not require hardware access.
>Getting rid of the hardware access by using an in-memory cache for
>this should gain more than just removing the barriers, as an MMIO read
>is always slow
Ok, you mean using the software walk through ? I will check on this to measure
the latency difference. If thats true, then the iopgtable ops itself provides a
function for iova_to_phys conversion, so that can be used.
>
>> @@ -714,7 +731,8 @@ static int msm_iommu_probe(struct platform_device *pdev)
>> par = GET_PAR(iommu->base, 0);
>> SET_V2PCFG(iommu->base, 0, 0);
>> SET_M(iommu->base, 0, 0);
>> -
>> + /* Ensure completion of relaxed writes from the above SET macros */
>> + mb();
>> if (!par) {
>
>Probe() clearly isn't performance sensitive, so please back this out
>and use the non-relaxed accessors.
>
ok, will change this back.
>> diff --git a/drivers/iommu/msm_iommu_hw-8xxx.h b/drivers/iommu/msm_iommu_hw-8xxx.h
>> index fc16010..fe2c5ca 100644
>> --- a/drivers/iommu/msm_iommu_hw-8xxx.h
>> +++ b/drivers/iommu/msm_iommu_hw-8xxx.h
>> @@ -24,13 +24,19 @@
>> #define GET_CTX_REG(reg, base, ctx) \
>> (readl((base) + (reg) + ((ctx) << CTX_SHIFT)))
>>
>> -#define SET_GLOBAL_REG(reg, base, val) writel((val), ((base) + (reg)))
>> +/*
>> + * The writes to the global/context registers needs to be synced only after
>> + * all the configuration writes are done. So use relaxed accessors and
>> + * a barrier at the end.
>> + */
>> +#define SET_GLOBAL_REG_RELAXED(reg, base, val) \
>> + writel_relaxed((val), ((base) + (reg)))
>>
>> -#define SET_CTX_REG(reg, base, ctx, val) \
>> - writel((val), ((base) + (reg) + ((ctx) << CTX_SHIFT)))
>> +#define SET_CTX_REG_RELAXED(reg, base, ctx, val) \
>> + writel_relaxed((val), ((base) + (reg) + ((ctx) << CTX_SHIFT)))
>
>Please add the relaxed accessors first in a separate patch, and then
>change over one user at a time before you remove the non-relaxed ones.
>
ok
>It's hard to believe that all users are actually performance critical,
>so start with the ones that gain the most performance. As commented above,
>some of the conversions seem particularly fishy and it's likely that
>a slow reset() function should not be fixed by making reset() faster
>but by calling it less often.
Ok. Will change this. So i will split this in to separate series, one patch to modify
relevant the flush_iotlb_range function in this series and rest of all changes
as a driver improvement in one more.
>Can you just remove those macros, and open-code the function calls?
>
>This is an unnecessary obfuscation that now also hides the fact that you
>are using relaxed accessors.
Ok, will make this in to inline functions to make it obvious.
Regards,
Sricharan
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH V5 0/7] iommu/msm: Add DT adaptation and generic bindings support
[not found] ` <1463741694-1735-1-git-send-email-sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2016-05-20 10:54 ` [PATCH V5 3/7] iommu/msm: Move the contents from msm_iommu_dev.c to msm_iommu.c Sricharan R
@ 2016-05-23 8:10 ` Srinivas Kandagatla
1 sibling, 0 replies; 20+ messages in thread
From: Srinivas Kandagatla @ 2016-05-23 8:10 UTC (permalink / raw)
To: Sricharan R, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, joro-zLv9SwRftAIdnm+yROfE0A,
robdclark-Re5JQEeQqe8AvxtiuMwx3w,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
laurent.pinchart-ryLnwIuWjnjg/C1BVhZhaw,
treding-DDmLM1+adcrQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
stepanm-sgV2jX0FEOL9JmXXK+q4OQ, architt-sgV2jX0FEOL9JmXXK+q4OQ,
arnd-r2nGTMty4D4
Hi Sricharan,
Thanks for the patchset,
On 20/05/16 11:54, Sricharan R wrote:
> The msm_iommu.c driver currently works based on platform data.
> A single master device can be connected to more than one iommu and multiple
> contexts in each of the iommu. This association between master and iommus was
> represented from platform data using parent/child devices. The master drivers
> were responsible for attaching all of the iommus/context to a domain. Now the
> platform data support is removed and DT support is added. The master/iommus are
> added through generic iommu bindings.
...
> [V1]
> https://lists.linuxfoundation.org/pipermail/iommu/2015-August/014074.html
>
> Sricharan R (7):
> iommu/msm: Add DT adaptation
> documentation: iommu: Add bindings for msm,iommu-v0 ip
> iommu/msm: Move the contents from msm_iommu_dev.c to msm_iommu.c
> iommu/msm: Add support for generic master bindings
> iommu/msm: use generic ARMV7S short descriptor pagetable ops
> iommu/msm: Use writel_relaxed and add a barrier
> iommu/msm: Remove driver BROKEN
>
> .../devicetree/bindings/iommu/msm,iommu-v0.txt | 64 ++
> drivers/iommu/Kconfig | 2 +-
> drivers/iommu/Makefile | 2 +-
> drivers/iommu/msm_iommu.c | 885 +++++++++++----------
> drivers/iommu/msm_iommu.h | 73 +-
> drivers/iommu/msm_iommu_dev.c | 381 ---------
> drivers/iommu/msm_iommu_hw-8xxx.h | 109 +--
> 7 files changed, 636 insertions(+), 880 deletions(-)
> create mode 100644 Documentation/devicetree/bindings/iommu/msm,iommu-v0.txt
> delete mode 100644 drivers/iommu/msm_iommu_dev.c
Tested It on IFC6410 and DB600c.
Tested-by: Srinivas Kandagatla <srinivas.kandagatla-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
--srini
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH V5 2/7] documentation: iommu: Add bindings for msm,iommu-v0 ip
[not found] ` <1463741694-1735-3-git-send-email-sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
@ 2016-05-23 21:23 ` Rob Herring
0 siblings, 0 replies; 20+ messages in thread
From: Rob Herring @ 2016-05-23 21:23 UTC (permalink / raw)
To: Sricharan R
Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, joro-zLv9SwRftAIdnm+yROfE0A,
robdclark-Re5JQEeQqe8AvxtiuMwx3w,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
srinivas.kandagatla-QSEj5FYQhm4dnm+yROfE0A,
laurent.pinchart-ryLnwIuWjnjg/C1BVhZhaw,
treding-DDmLM1+adcrQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
stepanm-sgV2jX0FEOL9JmXXK+q4OQ, architt-sgV2jX0FEOL9JmXXK+q4OQ,
arnd-r2nGTMty4D4
On Fri, May 20, 2016 at 04:24:49PM +0530, Sricharan R wrote:
> The MSM IOMMU is an implementation compatible with the ARM VMSA short
> descriptor page tables. It provides address translation for bus masters outside
> of the CPU, each connected to the IOMMU through a port called micro-TLB.
> Adding the DT bindings for the same.
>
> Signed-off-by: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
> ---
> .../devicetree/bindings/iommu/msm,iommu-v0.txt | 64 ++++++++++++++++++++++
> 1 file changed, 64 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/iommu/msm,iommu-v0.txt
Acked-by: Rob Herring <robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH V5 6/7] iommu/msm: Use writel_relaxed and add a barrier
2016-05-23 6:05 ` Sricharan
@ 2016-05-24 14:00 ` Arnd Bergmann
2016-05-25 10:45 ` Sricharan
0 siblings, 1 reply; 20+ messages in thread
From: Arnd Bergmann @ 2016-05-24 14:00 UTC (permalink / raw)
To: Sricharan
Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, architt-sgV2jX0FEOL9JmXXK+q4OQ,
linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, joro-zLv9SwRftAIdnm+yROfE0A,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
robdclark-Re5JQEeQqe8AvxtiuMwx3w,
srinivas.kandagatla-QSEj5FYQhm4dnm+yROfE0A,
laurent.pinchart-ryLnwIuWjnjg/C1BVhZhaw,
treding-DDmLM1+adcrQT0dZR+AlfA, robin.murphy-5wv7dgnIgG8,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
stepanm-sgV2jX0FEOL9JmXXK+q4OQ
On Monday, May 23, 2016 11:35:04 AM CEST Sricharan wrote:
> Hi Arnd,
>
> >> @@ -124,6 +124,9 @@ static void msm_iommu_reset(void __iomem *base, int ncb)
> >> SET_TLBLKCR(base, ctx, 0);
> >> SET_CONTEXTIDR(base, ctx, 0);
> >> }
> >> +
> >> + /* Ensure completion of relaxed writes from the above SET macros */
> >> + mb();
> >> }
> >
> >Why do the above use the relaxed writes? Do you have any performance
> >numbers showing that skipping the sync for the reset makes a measurable
> >difference?
> >
> >How did you prove that skipping the sync before the write is safe?
> >
> >How about resetting the iommu less often instead?
> >
>
> I had measured the numbers only for the full usecase path, not for the
> reset path alone. I saw improvement of about 5% on full numbers.
> As you said, the reset path would be called only less often
> and might not bring a measurable change. I did not see a difference in behavior
> when changing the sync to happen after the writes.
Ok, then better not change it.
> But my understanding was that
> the sync after the writes was required to ensure write completion.
Can you cite the relevant documentation on this? Is this specific
to the Qualcomm CPU implementation or the IOMMU? I don't think
the ARM architecture requires anything like this in general.
> I should have made smaller patches to do this change.
> The only patch relevant for this series is the one that changes the write in _iotlb_range
> function. Rest of the changes, should be added one by one in a separate series.
If you see the same 5% performance improvement with a simpler change, then
better do only that. The IOMMU infrastructure is rather sensitive to
having correct barriers everywhere, so this minimizes the risk of getting
it wrong somewhere.
> >> @@ -181,7 +187,8 @@ fail:
> >>
> >> static void __flush_iotlb_sync(void *cookie)
> >> {
> >> - /* To avoid a null function pointer */
> >> + /* To ensure completion of the TLBIVA in __flush_iotlb_range */
> >> + mb();
> >> }
> >
> >I don't understand the specific race from the comment.
> >
> >What operation comes after this that relies on __flush_iotlb_range
> >having completed, and how does an mb() guarantee this?
> >
>
> The flush_iotlb_range operation invalidates the tlb for writes to
> pagetable and the finally calls the sync operation to ensure completion
> of the flush and this is required before returning back to the client
> of the iommu. In the case of this iommu, only a barrier is required to
> ensure completion of the invalidate operation.
This doesn't answer my question: What operation would a client do
that requires the flush to be completed here? A barrier is always
defined in terms of things that come before it in combination with
things that come after it.
Any operation that could trigger a DMA from a device is required
to have a barrier preceding it (usually wmb() one implied by writel()),
so this is clearly not about a driver that installs a DMA mapping
before starting a DMA, but I don't see what else it would be.
> >This seems to be a bug fix that is unrelated to the change to
> >use writel_relaxed(), so better split it out into a separate
> >patch, with a longer changeset description. Did you spot this
> >race by running into incorrect data, or by inspection?
> >
>
> No i did not see a data corruption issue without the mb(),
> but that it would have been hidden in someother way as well.
> Another difference was the sync was done before the write
> previously and now its moved after the write. As i understand
> sync after the write is correct. So i will change this patch with more
> description and move rest of that changes out.
Ok.
> >> @@ -500,7 +516,8 @@ static phys_addr_t msm_iommu_iova_to_phys(struct iommu_domain *domain,
> >> /* Invalidate context TLB */
> >> SET_CTX_TLBIALL(iommu->base, master->num, 0);
> >> SET_V2PPR(iommu->base, master->num, va & V2Pxx_VA);
> >> -
> >> + /* Ensure completion of relaxed writes from the above SET macros */
> >> + mb();
> >> par = GET_PAR(iommu->base, master->num);
> >>
> >> /* We are dealing with a supersection */
> >
> >In this case, I'd say it's better to rewrite the function to avoid the
> >read: iova_to_phys() should be fast, and not require hardware access.
> >Getting rid of the hardware access by using an in-memory cache for
> >this should gain more than just removing the barriers, as an MMIO read
> >is always slow
>
> Ok, you mean using the software walk through ? I will check on this to measure
> the latency difference. If thats true, then the iopgtable ops itself provides a
> function for iova_to_phys conversion, so that can be used.
I hadn't realized that this is a lookup in the hardware, rather than
reading a static register. It's probably a good idea to check this
anyway.
Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH V5 6/7] iommu/msm: Use writel_relaxed and add a barrier
2016-05-24 14:00 ` Arnd Bergmann
@ 2016-05-25 10:45 ` Sricharan
2016-05-25 12:18 ` Arnd Bergmann
0 siblings, 1 reply; 20+ messages in thread
From: Sricharan @ 2016-05-25 10:45 UTC (permalink / raw)
To: 'Arnd Bergmann'
Cc: devicetree, architt, linux-arm-msm, joro, iommu, robdclark,
srinivas.kandagatla, laurent.pinchart, treding, robin.murphy,
linux-arm-kernel, stepanm
Hi Arnd,
<snip..>
>> I had measured the numbers only for the full usecase path, not for the
>> reset path alone. I saw improvement of about 5% on full numbers.
>> As you said, the reset path would be called only less often
>> and might not bring a measurable change. I did not see a difference in behavior
>> when changing the sync to happen after the writes.
>
>Ok, then better not change it.
>
Ok.
>> But my understanding was that
>> the sync after the writes was required to ensure write completion.
>
>Can you cite the relevant documentation on this? Is this specific
>to the Qualcomm CPU implementation or the IOMMU? I don't think
>the ARM architecture requires anything like this in general.
>
My explanation was not correct. So the whole reason for doing the
sync here was to make sure that client is going to observe the effect
the of tlb invalidation before starting any dma operation.
But as you have explained below, if it is implicit that any operation
from a device that could trigger a dma should have an barrier preceding
that, then the a sync here may not be needed.
>
>> >> @@ -181,7 +187,8 @@ fail:
>> >>
>> >> static void __flush_iotlb_sync(void *cookie)
>> >> {
>> >> - /* To avoid a null function pointer */
>> >> + /* To ensure completion of the TLBIVA in __flush_iotlb_range */
>> >> + mb();
>> >> }
>> >
>> >I don't understand the specific race from the comment.
>> >
>> >What operation comes after this that relies on __flush_iotlb_range
>> >having completed, and how does an mb() guarantee this?
>> >
>>
>> The flush_iotlb_range operation invalidates the tlb for writes to
>> pagetable and the finally calls the sync operation to ensure completion
>> of the flush and this is required before returning back to the client
>> of the iommu. In the case of this iommu, only a barrier is required to
>> ensure completion of the invalidate operation.
>
>This doesn't answer my question: What operation would a client do
>that requires the flush to be completed here? A barrier is always
>defined in terms of things that come before it in combination with
>things that come after it.
>
>Any operation that could trigger a DMA from a device is required
>to have a barrier preceding it (usually wmb() one implied by writel()),
>so this is clearly not about a driver that installs a DMA mapping
>before starting a DMA, but I don't see what else it would be.
>
Ok, so i was doing this from the idea that, other iommu drivers
where polling on a status bit in their sync call to ensure completion
of pending TLB invalidations. But in this case, there is no status bit.
So added a barrier to have no ordering issues before the client
triggers the dma operation. But as you say above that it is implicit that
the device would have a barrier before starting the trigger, then the
barrier here becomes redundant.
<snip..>
>> >In this case, I'd say it's better to rewrite the function to avoid the
>> >read: iova_to_phys() should be fast, and not require hardware access.
>> >Getting rid of the hardware access by using an in-memory cache for
>> >this should gain more than just removing the barriers, as an MMIO read
>> >is always slow
>>
>> Ok, you mean using the software walk through ? I will check on this to measure
>> the latency difference. If thats true, then the iopgtable ops itself provides a
>> function for iova_to_phys conversion, so that can be used.
>
>I hadn't realized that this is a lookup in the hardware, rather than
>reading a static register. It's probably a good idea to check this
>anyway.
Ok, will measure the difference and have the better one.
Regards,
Sricharan
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH V5 6/7] iommu/msm: Use writel_relaxed and add a barrier
2016-05-25 10:45 ` Sricharan
@ 2016-05-25 12:18 ` Arnd Bergmann
2016-05-25 13:19 ` Sricharan
0 siblings, 1 reply; 20+ messages in thread
From: Arnd Bergmann @ 2016-05-25 12:18 UTC (permalink / raw)
To: Sricharan
Cc: devicetree, architt, linux-arm-msm, joro, iommu, robdclark,
srinivas.kandagatla, laurent.pinchart, treding, robin.murphy,
linux-arm-kernel, stepanm
On Wednesday, May 25, 2016 4:15:31 PM CEST Sricharan wrote:
> >
> >Any operation that could trigger a DMA from a device is required
> >to have a barrier preceding it (usually wmb() one implied by writel()),
> >so this is clearly not about a driver that installs a DMA mapping
> >before starting a DMA, but I don't see what else it would be.
> >
>
> Ok, so i was doing this from the idea that, other iommu drivers
> where polling on a status bit in their sync call to ensure completion
> of pending TLB invalidations. But in this case, there is no status bit.
> So added a barrier to have no ordering issues before the client
> triggers the dma operation. But as you say above that it is implicit that
> the device would have a barrier before starting the trigger, then the
> barrier here becomes redundant.
Ok. There are two more things to note here:
* On other platforms, polling the register is likely required because
an MMIO write is "posted", meaning that a sync after writel() will
only ensure that it has left the CPU write queue, but it may still be
on the bus fabric and whatever side-effects are triggered by the
write are normally not guaranteed to be completed even after the
'sync'. You need to check the datasheet for your IOMMU to find out
whether the 'dsb' instruction actually has any effect on the IOMMU.
If not, then neither the barrier that you add here nor the barrier
in the following writel() is sufficient.
* The one barrier that is normally required in an IOMMU is between
updating the in-memory page tables and the following MMIO store
that triggers the TLB flush for that entry. This barrier is
implied by writel() but not writel_relaxed(). If you don't have
a hardware page table walker in your IOMMU, you don't need to worry
about this.
Arnd
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH V5 6/7] iommu/msm: Use writel_relaxed and add a barrier
2016-05-25 12:18 ` Arnd Bergmann
@ 2016-05-25 13:19 ` Sricharan
2016-05-25 14:15 ` Arnd Bergmann
0 siblings, 1 reply; 20+ messages in thread
From: Sricharan @ 2016-05-25 13:19 UTC (permalink / raw)
To: 'Arnd Bergmann'
Cc: devicetree, architt, linux-arm-msm, joro, iommu, robdclark,
srinivas.kandagatla, laurent.pinchart, treding, robin.murphy,
linux-arm-kernel, stepanm
Hi Arnd,
>> Ok, so i was doing this from the idea that, other iommu drivers
>> where polling on a status bit in their sync call to ensure completion
>> of pending TLB invalidations. But in this case, there is no status bit.
>> So added a barrier to have no ordering issues before the client
>> triggers the dma operation. But as you say above that it is implicit that
>> the device would have a barrier before starting the trigger, then the
>> barrier here becomes redundant.
>
>Ok. There are two more things to note here:
>
>* On other platforms, polling the register is likely required because
> an MMIO write is "posted", meaning that a sync after writel() will
> only ensure that it has left the CPU write queue, but it may still be
> on the bus fabric and whatever side-effects are triggered by the
> write are normally not guaranteed to be completed even after the
> 'sync'. You need to check the datasheet for your IOMMU to find out
> whether the 'dsb' instruction actually has any effect on the IOMMU.
> If not, then neither the barrier that you add here nor the barrier
> in the following writel() is sufficient.
>
Thanks for the detailed explanation.
i will check this. So with this, i think that if the iommu does not
support polling for its status, then it should listen to 'dsb' otherwise
there will no be no way of make it coherent ?
>* The one barrier that is normally required in an IOMMU is between
> updating the in-memory page tables and the following MMIO store
> that triggers the TLB flush for that entry. This barrier is
> implied by writel() but not writel_relaxed(). If you don't have
> a hardware page table walker in your IOMMU, you don't need to worry
> about this.
>
To get my understanding correct here, is the barrier required here because
of speculative fetch ?
Regards,
Sricharan
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH V5 6/7] iommu/msm: Use writel_relaxed and add a barrier
2016-05-25 13:19 ` Sricharan
@ 2016-05-25 14:15 ` Arnd Bergmann
2016-05-25 16:49 ` Sricharan
0 siblings, 1 reply; 20+ messages in thread
From: Arnd Bergmann @ 2016-05-25 14:15 UTC (permalink / raw)
To: Sricharan
Cc: devicetree, architt, linux-arm-msm, joro, iommu, robdclark,
srinivas.kandagatla, laurent.pinchart, treding, robin.murphy,
linux-arm-kernel, stepanm
On Wednesday, May 25, 2016 6:49:18 PM CEST Sricharan wrote:
> Hi Arnd,
>
> >> Ok, so i was doing this from the idea that, other iommu drivers
> >> where polling on a status bit in their sync call to ensure completion
> >> of pending TLB invalidations. But in this case, there is no status bit.
> >> So added a barrier to have no ordering issues before the client
> >> triggers the dma operation. But as you say above that it is implicit that
> >> the device would have a barrier before starting the trigger, then the
> >> barrier here becomes redundant.
> >
> >Ok. There are two more things to note here:
> >
> >* On other platforms, polling the register is likely required because
> > an MMIO write is "posted", meaning that a sync after writel() will
> > only ensure that it has left the CPU write queue, but it may still be
> > on the bus fabric and whatever side-effects are triggered by the
> > write are normally not guaranteed to be completed even after the
> > 'sync'. You need to check the datasheet for your IOMMU to find out
> > whether the 'dsb' instruction actually has any effect on the IOMMU.
> > If not, then neither the barrier that you add here nor the barrier
> > in the following writel() is sufficient.
> >
> Thanks for the detailed explanation.
> i will check this. So with this, i think that if the iommu does not
> support polling for its status, then it should listen to 'dsb' otherwise
> there will no be no way of make it coherent ?
It's more likely that you have to do a readl() after the writel() to
any address inside of the device in order to ensure the writel()
has completed, but that is something that the hardware manual for
the iommu should describe.
> >* The one barrier that is normally required in an IOMMU is between
> > updating the in-memory page tables and the following MMIO store
> > that triggers the TLB flush for that entry. This barrier is
> > implied by writel() but not writel_relaxed(). If you don't have
> > a hardware page table walker in your IOMMU, you don't need to worry
> > about this.
> >
> To get my understanding correct here, is the barrier required here because
> of speculative fetch ?
It doesn't have to be a speculative fetch, it could also be a normal
fetch triggered by a DMA. The point is that the CPU is free to reorder
the store to (io page table) memory relative to the store to the
IOMMU flush register, and when you directly follow up with a store to
the DMA master that triggers a transfer, this transfer can hit the
IOMMU and cause the stale page table entry to be read.
I guess in practice the barrier that comes before the MMIO write to the
DMA master would ensure ordering of the DMA after both the IOPTE
update and the MMIO flush (unless there was the speculative fetch you
mentioned), but this is where I'm no longer confident enough in my
pwn understanding of th ordering rules to rely on that. Having the
barrier between the IOPTE update and the flush certainly leaves
no room for ambiguity.
Arnd
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH V5 6/7] iommu/msm: Use writel_relaxed and add a barrier
2016-05-25 14:15 ` Arnd Bergmann
@ 2016-05-25 16:49 ` Sricharan
0 siblings, 0 replies; 20+ messages in thread
From: Sricharan @ 2016-05-25 16:49 UTC (permalink / raw)
To: 'Arnd Bergmann'
Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, architt-sgV2jX0FEOL9JmXXK+q4OQ,
linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
srinivas.kandagatla-QSEj5FYQhm4dnm+yROfE0A,
laurent.pinchart-ryLnwIuWjnjg/C1BVhZhaw,
treding-DDmLM1+adcrQT0dZR+AlfA,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
stepanm-sgV2jX0FEOL9JmXXK+q4OQ
Hi Arnd,
>> >* On other platforms, polling the register is likely required because
>> > an MMIO write is "posted", meaning that a sync after writel() will
>> > only ensure that it has left the CPU write queue, but it may still be
>> > on the bus fabric and whatever side-effects are triggered by the
>> > write are normally not guaranteed to be completed even after the
>> > 'sync'. You need to check the datasheet for your IOMMU to find out
>> > whether the 'dsb' instruction actually has any effect on the IOMMU.
>> > If not, then neither the barrier that you add here nor the barrier
>> > in the following writel() is sufficient.
>> >
>> Thanks for the detailed explanation.
>> i will check this. So with this, i think that if the iommu does not
>> support polling for its status, then it should listen to 'dsb' otherwise
>> there will no be no way of make it coherent ?
>
>It's more likely that you have to do a readl() after the writel() to
>any address inside of the device in order to ensure the writel()
>has completed, but that is something that the hardware manual for
>the iommu should describe.
>
ok, get it. So will do it as required for here.
>> >* The one barrier that is normally required in an IOMMU is between
>> > updating the in-memory page tables and the following MMIO store
>> > that triggers the TLB flush for that entry. This barrier is
>> > implied by writel() but not writel_relaxed(). If you don't have
>> > a hardware page table walker in your IOMMU, you don't need to worry
>> > about this.
>> >
>> To get my understanding correct here, is the barrier required here because
>> of speculative fetch ?
>
>It doesn't have to be a speculative fetch, it could also be a normal
>fetch triggered by a DMA. The point is that the CPU is free to reorder
>the store to (io page table) memory relative to the store to the
>IOMMU flush register, and when you directly follow up with a store to
>the DMA master that triggers a transfer, this transfer can hit the
>IOMMU and cause the stale page table entry to be read.
>
>I guess in practice the barrier that comes before the MMIO write to the
>DMA master would ensure ordering of the DMA after both the IOPTE
>update and the MMIO flush (unless there was the speculative fetch you
>mentioned), but this is where I'm no longer confident enough in my
>pwn understanding of th ordering rules to rely on that. Having the
>barrier between the IOPTE update and the flush certainly leaves
>no room for ambiguity.
Ok understand. Also i think speculative access if any, should
go away if unmap is done first and then remap by clients.
Regards,
Sricharan
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2016-05-25 16:49 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-20 10:54 [PATCH V5 0/7] iommu/msm: Add DT adaptation and generic bindings support Sricharan R
2016-05-20 10:54 ` [PATCH V5 1/7] iommu/msm: Add DT adaptation Sricharan R
2016-05-20 10:54 ` [PATCH V5 2/7] documentation: iommu: Add bindings for msm,iommu-v0 ip Sricharan R
[not found] ` <1463741694-1735-3-git-send-email-sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2016-05-23 21:23 ` Rob Herring
2016-05-20 10:54 ` [PATCH V5 4/7] iommu/msm: Add support for generic master bindings Sricharan R
2016-05-20 10:54 ` [PATCH V5 5/7] iommu/msm: use generic ARMV7S short descriptor pagetable ops Sricharan R
2016-05-20 10:54 ` [PATCH V5 6/7] iommu/msm: Use writel_relaxed and add a barrier Sricharan R
[not found] ` <1463741694-1735-7-git-send-email-sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2016-05-20 11:44 ` Arnd Bergmann
2016-05-20 12:20 ` Arnd Bergmann
2016-05-23 6:05 ` Sricharan
2016-05-24 14:00 ` Arnd Bergmann
2016-05-25 10:45 ` Sricharan
2016-05-25 12:18 ` Arnd Bergmann
2016-05-25 13:19 ` Sricharan
2016-05-25 14:15 ` Arnd Bergmann
2016-05-25 16:49 ` Sricharan
2016-05-20 10:54 ` [PATCH V5 7/7] iommu/msm: Remove driver BROKEN Sricharan R
2016-05-23 2:53 ` [PATCH V5 0/7] iommu/msm: Add DT adaptation and generic bindings support Archit Taneja
[not found] ` <1463741694-1735-1-git-send-email-sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2016-05-20 10:54 ` [PATCH V5 3/7] iommu/msm: Move the contents from msm_iommu_dev.c to msm_iommu.c Sricharan R
2016-05-23 8:10 ` [PATCH V5 0/7] iommu/msm: Add DT adaptation and generic bindings support Srinivas Kandagatla
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).