linux-tegra.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/22] Tegra DRM fixes
@ 2017-06-13 23:15 Dmitry Osipenko
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

The major changes of the V2 are:

  - Dropped "drm/tegra: Check whether page belongs to BO in tegra_bo_kmap()"
    patch as it is not needed with the checks performed by 'submit' IOCTL.

  - Dropped "drm/tegra: Remove module ownership from the tegra_fb_ops",
    turned out it is not needed. Thanks to Thierry for the clarification.

  - Added new patch "gpu: host1x: At first try a non-blocking allocation
    for the gather copy", which is a trivial fix for the Host1x firewall
    performance.

  - Reworked "drm/tegra: dc: Drop the reset asserts to workaround a bug"
    patch as per Thierry's suggestion to avoid reset only on Tegra20.

  - Fixed "drm/tegra: Don't use IOMMU on Tegra20" patch compilation in case
    of modular Tegra DRM. Thanks to Mikko and Nicolas for the suggestions.

  - The "Forbid relocation address shifting in the firewall" patch has been
    reworked to not break newer Tegra's.

Thanks to Erik, Mikko, Nicolas and Thierry for the reviews and suggestions.

Dmitry Osipenko (21):
  drm/tegra: Fix lockup on a use of staging API
  drm/tegra: Correct idr_alloc() minimum id
  drm/tegra: Check for malformed offsets and sizes in the 'submit' IOCTL
  drm/tegra: Correct copying of waitchecks and disable them in the
    'submit' IOCTL
  drm/tegra: Check syncpoint ID in the 'submit' IOCTL
  drm/tegra: dc: Avoid reset asserts on Tegra20
  drm/tegra: dc: Apply clipping to the plane
  drm/tegra: dc: Disable plane if it is invisible
  drm/tegra: Don't use IOMMU on Tegra20
  Revert "iommu/tegra: gart: Do not register with bus"
  gpu: host1x: Initialize firewall class to the jobs one
  gpu: host1x: Correct host1x_job_pin() error handling
  gpu: host1x: Do not leak BO's phys address to userspace
  gpu: host1x: Forbid relocation address shifting in the firewall
  gpu: host1x: Forbid RESTART opcode in the firewall
  gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall
  gpu: host1x: Correct swapped arguments in the is_addr_reg() definition
  gpu: host1x: Check waits in the firewall
  gpu: host1x: Remove unused 'struct host1x_cmdbuf'
  gpu: host1x: Remove unused host1x_cdma_stop() definition
  gpu: host1x: At first try a non-blocking allocation for the gather
    copy

Mikko Perttunen (1):
  gpu: host1x: Refactor channel allocation code

 drivers/gpu/drm/tegra/dc.c         |  92 +++++++++++++++--------
 drivers/gpu/drm/tegra/drm.c        | 144 ++++++++++++++++++++++++++++++------
 drivers/gpu/drm/tegra/drm.h        |   1 +
 drivers/gpu/drm/tegra/gem.c        |   5 --
 drivers/gpu/drm/tegra/gem.h        |   5 ++
 drivers/gpu/drm/tegra/gr2d.c       |  11 ++-
 drivers/gpu/drm/tegra/gr3d.c       |   4 +-
 drivers/gpu/drm/tegra/vic.c        |   4 +-
 drivers/gpu/host1x/cdma.h          |   1 -
 drivers/gpu/host1x/channel.c       | 147 +++++++++++++++++++++++--------------
 drivers/gpu/host1x/channel.h       |  21 ++++--
 drivers/gpu/host1x/debug.c         |  47 +++++-------
 drivers/gpu/host1x/dev.c           |  14 +++-
 drivers/gpu/host1x/dev.h           |   7 +-
 drivers/gpu/host1x/hw/channel_hw.c |   4 -
 drivers/gpu/host1x/job.c           | 144 +++++++++++++++++++++++++++++-------
 drivers/gpu/host1x/job.h           |  14 ----
 drivers/iommu/tegra-gart.c         |   2 +-
 include/linux/host1x.h             |  13 +++-
 19 files changed, 470 insertions(+), 210 deletions(-)

-- 
2.13.0

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v2 01/22] drm/tegra: Fix lockup on a use of staging API
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-06-13 23:15   ` Dmitry Osipenko
       [not found]     ` <d2616b82a1728d0953e84489cde40364999719f0.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-06-13 23:15   ` [PATCH v2 02/22] drm/tegra: Correct idr_alloc() minimum id Dmitry Osipenko
                     ` (20 subsequent siblings)
  21 siblings, 1 reply; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

Commit bdd2f9cd ("Don't leak kernel pointer to userspace") added a mutex
around staging IOCTL's, some of those mutexes are taken twice.

Fixes: bdd2f9cd10eb ("drm/tegra: Don't leak kernel pointer to userspace")
Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/tegra/drm.c | 20 ++++----------------
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index 51c48a8e00ec..b49d32a89f40 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -451,18 +451,6 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 
 
 #ifdef CONFIG_DRM_TEGRA_STAGING
-static struct tegra_drm_context *
-tegra_drm_file_get_context(struct tegra_drm_file *file, u32 id)
-{
-	struct tegra_drm_context *context;
-
-	mutex_lock(&file->lock);
-	context = idr_find(&file->contexts, id);
-	mutex_unlock(&file->lock);
-
-	return context;
-}
-
 static int tegra_gem_create(struct drm_device *drm, void *data,
 			    struct drm_file *file)
 {
@@ -606,7 +594,7 @@ static int tegra_close_channel(struct drm_device *drm, void *data,
 
 	mutex_lock(&fpriv->lock);
 
-	context = tegra_drm_file_get_context(fpriv, args->context);
+	context = idr_find(&fpriv->contexts, args->context);
 	if (!context) {
 		err = -EINVAL;
 		goto unlock;
@@ -631,7 +619,7 @@ static int tegra_get_syncpt(struct drm_device *drm, void *data,
 
 	mutex_lock(&fpriv->lock);
 
-	context = tegra_drm_file_get_context(fpriv, args->context);
+	context = idr_find(&fpriv->contexts, args->context);
 	if (!context) {
 		err = -ENODEV;
 		goto unlock;
@@ -660,7 +648,7 @@ static int tegra_submit(struct drm_device *drm, void *data,
 
 	mutex_lock(&fpriv->lock);
 
-	context = tegra_drm_file_get_context(fpriv, args->context);
+	context = idr_find(&fpriv->contexts, args->context);
 	if (!context) {
 		err = -ENODEV;
 		goto unlock;
@@ -685,7 +673,7 @@ static int tegra_get_syncpt_base(struct drm_device *drm, void *data,
 
 	mutex_lock(&fpriv->lock);
 
-	context = tegra_drm_file_get_context(fpriv, args->context);
+	context = idr_find(&fpriv->contexts, args->context);
 	if (!context) {
 		err = -ENODEV;
 		goto unlock;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 02/22] drm/tegra: Correct idr_alloc() minimum id
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-06-13 23:15   ` [PATCH v2 01/22] drm/tegra: Fix lockup on a use of staging API Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
  2017-06-13 23:15   ` [PATCH v2 03/22] drm/tegra: Check for malformed offsets and sizes in the 'submit' IOCTL Dmitry Osipenko
                     ` (19 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

The client ID 0 is reserved by the host1x/cdma to mark the timeout timer
work as already been scheduled and context ID is used as the clients one.
This fixes spurious CDMA timeouts.

Fixes: bdd2f9cd10eb ("drm/tegra: Don't leak kernel pointer to userspace")
Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/tegra/drm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index b49d32a89f40..f9282da94a1f 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -539,7 +539,7 @@ static int tegra_client_open(struct tegra_drm_file *fpriv,
 	if (err < 0)
 		return err;
 
-	err = idr_alloc(&fpriv->contexts, context, 0, 0, GFP_KERNEL);
+	err = idr_alloc(&fpriv->contexts, context, 1, 0, GFP_KERNEL);
 	if (err < 0) {
 		client->ops->close_channel(context);
 		return err;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 03/22] drm/tegra: Check for malformed offsets and sizes in the 'submit' IOCTL
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-06-13 23:15   ` [PATCH v2 01/22] drm/tegra: Fix lockup on a use of staging API Dmitry Osipenko
  2017-06-13 23:15   ` [PATCH v2 02/22] drm/tegra: Correct idr_alloc() minimum id Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
  2017-06-13 23:15   ` [PATCH v2 04/22] drm/tegra: Correct copying of waitchecks and disable them " Dmitry Osipenko
                     ` (18 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

If commands buffer claims a number of words that is higher than its BO can
fit, a kernel OOPS will be fired on the out-of-bounds BO access. This was
triggered by an opentegra Xorg driver that erroneously pushed too many
commands to the pushbuf.

The CDMA commands buffer address is 4 bytes aligned, so check its alignment.

The maximum number of the CDMA gather fetches is 16383, add a check for it.

Add a sanity check for the relocations in a same way.

[   46.829393] Unable to handle kernel paging request at virtual address f09b2000
...
[<c04a3ba4>] (host1x_job_pin) from [<c04dfcd0>] (tegra_drm_submit+0x474/0x510)
[<c04dfcd0>] (tegra_drm_submit) from [<c04deea0>] (tegra_submit+0x50/0x6c)
[<c04deea0>] (tegra_submit) from [<c04c07c0>] (drm_ioctl+0x1e4/0x3ec)
[<c04c07c0>] (drm_ioctl) from [<c02541a0>] (do_vfs_ioctl+0x9c/0x8e4)
[<c02541a0>] (do_vfs_ioctl) from [<c0254a1c>] (SyS_ioctl+0x34/0x5c)
[<c0254a1c>] (SyS_ioctl) from [<c0107640>] (ret_fast_syscall+0x0/0x3c)

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/tegra/drm.c | 49 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/tegra/gem.c |  5 -----
 drivers/gpu/drm/tegra/gem.h |  5 +++++
 3 files changed, 54 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index f9282da94a1f..c9246405fc7b 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -26,6 +26,7 @@
 #define DRIVER_PATCHLEVEL 0
 
 #define CARVEOUT_SZ SZ_64M
+#define CDMA_GATHER_FETCHES_MAX_NB 16383
 
 struct tegra_drm_file {
 	struct idr contexts;
@@ -383,18 +384,42 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 	while (num_cmdbufs) {
 		struct drm_tegra_cmdbuf cmdbuf;
 		struct host1x_bo *bo;
+		struct tegra_bo *obj;
+		u64 offset;
 
 		if (copy_from_user(&cmdbuf, cmdbufs, sizeof(cmdbuf))) {
 			err = -EFAULT;
 			goto fail;
 		}
 
+		/*
+		 * The maximum number of CDMA gather fetches is 16383, a higher
+		 * value means the words count is malformed.
+		 */
+		if (cmdbuf.words > CDMA_GATHER_FETCHES_MAX_NB) {
+			err = -EINVAL;
+			goto fail;
+		}
+
 		bo = host1x_bo_lookup(file, cmdbuf.handle);
 		if (!bo) {
 			err = -ENOENT;
 			goto fail;
 		}
 
+		offset = (u64)cmdbuf.offset + (u64)cmdbuf.words * sizeof(u32);
+		obj = host1x_to_tegra_bo(bo);
+
+		/*
+		 * Gather buffer base address must be 4-bytes aligned,
+		 * unaligned offset is malformed and cause commands stream
+		 * corruption on the buffer address relocation.
+		 */
+		if (offset & 3 || offset >= obj->gem.size) {
+			err = -EINVAL;
+			goto fail;
+		}
+
 		host1x_job_add_gather(job, bo, cmdbuf.words, cmdbuf.offset);
 		num_cmdbufs--;
 		cmdbufs++;
@@ -402,11 +427,35 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 
 	/* copy and resolve relocations from submit */
 	while (num_relocs--) {
+		struct host1x_reloc *reloc;
+		struct tegra_bo *obj;
+
 		err = host1x_reloc_copy_from_user(&job->relocarray[num_relocs],
 						  &relocs[num_relocs], drm,
 						  file);
 		if (err < 0)
 			goto fail;
+
+		reloc = &job->relocarray[num_relocs];
+		obj = host1x_to_tegra_bo(reloc->cmdbuf.bo);
+
+		/*
+		 * The unaligned cmdbuf offset will cause an unaligned write
+		 * during of the relocations patching, corrupting the commands
+		 * stream.
+		 */
+		if (reloc->cmdbuf.offset & 3 ||
+		    reloc->cmdbuf.offset >= obj->gem.size) {
+			err = -EINVAL;
+			goto fail;
+		}
+
+		obj = host1x_to_tegra_bo(reloc->target.bo);
+
+		if (reloc->target.offset >= obj->gem.size) {
+			err = -EINVAL;
+			goto fail;
+		}
 	}
 
 	if (copy_from_user(job->waitchk, waitchks,
diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index 424569b53e57..7a39a355678a 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -20,11 +20,6 @@
 #include "drm.h"
 #include "gem.h"
 
-static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
-{
-	return container_of(bo, struct tegra_bo, base);
-}
-
 static void tegra_bo_put(struct host1x_bo *bo)
 {
 	struct tegra_bo *obj = host1x_to_tegra_bo(bo);
diff --git a/drivers/gpu/drm/tegra/gem.h b/drivers/gpu/drm/tegra/gem.h
index 6c5f12ac0087..8b32a6fd586d 100644
--- a/drivers/gpu/drm/tegra/gem.h
+++ b/drivers/gpu/drm/tegra/gem.h
@@ -52,6 +52,11 @@ static inline struct tegra_bo *to_tegra_bo(struct drm_gem_object *gem)
 	return container_of(gem, struct tegra_bo, gem);
 }
 
+static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
+{
+	return container_of(bo, struct tegra_bo, base);
+}
+
 struct tegra_bo *tegra_bo_create(struct drm_device *drm, size_t size,
 				 unsigned long flags);
 struct tegra_bo *tegra_bo_create_with_handle(struct drm_file *file,
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 04/22] drm/tegra: Correct copying of waitchecks and disable them in the 'submit' IOCTL
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 03/22] drm/tegra: Check for malformed offsets and sizes in the 'submit' IOCTL Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
  2017-06-13 23:15   ` [PATCH v2 05/22] drm/tegra: Check syncpoint ID " Dmitry Osipenko
                     ` (17 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

The waitchecks along with multiple syncpoints per submit are not ready for
use yet, let's forbid them for now.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/tegra/drm.c | 60 ++++++++++++++++++++++++++++++++++++++++++---
 drivers/gpu/host1x/job.h    |  7 ------
 include/linux/host1x.h      |  7 ++++++
 3 files changed, 63 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index c9246405fc7b..b90ed2cd32ce 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -349,6 +349,36 @@ static int host1x_reloc_copy_from_user(struct host1x_reloc *dest,
 	return 0;
 }
 
+static int host1x_waitchk_copy_from_user(struct host1x_waitchk *dest,
+					 struct drm_tegra_waitchk __user *src,
+					 struct drm_file *file)
+{
+	u32 cmdbuf;
+	int err;
+
+	err = get_user(cmdbuf, &src->handle);
+	if (err < 0)
+		return err;
+
+	err = get_user(dest->offset, &src->offset);
+	if (err < 0)
+		return err;
+
+	err = get_user(dest->syncpt_id, &src->syncpt);
+	if (err < 0)
+		return err;
+
+	err = get_user(dest->thresh, &src->thresh);
+	if (err < 0)
+		return err;
+
+	dest->bo = host1x_bo_lookup(file, cmdbuf);
+	if (!dest->bo)
+		return -ENOENT;
+
+	return 0;
+}
+
 int tegra_drm_submit(struct tegra_drm_context *context,
 		     struct drm_tegra_submit *args, struct drm_device *drm,
 		     struct drm_file *file)
@@ -370,6 +400,10 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 	if (args->num_syncpts != 1)
 		return -EINVAL;
 
+	/* We don't yet support waitchks */
+	if (args->num_waitchks != 0)
+		return -EINVAL;
+
 	job = host1x_job_alloc(context->channel, args->num_cmdbufs,
 			       args->num_relocs, args->num_waitchks);
 	if (!job)
@@ -458,10 +492,28 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 		}
 	}
 
-	if (copy_from_user(job->waitchk, waitchks,
-			   sizeof(*waitchks) * num_waitchks)) {
-		err = -EFAULT;
-		goto fail;
+	/* copy and resolve waitchks from submit */
+	while (num_waitchks--) {
+		struct host1x_waitchk *wait = &job->waitchk[num_waitchks];
+		struct tegra_bo *obj;
+
+		err = host1x_waitchk_copy_from_user(wait,
+						    &waitchks[num_waitchks],
+						    file);
+		if (err < 0)
+			goto fail;
+
+		obj = host1x_to_tegra_bo(wait->bo);
+
+		/*
+		 * The unaligned offset will cause an unaligned write during
+		 * of the waitchks patching, corrupting the commands stream.
+		 */
+		if (wait->offset & 3 ||
+		    wait->offset >= obj->gem.size) {
+			err = -EINVAL;
+			goto fail;
+		}
 	}
 
 	if (copy_from_user(&syncpt, (void __user *)(uintptr_t)args->syncpts,
diff --git a/drivers/gpu/host1x/job.h b/drivers/gpu/host1x/job.h
index 878239c476d2..0debd93a1849 100644
--- a/drivers/gpu/host1x/job.h
+++ b/drivers/gpu/host1x/job.h
@@ -34,13 +34,6 @@ struct host1x_cmdbuf {
 	u32 pad;
 };
 
-struct host1x_waitchk {
-	struct host1x_bo *bo;
-	u32 offset;
-	u32 syncpt_id;
-	u32 thresh;
-};
-
 struct host1x_job_unpin_data {
 	struct host1x_bo *bo;
 	struct sg_table *sgt;
diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index 3d04aa1dc83e..aa323e43ae4e 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -177,6 +177,13 @@ struct host1x_reloc {
 	unsigned long shift;
 };
 
+struct host1x_waitchk {
+	struct host1x_bo *bo;
+	u32 offset;
+	u32 syncpt_id;
+	u32 thresh;
+};
+
 struct host1x_job {
 	/* When refcount goes to zero, job can be freed */
 	struct kref ref;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 05/22] drm/tegra: Check syncpoint ID in the 'submit' IOCTL
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 04/22] drm/tegra: Correct copying of waitchecks and disable them " Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
  2017-06-13 23:15   ` [PATCH v2 06/22] drm/tegra: dc: Avoid reset asserts on Tegra20 Dmitry Osipenko
                     ` (16 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

In case of invalid syncpoint ID, the host1x_syncpt_get() returns NULL and
none of its users perform a check of the returned pointer later. Let's bail
out until it's too late.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/tegra/drm.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index b90ed2cd32ce..e999391aedc9 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -393,6 +393,8 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 	struct drm_tegra_waitchk __user *waitchks =
 		(void __user *)(uintptr_t)args->waitchks;
 	struct drm_tegra_syncpt syncpt;
+	struct host1x *host1x = dev_get_drvdata(drm->dev->parent);
+	struct host1x_syncpt *sp;
 	struct host1x_job *job;
 	int err;
 
@@ -522,6 +524,13 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 		goto fail;
 	}
 
+	/* check whether syncpoint ID is valid */
+	sp = host1x_syncpt_get(host1x, syncpt.id);
+	if (!sp) {
+		err = -ENOENT;
+		goto fail;
+	}
+
 	job->is_addr_reg = context->client->ops->is_addr_reg;
 	job->syncpt_incrs = syncpt.incrs;
 	job->syncpt_id = syncpt.id;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 06/22] drm/tegra: dc: Avoid reset asserts on Tegra20
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 05/22] drm/tegra: Check syncpoint ID " Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
       [not found]     ` <1c7b0c7bf713401073d648f9f228fa152c094a5a.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-06-13 23:15   ` [PATCH v2 07/22] drm/tegra: dc: Apply clipping to the plane Dmitry Osipenko
                     ` (15 subsequent siblings)
  21 siblings, 1 reply; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

Commit 33a8eb8 ("Implement runtime PM") introduced HW reset control. It
causes a hang on Tegra20 if both display controllers are utilized (RGB
panel and HDMI). The TRM suggests that each display controller has its own
reset control, apparently it is not correct.

Fixes: 33a8eb8d40ee ("drm/tegra: dc: Implement runtime PM")
Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 drivers/gpu/drm/tegra/dc.c | 30 +++++++++++++++++++++---------
 1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 95b373f739f2..98ee6abb056c 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -30,6 +30,7 @@ struct tegra_dc_soc_info {
 	bool supports_block_linear;
 	unsigned int pitch_align;
 	bool has_powergate;
+	bool broken_reset;
 };
 
 struct tegra_plane {
@@ -1856,6 +1857,7 @@ static const struct tegra_dc_soc_info tegra20_dc_soc_info = {
 	.supports_block_linear = false,
 	.pitch_align = 8,
 	.has_powergate = false,
+	.broken_reset = true,
 };
 
 static const struct tegra_dc_soc_info tegra30_dc_soc_info = {
@@ -1865,6 +1867,7 @@ static const struct tegra_dc_soc_info tegra30_dc_soc_info = {
 	.supports_block_linear = false,
 	.pitch_align = 8,
 	.has_powergate = false,
+	.broken_reset = false,
 };
 
 static const struct tegra_dc_soc_info tegra114_dc_soc_info = {
@@ -1874,6 +1877,7 @@ static const struct tegra_dc_soc_info tegra114_dc_soc_info = {
 	.supports_block_linear = false,
 	.pitch_align = 64,
 	.has_powergate = true,
+	.broken_reset = false,
 };
 
 static const struct tegra_dc_soc_info tegra124_dc_soc_info = {
@@ -1883,6 +1887,7 @@ static const struct tegra_dc_soc_info tegra124_dc_soc_info = {
 	.supports_block_linear = true,
 	.pitch_align = 64,
 	.has_powergate = true,
+	.broken_reset = false,
 };
 
 static const struct tegra_dc_soc_info tegra210_dc_soc_info = {
@@ -1892,6 +1897,7 @@ static const struct tegra_dc_soc_info tegra210_dc_soc_info = {
 	.supports_block_linear = true,
 	.pitch_align = 64,
 	.has_powergate = true,
+	.broken_reset = false,
 };
 
 static const struct of_device_id tegra_dc_of_match[] = {
@@ -1989,7 +1995,8 @@ static int tegra_dc_probe(struct platform_device *pdev)
 		return PTR_ERR(dc->rst);
 	}
 
-	reset_control_assert(dc->rst);
+	if (!dc->soc->broken_reset)
+		reset_control_assert(dc->rst);
 
 	if (dc->soc->has_powergate) {
 		if (dc->pipe == 0)
@@ -2063,10 +2070,12 @@ static int tegra_dc_suspend(struct device *dev)
 	struct tegra_dc *dc = dev_get_drvdata(dev);
 	int err;
 
-	err = reset_control_assert(dc->rst);
-	if (err < 0) {
-		dev_err(dev, "failed to assert reset: %d\n", err);
-		return err;
+	if (!dc->soc->broken_reset) {
+		err = reset_control_assert(dc->rst);
+		if (err < 0) {
+			dev_err(dev, "failed to assert reset: %d\n", err);
+			return err;
+		}
 	}
 
 	if (dc->soc->has_powergate)
@@ -2096,10 +2105,13 @@ static int tegra_dc_resume(struct device *dev)
 			return err;
 		}
 
-		err = reset_control_deassert(dc->rst);
-		if (err < 0) {
-			dev_err(dev, "failed to deassert reset: %d\n", err);
-			return err;
+		if (!dc->soc->broken_reset) {
+			err = reset_control_deassert(dc->rst);
+			if (err < 0) {
+				dev_err(dev,
+					"failed to deassert reset: %d\n", err);
+				return err;
+			}
 		}
 	}
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 07/22] drm/tegra: dc: Apply clipping to the plane
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 06/22] drm/tegra: dc: Avoid reset asserts on Tegra20 Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
       [not found]     ` <7adbe0c6e67115fa4fd201026533b7cb97fc44d2.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-06-13 23:15   ` [PATCH v2 08/22] drm/tegra: dc: Disable plane if it is invisible Dmitry Osipenko
                     ` (14 subsequent siblings)
  21 siblings, 1 reply; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

On Tegra20 an overlay plane should be clipped, otherwise its output is
distorted once plane crosses display boundary.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 drivers/gpu/drm/tegra/dc.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 98ee6abb056c..a7a7cce1afd0 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -486,12 +486,25 @@ static int tegra_plane_state_add(struct tegra_plane *plane,
 {
 	struct drm_crtc_state *crtc_state;
 	struct tegra_dc_state *tegra;
+	struct drm_rect clip;
+	int err;
 
 	/* Propagate errors from allocation or locking failures. */
 	crtc_state = drm_atomic_get_crtc_state(state->state, state->crtc);
 	if (IS_ERR(crtc_state))
 		return PTR_ERR(crtc_state);
 
+	clip.x1 = 0;
+	clip.y1 = 0;
+	clip.x2 = crtc_state->mode.hdisplay;
+	clip.y2 = crtc_state->mode.vdisplay;
+
+	/* Check plane state for visibility and calculate clipping bounds */
+	err = drm_plane_helper_check_state(state, &clip, 0, INT_MAX,
+					   true, true);
+	if (err < 0)
+		return err;
+
 	tegra = to_dc_state(crtc_state);
 
 	tegra->planes |= WIN_A_ACT_REQ << plane->index;
@@ -561,14 +574,14 @@ static void tegra_plane_atomic_update(struct drm_plane *plane,
 		return;
 
 	memset(&window, 0, sizeof(window));
-	window.src.x = plane->state->src_x >> 16;
-	window.src.y = plane->state->src_y >> 16;
-	window.src.w = plane->state->src_w >> 16;
-	window.src.h = plane->state->src_h >> 16;
-	window.dst.x = plane->state->crtc_x;
-	window.dst.y = plane->state->crtc_y;
-	window.dst.w = plane->state->crtc_w;
-	window.dst.h = plane->state->crtc_h;
+	window.src.x = plane->state->src.x1 >> 16;
+	window.src.y = plane->state->src.y1 >> 16;
+	window.src.w = drm_rect_width(&plane->state->src) >> 16;
+	window.src.h = drm_rect_height(&plane->state->src) >> 16;
+	window.dst.x = plane->state->dst.x1;
+	window.dst.y = plane->state->dst.y1;
+	window.dst.w = drm_rect_width(&plane->state->dst);
+	window.dst.h = drm_rect_height(&plane->state->dst);
 	window.bits_per_pixel = fb->format->cpp[0] * 8;
 	window.bottom_up = tegra_fb_is_bottom_up(fb);
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 08/22] drm/tegra: dc: Disable plane if it is invisible
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (6 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 07/22] drm/tegra: dc: Apply clipping to the plane Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
  2017-06-14  7:21     ` Erik Faye-Lund
  2017-06-13 23:15   ` [PATCH v2 09/22] drm/tegra: Don't use IOMMU on Tegra20 Dmitry Osipenko
                     ` (13 subsequent siblings)
  21 siblings, 1 reply; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

On Tegra20 if plane has width or height equal to 0, it will be infinitely
wide or tall. Let's disable the plane if it is invisible on atomic state
committing to fix the issue. The Rockchip DRM driver does the same.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 drivers/gpu/drm/tegra/dc.c | 33 +++++++++++++++++++++------------
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index a7a7cce1afd0..c875f11786b9 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -559,6 +559,23 @@ static int tegra_plane_atomic_check(struct drm_plane *plane,
 	return 0;
 }
 
+static void tegra_dc_disable_window(struct tegra_dc *dc, int index)
+{
+	unsigned long flags;
+	u32 value;
+
+	spin_lock_irqsave(&dc->lock, flags);
+
+	value = WINDOW_A_SELECT << index;
+	tegra_dc_writel(dc, value, DC_CMD_DISPLAY_WINDOW_HEADER);
+
+	value = tegra_dc_readl(dc, DC_WIN_WIN_OPTIONS);
+	value &= ~WIN_ENABLE;
+	tegra_dc_writel(dc, value, DC_WIN_WIN_OPTIONS);
+
+	spin_unlock_irqrestore(&dc->lock, flags);
+}
+
 static void tegra_plane_atomic_update(struct drm_plane *plane,
 				      struct drm_plane_state *old_state)
 {
@@ -573,6 +590,9 @@ static void tegra_plane_atomic_update(struct drm_plane *plane,
 	if (!plane->state->crtc || !plane->state->fb)
 		return;
 
+	if (!plane->state->visible)
+		return tegra_dc_disable_window(dc, p->index);
+
 	memset(&window, 0, sizeof(window));
 	window.src.x = plane->state->src.x1 >> 16;
 	window.src.y = plane->state->src.y1 >> 16;
@@ -612,8 +632,6 @@ static void tegra_plane_atomic_disable(struct drm_plane *plane,
 {
 	struct tegra_plane *p = to_tegra_plane(plane);
 	struct tegra_dc *dc;
-	unsigned long flags;
-	u32 value;
 
 	/* rien ne va plus */
 	if (!old_state || !old_state->crtc)
@@ -621,16 +639,7 @@ static void tegra_plane_atomic_disable(struct drm_plane *plane,
 
 	dc = to_tegra_dc(old_state->crtc);
 
-	spin_lock_irqsave(&dc->lock, flags);
-
-	value = WINDOW_A_SELECT << p->index;
-	tegra_dc_writel(dc, value, DC_CMD_DISPLAY_WINDOW_HEADER);
-
-	value = tegra_dc_readl(dc, DC_WIN_WIN_OPTIONS);
-	value &= ~WIN_ENABLE;
-	tegra_dc_writel(dc, value, DC_WIN_WIN_OPTIONS);
-
-	spin_unlock_irqrestore(&dc->lock, flags);
+	tegra_dc_disable_window(dc, p->index);
 }
 
 static const struct drm_plane_helper_funcs tegra_primary_plane_helper_funcs = {
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 09/22] drm/tegra: Don't use IOMMU on Tegra20
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (7 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 08/22] drm/tegra: dc: Disable plane if it is invisible Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
       [not found]     ` <d707bd34e268f9d8fb29bf510a9f17e5f943a635.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-06-13 23:15   ` [PATCH v2 10/22] Revert "iommu/tegra: gart: Do not register with bus" Dmitry Osipenko
                     ` (12 subsequent siblings)
  21 siblings, 1 reply; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

There is no IOMMU on Tegra20, instead a GART would be picked as an IOMMU
provider.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Acked-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/gpu/drm/tegra/drm.c | 3 ++-
 drivers/gpu/host1x/dev.c    | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index e999391aedc9..aa7988dcc28f 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -131,7 +131,8 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
 	if (!tegra)
 		return -ENOMEM;
 
-	if (iommu_present(&platform_bus_type)) {
+	if (iommu_present(&platform_bus_type) &&
+	    !of_machine_is_compatible("nvidia,tegra20")) {
 		u64 carveout_start, carveout_end, gem_start, gem_end;
 		struct iommu_domain_geometry *geometry;
 		unsigned long order;
diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
index f05ebb14fa63..6a805ed908c3 100644
--- a/drivers/gpu/host1x/dev.c
+++ b/drivers/gpu/host1x/dev.c
@@ -177,7 +177,8 @@ static int host1x_probe(struct platform_device *pdev)
 		return err;
 	}
 
-	if (iommu_present(&platform_bus_type)) {
+	if (iommu_present(&platform_bus_type) &&
+	    !of_machine_is_compatible("nvidia,tegra20")) {
 		struct iommu_domain_geometry *geometry;
 		unsigned long order;
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 10/22] Revert "iommu/tegra: gart: Do not register with bus"
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (8 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 09/22] drm/tegra: Don't use IOMMU on Tegra20 Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
  2017-06-13 23:15   ` [PATCH v2 11/22] gpu: host1x: Initialize firewall class to the jobs one Dmitry Osipenko
                     ` (11 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

The GART driver was disabled because it was picked by as an IOMMU provider
for the DRM driver on Tegra20, which is not the purpose of the GART. Now
DRM driver avoids to use IOMMU on Tegra20, so the GART driver can be
re-enabled. Potentially there are interesting use cases of the GART for
Tegra20, like mmap'ing of a fallback memory allocation and using its GART
aperture for an accelerated graphics operation.

This reverts commit c7e3ca515e784a82223f447b79eb2090bdb91378.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Acked-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/tegra-gart.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 37e708fdbb5a..430d13bf4f95 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -422,7 +422,7 @@ static int tegra_gart_probe(struct platform_device *pdev)
 	do_gart_setup(gart, NULL);
 
 	gart_handle = gart;
-
+	bus_set_iommu(&platform_bus_type, &gart_iommu_ops);
 	return 0;
 }
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 11/22] gpu: host1x: Initialize firewall class to the jobs one
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (9 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 10/22] Revert "iommu/tegra: gart: Do not register with bus" Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
       [not found]     ` <8c0a2417584ce51b716f2c286027bcb18fceda56.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-06-13 23:15   ` [PATCH v2 12/22] gpu: host1x: Correct host1x_job_pin() error handling Dmitry Osipenko
                     ` (10 subsequent siblings)
  21 siblings, 1 reply; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

The commands stream is prepended by the jobs class on the CDMA submission,
so that explicitly setting a module class in the commands stream isn't
necessary. The firewall initializes its class to 0 and the command stream
that doesn't explicitly specify the class effectively bypasses the firewall.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/host1x/job.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index 5f5f8ee6143d..d9933828fe87 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -504,7 +504,7 @@ static inline int copy_gathers(struct host1x_job *job, struct device *dev)
 	fw.dev = dev;
 	fw.reloc = job->relocarray;
 	fw.num_relocs = job->num_relocs;
-	fw.class = 0;
+	fw.class = job->class;
 
 	for (i = 0; i < job->num_gathers; i++) {
 		struct host1x_job_gather *g = &job->gathers[i];
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 12/22] gpu: host1x: Correct host1x_job_pin() error handling
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (10 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 11/22] gpu: host1x: Initialize firewall class to the jobs one Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
  2017-06-13 23:15   ` [PATCH v2 13/22] gpu: host1x: Do not leak BO's phys address to userspace Dmitry Osipenko
                     ` (9 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

In case of relocations / waitchecks patching failure the jobs pins stay
referenced till DRM file get closed, wasting memory. Add the missed
unpinning.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/host1x/job.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index d9933828fe87..14f3f957ffab 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -592,22 +592,20 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev)
 
 		err = do_relocs(job, g->bo);
 		if (err)
-			break;
+			goto out;
 
 		err = do_waitchks(job, host, g->bo);
 		if (err)
-			break;
+			goto out;
 	}
 
-	if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && !err) {
-		err = copy_gathers(job, dev);
-		if (err) {
-			host1x_job_unpin(job);
-			return err;
-		}
-	}
+	if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL))
+		goto out;
 
+	err = copy_gathers(job, dev);
 out:
+	if (err)
+		host1x_job_unpin(job);
 	wmb();
 
 	return err;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 13/22] gpu: host1x: Do not leak BO's phys address to userspace
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (11 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 12/22] gpu: host1x: Correct host1x_job_pin() error handling Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
  2017-06-13 23:15   ` [PATCH v2 14/22] gpu: host1x: Forbid relocation address shifting in the firewall Dmitry Osipenko
                     ` (8 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

Perform gathers coping before patching them, so that original gathers are
left untouched. That's not as bad as leaking kernel addresses, but still
doesn't feel right.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/host1x/job.c | 39 +++++++++++++++++++++++++++------------
 1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index 14f3f957ffab..4208329ca2af 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -137,8 +137,9 @@ static void host1x_syncpt_patch_offset(struct host1x_syncpt *sp,
  * avoid a wrap condition in the HW).
  */
 static int do_waitchks(struct host1x_job *job, struct host1x *host,
-		       struct host1x_bo *patch)
+		       struct host1x_job_gather *g)
 {
+	struct host1x_bo *patch = g->bo;
 	int i;
 
 	/* compare syncpt vs wait threshold */
@@ -165,7 +166,8 @@ static int do_waitchks(struct host1x_job *job, struct host1x *host,
 				wait->syncpt_id, sp->name, wait->thresh,
 				host1x_syncpt_read_min(sp));
 
-			host1x_syncpt_patch_offset(sp, patch, wait->offset);
+			host1x_syncpt_patch_offset(sp, patch,
+						   g->offset + wait->offset);
 		}
 
 		wait->bo = NULL;
@@ -269,11 +271,12 @@ static unsigned int pin_job(struct host1x *host, struct host1x_job *job)
 	return err;
 }
 
-static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf)
+static int do_relocs(struct host1x_job *job, struct host1x_job_gather *g)
 {
 	int i = 0;
 	u32 last_page = ~0;
 	void *cmdbuf_page_addr = NULL;
+	struct host1x_bo *cmdbuf = g->bo;
 
 	/* pin & patch the relocs for one gather */
 	for (i = 0; i < job->num_relocs; i++) {
@@ -286,6 +289,13 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf)
 		if (cmdbuf != reloc->cmdbuf.bo)
 			continue;
 
+		if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) {
+			target = (u32 *)job->gather_copy_mapped +
+					reloc->cmdbuf.offset / sizeof(u32) +
+						g->offset / sizeof(u32);
+			goto patch_reloc;
+		}
+
 		if (last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) {
 			if (cmdbuf_page_addr)
 				host1x_bo_kunmap(cmdbuf, last_page,
@@ -302,6 +312,7 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf)
 		}
 
 		target = cmdbuf_page_addr + (reloc->cmdbuf.offset & ~PAGE_MASK);
+patch_reloc:
 		*target = reloc_addr;
 	}
 
@@ -573,6 +584,12 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev)
 	if (err)
 		goto out;
 
+	if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) {
+		err = copy_gathers(job, dev);
+		if (err)
+			goto out;
+	}
+
 	/* patch gathers */
 	for (i = 0; i < job->num_gathers; i++) {
 		struct host1x_job_gather *g = &job->gathers[i];
@@ -581,7 +598,9 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev)
 		if (g->handled)
 			continue;
 
-		g->base = job->gather_addr_phys[i];
+		/* copy_gathers() sets gathers base if firewall is enabled */
+		if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL))
+			g->base = job->gather_addr_phys[i];
 
 		for (j = i + 1; j < job->num_gathers; j++) {
 			if (job->gathers[j].bo == g->bo) {
@@ -590,19 +609,15 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev)
 			}
 		}
 
-		err = do_relocs(job, g->bo);
+		err = do_relocs(job, g);
 		if (err)
-			goto out;
+			break;
 
-		err = do_waitchks(job, host, g->bo);
+		err = do_waitchks(job, host, g);
 		if (err)
-			goto out;
+			break;
 	}
 
-	if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL))
-		goto out;
-
-	err = copy_gathers(job, dev);
 out:
 	if (err)
 		host1x_job_unpin(job);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 14/22] gpu: host1x: Forbid relocation address shifting in the firewall
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (12 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 13/22] gpu: host1x: Do not leak BO's phys address to userspace Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
       [not found]     ` <5e6f06d3e57db40c1c4947282ef7336369c3360f.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-06-13 23:15   ` [PATCH v2 15/22] gpu: host1x: Forbid RESTART opcode " Dmitry Osipenko
                     ` (7 subsequent siblings)
  21 siblings, 1 reply; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

Incorrectly shifted relocation address will cause a lower memory corruption
and likely a hang on a write or a read of an arbitrary data in case of IOMMU
absent. As of now there is no use for the address shifting (at least on
Tegra20) and adding a proper shifting / sizes validation is much more work.
Let's forbid it in the firewall till a proper validation is implemented.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 drivers/gpu/host1x/dev.c |  4 ++++
 drivers/gpu/host1x/dev.h |  1 +
 drivers/gpu/host1x/job.c | 24 ++++++++++++++++++++++++
 3 files changed, 29 insertions(+)

diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
index 6a805ed908c3..21da331ce092 100644
--- a/drivers/gpu/host1x/dev.c
+++ b/drivers/gpu/host1x/dev.c
@@ -72,6 +72,7 @@ static const struct host1x_info host1x01_info = {
 	.init = host1x01_init,
 	.sync_offset = 0x3000,
 	.dma_mask = DMA_BIT_MASK(32),
+	.version = 1,
 };
 
 static const struct host1x_info host1x02_info = {
@@ -82,6 +83,7 @@ static const struct host1x_info host1x02_info = {
 	.init = host1x02_init,
 	.sync_offset = 0x3000,
 	.dma_mask = DMA_BIT_MASK(32),
+	.version = 2,
 };
 
 static const struct host1x_info host1x04_info = {
@@ -92,6 +94,7 @@ static const struct host1x_info host1x04_info = {
 	.init = host1x04_init,
 	.sync_offset = 0x2100,
 	.dma_mask = DMA_BIT_MASK(34),
+	.version = 4,
 };
 
 static const struct host1x_info host1x05_info = {
@@ -102,6 +105,7 @@ static const struct host1x_info host1x05_info = {
 	.init = host1x05_init,
 	.sync_offset = 0x2100,
 	.dma_mask = DMA_BIT_MASK(34),
+	.version = 5,
 };
 
 static const struct of_device_id host1x_of_match[] = {
diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
index 229d08b6a45e..c6997651b336 100644
--- a/drivers/gpu/host1x/dev.h
+++ b/drivers/gpu/host1x/dev.h
@@ -100,6 +100,7 @@ struct host1x_info {
 	int (*init)(struct host1x *host1x); /* initialize per SoC ops */
 	unsigned int sync_offset; /* offset of syncpoint registers */
 	u64 dma_mask; /* mask of addressable memory */
+	unsigned int version; /* host1x's version */
 };
 
 struct host1x {
diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index 4208329ca2af..7825643da324 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -333,6 +333,27 @@ static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf,
 	return true;
 }
 
+static bool check_reloc_shift(struct device *dev, struct host1x_reloc *reloc)
+{
+	struct host1x *host = dev_get_drvdata(dev->parent);
+
+	/* skip newer Tegra's since IOMMU is supposed to be used by
+	 * them and not all address registers with their shifts are
+	 * publicly documented */
+	if (host->info->version > 1)
+		return true;
+
+	/* skip Tegra30 with IOMMU enabled */
+	if (host->domain)
+		return true;
+
+	/* relocation shift value validation isn't implemented yet */
+	if (reloc->shift)
+		return false;
+
+	return true;
+}
+
 struct host1x_firewall {
 	struct host1x_job *job;
 	struct device *dev;
@@ -359,6 +380,9 @@ static int check_register(struct host1x_firewall *fw, unsigned long offset)
 		if (!check_reloc(fw->reloc, fw->cmdbuf, fw->offset))
 			return -EINVAL;
 
+		if (!check_reloc_shift(fw->dev, fw->reloc))
+			return -EINVAL;
+
 		fw->num_relocs--;
 		fw->reloc++;
 	}
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 15/22] gpu: host1x: Forbid RESTART opcode in the firewall
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (13 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 14/22] gpu: host1x: Forbid relocation address shifting in the firewall Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
  2017-06-13 23:15   ` [PATCH v2 16/22] gpu: host1x: Forbid unrelated SETCLASS " Dmitry Osipenko
                     ` (6 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

The RESTART opcode terminates the gather and restarts the CDMA fetching from
a specified word << 2 relative to the CDMA start address. That shouldn't be
allowed to be done by userspace.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/host1x/job.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index 7825643da324..3680b4b7dce9 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -515,7 +515,6 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g)
 				goto out;
 			break;
 		case 4:
-		case 5:
 		case 14:
 			break;
 		default:
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 16/22] gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (14 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 15/22] gpu: host1x: Forbid RESTART opcode " Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
  2017-06-13 23:15   ` [PATCH v2 17/22] gpu: host1x: Correct swapped arguments in the is_addr_reg() definition Dmitry Osipenko
                     ` (5 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

Several channels could be made to write the same unit concurrently via the
SETCLASS opcode, trusting userspace is a bad idea. It should be possible to
drop the per-client channel reservation and add a per-unit locking by
inserting MLOCK's to the command stream to re-allow the SETCLASS opcode, but
it will be much more work. Let's forbid the unit-unrelated class changes for
now.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/tegra/drm.c  |  1 +
 drivers/gpu/drm/tegra/drm.h  |  1 +
 drivers/gpu/drm/tegra/gr2d.c |  7 +++++++
 drivers/gpu/host1x/job.c     | 24 ++++++++++++++++++++----
 include/linux/host1x.h       |  3 +++
 5 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index aa7988dcc28f..971cdead6436 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -533,6 +533,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 	}
 
 	job->is_addr_reg = context->client->ops->is_addr_reg;
+	job->is_valid_class = context->client->ops->is_valid_class;
 	job->syncpt_incrs = syncpt.incrs;
 	job->syncpt_id = syncpt.id;
 	job->timeout = 10000;
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index 85aa2e3d9d4e..6d6da01282f3 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -83,6 +83,7 @@ struct tegra_drm_client_ops {
 			    struct tegra_drm_context *context);
 	void (*close_channel)(struct tegra_drm_context *context);
 	int (*is_addr_reg)(struct device *dev, u32 class, u32 offset);
+	int (*is_valid_class)(u32 class);
 	int (*submit)(struct tegra_drm_context *context,
 		      struct drm_tegra_submit *args, struct drm_device *drm,
 		      struct drm_file *file);
diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c
index 02cd3e37a6ec..fbe0b8b25b42 100644
--- a/drivers/gpu/drm/tegra/gr2d.c
+++ b/drivers/gpu/drm/tegra/gr2d.c
@@ -109,10 +109,17 @@ static int gr2d_is_addr_reg(struct device *dev, u32 class, u32 offset)
 	return 0;
 }
 
+static int gr2d_is_valid_class(u32 class)
+{
+	return (class == HOST1X_CLASS_GR2D ||
+		class == HOST1X_CLASS_GR2D_SB);
+}
+
 static const struct tegra_drm_client_ops gr2d_ops = {
 	.open_channel = gr2d_open_channel,
 	.close_channel = gr2d_close_channel,
 	.is_addr_reg = gr2d_is_addr_reg,
+	.is_valid_class = gr2d_is_valid_class,
 	.submit = tegra_drm_submit,
 };
 
diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index 3680b4b7dce9..32d3300bcf99 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -373,6 +373,9 @@ struct host1x_firewall {
 
 static int check_register(struct host1x_firewall *fw, unsigned long offset)
 {
+	if (!fw->job->is_addr_reg)
+		return 0;
+
 	if (fw->job->is_addr_reg(fw->dev, fw->class, offset)) {
 		if (!fw->num_relocs)
 			return -EINVAL;
@@ -390,6 +393,19 @@ static int check_register(struct host1x_firewall *fw, unsigned long offset)
 	return 0;
 }
 
+static int check_class(struct host1x_firewall *fw, u32 class)
+{
+	if (!fw->job->is_valid_class) {
+		if (fw->class != class)
+			return -EINVAL;
+	} else {
+		if (!fw->job->is_valid_class(fw->class))
+			return -EINVAL;
+	}
+
+	return 0;
+}
+
 static int check_mask(struct host1x_firewall *fw)
 {
 	u32 mask = fw->mask;
@@ -463,11 +479,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g)
 {
 	u32 *cmdbuf_base = (u32 *)fw->job->gather_copy_mapped +
 		(g->offset / sizeof(u32));
+	u32 job_class = fw->class;
 	int err = 0;
 
-	if (!fw->job->is_addr_reg)
-		return 0;
-
 	fw->words = g->words;
 	fw->cmdbuf = g->bo;
 	fw->offset = 0;
@@ -487,7 +501,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g)
 			fw->class = word >> 6 & 0x3ff;
 			fw->mask = word & 0x3f;
 			fw->reg = word >> 16 & 0xfff;
-			err = check_mask(fw);
+			err = check_class(fw, job_class);
+			if (!err)
+				err = check_mask(fw);
 			if (err)
 				goto out;
 			break;
diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index aa323e43ae4e..71a34774b56a 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -235,6 +235,9 @@ struct host1x_job {
 	/* Check if register is marked as an address reg */
 	int (*is_addr_reg)(struct device *dev, u32 reg, u32 class);
 
+	/* Check if class belongs to the unit */
+	int (*is_valid_class)(u32 class);
+
 	/* Request a SETCLASS to this class */
 	u32 class;
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 17/22] gpu: host1x: Correct swapped arguments in the is_addr_reg() definition
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (15 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 16/22] gpu: host1x: Forbid unrelated SETCLASS " Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
  2017-06-13 23:15   ` [PATCH v2 18/22] gpu: host1x: Check waits in the firewall Dmitry Osipenko
                     ` (4 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

Arguments of the .is_addr_reg() are swapped in the definition of the
function, that is quite confusing.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 include/linux/host1x.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index 71a34774b56a..561d6bb6580d 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -233,7 +233,7 @@ struct host1x_job {
 	u8 *gather_copy_mapped;
 
 	/* Check if register is marked as an address reg */
-	int (*is_addr_reg)(struct device *dev, u32 reg, u32 class);
+	int (*is_addr_reg)(struct device *dev, u32 class, u32 reg);
 
 	/* Check if class belongs to the unit */
 	int (*is_valid_class)(u32 class);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 18/22] gpu: host1x: Check waits in the firewall
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (16 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 17/22] gpu: host1x: Correct swapped arguments in the is_addr_reg() definition Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
       [not found]     ` <bd57829e74c6554c46ab761d013e6d9d3c39b693.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-06-13 23:15   ` [PATCH v2 19/22] gpu: host1x: Remove unused 'struct host1x_cmdbuf' Dmitry Osipenko
                     ` (3 subsequent siblings)
  21 siblings, 1 reply; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

Check waits in the firewall in a way it is done for relocations.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/host1x/job.c | 36 ++++++++++++++++++++++++++++++++++--
 1 file changed, 34 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index 32d3300bcf99..fc194c676d91 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -31,6 +31,8 @@
 #include "job.h"
 #include "syncpt.h"
 
+#define HOST1X_WAIT_SYNCPT_OFFSET 0x8
+
 struct host1x_job *host1x_job_alloc(struct host1x_channel *ch,
 				    u32 num_cmdbufs, u32 num_relocs,
 				    u32 num_waitchks)
@@ -354,6 +356,17 @@ static bool check_reloc_shift(struct device *dev, struct host1x_reloc *reloc)
 	return true;
 }
 
+static bool check_wait(struct host1x_waitchk *wait, struct host1x_bo *cmdbuf,
+		       unsigned int offset)
+{
+	offset *= sizeof(u32);
+
+	if (wait->bo != cmdbuf || wait->offset != offset)
+		return false;
+
+	return true;
+}
+
 struct host1x_firewall {
 	struct host1x_job *job;
 	struct device *dev;
@@ -361,6 +374,9 @@ struct host1x_firewall {
 	unsigned int num_relocs;
 	struct host1x_reloc *reloc;
 
+	unsigned int num_waitchks;
+	struct host1x_waitchk *waitchk;
+
 	struct host1x_bo *cmdbuf;
 	unsigned int offset;
 
@@ -390,6 +406,20 @@ static int check_register(struct host1x_firewall *fw, unsigned long offset)
 		fw->reloc++;
 	}
 
+	if (offset == HOST1X_WAIT_SYNCPT_OFFSET) {
+		if (fw->class != HOST1X_CLASS_HOST1X)
+			return -EINVAL;
+
+		if (!fw->num_waitchks)
+			return -EINVAL;
+
+		if (!check_wait(fw->waitchk, fw->cmdbuf, fw->offset))
+			return -EINVAL;
+
+		fw->num_waitchks--;
+		fw->waitchk++;
+	}
+
 	return 0;
 }
 
@@ -554,6 +584,8 @@ static inline int copy_gathers(struct host1x_job *job, struct device *dev)
 	fw.dev = dev;
 	fw.reloc = job->relocarray;
 	fw.num_relocs = job->num_relocs;
+	fw.waitchk = job->waitchk;
+	fw.num_waitchks = job->num_waitchk;
 	fw.class = job->class;
 
 	for (i = 0; i < job->num_gathers; i++) {
@@ -592,8 +624,8 @@ static inline int copy_gathers(struct host1x_job *job, struct device *dev)
 		offset += g->words * sizeof(u32);
 	}
 
-	/* No relocs should remain at this point */
-	if (fw.num_relocs)
+	/* No relocs and waitchks should remain at this point */
+	if (fw.num_relocs || fw.num_waitchks)
 		return -EINVAL;
 
 	return 0;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 19/22] gpu: host1x: Remove unused 'struct host1x_cmdbuf'
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (17 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 18/22] gpu: host1x: Check waits in the firewall Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
  2017-06-13 23:15   ` [PATCH v2 20/22] gpu: host1x: Remove unused host1x_cdma_stop() definition Dmitry Osipenko
                     ` (2 subsequent siblings)
  21 siblings, 0 replies; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

The struct host1x_cmdbuf is unused, let's remove it.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/host1x/job.h | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/drivers/gpu/host1x/job.h b/drivers/gpu/host1x/job.h
index 0debd93a1849..4bda51d503ec 100644
--- a/drivers/gpu/host1x/job.h
+++ b/drivers/gpu/host1x/job.h
@@ -27,13 +27,6 @@ struct host1x_job_gather {
 	bool handled;
 };
 
-struct host1x_cmdbuf {
-	u32 handle;
-	u32 offset;
-	u32 words;
-	u32 pad;
-};
-
 struct host1x_job_unpin_data {
 	struct host1x_bo *bo;
 	struct sg_table *sgt;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 20/22] gpu: host1x: Remove unused host1x_cdma_stop() definition
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (18 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 19/22] gpu: host1x: Remove unused 'struct host1x_cmdbuf' Dmitry Osipenko
@ 2017-06-13 23:15   ` Dmitry Osipenko
  2017-06-13 23:16   ` [PATCH v2 21/22] gpu: host1x: Refactor channel allocation code Dmitry Osipenko
  2017-06-13 23:16   ` [PATCH v2 22/22] gpu: host1x: At first try a non-blocking allocation for the gather copy Dmitry Osipenko
  21 siblings, 0 replies; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:15 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

There is no host1x_cdma_stop() in the code, let's remove its definition
from the header file.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/host1x/cdma.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/host1x/cdma.h b/drivers/gpu/host1x/cdma.h
index ec170a78f4e1..286d49386be9 100644
--- a/drivers/gpu/host1x/cdma.h
+++ b/drivers/gpu/host1x/cdma.h
@@ -88,7 +88,6 @@ struct host1x_cdma {
 
 int host1x_cdma_init(struct host1x_cdma *cdma);
 int host1x_cdma_deinit(struct host1x_cdma *cdma);
-void host1x_cdma_stop(struct host1x_cdma *cdma);
 int host1x_cdma_begin(struct host1x_cdma *cdma, struct host1x_job *job);
 void host1x_cdma_push(struct host1x_cdma *cdma, u32 op1, u32 op2);
 void host1x_cdma_end(struct host1x_cdma *cdma, struct host1x_job *job);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 21/22] gpu: host1x: Refactor channel allocation code
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (19 preceding siblings ...)
  2017-06-13 23:15   ` [PATCH v2 20/22] gpu: host1x: Remove unused host1x_cdma_stop() definition Dmitry Osipenko
@ 2017-06-13 23:16   ` Dmitry Osipenko
  2017-06-13 23:16   ` [PATCH v2 22/22] gpu: host1x: At first try a non-blocking allocation for the gather copy Dmitry Osipenko
  21 siblings, 0 replies; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:16 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

From: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

This is largely a rewrite of the Host1x channel allocation code, bringing
several changes:

- The previous code could deadlock due to an interaction
  between the 'reflock' mutex and CDMA timeout handling.
  This gets rid of the mutex.
- Support for more than 32 channels, required for Tegra186
- General refactoring, including better encapsulation
  of channel ownership handling into channel.c

Signed-off-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
Reviewed-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Tested-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 drivers/gpu/drm/tegra/gr2d.c       |   4 +-
 drivers/gpu/drm/tegra/gr3d.c       |   4 +-
 drivers/gpu/drm/tegra/vic.c        |   4 +-
 drivers/gpu/host1x/channel.c       | 147 +++++++++++++++++++++++--------------
 drivers/gpu/host1x/channel.h       |  21 ++++--
 drivers/gpu/host1x/debug.c         |  47 +++++-------
 drivers/gpu/host1x/dev.c           |   7 +-
 drivers/gpu/host1x/dev.h           |   6 +-
 drivers/gpu/host1x/hw/channel_hw.c |   4 -
 include/linux/host1x.h             |   1 -
 10 files changed, 135 insertions(+), 110 deletions(-)

diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c
index fbe0b8b25b42..6ea070da7718 100644
--- a/drivers/gpu/drm/tegra/gr2d.c
+++ b/drivers/gpu/drm/tegra/gr2d.c
@@ -38,7 +38,7 @@ static int gr2d_init(struct host1x_client *client)
 
 	client->syncpts[0] = host1x_syncpt_request(client->dev, flags);
 	if (!client->syncpts[0]) {
-		host1x_channel_free(gr2d->channel);
+		host1x_channel_put(gr2d->channel);
 		return -ENOMEM;
 	}
 
@@ -57,7 +57,7 @@ static int gr2d_exit(struct host1x_client *client)
 		return err;
 
 	host1x_syncpt_free(client->syncpts[0]);
-	host1x_channel_free(gr2d->channel);
+	host1x_channel_put(gr2d->channel);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/tegra/gr3d.c b/drivers/gpu/drm/tegra/gr3d.c
index 13f0d1b7cd98..cee2ab645cde 100644
--- a/drivers/gpu/drm/tegra/gr3d.c
+++ b/drivers/gpu/drm/tegra/gr3d.c
@@ -48,7 +48,7 @@ static int gr3d_init(struct host1x_client *client)
 
 	client->syncpts[0] = host1x_syncpt_request(client->dev, flags);
 	if (!client->syncpts[0]) {
-		host1x_channel_free(gr3d->channel);
+		host1x_channel_put(gr3d->channel);
 		return -ENOMEM;
 	}
 
@@ -67,7 +67,7 @@ static int gr3d_exit(struct host1x_client *client)
 		return err;
 
 	host1x_syncpt_free(client->syncpts[0]);
-	host1x_channel_free(gr3d->channel);
+	host1x_channel_put(gr3d->channel);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
index cd804e404a11..47cb1aaa58b1 100644
--- a/drivers/gpu/drm/tegra/vic.c
+++ b/drivers/gpu/drm/tegra/vic.c
@@ -182,7 +182,7 @@ static int vic_init(struct host1x_client *client)
 free_syncpt:
 	host1x_syncpt_free(client->syncpts[0]);
 free_channel:
-	host1x_channel_free(vic->channel);
+	host1x_channel_put(vic->channel);
 detach_device:
 	if (tegra->domain)
 		iommu_detach_device(tegra->domain, vic->dev);
@@ -203,7 +203,7 @@ static int vic_exit(struct host1x_client *client)
 		return err;
 
 	host1x_syncpt_free(client->syncpts[0]);
-	host1x_channel_free(vic->channel);
+	host1x_channel_put(vic->channel);
 
 	if (vic->domain) {
 		iommu_detach_device(vic->domain, vic->dev);
diff --git a/drivers/gpu/host1x/channel.c b/drivers/gpu/host1x/channel.c
index 8f437d924c10..db9b91d1384c 100644
--- a/drivers/gpu/host1x/channel.c
+++ b/drivers/gpu/host1x/channel.c
@@ -24,19 +24,33 @@
 #include "job.h"
 
 /* Constructor for the host1x device list */
-int host1x_channel_list_init(struct host1x *host)
+int host1x_channel_list_init(struct host1x_channel_list *chlist,
+			     unsigned int num_channels)
 {
-	INIT_LIST_HEAD(&host->chlist.list);
-	mutex_init(&host->chlist_mutex);
-
-	if (host->info->nb_channels > BITS_PER_LONG) {
-		WARN(1, "host1x hardware has more channels than supported by the driver\n");
-		return -ENOSYS;
+	chlist->channels = kcalloc(num_channels, sizeof(struct host1x_channel),
+				   GFP_KERNEL);
+	if (!chlist->channels)
+		return -ENOMEM;
+
+	chlist->allocated_channels =
+		kcalloc(BITS_TO_LONGS(num_channels), sizeof(unsigned long),
+			GFP_KERNEL);
+	if (!chlist->allocated_channels) {
+		kfree(chlist->channels);
+		return -ENOMEM;
 	}
 
+	bitmap_zero(chlist->allocated_channels, num_channels);
+
 	return 0;
 }
 
+void host1x_channel_list_free(struct host1x_channel_list *chlist)
+{
+	kfree(chlist->allocated_channels);
+	kfree(chlist->channels);
+}
+
 int host1x_job_submit(struct host1x_job *job)
 {
 	struct host1x *host = dev_get_drvdata(job->channel->dev->parent);
@@ -47,86 +61,107 @@ EXPORT_SYMBOL(host1x_job_submit);
 
 struct host1x_channel *host1x_channel_get(struct host1x_channel *channel)
 {
-	int err = 0;
+	kref_get(&channel->refcount);
 
-	mutex_lock(&channel->reflock);
+	return channel;
+}
+EXPORT_SYMBOL(host1x_channel_get);
 
-	if (channel->refcount == 0)
-		err = host1x_cdma_init(&channel->cdma);
+/**
+ * host1x_channel_get_index() - Attempt to get channel reference by index
+ * @host: Host1x device object
+ * @index: Index of channel
+ *
+ * If channel number @index is currently allocated, increase its refcount
+ * and return a pointer to it. Otherwise, return NULL.
+ */
+struct host1x_channel *host1x_channel_get_index(struct host1x *host,
+						unsigned int index)
+{
+	struct host1x_channel *ch = &host->channel_list.channels[index];
 
-	if (!err)
-		channel->refcount++;
+	if (!kref_get_unless_zero(&ch->refcount))
+		return NULL;
 
-	mutex_unlock(&channel->reflock);
+	return ch;
+}
+
+static void release_channel(struct kref *kref)
+{
+	struct host1x_channel *channel =
+		container_of(kref, struct host1x_channel, refcount);
+	struct host1x *host = dev_get_drvdata(channel->dev->parent);
+	struct host1x_channel_list *chlist = &host->channel_list;
+
+	host1x_hw_cdma_stop(host, &channel->cdma);
+	host1x_cdma_deinit(&channel->cdma);
 
-	return err ? NULL : channel;
+	clear_bit(channel->id, chlist->allocated_channels);
 }
-EXPORT_SYMBOL(host1x_channel_get);
 
 void host1x_channel_put(struct host1x_channel *channel)
 {
-	mutex_lock(&channel->reflock);
+	kref_put(&channel->refcount, release_channel);
+}
+EXPORT_SYMBOL(host1x_channel_put);
 
-	if (channel->refcount == 1) {
-		struct host1x *host = dev_get_drvdata(channel->dev->parent);
+static struct host1x_channel *acquire_unused_channel(struct host1x *host)
+{
+	struct host1x_channel_list *chlist = &host->channel_list;
+	unsigned int max_channels = host->info->nb_channels;
+	unsigned int index;
 
-		host1x_hw_cdma_stop(host, &channel->cdma);
-		host1x_cdma_deinit(&channel->cdma);
+	index = find_first_zero_bit(chlist->allocated_channels, max_channels);
+	if (index >= max_channels) {
+		dev_err(host->dev, "failed to find free channel\n");
+		return NULL;
 	}
 
-	channel->refcount--;
+	chlist->channels[index].id = index;
 
-	mutex_unlock(&channel->reflock);
+	set_bit(index, chlist->allocated_channels);
+
+	return &chlist->channels[index];
 }
-EXPORT_SYMBOL(host1x_channel_put);
 
+/**
+ * host1x_channel_request() - Allocate a channel
+ * @device: Host1x unit this channel will be used to send commands to
+ *
+ * Allocates a new host1x channel for @device. If there are no free channels,
+ * this will sleep until one becomes available. May return NULL if CDMA
+ * initialization fails.
+ */
 struct host1x_channel *host1x_channel_request(struct device *dev)
 {
 	struct host1x *host = dev_get_drvdata(dev->parent);
-	unsigned int max_channels = host->info->nb_channels;
-	struct host1x_channel *channel = NULL;
-	unsigned long index;
+	struct host1x_channel_list *chlist = &host->channel_list;
+	struct host1x_channel *channel;
 	int err;
 
-	mutex_lock(&host->chlist_mutex);
+	channel = acquire_unused_channel(host);
+	if (!channel)
+		return NULL;
 
-	index = find_first_zero_bit(&host->allocated_channels, max_channels);
-	if (index >= max_channels)
-		goto fail;
+	kref_init(&channel->refcount);
+	mutex_init(&channel->submitlock);
+	channel->dev = dev;
 
-	channel = kzalloc(sizeof(*channel), GFP_KERNEL);
-	if (!channel)
+	err = host1x_hw_channel_init(host, channel, channel->id);
+	if (err < 0)
 		goto fail;
 
-	err = host1x_hw_channel_init(host, channel, index);
+	err = host1x_cdma_init(&channel->cdma);
 	if (err < 0)
 		goto fail;
 
-	/* Link device to host1x_channel */
-	channel->dev = dev;
-
-	/* Add to channel list */
-	list_add_tail(&channel->list, &host->chlist.list);
-
-	host->allocated_channels |= BIT(index);
-
-	mutex_unlock(&host->chlist_mutex);
 	return channel;
 
 fail:
-	dev_err(dev, "failed to init channel\n");
-	kfree(channel);
-	mutex_unlock(&host->chlist_mutex);
-	return NULL;
-}
-EXPORT_SYMBOL(host1x_channel_request);
+	clear_bit(channel->id, chlist->allocated_channels);
 
-void host1x_channel_free(struct host1x_channel *channel)
-{
-	struct host1x *host = dev_get_drvdata(channel->dev->parent);
+	dev_err(dev, "failed to initialize channel\n");
 
-	host->allocated_channels &= ~BIT(channel->id);
-	list_del(&channel->list);
-	kfree(channel);
+	return NULL;
 }
-EXPORT_SYMBOL(host1x_channel_free);
+EXPORT_SYMBOL(host1x_channel_request);
diff --git a/drivers/gpu/host1x/channel.h b/drivers/gpu/host1x/channel.h
index df767cf90d51..7068e42d42df 100644
--- a/drivers/gpu/host1x/channel.h
+++ b/drivers/gpu/host1x/channel.h
@@ -20,17 +20,21 @@
 #define __HOST1X_CHANNEL_H
 
 #include <linux/io.h>
+#include <linux/kref.h>
 
 #include "cdma.h"
 
 struct host1x;
+struct host1x_channel;
 
-struct host1x_channel {
-	struct list_head list;
+struct host1x_channel_list {
+	struct host1x_channel *channels;
+	unsigned long *allocated_channels;
+};
 
-	unsigned int refcount;
+struct host1x_channel {
+	struct kref refcount;
 	unsigned int id;
-	struct mutex reflock;
 	struct mutex submitlock;
 	void __iomem *regs;
 	struct device *dev;
@@ -38,9 +42,10 @@ struct host1x_channel {
 };
 
 /* channel list operations */
-int host1x_channel_list_init(struct host1x *host);
-
-#define host1x_for_each_channel(host, channel)				\
-	list_for_each_entry(channel, &host->chlist.list, list)
+int host1x_channel_list_init(struct host1x_channel_list *chlist,
+			     unsigned int num_channels);
+void host1x_channel_list_free(struct host1x_channel_list *chlist);
+struct host1x_channel *host1x_channel_get_index(struct host1x *host,
+						unsigned int index);
 
 #endif
diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c
index d9330fcc62ad..2aae0e63214c 100644
--- a/drivers/gpu/host1x/debug.c
+++ b/drivers/gpu/host1x/debug.c
@@ -43,24 +43,19 @@ void host1x_debug_output(struct output *o, const char *fmt, ...)
 	o->fn(o->ctx, o->buf, len);
 }
 
-static int show_channels(struct host1x_channel *ch, void *data, bool show_fifo)
+static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo)
 {
 	struct host1x *m = dev_get_drvdata(ch->dev->parent);
 	struct output *o = data;
 
-	mutex_lock(&ch->reflock);
+	mutex_lock(&ch->cdma.lock);
 
-	if (ch->refcount) {
-		mutex_lock(&ch->cdma.lock);
+	if (show_fifo)
+		host1x_hw_show_channel_fifo(m, ch, o);
 
-		if (show_fifo)
-			host1x_hw_show_channel_fifo(m, ch, o);
+	host1x_hw_show_channel_cdma(m, ch, o);
 
-		host1x_hw_show_channel_cdma(m, ch, o);
-		mutex_unlock(&ch->cdma.lock);
-	}
-
-	mutex_unlock(&ch->reflock);
+	mutex_unlock(&ch->cdma.lock);
 
 	return 0;
 }
@@ -94,28 +89,22 @@ static void show_syncpts(struct host1x *m, struct output *o)
 	host1x_debug_output(o, "\n");
 }
 
-static void show_all(struct host1x *m, struct output *o)
+static void show_all(struct host1x *m, struct output *o, bool show_fifo)
 {
-	struct host1x_channel *ch;
+	int i;
 
 	host1x_hw_show_mlocks(m, o);
 	show_syncpts(m, o);
 	host1x_debug_output(o, "---- channels ----\n");
 
-	host1x_for_each_channel(m, ch)
-		show_channels(ch, o, true);
-}
-
-static void show_all_no_fifo(struct host1x *host1x, struct output *o)
-{
-	struct host1x_channel *ch;
-
-	host1x_hw_show_mlocks(host1x, o);
-	show_syncpts(host1x, o);
-	host1x_debug_output(o, "---- channels ----\n");
+	for (i = 0; i < m->info->nb_channels; ++i) {
+		struct host1x_channel *ch = host1x_channel_get_index(m, i);
 
-	host1x_for_each_channel(host1x, ch)
-		show_channels(ch, o, false);
+		if (ch) {
+			show_channel(ch, o, show_fifo);
+			host1x_channel_put(ch);
+		}
+	}
 }
 
 static int host1x_debug_show_all(struct seq_file *s, void *unused)
@@ -125,7 +114,7 @@ static int host1x_debug_show_all(struct seq_file *s, void *unused)
 		.ctx = s
 	};
 
-	show_all(s->private, &o);
+	show_all(s->private, &o, true);
 
 	return 0;
 }
@@ -137,7 +126,7 @@ static int host1x_debug_show(struct seq_file *s, void *unused)
 		.ctx = s
 	};
 
-	show_all_no_fifo(s->private, &o);
+	show_all(s->private, &o, false);
 
 	return 0;
 }
@@ -216,7 +205,7 @@ void host1x_debug_dump(struct host1x *host1x)
 		.fn = write_to_printk
 	};
 
-	show_all(host1x, &o);
+	show_all(host1x, &o, true);
 }
 
 void host1x_debug_dump_syncpts(struct host1x *host1x)
diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
index 21da331ce092..18bc9dabb89d 100644
--- a/drivers/gpu/host1x/dev.c
+++ b/drivers/gpu/host1x/dev.c
@@ -203,7 +203,8 @@ static int host1x_probe(struct platform_device *pdev)
 		host->iova_end = geometry->aperture_end;
 	}
 
-	err = host1x_channel_list_init(host);
+	err = host1x_channel_list_init(&host->channel_list,
+				       host->info->nb_channels);
 	if (err) {
 		dev_err(&pdev->dev, "failed to initialize channel list\n");
 		goto fail_detach_device;
@@ -212,7 +213,7 @@ static int host1x_probe(struct platform_device *pdev)
 	err = clk_prepare_enable(host->clk);
 	if (err < 0) {
 		dev_err(&pdev->dev, "failed to enable clock\n");
-		goto fail_detach_device;
+		goto fail_free_channels;
 	}
 
 	err = reset_control_deassert(host->rst);
@@ -249,6 +250,8 @@ static int host1x_probe(struct platform_device *pdev)
 	reset_control_assert(host->rst);
 fail_unprepare_disable:
 	clk_disable_unprepare(host->clk);
+fail_free_channels:
+	host1x_channel_list_free(&host->channel_list);
 fail_detach_device:
 	if (host->domain) {
 		put_iova_domain(&host->iova);
diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
index c6997651b336..ec2eddec93cb 100644
--- a/drivers/gpu/host1x/dev.h
+++ b/drivers/gpu/host1x/dev.h
@@ -130,10 +130,8 @@ struct host1x {
 	struct host1x_syncpt *nop_sp;
 
 	struct mutex syncpt_mutex;
-	struct mutex chlist_mutex;
-	struct host1x_channel chlist;
-	unsigned long allocated_channels;
-	unsigned int num_allocated_channels;
+
+	struct host1x_channel_list channel_list;
 
 	struct dentry *debugfs;
 
diff --git a/drivers/gpu/host1x/hw/channel_hw.c b/drivers/gpu/host1x/hw/channel_hw.c
index 5e8df78b7acd..8447a56c41ca 100644
--- a/drivers/gpu/host1x/hw/channel_hw.c
+++ b/drivers/gpu/host1x/hw/channel_hw.c
@@ -181,10 +181,6 @@ static int channel_submit(struct host1x_job *job)
 static int host1x_channel_init(struct host1x_channel *ch, struct host1x *dev,
 			       unsigned int index)
 {
-	ch->id = index;
-	mutex_init(&ch->reflock);
-	mutex_init(&ch->submitlock);
-
 	ch->regs = dev->regs + index * HOST1X_CHANNEL_SIZE;
 	return 0;
 }
diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index 561d6bb6580d..5dd8df2f8810 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -156,7 +156,6 @@ struct host1x_channel;
 struct host1x_job;
 
 struct host1x_channel *host1x_channel_request(struct device *dev);
-void host1x_channel_free(struct host1x_channel *channel);
 struct host1x_channel *host1x_channel_get(struct host1x_channel *channel);
 void host1x_channel_put(struct host1x_channel *channel);
 int host1x_job_submit(struct host1x_job *job);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 22/22] gpu: host1x: At first try a non-blocking allocation for the gather copy
       [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
                     ` (20 preceding siblings ...)
  2017-06-13 23:16   ` [PATCH v2 21/22] gpu: host1x: Refactor channel allocation code Dmitry Osipenko
@ 2017-06-13 23:16   ` Dmitry Osipenko
       [not found]     ` <036615566011ab4d52190439a12d943df9e704ef.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  21 siblings, 1 reply; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-13 23:16 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

The blocking gather copy allocation is a major performance downside of the
Host1x firewall, it may take hundreds milliseconds which is unacceptable
for the real-time graphics operations. Let's try a non-blocking allocation
first as a least invasive solution, it makes opentegra (Xorg driver)
performance indistinguishable with/without the firewall.

Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---

The other much more invasive solution could be: to have a memory pool
reserved for the gather copy and performing the firewall checks (and the
gather copying) on the actual job submission to the CDMA, not on the job
allocation like it is done now. This solution reduces the overhead of a
gather copy allocations.

 drivers/gpu/host1x/job.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index fc194c676d91..ed0575f14093 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -594,12 +594,20 @@ static inline int copy_gathers(struct host1x_job *job, struct device *dev)
 		size += g->words * sizeof(u32);
 	}
 
+	/*
+	 * Try a non-blocking allocation from a higher priority pools first,
+	 * as awaiting for the allocation here is a major performance hit.
+	 */
 	job->gather_copy_mapped = dma_alloc_wc(dev, size, &job->gather_copy,
-					       GFP_KERNEL);
-	if (!job->gather_copy_mapped) {
-		job->gather_copy_mapped = NULL;
+					       GFP_NOWAIT);
+
+	/* the higher priority allocation failed, try the generic-blocking */
+	if (!job->gather_copy_mapped)
+		job->gather_copy_mapped = dma_alloc_wc(dev, size,
+						       &job->gather_copy,
+						       GFP_KERNEL);
+	if (!job->gather_copy_mapped)
 		return -ENOMEM;
-	}
 
 	job->gather_copy_size = size;
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 14/22] gpu: host1x: Forbid relocation address shifting in the firewall
       [not found]     ` <5e6f06d3e57db40c1c4947282ef7336369c3360f.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-06-14  6:50       ` Mikko Perttunen
       [not found]         ` <6db57dae-0de7-45a5-2ed5-1ebd84577742-/1wQRMveznE@public.gmane.org>
  0 siblings, 1 reply; 40+ messages in thread
From: Mikko Perttunen @ 2017-06-14  6:50 UTC (permalink / raw)
  To: Dmitry Osipenko, Thierry Reding, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

On 14.06.2017 02:15, Dmitry Osipenko wrote:
> Incorrectly shifted relocation address will cause a lower memory corruption
> and likely a hang on a write or a read of an arbitrary data in case of IOMMU
> absent. As of now there is no use for the address shifting (at least on
> Tegra20) and adding a proper shifting / sizes validation is much more work.
> Let's forbid it in the firewall till a proper validation is implemented.
>
> Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
>  drivers/gpu/host1x/dev.c |  4 ++++
>  drivers/gpu/host1x/dev.h |  1 +
>  drivers/gpu/host1x/job.c | 24 ++++++++++++++++++++++++
>  3 files changed, 29 insertions(+)
>
> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
> index 6a805ed908c3..21da331ce092 100644
> --- a/drivers/gpu/host1x/dev.c
> +++ b/drivers/gpu/host1x/dev.c
> @@ -72,6 +72,7 @@ static const struct host1x_info host1x01_info = {
>  	.init = host1x01_init,
>  	.sync_offset = 0x3000,
>  	.dma_mask = DMA_BIT_MASK(32),
> +	.version = 1,
>  };
>
>  static const struct host1x_info host1x02_info = {
> @@ -82,6 +83,7 @@ static const struct host1x_info host1x02_info = {
>  	.init = host1x02_init,
>  	.sync_offset = 0x3000,
>  	.dma_mask = DMA_BIT_MASK(32),
> +	.version = 2,
>  };
>
>  static const struct host1x_info host1x04_info = {
> @@ -92,6 +94,7 @@ static const struct host1x_info host1x04_info = {
>  	.init = host1x04_init,
>  	.sync_offset = 0x2100,
>  	.dma_mask = DMA_BIT_MASK(34),
> +	.version = 4,
>  };
>
>  static const struct host1x_info host1x05_info = {
> @@ -102,6 +105,7 @@ static const struct host1x_info host1x05_info = {
>  	.init = host1x05_init,
>  	.sync_offset = 0x2100,
>  	.dma_mask = DMA_BIT_MASK(34),
> +	.version = 5,
>  };
>
>  static const struct of_device_id host1x_of_match[] = {
> diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
> index 229d08b6a45e..c6997651b336 100644
> --- a/drivers/gpu/host1x/dev.h
> +++ b/drivers/gpu/host1x/dev.h
> @@ -100,6 +100,7 @@ struct host1x_info {
>  	int (*init)(struct host1x *host1x); /* initialize per SoC ops */
>  	unsigned int sync_offset; /* offset of syncpoint registers */
>  	u64 dma_mask; /* mask of addressable memory */
> +	unsigned int version; /* host1x's version */
>  };
>
>  struct host1x {
> diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
> index 4208329ca2af..7825643da324 100644
> --- a/drivers/gpu/host1x/job.c
> +++ b/drivers/gpu/host1x/job.c
> @@ -333,6 +333,27 @@ static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf,
>  	return true;
>  }
>
> +static bool check_reloc_shift(struct device *dev, struct host1x_reloc *reloc)
> +{
> +	struct host1x *host = dev_get_drvdata(dev->parent);
> +
> +	/* skip newer Tegra's since IOMMU is supposed to be used by
> +	 * them and not all address registers with their shifts are
> +	 * publicly documented */
> +	if (host->info->version > 1)
> +		return true;

I don't like the firewall working differently per SoC - IOMMU can still 
be disabled on later SoCs and in this case there would be a security 
hole even if the firewall is enabled.

> +
> +	/* skip Tegra30 with IOMMU enabled */
> +	if (host->domain)
> +		return true;
> +

So if we want to be able to test the firewall on all SoCs just having 
this check here for all SoCs would be better.

Thanks,
Mikko

> +	/* relocation shift value validation isn't implemented yet */
> +	if (reloc->shift)
> +		return false;
> +
> +	return true;
> +}
> +
>  struct host1x_firewall {
>  	struct host1x_job *job;
>  	struct device *dev;
> @@ -359,6 +380,9 @@ static int check_register(struct host1x_firewall *fw, unsigned long offset)
>  		if (!check_reloc(fw->reloc, fw->cmdbuf, fw->offset))
>  			return -EINVAL;
>
> +		if (!check_reloc_shift(fw->dev, fw->reloc))
> +			return -EINVAL;
> +
>  		fw->num_relocs--;
>  		fw->reloc++;
>  	}
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 01/22] drm/tegra: Fix lockup on a use of staging API
       [not found]     ` <d2616b82a1728d0953e84489cde40364999719f0.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-06-14  7:09       ` Erik Faye-Lund
  0 siblings, 0 replies; 40+ messages in thread
From: Erik Faye-Lund @ 2017-06-14  7:09 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Thierry Reding, Mikko Perttunen,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	DRI Development

On Wed, Jun 14, 2017 at 1:15 AM, Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Commit bdd2f9cd ("Don't leak kernel pointer to userspace") added a mutex
> around staging IOCTL's, some of those mutexes are taken twice.
>
> Fixes: bdd2f9cd10eb ("drm/tegra: Don't leak kernel pointer to userspace")
> Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 06/22] drm/tegra: dc: Avoid reset asserts on Tegra20
       [not found]     ` <1c7b0c7bf713401073d648f9f228fa152c094a5a.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-06-14  7:13       ` Erik Faye-Lund
  0 siblings, 0 replies; 40+ messages in thread
From: Erik Faye-Lund @ 2017-06-14  7:13 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Thierry Reding, Mikko Perttunen,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	DRI Development

On Wed, Jun 14, 2017 at 1:15 AM, Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Commit 33a8eb8 ("Implement runtime PM") introduced HW reset control. It
> causes a hang on Tegra20 if both display controllers are utilized (RGB
> panel and HDMI). The TRM suggests that each display controller has its own
> reset control, apparently it is not correct.
>
> Fixes: 33a8eb8d40ee ("drm/tegra: dc: Implement runtime PM")
> Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 07/22] drm/tegra: dc: Apply clipping to the plane
       [not found]     ` <7adbe0c6e67115fa4fd201026533b7cb97fc44d2.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-06-14  7:20       ` Erik Faye-Lund
  0 siblings, 0 replies; 40+ messages in thread
From: Erik Faye-Lund @ 2017-06-14  7:20 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Thierry Reding, Mikko Perttunen,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	DRI Development

On Wed, Jun 14, 2017 at 1:15 AM, Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Tegra20 an overlay plane should be clipped, otherwise its output is
> distorted once plane crosses display boundary.
>
> Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
>  drivers/gpu/drm/tegra/dc.c | 29 +++++++++++++++++++++--------
>  1 file changed, 21 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> index 98ee6abb056c..a7a7cce1afd0 100644
> --- a/drivers/gpu/drm/tegra/dc.c
> +++ b/drivers/gpu/drm/tegra/dc.c
> @@ -486,12 +486,25 @@ static int tegra_plane_state_add(struct tegra_plane *plane,
>  {
>         struct drm_crtc_state *crtc_state;
>         struct tegra_dc_state *tegra;
> +       struct drm_rect clip;
> +       int err;
>
>         /* Propagate errors from allocation or locking failures. */
>         crtc_state = drm_atomic_get_crtc_state(state->state, state->crtc);
>         if (IS_ERR(crtc_state))
>                 return PTR_ERR(crtc_state);
>
> +       clip.x1 = 0;
> +       clip.y1 = 0;
> +       clip.x2 = crtc_state->mode.hdisplay;
> +       clip.y2 = crtc_state->mode.vdisplay;
> +
> +       /* Check plane state for visibility and calculate clipping bounds */
> +       err = drm_plane_helper_check_state(state, &clip, 0, INT_MAX,
> +                                          true, true);
> +       if (err < 0)
> +               return err;
> +
>         tegra = to_dc_state(crtc_state);
>
>         tegra->planes |= WIN_A_ACT_REQ << plane->index;
> @@ -561,14 +574,14 @@ static void tegra_plane_atomic_update(struct drm_plane *plane,
>                 return;
>
>         memset(&window, 0, sizeof(window));
> -       window.src.x = plane->state->src_x >> 16;
> -       window.src.y = plane->state->src_y >> 16;
> -       window.src.w = plane->state->src_w >> 16;
> -       window.src.h = plane->state->src_h >> 16;
> -       window.dst.x = plane->state->crtc_x;
> -       window.dst.y = plane->state->crtc_y;
> -       window.dst.w = plane->state->crtc_w;
> -       window.dst.h = plane->state->crtc_h;
> +       window.src.x = plane->state->src.x1 >> 16;
> +       window.src.y = plane->state->src.y1 >> 16;
> +       window.src.w = drm_rect_width(&plane->state->src) >> 16;
> +       window.src.h = drm_rect_height(&plane->state->src) >> 16;
> +       window.dst.x = plane->state->dst.x1;
> +       window.dst.y = plane->state->dst.y1;
> +       window.dst.w = drm_rect_width(&plane->state->dst);
> +       window.dst.h = drm_rect_height(&plane->state->dst);
>         window.bits_per_pixel = fb->format->cpp[0] * 8;
>         window.bottom_up = tegra_fb_is_bottom_up(fb);
>

Looks good as far as I can tell.

Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 08/22] drm/tegra: dc: Disable plane if it is invisible
  2017-06-13 23:15   ` [PATCH v2 08/22] drm/tegra: dc: Disable plane if it is invisible Dmitry Osipenko
@ 2017-06-14  7:21     ` Erik Faye-Lund
  0 siblings, 0 replies; 40+ messages in thread
From: Erik Faye-Lund @ 2017-06-14  7:21 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: linux-tegra@vger.kernel.org, Thierry Reding, Mikko Perttunen,
	DRI Development

On Wed, Jun 14, 2017 at 1:15 AM, Dmitry Osipenko <digetx@gmail.com> wrote:
> On Tegra20 if plane has width or height equal to 0, it will be infinitely
> wide or tall. Let's disable the plane if it is invisible on atomic state
> committing to fix the issue. The Rockchip DRM driver does the same.
>
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>

Looks good as far as I can tell

Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 09/22] drm/tegra: Don't use IOMMU on Tegra20
       [not found]     ` <d707bd34e268f9d8fb29bf510a9f17e5f943a635.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-06-14  7:39       ` Erik Faye-Lund
       [not found]         ` <CABPQNSbMQMrahcvWnczywLDYVU0=Ns=FB6Yseisk-5L_8Y=rpQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 40+ messages in thread
From: Erik Faye-Lund @ 2017-06-14  7:39 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Thierry Reding, Mikko Perttunen,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	DRI Development

On Wed, Jun 14, 2017 at 1:15 AM, Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> There is no IOMMU on Tegra20, instead a GART would be picked as an IOMMU
> provider.
>
> Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Acked-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
> ---
>  drivers/gpu/drm/tegra/drm.c | 3 ++-
>  drivers/gpu/host1x/dev.c    | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> index e999391aedc9..aa7988dcc28f 100644
> --- a/drivers/gpu/drm/tegra/drm.c
> +++ b/drivers/gpu/drm/tegra/drm.c
> @@ -131,7 +131,8 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>         if (!tegra)
>                 return -ENOMEM;
>
> -       if (iommu_present(&platform_bus_type)) {
> +       if (iommu_present(&platform_bus_type) &&
> +           !of_machine_is_compatible("nvidia,tegra20")) {
>                 u64 carveout_start, carveout_end, gem_start, gem_end;
>                 struct iommu_domain_geometry *geometry;
>                 unsigned long order;
> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
> index f05ebb14fa63..6a805ed908c3 100644
> --- a/drivers/gpu/host1x/dev.c
> +++ b/drivers/gpu/host1x/dev.c
> @@ -177,7 +177,8 @@ static int host1x_probe(struct platform_device *pdev)
>                 return err;
>         }
>
> -       if (iommu_present(&platform_bus_type)) {
> +       if (iommu_present(&platform_bus_type) &&
> +           !of_machine_is_compatible("nvidia,tegra20")) {
>                 struct iommu_domain_geometry *geometry;
>                 unsigned long order;
>

This doesn't feel great... The commit message says there's no IOMMU,
but iommu_present says otherwise. I know there's some more subtleties
here, and the commit message does hint at this. But...

If we don't want to treat the GART as an IOMMU, shouldn't we somehow
make sure iommu_present() doesn't return true in these cases (or
perhaps make something like tegra_use_iommu(), with a comment
explaining why we don't want to allow the GART to be treated like a
proper IOMMU)? These seems to be the only Tegra-specific calls to
iommu_present()...

That being said, the patch seems to have the right effect...

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 11/22] gpu: host1x: Initialize firewall class to the jobs one
       [not found]     ` <8c0a2417584ce51b716f2c286027bcb18fceda56.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-06-14  7:41       ` Erik Faye-Lund
  0 siblings, 0 replies; 40+ messages in thread
From: Erik Faye-Lund @ 2017-06-14  7:41 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Thierry Reding, Mikko Perttunen,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	DRI Development

On Wed, Jun 14, 2017 at 1:15 AM, Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> The commands stream is prepended by the jobs class on the CDMA submission,
> so that explicitly setting a module class in the commands stream isn't
> necessary. The firewall initializes its class to 0 and the command stream
> that doesn't explicitly specify the class effectively bypasses the firewall.
>
> Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 18/22] gpu: host1x: Check waits in the firewall
       [not found]     ` <bd57829e74c6554c46ab761d013e6d9d3c39b693.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-06-14  7:47       ` Erik Faye-Lund
  0 siblings, 0 replies; 40+ messages in thread
From: Erik Faye-Lund @ 2017-06-14  7:47 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Thierry Reding, Mikko Perttunen,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	DRI Development

On Wed, Jun 14, 2017 at 1:15 AM, Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Check waits in the firewall in a way it is done for relocations.
>
> Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Reviewed-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 22/22] gpu: host1x: At first try a non-blocking allocation for the gather copy
       [not found]     ` <036615566011ab4d52190439a12d943df9e704ef.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-06-14  7:56       ` Erik Faye-Lund
       [not found]         ` <CABPQNSZ8f8QqW4fJYCGfgf3q0pTyQ7NbS_=64x39kvrNDSyhdw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 40+ messages in thread
From: Erik Faye-Lund @ 2017-06-14  7:56 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Thierry Reding, Mikko Perttunen,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	DRI Development

On Wed, Jun 14, 2017 at 1:16 AM, Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> The blocking gather copy allocation is a major performance downside of the
> Host1x firewall, it may take hundreds milliseconds which is unacceptable
> for the real-time graphics operations. Let's try a non-blocking allocation
> first as a least invasive solution, it makes opentegra (Xorg driver)
> performance indistinguishable with/without the firewall.
>
> Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
>
> The other much more invasive solution could be: to have a memory pool
> reserved for the gather copy and performing the firewall checks (and the
> gather copying) on the actual job submission to the CDMA, not on the job
> allocation like it is done now. This solution reduces the overhead of a
> gather copy allocations.
>
>  drivers/gpu/host1x/job.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
> index fc194c676d91..ed0575f14093 100644
> --- a/drivers/gpu/host1x/job.c
> +++ b/drivers/gpu/host1x/job.c
> @@ -594,12 +594,20 @@ static inline int copy_gathers(struct host1x_job *job, struct device *dev)
>                 size += g->words * sizeof(u32);
>         }
>
> +       /*
> +        * Try a non-blocking allocation from a higher priority pools first,
> +        * as awaiting for the allocation here is a major performance hit.
> +        */
>         job->gather_copy_mapped = dma_alloc_wc(dev, size, &job->gather_copy,
> -                                              GFP_KERNEL);
> -       if (!job->gather_copy_mapped) {
> -               job->gather_copy_mapped = NULL;
> +                                              GFP_NOWAIT);
> +
> +       /* the higher priority allocation failed, try the generic-blocking */
> +       if (!job->gather_copy_mapped)
> +               job->gather_copy_mapped = dma_alloc_wc(dev, size,
> +                                                      &job->gather_copy,
> +                                                      GFP_KERNEL);
> +       if (!job->gather_copy_mapped)
>                 return -ENOMEM;
> -       }
>
>         job->gather_copy_size = size;

Can't we just have a static fixed-size buffer, and validate chunks at
the time? Or why can't we validate directly on the mapped pointers?

It feels pretty silly to have to allocate memory in the first place here...

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 22/22] gpu: host1x: At first try a non-blocking allocation for the gather copy
       [not found]         ` <CABPQNSZ8f8QqW4fJYCGfgf3q0pTyQ7NbS_=64x39kvrNDSyhdw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-06-14  8:32           ` Dmitry Osipenko
  2017-06-14  8:44             ` Erik Faye-Lund
  0 siblings, 1 reply; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-14  8:32 UTC (permalink / raw)
  To: kusmabite-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Thierry Reding, Mikko Perttunen,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	DRI Development

On 14.06.2017 10:56, Erik Faye-Lund wrote:
> On Wed, Jun 14, 2017 at 1:16 AM, Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> The blocking gather copy allocation is a major performance downside of the
>> Host1x firewall, it may take hundreds milliseconds which is unacceptable
>> for the real-time graphics operations. Let's try a non-blocking allocation
>> first as a least invasive solution, it makes opentegra (Xorg driver)
>> performance indistinguishable with/without the firewall.
>>
>> Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> ---
>>
>> The other much more invasive solution could be: to have a memory pool
>> reserved for the gather copy and performing the firewall checks (and the
>> gather copying) on the actual job submission to the CDMA, not on the job
>> allocation like it is done now. This solution reduces the overhead of a
>> gather copy allocations.
>>
>>  drivers/gpu/host1x/job.c | 16 ++++++++++++----
>>  1 file changed, 12 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
>> index fc194c676d91..ed0575f14093 100644
>> --- a/drivers/gpu/host1x/job.c
>> +++ b/drivers/gpu/host1x/job.c
>> @@ -594,12 +594,20 @@ static inline int copy_gathers(struct host1x_job *job, struct device *dev)
>>                 size += g->words * sizeof(u32);
>>         }
>>
>> +       /*
>> +        * Try a non-blocking allocation from a higher priority pools first,
>> +        * as awaiting for the allocation here is a major performance hit.
>> +        */
>>         job->gather_copy_mapped = dma_alloc_wc(dev, size, &job->gather_copy,
>> -                                              GFP_KERNEL);
>> -       if (!job->gather_copy_mapped) {
>> -               job->gather_copy_mapped = NULL;
>> +                                              GFP_NOWAIT);
>> +
>> +       /* the higher priority allocation failed, try the generic-blocking */
>> +       if (!job->gather_copy_mapped)
>> +               job->gather_copy_mapped = dma_alloc_wc(dev, size,
>> +                                                      &job->gather_copy,
>> +                                                      GFP_KERNEL);
>> +       if (!job->gather_copy_mapped)
>>                 return -ENOMEM;
>> -       }
>>
>>         job->gather_copy_size = size;
> 
> Can't we just have a static fixed-size buffer, and validate chunks at
> the time? Or why can't we validate directly on the mapped pointers?
> 
> It feels pretty silly to have to allocate memory in the first place here...
> 

We can't validate the mapped pointers because userspace may write to the buffers
memory at any time. Maybe it should be possible to write-protect cmdbuffers
memory for the time of its submission and execution, but I'm not sure whether
it's worth the effort. The current-proposed solution is efficient and least
invasive.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 22/22] gpu: host1x: At first try a non-blocking allocation for the gather copy
  2017-06-14  8:32           ` Dmitry Osipenko
@ 2017-06-14  8:44             ` Erik Faye-Lund
  0 siblings, 0 replies; 40+ messages in thread
From: Erik Faye-Lund @ 2017-06-14  8:44 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: linux-tegra@vger.kernel.org, Thierry Reding, Mikko Perttunen,
	DRI Development

On Wed, Jun 14, 2017 at 10:32 AM, Dmitry Osipenko <digetx@gmail.com> wrote:
> On 14.06.2017 10:56, Erik Faye-Lund wrote:
>> On Wed, Jun 14, 2017 at 1:16 AM, Dmitry Osipenko <digetx@gmail.com> wrote:
>>> The blocking gather copy allocation is a major performance downside of the
>>> Host1x firewall, it may take hundreds milliseconds which is unacceptable
>>> for the real-time graphics operations. Let's try a non-blocking allocation
>>> first as a least invasive solution, it makes opentegra (Xorg driver)
>>> performance indistinguishable with/without the firewall.
>>>
>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>>> ---
>>>
>>> The other much more invasive solution could be: to have a memory pool
>>> reserved for the gather copy and performing the firewall checks (and the
>>> gather copying) on the actual job submission to the CDMA, not on the job
>>> allocation like it is done now. This solution reduces the overhead of a
>>> gather copy allocations.
>>>
>>>  drivers/gpu/host1x/job.c | 16 ++++++++++++----
>>>  1 file changed, 12 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
>>> index fc194c676d91..ed0575f14093 100644
>>> --- a/drivers/gpu/host1x/job.c
>>> +++ b/drivers/gpu/host1x/job.c
>>> @@ -594,12 +594,20 @@ static inline int copy_gathers(struct host1x_job *job, struct device *dev)
>>>                 size += g->words * sizeof(u32);
>>>         }
>>>
>>> +       /*
>>> +        * Try a non-blocking allocation from a higher priority pools first,
>>> +        * as awaiting for the allocation here is a major performance hit.
>>> +        */
>>>         job->gather_copy_mapped = dma_alloc_wc(dev, size, &job->gather_copy,
>>> -                                              GFP_KERNEL);
>>> -       if (!job->gather_copy_mapped) {
>>> -               job->gather_copy_mapped = NULL;
>>> +                                              GFP_NOWAIT);
>>> +
>>> +       /* the higher priority allocation failed, try the generic-blocking */
>>> +       if (!job->gather_copy_mapped)
>>> +               job->gather_copy_mapped = dma_alloc_wc(dev, size,
>>> +                                                      &job->gather_copy,
>>> +                                                      GFP_KERNEL);
>>> +       if (!job->gather_copy_mapped)
>>>                 return -ENOMEM;
>>> -       }
>>>
>>>         job->gather_copy_size = size;
>>
>> Can't we just have a static fixed-size buffer, and validate chunks at
>> the time? Or why can't we validate directly on the mapped pointers?
>>
>> It feels pretty silly to have to allocate memory in the first place here...
>>
>
> We can't validate the mapped pointers because userspace may write to the buffers
> memory at any time. Maybe it should be possible to write-protect cmdbuffers
> memory for the time of its submission and execution, but I'm not sure whether
> it's worth the effort. The current-proposed solution is efficient and least
> invasive.

Right, good point.

Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 14/22] gpu: host1x: Forbid relocation address shifting in the firewall
       [not found]         ` <6db57dae-0de7-45a5-2ed5-1ebd84577742-/1wQRMveznE@public.gmane.org>
@ 2017-06-14  9:06           ` Dmitry Osipenko
       [not found]             ` <3d723511-45f2-3c80-58fe-c9050624ac94-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-14  9:06 UTC (permalink / raw)
  To: Mikko Perttunen, Thierry Reding, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

On 14.06.2017 09:50, Mikko Perttunen wrote:
> On 14.06.2017 02:15, Dmitry Osipenko wrote:
>> Incorrectly shifted relocation address will cause a lower memory corruption
>> and likely a hang on a write or a read of an arbitrary data in case of IOMMU
>> absent. As of now there is no use for the address shifting (at least on
>> Tegra20) and adding a proper shifting / sizes validation is much more work.
>> Let's forbid it in the firewall till a proper validation is implemented.
>>
>> Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> ---
>>  drivers/gpu/host1x/dev.c |  4 ++++
>>  drivers/gpu/host1x/dev.h |  1 +
>>  drivers/gpu/host1x/job.c | 24 ++++++++++++++++++++++++
>>  3 files changed, 29 insertions(+)
>>
>> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
>> index 6a805ed908c3..21da331ce092 100644
>> --- a/drivers/gpu/host1x/dev.c
>> +++ b/drivers/gpu/host1x/dev.c
>> @@ -72,6 +72,7 @@ static const struct host1x_info host1x01_info = {
>>      .init = host1x01_init,
>>      .sync_offset = 0x3000,
>>      .dma_mask = DMA_BIT_MASK(32),
>> +    .version = 1,
>>  };
>>
>>  static const struct host1x_info host1x02_info = {
>> @@ -82,6 +83,7 @@ static const struct host1x_info host1x02_info = {
>>      .init = host1x02_init,
>>      .sync_offset = 0x3000,
>>      .dma_mask = DMA_BIT_MASK(32),
>> +    .version = 2,
>>  };
>>
>>  static const struct host1x_info host1x04_info = {
>> @@ -92,6 +94,7 @@ static const struct host1x_info host1x04_info = {
>>      .init = host1x04_init,
>>      .sync_offset = 0x2100,
>>      .dma_mask = DMA_BIT_MASK(34),
>> +    .version = 4,
>>  };
>>
>>  static const struct host1x_info host1x05_info = {
>> @@ -102,6 +105,7 @@ static const struct host1x_info host1x05_info = {
>>      .init = host1x05_init,
>>      .sync_offset = 0x2100,
>>      .dma_mask = DMA_BIT_MASK(34),
>> +    .version = 5,
>>  };
>>
>>  static const struct of_device_id host1x_of_match[] = {
>> diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
>> index 229d08b6a45e..c6997651b336 100644
>> --- a/drivers/gpu/host1x/dev.h
>> +++ b/drivers/gpu/host1x/dev.h
>> @@ -100,6 +100,7 @@ struct host1x_info {
>>      int (*init)(struct host1x *host1x); /* initialize per SoC ops */
>>      unsigned int sync_offset; /* offset of syncpoint registers */
>>      u64 dma_mask; /* mask of addressable memory */
>> +    unsigned int version; /* host1x's version */
>>  };
>>
>>  struct host1x {
>> diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
>> index 4208329ca2af..7825643da324 100644
>> --- a/drivers/gpu/host1x/job.c
>> +++ b/drivers/gpu/host1x/job.c
>> @@ -333,6 +333,27 @@ static bool check_reloc(struct host1x_reloc *reloc,
>> struct host1x_bo *cmdbuf,
>>      return true;
>>  }
>>
>> +static bool check_reloc_shift(struct device *dev, struct host1x_reloc *reloc)
>> +{
>> +    struct host1x *host = dev_get_drvdata(dev->parent);
>> +
>> +    /* skip newer Tegra's since IOMMU is supposed to be used by
>> +     * them and not all address registers with their shifts are
>> +     * publicly documented */
>> +    if (host->info->version > 1)
>> +        return true;
> 
> I don't like the firewall working differently per SoC - IOMMU can still be
> disabled on later SoCs and in this case there would be a security hole even if
> the firewall is enabled.
> 

Yes, it would be a security hole like it is is right now. Unfortunately we can't
fix it on later SoC's without knowing all the address registers and their shifts.

>> +
>> +    /* skip Tegra30 with IOMMU enabled */
>> +    if (host->domain)
>> +        return true;
>> +
> 
> So if we want to be able to test the firewall on all SoCs just having this check
> here for all SoCs would be better.
> 

I'm not sure what you are meaning here. You'll get a wrong address resolved in
HW if that address needs to be shifted.

As I wrote before, on later SoC's firewall could be used as a debug feature
without the shifts being validated.

>> +    /* relocation shift value validation isn't implemented yet */
>> +    if (reloc->shift)
>> +        return false;
>> +
>> +    return true;
>> +}
>> +
>>  struct host1x_firewall {
>>      struct host1x_job *job;
>>      struct device *dev;
>> @@ -359,6 +380,9 @@ static int check_register(struct host1x_firewall *fw,
>> unsigned long offset)
>>          if (!check_reloc(fw->reloc, fw->cmdbuf, fw->offset))
>>              return -EINVAL;
>>
>> +        if (!check_reloc_shift(fw->dev, fw->reloc))
>> +            return -EINVAL;
>> +
>>          fw->num_relocs--;
>>          fw->reloc++;
>>      }
>>


-- 
Dmitry

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 09/22] drm/tegra: Don't use IOMMU on Tegra20
       [not found]         ` <CABPQNSbMQMrahcvWnczywLDYVU0=Ns=FB6Yseisk-5L_8Y=rpQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-06-14 10:22           ` Dmitry Osipenko
       [not found]             ` <16be88ba-808d-4d3e-533b-7dce4a2765e7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-14 10:22 UTC (permalink / raw)
  To: kusmabite-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Thierry Reding, Mikko Perttunen,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	DRI Development, Nicolas Chauvet

On 14.06.2017 10:39, Erik Faye-Lund wrote:
> On Wed, Jun 14, 2017 at 1:15 AM, Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> There is no IOMMU on Tegra20, instead a GART would be picked as an IOMMU
>> provider.
>>
>> Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> Acked-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
>> ---
>>  drivers/gpu/drm/tegra/drm.c | 3 ++-
>>  drivers/gpu/host1x/dev.c    | 3 ++-
>>  2 files changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
>> index e999391aedc9..aa7988dcc28f 100644
>> --- a/drivers/gpu/drm/tegra/drm.c
>> +++ b/drivers/gpu/drm/tegra/drm.c
>> @@ -131,7 +131,8 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>>         if (!tegra)
>>                 return -ENOMEM;
>>
>> -       if (iommu_present(&platform_bus_type)) {
>> +       if (iommu_present(&platform_bus_type) &&
>> +           !of_machine_is_compatible("nvidia,tegra20")) {
>>                 u64 carveout_start, carveout_end, gem_start, gem_end;
>>                 struct iommu_domain_geometry *geometry;
>>                 unsigned long order;
>> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
>> index f05ebb14fa63..6a805ed908c3 100644
>> --- a/drivers/gpu/host1x/dev.c
>> +++ b/drivers/gpu/host1x/dev.c
>> @@ -177,7 +177,8 @@ static int host1x_probe(struct platform_device *pdev)
>>                 return err;
>>         }
>>
>> -       if (iommu_present(&platform_bus_type)) {
>> +       if (iommu_present(&platform_bus_type) &&
>> +           !of_machine_is_compatible("nvidia,tegra20")) {
>>                 struct iommu_domain_geometry *geometry;
>>                 unsigned long order;
>>
> 
> This doesn't feel great... The commit message says there's no IOMMU,
> but iommu_present says otherwise. I know there's some more subtleties
> here, and the commit message does hint at this. But...
> 
> If we don't want to treat the GART as an IOMMU, shouldn't we somehow
> make sure iommu_present() doesn't return true in these cases (or
> perhaps make something like tegra_use_iommu(), with a comment
> explaining why we don't want to allow the GART to be treated like a
> proper IOMMU)? These seems to be the only Tegra-specific calls to
> iommu_present()...
> 
> That being said, the patch seems to have the right effect...
> 

We don't want to treat the GART as an IOMMU right now, but I'd want to change
that in the (near?) future, so it's a kinda trivial preparation patch that
restores the GART driver and these of_machine_is_compatible() are supposed to go
away later. Probably I should mention this in the commit message(?).

I think we can add a Tegra20 specific IOCTL for an allocation of GART-able
memory and use it for a stuff like opentegra's EXA, its fallback allocations
migration is a pita and reserved CMA is never enough =)

-- 
Dmitry

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 14/22] gpu: host1x: Forbid relocation address shifting in the firewall
       [not found]             ` <3d723511-45f2-3c80-58fe-c9050624ac94-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-06-14 11:47               ` Mikko Perttunen
       [not found]                 ` <bc4d8ea3-0fd7-85ad-be08-bc869d0e89a6-/1wQRMveznE@public.gmane.org>
  0 siblings, 1 reply; 40+ messages in thread
From: Mikko Perttunen @ 2017-06-14 11:47 UTC (permalink / raw)
  To: Dmitry Osipenko, Thierry Reding, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development



On 14.06.2017 12:06, Dmitry Osipenko wrote:
> On 14.06.2017 09:50, Mikko Perttunen wrote:
>> On 14.06.2017 02:15, Dmitry Osipenko wrote:
>>> Incorrectly shifted relocation address will cause a lower memory corruption
>>> and likely a hang on a write or a read of an arbitrary data in case of IOMMU
>>> absent. As of now there is no use for the address shifting (at least on
>>> Tegra20) and adding a proper shifting / sizes validation is much more work.
>>> Let's forbid it in the firewall till a proper validation is implemented.
>>>
>>> Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>> Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>> ---
>>>  drivers/gpu/host1x/dev.c |  4 ++++
>>>  drivers/gpu/host1x/dev.h |  1 +
>>>  drivers/gpu/host1x/job.c | 24 ++++++++++++++++++++++++
>>>  3 files changed, 29 insertions(+)
>>>
>>> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
>>> index 6a805ed908c3..21da331ce092 100644
>>> --- a/drivers/gpu/host1x/dev.c
>>> +++ b/drivers/gpu/host1x/dev.c
>>> @@ -72,6 +72,7 @@ static const struct host1x_info host1x01_info = {
>>>      .init = host1x01_init,
>>>      .sync_offset = 0x3000,
>>>      .dma_mask = DMA_BIT_MASK(32),
>>> +    .version = 1,
>>>  };
>>>
>>>  static const struct host1x_info host1x02_info = {
>>> @@ -82,6 +83,7 @@ static const struct host1x_info host1x02_info = {
>>>      .init = host1x02_init,
>>>      .sync_offset = 0x3000,
>>>      .dma_mask = DMA_BIT_MASK(32),
>>> +    .version = 2,
>>>  };
>>>
>>>  static const struct host1x_info host1x04_info = {
>>> @@ -92,6 +94,7 @@ static const struct host1x_info host1x04_info = {
>>>      .init = host1x04_init,
>>>      .sync_offset = 0x2100,
>>>      .dma_mask = DMA_BIT_MASK(34),
>>> +    .version = 4,
>>>  };
>>>
>>>  static const struct host1x_info host1x05_info = {
>>> @@ -102,6 +105,7 @@ static const struct host1x_info host1x05_info = {
>>>      .init = host1x05_init,
>>>      .sync_offset = 0x2100,
>>>      .dma_mask = DMA_BIT_MASK(34),
>>> +    .version = 5,
>>>  };
>>>
>>>  static const struct of_device_id host1x_of_match[] = {
>>> diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
>>> index 229d08b6a45e..c6997651b336 100644
>>> --- a/drivers/gpu/host1x/dev.h
>>> +++ b/drivers/gpu/host1x/dev.h
>>> @@ -100,6 +100,7 @@ struct host1x_info {
>>>      int (*init)(struct host1x *host1x); /* initialize per SoC ops */
>>>      unsigned int sync_offset; /* offset of syncpoint registers */
>>>      u64 dma_mask; /* mask of addressable memory */
>>> +    unsigned int version; /* host1x's version */
>>>  };
>>>
>>>  struct host1x {
>>> diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
>>> index 4208329ca2af..7825643da324 100644
>>> --- a/drivers/gpu/host1x/job.c
>>> +++ b/drivers/gpu/host1x/job.c
>>> @@ -333,6 +333,27 @@ static bool check_reloc(struct host1x_reloc *reloc,
>>> struct host1x_bo *cmdbuf,
>>>      return true;
>>>  }
>>>
>>> +static bool check_reloc_shift(struct device *dev, struct host1x_reloc *reloc)
>>> +{
>>> +    struct host1x *host = dev_get_drvdata(dev->parent);
>>> +
>>> +    /* skip newer Tegra's since IOMMU is supposed to be used by
>>> +     * them and not all address registers with their shifts are
>>> +     * publicly documented */
>>> +    if (host->info->version > 1)
>>> +        return true;
>>
>> I don't like the firewall working differently per SoC - IOMMU can still be
>> disabled on later SoCs and in this case there would be a security hole even if
>> the firewall is enabled.
>>
>
> Yes, it would be a security hole like it is is right now. Unfortunately we can't
> fix it on later SoC's without knowing all the address registers and their shifts.

Indeed, we can't fix it, but I was thinking that having the firewall 
enabled in the kernel should always mean that the system is safe, even 
if IOMMU is disabled, and even if it means that you cannot really do 
anything useful with it (at least running host1x_test should be possible 
without shifts, though). Thus no matter what the hardware revision, we 
should reject anything we cannot ensure as safe.

>
>>> +
>>> +    /* skip Tegra30 with IOMMU enabled */
>>> +    if (host->domain)
>>> +        return true;
>>> +
>>
>> So if we want to be able to test the firewall on all SoCs just having this check
>> here for all SoCs would be better.
>>
>
> I'm not sure what you are meaning here. You'll get a wrong address resolved in
> HW if that address needs to be shifted.

So we have two choices here:
a) If IOMMU is enabled, skip the shift check and return true always. 
This means that we still ensure that the system is safe, since with 
IOMMU enabled it is safe even if the firewall cannot ensure that the job 
is proper. This is what I was trying to say above.
b) Always return false if there is a shift, on any SoC and even if IOMMU 
is enabled. This would harmonize the behavior of the firewall across 
systems.

>
> As I wrote before, on later SoC's firewall could be used as a debug feature
> without the shifts being validated.
>
>>> +    /* relocation shift value validation isn't implemented yet */
>>> +    if (reloc->shift)
>>> +        return false;
>>> +
>>> +    return true;
>>> +}
>>> +
>>>  struct host1x_firewall {
>>>      struct host1x_job *job;
>>>      struct device *dev;
>>> @@ -359,6 +380,9 @@ static int check_register(struct host1x_firewall *fw,
>>> unsigned long offset)
>>>          if (!check_reloc(fw->reloc, fw->cmdbuf, fw->offset))
>>>              return -EINVAL;
>>>
>>> +        if (!check_reloc_shift(fw->dev, fw->reloc))
>>> +            return -EINVAL;
>>> +
>>>          fw->num_relocs--;
>>>          fw->reloc++;
>>>      }
>>>
>
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 14/22] gpu: host1x: Forbid relocation address shifting in the firewall
       [not found]                 ` <bc4d8ea3-0fd7-85ad-be08-bc869d0e89a6-/1wQRMveznE@public.gmane.org>
@ 2017-06-14 14:49                   ` Dmitry Osipenko
       [not found]                     ` <a18ec3c6-0fa1-4daa-3db4-c81124424cec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-14 14:49 UTC (permalink / raw)
  To: Mikko Perttunen, Thierry Reding, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

On 14.06.2017 14:47, Mikko Perttunen wrote:
> 
> 
> On 14.06.2017 12:06, Dmitry Osipenko wrote:
>> On 14.06.2017 09:50, Mikko Perttunen wrote:
>>> On 14.06.2017 02:15, Dmitry Osipenko wrote:
>>>> Incorrectly shifted relocation address will cause a lower memory corruption
>>>> and likely a hang on a write or a read of an arbitrary data in case of IOMMU
>>>> absent. As of now there is no use for the address shifting (at least on
>>>> Tegra20) and adding a proper shifting / sizes validation is much more work.
>>>> Let's forbid it in the firewall till a proper validation is implemented.
>>>>
>>>> Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>>> Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>>> ---
>>>>  drivers/gpu/host1x/dev.c |  4 ++++
>>>>  drivers/gpu/host1x/dev.h |  1 +
>>>>  drivers/gpu/host1x/job.c | 24 ++++++++++++++++++++++++
>>>>  3 files changed, 29 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
>>>> index 6a805ed908c3..21da331ce092 100644
>>>> --- a/drivers/gpu/host1x/dev.c
>>>> +++ b/drivers/gpu/host1x/dev.c
>>>> @@ -72,6 +72,7 @@ static const struct host1x_info host1x01_info = {
>>>>      .init = host1x01_init,
>>>>      .sync_offset = 0x3000,
>>>>      .dma_mask = DMA_BIT_MASK(32),
>>>> +    .version = 1,
>>>>  };
>>>>
>>>>  static const struct host1x_info host1x02_info = {
>>>> @@ -82,6 +83,7 @@ static const struct host1x_info host1x02_info = {
>>>>      .init = host1x02_init,
>>>>      .sync_offset = 0x3000,
>>>>      .dma_mask = DMA_BIT_MASK(32),
>>>> +    .version = 2,
>>>>  };
>>>>
>>>>  static const struct host1x_info host1x04_info = {
>>>> @@ -92,6 +94,7 @@ static const struct host1x_info host1x04_info = {
>>>>      .init = host1x04_init,
>>>>      .sync_offset = 0x2100,
>>>>      .dma_mask = DMA_BIT_MASK(34),
>>>> +    .version = 4,
>>>>  };
>>>>
>>>>  static const struct host1x_info host1x05_info = {
>>>> @@ -102,6 +105,7 @@ static const struct host1x_info host1x05_info = {
>>>>      .init = host1x05_init,
>>>>      .sync_offset = 0x2100,
>>>>      .dma_mask = DMA_BIT_MASK(34),
>>>> +    .version = 5,
>>>>  };
>>>>
>>>>  static const struct of_device_id host1x_of_match[] = {
>>>> diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
>>>> index 229d08b6a45e..c6997651b336 100644
>>>> --- a/drivers/gpu/host1x/dev.h
>>>> +++ b/drivers/gpu/host1x/dev.h
>>>> @@ -100,6 +100,7 @@ struct host1x_info {
>>>>      int (*init)(struct host1x *host1x); /* initialize per SoC ops */
>>>>      unsigned int sync_offset; /* offset of syncpoint registers */
>>>>      u64 dma_mask; /* mask of addressable memory */
>>>> +    unsigned int version; /* host1x's version */
>>>>  };
>>>>
>>>>  struct host1x {
>>>> diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
>>>> index 4208329ca2af..7825643da324 100644
>>>> --- a/drivers/gpu/host1x/job.c
>>>> +++ b/drivers/gpu/host1x/job.c
>>>> @@ -333,6 +333,27 @@ static bool check_reloc(struct host1x_reloc *reloc,
>>>> struct host1x_bo *cmdbuf,
>>>>      return true;
>>>>  }
>>>>
>>>> +static bool check_reloc_shift(struct device *dev, struct host1x_reloc *reloc)
>>>> +{
>>>> +    struct host1x *host = dev_get_drvdata(dev->parent);
>>>> +
>>>> +    /* skip newer Tegra's since IOMMU is supposed to be used by
>>>> +     * them and not all address registers with their shifts are
>>>> +     * publicly documented */
>>>> +    if (host->info->version > 1)
>>>> +        return true;
>>>
>>> I don't like the firewall working differently per SoC - IOMMU can still be
>>> disabled on later SoCs and in this case there would be a security hole even if
>>> the firewall is enabled.
>>>
>>
>> Yes, it would be a security hole like it is is right now. Unfortunately we can't
>> fix it on later SoC's without knowing all the address registers and their shifts.
> 
> Indeed, we can't fix it, but I was thinking that having the firewall enabled in
> the kernel should always mean that the system is safe, even if IOMMU is
> disabled, and even if it means that you cannot really do anything useful with it
> (at least running host1x_test should be possible without shifts, though). Thus
> no matter what the hardware revision, we should reject anything we cannot ensure
> as safe.
> 
>>
>>>> +
>>>> +    /* skip Tegra30 with IOMMU enabled */
>>>> +    if (host->domain)
>>>> +        return true;
>>>> +
>>>
>>> So if we want to be able to test the firewall on all SoCs just having this check
>>> here for all SoCs would be better.
>>>
>>
>> I'm not sure what you are meaning here. You'll get a wrong address resolved in
>> HW if that address needs to be shifted.
> 
> So we have two choices here:
> a) If IOMMU is enabled, skip the shift check and return true always. This means
> that we still ensure that the system is safe, since with IOMMU enabled it is
> safe even if the firewall cannot ensure that the job is proper. This is what I
> was trying to say above.
> b) Always return false if there is a shift, on any SoC and even if IOMMU is
> enabled. This would harmonize the behavior of the firewall across systems.

Okay, I think b) is a better choice, actually that's what V1 of the patch did.
Preserving behaviour regardless of the IOMMU state is reasonable and this is an
opensource driver that is supposed to be used with the opensource userspace
after all.

Later I'd want to stricter the firewall checks to disallow any unknown register
accesses, that will make it a bit more secure on a later SoC's as well. After,
we should add a memory access bounds checking for each of HW operations to make
it really secure.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 14/22] gpu: host1x: Forbid relocation address shifting in the firewall
       [not found]                     ` <a18ec3c6-0fa1-4daa-3db4-c81124424cec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-06-14 14:59                       ` Mikko Perttunen
  0 siblings, 0 replies; 40+ messages in thread
From: Mikko Perttunen @ 2017-06-14 14:59 UTC (permalink / raw)
  To: Dmitry Osipenko, Thierry Reding, Erik Faye-Lund
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA, DRI Development

On 06/14/2017 05:49 PM, Dmitry Osipenko wrote:
> On 14.06.2017 14:47, Mikko Perttunen wrote:
>>
>>
>> On 14.06.2017 12:06, Dmitry Osipenko wrote:
>>> On 14.06.2017 09:50, Mikko Perttunen wrote:
>>>> On 14.06.2017 02:15, Dmitry Osipenko wrote:
>>>>> Incorrectly shifted relocation address will cause a lower memory corruption
>>>>> and likely a hang on a write or a read of an arbitrary data in case of IOMMU
>>>>> absent. As of now there is no use for the address shifting (at least on
>>>>> Tegra20) and adding a proper shifting / sizes validation is much more work.
>>>>> Let's forbid it in the firewall till a proper validation is implemented.
>>>>>
>>>>> Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>>>> Reviewed-by: Erik Faye-Lund <kusmabite-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>>>> ---
>>>>>   drivers/gpu/host1x/dev.c |  4 ++++
>>>>>   drivers/gpu/host1x/dev.h |  1 +
>>>>>   drivers/gpu/host1x/job.c | 24 ++++++++++++++++++++++++
>>>>>   3 files changed, 29 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
>>>>> index 6a805ed908c3..21da331ce092 100644
>>>>> --- a/drivers/gpu/host1x/dev.c
>>>>> +++ b/drivers/gpu/host1x/dev.c
>>>>> @@ -72,6 +72,7 @@ static const struct host1x_info host1x01_info = {
>>>>>       .init = host1x01_init,
>>>>>       .sync_offset = 0x3000,
>>>>>       .dma_mask = DMA_BIT_MASK(32),
>>>>> +    .version = 1,
>>>>>   };
>>>>>
>>>>>   static const struct host1x_info host1x02_info = {
>>>>> @@ -82,6 +83,7 @@ static const struct host1x_info host1x02_info = {
>>>>>       .init = host1x02_init,
>>>>>       .sync_offset = 0x3000,
>>>>>       .dma_mask = DMA_BIT_MASK(32),
>>>>> +    .version = 2,
>>>>>   };
>>>>>
>>>>>   static const struct host1x_info host1x04_info = {
>>>>> @@ -92,6 +94,7 @@ static const struct host1x_info host1x04_info = {
>>>>>       .init = host1x04_init,
>>>>>       .sync_offset = 0x2100,
>>>>>       .dma_mask = DMA_BIT_MASK(34),
>>>>> +    .version = 4,
>>>>>   };
>>>>>
>>>>>   static const struct host1x_info host1x05_info = {
>>>>> @@ -102,6 +105,7 @@ static const struct host1x_info host1x05_info = {
>>>>>       .init = host1x05_init,
>>>>>       .sync_offset = 0x2100,
>>>>>       .dma_mask = DMA_BIT_MASK(34),
>>>>> +    .version = 5,
>>>>>   };
>>>>>
>>>>>   static const struct of_device_id host1x_of_match[] = {
>>>>> diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
>>>>> index 229d08b6a45e..c6997651b336 100644
>>>>> --- a/drivers/gpu/host1x/dev.h
>>>>> +++ b/drivers/gpu/host1x/dev.h
>>>>> @@ -100,6 +100,7 @@ struct host1x_info {
>>>>>       int (*init)(struct host1x *host1x); /* initialize per SoC ops */
>>>>>       unsigned int sync_offset; /* offset of syncpoint registers */
>>>>>       u64 dma_mask; /* mask of addressable memory */
>>>>> +    unsigned int version; /* host1x's version */
>>>>>   };
>>>>>
>>>>>   struct host1x {
>>>>> diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
>>>>> index 4208329ca2af..7825643da324 100644
>>>>> --- a/drivers/gpu/host1x/job.c
>>>>> +++ b/drivers/gpu/host1x/job.c
>>>>> @@ -333,6 +333,27 @@ static bool check_reloc(struct host1x_reloc *reloc,
>>>>> struct host1x_bo *cmdbuf,
>>>>>       return true;
>>>>>   }
>>>>>
>>>>> +static bool check_reloc_shift(struct device *dev, struct host1x_reloc *reloc)
>>>>> +{
>>>>> +    struct host1x *host = dev_get_drvdata(dev->parent);
>>>>> +
>>>>> +    /* skip newer Tegra's since IOMMU is supposed to be used by
>>>>> +     * them and not all address registers with their shifts are
>>>>> +     * publicly documented */
>>>>> +    if (host->info->version > 1)
>>>>> +        return true;
>>>>
>>>> I don't like the firewall working differently per SoC - IOMMU can still be
>>>> disabled on later SoCs and in this case there would be a security hole even if
>>>> the firewall is enabled.
>>>>
>>>
>>> Yes, it would be a security hole like it is is right now. Unfortunately we can't
>>> fix it on later SoC's without knowing all the address registers and their shifts.
>>
>> Indeed, we can't fix it, but I was thinking that having the firewall enabled in
>> the kernel should always mean that the system is safe, even if IOMMU is
>> disabled, and even if it means that you cannot really do anything useful with it
>> (at least running host1x_test should be possible without shifts, though). Thus
>> no matter what the hardware revision, we should reject anything we cannot ensure
>> as safe.
>>
>>>
>>>>> +
>>>>> +    /* skip Tegra30 with IOMMU enabled */
>>>>> +    if (host->domain)
>>>>> +        return true;
>>>>> +
>>>>
>>>> So if we want to be able to test the firewall on all SoCs just having this check
>>>> here for all SoCs would be better.
>>>>
>>>
>>> I'm not sure what you are meaning here. You'll get a wrong address resolved in
>>> HW if that address needs to be shifted.
>>
>> So we have two choices here:
>> a) If IOMMU is enabled, skip the shift check and return true always. This means
>> that we still ensure that the system is safe, since with IOMMU enabled it is
>> safe even if the firewall cannot ensure that the job is proper. This is what I
>> was trying to say above.
>> b) Always return false if there is a shift, on any SoC and even if IOMMU is
>> enabled. This would harmonize the behavior of the firewall across systems.
> 
> Okay, I think b) is a better choice, actually that's what V1 of the patch did.
> Preserving behaviour regardless of the IOMMU state is reasonable and this is an
> opensource driver that is supposed to be used with the opensource userspace
> after all.
> 
> Later I'd want to stricter the firewall checks to disallow any unknown register
> accesses, that will make it a bit more secure on a later SoC's as well. After,
> we should add a memory access bounds checking for each of HW operations to make
> it really secure.
> 

Yep, this sounds good.

Mikko

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 09/22] drm/tegra: Don't use IOMMU on Tegra20
       [not found]             ` <16be88ba-808d-4d3e-533b-7dce4a2765e7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-06-14 22:24               ` Dmitry Osipenko
  0 siblings, 0 replies; 40+ messages in thread
From: Dmitry Osipenko @ 2017-06-14 22:24 UTC (permalink / raw)
  To: kusmabite-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Thierry Reding, Mikko Perttunen,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	DRI Development, Nicolas Chauvet, Joerg Roedel

On 14.06.2017 13:22, Dmitry Osipenko wrote:
> On 14.06.2017 10:39, Erik Faye-Lund wrote:
>> On Wed, Jun 14, 2017 at 1:15 AM, Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>> There is no IOMMU on Tegra20, instead a GART would be picked as an IOMMU
>>> provider.
>>>
>>> Signed-off-by: Dmitry Osipenko <digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>> Acked-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
>>> ---
>>>  drivers/gpu/drm/tegra/drm.c | 3 ++-
>>>  drivers/gpu/host1x/dev.c    | 3 ++-
>>>  2 files changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
>>> index e999391aedc9..aa7988dcc28f 100644
>>> --- a/drivers/gpu/drm/tegra/drm.c
>>> +++ b/drivers/gpu/drm/tegra/drm.c
>>> @@ -131,7 +131,8 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>>>         if (!tegra)
>>>                 return -ENOMEM;
>>>
>>> -       if (iommu_present(&platform_bus_type)) {
>>> +       if (iommu_present(&platform_bus_type) &&
>>> +           !of_machine_is_compatible("nvidia,tegra20")) {
>>>                 u64 carveout_start, carveout_end, gem_start, gem_end;
>>>                 struct iommu_domain_geometry *geometry;
>>>                 unsigned long order;
>>> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
>>> index f05ebb14fa63..6a805ed908c3 100644
>>> --- a/drivers/gpu/host1x/dev.c
>>> +++ b/drivers/gpu/host1x/dev.c
>>> @@ -177,7 +177,8 @@ static int host1x_probe(struct platform_device *pdev)
>>>                 return err;
>>>         }
>>>
>>> -       if (iommu_present(&platform_bus_type)) {
>>> +       if (iommu_present(&platform_bus_type) &&
>>> +           !of_machine_is_compatible("nvidia,tegra20")) {
>>>                 struct iommu_domain_geometry *geometry;
>>>                 unsigned long order;
>>>
>>
>> This doesn't feel great... The commit message says there's no IOMMU,
>> but iommu_present says otherwise. I know there's some more subtleties
>> here, and the commit message does hint at this. But...
>>
>> If we don't want to treat the GART as an IOMMU, shouldn't we somehow
>> make sure iommu_present() doesn't return true in these cases (or
>> perhaps make something like tegra_use_iommu(), with a comment
>> explaining why we don't want to allow the GART to be treated like a
>> proper IOMMU)? These seems to be the only Tegra-specific calls to
>> iommu_present()...
>>
>> That being said, the patch seems to have the right effect...
>>
> 
> We don't want to treat the GART as an IOMMU right now, but I'd want to change
> that in the (near?) future, so it's a kinda trivial preparation patch that
> restores the GART driver and these of_machine_is_compatible() are supposed to go
> away later. Probably I should mention this in the commit message(?).
> 
> I think we can add a Tegra20 specific IOCTL for an allocation of GART-able
> memory and use it for a stuff like opentegra's EXA, its fallback allocations
> migration is a pita and reserved CMA is never enough =)
> 

I have hacked the DRM driver to work with GART correctly for the IOMMU
allocations, the grate tests and opentegra are working fine utilizing the GART,
CMA is barely utilized and kmsg shows GART's map/unmap debug messages that I've
enabled in the driver. So I think GART should be really useful for us.

Actually, now I'm thinking that it should be better to postpone the restoring of
the GART driver till we have a full solution, maybe next merge window. I'll drop
these two patches from the series in v3 that I'm going to send shortly.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2017-06-14 22:24 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-13 23:15 [PATCH v2 00/22] Tegra DRM fixes Dmitry Osipenko
     [not found] ` <cover.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-13 23:15   ` [PATCH v2 01/22] drm/tegra: Fix lockup on a use of staging API Dmitry Osipenko
     [not found]     ` <d2616b82a1728d0953e84489cde40364999719f0.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-14  7:09       ` Erik Faye-Lund
2017-06-13 23:15   ` [PATCH v2 02/22] drm/tegra: Correct idr_alloc() minimum id Dmitry Osipenko
2017-06-13 23:15   ` [PATCH v2 03/22] drm/tegra: Check for malformed offsets and sizes in the 'submit' IOCTL Dmitry Osipenko
2017-06-13 23:15   ` [PATCH v2 04/22] drm/tegra: Correct copying of waitchecks and disable them " Dmitry Osipenko
2017-06-13 23:15   ` [PATCH v2 05/22] drm/tegra: Check syncpoint ID " Dmitry Osipenko
2017-06-13 23:15   ` [PATCH v2 06/22] drm/tegra: dc: Avoid reset asserts on Tegra20 Dmitry Osipenko
     [not found]     ` <1c7b0c7bf713401073d648f9f228fa152c094a5a.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-14  7:13       ` Erik Faye-Lund
2017-06-13 23:15   ` [PATCH v2 07/22] drm/tegra: dc: Apply clipping to the plane Dmitry Osipenko
     [not found]     ` <7adbe0c6e67115fa4fd201026533b7cb97fc44d2.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-14  7:20       ` Erik Faye-Lund
2017-06-13 23:15   ` [PATCH v2 08/22] drm/tegra: dc: Disable plane if it is invisible Dmitry Osipenko
2017-06-14  7:21     ` Erik Faye-Lund
2017-06-13 23:15   ` [PATCH v2 09/22] drm/tegra: Don't use IOMMU on Tegra20 Dmitry Osipenko
     [not found]     ` <d707bd34e268f9d8fb29bf510a9f17e5f943a635.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-14  7:39       ` Erik Faye-Lund
     [not found]         ` <CABPQNSbMQMrahcvWnczywLDYVU0=Ns=FB6Yseisk-5L_8Y=rpQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-14 10:22           ` Dmitry Osipenko
     [not found]             ` <16be88ba-808d-4d3e-533b-7dce4a2765e7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-14 22:24               ` Dmitry Osipenko
2017-06-13 23:15   ` [PATCH v2 10/22] Revert "iommu/tegra: gart: Do not register with bus" Dmitry Osipenko
2017-06-13 23:15   ` [PATCH v2 11/22] gpu: host1x: Initialize firewall class to the jobs one Dmitry Osipenko
     [not found]     ` <8c0a2417584ce51b716f2c286027bcb18fceda56.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-14  7:41       ` Erik Faye-Lund
2017-06-13 23:15   ` [PATCH v2 12/22] gpu: host1x: Correct host1x_job_pin() error handling Dmitry Osipenko
2017-06-13 23:15   ` [PATCH v2 13/22] gpu: host1x: Do not leak BO's phys address to userspace Dmitry Osipenko
2017-06-13 23:15   ` [PATCH v2 14/22] gpu: host1x: Forbid relocation address shifting in the firewall Dmitry Osipenko
     [not found]     ` <5e6f06d3e57db40c1c4947282ef7336369c3360f.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-14  6:50       ` Mikko Perttunen
     [not found]         ` <6db57dae-0de7-45a5-2ed5-1ebd84577742-/1wQRMveznE@public.gmane.org>
2017-06-14  9:06           ` Dmitry Osipenko
     [not found]             ` <3d723511-45f2-3c80-58fe-c9050624ac94-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-14 11:47               ` Mikko Perttunen
     [not found]                 ` <bc4d8ea3-0fd7-85ad-be08-bc869d0e89a6-/1wQRMveznE@public.gmane.org>
2017-06-14 14:49                   ` Dmitry Osipenko
     [not found]                     ` <a18ec3c6-0fa1-4daa-3db4-c81124424cec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-14 14:59                       ` Mikko Perttunen
2017-06-13 23:15   ` [PATCH v2 15/22] gpu: host1x: Forbid RESTART opcode " Dmitry Osipenko
2017-06-13 23:15   ` [PATCH v2 16/22] gpu: host1x: Forbid unrelated SETCLASS " Dmitry Osipenko
2017-06-13 23:15   ` [PATCH v2 17/22] gpu: host1x: Correct swapped arguments in the is_addr_reg() definition Dmitry Osipenko
2017-06-13 23:15   ` [PATCH v2 18/22] gpu: host1x: Check waits in the firewall Dmitry Osipenko
     [not found]     ` <bd57829e74c6554c46ab761d013e6d9d3c39b693.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-14  7:47       ` Erik Faye-Lund
2017-06-13 23:15   ` [PATCH v2 19/22] gpu: host1x: Remove unused 'struct host1x_cmdbuf' Dmitry Osipenko
2017-06-13 23:15   ` [PATCH v2 20/22] gpu: host1x: Remove unused host1x_cdma_stop() definition Dmitry Osipenko
2017-06-13 23:16   ` [PATCH v2 21/22] gpu: host1x: Refactor channel allocation code Dmitry Osipenko
2017-06-13 23:16   ` [PATCH v2 22/22] gpu: host1x: At first try a non-blocking allocation for the gather copy Dmitry Osipenko
     [not found]     ` <036615566011ab4d52190439a12d943df9e704ef.1497394243.git.digetx-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-14  7:56       ` Erik Faye-Lund
     [not found]         ` <CABPQNSZ8f8QqW4fJYCGfgf3q0pTyQ7NbS_=64x39kvrNDSyhdw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-14  8:32           ` Dmitry Osipenko
2017-06-14  8:44             ` Erik Faye-Lund

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).