* [PATCH 0/8] Host1x IOMMU support + VIC support
@ 2016-11-10 18:23 Mikko Perttunen
2016-11-10 18:23 ` [PATCH 1/8] drm/tegra: Add Tegra DRM allocation API Mikko Perttunen
` (5 more replies)
0 siblings, 6 replies; 25+ messages in thread
From: Mikko Perttunen @ 2016-11-10 18:23 UTC (permalink / raw)
To: thierry.reding; +Cc: linux-tegra, dri-devel, Mikko Perttunen
This series adds IOMMU support to Host1x and TegraDRM
and adds support for the VIC host1x client so that
host1x can be tested on modern Tegra platforms.
It depends on the previous fix series. The whole thing
(modulo patch order) is available as a git repository at
git://github.com/cyndis/linux.git; branch vic-v1.
IO memory management is organized such that there are
two domains: the host1x domain and the tegradrm domain.
The host1x domain is used by the host1x engine and
contains the host1x CDMA and pushbuffers for submitted
jobs.
The tegradrm domain is shared by all host1x units and
contains GEM objects and memory allocated by the
separate tegra_drm_alloc function. This function is
currently used to allocate space for firmware blobs
in the tegradrm domain.
A userspace test case for VIC can be found at
https://github.com/cyndis/drm/tree/work/tegra.
The testcase is in tests/tegra and is called submit_vic.
The in-kernel firewall is not implemented for VIC;
therefore, IOMMU must be enabled for the test to pass.
Tested with Jetson TX1 (T210). Probably works also
with Jetson TK1 (T124). Note that due to hardware changes
the testcase also needs to be changed to run properly
on T124.
Thanks,
Mikko.
Arto Merilainen (2):
drm/tegra: Add falcon helper library
drm/tegra: Add VIC support
Mikko Perttunen (6):
drm/tegra: Add Tegra DRM allocation API
drm/tegra: Allocate BOs from lower 4G when without IOMMU
gpu: host1x: Add IOMMU support
dt-bindings: Add bindings for the Tegra VIC
arm64: tegra: Enable VIC on T210
arm64: tegra: Enable IOMMU for Host1x on Tegra210
.../display/tegra/nvidia,tegra20-host1x.txt | 13 +
arch/arm64/boot/dts/nvidia/tegra210.dtsi | 19 +-
drivers/gpu/drm/tegra/Makefile | 4 +-
drivers/gpu/drm/tegra/drm.c | 101 +++++-
drivers/gpu/drm/tegra/drm.h | 8 +
drivers/gpu/drm/tegra/falcon.c | 256 +++++++++++++
drivers/gpu/drm/tegra/falcon.h | 130 +++++++
drivers/gpu/drm/tegra/gem.c | 2 +-
drivers/gpu/drm/tegra/vic.c | 400 +++++++++++++++++++++
drivers/gpu/drm/tegra/vic.h | 31 ++
drivers/gpu/host1x/cdma.c | 65 +++-
drivers/gpu/host1x/cdma.h | 4 +-
drivers/gpu/host1x/dev.c | 39 +-
drivers/gpu/host1x/dev.h | 5 +
drivers/gpu/host1x/hw/cdma_hw.c | 10 +-
drivers/gpu/host1x/job.c | 76 +++-
drivers/gpu/host1x/job.h | 1 +
include/linux/host1x.h | 1 +
18 files changed, 1128 insertions(+), 37 deletions(-)
create mode 100644 drivers/gpu/drm/tegra/falcon.c
create mode 100644 drivers/gpu/drm/tegra/falcon.h
create mode 100644 drivers/gpu/drm/tegra/vic.c
create mode 100644 drivers/gpu/drm/tegra/vic.h
--
2.10.2
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/8] drm/tegra: Add Tegra DRM allocation API
2016-11-10 18:23 [PATCH 0/8] Host1x IOMMU support + VIC support Mikko Perttunen
@ 2016-11-10 18:23 ` Mikko Perttunen
[not found] ` <20161110182345.31777-2-mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2016-11-10 18:23 ` [PATCH 2/8] drm/tegra: Allocate BOs from lower 4G when without IOMMU Mikko Perttunen
` (4 subsequent siblings)
5 siblings, 1 reply; 25+ messages in thread
From: Mikko Perttunen @ 2016-11-10 18:23 UTC (permalink / raw)
To: thierry.reding; +Cc: linux-tegra, dri-devel, Mikko Perttunen
Add a new IO virtual memory allocation API to allow clients to
allocate non-GEM memory in the Tegra DRM IOMMU domain. This is
required e.g. for loading client firmware when clients are attached
to the IOMMU domain.
The allocator allocates contiguous physical pages that are then
mapped contiguously to the IOMMU domain using the iova_domain
library provided by the kernel. Contiguous physical pages are
used so that the same allocator works also when IOMMU support
is disabled and therefore devices access physical memory directly.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
drivers/gpu/drm/tegra/drm.c | 98 ++++++++++++++++++++++++++++++++++++++++++---
drivers/gpu/drm/tegra/drm.h | 7 ++++
2 files changed, 100 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index b8be3ee..ecfe7ba 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -1,12 +1,13 @@
/*
* Copyright (C) 2012 Avionic Design GmbH
- * Copyright (C) 2012-2013 NVIDIA CORPORATION. All rights reserved.
+ * Copyright (C) 2012-2015 NVIDIA CORPORATION. All rights reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
+#include <linux/bitops.h>
#include <linux/host1x.h>
#include <linux/iommu.h>
@@ -23,6 +24,8 @@
#define DRIVER_MINOR 0
#define DRIVER_PATCHLEVEL 0
+#define IOVA_AREA_SZ (1024 * 1024 * 64) /* 64 MiB */
+
struct tegra_drm_file {
struct list_head contexts;
};
@@ -127,7 +130,8 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
if (iommu_present(&platform_bus_type)) {
struct iommu_domain_geometry *geometry;
- u64 start, end;
+ unsigned long order;
+ u64 iova_start, start, end;
tegra->domain = iommu_domain_alloc(&platform_bus_type);
if (!tegra->domain) {
@@ -138,10 +142,17 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
geometry = &tegra->domain->geometry;
start = geometry->aperture_start;
end = geometry->aperture_end;
+ iova_start = end - IOVA_AREA_SZ + 1;
+
+ order = __ffs(tegra->domain->pgsize_bitmap);
+ init_iova_domain(&tegra->iova, 1UL << order,
+ iova_start >> order,
+ end >> order);
- DRM_DEBUG_DRIVER("IOMMU aperture initialized (%#llx-%#llx)\n",
- start, end);
- drm_mm_init(&tegra->mm, start, end - start + 1);
+ drm_mm_init(&tegra->mm, start, iova_start - start);
+
+ DRM_DEBUG("IOMMU apertures initialized (GEM: %#llx-%#llx, IOVA: %#llx-%#llx)\n",
+ start, iova_start-1, iova_start, end);
}
mutex_init(&tegra->clients_lock);
@@ -208,6 +219,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
if (tegra->domain) {
iommu_domain_free(tegra->domain);
drm_mm_takedown(&tegra->mm);
+ put_iova_domain(&tegra->iova);
}
free:
kfree(tegra);
@@ -232,6 +244,7 @@ static int tegra_drm_unload(struct drm_device *drm)
if (tegra->domain) {
iommu_domain_free(tegra->domain);
drm_mm_takedown(&tegra->mm);
+ put_iova_domain(&tegra->iova);
}
kfree(tegra);
@@ -975,6 +988,81 @@ int tegra_drm_unregister_client(struct tegra_drm *tegra,
return 0;
}
+void *tegra_drm_alloc(struct tegra_drm *tegra, size_t size,
+ dma_addr_t *iova)
+{
+ struct iova *alloc;
+ unsigned long shift;
+ void *virt;
+ gfp_t gfp;
+ int err;
+
+ if (tegra->domain)
+ size = iova_align(&tegra->iova, size);
+ else
+ size = PAGE_ALIGN(size);
+
+ gfp = GFP_KERNEL | __GFP_ZERO;
+ if (!tegra->domain) {
+ /*
+ * Tegra210 has 64-bit physical addresses but units only support
+ * 32-bit addresses, so if we don't have an IOMMU to do
+ * translation, force allocations to be in the 32-bit range.
+ */
+ gfp |= GFP_DMA;
+ }
+
+ virt = (void *)__get_free_pages(gfp, get_order(size));
+ if (!virt)
+ return NULL;
+
+ if (!tegra->domain) {
+ /*
+ * If IOMMU is disabled, devices address physical memory
+ * directly.
+ */
+ *iova = virt_to_phys(virt);
+ return virt;
+ }
+
+ shift = iova_shift(&tegra->iova);
+ alloc = alloc_iova(&tegra->iova, size >> shift,
+ tegra->domain->geometry.aperture_end >> shift, true);
+ if (!alloc)
+ goto free_pages;
+
+ *iova = iova_dma_addr(&tegra->iova, alloc);
+ err = iommu_map(tegra->domain, *iova, virt_to_phys(virt),
+ size, IOMMU_READ | IOMMU_WRITE);
+ if (err < 0)
+ goto free_iova;
+
+ return virt;
+
+free_iova:
+ __free_iova(&tegra->iova, alloc);
+free_pages:
+ free_pages((unsigned long)virt, get_order(size));
+
+ return NULL;
+}
+
+void tegra_drm_free(struct tegra_drm *tegra, size_t size, void *virt,
+ dma_addr_t iova)
+{
+ if (tegra->domain)
+ size = iova_align(&tegra->iova, size);
+ else
+ size = PAGE_ALIGN(size);
+
+ if (tegra->domain) {
+ iommu_unmap(tegra->domain, iova, size);
+ free_iova(&tegra->iova, iova_pfn(&tegra->iova, iova));
+ }
+
+ free_pages((unsigned long)virt, get_order(size));
+}
+
static int host1x_drm_probe(struct host1x_device *dev)
{
struct drm_driver *driver = &tegra_drm_driver;
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index 0ddcce1..f75b505 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -12,6 +12,7 @@
#include <uapi/drm/tegra_drm.h>
#include <linux/host1x.h>
+#include <linux/iova.h>
#include <linux/of_gpio.h>
#include <drm/drmP.h>
@@ -43,6 +44,8 @@ struct tegra_drm {
struct iommu_domain *domain;
struct drm_mm mm;
+ struct iova_domain iova;
+
struct mutex clients_lock;
struct list_head clients;
@@ -104,6 +107,10 @@ int tegra_drm_unregister_client(struct tegra_drm *tegra,
int tegra_drm_init(struct tegra_drm *tegra, struct drm_device *drm);
int tegra_drm_exit(struct tegra_drm *tegra);
+void *tegra_drm_alloc(struct tegra_drm *tegra, size_t size, dma_addr_t *iova);
+void tegra_drm_free(struct tegra_drm *tegra, size_t size, void *virt,
+ dma_addr_t iova);
+
struct tegra_dc_soc_info;
struct tegra_output;
--
2.10.2
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 2/8] drm/tegra: Allocate BOs from lower 4G when without IOMMU
2016-11-10 18:23 [PATCH 0/8] Host1x IOMMU support + VIC support Mikko Perttunen
2016-11-10 18:23 ` [PATCH 1/8] drm/tegra: Add Tegra DRM allocation API Mikko Perttunen
@ 2016-11-10 18:23 ` Mikko Perttunen
[not found] ` <20161110182345.31777-3-mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
[not found] ` <20161110182345.31777-1-mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
` (3 subsequent siblings)
5 siblings, 1 reply; 25+ messages in thread
From: Mikko Perttunen @ 2016-11-10 18:23 UTC (permalink / raw)
To: thierry.reding; +Cc: linux-tegra, dri-devel, Mikko Perttunen
On 64-bit Tegras, buffer object memory allocation may
return memory above 4G that units behind Host1x cannot
access. Add the GFP_DMA flag to these allocation when
IOMMU is not enabled to ensure units can always access
BO memory.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
drivers/gpu/drm/tegra/gem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index 19bf9cd..e36e6c5 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -233,7 +233,7 @@ static int tegra_bo_alloc(struct drm_device *drm, struct tegra_bo *bo)
size_t size = bo->gem.size;
bo->vaddr = dma_alloc_wc(drm->dev, size, &bo->paddr,
- GFP_KERNEL | __GFP_NOWARN);
+ GFP_KERNEL | GFP_DMA | __GFP_NOWARN);
if (!bo->vaddr) {
dev_err(drm->dev,
"failed to allocate buffer of size %zu\n",
--
2.10.2
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 3/8] drm/tegra: Add falcon helper library
[not found] ` <20161110182345.31777-1-mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2016-11-10 18:23 ` Mikko Perttunen
2016-12-05 19:13 ` Thierry Reding
2016-11-10 18:23 ` [PATCH 4/8] drm/tegra: Add VIC support Mikko Perttunen
` (3 subsequent siblings)
4 siblings, 1 reply; 25+ messages in thread
From: Mikko Perttunen @ 2016-11-10 18:23 UTC (permalink / raw)
To: thierry.reding-Re5JQEeQqe8AvxtiuMwx3w
Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, airlied-cv59FeDIM0c,
Arto Merilainen, Andrew Chew, Mikko Perttunen
From: Arto Merilainen <amerilainen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
Add a set of falcon helper routines for use by the tegradrm client drivers
of the various falcon-based engines.
The falcon is a microcontroller that acts as a frontend for the rest of a
particular Tegra engine. In order to properly utilize these engines, the
frontend must be booted before pushing any commands.
Based on work by Andrew Chew <achew-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Andrew Chew <achew-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Arto Merilainen <amerilainen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
drivers/gpu/drm/tegra/Makefile | 3 +-
drivers/gpu/drm/tegra/falcon.c | 256 +++++++++++++++++++++++++++++++++++++++++
drivers/gpu/drm/tegra/falcon.h | 130 +++++++++++++++++++++
3 files changed, 388 insertions(+), 1 deletion(-)
create mode 100644 drivers/gpu/drm/tegra/falcon.c
create mode 100644 drivers/gpu/drm/tegra/falcon.h
diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile
index 2c66a8d..93e9a4a 100644
--- a/drivers/gpu/drm/tegra/Makefile
+++ b/drivers/gpu/drm/tegra/Makefile
@@ -13,6 +13,7 @@ tegra-drm-y := \
sor.o \
dpaux.o \
gr2d.o \
- gr3d.o
+ gr3d.o \
+ falcon.o
obj-$(CONFIG_DRM_TEGRA) += tegra-drm.o
diff --git a/drivers/gpu/drm/tegra/falcon.c b/drivers/gpu/drm/tegra/falcon.c
new file mode 100644
index 0000000..180b2fd
--- /dev/null
+++ b/drivers/gpu/drm/tegra/falcon.c
@@ -0,0 +1,256 @@
+/*
+ * Copyright (c) 2015, NVIDIA Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/platform_device.h>
+#include <linux/dma-mapping.h>
+#include <linux/firmware.h>
+#include <linux/pci_ids.h>
+#include <linux/iopoll.h>
+
+#include "falcon.h"
+#include "drm.h"
+
+#define FALCON_IDLE_TIMEOUT_US 100000
+#define FALCON_IDLE_CHECK_PERIOD_US 10
+
+enum falcon_memory {
+ FALCON_MEMORY_IMEM,
+ FALCON_MEMORY_DATA,
+};
+
+static void falcon_writel(struct falcon *falcon, u32 value, u32 offset)
+{
+ writel(value, falcon->regs + offset);
+}
+
+int falcon_wait_idle(struct falcon *falcon)
+{
+ u32 idlestate;
+
+ return readl_poll_timeout(falcon->regs + FALCON_IDLESTATE, idlestate,
+ (!idlestate),
+ FALCON_IDLE_CHECK_PERIOD_US,
+ FALCON_IDLE_TIMEOUT_US);
+}
+
+static int falcon_dma_wait_idle(struct falcon *falcon)
+{
+ u32 dmatrfcmd;
+
+ return readl_poll_timeout(falcon->regs + FALCON_DMATRFCMD, dmatrfcmd,
+ (dmatrfcmd & FALCON_DMATRFCMD_IDLE),
+ FALCON_IDLE_CHECK_PERIOD_US,
+ FALCON_IDLE_TIMEOUT_US);
+}
+
+static int falcon_copy_chunk(struct falcon *falcon,
+ phys_addr_t base,
+ unsigned long offset,
+ enum falcon_memory target)
+{
+ u32 cmd = FALCON_DMATRFCMD_SIZE_256B;
+
+ if (target == FALCON_MEMORY_IMEM)
+ cmd |= FALCON_DMATRFCMD_IMEM;
+
+ falcon_writel(falcon, offset, FALCON_DMATRFMOFFS);
+ falcon_writel(falcon, base, FALCON_DMATRFFBOFFS);
+ falcon_writel(falcon, cmd, FALCON_DMATRFCMD);
+
+ return falcon_dma_wait_idle(falcon);
+}
+
+static void falcon_copy_firmware_image(struct falcon *falcon,
+ const struct firmware *firmware)
+{
+ u32 *firmware_vaddr = falcon->firmware.vaddr;
+ size_t i;
+
+ /* copy the whole thing taking into account endianness */
+ for (i = 0; i < firmware->size / sizeof(u32); i++)
+ firmware_vaddr[i] = le32_to_cpu(((u32 *)firmware->data)[i]);
+
+ /* ensure that caches are flushed and falcon can see the firmware */
+ dma_sync_single_for_device(falcon->dev, virt_to_phys(firmware_vaddr),
+ falcon->firmware.size, DMA_TO_DEVICE);
+}
+
+static int falcon_parse_firmware_image(struct falcon *falcon)
+{
+ struct falcon_firmware_bin_header_v1 *bin_header =
+ (void *)falcon->firmware.vaddr;
+ struct falcon_firmware_os_header_v1 *os_header;
+
+ /* endian problems would show up right here */
+ if (bin_header->bin_magic != PCI_VENDOR_ID_NVIDIA) {
+ dev_err(falcon->dev, "failed to get firmware magic");
+ return -EINVAL;
+ }
+
+ /* currently only version 1 is supported */
+ if (bin_header->bin_ver != 1) {
+ dev_err(falcon->dev, "unsupported firmware version");
+ return -EINVAL;
+ }
+
+ /* check that the firmware size is consistent */
+ if (bin_header->bin_size > falcon->firmware.size) {
+ dev_err(falcon->dev, "firmware image size inconsistency");
+ return -EINVAL;
+ }
+
+ os_header = (falcon->firmware.vaddr +
+ bin_header->os_bin_header_offset);
+
+ falcon->firmware.bin_data.size = bin_header->os_bin_size;
+ falcon->firmware.bin_data.offset = bin_header->os_bin_data_offset;
+ falcon->firmware.code.offset = os_header->os_code_offset;
+ falcon->firmware.code.size = os_header->os_code_size;
+ falcon->firmware.data.offset = os_header->os_data_offset;
+ falcon->firmware.data.size = os_header->os_data_size;
+
+ return 0;
+}
+
+int falcon_read_firmware(struct falcon *falcon, const char *firmware_name)
+{
+ const struct firmware *firmware;
+ int err;
+
+ if (falcon->firmware.valid)
+ return 0;
+
+ err = request_firmware(&firmware, firmware_name, falcon->dev);
+ if (err < 0) {
+ dev_err(falcon->dev, "failed to get firmware\n");
+ return err;
+ }
+
+ falcon->firmware.size = firmware->size;
+
+ /* allocate iova space for the firmware */
+ falcon->firmware.vaddr = falcon->ops->alloc(falcon, firmware->size,
+ &falcon->firmware.paddr);
+ if (!falcon->firmware.vaddr) {
+ dev_err(falcon->dev, "dma memory mapping failed");
+ err = -ENOMEM;
+ goto err_alloc_firmware;
+ }
+
+ /* copy firmware image into local area. this also ensures endianness */
+ falcon_copy_firmware_image(falcon, firmware);
+
+ /* parse the image data */
+ err = falcon_parse_firmware_image(falcon);
+ if (err < 0) {
+ dev_err(falcon->dev, "failed to parse firmware image\n");
+ goto err_setup_firmware_image;
+ }
+
+ falcon->firmware.valid = true;
+
+ release_firmware(firmware);
+
+ return 0;
+
+err_setup_firmware_image:
+ falcon->ops->free(falcon, falcon->firmware.size,
+ falcon->firmware.paddr, falcon->firmware.vaddr);
+err_alloc_firmware:
+ release_firmware(firmware);
+
+ return err;
+}
+
+int falcon_init(struct falcon *falcon)
+{
+ /* check mandatory ops */
+ if (!falcon->ops || !falcon->ops->alloc || !falcon->ops->free)
+ return -EINVAL;
+
+ falcon->firmware.valid = false;
+
+ return 0;
+}
+
+void falcon_exit(struct falcon *falcon)
+{
+ if (!falcon->firmware.valid)
+ return;
+
+ falcon->ops->free(falcon, falcon->firmware.size,
+ falcon->firmware.paddr,
+ falcon->firmware.vaddr);
+ falcon->firmware.valid = false;
+}
+
+int falcon_boot(struct falcon *falcon)
+{
+ unsigned long offset;
+ int err = 0;
+
+ if (!falcon->firmware.valid)
+ return -EINVAL;
+
+ falcon_writel(falcon, 0, FALCON_DMACTL);
+
+ /* setup the address of the binary data. Falcon can access it later */
+ falcon_writel(falcon, (falcon->firmware.paddr +
+ falcon->firmware.bin_data.offset) >> 8,
+ FALCON_DMATRFBASE);
+
+ /* copy the data segment into Falcon internal memory */
+ for (offset = 0; offset < falcon->firmware.data.size; offset += 256)
+ falcon_copy_chunk(falcon,
+ falcon->firmware.data.offset + offset,
+ offset, FALCON_MEMORY_DATA);
+
+ /* copy the first code segment into Falcon internal memory */
+ falcon_copy_chunk(falcon, falcon->firmware.code.offset,
+ 0, FALCON_MEMORY_IMEM);
+
+ /* setup falcon interrupts */
+ falcon_writel(falcon, FALCON_IRQMSET_EXT(0xff) |
+ FALCON_IRQMSET_SWGEN1 |
+ FALCON_IRQMSET_SWGEN0 |
+ FALCON_IRQMSET_EXTERR |
+ FALCON_IRQMSET_HALT |
+ FALCON_IRQMSET_WDTMR,
+ FALCON_IRQMSET);
+ falcon_writel(falcon, FALCON_IRQDEST_EXT(0xff) |
+ FALCON_IRQDEST_SWGEN1 |
+ FALCON_IRQDEST_SWGEN0 |
+ FALCON_IRQDEST_EXTERR |
+ FALCON_IRQDEST_HALT,
+ FALCON_IRQDEST);
+
+ /* enable interface */
+ falcon_writel(falcon, FALCON_ITFEN_MTHDEN |
+ FALCON_ITFEN_CTXEN,
+ FALCON_ITFEN);
+
+ /* boot falcon */
+ falcon_writel(falcon, 0x00000000, FALCON_BOOTVEC);
+ falcon_writel(falcon, FALCON_CPUCTL_STARTCPU, FALCON_CPUCTL);
+
+ err = falcon_wait_idle(falcon);
+ if (err < 0) {
+ dev_err(falcon->dev, "falcon boot failed due to timeout");
+ return err;
+ }
+
+ dev_info(falcon->dev, "falcon booted");
+
+ return 0;
+}
+
+void falcon_execute_method(struct falcon *falcon, u32 method, u32 data)
+{
+ falcon_writel(falcon, method >> 2, FALCON_UCLASS_METHOD_OFFSET);
+ falcon_writel(falcon, data, FALCON_UCLASS_METHOD_DATA);
+}
diff --git a/drivers/gpu/drm/tegra/falcon.h b/drivers/gpu/drm/tegra/falcon.h
new file mode 100644
index 0000000..56284b9
--- /dev/null
+++ b/drivers/gpu/drm/tegra/falcon.h
@@ -0,0 +1,130 @@
+/*
+ * Copyright (c) 2015, NVIDIA Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _FALCON_H_
+#define _FALCON_H_
+
+#include <linux/types.h>
+
+#define FALCON_UCLASS_METHOD_OFFSET 0x00000040
+
+#define FALCON_UCLASS_METHOD_DATA 0x00000044
+
+#define FALCON_IRQMSET 0x00001010
+#define FALCON_IRQMSET_WDTMR (1 << 1)
+#define FALCON_IRQMSET_HALT (1 << 4)
+#define FALCON_IRQMSET_EXTERR (1 << 5)
+#define FALCON_IRQMSET_SWGEN0 (1 << 6)
+#define FALCON_IRQMSET_SWGEN1 (1 << 7)
+#define FALCON_IRQMSET_EXT(v) (((v) & 0xff) << 8)
+
+#define FALCON_IRQDEST 0x0000101c
+#define FALCON_IRQDEST_HALT (1 << 4)
+#define FALCON_IRQDEST_EXTERR (1 << 5)
+#define FALCON_IRQDEST_SWGEN0 (1 << 6)
+#define FALCON_IRQDEST_SWGEN1 (1 << 7)
+#define FALCON_IRQDEST_EXT(v) (((v) & 0xff) << 8)
+
+#define FALCON_ITFEN 0x00001048
+#define FALCON_ITFEN_CTXEN (1 << 0)
+#define FALCON_ITFEN_MTHDEN (1 << 1)
+
+#define FALCON_IDLESTATE 0x0000104c
+
+#define FALCON_CPUCTL 0x00001100
+#define FALCON_CPUCTL_STARTCPU (1 << 1)
+
+#define FALCON_BOOTVEC 0x00001104
+
+#define FALCON_DMACTL 0x0000110c
+#define FALCON_DMACTL_DMEM_SCRUBBING (1 << 1)
+#define FALCON_DMACTL_IMEM_SCRUBBING (1 << 2)
+
+#define FALCON_DMATRFBASE 0x00001110
+
+#define FALCON_DMATRFMOFFS 0x00001114
+
+#define FALCON_DMATRFCMD 0x00001118
+#define FALCON_DMATRFCMD_IDLE (1 << 1)
+#define FALCON_DMATRFCMD_IMEM (1 << 4)
+#define FALCON_DMATRFCMD_SIZE_256B (6 << 8)
+
+#define FALCON_DMATRFFBOFFS 0x0000111c
+
+struct falcon_firmware_bin_header_v1 {
+ u32 bin_magic; /* 0x10de */
+ u32 bin_ver; /* cya, versioning of bin format (1) */
+ u32 bin_size; /* entire image size including this header */
+ u32 os_bin_header_offset;
+ u32 os_bin_data_offset;
+ u32 os_bin_size;
+};
+
+struct falcon_firmware_os_app_v1 {
+ u32 offset;
+ u32 size;
+};
+
+struct falcon_firmware_os_header_v1 {
+ u32 os_code_offset;
+ u32 os_code_size;
+ u32 os_data_offset;
+ u32 os_data_size;
+ u32 num_apps;
+ struct falcon_firmware_os_app_v1 *app_code;
+ struct falcon_firmware_os_app_v1 *app_data;
+ u32 *os_ovl_offset;
+ u32 *os_ovl_size;
+};
+
+struct falcon;
+
+struct falcon_ops {
+ void *(*alloc)(struct falcon *falcon, size_t size,
+ dma_addr_t *paddr);
+ void (*free)(struct falcon *falcon, size_t size,
+ dma_addr_t paddr, void *vaddr);
+};
+
+struct falcon_firmware_section {
+ unsigned long offset;
+ size_t size;
+};
+
+struct falcon_firmware {
+ /* Raw firmware data */
+ dma_addr_t paddr;
+ void *vaddr;
+ size_t size;
+
+ /* Parsed firmware information */
+ struct falcon_firmware_section bin_data;
+ struct falcon_firmware_section data;
+ struct falcon_firmware_section code;
+
+ bool valid;
+};
+
+struct falcon {
+ /* Set by falcon client */
+ struct device *dev;
+ void __iomem *regs;
+ const struct falcon_ops *ops;
+ void *data;
+
+ struct falcon_firmware firmware;
+};
+
+int falcon_init(struct falcon *falcon);
+void falcon_exit(struct falcon *falcon);
+int falcon_read_firmware(struct falcon *falcon, const char *firmware_name);
+int falcon_boot(struct falcon *falcon);
+void falcon_execute_method(struct falcon *falcon, u32 method, u32 data);
+int falcon_wait_idle(struct falcon *falcon);
+
+#endif /* _FALCON_H_ */
--
2.10.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 4/8] drm/tegra: Add VIC support
[not found] ` <20161110182345.31777-1-mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2016-11-10 18:23 ` [PATCH 3/8] drm/tegra: Add falcon helper library Mikko Perttunen
@ 2016-11-10 18:23 ` Mikko Perttunen
2016-12-05 19:35 ` Thierry Reding
2016-11-10 18:23 ` [PATCH 7/8] arm64: tegra: Enable VIC on T210 Mikko Perttunen
` (2 subsequent siblings)
4 siblings, 1 reply; 25+ messages in thread
From: Mikko Perttunen @ 2016-11-10 18:23 UTC (permalink / raw)
To: thierry.reding-Re5JQEeQqe8AvxtiuMwx3w
Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, airlied-cv59FeDIM0c,
Arto Merilainen, Andrew Chew, Mikko Perttunen
From: Arto Merilainen <amerilainen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
This patch adds support for Video Image Compositor engine which
can be used for 2d operations.
Signed-off-by: Andrew Chew <achew-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Arto Merilainen <amerilainen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
drivers/gpu/drm/tegra/Makefile | 3 +-
drivers/gpu/drm/tegra/drm.c | 3 +
drivers/gpu/drm/tegra/drm.h | 1 +
drivers/gpu/drm/tegra/vic.c | 400 +++++++++++++++++++++++++++++++++++++++++
drivers/gpu/drm/tegra/vic.h | 31 ++++
include/linux/host1x.h | 1 +
6 files changed, 438 insertions(+), 1 deletion(-)
create mode 100644 drivers/gpu/drm/tegra/vic.c
create mode 100644 drivers/gpu/drm/tegra/vic.h
diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile
index 93e9a4a..6af3a9a 100644
--- a/drivers/gpu/drm/tegra/Makefile
+++ b/drivers/gpu/drm/tegra/Makefile
@@ -14,6 +14,7 @@ tegra-drm-y := \
dpaux.o \
gr2d.o \
gr3d.o \
- falcon.o
+ falcon.o \
+ vic.o
obj-$(CONFIG_DRM_TEGRA) += tegra-drm.o
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index ecfe7ba..707404d 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -1151,11 +1151,13 @@ static const struct of_device_id host1x_drm_subdevs[] = {
{ .compatible = "nvidia,tegra124-sor", },
{ .compatible = "nvidia,tegra124-hdmi", },
{ .compatible = "nvidia,tegra124-dsi", },
+ { .compatible = "nvidia,tegra124-vic", },
{ .compatible = "nvidia,tegra132-dsi", },
{ .compatible = "nvidia,tegra210-dc", },
{ .compatible = "nvidia,tegra210-dsi", },
{ .compatible = "nvidia,tegra210-sor", },
{ .compatible = "nvidia,tegra210-sor1", },
+ { .compatible = "nvidia,tegra210-vic", },
{ /* sentinel */ }
};
@@ -1177,6 +1179,7 @@ static struct platform_driver * const drivers[] = {
&tegra_sor_driver,
&tegra_gr2d_driver,
&tegra_gr3d_driver,
+ &tegra_vic_driver,
};
static int __init host1x_drm_init(void)
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index f75b505..8efaaa4 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -292,5 +292,6 @@ extern struct platform_driver tegra_dpaux_driver;
extern struct platform_driver tegra_sor_driver;
extern struct platform_driver tegra_gr2d_driver;
extern struct platform_driver tegra_gr3d_driver;
+extern struct platform_driver tegra_vic_driver;
#endif /* HOST1X_DRM_H */
diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
new file mode 100644
index 0000000..9eda5e5
--- /dev/null
+++ b/drivers/gpu/drm/tegra/vic.c
@@ -0,0 +1,400 @@
+/*
+ * Copyright (c) 2015, NVIDIA Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/clk.h>
+#include <linux/host1x.h>
+#include <linux/iommu.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/of_platform.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+#include <linux/reset.h>
+
+#include <soc/tegra/pmc.h>
+
+#include "drm.h"
+#include "falcon.h"
+#include "vic.h"
+
+struct vic_config {
+ /* Firmware name */
+ const char *ucode_name;
+};
+
+struct vic {
+ struct falcon falcon;
+ bool booted;
+
+ void __iomem *regs;
+ struct tegra_drm_client client;
+ struct host1x_channel *channel;
+ struct iommu_domain *domain;
+ struct device *dev;
+ struct clk *clk;
+
+ /* Platform configuration */
+ const struct vic_config *config;
+};
+
+static inline struct vic *to_vic(struct tegra_drm_client *client)
+{
+ return container_of(client, struct vic, client);
+}
+
+static void vic_writel(struct vic *vic, u32 value, unsigned int offset)
+{
+ writel(value, vic->regs + offset);
+}
+
+static int vic_runtime_resume(struct device *dev)
+{
+ struct vic *vic = dev_get_drvdata(dev);
+
+ return clk_prepare_enable(vic->clk);
+}
+
+static int vic_runtime_suspend(struct device *dev)
+{
+ struct vic *vic = dev_get_drvdata(dev);
+
+ clk_disable_unprepare(vic->clk);
+
+ vic->booted = false;
+
+ return 0;
+}
+
+static int vic_boot(struct vic *vic)
+{
+ u32 fce_ucode_size, fce_bin_data_offset;
+ void *hdr;
+ int err = 0;
+
+ if (vic->booted)
+ return 0;
+
+ if (!vic->falcon.firmware.valid) {
+ err = falcon_read_firmware(&vic->falcon,
+ vic->config->ucode_name);
+ if (err < 0)
+ return err;
+ }
+
+ /* setup clockgating registers */
+ vic_writel(vic, CG_IDLE_CG_DLY_CNT(4) |
+ CG_IDLE_CG_EN |
+ CG_WAKEUP_DLY_CNT(4),
+ NV_PVIC_MISC_PRI_VIC_CG);
+
+ err = falcon_boot(&vic->falcon);
+ if (err < 0)
+ return err;
+
+ hdr = vic->falcon.firmware.vaddr;
+ fce_bin_data_offset = *(u32 *)(hdr + VIC_UCODE_FCE_DATA_OFFSET);
+ hdr = vic->falcon.firmware.vaddr +
+ *(u32 *)(hdr + VIC_UCODE_FCE_HEADER_OFFSET);
+ fce_ucode_size = *(u32 *)(hdr + FCE_UCODE_SIZE_OFFSET);
+
+ falcon_execute_method(&vic->falcon, VIC_SET_APPLICATION_ID, 1);
+ falcon_execute_method(&vic->falcon, VIC_SET_FCE_UCODE_SIZE,
+ fce_ucode_size);
+ falcon_execute_method(&vic->falcon, VIC_SET_FCE_UCODE_OFFSET,
+ (vic->falcon.firmware.paddr + fce_bin_data_offset)
+ >> 8);
+
+ err = falcon_wait_idle(&vic->falcon);
+ if (err < 0) {
+ dev_err(vic->dev,
+ "failed to set application ID and FCE base\n");
+ return err;
+ }
+
+ vic->booted = true;
+
+ return 0;
+}
+
+static void *vic_falcon_alloc(struct falcon *falcon, size_t size,
+ dma_addr_t *iova)
+{
+ struct tegra_drm *tegra = falcon->data;
+
+ return tegra_drm_alloc(tegra, size, iova);
+}
+
+static void vic_falcon_free(struct falcon *falcon, size_t size,
+ dma_addr_t iova, void *va)
+{
+ struct tegra_drm *tegra = falcon->data;
+
+ return tegra_drm_free(tegra, size, va, iova);
+}
+
+static const struct falcon_ops vic_falcon_ops = {
+ .alloc = vic_falcon_alloc,
+ .free = vic_falcon_free
+};
+
+static int vic_init(struct host1x_client *client)
+{
+ struct tegra_drm_client *drm = host1x_to_drm_client(client);
+ struct drm_device *dev = dev_get_drvdata(client->parent);
+ struct tegra_drm *tegra = dev->dev_private;
+ struct vic *vic = to_vic(drm);
+ int err;
+
+ if (tegra->domain) {
+ err = iommu_attach_device(tegra->domain, vic->dev);
+ if (err < 0) {
+ dev_err(vic->dev, "failed to attach to domain: %d\n",
+ err);
+ return err;
+ }
+
+ vic->domain = tegra->domain;
+ }
+
+ vic->falcon.dev = vic->dev;
+ vic->falcon.regs = vic->regs;
+ vic->falcon.data = tegra;
+ vic->falcon.ops = &vic_falcon_ops;
+ err = falcon_init(&vic->falcon);
+ if (err < 0)
+ goto detach_device;
+
+ vic->channel = host1x_channel_request(client->dev);
+ if (!vic->channel) {
+ err = -ENOMEM;
+ goto exit_falcon;
+ }
+
+ client->syncpts[0] = host1x_syncpt_request(client->dev, 0);
+ if (!client->syncpts[0]) {
+ err = -ENOMEM;
+ goto free_channel;
+ }
+
+ err = tegra_drm_register_client(tegra, drm);
+ if (err < 0)
+ goto free_syncpt;
+
+ return 0;
+
+free_syncpt:
+ host1x_syncpt_free(client->syncpts[0]);
+free_channel:
+ host1x_channel_free(vic->channel);
+exit_falcon:
+ falcon_exit(&vic->falcon);
+detach_device:
+ if (tegra->domain)
+ iommu_detach_device(tegra->domain, vic->dev);
+
+ return err;
+}
+
+static int vic_exit(struct host1x_client *client)
+{
+ struct tegra_drm_client *drm = host1x_to_drm_client(client);
+ struct drm_device *dev = dev_get_drvdata(client->parent);
+ struct tegra_drm *tegra = dev->dev_private;
+ struct vic *vic = to_vic(drm);
+ int err;
+
+ err = tegra_drm_unregister_client(tegra, drm);
+ if (err < 0)
+ return err;
+
+ host1x_syncpt_free(client->syncpts[0]);
+ host1x_channel_free(vic->channel);
+
+ falcon_exit(&vic->falcon);
+
+ if (vic->domain) {
+ iommu_detach_device(vic->domain, vic->dev);
+ vic->domain = NULL;
+ }
+
+ return 0;
+}
+
+static const struct host1x_client_ops vic_client_ops = {
+ .init = vic_init,
+ .exit = vic_exit,
+};
+
+static int vic_open_channel(struct tegra_drm_client *client,
+ struct tegra_drm_context *context)
+{
+ struct vic *vic = to_vic(client);
+ int err;
+
+ err = pm_runtime_get_sync(vic->dev);
+ if (err < 0)
+ return err;
+
+ /*
+ * Try to boot the Falcon microcontroller. Booting is deferred until
+ * here because the firmware might not yet be available during system
+ * boot, for example if it's on remote storage.
+ */
+ err = vic_boot(vic);
+ if (err < 0) {
+ pm_runtime_put(vic->dev);
+ return err;
+ }
+
+ context->channel = host1x_channel_get(vic->channel);
+ if (!context->channel) {
+ pm_runtime_put(vic->dev);
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void vic_close_channel(struct tegra_drm_context *context)
+{
+ struct vic *vic = to_vic(context->client);
+
+ host1x_channel_put(context->channel);
+
+ pm_runtime_put(vic->dev);
+}
+
+static const struct tegra_drm_client_ops vic_ops = {
+ .open_channel = vic_open_channel,
+ .close_channel = vic_close_channel,
+ .submit = tegra_drm_submit,
+};
+
+static const struct vic_config vic_t124_config = {
+ .ucode_name = "nvidia/tegra124/vic03_ucode.bin",
+};
+
+static const struct vic_config vic_t210_config = {
+ .ucode_name = "nvidia/tegra210/vic04_ucode.bin",
+};
+
+static const struct of_device_id vic_match[] = {
+ { .compatible = "nvidia,tegra124-vic", .data = &vic_t124_config },
+ { .compatible = "nvidia,tegra210-vic", .data = &vic_t210_config },
+ { },
+};
+
+static int vic_probe(struct platform_device *pdev)
+{
+ struct vic_config *vic_config = NULL;
+ struct device *dev = &pdev->dev;
+ struct host1x_syncpt **syncpts;
+ struct resource *regs;
+ const struct of_device_id *match;
+ struct vic *vic;
+ int err;
+
+ match = of_match_device(vic_match, dev);
+ vic_config = (struct vic_config *)match->data;
+
+ vic = devm_kzalloc(dev, sizeof(*vic), GFP_KERNEL);
+ if (!vic)
+ return -ENOMEM;
+
+ syncpts = devm_kzalloc(dev, sizeof(*syncpts), GFP_KERNEL);
+ if (!syncpts)
+ return -ENOMEM;
+
+ regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!regs) {
+ dev_err(&pdev->dev, "failed to get registers\n");
+ return -ENXIO;
+ }
+
+ vic->regs = devm_ioremap_resource(dev, regs);
+ if (IS_ERR(vic->regs))
+ return PTR_ERR(vic->regs);
+
+ vic->clk = devm_clk_get(dev, NULL);
+ if (IS_ERR(vic->clk)) {
+ dev_err(&pdev->dev, "failed to get clock\n");
+ return PTR_ERR(vic->clk);
+ }
+
+ platform_set_drvdata(pdev, vic);
+
+ INIT_LIST_HEAD(&vic->client.base.list);
+ vic->client.base.ops = &vic_client_ops;
+ vic->client.base.dev = dev;
+ vic->client.base.class = HOST1X_CLASS_VIC;
+ vic->client.base.syncpts = syncpts;
+ vic->client.base.num_syncpts = 1;
+ vic->dev = dev;
+ vic->config = vic_config;
+
+ INIT_LIST_HEAD(&vic->client.list);
+ vic->client.ops = &vic_ops;
+
+ err = host1x_client_register(&vic->client.base);
+ if (err < 0) {
+ dev_err(dev, "failed to register host1x client: %d\n", err);
+ platform_set_drvdata(pdev, NULL);
+ return err;
+ }
+
+ pm_runtime_enable(&pdev->dev);
+ if (!pm_runtime_enabled(&pdev->dev)) {
+ err = vic_runtime_resume(&pdev->dev);
+ if (err < 0)
+ goto unregister_client;
+ }
+
+ dev_info(&pdev->dev, "initialized");
+
+ return 0;
+
+unregister_client:
+ host1x_client_unregister(&vic->client.base);
+
+ return err;
+}
+
+static int vic_remove(struct platform_device *pdev)
+{
+ struct vic *vic = platform_get_drvdata(pdev);
+ int err;
+
+ err = host1x_client_unregister(&vic->client.base);
+ if (err < 0) {
+ dev_err(&pdev->dev, "failed to unregister host1x client: %d\n",
+ err);
+ return err;
+ }
+
+ if (pm_runtime_enabled(&pdev->dev))
+ pm_runtime_disable(&pdev->dev);
+ else
+ vic_runtime_suspend(&pdev->dev);
+
+ return 0;
+}
+
+static const struct dev_pm_ops vic_pm_ops = {
+ SET_RUNTIME_PM_OPS(vic_runtime_suspend, vic_runtime_resume, NULL)
+};
+
+struct platform_driver tegra_vic_driver = {
+ .driver = {
+ .name = "tegra-vic",
+ .of_match_table = vic_match,
+ .pm = &vic_pm_ops
+ },
+ .probe = vic_probe,
+ .remove = vic_remove,
+};
diff --git a/drivers/gpu/drm/tegra/vic.h b/drivers/gpu/drm/tegra/vic.h
new file mode 100644
index 0000000..2184481
--- /dev/null
+++ b/drivers/gpu/drm/tegra/vic.h
@@ -0,0 +1,31 @@
+/*
+ * Copyright (c) 2015, NVIDIA Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef TEGRA_VIC_H
+#define TEGRA_VIC_H
+
+/* VIC methods */
+
+#define VIC_SET_APPLICATION_ID 0x00000200
+#define VIC_SET_FCE_UCODE_SIZE 0x0000071C
+#define VIC_SET_FCE_UCODE_OFFSET 0x0000072C
+
+/* VIC registers */
+
+#define NV_PVIC_MISC_PRI_VIC_CG 0x000016d0
+#define CG_IDLE_CG_DLY_CNT(val) ((val & 0x3f) << 0)
+#define CG_IDLE_CG_EN (1 << 6)
+#define CG_WAKEUP_DLY_CNT(val) ((val & 0xf) << 16)
+
+/* Firmware offsets */
+
+#define VIC_UCODE_FCE_HEADER_OFFSET (6*4)
+#define VIC_UCODE_FCE_DATA_OFFSET (7*4)
+#define FCE_UCODE_SIZE_OFFSET (2*4)
+
+#endif /* TEGRA_VIC_H */
diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index 1ffbf2a..3d04aa1 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -26,6 +26,7 @@ enum host1x_class {
HOST1X_CLASS_HOST1X = 0x1,
HOST1X_CLASS_GR2D = 0x51,
HOST1X_CLASS_GR2D_SB = 0x52,
+ HOST1X_CLASS_VIC = 0x5D,
HOST1X_CLASS_GR3D = 0x60,
};
--
2.10.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 5/8] gpu: host1x: Add IOMMU support
2016-11-10 18:23 [PATCH 0/8] Host1x IOMMU support + VIC support Mikko Perttunen
` (2 preceding siblings ...)
[not found] ` <20161110182345.31777-1-mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2016-11-10 18:23 ` Mikko Perttunen
2016-12-05 19:49 ` Thierry Reding
2016-11-10 18:23 ` [PATCH 6/8] dt-bindings: Add bindings for the Tegra VIC Mikko Perttunen
2016-12-05 19:51 ` [PATCH 0/8] Host1x IOMMU support + VIC support Thierry Reding
5 siblings, 1 reply; 25+ messages in thread
From: Mikko Perttunen @ 2016-11-10 18:23 UTC (permalink / raw)
To: thierry.reding; +Cc: linux-tegra, dri-devel, Mikko Perttunen
Add support for the Host1x unit to be located behind
an IOMMU. This is required when gather buffers may be
allocated non-contiguously in physical memory, as can
be the case when TegraDRM is also using the IOMMU.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
drivers/gpu/host1x/cdma.c | 65 +++++++++++++++++++++++++++++------
drivers/gpu/host1x/cdma.h | 4 ++-
drivers/gpu/host1x/dev.c | 39 +++++++++++++++++++--
drivers/gpu/host1x/dev.h | 5 +++
drivers/gpu/host1x/hw/cdma_hw.c | 10 +++---
drivers/gpu/host1x/job.c | 76 +++++++++++++++++++++++++++++++++++------
drivers/gpu/host1x/job.h | 1 +
7 files changed, 171 insertions(+), 29 deletions(-)
diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c
index c5d82a8..14692446e 100644
--- a/drivers/gpu/host1x/cdma.c
+++ b/drivers/gpu/host1x/cdma.c
@@ -51,9 +51,15 @@ static void host1x_pushbuffer_destroy(struct push_buffer *pb)
struct host1x_cdma *cdma = pb_to_cdma(pb);
struct host1x *host1x = cdma_to_host1x(cdma);
- if (pb->phys != 0)
- dma_free_wc(host1x->dev, pb->size_bytes + 4, pb->mapped,
- pb->phys);
+ if (!pb->phys)
+ return;
+
+ if (host1x->domain) {
+ iommu_unmap(host1x->domain, pb->dma, pb->alloc_size_bytes);
+ free_iova(&host1x->iova, iova_pfn(&host1x->iova, pb->dma));
+ }
+
+ dma_free_wc(host1x->dev, pb->alloc_size_bytes, pb->mapped, pb->phys);
pb->mapped = NULL;
pb->phys = 0;
@@ -66,28 +72,65 @@ static int host1x_pushbuffer_init(struct push_buffer *pb)
{
struct host1x_cdma *cdma = pb_to_cdma(pb);
struct host1x *host1x = cdma_to_host1x(cdma);
+ struct iova *alloc;
+ int err;
pb->mapped = NULL;
pb->phys = 0;
pb->size_bytes = HOST1X_PUSHBUFFER_SLOTS * 8;
+ pb->alloc_size_bytes = pb->size_bytes + 4;
/* initialize buffer pointers */
pb->fence = pb->size_bytes - 8;
pb->pos = 0;
- /* allocate and map pushbuffer memory */
- pb->mapped = dma_alloc_wc(host1x->dev, pb->size_bytes + 4, &pb->phys,
- GFP_KERNEL);
- if (!pb->mapped)
- goto fail;
+ if (host1x->domain) {
+ unsigned long shift;
+
+ pb->alloc_size_bytes = iova_align(&host1x->iova,
+ pb->alloc_size_bytes);
+
+ pb->mapped = dma_alloc_wc(host1x->dev, pb->alloc_size_bytes,
+ &pb->phys, GFP_KERNEL);
+ if (!pb->mapped)
+ return -ENOMEM;
+
+ shift = iova_shift(&host1x->iova);
+ alloc = alloc_iova(
+ &host1x->iova, pb->alloc_size_bytes >> shift,
+ host1x->domain->geometry.aperture_end >> shift, true);
+ if (!alloc) {
+ err = -ENOMEM;
+ goto iommu_free_mem;
+ }
+
+ err = iommu_map(host1x->domain,
+ iova_dma_addr(&host1x->iova, alloc),
+ pb->phys, pb->alloc_size_bytes,
+ IOMMU_READ);
+ if (err)
+ goto iommu_free_iova;
+
+ pb->dma = iova_dma_addr(&host1x->iova, alloc);
+ } else {
+ pb->mapped = dma_alloc_wc(host1x->dev, pb->alloc_size_bytes,
+ &pb->phys, GFP_KERNEL);
+ if (!pb->mapped)
+ return -ENOMEM;
+
+ pb->dma = pb->phys;
+ }
host1x_hw_pushbuffer_init(host1x, pb);
return 0;
-fail:
- host1x_pushbuffer_destroy(pb);
- return -ENOMEM;
+iommu_free_iova:
+ __free_iova(&host1x->iova, alloc);
+iommu_free_mem:
+ dma_free_wc(host1x->dev, pb->alloc_size_bytes, pb->mapped, pb->phys);
+
+ return err;
}
/*
diff --git a/drivers/gpu/host1x/cdma.h b/drivers/gpu/host1x/cdma.h
index 470087a..4b3645f 100644
--- a/drivers/gpu/host1x/cdma.h
+++ b/drivers/gpu/host1x/cdma.h
@@ -43,10 +43,12 @@ struct host1x_job;
struct push_buffer {
void *mapped; /* mapped pushbuffer memory */
- dma_addr_t phys; /* physical address of pushbuffer */
+ dma_addr_t dma; /* device address of pushbuffer */
+ phys_addr_t phys; /* physical address of pushbuffer */
u32 fence; /* index we've written */
u32 pos; /* index to write to */
u32 size_bytes;
+ u32 alloc_size_bytes;
};
struct buffer_timeout {
diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
index a62317a..ca4527e 100644
--- a/drivers/gpu/host1x/dev.c
+++ b/drivers/gpu/host1x/dev.c
@@ -168,16 +168,36 @@ static int host1x_probe(struct platform_device *pdev)
return err;
}
+ if (iommu_present(&platform_bus_type)) {
+ struct iommu_domain_geometry *geometry;
+ unsigned long order;
+
+ host->domain = iommu_domain_alloc(&platform_bus_type);
+ if (!host->domain)
+ return -ENOMEM;
+
+ err = iommu_attach_device(host->domain, &pdev->dev);
+ if (err)
+ goto fail_free_domain;
+
+ geometry = &host->domain->geometry;
+
+ order = __ffs(host->domain->pgsize_bitmap);
+ init_iova_domain(&host->iova, 1UL << order,
+ geometry->aperture_start >> order,
+ geometry->aperture_end >> order);
+ }
+
err = host1x_channel_list_init(host);
if (err) {
dev_err(&pdev->dev, "failed to initialize channel list\n");
- return err;
+ goto fail_detach_device;
}
err = clk_prepare_enable(host->clk);
if (err < 0) {
dev_err(&pdev->dev, "failed to enable clock\n");
- return err;
+ goto fail_detach_device;
}
err = host1x_syncpt_init(host);
@@ -206,6 +226,15 @@ static int host1x_probe(struct platform_device *pdev)
host1x_syncpt_deinit(host);
fail_unprepare_disable:
clk_disable_unprepare(host->clk);
+fail_detach_device:
+ if (host->domain) {
+ put_iova_domain(&host->iova);
+ iommu_detach_device(host->domain, &pdev->dev);
+ }
+fail_free_domain:
+ if (host->domain)
+ iommu_domain_free(host->domain);
+
return err;
}
@@ -218,6 +247,12 @@ static int host1x_remove(struct platform_device *pdev)
host1x_syncpt_deinit(host);
clk_disable_unprepare(host->clk);
+ if (host->domain) {
+ put_iova_domain(&host->iova);
+ iommu_detach_device(host->domain, &pdev->dev);
+ iommu_domain_free(host->domain);
+ }
+
return 0;
}
diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
index 5220510..6189825 100644
--- a/drivers/gpu/host1x/dev.h
+++ b/drivers/gpu/host1x/dev.h
@@ -19,6 +19,8 @@
#include <linux/platform_device.h>
#include <linux/device.h>
+#include <linux/iommu.h>
+#include <linux/iova.h>
#include "channel.h"
#include "syncpt.h"
@@ -108,6 +110,9 @@ struct host1x {
struct device *dev;
struct clk *clk;
+ struct iommu_domain *domain;
+ struct iova_domain iova;
+
struct mutex intr_mutex;
int intr_syncpt_irq;
diff --git a/drivers/gpu/host1x/hw/cdma_hw.c b/drivers/gpu/host1x/hw/cdma_hw.c
index 659c1bb..b8c0348 100644
--- a/drivers/gpu/host1x/hw/cdma_hw.c
+++ b/drivers/gpu/host1x/hw/cdma_hw.c
@@ -55,7 +55,7 @@ static void cdma_timeout_cpu_incr(struct host1x_cdma *cdma, u32 getptr,
*(p++) = HOST1X_OPCODE_NOP;
*(p++) = HOST1X_OPCODE_NOP;
dev_dbg(host1x->dev, "%s: NOP at %pad+%#x\n", __func__,
- &pb->phys, getptr);
+ &pb->dma, getptr);
getptr = (getptr + 8) & (pb->size_bytes - 1);
}
@@ -78,9 +78,9 @@ static void cdma_start(struct host1x_cdma *cdma)
HOST1X_CHANNEL_DMACTRL);
/* set base, put and end pointer */
- host1x_ch_writel(ch, cdma->push_buffer.phys, HOST1X_CHANNEL_DMASTART);
+ host1x_ch_writel(ch, cdma->push_buffer.dma, HOST1X_CHANNEL_DMASTART);
host1x_ch_writel(ch, cdma->push_buffer.pos, HOST1X_CHANNEL_DMAPUT);
- host1x_ch_writel(ch, cdma->push_buffer.phys +
+ host1x_ch_writel(ch, cdma->push_buffer.dma +
cdma->push_buffer.size_bytes + 4,
HOST1X_CHANNEL_DMAEND);
@@ -115,8 +115,8 @@ static void cdma_timeout_restart(struct host1x_cdma *cdma, u32 getptr)
HOST1X_CHANNEL_DMACTRL);
/* set base, end pointer (all of memory) */
- host1x_ch_writel(ch, cdma->push_buffer.phys, HOST1X_CHANNEL_DMASTART);
- host1x_ch_writel(ch, cdma->push_buffer.phys +
+ host1x_ch_writel(ch, cdma->push_buffer.dma, HOST1X_CHANNEL_DMASTART);
+ host1x_ch_writel(ch, cdma->push_buffer.dma +
cdma->push_buffer.size_bytes,
HOST1X_CHANNEL_DMAEND);
diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index a91b7c4..222d930 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -174,8 +174,9 @@ static int do_waitchks(struct host1x_job *job, struct host1x *host,
return 0;
}
-static unsigned int pin_job(struct host1x_job *job)
+static unsigned int pin_job(struct host1x *host, struct host1x_job *job)
{
+ int err;
unsigned int i;
job->num_unpins = 0;
@@ -186,12 +187,16 @@ static unsigned int pin_job(struct host1x_job *job)
dma_addr_t phys_addr;
reloc->target.bo = host1x_bo_get(reloc->target.bo);
- if (!reloc->target.bo)
+ if (!reloc->target.bo) {
+ err = -EINVAL;
goto unpin;
+ }
phys_addr = host1x_bo_pin(reloc->target.bo, &sgt);
- if (!phys_addr)
+ if (!phys_addr) {
+ err = -EINVAL;
goto unpin;
+ }
job->addr_phys[job->num_unpins] = phys_addr;
job->unpins[job->num_unpins].bo = reloc->target.bo;
@@ -204,25 +209,68 @@ static unsigned int pin_job(struct host1x_job *job)
struct sg_table *sgt;
dma_addr_t phys_addr;
+ size_t gather_size = 0;
+ struct scatterlist *sg;
+ struct iova *alloc;
+ unsigned long shift;
+ int j;
+
g->bo = host1x_bo_get(g->bo);
- if (!g->bo)
+ if (!g->bo) {
+ err = -EINVAL;
goto unpin;
+ }
phys_addr = host1x_bo_pin(g->bo, &sgt);
- if (!phys_addr)
+ if (!phys_addr) {
+ err = -EINVAL;
goto unpin;
+ }
+
+ if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && host->domain) {
+ for_each_sg(sgt->sgl, sg, sgt->nents, j) {
+ gather_size += sg->length;
+ }
+ gather_size = iova_align(&host->iova, gather_size);
+
+ shift = iova_shift(&host->iova);
+ alloc = alloc_iova(
+ &host->iova, gather_size >> shift,
+ host->domain->geometry.aperture_end >> shift,
+ true);
+ if (!alloc) {
+ err = -ENOMEM;
+ goto unpin;
+ }
+
+ err = iommu_map_sg(host->domain,
+ iova_dma_addr(&host->iova, alloc),
+ sgt->sgl, sgt->nents, IOMMU_READ);
+ if (err == 0) {
+ __free_iova(&host->iova, alloc);
+ err = -EINVAL;
+ goto unpin;
+ }
+
+ job->addr_phys[job->num_unpins] =
+ iova_dma_addr(&host->iova, alloc);
+ job->unpins[job->num_unpins].size = gather_size;
+ } else {
+ job->addr_phys[job->num_unpins] = phys_addr;
+ }
+
+ job->gather_addr_phys[i] = job->addr_phys[job->num_unpins];
- job->addr_phys[job->num_unpins] = phys_addr;
job->unpins[job->num_unpins].bo = g->bo;
job->unpins[job->num_unpins].sgt = sgt;
job->num_unpins++;
}
- return job->num_unpins;
+ return 0;
unpin:
host1x_job_unpin(job);
- return 0;
+ return err;
}
static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf)
@@ -525,8 +573,8 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev)
host1x_syncpt_load(host->syncpt + i);
/* pin memory */
- err = pin_job(job);
- if (!err)
+ err = pin_job(host, job);
+ if (err)
goto out;
/* patch gathers */
@@ -569,11 +617,19 @@ EXPORT_SYMBOL(host1x_job_pin);
void host1x_job_unpin(struct host1x_job *job)
{
+ struct host1x *host = dev_get_drvdata(job->channel->dev->parent);
unsigned int i;
for (i = 0; i < job->num_unpins; i++) {
struct host1x_job_unpin_data *unpin = &job->unpins[i];
+ if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && host->domain) {
+ iommu_unmap(host->domain, job->addr_phys[i],
+ unpin->size);
+ free_iova(&host->iova,
+ iova_pfn(&host->iova, job->addr_phys[i]));
+ }
+
host1x_bo_unpin(unpin->bo, unpin->sgt);
host1x_bo_put(unpin->bo);
}
diff --git a/drivers/gpu/host1x/job.h b/drivers/gpu/host1x/job.h
index 8b3c15d..878239c4 100644
--- a/drivers/gpu/host1x/job.h
+++ b/drivers/gpu/host1x/job.h
@@ -44,6 +44,7 @@ struct host1x_waitchk {
struct host1x_job_unpin_data {
struct host1x_bo *bo;
struct sg_table *sgt;
+ size_t size;
};
/*
--
2.10.2
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 6/8] dt-bindings: Add bindings for the Tegra VIC
2016-11-10 18:23 [PATCH 0/8] Host1x IOMMU support + VIC support Mikko Perttunen
` (3 preceding siblings ...)
2016-11-10 18:23 ` [PATCH 5/8] gpu: host1x: Add IOMMU support Mikko Perttunen
@ 2016-11-10 18:23 ` Mikko Perttunen
2016-12-05 19:51 ` [PATCH 0/8] Host1x IOMMU support + VIC support Thierry Reding
5 siblings, 0 replies; 25+ messages in thread
From: Mikko Perttunen @ 2016-11-10 18:23 UTC (permalink / raw)
To: thierry.reding; +Cc: linux-tegra, dri-devel, Mikko Perttunen
The VIC (Video Image Compositor) is a Host1x client unit
that can do various 2D composition and transform operations.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
.../bindings/display/tegra/nvidia,tegra20-host1x.txt | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.txt b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.txt
index 0fad7ed..74e1e8a 100644
--- a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.txt
+++ b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.txt
@@ -249,6 +249,19 @@ of the following host1x client modules:
See ../pinctrl/nvidia,tegra124-dpaux-padctl.txt for information
regarding the DPAUX pad controller bindings.
+- vic: Video Image Compositor
+ - compatible : "nvidia,tegra<chip>-vic"
+ - reg: Physical base address and length of the controller's registers.
+ - interrupts: The interrupt outputs from the controller.
+ - clocks: Must contain an entry for each entry in clock-names.
+ See ../clocks/clock-bindings.txt for details.
+ - clock-names: Must include the following entries:
+ - vic: clock input for the VIC hardware
+ - resets: Must contain an entry for each entry in reset-names.
+ See ../reset/reset.txt for details.
+ - reset-names: Must include the following entries:
+ - vic
+
Example:
/ {
--
2.10.2
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 7/8] arm64: tegra: Enable VIC on T210
[not found] ` <20161110182345.31777-1-mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2016-11-10 18:23 ` [PATCH 3/8] drm/tegra: Add falcon helper library Mikko Perttunen
2016-11-10 18:23 ` [PATCH 4/8] drm/tegra: Add VIC support Mikko Perttunen
@ 2016-11-10 18:23 ` Mikko Perttunen
2016-11-10 18:23 ` [PATCH 8/8] arm64: tegra: Enable IOMMU for Host1x on Tegra210 Mikko Perttunen
2016-12-05 13:25 ` [PATCH 0/8] Host1x IOMMU support + VIC support Mikko Perttunen
4 siblings, 0 replies; 25+ messages in thread
From: Mikko Perttunen @ 2016-11-10 18:23 UTC (permalink / raw)
To: thierry.reding-Re5JQEeQqe8AvxtiuMwx3w
Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, airlied-cv59FeDIM0c,
Mikko Perttunen
Enable the VIC (Video Image Compositor) host1x
unit on Tegra210 systems.
Signed-off-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
arch/arm64/boot/dts/nvidia/tegra210.dtsi | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/nvidia/tegra210.dtsi b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
index 46045fe..e12390f 100644
--- a/arch/arm64/boot/dts/nvidia/tegra210.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
@@ -122,7 +122,14 @@
vic@54340000 {
compatible = "nvidia,tegra210-vic";
reg = <0x0 0x54340000 0x0 0x00040000>;
- status = "disabled";
+ interrupts = <GIC_SPI 72 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&tegra_car TEGRA210_CLK_VIC03>;
+ clock-names = "vic";
+ resets = <&tegra_car 178>;
+ reset-names = "vic";
+
+ iommus = <&mc TEGRA_SWGROUP_VIC>;
+ power-domains = <&pd_vic>;
};
nvjpg@54380000 {
@@ -692,6 +699,14 @@
resets = <&tegra_car TEGRA210_CLK_XUSB_HOST>;
#power-domain-cells = <0>;
};
+
+ pd_vic: vic {
+ clocks = <&tegra_car TEGRA210_CLK_VIC03>;
+ clock-names = "vic";
+ resets = <&tegra_car TEGRA210_CLK_VIC03>;
+ reset-names = "vic";
+ #power-domain-cells = <0>;
+ };
};
};
--
2.10.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 8/8] arm64: tegra: Enable IOMMU for Host1x on Tegra210
[not found] ` <20161110182345.31777-1-mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
` (2 preceding siblings ...)
2016-11-10 18:23 ` [PATCH 7/8] arm64: tegra: Enable VIC on T210 Mikko Perttunen
@ 2016-11-10 18:23 ` Mikko Perttunen
2016-12-05 13:25 ` [PATCH 0/8] Host1x IOMMU support + VIC support Mikko Perttunen
4 siblings, 0 replies; 25+ messages in thread
From: Mikko Perttunen @ 2016-11-10 18:23 UTC (permalink / raw)
To: thierry.reding-Re5JQEeQqe8AvxtiuMwx3w
Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, airlied-cv59FeDIM0c,
Mikko Perttunen
The Host1x driver now supports operation behind an IOMMU,
so add its IOMMU domain to the device tree.
Signed-off-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
arch/arm64/boot/dts/nvidia/tegra210.dtsi | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm64/boot/dts/nvidia/tegra210.dtsi b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
index e12390f..8d229c7 100644
--- a/arch/arm64/boot/dts/nvidia/tegra210.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
@@ -26,6 +26,8 @@
ranges = <0x0 0x54000000 0x0 0x54000000 0x0 0x01000000>;
+ iommus = <&mc TEGRA_SWGROUP_HC>;
+
dpaux1: dpaux@54040000 {
compatible = "nvidia,tegra210-dpaux";
reg = <0x0 0x54040000 0x0 0x00040000>;
--
2.10.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 0/8] Host1x IOMMU support + VIC support
[not found] ` <20161110182345.31777-1-mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
` (3 preceding siblings ...)
2016-11-10 18:23 ` [PATCH 8/8] arm64: tegra: Enable IOMMU for Host1x on Tegra210 Mikko Perttunen
@ 2016-12-05 13:25 ` Mikko Perttunen
4 siblings, 0 replies; 25+ messages in thread
From: Mikko Perttunen @ 2016-12-05 13:25 UTC (permalink / raw)
To: Mikko Perttunen, thierry.reding-Re5JQEeQqe8AvxtiuMwx3w
Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, airlied-cv59FeDIM0c
Bump
On 10.11.2016 20:23, Mikko Perttunen wrote:
> This series adds IOMMU support to Host1x and TegraDRM
> and adds support for the VIC host1x client so that
> host1x can be tested on modern Tegra platforms.
> It depends on the previous fix series. The whole thing
> (modulo patch order) is available as a git repository at
> git://github.com/cyndis/linux.git; branch vic-v1.
>
> IO memory management is organized such that there are
> two domains: the host1x domain and the tegradrm domain.
> The host1x domain is used by the host1x engine and
> contains the host1x CDMA and pushbuffers for submitted
> jobs.
>
> The tegradrm domain is shared by all host1x units and
> contains GEM objects and memory allocated by the
> separate tegra_drm_alloc function. This function is
> currently used to allocate space for firmware blobs
> in the tegradrm domain.
>
> A userspace test case for VIC can be found at
> https://github.com/cyndis/drm/tree/work/tegra.
> The testcase is in tests/tegra and is called submit_vic.
> The in-kernel firewall is not implemented for VIC;
> therefore, IOMMU must be enabled for the test to pass.
>
> Tested with Jetson TX1 (T210). Probably works also
> with Jetson TK1 (T124). Note that due to hardware changes
> the testcase also needs to be changed to run properly
> on T124.
>
> Thanks,
> Mikko.
>
> Arto Merilainen (2):
> drm/tegra: Add falcon helper library
> drm/tegra: Add VIC support
>
> Mikko Perttunen (6):
> drm/tegra: Add Tegra DRM allocation API
> drm/tegra: Allocate BOs from lower 4G when without IOMMU
> gpu: host1x: Add IOMMU support
> dt-bindings: Add bindings for the Tegra VIC
> arm64: tegra: Enable VIC on T210
> arm64: tegra: Enable IOMMU for Host1x on Tegra210
>
> .../display/tegra/nvidia,tegra20-host1x.txt | 13 +
> arch/arm64/boot/dts/nvidia/tegra210.dtsi | 19 +-
> drivers/gpu/drm/tegra/Makefile | 4 +-
> drivers/gpu/drm/tegra/drm.c | 101 +++++-
> drivers/gpu/drm/tegra/drm.h | 8 +
> drivers/gpu/drm/tegra/falcon.c | 256 +++++++++++++
> drivers/gpu/drm/tegra/falcon.h | 130 +++++++
> drivers/gpu/drm/tegra/gem.c | 2 +-
> drivers/gpu/drm/tegra/vic.c | 400 +++++++++++++++++++++
> drivers/gpu/drm/tegra/vic.h | 31 ++
> drivers/gpu/host1x/cdma.c | 65 +++-
> drivers/gpu/host1x/cdma.h | 4 +-
> drivers/gpu/host1x/dev.c | 39 +-
> drivers/gpu/host1x/dev.h | 5 +
> drivers/gpu/host1x/hw/cdma_hw.c | 10 +-
> drivers/gpu/host1x/job.c | 76 +++-
> drivers/gpu/host1x/job.h | 1 +
> include/linux/host1x.h | 1 +
> 18 files changed, 1128 insertions(+), 37 deletions(-)
> create mode 100644 drivers/gpu/drm/tegra/falcon.c
> create mode 100644 drivers/gpu/drm/tegra/falcon.h
> create mode 100644 drivers/gpu/drm/tegra/vic.c
> create mode 100644 drivers/gpu/drm/tegra/vic.h
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/8] drm/tegra: Add Tegra DRM allocation API
[not found] ` <20161110182345.31777-2-mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2016-12-05 18:37 ` Thierry Reding
[not found] ` <20161205183726.GA22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: Thierry Reding @ 2016-12-05 18:37 UTC (permalink / raw)
To: Mikko Perttunen
Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, airlied-cv59FeDIM0c
[-- Attachment #1: Type: text/plain, Size: 6442 bytes --]
On Thu, Nov 10, 2016 at 08:23:38PM +0200, Mikko Perttunen wrote:
> Add a new IO virtual memory allocation API to allow clients to
> allocate non-GEM memory in the Tegra DRM IOMMU domain. This is
> required e.g. for loading client firmware when clients are attached
> to the IOMMU domain.
>
> The allocator allocates contiguous physical pages that are then
> mapped contiguously to the IOMMU domain using the iova_domain
> library provided by the kernel. Contiguous physical pages are
> used so that the same allocator works also when IOMMU support
> is disabled and therefore devices access physical memory directly.
>
> Signed-off-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> ---
> drivers/gpu/drm/tegra/drm.c | 98 ++++++++++++++++++++++++++++++++++++++++++---
> drivers/gpu/drm/tegra/drm.h | 7 ++++
> 2 files changed, 100 insertions(+), 5 deletions(-)
This looks mostly good, just a few comments below...
> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> index b8be3ee..ecfe7ba 100644
> --- a/drivers/gpu/drm/tegra/drm.c
> +++ b/drivers/gpu/drm/tegra/drm.c
> @@ -1,12 +1,13 @@
> /*
> * Copyright (C) 2012 Avionic Design GmbH
> - * Copyright (C) 2012-2013 NVIDIA CORPORATION. All rights reserved.
> + * Copyright (C) 2012-2015 NVIDIA CORPORATION. All rights reserved.
It's almost 2017 now, I think this is out of date. =)
> *
> * This program is free software; you can redistribute it and/or modify
> * it under the terms of the GNU General Public License version 2 as
> * published by the Free Software Foundation.
> */
>
> +#include <linux/bitops.h>
> #include <linux/host1x.h>
> #include <linux/iommu.h>
>
> @@ -23,6 +24,8 @@
> #define DRIVER_MINOR 0
> #define DRIVER_PATCHLEVEL 0
>
> +#define IOVA_AREA_SZ (1024 * 1024 * 64) /* 64 MiB */
SZ_64M?
> +
> struct tegra_drm_file {
> struct list_head contexts;
> };
> @@ -127,7 +130,8 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>
> if (iommu_present(&platform_bus_type)) {
> struct iommu_domain_geometry *geometry;
> - u64 start, end;
> + unsigned long order;
> + u64 iova_start, start, end;
Can we be a little more explicit and keep an additional iova_end to
simplify the code a little?
> tegra->domain = iommu_domain_alloc(&platform_bus_type);
> if (!tegra->domain) {
> @@ -138,10 +142,17 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> geometry = &tegra->domain->geometry;
> start = geometry->aperture_start;
> end = geometry->aperture_end;
> + iova_start = end - IOVA_AREA_SZ + 1;
> +
> + order = __ffs(tegra->domain->pgsize_bitmap);
> + init_iova_domain(&tegra->iova, 1UL << order,
> + iova_start >> order,
> + end >> order);
I hadn't seen this IOVA domain thing and it looks rather handy.
>
> - DRM_DEBUG_DRIVER("IOMMU aperture initialized (%#llx-%#llx)\n",
> - start, end);
> - drm_mm_init(&tegra->mm, start, end - start + 1);
> + drm_mm_init(&tegra->mm, start, iova_start - start);
> +
> + DRM_DEBUG("IOMMU apertures initialized (GEM: %#llx-%#llx, IOVA: %#llx-%#llx)\n",
> + start, iova_start-1, iova_start, end);
Might be worth splitting this into two, or perhaps even three lines:
IOMMU apertures:
GEM: %#llx-%#llx
IOVA: %#llx-%#llx
Maybe also find a better name for IOVA, since GEM will be using I/O
virtual addresses as well. Perhaps just name it "carveout" or something
like that?
> }
>
> mutex_init(&tegra->clients_lock);
> @@ -208,6 +219,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> if (tegra->domain) {
> iommu_domain_free(tegra->domain);
> drm_mm_takedown(&tegra->mm);
> + put_iova_domain(&tegra->iova);
> }
> free:
> kfree(tegra);
> @@ -232,6 +244,7 @@ static int tegra_drm_unload(struct drm_device *drm)
> if (tegra->domain) {
> iommu_domain_free(tegra->domain);
> drm_mm_takedown(&tegra->mm);
> + put_iova_domain(&tegra->iova);
> }
>
> kfree(tegra);
> @@ -975,6 +988,81 @@ int tegra_drm_unregister_client(struct tegra_drm *tegra,
> return 0;
> }
>
> +void *tegra_drm_alloc(struct tegra_drm *tegra, size_t size,
> + dma_addr_t *iova)
Maybe name the iova parameter phys? iova implies that it's always
translated, whereas according to the code it can be physical, too.
> +{
> + struct iova *alloc;
> + unsigned long shift;
> + void *virt;
> + gfp_t gfp;
> + int err;
> +
> + if (tegra->domain)
> + size = iova_align(&tegra->iova, size);
> + else
> + size = PAGE_ALIGN(size);
> +
> + gfp = GFP_KERNEL | __GFP_ZERO;
> + if (!tegra->domain) {
> + /*
> + * Tegra210 has 64-bit physical addresses but units only support
> + * 32-bit addresses, so if we don't have an IOMMU to do
> + * translation, force allocations to be in the 32-bit range.
> + */
> + gfp |= GFP_DMA;
Technically we do support Tegra132 and that already has 64-bit physical
addresses, so perhaps include that in the comment, or make it more
generic:
/*
* Many units only support 32-bit addresses, even on 64-bit SoCs.
* If there is no IOMMU to translate into a 32-bit IO virtual address
* address space, force allocations to be in the lower 32-bit range.
*/
> + }
> +
> + virt = (void *)__get_free_pages(gfp, get_order(size));
> + if (!virt)
> + return NULL;
Perhaps we want to return an ERR_PTR()-encoded error code here so we can
differentiate between out-of-memory and out-of-virtual-address-space
conditions in the caller? Or perhaps other conditions as well.
Also, is __get_free_pages() the right API here? I thought that it always
returned contiguous memory, which, in the presence of an IOMMU, will be
completely unnecessary.
> +
> + if (!tegra->domain) {
> + /*
> + * If IOMMU is disabled, devices address physical memory
> + * directly.
> + */
> + *iova = virt_to_phys(virt);
> + return virt;
> + }
> +
> + shift = iova_shift(&tegra->iova);
> + alloc = alloc_iova(&tegra->iova, size >> shift,
> + tegra->domain->geometry.aperture_end >> shift, true);
This is fairly cumbersome to use. Maybe a drm_mm would be easier in this
case? If not, could we at least store the upper limit of the carveout so
that we don't have to use the really long expression above?
Thierry
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/8] drm/tegra: Allocate BOs from lower 4G when without IOMMU
[not found] ` <20161110182345.31777-3-mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2016-12-05 18:39 ` Thierry Reding
[not found] ` <20161205183955.GB22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: Thierry Reding @ 2016-12-05 18:39 UTC (permalink / raw)
To: Mikko Perttunen
Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, airlied-cv59FeDIM0c
[-- Attachment #1: Type: text/plain, Size: 567 bytes --]
On Thu, Nov 10, 2016 at 08:23:39PM +0200, Mikko Perttunen wrote:
> On 64-bit Tegras, buffer object memory allocation may
> return memory above 4G that units behind Host1x cannot
> access. Add the GFP_DMA flag to these allocation when
> IOMMU is not enabled to ensure units can always access
> BO memory.
I don't think that's entirely true. The display controller can address
more than 32 bits. It's perhaps slightly different from other units that
are behind host1x, but maybe we need a special case here to allow frame-
buffers to reside in higher memory?
Thierry
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/8] drm/tegra: Add falcon helper library
2016-11-10 18:23 ` [PATCH 3/8] drm/tegra: Add falcon helper library Mikko Perttunen
@ 2016-12-05 19:13 ` Thierry Reding
[not found] ` <20161205191346.GC22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: Thierry Reding @ 2016-12-05 19:13 UTC (permalink / raw)
To: Mikko Perttunen; +Cc: linux-tegra, Andrew Chew, Arto Merilainen, dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 14778 bytes --]
On Thu, Nov 10, 2016 at 08:23:40PM +0200, Mikko Perttunen wrote:
> From: Arto Merilainen <amerilainen@nvidia.com>
>
> Add a set of falcon helper routines for use by the tegradrm client drivers
> of the various falcon-based engines.
>
> The falcon is a microcontroller that acts as a frontend for the rest of a
> particular Tegra engine. In order to properly utilize these engines, the
> frontend must be booted before pushing any commands.
>
> Based on work by Andrew Chew <achew@nvidia.com>
>
> Signed-off-by: Andrew Chew <achew@nvidia.com>
> Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> ---
> drivers/gpu/drm/tegra/Makefile | 3 +-
> drivers/gpu/drm/tegra/falcon.c | 256 +++++++++++++++++++++++++++++++++++++++++
> drivers/gpu/drm/tegra/falcon.h | 130 +++++++++++++++++++++
> 3 files changed, 388 insertions(+), 1 deletion(-)
> create mode 100644 drivers/gpu/drm/tegra/falcon.c
> create mode 100644 drivers/gpu/drm/tegra/falcon.h
>
> diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile
> index 2c66a8d..93e9a4a 100644
> --- a/drivers/gpu/drm/tegra/Makefile
> +++ b/drivers/gpu/drm/tegra/Makefile
> @@ -13,6 +13,7 @@ tegra-drm-y := \
> sor.o \
> dpaux.o \
> gr2d.o \
> - gr3d.o
> + gr3d.o \
> + falcon.o
>
> obj-$(CONFIG_DRM_TEGRA) += tegra-drm.o
> diff --git a/drivers/gpu/drm/tegra/falcon.c b/drivers/gpu/drm/tegra/falcon.c
> new file mode 100644
> index 0000000..180b2fd
> --- /dev/null
> +++ b/drivers/gpu/drm/tegra/falcon.c
> @@ -0,0 +1,256 @@
> +/*
> + * Copyright (c) 2015, NVIDIA Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/platform_device.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/firmware.h>
> +#include <linux/pci_ids.h>
> +#include <linux/iopoll.h>
> +
> +#include "falcon.h"
> +#include "drm.h"
> +
> +#define FALCON_IDLE_TIMEOUT_US 100000
> +#define FALCON_IDLE_CHECK_PERIOD_US 10
Seems a little overkill to have these here, given that their only used
twice. They're also used in two cases where we may end up using
different periods and timeouts eventually, so I'd just stick them into
falcon_wait_idle() and falcon_dma_wait_idle() directly and omit these
definitions. That's kind of a nitpick, so feel free to keep it as-is.
> +
> +enum falcon_memory {
> + FALCON_MEMORY_IMEM,
> + FALCON_MEMORY_DATA,
> +};
> +
> +static void falcon_writel(struct falcon *falcon, u32 value, u32 offset)
> +{
> + writel(value, falcon->regs + offset);
> +}
> +
> +int falcon_wait_idle(struct falcon *falcon)
> +{
> + u32 idlestate;
> +
> + return readl_poll_timeout(falcon->regs + FALCON_IDLESTATE, idlestate,
> + (!idlestate),
That's an odd way to write this. Why not just "idlestate == 0"? That's
much easier to read and more explicit.
> + FALCON_IDLE_CHECK_PERIOD_US,
> + FALCON_IDLE_TIMEOUT_US);
> +}
> +
> +static int falcon_dma_wait_idle(struct falcon *falcon)
> +{
> + u32 dmatrfcmd;
u32 value? Naming it after a register suggests that it may be an offset
rather than the register value.
> +
> + return readl_poll_timeout(falcon->regs + FALCON_DMATRFCMD, dmatrfcmd,
> + (dmatrfcmd & FALCON_DMATRFCMD_IDLE),
> + FALCON_IDLE_CHECK_PERIOD_US,
> + FALCON_IDLE_TIMEOUT_US);
> +}
> +
> +static int falcon_copy_chunk(struct falcon *falcon,
> + phys_addr_t base,
> + unsigned long offset,
> + enum falcon_memory target)
> +{
> + u32 cmd = FALCON_DMATRFCMD_SIZE_256B;
> +
> + if (target == FALCON_MEMORY_IMEM)
> + cmd |= FALCON_DMATRFCMD_IMEM;
> +
> + falcon_writel(falcon, offset, FALCON_DMATRFMOFFS);
> + falcon_writel(falcon, base, FALCON_DMATRFFBOFFS);
> + falcon_writel(falcon, cmd, FALCON_DMATRFCMD);
> +
> + return falcon_dma_wait_idle(falcon);
> +}
> +
> +static void falcon_copy_firmware_image(struct falcon *falcon,
> + const struct firmware *firmware)
> +{
> + u32 *firmware_vaddr = falcon->firmware.vaddr;
> + size_t i;
> +
> + /* copy the whole thing taking into account endianness */
> + for (i = 0; i < firmware->size / sizeof(u32); i++)
> + firmware_vaddr[i] = le32_to_cpu(((u32 *)firmware->data)[i]);
> +
> + /* ensure that caches are flushed and falcon can see the firmware */
> + dma_sync_single_for_device(falcon->dev, virt_to_phys(firmware_vaddr),
> + falcon->firmware.size, DMA_TO_DEVICE);
My understanding is that this is an error and the DMA API will complain
about it if debugging is enabled. I think you need to call one of the
dma_map_*() functions on the memory before you can do a dma_sync_*().
> +}
> +
> +static int falcon_parse_firmware_image(struct falcon *falcon)
> +{
> + struct falcon_firmware_bin_header_v1 *bin_header =
> + (void *)falcon->firmware.vaddr;
> + struct falcon_firmware_os_header_v1 *os_header;
Can we make these shorter? Perhaps by leaving out the _header suffix?
It'd be nice if we could avoid splitting these across multiple lines.
> +
> + /* endian problems would show up right here */
> + if (bin_header->bin_magic != PCI_VENDOR_ID_NVIDIA) {
> + dev_err(falcon->dev, "failed to get firmware magic");
> + return -EINVAL;
> + }
> +
> + /* currently only version 1 is supported */
> + if (bin_header->bin_ver != 1) {
> + dev_err(falcon->dev, "unsupported firmware version");
> + return -EINVAL;
> + }
> +
> + /* check that the firmware size is consistent */
> + if (bin_header->bin_size > falcon->firmware.size) {
> + dev_err(falcon->dev, "firmware image size inconsistency");
> + return -EINVAL;
> + }
These messages are missing newlines at the end.
> +
> + os_header = (falcon->firmware.vaddr +
> + bin_header->os_bin_header_offset);
I think if we shorten the variable names a little this will also fit on
a single line. There's also no need for the parentheses.
> +
> + falcon->firmware.bin_data.size = bin_header->os_bin_size;
> + falcon->firmware.bin_data.offset = bin_header->os_bin_data_offset;
> + falcon->firmware.code.offset = os_header->os_code_offset;
> + falcon->firmware.code.size = os_header->os_code_size;
> + falcon->firmware.data.offset = os_header->os_data_offset;
> + falcon->firmware.data.size = os_header->os_data_size;
Can you remove the extra padding before =, please? It's inconsistent
with the other assignments.
> +
> + return 0;
> +}
> +
> +int falcon_read_firmware(struct falcon *falcon, const char *firmware_name)
The firmware_ prefix in firmware_name is redundant and can be dropped,
in my opinion.
> +{
> + const struct firmware *firmware;
> + int err;
> +
> + if (falcon->firmware.valid)
Do we really need the .valid field? It seems like we should be able to
derive it from the value of one of the other fields.
> + return 0;
> +
> + err = request_firmware(&firmware, firmware_name, falcon->dev);
> + if (err < 0) {
> + dev_err(falcon->dev, "failed to get firmware\n");
> + return err;
It'd be nice to show the user what error occurred. Maybe also show the
name of the firmware that couldn't be loaded.
> + }
> +
> + falcon->firmware.size = firmware->size;
> +
> + /* allocate iova space for the firmware */
> + falcon->firmware.vaddr = falcon->ops->alloc(falcon, firmware->size,
> + &falcon->firmware.paddr);
The arguments aren't properly aligned.
> + if (!falcon->firmware.vaddr) {
> + dev_err(falcon->dev, "dma memory mapping failed");
> + err = -ENOMEM;
> + goto err_alloc_firmware;
> + }
> +
> + /* copy firmware image into local area. this also ensures endianness */
> + falcon_copy_firmware_image(falcon, firmware);
> +
> + /* parse the image data */
> + err = falcon_parse_firmware_image(falcon);
> + if (err < 0) {
> + dev_err(falcon->dev, "failed to parse firmware image\n");
> + goto err_setup_firmware_image;
> + }
> +
> + falcon->firmware.valid = true;
> +
> + release_firmware(firmware);
> +
> + return 0;
> +
> +err_setup_firmware_image:
> + falcon->ops->free(falcon, falcon->firmware.size,
> + falcon->firmware.paddr, falcon->firmware.vaddr);
> +err_alloc_firmware:
> + release_firmware(firmware);
> +
> + return err;
> +}
> +
> +int falcon_init(struct falcon *falcon)
> +{
> + /* check mandatory ops */
> + if (!falcon->ops || !falcon->ops->alloc || !falcon->ops->free)
> + return -EINVAL;
> +
> + falcon->firmware.valid = false;
> +
> + return 0;
> +}
> +
> +void falcon_exit(struct falcon *falcon)
> +{
> + if (!falcon->firmware.valid)
> + return;
> +
> + falcon->ops->free(falcon, falcon->firmware.size,
> + falcon->firmware.paddr,
> + falcon->firmware.vaddr);
> + falcon->firmware.valid = false;
> +}
> +
> +int falcon_boot(struct falcon *falcon)
> +{
> + unsigned long offset;
> + int err = 0;
I don'n think it's necessary to initialize err here.
> +
> + if (!falcon->firmware.valid)
> + return -EINVAL;
> +
> + falcon_writel(falcon, 0, FALCON_DMACTL);
> +
> + /* setup the address of the binary data. Falcon can access it later */
If you write sentences with a full stop, please use a capital first
letter. In this case I think you can just drop the full stop:
/* setup the address of the binary data so Falcon can access it later */
> + falcon_writel(falcon, (falcon->firmware.paddr +
> + falcon->firmware.bin_data.offset) >> 8,
> + FALCON_DMATRFBASE);
> +
> + /* copy the data segment into Falcon internal memory */
> + for (offset = 0; offset < falcon->firmware.data.size; offset += 256)
> + falcon_copy_chunk(falcon,
> + falcon->firmware.data.offset + offset,
> + offset, FALCON_MEMORY_DATA);
> +
> + /* copy the first code segment into Falcon internal memory */
> + falcon_copy_chunk(falcon, falcon->firmware.code.offset,
> + 0, FALCON_MEMORY_IMEM);
> +
> + /* setup falcon interrupts */
> + falcon_writel(falcon, FALCON_IRQMSET_EXT(0xff) |
> + FALCON_IRQMSET_SWGEN1 |
> + FALCON_IRQMSET_SWGEN0 |
> + FALCON_IRQMSET_EXTERR |
> + FALCON_IRQMSET_HALT |
> + FALCON_IRQMSET_WDTMR,
> + FALCON_IRQMSET);
> + falcon_writel(falcon, FALCON_IRQDEST_EXT(0xff) |
> + FALCON_IRQDEST_SWGEN1 |
> + FALCON_IRQDEST_SWGEN0 |
> + FALCON_IRQDEST_EXTERR |
> + FALCON_IRQDEST_HALT,
> + FALCON_IRQDEST);
> +
> + /* enable interface */
> + falcon_writel(falcon, FALCON_ITFEN_MTHDEN |
> + FALCON_ITFEN_CTXEN,
> + FALCON_ITFEN);
> +
> + /* boot falcon */
> + falcon_writel(falcon, 0x00000000, FALCON_BOOTVEC);
> + falcon_writel(falcon, FALCON_CPUCTL_STARTCPU, FALCON_CPUCTL);
> +
> + err = falcon_wait_idle(falcon);
> + if (err < 0) {
> + dev_err(falcon->dev, "falcon boot failed due to timeout");
> + return err;
> + }
> +
> + dev_info(falcon->dev, "falcon booted");
No need to brag here, it's supposed to be successful. Let the user know
if it isn't. If you really want this, make sure it's output at debug
level, not info. Also I think we need to agree on whether these are
named Falcon or falcon and then use it consistently. I'm leaning towards
the former. Also, there are a few messages with missing newlines.
> +
> + return 0;
> +}
> +
> +void falcon_execute_method(struct falcon *falcon, u32 method, u32 data)
> +{
> + falcon_writel(falcon, method >> 2, FALCON_UCLASS_METHOD_OFFSET);
> + falcon_writel(falcon, data, FALCON_UCLASS_METHOD_DATA);
> +}
> diff --git a/drivers/gpu/drm/tegra/falcon.h b/drivers/gpu/drm/tegra/falcon.h
> new file mode 100644
> index 0000000..56284b9
> --- /dev/null
> +++ b/drivers/gpu/drm/tegra/falcon.h
> @@ -0,0 +1,130 @@
> +/*
> + * Copyright (c) 2015, NVIDIA Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#ifndef _FALCON_H_
> +#define _FALCON_H_
> +
> +#include <linux/types.h>
> +
> +#define FALCON_UCLASS_METHOD_OFFSET 0x00000040
> +
> +#define FALCON_UCLASS_METHOD_DATA 0x00000044
> +
> +#define FALCON_IRQMSET 0x00001010
> +#define FALCON_IRQMSET_WDTMR (1 << 1)
> +#define FALCON_IRQMSET_HALT (1 << 4)
> +#define FALCON_IRQMSET_EXTERR (1 << 5)
> +#define FALCON_IRQMSET_SWGEN0 (1 << 6)
> +#define FALCON_IRQMSET_SWGEN1 (1 << 7)
> +#define FALCON_IRQMSET_EXT(v) (((v) & 0xff) << 8)
> +
> +#define FALCON_IRQDEST 0x0000101c
> +#define FALCON_IRQDEST_HALT (1 << 4)
> +#define FALCON_IRQDEST_EXTERR (1 << 5)
> +#define FALCON_IRQDEST_SWGEN0 (1 << 6)
> +#define FALCON_IRQDEST_SWGEN1 (1 << 7)
> +#define FALCON_IRQDEST_EXT(v) (((v) & 0xff) << 8)
> +
> +#define FALCON_ITFEN 0x00001048
> +#define FALCON_ITFEN_CTXEN (1 << 0)
> +#define FALCON_ITFEN_MTHDEN (1 << 1)
> +
> +#define FALCON_IDLESTATE 0x0000104c
> +
> +#define FALCON_CPUCTL 0x00001100
> +#define FALCON_CPUCTL_STARTCPU (1 << 1)
> +
> +#define FALCON_BOOTVEC 0x00001104
> +
> +#define FALCON_DMACTL 0x0000110c
> +#define FALCON_DMACTL_DMEM_SCRUBBING (1 << 1)
> +#define FALCON_DMACTL_IMEM_SCRUBBING (1 << 2)
> +
> +#define FALCON_DMATRFBASE 0x00001110
> +
> +#define FALCON_DMATRFMOFFS 0x00001114
> +
> +#define FALCON_DMATRFCMD 0x00001118
> +#define FALCON_DMATRFCMD_IDLE (1 << 1)
> +#define FALCON_DMATRFCMD_IMEM (1 << 4)
> +#define FALCON_DMATRFCMD_SIZE_256B (6 << 8)
> +
> +#define FALCON_DMATRFFBOFFS 0x0000111c
> +
> +struct falcon_firmware_bin_header_v1 {
> + u32 bin_magic; /* 0x10de */
> + u32 bin_ver; /* cya, versioning of bin format (1) */
Not sure everyone understands what cya means, here. Perhaps just drop
it?
> + u32 bin_size; /* entire image size including this header */
> + u32 os_bin_header_offset;
> + u32 os_bin_data_offset;
> + u32 os_bin_size;
> +};
> +
> +struct falcon_firmware_os_app_v1 {
> + u32 offset;
> + u32 size;
> +};
> +
> +struct falcon_firmware_os_header_v1 {
> + u32 os_code_offset;
> + u32 os_code_size;
> + u32 os_data_offset;
> + u32 os_data_size;
> + u32 num_apps;
> + struct falcon_firmware_os_app_v1 *app_code;
> + struct falcon_firmware_os_app_v1 *app_data;
The above two seem to be completel unused. Should we drop them?
> + u32 *os_ovl_offset;
> + u32 *os_ovl_size;
> +};
Can we in general sanitize the names of these a little? It's redundent
to add a os_ prefix in a structure that contains an _os_ infix.
Thierry
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/8] drm/tegra: Add VIC support
2016-11-10 18:23 ` [PATCH 4/8] drm/tegra: Add VIC support Mikko Perttunen
@ 2016-12-05 19:35 ` Thierry Reding
[not found] ` <20161205193559.GD22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: Thierry Reding @ 2016-12-05 19:35 UTC (permalink / raw)
To: Mikko Perttunen; +Cc: linux-tegra, Andrew Chew, Arto Merilainen, dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 14093 bytes --]
On Thu, Nov 10, 2016 at 08:23:41PM +0200, Mikko Perttunen wrote:
> From: Arto Merilainen <amerilainen@nvidia.com>
>
> This patch adds support for Video Image Compositor engine which
> can be used for 2d operations.
>
> Signed-off-by: Andrew Chew <achew@nvidia.com>
> Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> ---
> drivers/gpu/drm/tegra/Makefile | 3 +-
> drivers/gpu/drm/tegra/drm.c | 3 +
> drivers/gpu/drm/tegra/drm.h | 1 +
> drivers/gpu/drm/tegra/vic.c | 400 +++++++++++++++++++++++++++++++++++++++++
> drivers/gpu/drm/tegra/vic.h | 31 ++++
> include/linux/host1x.h | 1 +
> 6 files changed, 438 insertions(+), 1 deletion(-)
> create mode 100644 drivers/gpu/drm/tegra/vic.c
> create mode 100644 drivers/gpu/drm/tegra/vic.h
>
> diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile
> index 93e9a4a..6af3a9a 100644
> --- a/drivers/gpu/drm/tegra/Makefile
> +++ b/drivers/gpu/drm/tegra/Makefile
> @@ -14,6 +14,7 @@ tegra-drm-y := \
> dpaux.o \
> gr2d.o \
> gr3d.o \
> - falcon.o
> + falcon.o \
> + vic.o
>
> obj-$(CONFIG_DRM_TEGRA) += tegra-drm.o
> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> index ecfe7ba..707404d 100644
> --- a/drivers/gpu/drm/tegra/drm.c
> +++ b/drivers/gpu/drm/tegra/drm.c
> @@ -1151,11 +1151,13 @@ static const struct of_device_id host1x_drm_subdevs[] = {
> { .compatible = "nvidia,tegra124-sor", },
> { .compatible = "nvidia,tegra124-hdmi", },
> { .compatible = "nvidia,tegra124-dsi", },
> + { .compatible = "nvidia,tegra124-vic", },
> { .compatible = "nvidia,tegra132-dsi", },
> { .compatible = "nvidia,tegra210-dc", },
> { .compatible = "nvidia,tegra210-dsi", },
> { .compatible = "nvidia,tegra210-sor", },
> { .compatible = "nvidia,tegra210-sor1", },
> + { .compatible = "nvidia,tegra210-vic", },
> { /* sentinel */ }
> };
>
> @@ -1177,6 +1179,7 @@ static struct platform_driver * const drivers[] = {
> &tegra_sor_driver,
> &tegra_gr2d_driver,
> &tegra_gr3d_driver,
> + &tegra_vic_driver,
> };
>
> static int __init host1x_drm_init(void)
> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
> index f75b505..8efaaa4 100644
> --- a/drivers/gpu/drm/tegra/drm.h
> +++ b/drivers/gpu/drm/tegra/drm.h
> @@ -292,5 +292,6 @@ extern struct platform_driver tegra_dpaux_driver;
> extern struct platform_driver tegra_sor_driver;
> extern struct platform_driver tegra_gr2d_driver;
> extern struct platform_driver tegra_gr3d_driver;
> +extern struct platform_driver tegra_vic_driver;
>
> #endif /* HOST1X_DRM_H */
> diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
> new file mode 100644
> index 0000000..9eda5e5
> --- /dev/null
> +++ b/drivers/gpu/drm/tegra/vic.c
> @@ -0,0 +1,400 @@
> +/*
> + * Copyright (c) 2015, NVIDIA Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/clk.h>
> +#include <linux/host1x.h>
> +#include <linux/iommu.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/of_device.h>
> +#include <linux/of_platform.h>
> +#include <linux/platform_device.h>
> +#include <linux/pm_runtime.h>
> +#include <linux/reset.h>
> +
> +#include <soc/tegra/pmc.h>
> +
> +#include "drm.h"
> +#include "falcon.h"
> +#include "vic.h"
> +
> +struct vic_config {
> + /* Firmware name */
> + const char *ucode_name;
Maybe just "firmware"? That way you could remove the comment.
> +};
> +
> +struct vic {
> + struct falcon falcon;
> + bool booted;
> +
> + void __iomem *regs;
> + struct tegra_drm_client client;
> + struct host1x_channel *channel;
> + struct iommu_domain *domain;
> + struct device *dev;
> + struct clk *clk;
> +
> + /* Platform configuration */
> + const struct vic_config *config;
> +};
> +
> +static inline struct vic *to_vic(struct tegra_drm_client *client)
> +{
> + return container_of(client, struct vic, client);
> +}
> +
> +static void vic_writel(struct vic *vic, u32 value, unsigned int offset)
> +{
> + writel(value, vic->regs + offset);
> +}
> +
> +static int vic_runtime_resume(struct device *dev)
> +{
> + struct vic *vic = dev_get_drvdata(dev);
> +
> + return clk_prepare_enable(vic->clk);
> +}
> +
> +static int vic_runtime_suspend(struct device *dev)
> +{
> + struct vic *vic = dev_get_drvdata(dev);
> +
> + clk_disable_unprepare(vic->clk);
> +
> + vic->booted = false;
> +
> + return 0;
> +}
> +
> +static int vic_boot(struct vic *vic)
> +{
> + u32 fce_ucode_size, fce_bin_data_offset;
> + void *hdr;
> + int err = 0;
> +
> + if (vic->booted)
> + return 0;
> +
> + if (!vic->falcon.firmware.valid) {
> + err = falcon_read_firmware(&vic->falcon,
> + vic->config->ucode_name);
> + if (err < 0)
> + return err;
> + }
> +
> + /* setup clockgating registers */
> + vic_writel(vic, CG_IDLE_CG_DLY_CNT(4) |
> + CG_IDLE_CG_EN |
> + CG_WAKEUP_DLY_CNT(4),
> + NV_PVIC_MISC_PRI_VIC_CG);
> +
> + err = falcon_boot(&vic->falcon);
> + if (err < 0)
> + return err;
> +
> + hdr = vic->falcon.firmware.vaddr;
> + fce_bin_data_offset = *(u32 *)(hdr + VIC_UCODE_FCE_DATA_OFFSET);
> + hdr = vic->falcon.firmware.vaddr +
> + *(u32 *)(hdr + VIC_UCODE_FCE_HEADER_OFFSET);
> + fce_ucode_size = *(u32 *)(hdr + FCE_UCODE_SIZE_OFFSET);
> +
> + falcon_execute_method(&vic->falcon, VIC_SET_APPLICATION_ID, 1);
> + falcon_execute_method(&vic->falcon, VIC_SET_FCE_UCODE_SIZE,
> + fce_ucode_size);
> + falcon_execute_method(&vic->falcon, VIC_SET_FCE_UCODE_OFFSET,
> + (vic->falcon.firmware.paddr + fce_bin_data_offset)
> + >> 8);
> +
> + err = falcon_wait_idle(&vic->falcon);
> + if (err < 0) {
> + dev_err(vic->dev,
> + "failed to set application ID and FCE base\n");
> + return err;
> + }
> +
> + vic->booted = true;
> +
> + return 0;
> +}
> +
> +static void *vic_falcon_alloc(struct falcon *falcon, size_t size,
> + dma_addr_t *iova)
> +{
> + struct tegra_drm *tegra = falcon->data;
> +
> + return tegra_drm_alloc(tegra, size, iova);
> +}
> +
> +static void vic_falcon_free(struct falcon *falcon, size_t size,
> + dma_addr_t iova, void *va)
> +{
> + struct tegra_drm *tegra = falcon->data;
> +
> + return tegra_drm_free(tegra, size, va, iova);
> +}
> +
> +static const struct falcon_ops vic_falcon_ops = {
> + .alloc = vic_falcon_alloc,
> + .free = vic_falcon_free
> +};
> +
> +static int vic_init(struct host1x_client *client)
> +{
> + struct tegra_drm_client *drm = host1x_to_drm_client(client);
> + struct drm_device *dev = dev_get_drvdata(client->parent);
> + struct tegra_drm *tegra = dev->dev_private;
> + struct vic *vic = to_vic(drm);
> + int err;
> +
> + if (tegra->domain) {
> + err = iommu_attach_device(tegra->domain, vic->dev);
> + if (err < 0) {
> + dev_err(vic->dev, "failed to attach to domain: %d\n",
> + err);
> + return err;
> + }
> +
> + vic->domain = tegra->domain;
> + }
> +
> + vic->falcon.dev = vic->dev;
> + vic->falcon.regs = vic->regs;
> + vic->falcon.data = tegra;
Why do we need this? We can get at struct vic * from struct falcon * via
an upcast, and once we have struct vic, we have struct tegra_drm_client.
But now I realize that apparently there's no way to get from that to the
struct drm_device. I guess we could go via vic->client.base.parent and a
call to dev_get_drvdata(), but that's rather far. It's still a little
less confusing than stashing the struct tegra_drm * in it, since it is
not at all driver-specific data.
> + vic->falcon.ops = &vic_falcon_ops;
> + err = falcon_init(&vic->falcon);
Could use a blank line between the above.
> + if (err < 0)
> + goto detach_device;
> +
> + vic->channel = host1x_channel_request(client->dev);
> + if (!vic->channel) {
> + err = -ENOMEM;
> + goto exit_falcon;
> + }
> +
> + client->syncpts[0] = host1x_syncpt_request(client->dev, 0);
> + if (!client->syncpts[0]) {
> + err = -ENOMEM;
> + goto free_channel;
> + }
> +
> + err = tegra_drm_register_client(tegra, drm);
> + if (err < 0)
> + goto free_syncpt;
> +
> + return 0;
> +
> +free_syncpt:
> + host1x_syncpt_free(client->syncpts[0]);
> +free_channel:
> + host1x_channel_free(vic->channel);
> +exit_falcon:
> + falcon_exit(&vic->falcon);
> +detach_device:
> + if (tegra->domain)
> + iommu_detach_device(tegra->domain, vic->dev);
> +
> + return err;
> +}
> +
> +static int vic_exit(struct host1x_client *client)
> +{
> + struct tegra_drm_client *drm = host1x_to_drm_client(client);
> + struct drm_device *dev = dev_get_drvdata(client->parent);
> + struct tegra_drm *tegra = dev->dev_private;
> + struct vic *vic = to_vic(drm);
> + int err;
> +
> + err = tegra_drm_unregister_client(tegra, drm);
> + if (err < 0)
> + return err;
> +
> + host1x_syncpt_free(client->syncpts[0]);
> + host1x_channel_free(vic->channel);
> +
> + falcon_exit(&vic->falcon);
> +
> + if (vic->domain) {
> + iommu_detach_device(vic->domain, vic->dev);
> + vic->domain = NULL;
> + }
> +
> + return 0;
> +}
> +
> +static const struct host1x_client_ops vic_client_ops = {
> + .init = vic_init,
> + .exit = vic_exit,
> +};
> +
> +static int vic_open_channel(struct tegra_drm_client *client,
> + struct tegra_drm_context *context)
> +{
> + struct vic *vic = to_vic(client);
> + int err;
> +
> + err = pm_runtime_get_sync(vic->dev);
> + if (err < 0)
> + return err;
> +
> + /*
> + * Try to boot the Falcon microcontroller. Booting is deferred until
> + * here because the firmware might not yet be available during system
> + * boot, for example if it's on remote storage.
> + */
That sounds like a hack. Typically you're supposed to have the firmware
on the same medium as the module that requests it. That is, if the
driver is built-in, then you need to make the firmware available as
built-in as well or stash it into an initrd. If the driver is built as a
loadable module and is on the root filesystem, that's where the firmware
should be as well. There's also the possibility to put the loadable
module into an initrd, in which case that's where the firmware should go
as well.
That said, I think it makes sense to defer the actual booting until this
point. Loading the firmware seems unappropriate to me. It really should
be done in ->probe().
> + err = vic_boot(vic);
> + if (err < 0) {
> + pm_runtime_put(vic->dev);
> + return err;
> + }
> +
> + context->channel = host1x_channel_get(vic->channel);
> + if (!context->channel) {
> + pm_runtime_put(vic->dev);
> + return -ENOMEM;
> + }
> +
> + return 0;
> +}
> +
> +static void vic_close_channel(struct tegra_drm_context *context)
> +{
> + struct vic *vic = to_vic(context->client);
> +
> + host1x_channel_put(context->channel);
> +
> + pm_runtime_put(vic->dev);
> +}
> +
> +static const struct tegra_drm_client_ops vic_ops = {
> + .open_channel = vic_open_channel,
> + .close_channel = vic_close_channel,
> + .submit = tegra_drm_submit,
> +};
> +
> +static const struct vic_config vic_t124_config = {
> + .ucode_name = "nvidia/tegra124/vic03_ucode.bin",
> +};
> +
> +static const struct vic_config vic_t210_config = {
> + .ucode_name = "nvidia/tegra210/vic04_ucode.bin",
> +};
> +
> +static const struct of_device_id vic_match[] = {
> + { .compatible = "nvidia,tegra124-vic", .data = &vic_t124_config },
> + { .compatible = "nvidia,tegra210-vic", .data = &vic_t210_config },
> + { },
> +};
> +
> +static int vic_probe(struct platform_device *pdev)
> +{
> + struct vic_config *vic_config = NULL;
> + struct device *dev = &pdev->dev;
> + struct host1x_syncpt **syncpts;
> + struct resource *regs;
> + const struct of_device_id *match;
> + struct vic *vic;
> + int err;
> +
> + match = of_match_device(vic_match, dev);
> + vic_config = (struct vic_config *)match->data;
> +
> + vic = devm_kzalloc(dev, sizeof(*vic), GFP_KERNEL);
> + if (!vic)
> + return -ENOMEM;
> +
> + syncpts = devm_kzalloc(dev, sizeof(*syncpts), GFP_KERNEL);
> + if (!syncpts)
> + return -ENOMEM;
> +
> + regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> + if (!regs) {
> + dev_err(&pdev->dev, "failed to get registers\n");
> + return -ENXIO;
> + }
> +
> + vic->regs = devm_ioremap_resource(dev, regs);
> + if (IS_ERR(vic->regs))
> + return PTR_ERR(vic->regs);
> +
> + vic->clk = devm_clk_get(dev, NULL);
> + if (IS_ERR(vic->clk)) {
> + dev_err(&pdev->dev, "failed to get clock\n");
> + return PTR_ERR(vic->clk);
> + }
> +
> + platform_set_drvdata(pdev, vic);
> +
> + INIT_LIST_HEAD(&vic->client.base.list);
> + vic->client.base.ops = &vic_client_ops;
> + vic->client.base.dev = dev;
> + vic->client.base.class = HOST1X_CLASS_VIC;
> + vic->client.base.syncpts = syncpts;
> + vic->client.base.num_syncpts = 1;
> + vic->dev = dev;
> + vic->config = vic_config;
> +
> + INIT_LIST_HEAD(&vic->client.list);
> + vic->client.ops = &vic_ops;
> +
> + err = host1x_client_register(&vic->client.base);
> + if (err < 0) {
> + dev_err(dev, "failed to register host1x client: %d\n", err);
> + platform_set_drvdata(pdev, NULL);
> + return err;
> + }
> +
> + pm_runtime_enable(&pdev->dev);
> + if (!pm_runtime_enabled(&pdev->dev)) {
> + err = vic_runtime_resume(&pdev->dev);
> + if (err < 0)
> + goto unregister_client;
> + }
That's a slightly ugly construct, but I can't think of an easy way to
get rid of it (other than to depend on PM, which actually might be the
right thing to do here, it's already selected by ARCH_TEGRA on 64-bit
ARM). I can't think of a scenario where we'd want to run without runtime
PM.
> +
> + dev_info(&pdev->dev, "initialized");
Again, no need for this.
Thierry
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 5/8] gpu: host1x: Add IOMMU support
2016-11-10 18:23 ` [PATCH 5/8] gpu: host1x: Add IOMMU support Mikko Perttunen
@ 2016-12-05 19:49 ` Thierry Reding
[not found] ` <20161205194924.GE22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: Thierry Reding @ 2016-12-05 19:49 UTC (permalink / raw)
To: Mikko Perttunen; +Cc: linux-tegra, dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 11310 bytes --]
On Thu, Nov 10, 2016 at 08:23:42PM +0200, Mikko Perttunen wrote:
> Add support for the Host1x unit to be located behind
> an IOMMU. This is required when gather buffers may be
> allocated non-contiguously in physical memory, as can
> be the case when TegraDRM is also using the IOMMU.
>
> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> ---
> drivers/gpu/host1x/cdma.c | 65 +++++++++++++++++++++++++++++------
> drivers/gpu/host1x/cdma.h | 4 ++-
> drivers/gpu/host1x/dev.c | 39 +++++++++++++++++++--
> drivers/gpu/host1x/dev.h | 5 +++
> drivers/gpu/host1x/hw/cdma_hw.c | 10 +++---
> drivers/gpu/host1x/job.c | 76 +++++++++++++++++++++++++++++++++++------
> drivers/gpu/host1x/job.h | 1 +
> 7 files changed, 171 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c
> index c5d82a8..14692446e 100644
> --- a/drivers/gpu/host1x/cdma.c
> +++ b/drivers/gpu/host1x/cdma.c
> @@ -51,9 +51,15 @@ static void host1x_pushbuffer_destroy(struct push_buffer *pb)
> struct host1x_cdma *cdma = pb_to_cdma(pb);
> struct host1x *host1x = cdma_to_host1x(cdma);
>
> - if (pb->phys != 0)
> - dma_free_wc(host1x->dev, pb->size_bytes + 4, pb->mapped,
> - pb->phys);
> + if (!pb->phys)
> + return;
> +
> + if (host1x->domain) {
> + iommu_unmap(host1x->domain, pb->dma, pb->alloc_size_bytes);
> + free_iova(&host1x->iova, iova_pfn(&host1x->iova, pb->dma));
> + }
> +
> + dma_free_wc(host1x->dev, pb->alloc_size_bytes, pb->mapped, pb->phys);
>
> pb->mapped = NULL;
> pb->phys = 0;
> @@ -66,28 +72,65 @@ static int host1x_pushbuffer_init(struct push_buffer *pb)
> {
> struct host1x_cdma *cdma = pb_to_cdma(pb);
> struct host1x *host1x = cdma_to_host1x(cdma);
> + struct iova *alloc;
> + int err;
>
> pb->mapped = NULL;
> pb->phys = 0;
> pb->size_bytes = HOST1X_PUSHBUFFER_SLOTS * 8;
> + pb->alloc_size_bytes = pb->size_bytes + 4;
>
> /* initialize buffer pointers */
> pb->fence = pb->size_bytes - 8;
> pb->pos = 0;
>
> - /* allocate and map pushbuffer memory */
> - pb->mapped = dma_alloc_wc(host1x->dev, pb->size_bytes + 4, &pb->phys,
> - GFP_KERNEL);
> - if (!pb->mapped)
> - goto fail;
> + if (host1x->domain) {
> + unsigned long shift;
> +
> + pb->alloc_size_bytes = iova_align(&host1x->iova,
> + pb->alloc_size_bytes);
> +
> + pb->mapped = dma_alloc_wc(host1x->dev, pb->alloc_size_bytes,
> + &pb->phys, GFP_KERNEL);
> + if (!pb->mapped)
> + return -ENOMEM;
> +
> + shift = iova_shift(&host1x->iova);
> + alloc = alloc_iova(
> + &host1x->iova, pb->alloc_size_bytes >> shift,
> + host1x->domain->geometry.aperture_end >> shift, true);
Let's precompute some of these values to make this expression more
compact. For example pb->alloc_size_bytes is used a couple of times, so
it could be stored temporarily in a local variable with a nice and short
name such as "size".
If you store the aperture end, or limit_pfn, as I commented in an
earlier patch, then this also becomes much more compact.
> + if (!alloc) {
> + err = -ENOMEM;
> + goto iommu_free_mem;
> + }
> +
> + err = iommu_map(host1x->domain,
> + iova_dma_addr(&host1x->iova, alloc),
> + pb->phys, pb->alloc_size_bytes,
> + IOMMU_READ);
Same here.
> + if (err)
> + goto iommu_free_iova;
> +
> + pb->dma = iova_dma_addr(&host1x->iova, alloc);
> + } else {
> + pb->mapped = dma_alloc_wc(host1x->dev, pb->alloc_size_bytes,
> + &pb->phys, GFP_KERNEL);
> + if (!pb->mapped)
> + return -ENOMEM;
> +
> + pb->dma = pb->phys;
> + }
>
> host1x_hw_pushbuffer_init(host1x, pb);
>
> return 0;
>
> -fail:
> - host1x_pushbuffer_destroy(pb);
> - return -ENOMEM;
> +iommu_free_iova:
> + __free_iova(&host1x->iova, alloc);
> +iommu_free_mem:
> + dma_free_wc(host1x->dev, pb->alloc_size_bytes, pb->mapped, pb->phys);
> +
> + return err;
> }
>
> /*
> diff --git a/drivers/gpu/host1x/cdma.h b/drivers/gpu/host1x/cdma.h
> index 470087a..4b3645f 100644
> --- a/drivers/gpu/host1x/cdma.h
> +++ b/drivers/gpu/host1x/cdma.h
> @@ -43,10 +43,12 @@ struct host1x_job;
>
> struct push_buffer {
> void *mapped; /* mapped pushbuffer memory */
> - dma_addr_t phys; /* physical address of pushbuffer */
> + dma_addr_t dma; /* device address of pushbuffer */
> + phys_addr_t phys; /* physical address of pushbuffer */
> u32 fence; /* index we've written */
> u32 pos; /* index to write to */
> u32 size_bytes;
> + u32 alloc_size_bytes;
Maybe just "alloc_size"?
> };
>
> struct buffer_timeout {
> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
> index a62317a..ca4527e 100644
> --- a/drivers/gpu/host1x/dev.c
> +++ b/drivers/gpu/host1x/dev.c
> @@ -168,16 +168,36 @@ static int host1x_probe(struct platform_device *pdev)
> return err;
> }
>
> + if (iommu_present(&platform_bus_type)) {
> + struct iommu_domain_geometry *geometry;
> + unsigned long order;
> +
> + host->domain = iommu_domain_alloc(&platform_bus_type);
> + if (!host->domain)
> + return -ENOMEM;
> +
> + err = iommu_attach_device(host->domain, &pdev->dev);
> + if (err)
> + goto fail_free_domain;
> +
> + geometry = &host->domain->geometry;
> +
> + order = __ffs(host->domain->pgsize_bitmap);
> + init_iova_domain(&host->iova, 1UL << order,
> + geometry->aperture_start >> order,
> + geometry->aperture_end >> order);
> + }
> +
> err = host1x_channel_list_init(host);
> if (err) {
> dev_err(&pdev->dev, "failed to initialize channel list\n");
> - return err;
> + goto fail_detach_device;
> }
>
> err = clk_prepare_enable(host->clk);
> if (err < 0) {
> dev_err(&pdev->dev, "failed to enable clock\n");
> - return err;
> + goto fail_detach_device;
> }
>
> err = host1x_syncpt_init(host);
> @@ -206,6 +226,15 @@ static int host1x_probe(struct platform_device *pdev)
> host1x_syncpt_deinit(host);
> fail_unprepare_disable:
> clk_disable_unprepare(host->clk);
> +fail_detach_device:
> + if (host->domain) {
> + put_iova_domain(&host->iova);
> + iommu_detach_device(host->domain, &pdev->dev);
> + }
> +fail_free_domain:
> + if (host->domain)
> + iommu_domain_free(host->domain);
> +
> return err;
> }
>
> @@ -218,6 +247,12 @@ static int host1x_remove(struct platform_device *pdev)
> host1x_syncpt_deinit(host);
> clk_disable_unprepare(host->clk);
>
> + if (host->domain) {
> + put_iova_domain(&host->iova);
> + iommu_detach_device(host->domain, &pdev->dev);
> + iommu_domain_free(host->domain);
> + }
> +
> return 0;
> }
>
> diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
> index 5220510..6189825 100644
> --- a/drivers/gpu/host1x/dev.h
> +++ b/drivers/gpu/host1x/dev.h
> @@ -19,6 +19,8 @@
>
> #include <linux/platform_device.h>
> #include <linux/device.h>
> +#include <linux/iommu.h>
> +#include <linux/iova.h>
>
> #include "channel.h"
> #include "syncpt.h"
> @@ -108,6 +110,9 @@ struct host1x {
> struct device *dev;
> struct clk *clk;
>
> + struct iommu_domain *domain;
> + struct iova_domain iova;
> +
> struct mutex intr_mutex;
> int intr_syncpt_irq;
>
> diff --git a/drivers/gpu/host1x/hw/cdma_hw.c b/drivers/gpu/host1x/hw/cdma_hw.c
> index 659c1bb..b8c0348 100644
> --- a/drivers/gpu/host1x/hw/cdma_hw.c
> +++ b/drivers/gpu/host1x/hw/cdma_hw.c
> @@ -55,7 +55,7 @@ static void cdma_timeout_cpu_incr(struct host1x_cdma *cdma, u32 getptr,
> *(p++) = HOST1X_OPCODE_NOP;
> *(p++) = HOST1X_OPCODE_NOP;
> dev_dbg(host1x->dev, "%s: NOP at %pad+%#x\n", __func__,
> - &pb->phys, getptr);
> + &pb->dma, getptr);
> getptr = (getptr + 8) & (pb->size_bytes - 1);
> }
>
> @@ -78,9 +78,9 @@ static void cdma_start(struct host1x_cdma *cdma)
> HOST1X_CHANNEL_DMACTRL);
>
> /* set base, put and end pointer */
> - host1x_ch_writel(ch, cdma->push_buffer.phys, HOST1X_CHANNEL_DMASTART);
> + host1x_ch_writel(ch, cdma->push_buffer.dma, HOST1X_CHANNEL_DMASTART);
> host1x_ch_writel(ch, cdma->push_buffer.pos, HOST1X_CHANNEL_DMAPUT);
> - host1x_ch_writel(ch, cdma->push_buffer.phys +
> + host1x_ch_writel(ch, cdma->push_buffer.dma +
> cdma->push_buffer.size_bytes + 4,
> HOST1X_CHANNEL_DMAEND);
>
> @@ -115,8 +115,8 @@ static void cdma_timeout_restart(struct host1x_cdma *cdma, u32 getptr)
> HOST1X_CHANNEL_DMACTRL);
>
> /* set base, end pointer (all of memory) */
> - host1x_ch_writel(ch, cdma->push_buffer.phys, HOST1X_CHANNEL_DMASTART);
> - host1x_ch_writel(ch, cdma->push_buffer.phys +
> + host1x_ch_writel(ch, cdma->push_buffer.dma, HOST1X_CHANNEL_DMASTART);
> + host1x_ch_writel(ch, cdma->push_buffer.dma +
> cdma->push_buffer.size_bytes,
> HOST1X_CHANNEL_DMAEND);
>
> diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
> index a91b7c4..222d930 100644
> --- a/drivers/gpu/host1x/job.c
> +++ b/drivers/gpu/host1x/job.c
> @@ -174,8 +174,9 @@ static int do_waitchks(struct host1x_job *job, struct host1x *host,
> return 0;
> }
>
> -static unsigned int pin_job(struct host1x_job *job)
> +static unsigned int pin_job(struct host1x *host, struct host1x_job *job)
> {
> + int err;
> unsigned int i;
Maybe use an inverse christmas tree for variables as in other parts of
this driver? That is, add the new int err belowe the existing unsined
int i.
>
> job->num_unpins = 0;
> @@ -186,12 +187,16 @@ static unsigned int pin_job(struct host1x_job *job)
> dma_addr_t phys_addr;
>
> reloc->target.bo = host1x_bo_get(reloc->target.bo);
> - if (!reloc->target.bo)
> + if (!reloc->target.bo) {
> + err = -EINVAL;
> goto unpin;
> + }
>
> phys_addr = host1x_bo_pin(reloc->target.bo, &sgt);
> - if (!phys_addr)
> + if (!phys_addr) {
> + err = -EINVAL;
> goto unpin;
> + }
>
> job->addr_phys[job->num_unpins] = phys_addr;
> job->unpins[job->num_unpins].bo = reloc->target.bo;
> @@ -204,25 +209,68 @@ static unsigned int pin_job(struct host1x_job *job)
> struct sg_table *sgt;
> dma_addr_t phys_addr;
>
> + size_t gather_size = 0;
> + struct scatterlist *sg;
> + struct iova *alloc;
> + unsigned long shift;
> + int j;
There should be no blank line between blocks of variable declarations.
Also, make j an unsigned int.
> +
> g->bo = host1x_bo_get(g->bo);
> - if (!g->bo)
> + if (!g->bo) {
> + err = -EINVAL;
> goto unpin;
> + }
>
> phys_addr = host1x_bo_pin(g->bo, &sgt);
> - if (!phys_addr)
> + if (!phys_addr) {
> + err = -EINVAL;
> goto unpin;
> + }
> +
> + if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && host->domain) {
> + for_each_sg(sgt->sgl, sg, sgt->nents, j) {
> + gather_size += sg->length;
> + }
No need for curly brackets here.
> + gather_size = iova_align(&host->iova, gather_size);
> +
> + shift = iova_shift(&host->iova);
> + alloc = alloc_iova(
> + &host->iova, gather_size >> shift,
> + host->domain->geometry.aperture_end >> shift,
> + true);
This could be made a little more compact as well.
Thierry
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/8] Host1x IOMMU support + VIC support
2016-11-10 18:23 [PATCH 0/8] Host1x IOMMU support + VIC support Mikko Perttunen
` (4 preceding siblings ...)
2016-11-10 18:23 ` [PATCH 6/8] dt-bindings: Add bindings for the Tegra VIC Mikko Perttunen
@ 2016-12-05 19:51 ` Thierry Reding
[not found] ` <20161205195131.GF22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
5 siblings, 1 reply; 25+ messages in thread
From: Thierry Reding @ 2016-12-05 19:51 UTC (permalink / raw)
To: Mikko Perttunen; +Cc: linux-tegra, dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 1533 bytes --]
On Thu, Nov 10, 2016 at 08:23:37PM +0200, Mikko Perttunen wrote:
> This series adds IOMMU support to Host1x and TegraDRM
> and adds support for the VIC host1x client so that
> host1x can be tested on modern Tegra platforms.
> It depends on the previous fix series. The whole thing
> (modulo patch order) is available as a git repository at
> git://github.com/cyndis/linux.git; branch vic-v1.
>
> IO memory management is organized such that there are
> two domains: the host1x domain and the tegradrm domain.
> The host1x domain is used by the host1x engine and
> contains the host1x CDMA and pushbuffers for submitted
> jobs.
>
> The tegradrm domain is shared by all host1x units and
> contains GEM objects and memory allocated by the
> separate tegra_drm_alloc function. This function is
> currently used to allocate space for firmware blobs
> in the tegradrm domain.
>
> A userspace test case for VIC can be found at
> https://github.com/cyndis/drm/tree/work/tegra.
> The testcase is in tests/tegra and is called submit_vic.
> The in-kernel firewall is not implemented for VIC;
> therefore, IOMMU must be enabled for the test to pass.
>
> Tested with Jetson TX1 (T210). Probably works also
> with Jetson TK1 (T124). Note that due to hardware changes
> the testcase also needs to be changed to run properly
> on T124.
What's the scope of the changes required for Tegra124? If we add the
kernel bits for Tegra124 we should also have a userspace test program to
exercise it.
Thanks,
Thierry
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/8] Host1x IOMMU support + VIC support
[not found] ` <20161205195131.GF22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
@ 2016-12-05 19:58 ` Mikko Perttunen
2016-12-06 7:14 ` Daniel Vetter
1 sibling, 0 replies; 25+ messages in thread
From: Mikko Perttunen @ 2016-12-05 19:58 UTC (permalink / raw)
To: Thierry Reding, Mikko Perttunen
Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, airlied-cv59FeDIM0c
On 12/05/2016 09:51 PM, Thierry Reding wrote:
> On Thu, Nov 10, 2016 at 08:23:37PM +0200, Mikko Perttunen wrote:
>> This series adds IOMMU support to Host1x and TegraDRM
>> and adds support for the VIC host1x client so that
>> host1x can be tested on modern Tegra platforms.
>> It depends on the previous fix series. The whole thing
>> (modulo patch order) is available as a git repository at
>> git://github.com/cyndis/linux.git; branch vic-v1.
>>
>> IO memory management is organized such that there are
>> two domains: the host1x domain and the tegradrm domain.
>> The host1x domain is used by the host1x engine and
>> contains the host1x CDMA and pushbuffers for submitted
>> jobs.
>>
>> The tegradrm domain is shared by all host1x units and
>> contains GEM objects and memory allocated by the
>> separate tegra_drm_alloc function. This function is
>> currently used to allocate space for firmware blobs
>> in the tegradrm domain.
>>
>> A userspace test case for VIC can be found at
>> https://github.com/cyndis/drm/tree/work/tegra.
>> The testcase is in tests/tegra and is called submit_vic.
>> The in-kernel firewall is not implemented for VIC;
>> therefore, IOMMU must be enabled for the test to pass.
>>
>> Tested with Jetson TX1 (T210). Probably works also
>> with Jetson TK1 (T124). Note that due to hardware changes
>> the testcase also needs to be changed to run properly
>> on T124.
>
> What's the scope of the changes required for Tegra124? If we add the
> kernel bits for Tegra124 we should also have a userspace test program to
> exercise it.
There should be no change needed to the kernel driver for T124. For the
userspace test, removing the one or two topmost commits in my repo will
put the testcase to a form working on T124 - I was lazy and just
modified the original file. Should be reasonably easy to have the test
case support both chips, though.
>
> Thanks,
> Thierry
>
Cheers,
Mikko.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/8] Host1x IOMMU support + VIC support
[not found] ` <20161205195131.GF22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
2016-12-05 19:58 ` Mikko Perttunen
@ 2016-12-06 7:14 ` Daniel Vetter
[not found] ` <20161206071450.uedxvvtcxbkwsoza-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
1 sibling, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2016-12-06 7:14 UTC (permalink / raw)
To: Thierry Reding
Cc: Mikko Perttunen, linux-tegra-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On Mon, Dec 05, 2016 at 08:51:31PM +0100, Thierry Reding wrote:
> On Thu, Nov 10, 2016 at 08:23:37PM +0200, Mikko Perttunen wrote:
> > This series adds IOMMU support to Host1x and TegraDRM
> > and adds support for the VIC host1x client so that
> > host1x can be tested on modern Tegra platforms.
> > It depends on the previous fix series. The whole thing
> > (modulo patch order) is available as a git repository at
> > git://github.com/cyndis/linux.git; branch vic-v1.
> >
> > IO memory management is organized such that there are
> > two domains: the host1x domain and the tegradrm domain.
> > The host1x domain is used by the host1x engine and
> > contains the host1x CDMA and pushbuffers for submitted
> > jobs.
> >
> > The tegradrm domain is shared by all host1x units and
> > contains GEM objects and memory allocated by the
> > separate tegra_drm_alloc function. This function is
> > currently used to allocate space for firmware blobs
> > in the tegradrm domain.
> >
> > A userspace test case for VIC can be found at
> > https://github.com/cyndis/drm/tree/work/tegra.
> > The testcase is in tests/tegra and is called submit_vic.
> > The in-kernel firewall is not implemented for VIC;
> > therefore, IOMMU must be enabled for the test to pass.
> >
> > Tested with Jetson TX1 (T210). Probably works also
> > with Jetson TK1 (T124). Note that due to hardware changes
> > the testcase also needs to be changed to run properly
> > on T124.
>
> What's the scope of the changes required for Tegra124? If we add the
> kernel bits for Tegra124 we should also have a userspace test program to
> exercise it.
Please note that just a test program is not really enough to add more uabi
(and VIC support here maybe classifies at that, no idea how this all
works). Aka does mesa run, and for the vic stuff, does some libva/libvdpau
open source implementation run on this?
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/8] Host1x IOMMU support + VIC support
[not found] ` <20161206071450.uedxvvtcxbkwsoza-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
@ 2016-12-06 9:33 ` Mikko Perttunen
2016-12-07 10:36 ` Daniel Vetter
0 siblings, 1 reply; 25+ messages in thread
From: Mikko Perttunen @ 2016-12-06 9:33 UTC (permalink / raw)
To: Daniel Vetter, Thierry Reding
Cc: Mikko Perttunen, linux-tegra-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 12/06/2016 09:14 AM, Daniel Vetter wrote:
> On Mon, Dec 05, 2016 at 08:51:31PM +0100, Thierry Reding wrote:
>> On Thu, Nov 10, 2016 at 08:23:37PM +0200, Mikko Perttunen wrote:
>>> This series adds IOMMU support to Host1x and TegraDRM
>>> and adds support for the VIC host1x client so that
>>> host1x can be tested on modern Tegra platforms.
>>> It depends on the previous fix series. The whole thing
>>> (modulo patch order) is available as a git repository at
>>> git://github.com/cyndis/linux.git; branch vic-v1.
>>>
>>> IO memory management is organized such that there are
>>> two domains: the host1x domain and the tegradrm domain.
>>> The host1x domain is used by the host1x engine and
>>> contains the host1x CDMA and pushbuffers for submitted
>>> jobs.
>>>
>>> The tegradrm domain is shared by all host1x units and
>>> contains GEM objects and memory allocated by the
>>> separate tegra_drm_alloc function. This function is
>>> currently used to allocate space for firmware blobs
>>> in the tegradrm domain.
>>>
>>> A userspace test case for VIC can be found at
>>> https://github.com/cyndis/drm/tree/work/tegra.
>>> The testcase is in tests/tegra and is called submit_vic.
>>> The in-kernel firewall is not implemented for VIC;
>>> therefore, IOMMU must be enabled for the test to pass.
>>>
>>> Tested with Jetson TX1 (T210). Probably works also
>>> with Jetson TK1 (T124). Note that due to hardware changes
>>> the testcase also needs to be changed to run properly
>>> on T124.
>>
>> What's the scope of the changes required for Tegra124? If we add the
>> kernel bits for Tegra124 we should also have a userspace test program to
>> exercise it.
>
> Please note that just a test program is not really enough to add more uabi
> (and VIC support here maybe classifies at that, no idea how this all
> works). Aka does mesa run, and for the vic stuff, does some libva/libvdpau
> open source implementation run on this?
> -Daniel
>
VIC is just an image compositor so if we wanted to use it alone we would
probably need to implement X11 2D acceleration (at least nothing else
comes to mind), which isn't that useful nowadays, since nouveau works
too. I'm internally working on NVDEC (the video decoder) at the moment -
if/when I'm allowed to upstream that, I'll provide more extensive
userspace for decoding video. This would most likely be a libva or
libvdpau backend; I have already prototyped these. This will also most
likely use VIC for image format conversions.
As for the "more UABI" question; the UABI used here already exists in
the kernel and libdrm. The same UABI can be used for any Host1x unit
without modifications; we have previously exercised it on older Tegra
revisions with different units. The only thing that changes is a new
enumeration value is added for the new unit.
Is that reasonable?
Thanks,
Mikko.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/8] drm/tegra: Add Tegra DRM allocation API
[not found] ` <20161205183726.GA22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
@ 2016-12-07 9:23 ` Mikko Perttunen
0 siblings, 0 replies; 25+ messages in thread
From: Mikko Perttunen @ 2016-12-07 9:23 UTC (permalink / raw)
To: Thierry Reding, Mikko Perttunen
Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, airlied-cv59FeDIM0c
On 05.12.2016 20:37, Thierry Reding wrote:
> On Thu, Nov 10, 2016 at 08:23:38PM +0200, Mikko Perttunen wrote:
>> Add a new IO virtual memory allocation API to allow clients to
>> allocate non-GEM memory in the Tegra DRM IOMMU domain. This is
>> required e.g. for loading client firmware when clients are attached
>> to the IOMMU domain.
>>
>> The allocator allocates contiguous physical pages that are then
>> mapped contiguously to the IOMMU domain using the iova_domain
>> library provided by the kernel. Contiguous physical pages are
>> used so that the same allocator works also when IOMMU support
>> is disabled and therefore devices access physical memory directly.
>>
>> Signed-off-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> ---
>> drivers/gpu/drm/tegra/drm.c | 98 ++++++++++++++++++++++++++++++++++++++++++---
>> drivers/gpu/drm/tegra/drm.h | 7 ++++
>> 2 files changed, 100 insertions(+), 5 deletions(-)
>
> This looks mostly good, just a few comments below...
>
>> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
>> index b8be3ee..ecfe7ba 100644
>> --- a/drivers/gpu/drm/tegra/drm.c
>> +++ b/drivers/gpu/drm/tegra/drm.c
>> @@ -1,12 +1,13 @@
>> /*
>> * Copyright (C) 2012 Avionic Design GmbH
>> - * Copyright (C) 2012-2013 NVIDIA CORPORATION. All rights reserved.
>> + * Copyright (C) 2012-2015 NVIDIA CORPORATION. All rights reserved.
>
> It's almost 2017 now, I think this is out of date. =)
>
>> *
>> * This program is free software; you can redistribute it and/or modify
>> * it under the terms of the GNU General Public License version 2 as
>> * published by the Free Software Foundation.
>> */
>>
>> +#include <linux/bitops.h>
>> #include <linux/host1x.h>
>> #include <linux/iommu.h>
>>
>> @@ -23,6 +24,8 @@
>> #define DRIVER_MINOR 0
>> #define DRIVER_PATCHLEVEL 0
>>
>> +#define IOVA_AREA_SZ (1024 * 1024 * 64) /* 64 MiB */
>
> SZ_64M?
Fixed.
>
>> +
>> struct tegra_drm_file {
>> struct list_head contexts;
>> };
>> @@ -127,7 +130,8 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>>
>> if (iommu_present(&platform_bus_type)) {
>> struct iommu_domain_geometry *geometry;
>> - u64 start, end;
>> + unsigned long order;
>> + u64 iova_start, start, end;
>
> Can we be a little more explicit and keep an additional iova_end to
> simplify the code a little?
Good idea. I replaced this with gem_{start,end} and carveout_{start,end}
and it looks nicer.
>
>> tegra->domain = iommu_domain_alloc(&platform_bus_type);
>> if (!tegra->domain) {
>> @@ -138,10 +142,17 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>> geometry = &tegra->domain->geometry;
>> start = geometry->aperture_start;
>> end = geometry->aperture_end;
>> + iova_start = end - IOVA_AREA_SZ + 1;
>> +
>> + order = __ffs(tegra->domain->pgsize_bitmap);
>> + init_iova_domain(&tegra->iova, 1UL << order,
>> + iova_start >> order,
>> + end >> order);
>
> I hadn't seen this IOVA domain thing and it looks rather handy.
Yes, definitely nicer than rolling by hand. Could be a bit easier still
though :) Too many "<< order" everywhere.
>
>>
>> - DRM_DEBUG_DRIVER("IOMMU aperture initialized (%#llx-%#llx)\n",
>> - start, end);
>> - drm_mm_init(&tegra->mm, start, end - start + 1);
>> + drm_mm_init(&tegra->mm, start, iova_start - start);
>> +
>> + DRM_DEBUG("IOMMU apertures initialized (GEM: %#llx-%#llx, IOVA: %#llx-%#llx)\n",
>> + start, iova_start-1, iova_start, end);
>
> Might be worth splitting this into two, or perhaps even three lines:
>
> IOMMU apertures:
> GEM: %#llx-%#llx
> IOVA: %#llx-%#llx
>
> Maybe also find a better name for IOVA, since GEM will be using I/O
> virtual addresses as well. Perhaps just name it "carveout" or something
> like that?
Done and renamed to carveout.
>
>> }
>>
>> mutex_init(&tegra->clients_lock);
>> @@ -208,6 +219,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>> if (tegra->domain) {
>> iommu_domain_free(tegra->domain);
>> drm_mm_takedown(&tegra->mm);
>> + put_iova_domain(&tegra->iova);
>> }
>> free:
>> kfree(tegra);
>> @@ -232,6 +244,7 @@ static int tegra_drm_unload(struct drm_device *drm)
>> if (tegra->domain) {
>> iommu_domain_free(tegra->domain);
>> drm_mm_takedown(&tegra->mm);
>> + put_iova_domain(&tegra->iova);
>> }
>>
>> kfree(tegra);
>> @@ -975,6 +988,81 @@ int tegra_drm_unregister_client(struct tegra_drm *tegra,
>> return 0;
>> }
>>
>> +void *tegra_drm_alloc(struct tegra_drm *tegra, size_t size,
>> + dma_addr_t *iova)
>
> Maybe name the iova parameter phys? iova implies that it's always
> translated, whereas according to the code it can be physical, too.
How about something like "dma"? I find the use of "phys" for device
addresses slightly confusing since sometimes they are and sometimes not
and it's hard to say at a glance if the code in question actually
assumes that the addresses are physical or not. "dma" should convey the
meaning of "address used by device" a bit better.
>
>> +{
>> + struct iova *alloc;
>> + unsigned long shift;
>> + void *virt;
>> + gfp_t gfp;
>> + int err;
>> +
>> + if (tegra->domain)
>> + size = iova_align(&tegra->iova, size);
>> + else
>> + size = PAGE_ALIGN(size);
>> +
>> + gfp = GFP_KERNEL | __GFP_ZERO;
>> + if (!tegra->domain) {
>> + /*
>> + * Tegra210 has 64-bit physical addresses but units only support
>> + * 32-bit addresses, so if we don't have an IOMMU to do
>> + * translation, force allocations to be in the 32-bit range.
>> + */
>> + gfp |= GFP_DMA;
>
> Technically we do support Tegra132 and that already has 64-bit physical
> addresses, so perhaps include that in the comment, or make it more
> generic:
>
> /*
> * Many units only support 32-bit addresses, even on 64-bit SoCs.
> * If there is no IOMMU to translate into a 32-bit IO virtual address
> * address space, force allocations to be in the lower 32-bit range.
> */
>
Will fix.
>> + }
>> +
>> + virt = (void *)__get_free_pages(gfp, get_order(size));
>> + if (!virt)
>> + return NULL;
>
> Perhaps we want to return an ERR_PTR()-encoded error code here so we can
> differentiate between out-of-memory and out-of-virtual-address-space
> conditions in the caller? Or perhaps other conditions as well.
Sure.
>
> Also, is __get_free_pages() the right API here? I thought that it always
> returned contiguous memory, which, in the presence of an IOMMU, will be
> completely unnecessary.
That is true. However, this API is intended only for a few small
allocations is done by the kernel, so it shouldn't be that bad. And
sources on the internet say that vmalloc should only be used when
absolutely necessary :)
>
>> +
>> + if (!tegra->domain) {
>> + /*
>> + * If IOMMU is disabled, devices address physical memory
>> + * directly.
>> + */
>> + *iova = virt_to_phys(virt);
>> + return virt;
>> + }
>> +
>> + shift = iova_shift(&tegra->iova);
>> + alloc = alloc_iova(&tegra->iova, size >> shift,
>> + tegra->domain->geometry.aperture_end >> shift, true);
>
> This is fairly cumbersome to use. Maybe a drm_mm would be easier in this
> case? If not, could we at least store the upper limit of the carveout so
> that we don't have to use the really long expression above?
Stored shift and aperture_end >> shift.
>
> Thierry
>
Thanks,
Mikko.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/8] drm/tegra: Allocate BOs from lower 4G when without IOMMU
[not found] ` <20161205183955.GB22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
@ 2016-12-07 9:32 ` Mikko Perttunen
0 siblings, 0 replies; 25+ messages in thread
From: Mikko Perttunen @ 2016-12-07 9:32 UTC (permalink / raw)
To: Thierry Reding, Mikko Perttunen
Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, airlied-cv59FeDIM0c
On 05.12.2016 20:39, Thierry Reding wrote:
> On Thu, Nov 10, 2016 at 08:23:39PM +0200, Mikko Perttunen wrote:
>> On 64-bit Tegras, buffer object memory allocation may
>> return memory above 4G that units behind Host1x cannot
>> access. Add the GFP_DMA flag to these allocation when
>> IOMMU is not enabled to ensure units can always access
>> BO memory.
>
> I don't think that's entirely true. The display controller can address
> more than 32 bits. It's perhaps slightly different from other units that
> are behind host1x, but maybe we need a special case here to allow frame-
> buffers to reside in higher memory?
I checked and looks like all (or at least most) units can actually
address 34 bits /almost always/. The thing that tripped me up was that
apparently the firmware for VIC still needs be in the lower 32 bits. So
we should be able to drop this patch.
>
> Thierry
>
Thanks,
Mikko.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/8] Host1x IOMMU support + VIC support
2016-12-06 9:33 ` Mikko Perttunen
@ 2016-12-07 10:36 ` Daniel Vetter
0 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2016-12-07 10:36 UTC (permalink / raw)
To: Mikko Perttunen; +Cc: linux-tegra, dri-devel, Mikko Perttunen
On Tue, Dec 06, 2016 at 11:33:11AM +0200, Mikko Perttunen wrote:
>
>
> On 12/06/2016 09:14 AM, Daniel Vetter wrote:
> > On Mon, Dec 05, 2016 at 08:51:31PM +0100, Thierry Reding wrote:
> > > On Thu, Nov 10, 2016 at 08:23:37PM +0200, Mikko Perttunen wrote:
> > > > This series adds IOMMU support to Host1x and TegraDRM
> > > > and adds support for the VIC host1x client so that
> > > > host1x can be tested on modern Tegra platforms.
> > > > It depends on the previous fix series. The whole thing
> > > > (modulo patch order) is available as a git repository at
> > > > git://github.com/cyndis/linux.git; branch vic-v1.
> > > >
> > > > IO memory management is organized such that there are
> > > > two domains: the host1x domain and the tegradrm domain.
> > > > The host1x domain is used by the host1x engine and
> > > > contains the host1x CDMA and pushbuffers for submitted
> > > > jobs.
> > > >
> > > > The tegradrm domain is shared by all host1x units and
> > > > contains GEM objects and memory allocated by the
> > > > separate tegra_drm_alloc function. This function is
> > > > currently used to allocate space for firmware blobs
> > > > in the tegradrm domain.
> > > >
> > > > A userspace test case for VIC can be found at
> > > > https://github.com/cyndis/drm/tree/work/tegra.
> > > > The testcase is in tests/tegra and is called submit_vic.
> > > > The in-kernel firewall is not implemented for VIC;
> > > > therefore, IOMMU must be enabled for the test to pass.
> > > >
> > > > Tested with Jetson TX1 (T210). Probably works also
> > > > with Jetson TK1 (T124). Note that due to hardware changes
> > > > the testcase also needs to be changed to run properly
> > > > on T124.
> > >
> > > What's the scope of the changes required for Tegra124? If we add the
> > > kernel bits for Tegra124 we should also have a userspace test program to
> > > exercise it.
> >
> > Please note that just a test program is not really enough to add more uabi
> > (and VIC support here maybe classifies at that, no idea how this all
> > works). Aka does mesa run, and for the vic stuff, does some libva/libvdpau
> > open source implementation run on this?
> > -Daniel
> >
>
> VIC is just an image compositor so if we wanted to use it alone we would
> probably need to implement X11 2D acceleration (at least nothing else comes
> to mind), which isn't that useful nowadays, since nouveau works too. I'm
> internally working on NVDEC (the video decoder) at the moment - if/when I'm
> allowed to upstream that, I'll provide more extensive userspace for decoding
> video. This would most likely be a libva or libvdpau backend; I have already
> prototyped these. This will also most likely use VIC for image format
> conversions.
>
> As for the "more UABI" question; the UABI used here already exists in the
> kernel and libdrm. The same UABI can be used for any Host1x unit without
> modifications; we have previously exercised it on older Tegra revisions with
> different units. The only thing that changes is a new enumeration value is
> added for the new unit.
More uabi also includes stuff like adding more support for newer engines,
more platforms, newer chips and all that. Anything really that is a
contract between the kernel and userspace about how the hw is set up (so
e.g. fancy features that need a few bits to be set correctly, or the right
microcode to be loaded or something like that are also included.
Otherwise it'd be trivial to game by doing a basic driver for the very
first chip that no one cares about any more, open source it, and then do
all future enabling with the blob drivers only in userspace.
This open source userspace requirement isn't just to validate the bare
ioctl structs and stuff. We have it _exactly_ because gpu interfaces are
ridiculously wide (for speed and flexibility reasons), and hence it does
apply for every extension. Even if you don't change any of the uapi
headers (since those describe just a miniscule part of the overall drm
interface exported to userspace). Another example is adding more atomic
properties, which also doesn't change anything with the ioctl or structs
itself.
> Is that reasonable?
If you have the userspace (and it needs to be real, not just promised)
then yes. Otherwise imo no. E.g. Rob Clark always makes sure on msm that
his r/e'ed mesa branch _does_ work properly, before merging support for
new hw or engines or anything like that. Intel/AMD do the same, and so
should nvidia.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/8] drm/tegra: Add falcon helper library
[not found] ` <20161205191346.GC22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
@ 2016-12-07 11:54 ` Mikko Perttunen
0 siblings, 0 replies; 25+ messages in thread
From: Mikko Perttunen @ 2016-12-07 11:54 UTC (permalink / raw)
To: Thierry Reding, Mikko Perttunen
Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, airlied-cv59FeDIM0c,
Arto Merilainen, Andrew Chew
On 05.12.2016 21:13, Thierry Reding wrote:
> On Thu, Nov 10, 2016 at 08:23:40PM +0200, Mikko Perttunen wrote:
>> From: Arto Merilainen <amerilainen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>>
>> Add a set of falcon helper routines for use by the tegradrm client drivers
>> of the various falcon-based engines.
>>
>> The falcon is a microcontroller that acts as a frontend for the rest of a
>> particular Tegra engine. In order to properly utilize these engines, the
>> frontend must be booted before pushing any commands.
>>
>> Based on work by Andrew Chew <achew-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>>
>> Signed-off-by: Andrew Chew <achew-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> Signed-off-by: Arto Merilainen <amerilainen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> Signed-off-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> ---
>> drivers/gpu/drm/tegra/Makefile | 3 +-
>> drivers/gpu/drm/tegra/falcon.c | 256 +++++++++++++++++++++++++++++++++++++++++
>> drivers/gpu/drm/tegra/falcon.h | 130 +++++++++++++++++++++
>> 3 files changed, 388 insertions(+), 1 deletion(-)
>> create mode 100644 drivers/gpu/drm/tegra/falcon.c
>> create mode 100644 drivers/gpu/drm/tegra/falcon.h
>>
>> diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile
>> index 2c66a8d..93e9a4a 100644
>> --- a/drivers/gpu/drm/tegra/Makefile
>> +++ b/drivers/gpu/drm/tegra/Makefile
>> @@ -13,6 +13,7 @@ tegra-drm-y := \
>> sor.o \
>> dpaux.o \
>> gr2d.o \
>> - gr3d.o
>> + gr3d.o \
>> + falcon.o
>>
>> obj-$(CONFIG_DRM_TEGRA) += tegra-drm.o
>> diff --git a/drivers/gpu/drm/tegra/falcon.c b/drivers/gpu/drm/tegra/falcon.c
>> new file mode 100644
>> index 0000000..180b2fd
>> --- /dev/null
>> +++ b/drivers/gpu/drm/tegra/falcon.c
>> @@ -0,0 +1,256 @@
>> +/*
>> + * Copyright (c) 2015, NVIDIA Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include <linux/platform_device.h>
>> +#include <linux/dma-mapping.h>
>> +#include <linux/firmware.h>
>> +#include <linux/pci_ids.h>
>> +#include <linux/iopoll.h>
>> +
>> +#include "falcon.h"
>> +#include "drm.h"
>> +
>> +#define FALCON_IDLE_TIMEOUT_US 100000
>> +#define FALCON_IDLE_CHECK_PERIOD_US 10
>
> Seems a little overkill to have these here, given that their only used
> twice. They're also used in two cases where we may end up using
> different periods and timeouts eventually, so I'd just stick them into
> falcon_wait_idle() and falcon_dma_wait_idle() directly and omit these
> definitions. That's kind of a nitpick, so feel free to keep it as-is.
Fixed.
>
>> +
>> +enum falcon_memory {
>> + FALCON_MEMORY_IMEM,
>> + FALCON_MEMORY_DATA,
>> +};
>> +
>> +static void falcon_writel(struct falcon *falcon, u32 value, u32 offset)
>> +{
>> + writel(value, falcon->regs + offset);
>> +}
>> +
>> +int falcon_wait_idle(struct falcon *falcon)
>> +{
>> + u32 idlestate;
>> +
>> + return readl_poll_timeout(falcon->regs + FALCON_IDLESTATE, idlestate,
>> + (!idlestate),
>
> That's an odd way to write this. Why not just "idlestate == 0"? That's
> much easier to read and more explicit.
Fixed.
>
>> + FALCON_IDLE_CHECK_PERIOD_US,
>> + FALCON_IDLE_TIMEOUT_US);
>> +}
>> +
>> +static int falcon_dma_wait_idle(struct falcon *falcon)
>> +{
>> + u32 dmatrfcmd;
>
> u32 value? Naming it after a register suggests that it may be an offset
> rather than the register value.
Fixed.
>
>> +
>> + return readl_poll_timeout(falcon->regs + FALCON_DMATRFCMD, dmatrfcmd,
>> + (dmatrfcmd & FALCON_DMATRFCMD_IDLE),
>> + FALCON_IDLE_CHECK_PERIOD_US,
>> + FALCON_IDLE_TIMEOUT_US);
>> +}
>> +
>> +static int falcon_copy_chunk(struct falcon *falcon,
>> + phys_addr_t base,
>> + unsigned long offset,
>> + enum falcon_memory target)
>> +{
>> + u32 cmd = FALCON_DMATRFCMD_SIZE_256B;
>> +
>> + if (target == FALCON_MEMORY_IMEM)
>> + cmd |= FALCON_DMATRFCMD_IMEM;
>> +
>> + falcon_writel(falcon, offset, FALCON_DMATRFMOFFS);
>> + falcon_writel(falcon, base, FALCON_DMATRFFBOFFS);
>> + falcon_writel(falcon, cmd, FALCON_DMATRFCMD);
>> +
>> + return falcon_dma_wait_idle(falcon);
>> +}
>> +
>> +static void falcon_copy_firmware_image(struct falcon *falcon,
>> + const struct firmware *firmware)
>> +{
>> + u32 *firmware_vaddr = falcon->firmware.vaddr;
>> + size_t i;
>> +
>> + /* copy the whole thing taking into account endianness */
>> + for (i = 0; i < firmware->size / sizeof(u32); i++)
>> + firmware_vaddr[i] = le32_to_cpu(((u32 *)firmware->data)[i]);
>> +
>> + /* ensure that caches are flushed and falcon can see the firmware */
>> + dma_sync_single_for_device(falcon->dev, virt_to_phys(firmware_vaddr),
>> + falcon->firmware.size, DMA_TO_DEVICE);
>
> My understanding is that this is an error and the DMA API will complain
> about it if debugging is enabled. I think you need to call one of the
> dma_map_*() functions on the memory before you can do a dma_sync_*().
Yeah, seems so. I added dma_map_single and dma_unmap_single around this
call.
>
>> +}
>> +
>> +static int falcon_parse_firmware_image(struct falcon *falcon)
>> +{
>> + struct falcon_firmware_bin_header_v1 *bin_header =
>> + (void *)falcon->firmware.vaddr;
>> + struct falcon_firmware_os_header_v1 *os_header;
>
> Can we make these shorter? Perhaps by leaving out the _header suffix?
> It'd be nice if we could avoid splitting these across multiple lines.
>
>> +
>> + /* endian problems would show up right here */
>> + if (bin_header->bin_magic != PCI_VENDOR_ID_NVIDIA) {
>> + dev_err(falcon->dev, "failed to get firmware magic");
>> + return -EINVAL;
>> + }
>> +
>> + /* currently only version 1 is supported */
>> + if (bin_header->bin_ver != 1) {
>> + dev_err(falcon->dev, "unsupported firmware version");
>> + return -EINVAL;
>> + }
>> +
>> + /* check that the firmware size is consistent */
>> + if (bin_header->bin_size > falcon->firmware.size) {
>> + dev_err(falcon->dev, "firmware image size inconsistency");
>> + return -EINVAL;
>> + }
>
> These messages are missing newlines at the end.
Fixed.
>
>> +
>> + os_header = (falcon->firmware.vaddr +
>> + bin_header->os_bin_header_offset);
>
> I think if we shorten the variable names a little this will also fit on
> a single line. There's also no need for the parentheses.
Dropped the _header from variable names, did s/firmware/fw/ in struct
names and also trimmed the field names a bit.
>
>> +
>> + falcon->firmware.bin_data.size = bin_header->os_bin_size;
>> + falcon->firmware.bin_data.offset = bin_header->os_bin_data_offset;
>> + falcon->firmware.code.offset = os_header->os_code_offset;
>> + falcon->firmware.code.size = os_header->os_code_size;
>> + falcon->firmware.data.offset = os_header->os_data_offset;
>> + falcon->firmware.data.size = os_header->os_data_size;
>
> Can you remove the extra padding before =, please? It's inconsistent
> with the other assignments.
Fixed.
>
>> +
>> + return 0;
>> +}
>> +
>> +int falcon_read_firmware(struct falcon *falcon, const char *firmware_name)
>
> The firmware_ prefix in firmware_name is redundant and can be dropped,
> in my opinion.
Fixed.
>
>> +{
>> + const struct firmware *firmware;
>> + int err;
>> +
>> + if (falcon->firmware.valid)
>
> Do we really need the .valid field? It seems like we should be able to
> derive it from the value of one of the other fields.
Changed to use .vaddr.
>
>> + return 0;
>> +
>> + err = request_firmware(&firmware, firmware_name, falcon->dev);
>> + if (err < 0) {
>> + dev_err(falcon->dev, "failed to get firmware\n");
>> + return err;
>
> It'd be nice to show the user what error occurred. Maybe also show the
> name of the firmware that couldn't be loaded.
Fixed.
>
>> + }
>> +
>> + falcon->firmware.size = firmware->size;
>> +
>> + /* allocate iova space for the firmware */
>> + falcon->firmware.vaddr = falcon->ops->alloc(falcon, firmware->size,
>> + &falcon->firmware.paddr);
>
> The arguments aren't properly aligned.
Fixed.
>
>> + if (!falcon->firmware.vaddr) {
>> + dev_err(falcon->dev, "dma memory mapping failed");
>> + err = -ENOMEM;
>> + goto err_alloc_firmware;
>> + }
>> +
>> + /* copy firmware image into local area. this also ensures endianness */
>> + falcon_copy_firmware_image(falcon, firmware);
>> +
>> + /* parse the image data */
>> + err = falcon_parse_firmware_image(falcon);
>> + if (err < 0) {
>> + dev_err(falcon->dev, "failed to parse firmware image\n");
>> + goto err_setup_firmware_image;
>> + }
>> +
>> + falcon->firmware.valid = true;
>> +
>> + release_firmware(firmware);
>> +
>> + return 0;
>> +
>> +err_setup_firmware_image:
>> + falcon->ops->free(falcon, falcon->firmware.size,
>> + falcon->firmware.paddr, falcon->firmware.vaddr);
>> +err_alloc_firmware:
>> + release_firmware(firmware);
>> +
>> + return err;
>> +}
>> +
>> +int falcon_init(struct falcon *falcon)
>> +{
>> + /* check mandatory ops */
>> + if (!falcon->ops || !falcon->ops->alloc || !falcon->ops->free)
>> + return -EINVAL;
>> +
>> + falcon->firmware.valid = false;
>> +
>> + return 0;
>> +}
>> +
>> +void falcon_exit(struct falcon *falcon)
>> +{
>> + if (!falcon->firmware.valid)
>> + return;
>> +
>> + falcon->ops->free(falcon, falcon->firmware.size,
>> + falcon->firmware.paddr,
>> + falcon->firmware.vaddr);
>> + falcon->firmware.valid = false;
>> +}
>> +
>> +int falcon_boot(struct falcon *falcon)
>> +{
>> + unsigned long offset;
>> + int err = 0;
>
> I don'n think it's necessary to initialize err here.
Fixed.
>
>> +
>> + if (!falcon->firmware.valid)
>> + return -EINVAL;
>> +
>> + falcon_writel(falcon, 0, FALCON_DMACTL);
>> +
>> + /* setup the address of the binary data. Falcon can access it later */
>
> If you write sentences with a full stop, please use a capital first
> letter. In this case I think you can just drop the full stop:
>
> /* setup the address of the binary data so Falcon can access it later */
Fixed.
>
>> + falcon_writel(falcon, (falcon->firmware.paddr +
>> + falcon->firmware.bin_data.offset) >> 8,
>> + FALCON_DMATRFBASE);
>> +
>> + /* copy the data segment into Falcon internal memory */
>> + for (offset = 0; offset < falcon->firmware.data.size; offset += 256)
>> + falcon_copy_chunk(falcon,
>> + falcon->firmware.data.offset + offset,
>> + offset, FALCON_MEMORY_DATA);
>> +
>> + /* copy the first code segment into Falcon internal memory */
>> + falcon_copy_chunk(falcon, falcon->firmware.code.offset,
>> + 0, FALCON_MEMORY_IMEM);
>> +
>> + /* setup falcon interrupts */
>> + falcon_writel(falcon, FALCON_IRQMSET_EXT(0xff) |
>> + FALCON_IRQMSET_SWGEN1 |
>> + FALCON_IRQMSET_SWGEN0 |
>> + FALCON_IRQMSET_EXTERR |
>> + FALCON_IRQMSET_HALT |
>> + FALCON_IRQMSET_WDTMR,
>> + FALCON_IRQMSET);
>> + falcon_writel(falcon, FALCON_IRQDEST_EXT(0xff) |
>> + FALCON_IRQDEST_SWGEN1 |
>> + FALCON_IRQDEST_SWGEN0 |
>> + FALCON_IRQDEST_EXTERR |
>> + FALCON_IRQDEST_HALT,
>> + FALCON_IRQDEST);
>> +
>> + /* enable interface */
>> + falcon_writel(falcon, FALCON_ITFEN_MTHDEN |
>> + FALCON_ITFEN_CTXEN,
>> + FALCON_ITFEN);
>> +
>> + /* boot falcon */
>> + falcon_writel(falcon, 0x00000000, FALCON_BOOTVEC);
>> + falcon_writel(falcon, FALCON_CPUCTL_STARTCPU, FALCON_CPUCTL);
>> +
>> + err = falcon_wait_idle(falcon);
>> + if (err < 0) {
>> + dev_err(falcon->dev, "falcon boot failed due to timeout");
>> + return err;
>> + }
>> +
>> + dev_info(falcon->dev, "falcon booted");
>
> No need to brag here, it's supposed to be successful. Let the user know
> if it isn't. If you really want this, make sure it's output at debug
> level, not info. Also I think we need to agree on whether these are
> named Falcon or falcon and then use it consistently. I'm leaning towards
> the former. Also, there are a few messages with missing newlines.
Removed and fixed.
>
>> +
>> + return 0;
>> +}
>> +
>> +void falcon_execute_method(struct falcon *falcon, u32 method, u32 data)
>> +{
>> + falcon_writel(falcon, method >> 2, FALCON_UCLASS_METHOD_OFFSET);
>> + falcon_writel(falcon, data, FALCON_UCLASS_METHOD_DATA);
>> +}
>> diff --git a/drivers/gpu/drm/tegra/falcon.h b/drivers/gpu/drm/tegra/falcon.h
>> new file mode 100644
>> index 0000000..56284b9
>> --- /dev/null
>> +++ b/drivers/gpu/drm/tegra/falcon.h
>> @@ -0,0 +1,130 @@
>> +/*
>> + * Copyright (c) 2015, NVIDIA Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#ifndef _FALCON_H_
>> +#define _FALCON_H_
>> +
>> +#include <linux/types.h>
>> +
>> +#define FALCON_UCLASS_METHOD_OFFSET 0x00000040
>> +
>> +#define FALCON_UCLASS_METHOD_DATA 0x00000044
>> +
>> +#define FALCON_IRQMSET 0x00001010
>> +#define FALCON_IRQMSET_WDTMR (1 << 1)
>> +#define FALCON_IRQMSET_HALT (1 << 4)
>> +#define FALCON_IRQMSET_EXTERR (1 << 5)
>> +#define FALCON_IRQMSET_SWGEN0 (1 << 6)
>> +#define FALCON_IRQMSET_SWGEN1 (1 << 7)
>> +#define FALCON_IRQMSET_EXT(v) (((v) & 0xff) << 8)
>> +
>> +#define FALCON_IRQDEST 0x0000101c
>> +#define FALCON_IRQDEST_HALT (1 << 4)
>> +#define FALCON_IRQDEST_EXTERR (1 << 5)
>> +#define FALCON_IRQDEST_SWGEN0 (1 << 6)
>> +#define FALCON_IRQDEST_SWGEN1 (1 << 7)
>> +#define FALCON_IRQDEST_EXT(v) (((v) & 0xff) << 8)
>> +
>> +#define FALCON_ITFEN 0x00001048
>> +#define FALCON_ITFEN_CTXEN (1 << 0)
>> +#define FALCON_ITFEN_MTHDEN (1 << 1)
>> +
>> +#define FALCON_IDLESTATE 0x0000104c
>> +
>> +#define FALCON_CPUCTL 0x00001100
>> +#define FALCON_CPUCTL_STARTCPU (1 << 1)
>> +
>> +#define FALCON_BOOTVEC 0x00001104
>> +
>> +#define FALCON_DMACTL 0x0000110c
>> +#define FALCON_DMACTL_DMEM_SCRUBBING (1 << 1)
>> +#define FALCON_DMACTL_IMEM_SCRUBBING (1 << 2)
>> +
>> +#define FALCON_DMATRFBASE 0x00001110
>> +
>> +#define FALCON_DMATRFMOFFS 0x00001114
>> +
>> +#define FALCON_DMATRFCMD 0x00001118
>> +#define FALCON_DMATRFCMD_IDLE (1 << 1)
>> +#define FALCON_DMATRFCMD_IMEM (1 << 4)
>> +#define FALCON_DMATRFCMD_SIZE_256B (6 << 8)
>> +
>> +#define FALCON_DMATRFFBOFFS 0x0000111c
>> +
>> +struct falcon_firmware_bin_header_v1 {
>> + u32 bin_magic; /* 0x10de */
>> + u32 bin_ver; /* cya, versioning of bin format (1) */
>
> Not sure everyone understands what cya means, here. Perhaps just drop
> it?
Removed.
>
>> + u32 bin_size; /* entire image size including this header */
>> + u32 os_bin_header_offset;
>> + u32 os_bin_data_offset;
>> + u32 os_bin_size;
>> +};
>> +
>> +struct falcon_firmware_os_app_v1 {
>> + u32 offset;
>> + u32 size;
>> +};
>> +
>> +struct falcon_firmware_os_header_v1 {
>> + u32 os_code_offset;
>> + u32 os_code_size;
>> + u32 os_data_offset;
>> + u32 os_data_size;
>> + u32 num_apps;
>> + struct falcon_firmware_os_app_v1 *app_code;
>> + struct falcon_firmware_os_app_v1 *app_data;
>
> The above two seem to be completel unused. Should we drop them?
Dropped the last four fields.
>
>> + u32 *os_ovl_offset;
>> + u32 *os_ovl_size;
>> +};
>
> Can we in general sanitize the names of these a little? It's redundent
> to add a os_ prefix in a structure that contains an _os_ infix.
Yep, removed the os_ prefixes.
>
> Thierry
>
Thanks,
Mikko.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/8] drm/tegra: Add VIC support
[not found] ` <20161205193559.GD22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
@ 2016-12-14 8:28 ` Mikko Perttunen
0 siblings, 0 replies; 25+ messages in thread
From: Mikko Perttunen @ 2016-12-14 8:28 UTC (permalink / raw)
To: Thierry Reding, Mikko Perttunen
Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, airlied-cv59FeDIM0c,
Arto Merilainen, Andrew Chew
On 05.12.2016 21:35, Thierry Reding wrote:
> On Thu, Nov 10, 2016 at 08:23:41PM +0200, Mikko Perttunen wrote:
>> From: Arto Merilainen <amerilainen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>>
>> This patch adds support for Video Image Compositor engine which
>> can be used for 2d operations.
>>
>> Signed-off-by: Andrew Chew <achew-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> Signed-off-by: Arto Merilainen <amerilainen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> Signed-off-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> ---
>> drivers/gpu/drm/tegra/Makefile | 3 +-
>> drivers/gpu/drm/tegra/drm.c | 3 +
>> drivers/gpu/drm/tegra/drm.h | 1 +
>> drivers/gpu/drm/tegra/vic.c | 400 +++++++++++++++++++++++++++++++++++++++++
>> drivers/gpu/drm/tegra/vic.h | 31 ++++
>> include/linux/host1x.h | 1 +
>> 6 files changed, 438 insertions(+), 1 deletion(-)
>> create mode 100644 drivers/gpu/drm/tegra/vic.c
>> create mode 100644 drivers/gpu/drm/tegra/vic.h
>>
>> diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile
>> index 93e9a4a..6af3a9a 100644
>> --- a/drivers/gpu/drm/tegra/Makefile
>> +++ b/drivers/gpu/drm/tegra/Makefile
>> @@ -14,6 +14,7 @@ tegra-drm-y := \
>> dpaux.o \
>> gr2d.o \
>> gr3d.o \
>> - falcon.o
>> + falcon.o \
>> + vic.o
>>
>> obj-$(CONFIG_DRM_TEGRA) += tegra-drm.o
>> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
>> index ecfe7ba..707404d 100644
>> --- a/drivers/gpu/drm/tegra/drm.c
>> +++ b/drivers/gpu/drm/tegra/drm.c
>> @@ -1151,11 +1151,13 @@ static const struct of_device_id host1x_drm_subdevs[] = {
>> { .compatible = "nvidia,tegra124-sor", },
>> { .compatible = "nvidia,tegra124-hdmi", },
>> { .compatible = "nvidia,tegra124-dsi", },
>> + { .compatible = "nvidia,tegra124-vic", },
>> { .compatible = "nvidia,tegra132-dsi", },
>> { .compatible = "nvidia,tegra210-dc", },
>> { .compatible = "nvidia,tegra210-dsi", },
>> { .compatible = "nvidia,tegra210-sor", },
>> { .compatible = "nvidia,tegra210-sor1", },
>> + { .compatible = "nvidia,tegra210-vic", },
>> { /* sentinel */ }
>> };
>>
>> @@ -1177,6 +1179,7 @@ static struct platform_driver * const drivers[] = {
>> &tegra_sor_driver,
>> &tegra_gr2d_driver,
>> &tegra_gr3d_driver,
>> + &tegra_vic_driver,
>> };
>>
>> static int __init host1x_drm_init(void)
>> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
>> index f75b505..8efaaa4 100644
>> --- a/drivers/gpu/drm/tegra/drm.h
>> +++ b/drivers/gpu/drm/tegra/drm.h
>> @@ -292,5 +292,6 @@ extern struct platform_driver tegra_dpaux_driver;
>> extern struct platform_driver tegra_sor_driver;
>> extern struct platform_driver tegra_gr2d_driver;
>> extern struct platform_driver tegra_gr3d_driver;
>> +extern struct platform_driver tegra_vic_driver;
>>
>> #endif /* HOST1X_DRM_H */
>> diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
>> new file mode 100644
>> index 0000000..9eda5e5
>> --- /dev/null
>> +++ b/drivers/gpu/drm/tegra/vic.c
>> @@ -0,0 +1,400 @@
>> +/*
>> + * Copyright (c) 2015, NVIDIA Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include <linux/clk.h>
>> +#include <linux/host1x.h>
>> +#include <linux/iommu.h>
>> +#include <linux/module.h>
>> +#include <linux/of.h>
>> +#include <linux/of_device.h>
>> +#include <linux/of_platform.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/pm_runtime.h>
>> +#include <linux/reset.h>
>> +
>> +#include <soc/tegra/pmc.h>
>> +
>> +#include "drm.h"
>> +#include "falcon.h"
>> +#include "vic.h"
>> +
>> +struct vic_config {
>> + /* Firmware name */
>> + const char *ucode_name;
>
> Maybe just "firmware"? That way you could remove the comment.
Fixed.
>
>> +};
>> +
>> +struct vic {
>> + struct falcon falcon;
>> + bool booted;
>> +
>> + void __iomem *regs;
>> + struct tegra_drm_client client;
>> + struct host1x_channel *channel;
>> + struct iommu_domain *domain;
>> + struct device *dev;
>> + struct clk *clk;
>> +
>> + /* Platform configuration */
>> + const struct vic_config *config;
>> +};
>> +
>> +static inline struct vic *to_vic(struct tegra_drm_client *client)
>> +{
>> + return container_of(client, struct vic, client);
>> +}
>> +
>> +static void vic_writel(struct vic *vic, u32 value, unsigned int offset)
>> +{
>> + writel(value, vic->regs + offset);
>> +}
>> +
>> +static int vic_runtime_resume(struct device *dev)
>> +{
>> + struct vic *vic = dev_get_drvdata(dev);
>> +
>> + return clk_prepare_enable(vic->clk);
>> +}
>> +
>> +static int vic_runtime_suspend(struct device *dev)
>> +{
>> + struct vic *vic = dev_get_drvdata(dev);
>> +
>> + clk_disable_unprepare(vic->clk);
>> +
>> + vic->booted = false;
>> +
>> + return 0;
>> +}
>> +
>> +static int vic_boot(struct vic *vic)
>> +{
>> + u32 fce_ucode_size, fce_bin_data_offset;
>> + void *hdr;
>> + int err = 0;
>> +
>> + if (vic->booted)
>> + return 0;
>> +
>> + if (!vic->falcon.firmware.valid) {
>> + err = falcon_read_firmware(&vic->falcon,
>> + vic->config->ucode_name);
>> + if (err < 0)
>> + return err;
>> + }
>> +
>> + /* setup clockgating registers */
>> + vic_writel(vic, CG_IDLE_CG_DLY_CNT(4) |
>> + CG_IDLE_CG_EN |
>> + CG_WAKEUP_DLY_CNT(4),
>> + NV_PVIC_MISC_PRI_VIC_CG);
>> +
>> + err = falcon_boot(&vic->falcon);
>> + if (err < 0)
>> + return err;
>> +
>> + hdr = vic->falcon.firmware.vaddr;
>> + fce_bin_data_offset = *(u32 *)(hdr + VIC_UCODE_FCE_DATA_OFFSET);
>> + hdr = vic->falcon.firmware.vaddr +
>> + *(u32 *)(hdr + VIC_UCODE_FCE_HEADER_OFFSET);
>> + fce_ucode_size = *(u32 *)(hdr + FCE_UCODE_SIZE_OFFSET);
>> +
>> + falcon_execute_method(&vic->falcon, VIC_SET_APPLICATION_ID, 1);
>> + falcon_execute_method(&vic->falcon, VIC_SET_FCE_UCODE_SIZE,
>> + fce_ucode_size);
>> + falcon_execute_method(&vic->falcon, VIC_SET_FCE_UCODE_OFFSET,
>> + (vic->falcon.firmware.paddr + fce_bin_data_offset)
>> + >> 8);
>> +
>> + err = falcon_wait_idle(&vic->falcon);
>> + if (err < 0) {
>> + dev_err(vic->dev,
>> + "failed to set application ID and FCE base\n");
>> + return err;
>> + }
>> +
>> + vic->booted = true;
>> +
>> + return 0;
>> +}
>> +
>> +static void *vic_falcon_alloc(struct falcon *falcon, size_t size,
>> + dma_addr_t *iova)
>> +{
>> + struct tegra_drm *tegra = falcon->data;
>> +
>> + return tegra_drm_alloc(tegra, size, iova);
>> +}
>> +
>> +static void vic_falcon_free(struct falcon *falcon, size_t size,
>> + dma_addr_t iova, void *va)
>> +{
>> + struct tegra_drm *tegra = falcon->data;
>> +
>> + return tegra_drm_free(tegra, size, va, iova);
>> +}
>> +
>> +static const struct falcon_ops vic_falcon_ops = {
>> + .alloc = vic_falcon_alloc,
>> + .free = vic_falcon_free
>> +};
>> +
>> +static int vic_init(struct host1x_client *client)
>> +{
>> + struct tegra_drm_client *drm = host1x_to_drm_client(client);
>> + struct drm_device *dev = dev_get_drvdata(client->parent);
>> + struct tegra_drm *tegra = dev->dev_private;
>> + struct vic *vic = to_vic(drm);
>> + int err;
>> +
>> + if (tegra->domain) {
>> + err = iommu_attach_device(tegra->domain, vic->dev);
>> + if (err < 0) {
>> + dev_err(vic->dev, "failed to attach to domain: %d\n",
>> + err);
>> + return err;
>> + }
>> +
>> + vic->domain = tegra->domain;
>> + }
>> +
>> + vic->falcon.dev = vic->dev;
>> + vic->falcon.regs = vic->regs;
>> + vic->falcon.data = tegra;
>
> Why do we need this? We can get at struct vic * from struct falcon * via
> an upcast, and once we have struct vic, we have struct tegra_drm_client.
>
> But now I realize that apparently there's no way to get from that to the
> struct drm_device. I guess we could go via vic->client.base.parent and a
> call to dev_get_drvdata(), but that's rather far. It's still a little
> less confusing than stashing the struct tegra_drm * in it, since it is
> not at all driver-specific data.
I don't know, falcon.data is just supposed to be "priv pointer" for
falcon ops callbacks so it could be anything, in principle. I can change
this but I somewhat prefer the current version.
>
>> + vic->falcon.ops = &vic_falcon_ops;
>> + err = falcon_init(&vic->falcon);
>
> Could use a blank line between the above.
Fixed.
>
>> + if (err < 0)
>> + goto detach_device;
>> +
>> + vic->channel = host1x_channel_request(client->dev);
>> + if (!vic->channel) {
>> + err = -ENOMEM;
>> + goto exit_falcon;
>> + }
>> +
>> + client->syncpts[0] = host1x_syncpt_request(client->dev, 0);
>> + if (!client->syncpts[0]) {
>> + err = -ENOMEM;
>> + goto free_channel;
>> + }
>> +
>> + err = tegra_drm_register_client(tegra, drm);
>> + if (err < 0)
>> + goto free_syncpt;
>> +
>> + return 0;
>> +
>> +free_syncpt:
>> + host1x_syncpt_free(client->syncpts[0]);
>> +free_channel:
>> + host1x_channel_free(vic->channel);
>> +exit_falcon:
>> + falcon_exit(&vic->falcon);
>> +detach_device:
>> + if (tegra->domain)
>> + iommu_detach_device(tegra->domain, vic->dev);
>> +
>> + return err;
>> +}
>> +
>> +static int vic_exit(struct host1x_client *client)
>> +{
>> + struct tegra_drm_client *drm = host1x_to_drm_client(client);
>> + struct drm_device *dev = dev_get_drvdata(client->parent);
>> + struct tegra_drm *tegra = dev->dev_private;
>> + struct vic *vic = to_vic(drm);
>> + int err;
>> +
>> + err = tegra_drm_unregister_client(tegra, drm);
>> + if (err < 0)
>> + return err;
>> +
>> + host1x_syncpt_free(client->syncpts[0]);
>> + host1x_channel_free(vic->channel);
>> +
>> + falcon_exit(&vic->falcon);
>> +
>> + if (vic->domain) {
>> + iommu_detach_device(vic->domain, vic->dev);
>> + vic->domain = NULL;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static const struct host1x_client_ops vic_client_ops = {
>> + .init = vic_init,
>> + .exit = vic_exit,
>> +};
>> +
>> +static int vic_open_channel(struct tegra_drm_client *client,
>> + struct tegra_drm_context *context)
>> +{
>> + struct vic *vic = to_vic(client);
>> + int err;
>> +
>> + err = pm_runtime_get_sync(vic->dev);
>> + if (err < 0)
>> + return err;
>> +
>> + /*
>> + * Try to boot the Falcon microcontroller. Booting is deferred until
>> + * here because the firmware might not yet be available during system
>> + * boot, for example if it's on remote storage.
>> + */
>
> That sounds like a hack. Typically you're supposed to have the firmware
> on the same medium as the module that requests it. That is, if the
> driver is built-in, then you need to make the firmware available as
> built-in as well or stash it into an initrd. If the driver is built as a
> loadable module and is on the root filesystem, that's where the firmware
> should be as well. There's also the possibility to put the loadable
> module into an initrd, in which case that's where the firmware should go
> as well.
>
> That said, I think it makes sense to defer the actual booting until this
> point. Loading the firmware seems unappropriate to me. It really should
> be done in ->probe().
Fixed. The way it's now done is that the firmware is read in probe and
loaded in vic_init, as we cannot call tegra_drm functions from probe.
>
>> + err = vic_boot(vic);
>> + if (err < 0) {
>> + pm_runtime_put(vic->dev);
>> + return err;
>> + }
>> +
>> + context->channel = host1x_channel_get(vic->channel);
>> + if (!context->channel) {
>> + pm_runtime_put(vic->dev);
>> + return -ENOMEM;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static void vic_close_channel(struct tegra_drm_context *context)
>> +{
>> + struct vic *vic = to_vic(context->client);
>> +
>> + host1x_channel_put(context->channel);
>> +
>> + pm_runtime_put(vic->dev);
>> +}
>> +
>> +static const struct tegra_drm_client_ops vic_ops = {
>> + .open_channel = vic_open_channel,
>> + .close_channel = vic_close_channel,
>> + .submit = tegra_drm_submit,
>> +};
>> +
>> +static const struct vic_config vic_t124_config = {
>> + .ucode_name = "nvidia/tegra124/vic03_ucode.bin",
>> +};
>> +
>> +static const struct vic_config vic_t210_config = {
>> + .ucode_name = "nvidia/tegra210/vic04_ucode.bin",
>> +};
>> +
>> +static const struct of_device_id vic_match[] = {
>> + { .compatible = "nvidia,tegra124-vic", .data = &vic_t124_config },
>> + { .compatible = "nvidia,tegra210-vic", .data = &vic_t210_config },
>> + { },
>> +};
>> +
>> +static int vic_probe(struct platform_device *pdev)
>> +{
>> + struct vic_config *vic_config = NULL;
>> + struct device *dev = &pdev->dev;
>> + struct host1x_syncpt **syncpts;
>> + struct resource *regs;
>> + const struct of_device_id *match;
>> + struct vic *vic;
>> + int err;
>> +
>> + match = of_match_device(vic_match, dev);
>> + vic_config = (struct vic_config *)match->data;
>> +
>> + vic = devm_kzalloc(dev, sizeof(*vic), GFP_KERNEL);
>> + if (!vic)
>> + return -ENOMEM;
>> +
>> + syncpts = devm_kzalloc(dev, sizeof(*syncpts), GFP_KERNEL);
>> + if (!syncpts)
>> + return -ENOMEM;
>> +
>> + regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> + if (!regs) {
>> + dev_err(&pdev->dev, "failed to get registers\n");
>> + return -ENXIO;
>> + }
>> +
>> + vic->regs = devm_ioremap_resource(dev, regs);
>> + if (IS_ERR(vic->regs))
>> + return PTR_ERR(vic->regs);
>> +
>> + vic->clk = devm_clk_get(dev, NULL);
>> + if (IS_ERR(vic->clk)) {
>> + dev_err(&pdev->dev, "failed to get clock\n");
>> + return PTR_ERR(vic->clk);
>> + }
>> +
>> + platform_set_drvdata(pdev, vic);
>> +
>> + INIT_LIST_HEAD(&vic->client.base.list);
>> + vic->client.base.ops = &vic_client_ops;
>> + vic->client.base.dev = dev;
>> + vic->client.base.class = HOST1X_CLASS_VIC;
>> + vic->client.base.syncpts = syncpts;
>> + vic->client.base.num_syncpts = 1;
>> + vic->dev = dev;
>> + vic->config = vic_config;
>> +
>> + INIT_LIST_HEAD(&vic->client.list);
>> + vic->client.ops = &vic_ops;
>> +
>> + err = host1x_client_register(&vic->client.base);
>> + if (err < 0) {
>> + dev_err(dev, "failed to register host1x client: %d\n", err);
>> + platform_set_drvdata(pdev, NULL);
>> + return err;
>> + }
>> +
>> + pm_runtime_enable(&pdev->dev);
>> + if (!pm_runtime_enabled(&pdev->dev)) {
>> + err = vic_runtime_resume(&pdev->dev);
>> + if (err < 0)
>> + goto unregister_client;
>> + }
>
> That's a slightly ugly construct, but I can't think of an easy way to
> get rid of it (other than to depend on PM, which actually might be the
> right thing to do here, it's already selected by ARCH_TEGRA on 64-bit
> ARM). I can't think of a scenario where we'd want to run without runtime
> PM.
Kept this for now.
>
>> +
>> + dev_info(&pdev->dev, "initialized");
>
> Again, no need for this.
Removed.
>
> Thierry
>
Thanks,
Mikko.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 5/8] gpu: host1x: Add IOMMU support
[not found] ` <20161205194924.GE22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
@ 2016-12-14 8:50 ` Mikko Perttunen
0 siblings, 0 replies; 25+ messages in thread
From: Mikko Perttunen @ 2016-12-14 8:50 UTC (permalink / raw)
To: Thierry Reding, Mikko Perttunen
Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, airlied-cv59FeDIM0c
On 05.12.2016 21:49, Thierry Reding wrote:
> On Thu, Nov 10, 2016 at 08:23:42PM +0200, Mikko Perttunen wrote:
>> Add support for the Host1x unit to be located behind
>> an IOMMU. This is required when gather buffers may be
>> allocated non-contiguously in physical memory, as can
>> be the case when TegraDRM is also using the IOMMU.
>>
>> Signed-off-by: Mikko Perttunen <mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
>> ---
>> drivers/gpu/host1x/cdma.c | 65 +++++++++++++++++++++++++++++------
>> drivers/gpu/host1x/cdma.h | 4 ++-
>> drivers/gpu/host1x/dev.c | 39 +++++++++++++++++++--
>> drivers/gpu/host1x/dev.h | 5 +++
>> drivers/gpu/host1x/hw/cdma_hw.c | 10 +++---
>> drivers/gpu/host1x/job.c | 76 +++++++++++++++++++++++++++++++++++------
>> drivers/gpu/host1x/job.h | 1 +
>> 7 files changed, 171 insertions(+), 29 deletions(-)
>>
>> diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c
>> index c5d82a8..14692446e 100644
>> --- a/drivers/gpu/host1x/cdma.c
>> +++ b/drivers/gpu/host1x/cdma.c
>> @@ -51,9 +51,15 @@ static void host1x_pushbuffer_destroy(struct push_buffer *pb)
>> struct host1x_cdma *cdma = pb_to_cdma(pb);
>> struct host1x *host1x = cdma_to_host1x(cdma);
>>
>> - if (pb->phys != 0)
>> - dma_free_wc(host1x->dev, pb->size_bytes + 4, pb->mapped,
>> - pb->phys);
>> + if (!pb->phys)
>> + return;
>> +
>> + if (host1x->domain) {
>> + iommu_unmap(host1x->domain, pb->dma, pb->alloc_size_bytes);
>> + free_iova(&host1x->iova, iova_pfn(&host1x->iova, pb->dma));
>> + }
>> +
>> + dma_free_wc(host1x->dev, pb->alloc_size_bytes, pb->mapped, pb->phys);
>>
>> pb->mapped = NULL;
>> pb->phys = 0;
>> @@ -66,28 +72,65 @@ static int host1x_pushbuffer_init(struct push_buffer *pb)
>> {
>> struct host1x_cdma *cdma = pb_to_cdma(pb);
>> struct host1x *host1x = cdma_to_host1x(cdma);
>> + struct iova *alloc;
>> + int err;
>>
>> pb->mapped = NULL;
>> pb->phys = 0;
>> pb->size_bytes = HOST1X_PUSHBUFFER_SLOTS * 8;
>> + pb->alloc_size_bytes = pb->size_bytes + 4;
>>
>> /* initialize buffer pointers */
>> pb->fence = pb->size_bytes - 8;
>> pb->pos = 0;
>>
>> - /* allocate and map pushbuffer memory */
>> - pb->mapped = dma_alloc_wc(host1x->dev, pb->size_bytes + 4, &pb->phys,
>> - GFP_KERNEL);
>> - if (!pb->mapped)
>> - goto fail;
>> + if (host1x->domain) {
>> + unsigned long shift;
>> +
>> + pb->alloc_size_bytes = iova_align(&host1x->iova,
>> + pb->alloc_size_bytes);
>> +
>> + pb->mapped = dma_alloc_wc(host1x->dev, pb->alloc_size_bytes,
>> + &pb->phys, GFP_KERNEL);
>> + if (!pb->mapped)
>> + return -ENOMEM;
>> +
>> + shift = iova_shift(&host1x->iova);
>> + alloc = alloc_iova(
>> + &host1x->iova, pb->alloc_size_bytes >> shift,
>> + host1x->domain->geometry.aperture_end >> shift, true);
>
> Let's precompute some of these values to make this expression more
> compact. For example pb->alloc_size_bytes is used a couple of times, so
> it could be stored temporarily in a local variable with a nice and short
> name such as "size".
>
> If you store the aperture end, or limit_pfn, as I commented in an
> earlier patch, then this also becomes much more compact.
Made a local "size" variable and stored the aperture end and did some
other cleanup as well.
>
>> + if (!alloc) {
>> + err = -ENOMEM;
>> + goto iommu_free_mem;
>> + }
>> +
>> + err = iommu_map(host1x->domain,
>> + iova_dma_addr(&host1x->iova, alloc),
>> + pb->phys, pb->alloc_size_bytes,
>> + IOMMU_READ);
>
> Same here.
Cleaned up.
>
>> + if (err)
>> + goto iommu_free_iova;
>> +
>> + pb->dma = iova_dma_addr(&host1x->iova, alloc);
>> + } else {
>> + pb->mapped = dma_alloc_wc(host1x->dev, pb->alloc_size_bytes,
>> + &pb->phys, GFP_KERNEL);
>> + if (!pb->mapped)
>> + return -ENOMEM;
>> +
>> + pb->dma = pb->phys;
>> + }
>>
>> host1x_hw_pushbuffer_init(host1x, pb);
>>
>> return 0;
>>
>> -fail:
>> - host1x_pushbuffer_destroy(pb);
>> - return -ENOMEM;
>> +iommu_free_iova:
>> + __free_iova(&host1x->iova, alloc);
>> +iommu_free_mem:
>> + dma_free_wc(host1x->dev, pb->alloc_size_bytes, pb->mapped, pb->phys);
>> +
>> + return err;
>> }
>>
>> /*
>> diff --git a/drivers/gpu/host1x/cdma.h b/drivers/gpu/host1x/cdma.h
>> index 470087a..4b3645f 100644
>> --- a/drivers/gpu/host1x/cdma.h
>> +++ b/drivers/gpu/host1x/cdma.h
>> @@ -43,10 +43,12 @@ struct host1x_job;
>>
>> struct push_buffer {
>> void *mapped; /* mapped pushbuffer memory */
>> - dma_addr_t phys; /* physical address of pushbuffer */
>> + dma_addr_t dma; /* device address of pushbuffer */
>> + phys_addr_t phys; /* physical address of pushbuffer */
>> u32 fence; /* index we've written */
>> u32 pos; /* index to write to */
>> u32 size_bytes;
>> + u32 alloc_size_bytes;
>
> Maybe just "alloc_size"?
Renamed. Also renamed size_bytes to size for consistency.
>
>> };
>>
>> struct buffer_timeout {
>> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
>> index a62317a..ca4527e 100644
>> --- a/drivers/gpu/host1x/dev.c
>> +++ b/drivers/gpu/host1x/dev.c
>> @@ -168,16 +168,36 @@ static int host1x_probe(struct platform_device *pdev)
>> return err;
>> }
>>
>> + if (iommu_present(&platform_bus_type)) {
>> + struct iommu_domain_geometry *geometry;
>> + unsigned long order;
>> +
>> + host->domain = iommu_domain_alloc(&platform_bus_type);
>> + if (!host->domain)
>> + return -ENOMEM;
>> +
>> + err = iommu_attach_device(host->domain, &pdev->dev);
>> + if (err)
>> + goto fail_free_domain;
>> +
>> + geometry = &host->domain->geometry;
>> +
>> + order = __ffs(host->domain->pgsize_bitmap);
>> + init_iova_domain(&host->iova, 1UL << order,
>> + geometry->aperture_start >> order,
>> + geometry->aperture_end >> order);
>> + }
>> +
>> err = host1x_channel_list_init(host);
>> if (err) {
>> dev_err(&pdev->dev, "failed to initialize channel list\n");
>> - return err;
>> + goto fail_detach_device;
>> }
>>
>> err = clk_prepare_enable(host->clk);
>> if (err < 0) {
>> dev_err(&pdev->dev, "failed to enable clock\n");
>> - return err;
>> + goto fail_detach_device;
>> }
>>
>> err = host1x_syncpt_init(host);
>> @@ -206,6 +226,15 @@ static int host1x_probe(struct platform_device *pdev)
>> host1x_syncpt_deinit(host);
>> fail_unprepare_disable:
>> clk_disable_unprepare(host->clk);
>> +fail_detach_device:
>> + if (host->domain) {
>> + put_iova_domain(&host->iova);
>> + iommu_detach_device(host->domain, &pdev->dev);
>> + }
>> +fail_free_domain:
>> + if (host->domain)
>> + iommu_domain_free(host->domain);
>> +
>> return err;
>> }
>>
>> @@ -218,6 +247,12 @@ static int host1x_remove(struct platform_device *pdev)
>> host1x_syncpt_deinit(host);
>> clk_disable_unprepare(host->clk);
>>
>> + if (host->domain) {
>> + put_iova_domain(&host->iova);
>> + iommu_detach_device(host->domain, &pdev->dev);
>> + iommu_domain_free(host->domain);
>> + }
>> +
>> return 0;
>> }
>>
>> diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
>> index 5220510..6189825 100644
>> --- a/drivers/gpu/host1x/dev.h
>> +++ b/drivers/gpu/host1x/dev.h
>> @@ -19,6 +19,8 @@
>>
>> #include <linux/platform_device.h>
>> #include <linux/device.h>
>> +#include <linux/iommu.h>
>> +#include <linux/iova.h>
>>
>> #include "channel.h"
>> #include "syncpt.h"
>> @@ -108,6 +110,9 @@ struct host1x {
>> struct device *dev;
>> struct clk *clk;
>>
>> + struct iommu_domain *domain;
>> + struct iova_domain iova;
>> +
>> struct mutex intr_mutex;
>> int intr_syncpt_irq;
>>
>> diff --git a/drivers/gpu/host1x/hw/cdma_hw.c b/drivers/gpu/host1x/hw/cdma_hw.c
>> index 659c1bb..b8c0348 100644
>> --- a/drivers/gpu/host1x/hw/cdma_hw.c
>> +++ b/drivers/gpu/host1x/hw/cdma_hw.c
>> @@ -55,7 +55,7 @@ static void cdma_timeout_cpu_incr(struct host1x_cdma *cdma, u32 getptr,
>> *(p++) = HOST1X_OPCODE_NOP;
>> *(p++) = HOST1X_OPCODE_NOP;
>> dev_dbg(host1x->dev, "%s: NOP at %pad+%#x\n", __func__,
>> - &pb->phys, getptr);
>> + &pb->dma, getptr);
>> getptr = (getptr + 8) & (pb->size_bytes - 1);
>> }
>>
>> @@ -78,9 +78,9 @@ static void cdma_start(struct host1x_cdma *cdma)
>> HOST1X_CHANNEL_DMACTRL);
>>
>> /* set base, put and end pointer */
>> - host1x_ch_writel(ch, cdma->push_buffer.phys, HOST1X_CHANNEL_DMASTART);
>> + host1x_ch_writel(ch, cdma->push_buffer.dma, HOST1X_CHANNEL_DMASTART);
>> host1x_ch_writel(ch, cdma->push_buffer.pos, HOST1X_CHANNEL_DMAPUT);
>> - host1x_ch_writel(ch, cdma->push_buffer.phys +
>> + host1x_ch_writel(ch, cdma->push_buffer.dma +
>> cdma->push_buffer.size_bytes + 4,
>> HOST1X_CHANNEL_DMAEND);
>>
>> @@ -115,8 +115,8 @@ static void cdma_timeout_restart(struct host1x_cdma *cdma, u32 getptr)
>> HOST1X_CHANNEL_DMACTRL);
>>
>> /* set base, end pointer (all of memory) */
>> - host1x_ch_writel(ch, cdma->push_buffer.phys, HOST1X_CHANNEL_DMASTART);
>> - host1x_ch_writel(ch, cdma->push_buffer.phys +
>> + host1x_ch_writel(ch, cdma->push_buffer.dma, HOST1X_CHANNEL_DMASTART);
>> + host1x_ch_writel(ch, cdma->push_buffer.dma +
>> cdma->push_buffer.size_bytes,
>> HOST1X_CHANNEL_DMAEND);
>>
>> diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
>> index a91b7c4..222d930 100644
>> --- a/drivers/gpu/host1x/job.c
>> +++ b/drivers/gpu/host1x/job.c
>> @@ -174,8 +174,9 @@ static int do_waitchks(struct host1x_job *job, struct host1x *host,
>> return 0;
>> }
>>
>> -static unsigned int pin_job(struct host1x_job *job)
>> +static unsigned int pin_job(struct host1x *host, struct host1x_job *job)
>> {
>> + int err;
>> unsigned int i;
>
> Maybe use an inverse christmas tree for variables as in other parts of
> this driver? That is, add the new int err belowe the existing unsined
> int i.
Fixed.
>
>>
>> job->num_unpins = 0;
>> @@ -186,12 +187,16 @@ static unsigned int pin_job(struct host1x_job *job)
>> dma_addr_t phys_addr;
>>
>> reloc->target.bo = host1x_bo_get(reloc->target.bo);
>> - if (!reloc->target.bo)
>> + if (!reloc->target.bo) {
>> + err = -EINVAL;
>> goto unpin;
>> + }
>>
>> phys_addr = host1x_bo_pin(reloc->target.bo, &sgt);
>> - if (!phys_addr)
>> + if (!phys_addr) {
>> + err = -EINVAL;
>> goto unpin;
>> + }
>>
>> job->addr_phys[job->num_unpins] = phys_addr;
>> job->unpins[job->num_unpins].bo = reloc->target.bo;
>> @@ -204,25 +209,68 @@ static unsigned int pin_job(struct host1x_job *job)
>> struct sg_table *sgt;
>> dma_addr_t phys_addr;
>>
>> + size_t gather_size = 0;
>> + struct scatterlist *sg;
>> + struct iova *alloc;
>> + unsigned long shift;
>> + int j;
>
> There should be no blank line between blocks of variable declarations.
> Also, make j an unsigned int.
Fixed.
>
>> +
>> g->bo = host1x_bo_get(g->bo);
>> - if (!g->bo)
>> + if (!g->bo) {
>> + err = -EINVAL;
>> goto unpin;
>> + }
>>
>> phys_addr = host1x_bo_pin(g->bo, &sgt);
>> - if (!phys_addr)
>> + if (!phys_addr) {
>> + err = -EINVAL;
>> goto unpin;
>> + }
>> +
>> + if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && host->domain) {
>> + for_each_sg(sgt->sgl, sg, sgt->nents, j) {
>> + gather_size += sg->length;
>> + }
>
> No need for curly brackets here.
Fixed.
>
>> + gather_size = iova_align(&host->iova, gather_size);
>> +
>> + shift = iova_shift(&host->iova);
>> + alloc = alloc_iova(
>> + &host->iova, gather_size >> shift,
>> + host->domain->geometry.aperture_end >> shift,
>> + true);
>
> This could be made a little more compact as well.
Fixed.
>
> Thierry
>
Thanks,
Mikko.
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2016-12-14 8:50 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-10 18:23 [PATCH 0/8] Host1x IOMMU support + VIC support Mikko Perttunen
2016-11-10 18:23 ` [PATCH 1/8] drm/tegra: Add Tegra DRM allocation API Mikko Perttunen
[not found] ` <20161110182345.31777-2-mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2016-12-05 18:37 ` Thierry Reding
[not found] ` <20161205183726.GA22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
2016-12-07 9:23 ` Mikko Perttunen
2016-11-10 18:23 ` [PATCH 2/8] drm/tegra: Allocate BOs from lower 4G when without IOMMU Mikko Perttunen
[not found] ` <20161110182345.31777-3-mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2016-12-05 18:39 ` Thierry Reding
[not found] ` <20161205183955.GB22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
2016-12-07 9:32 ` Mikko Perttunen
[not found] ` <20161110182345.31777-1-mperttunen-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2016-11-10 18:23 ` [PATCH 3/8] drm/tegra: Add falcon helper library Mikko Perttunen
2016-12-05 19:13 ` Thierry Reding
[not found] ` <20161205191346.GC22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
2016-12-07 11:54 ` Mikko Perttunen
2016-11-10 18:23 ` [PATCH 4/8] drm/tegra: Add VIC support Mikko Perttunen
2016-12-05 19:35 ` Thierry Reding
[not found] ` <20161205193559.GD22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
2016-12-14 8:28 ` Mikko Perttunen
2016-11-10 18:23 ` [PATCH 7/8] arm64: tegra: Enable VIC on T210 Mikko Perttunen
2016-11-10 18:23 ` [PATCH 8/8] arm64: tegra: Enable IOMMU for Host1x on Tegra210 Mikko Perttunen
2016-12-05 13:25 ` [PATCH 0/8] Host1x IOMMU support + VIC support Mikko Perttunen
2016-11-10 18:23 ` [PATCH 5/8] gpu: host1x: Add IOMMU support Mikko Perttunen
2016-12-05 19:49 ` Thierry Reding
[not found] ` <20161205194924.GE22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
2016-12-14 8:50 ` Mikko Perttunen
2016-11-10 18:23 ` [PATCH 6/8] dt-bindings: Add bindings for the Tegra VIC Mikko Perttunen
2016-12-05 19:51 ` [PATCH 0/8] Host1x IOMMU support + VIC support Thierry Reding
[not found] ` <20161205195131.GF22918-EkSeR96xj6Pcmrwk2tT4+A@public.gmane.org>
2016-12-05 19:58 ` Mikko Perttunen
2016-12-06 7:14 ` Daniel Vetter
[not found] ` <20161206071450.uedxvvtcxbkwsoza-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2016-12-06 9:33 ` Mikko Perttunen
2016-12-07 10:36 ` Daniel Vetter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).