* [PATCH 0/9 v6] Batch submission via GuC
@ 2015-08-12 14:43 Dave Gordon
2015-08-12 14:43 ` [PATCH 1/9 v6] drm/i915: GuC-specific firmware loader Dave Gordon
` (9 more replies)
0 siblings, 10 replies; 14+ messages in thread
From: Dave Gordon @ 2015-08-12 14:43 UTC (permalink / raw)
To: intel-gfx
This patch series enables command submission via the GuC. In this mode,
instead of the host CPU driving the execlist port directly, it hands
over work items to the GuC, using a doorbell mechanism to tell the GuC
that new items have been added to its work queue. The GuC then dispatches
contexts to the various GPU engines, and manages the resulting context-
switch interrupts. Completion of a batch is however still signalled to
the CPU; the GuC is not involved in handling user interrupts.
There are two subsequences within the patch series:
drm/i915: GuC-specific firmware loader
drm/i915: Debugfs interface to read GuC load status
These two patches provide the GuC loader and a debugfs interface to
verify the resulting state. At this point in the sequence we can load
and activate the GuC firmware, but not submit any batches through it.
(This is nonetheless a potentially useful state, as the GuC could do
other useful work even when not handling batch submissions).
drm/i915: Expose one LRC function for GuC submission mode
drm/i915: Prepare for GuC-based command submission
drm/i915: Enable GuC firmware log
drm/i915: Implementation of GuC submission client
drm/i915: Interrupt routing for GuC submission
drm/i915: Integrate GuC-based command submission
drm/i915: Debugfs interface for GuC submission statistics
In this second section, we implement the GuC submission mechanism, and
link it into the (execlist-based) submission path. At this stage, it is
not yet enabled by default; it is activated only if the kernel command
line explicitly switches it on.
The GuC firmware itself is not included in this patchset; it is or will
be available for download from https://01.org/linuxgraphics/downloads/
This driver works with and requires GuC firmware revision 3.x. It will
not work with any firmware version 1.x, as the GuC protocol in those
revisions was incompatible and is no longer supported.
On platforms where there is no GuC, or where GuC submission is not
specifically enabled, batch submission will continue to use the execlist
(or ringbuffer) mechanisms. On the other hand, if GuC submission is
requested but the GuC firmware cannot be found or is invalid, the GPU
will be unusable.
It is expected that a subsequent patch will enable GuC submission by
default once the signed firmware is published on 01.org.
Ben Widawsky (0):
Vinit Azad (0):
Michael H. Nguyen (0):
created the original versions on which some of these patches are based.
Alex Dai (5):
drm/i915: GuC-specific firmware loader
drm/i915: Debugfs interface to read GuC load status
drm/i915: Prepare for GuC-based command submission
drm/i915: Enable GuC firmware log
drm/i915: Integrate GuC-based command submission
Dave Gordon (4):
drm/i915: Expose one LRC function for GuC submission mode
drm/i915: Implementation of GuC submission client
drm/i915: Interrupt routing for GuC submission
drm/i915: Debugfs interface for GuC submission statistics
Documentation/DocBook/drm.tmpl | 14 +
drivers/gpu/drm/i915/Makefile | 4 +
drivers/gpu/drm/i915/i915_debugfs.c | 146 ++++-
drivers/gpu/drm/i915/i915_dma.c | 9 +
drivers/gpu/drm/i915/i915_drv.h | 11 +
drivers/gpu/drm/i915/i915_gem.c | 16 +
drivers/gpu/drm/i915/i915_gpu_error.c | 14 +-
drivers/gpu/drm/i915/i915_guc_reg.h | 17 +-
drivers/gpu/drm/i915/i915_guc_submission.c | 901 +++++++++++++++++++++++++++++
drivers/gpu/drm/i915/i915_reg.h | 15 +-
drivers/gpu/drm/i915/intel_guc.h | 122 ++++
drivers/gpu/drm/i915/intel_guc_fwif.h | 9 +-
drivers/gpu/drm/i915/intel_guc_loader.c | 611 +++++++++++++++++++
drivers/gpu/drm/i915/intel_lrc.c | 65 ++-
drivers/gpu/drm/i915/intel_lrc.h | 8 +
15 files changed, 1918 insertions(+), 44 deletions(-)
create mode 100644 drivers/gpu/drm/i915/i915_guc_submission.c
create mode 100644 drivers/gpu/drm/i915/intel_guc.h
create mode 100644 drivers/gpu/drm/i915/intel_guc_loader.c
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/9 v6] drm/i915: GuC-specific firmware loader
2015-08-12 14:43 [PATCH 0/9 v6] Batch submission via GuC Dave Gordon
@ 2015-08-12 14:43 ` Dave Gordon
2015-08-14 9:05 ` Daniel Vetter
2015-08-12 14:43 ` [PATCH 2/9 v6] drm/i915: Debugfs interface to read GuC load status Dave Gordon
` (8 subsequent siblings)
9 siblings, 1 reply; 14+ messages in thread
From: Dave Gordon @ 2015-08-12 14:43 UTC (permalink / raw)
To: intel-gfx
From: Alex Dai <yu.dai@intel.com>
This fetches the required firmware image from the filesystem,
then loads it into the GuC's memory via a dedicated DMA engine.
This patch is derived from GuC loading work originally done by
Vinit Azad and Ben Widawsky.
v2:
Various improvements per review comments by Chris Wilson
v3:
Removed 'wait' parameter to intel_guc_ucode_load() as firmware
prefetch is no longer supported in the common firmware loader,
per Daniel Vetter's request.
Firmware checker callback fn now returns errno rather than bool.
v4:
Squash uC-independent code into GuC-specifc loader [Daniel Vetter]
Don't keep the driver working (by falling back to execlist mode)
if GuC firmware loading fails [Daniel Vetter]
v5:
Clarify WOPCM-related #defines [Tom O'Rourke]
Delete obsolete code no longer required with current h/w & f/w
[Tom O'Rourke]
Move the call to intel_guc_ucode_init() later, so that it can
allocate GEM objects, and have it fetch the firmware; then
intel_guc_ucode_load() doesn't need to fetch it later.
[Daniel Vetter].
v6:
Update comment describing intel_guc_ucode_load() [Tom O'Rourke]
Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
drivers/gpu/drm/i915/Makefile | 3 +
drivers/gpu/drm/i915/i915_dma.c | 9 +
drivers/gpu/drm/i915/i915_drv.h | 11 +
drivers/gpu/drm/i915/i915_gem.c | 16 +
drivers/gpu/drm/i915/i915_guc_reg.h | 17 +-
drivers/gpu/drm/i915/i915_reg.h | 4 +-
drivers/gpu/drm/i915/intel_guc.h | 67 ++++
drivers/gpu/drm/i915/intel_guc_fwif.h | 3 +-
drivers/gpu/drm/i915/intel_guc_loader.c | 529 ++++++++++++++++++++++++++++++++
9 files changed, 650 insertions(+), 9 deletions(-)
create mode 100644 drivers/gpu/drm/i915/intel_guc.h
create mode 100644 drivers/gpu/drm/i915/intel_guc_loader.c
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 998b464..65e52fa 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -40,6 +40,9 @@ i915-y += i915_cmd_parser.o \
intel_ringbuffer.o \
intel_uncore.o
+# general-purpose microcontroller (GuC) support
+i915-y += intel_guc_loader.o
+
# autogenerated null render state
i915-y += intel_renderstate_gen6.o \
intel_renderstate_gen7.o \
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index ab37d11..2193cc2 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -435,6 +435,11 @@ static int i915_load_modeset_init(struct drm_device *dev)
* working irqs for e.g. gmbus and dp aux transfers. */
intel_modeset_init(dev);
+ /* intel_guc_ucode_init() needs the mutex to allocate GEM objects */
+ mutex_lock(&dev->struct_mutex);
+ intel_guc_ucode_init(dev);
+ mutex_unlock(&dev->struct_mutex);
+
ret = i915_gem_init(dev);
if (ret)
goto cleanup_irq;
@@ -476,6 +481,9 @@ cleanup_gem:
i915_gem_context_fini(dev);
mutex_unlock(&dev->struct_mutex);
cleanup_irq:
+ mutex_lock(&dev->struct_mutex);
+ intel_guc_ucode_fini(dev);
+ mutex_unlock(&dev->struct_mutex);
drm_irq_uninstall(dev);
cleanup_gem_stolen:
i915_gem_cleanup_stolen(dev);
@@ -1128,6 +1136,7 @@ int i915_driver_unload(struct drm_device *dev)
flush_workqueue(dev_priv->wq);
mutex_lock(&dev->struct_mutex);
+ intel_guc_ucode_fini(dev);
i915_gem_cleanup_ringbuffer(dev);
i915_gem_context_fini(dev);
mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 674b223..0c0125c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -50,6 +50,7 @@
#include <linux/intel-iommu.h>
#include <linux/kref.h>
#include <linux/pm_qos.h>
+#include "intel_guc.h"
/* General customization:
*/
@@ -1706,6 +1707,8 @@ struct drm_i915_private {
struct i915_virtual_gpu vgpu;
+ struct intel_guc guc;
+
struct intel_csr csr;
/* Display CSR-related protection */
@@ -1950,6 +1953,11 @@ static inline struct drm_i915_private *dev_to_i915(struct device *dev)
return to_i915(dev_get_drvdata(dev));
}
+static inline struct drm_i915_private *guc_to_i915(struct intel_guc *guc)
+{
+ return container_of(guc, struct drm_i915_private, guc);
+}
+
/* Iterate over initialised rings */
#define for_each_ring(ring__, dev_priv__, i__) \
for ((i__) = 0; (i__) < I915_NUM_RINGS; (i__)++) \
@@ -2554,6 +2562,9 @@ struct drm_i915_cmd_table {
#define HAS_CSR(dev) (IS_SKYLAKE(dev))
+#define HAS_GUC_UCODE(dev) (IS_GEN9(dev))
+#define HAS_GUC_SCHED(dev) (IS_GEN9(dev))
+
#define HAS_RESOURCE_STREAMER(dev) (IS_HASWELL(dev) || \
INTEL_INFO(dev)->gen >= 8)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1c54601..23263e2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4680,6 +4680,22 @@ i915_gem_init_hw(struct drm_device *dev)
goto out;
}
+ /* We can't enable contexts until all firmware is loaded */
+ ret = intel_guc_ucode_load(dev);
+ if (ret) {
+ /*
+ * If we got an error and GuC submission is enabled, map
+ * the error to -EIO so the GPU will be declared wedged.
+ * OTOH, if we didn't intend to use the GuC anyway, just
+ * discard the error and carry on.
+ */
+ DRM_ERROR("Failed to initialize GuC, error %d%s\n", ret,
+ i915.enable_guc_submission ? "" : " (ignored)");
+ ret = i915.enable_guc_submission ? -EIO : 0;
+ if (ret)
+ goto out;
+ }
+
/* Now it is safe to go back round and do everything else: */
for_each_ring(ring, dev_priv, i) {
struct drm_i915_gem_request *req;
diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h b/drivers/gpu/drm/i915/i915_guc_reg.h
index ccdc6c8..8c8e574 100644
--- a/drivers/gpu/drm/i915/i915_guc_reg.h
+++ b/drivers/gpu/drm/i915/i915_guc_reg.h
@@ -38,10 +38,6 @@
#define GS_MIA_SHIFT 16
#define GS_MIA_MASK (0x07 << GS_MIA_SHIFT)
-#define GUC_WOPCM_SIZE 0xc050
-#define GUC_WOPCM_SIZE_VALUE (0x80 << 12) /* 512KB */
-#define GUC_WOPCM_OFFSET 0x80000 /* 512KB */
-
#define SOFT_SCRATCH(n) (0xc180 + ((n) * 4))
#define UOS_RSA_SCRATCH_0 0xc200
@@ -56,10 +52,18 @@
#define UOS_MOVE (1<<4)
#define START_DMA (1<<0)
#define DMA_GUC_WOPCM_OFFSET 0xc340
+#define GUC_WOPCM_OFFSET_VALUE 0x80000 /* 512KB */
+
+#define GUC_WOPCM_SIZE 0xc050
+#define GUC_WOPCM_SIZE_VALUE (0x80 << 12) /* 512KB */
+
+/* GuC addresses below GUC_WOPCM_TOP don't map through the GTT */
+#define GUC_WOPCM_TOP (GUC_WOPCM_SIZE_VALUE)
#define GEN8_GT_PM_CONFIG 0x138140
+#define GEN9LP_GT_PM_CONFIG 0x138140
#define GEN9_GT_PM_CONFIG 0x13816c
-#define GEN8_GT_DOORBELL_ENABLE (1<<0)
+#define GT_DOORBELL_ENABLE (1<<0)
#define GEN8_GTCR 0x4274
#define GEN8_GTCR_INVALIDATE (1<<0)
@@ -80,7 +84,8 @@
GUC_ENABLE_READ_CACHE_LOGIC | \
GUC_ENABLE_MIA_CACHING | \
GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA | \
- GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA)
+ GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA | \
+ GUC_ENABLE_MIA_CLOCK_GATING)
#define HOST2GUC_INTERRUPT 0xc4c8
#define HOST2GUC_TRIGGER (1<<0)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 6786e94..d54dc2c 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -6871,7 +6871,9 @@ enum skl_disp_power_wells {
#define GEN9_PGCTL_SSB_EU311_ACK (1 << 14)
#define GEN7_MISCCPCTL (0x9424)
-#define GEN7_DOP_CLOCK_GATE_ENABLE (1<<0)
+#define GEN7_DOP_CLOCK_GATE_ENABLE (1<<0)
+#define GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE (1<<2)
+#define GEN8_DOP_CLOCK_GATE_GUC_ENABLE (1<<4)
#define GEN8_GARBCNTL 0xB004
#define GEN9_GAPS_TSV_CREDIT_DISABLE (1<<7)
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
new file mode 100644
index 0000000..2846b6d
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -0,0 +1,67 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+#ifndef _INTEL_GUC_H_
+#define _INTEL_GUC_H_
+
+#include "intel_guc_fwif.h"
+#include "i915_guc_reg.h"
+
+enum intel_guc_fw_status {
+ GUC_FIRMWARE_FAIL = -1,
+ GUC_FIRMWARE_NONE = 0,
+ GUC_FIRMWARE_PENDING,
+ GUC_FIRMWARE_SUCCESS
+};
+
+/*
+ * This structure encapsulates all the data needed during the process
+ * of fetching, caching, and loading the firmware image into the GuC.
+ */
+struct intel_guc_fw {
+ struct drm_device * guc_dev;
+ const char * guc_fw_path;
+ size_t guc_fw_size;
+ struct drm_i915_gem_object * guc_fw_obj;
+ enum intel_guc_fw_status guc_fw_fetch_status;
+ enum intel_guc_fw_status guc_fw_load_status;
+
+ uint16_t guc_fw_major_wanted;
+ uint16_t guc_fw_minor_wanted;
+ uint16_t guc_fw_major_found;
+ uint16_t guc_fw_minor_found;
+};
+
+struct intel_guc {
+ struct intel_guc_fw guc_fw;
+
+ uint32_t log_flags;
+};
+
+/* intel_guc_loader.c */
+extern void intel_guc_ucode_init(struct drm_device *dev);
+extern int intel_guc_ucode_load(struct drm_device *dev);
+extern void intel_guc_ucode_fini(struct drm_device *dev);
+extern const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status);
+
+#endif
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 18d7f20..06aad6f 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -32,9 +32,8 @@
* EDITING THIS FILE IS THEREFORE NOT RECOMMENDED - YOUR CHANGES MAY BE LOST.
*/
-#define GFXCORE_FAMILY_GEN8 11
#define GFXCORE_FAMILY_GEN9 12
-#define GFXCORE_FAMILY_FORCE_ULONG 0x7fffffff
+#define GFXCORE_FAMILY_UNKNOWN 0x7fffffff
#define GUC_CTX_PRIORITY_CRITICAL 0
#define GUC_CTX_PRIORITY_HIGH 1
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
new file mode 100644
index 0000000..dd62c31
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -0,0 +1,529 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ * Vinit Azad <vinit.azad@intel.com>
+ * Ben Widawsky <ben@bwidawsk.net>
+ * Dave Gordon <david.s.gordon@intel.com>
+ * Alex Dai <yu.dai@intel.com>
+ */
+#include <linux/firmware.h>
+#include "i915_drv.h"
+#include "intel_guc.h"
+
+/**
+ * DOC: GuC
+ *
+ * intel_guc:
+ * Top level structure of guc. It handles firmware loading and manages client
+ * pool and doorbells. intel_guc owns a i915_guc_client to replace the legacy
+ * ExecList submission.
+ *
+ * Firmware versioning:
+ * The firmware build process will generate a version header file with major and
+ * minor version defined. The versions are built into CSS header of firmware.
+ * i915 kernel driver set the minimal firmware version required per platform.
+ * The firmware installation package will install (symbolic link) proper version
+ * of firmware.
+ *
+ * GuC address space:
+ * GuC does not allow any gfx GGTT address that falls into range [0, WOPCM_TOP),
+ * which is reserved for Boot ROM, SRAM and WOPCM. Currently this top address is
+ * 512K. In order to exclude 0-512K address space from GGTT, all gfx objects
+ * used by GuC is pinned with PIN_OFFSET_BIAS along with size of WOPCM.
+ *
+ * Firmware log:
+ * Firmware log is enabled by setting i915.guc_log_level to non-negative level.
+ * Log data is printed out via reading debugfs i915_guc_log_dump. Reading from
+ * i915_guc_load_status will print out firmware loading status and scratch
+ * registers value.
+ *
+ */
+
+#define I915_SKL_GUC_UCODE "i915/skl_guc_ver3.bin"
+MODULE_FIRMWARE(I915_SKL_GUC_UCODE);
+
+/* User-friendly representation of an enum */
+const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status)
+{
+ switch (status) {
+ case GUC_FIRMWARE_FAIL:
+ return "FAIL";
+ case GUC_FIRMWARE_NONE:
+ return "NONE";
+ case GUC_FIRMWARE_PENDING:
+ return "PENDING";
+ case GUC_FIRMWARE_SUCCESS:
+ return "SUCCESS";
+ default:
+ return "UNKNOWN!";
+ }
+};
+
+static u32 get_gttype(struct drm_i915_private *dev_priv)
+{
+ /* XXX: GT type based on PCI device ID? field seems unused by fw */
+ return 0;
+}
+
+static u32 get_core_family(struct drm_i915_private *dev_priv)
+{
+ switch (INTEL_INFO(dev_priv)->gen) {
+ case 9:
+ return GFXCORE_FAMILY_GEN9;
+
+ default:
+ DRM_ERROR("GUC: unsupported core family\n");
+ return GFXCORE_FAMILY_UNKNOWN;
+ }
+}
+
+static void set_guc_init_params(struct drm_i915_private *dev_priv)
+{
+ struct intel_guc *guc = &dev_priv->guc;
+ u32 params[GUC_CTL_MAX_DWORDS];
+ int i;
+
+ memset(¶ms, 0, sizeof(params));
+
+ params[GUC_CTL_DEVICE_INFO] |=
+ (get_gttype(dev_priv) << GUC_CTL_GTTYPE_SHIFT) |
+ (get_core_family(dev_priv) << GUC_CTL_COREFAMILY_SHIFT);
+
+ /*
+ * GuC ARAT increment is 10 ns. GuC default scheduler quantum is one
+ * second. This ARAR is calculated by:
+ * Scheduler-Quantum-in-ns / ARAT-increment-in-ns = 1000000000 / 10
+ */
+ params[GUC_CTL_ARAT_HIGH] = 0;
+ params[GUC_CTL_ARAT_LOW] = 100000000;
+
+ params[GUC_CTL_WA] |= GUC_CTL_WA_UK_BY_DRIVER;
+
+ params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
+ GUC_CTL_VCS2_ENABLED;
+
+ if (i915.guc_log_level >= 0) {
+ params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
+ params[GUC_CTL_DEBUG] =
+ i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
+ }
+
+ I915_WRITE(SOFT_SCRATCH(0), 0);
+
+ for (i = 0; i < GUC_CTL_MAX_DWORDS; i++)
+ I915_WRITE(SOFT_SCRATCH(1 + i), params[i]);
+}
+
+/*
+ * Read the GuC status register (GUC_STATUS) and store it in the
+ * specified location; then return a boolean indicating whether
+ * the value matches either of two values representing completion
+ * of the GuC boot process.
+ *
+ * This is used for polling the GuC status in a wait_for_atomic()
+ * loop below.
+ */
+static inline bool guc_ucode_response(struct drm_i915_private *dev_priv,
+ u32 *status)
+{
+ u32 val = I915_READ(GUC_STATUS);
+ *status = val;
+ return ((val & GS_UKERNEL_MASK) == GS_UKERNEL_READY ||
+ (val & GS_UKERNEL_MASK) == GS_UKERNEL_LAPIC_DONE);
+}
+
+/*
+ * Transfer the firmware image to RAM for execution by the microcontroller.
+ *
+ * GuC Firmware layout:
+ * +-------------------------------+ ----
+ * | CSS header | 128B
+ * | contains major/minor version |
+ * +-------------------------------+ ----
+ * | uCode |
+ * +-------------------------------+ ----
+ * | RSA signature | 256B
+ * +-------------------------------+ ----
+ * | RSA public Key | 256B
+ * +-------------------------------+ ----
+ * | Public key modulus | 4B
+ * +-------------------------------+ ----
+ *
+ * Architecturally, the DMA engine is bidirectional, and can potentially even
+ * transfer between GTT locations. This functionality is left out of the API
+ * for now as there is no need for it.
+ *
+ * Note that GuC needs the CSS header plus uKernel code to be copied by the
+ * DMA engine in one operation, whereas the RSA signature is loaded via MMIO.
+ */
+
+#define UOS_CSS_HEADER_OFFSET 0
+#define UOS_VER_MINOR_OFFSET 0x44
+#define UOS_VER_MAJOR_OFFSET 0x46
+#define UOS_CSS_HEADER_SIZE 0x80
+#define UOS_RSA_SIG_SIZE 0x100
+#define UOS_CSS_SIGNING_SIZE 0x204
+
+static int guc_ucode_xfer_dma(struct drm_i915_private *dev_priv)
+{
+ struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
+ struct drm_i915_gem_object *fw_obj = guc_fw->guc_fw_obj;
+ unsigned long offset;
+ struct sg_table *sg = fw_obj->pages;
+ u32 status, ucode_size, rsa[UOS_RSA_SIG_SIZE / sizeof(u32)];
+ int i, ret = 0;
+
+ /* uCode size, also is where RSA signature starts */
+ offset = ucode_size = guc_fw->guc_fw_size - UOS_CSS_SIGNING_SIZE;
+ I915_WRITE(DMA_COPY_SIZE, ucode_size);
+
+ /* Copy RSA signature from the fw image to HW for verification */
+ sg_pcopy_to_buffer(sg->sgl, sg->nents, rsa, UOS_RSA_SIG_SIZE, offset);
+ for (i = 0; i < UOS_RSA_SIG_SIZE / sizeof(u32); i++)
+ I915_WRITE(UOS_RSA_SCRATCH_0 + i * sizeof(u32), rsa[i]);
+
+ /* Set the source address for the new blob */
+ offset = i915_gem_obj_ggtt_offset(fw_obj);
+ I915_WRITE(DMA_ADDR_0_LOW, lower_32_bits(offset));
+ I915_WRITE(DMA_ADDR_0_HIGH, upper_32_bits(offset) & 0xFFFF);
+
+ /*
+ * Set the DMA destination. Current uCode expects the code to be
+ * loaded at 8k; locations below this are used for the stack.
+ */
+ I915_WRITE(DMA_ADDR_1_LOW, 0x2000);
+ I915_WRITE(DMA_ADDR_1_HIGH, DMA_ADDRESS_SPACE_WOPCM);
+
+ /* Finally start the DMA */
+ I915_WRITE(DMA_CTRL, _MASKED_BIT_ENABLE(UOS_MOVE | START_DMA));
+
+ /*
+ * Spin-wait for the DMA to complete & the GuC to start up.
+ * NB: Docs recommend not using the interrupt for completion.
+ * Measurements indicate this should take no more than 20ms, so a
+ * timeout here indicates that the GuC has failed and is unusable.
+ * (Higher levels of the driver will attempt to fall back to
+ * execlist mode if this happens.)
+ */
+ ret = wait_for_atomic(guc_ucode_response(dev_priv, &status), 100);
+
+ DRM_DEBUG_DRIVER("DMA status 0x%x, GuC status 0x%x\n",
+ I915_READ(DMA_CTRL), status);
+
+ if ((status & GS_BOOTROM_MASK) == GS_BOOTROM_RSA_FAILED) {
+ DRM_ERROR("GuC firmware signature verification failed\n");
+ ret = -ENOEXEC;
+ }
+
+ DRM_DEBUG_DRIVER("returning %d\n", ret);
+
+ return ret;
+}
+
+/*
+ * Load the GuC firmware blob into the MinuteIA.
+ */
+static int guc_ucode_xfer(struct drm_i915_private *dev_priv)
+{
+ struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
+ struct drm_device *dev = dev_priv->dev;
+ int ret;
+
+ ret = i915_gem_object_set_to_gtt_domain(guc_fw->guc_fw_obj, false);
+ if (ret) {
+ DRM_DEBUG_DRIVER("set-domain failed %d\n", ret);
+ return ret;
+ }
+
+ ret = i915_gem_obj_ggtt_pin(guc_fw->guc_fw_obj, 0, 0);
+ if (ret) {
+ DRM_DEBUG_DRIVER("pin failed %d\n", ret);
+ return ret;
+ }
+
+ /* Invalidate GuC TLB to let GuC take the latest updates to GTT. */
+ I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
+
+ intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+
+ /* init WOPCM */
+ I915_WRITE(GUC_WOPCM_SIZE, GUC_WOPCM_SIZE_VALUE);
+ I915_WRITE(DMA_GUC_WOPCM_OFFSET, GUC_WOPCM_OFFSET_VALUE);
+
+ /* Enable MIA caching. GuC clock gating is disabled. */
+ I915_WRITE(GUC_SHIM_CONTROL, GUC_SHIM_CONTROL_VALUE);
+
+ /* WaC6DisallowByGfxPause*/
+ I915_WRITE(GEN6_GFXPAUSE, 0x30FFF);
+
+ if (IS_BROXTON(dev))
+ I915_WRITE(GEN9LP_GT_PM_CONFIG, GT_DOORBELL_ENABLE);
+ else
+ I915_WRITE(GEN9_GT_PM_CONFIG, GT_DOORBELL_ENABLE);
+
+ if (IS_GEN9(dev)) {
+ /* DOP Clock Gating Enable for GuC clocks */
+ I915_WRITE(GEN7_MISCCPCTL, (GEN8_DOP_CLOCK_GATE_GUC_ENABLE |
+ I915_READ(GEN7_MISCCPCTL)));
+
+ /* allows for 5us before GT can go to RC6 */
+ I915_WRITE(GUC_ARAT_C6DIS, 0x1FF);
+ }
+
+ set_guc_init_params(dev_priv);
+
+ ret = guc_ucode_xfer_dma(dev_priv);
+
+ intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+
+ /*
+ * We keep the object pages for reuse during resume. But we can unpin it
+ * now that DMA has completed, so it doesn't continue to take up space.
+ */
+ i915_gem_object_ggtt_unpin(guc_fw->guc_fw_obj);
+
+ return ret;
+}
+
+/**
+ * intel_guc_ucode_load() - load GuC uCode into the device
+ * @dev: drm device
+ *
+ * Called from gem_init_hw() during driver loading and also after a GPU reset.
+ *
+ * The firmware image should have already been fetched into memory by the
+ * earlier call to intel_guc_ucode_init(), so here we need only check that
+ * is succeeded, and then transfer the image to the h/w.
+ *
+ * Return: non-zero code on error
+ */
+int intel_guc_ucode_load(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
+ int err = 0;
+
+ DRM_DEBUG_DRIVER("GuC fw status: fetch %s, load %s\n",
+ intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
+ intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
+
+ if (guc_fw->guc_fw_fetch_status == GUC_FIRMWARE_NONE)
+ return 0;
+
+ if (guc_fw->guc_fw_fetch_status == GUC_FIRMWARE_SUCCESS &&
+ guc_fw->guc_fw_load_status == GUC_FIRMWARE_FAIL)
+ return -ENOEXEC;
+
+ guc_fw->guc_fw_load_status = GUC_FIRMWARE_PENDING;
+
+ DRM_DEBUG_DRIVER("GuC fw fetch status %s\n",
+ intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status));
+
+ switch (guc_fw->guc_fw_fetch_status) {
+ case GUC_FIRMWARE_FAIL:
+ /* something went wrong :( */
+ err = -EIO;
+ goto fail;
+
+ case GUC_FIRMWARE_NONE:
+ case GUC_FIRMWARE_PENDING:
+ default:
+ /* "can't happen" */
+ WARN_ONCE(1, "GuC fw %s invalid guc_fw_fetch_status %s [%d]\n",
+ guc_fw->guc_fw_path,
+ intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
+ guc_fw->guc_fw_fetch_status);
+ err = -ENXIO;
+ goto fail;
+
+ case GUC_FIRMWARE_SUCCESS:
+ break;
+ }
+
+ err = guc_ucode_xfer(dev_priv);
+ if (err)
+ goto fail;
+
+ guc_fw->guc_fw_load_status = GUC_FIRMWARE_SUCCESS;
+
+ DRM_DEBUG_DRIVER("GuC fw status: fetch %s, load %s\n",
+ intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
+ intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
+
+ return 0;
+
+fail:
+ if (guc_fw->guc_fw_load_status == GUC_FIRMWARE_PENDING)
+ guc_fw->guc_fw_load_status = GUC_FIRMWARE_FAIL;
+
+ return err;
+}
+
+static void guc_fw_fetch(struct drm_device *dev, struct intel_guc_fw *guc_fw)
+{
+ struct drm_i915_gem_object *obj;
+ const struct firmware *fw;
+ const u8 *css_header;
+ const size_t minsize = UOS_CSS_HEADER_SIZE + UOS_CSS_SIGNING_SIZE;
+ const size_t maxsize = GUC_WOPCM_SIZE_VALUE + UOS_CSS_SIGNING_SIZE
+ - 0x8000; /* 32k reserved (8K stack + 24k context) */
+ int err;
+
+ DRM_DEBUG_DRIVER("before requesting firmware: GuC fw fetch status %s\n",
+ intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status));
+
+ err = request_firmware(&fw, guc_fw->guc_fw_path, &dev->pdev->dev);
+ if (err)
+ goto fail;
+ if (!fw)
+ goto fail;
+
+ DRM_DEBUG_DRIVER("fetch GuC fw from %s succeeded, fw %p\n",
+ guc_fw->guc_fw_path, fw);
+ DRM_DEBUG_DRIVER("firmware file size %zu (minimum %zu, maximum %zu)\n",
+ fw->size, minsize, maxsize);
+
+ /* Check the size of the blob befoe examining buffer contents */
+ if (fw->size < minsize || fw->size > maxsize)
+ goto fail;
+
+ /*
+ * The GuC firmware image has the version number embedded at a well-known
+ * offset within the firmware blob; note that major / minor version are
+ * TWO bytes each (i.e. u16), although all pointers and offsets are defined
+ * in terms of bytes (u8).
+ */
+ css_header = fw->data + UOS_CSS_HEADER_OFFSET;
+ guc_fw->guc_fw_major_found = *(u16 *)(css_header + UOS_VER_MAJOR_OFFSET);
+ guc_fw->guc_fw_minor_found = *(u16 *)(css_header + UOS_VER_MINOR_OFFSET);
+
+ if (guc_fw->guc_fw_major_found != guc_fw->guc_fw_major_wanted ||
+ guc_fw->guc_fw_minor_found < guc_fw->guc_fw_minor_wanted) {
+ DRM_ERROR("GuC firmware version %d.%d, required %d.%d\n",
+ guc_fw->guc_fw_major_found, guc_fw->guc_fw_minor_found,
+ guc_fw->guc_fw_major_wanted, guc_fw->guc_fw_minor_wanted);
+ err = -ENOEXEC;
+ goto fail;
+ }
+
+ DRM_DEBUG_DRIVER("firmware version %d.%d OK (minimum %d.%d)\n",
+ guc_fw->guc_fw_major_found, guc_fw->guc_fw_minor_found,
+ guc_fw->guc_fw_major_wanted, guc_fw->guc_fw_minor_wanted);
+
+ obj = i915_gem_object_create_from_data(dev, fw->data, fw->size);
+ if (IS_ERR_OR_NULL(obj)) {
+ err = obj ? PTR_ERR(obj) : -ENOMEM;
+ goto fail;
+ }
+
+ guc_fw->guc_fw_obj = obj;
+ guc_fw->guc_fw_size = fw->size;
+
+ DRM_DEBUG_DRIVER("GuC fw fetch status SUCCESS, obj %p\n",
+ guc_fw->guc_fw_obj);
+
+ release_firmware(fw);
+ guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_SUCCESS;
+ return;
+
+fail:
+ DRM_DEBUG_DRIVER("GuC fw fetch status FAIL; err %d, fw %p, obj %p\n",
+ err, fw, guc_fw->guc_fw_obj);
+ DRM_ERROR("Failed to fetch GuC firmware from %s (error %d)\n",
+ guc_fw->guc_fw_path, err);
+
+ obj = guc_fw->guc_fw_obj;
+ if (obj)
+ drm_gem_object_unreference(&obj->base);
+ guc_fw->guc_fw_obj = NULL;
+
+ release_firmware(fw); /* OK even if fw is NULL */
+ guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_FAIL;
+}
+
+/**
+ * intel_guc_ucode_init() - define parameters and fetch firmware
+ * @dev: drm device
+ *
+ * Called early during driver load, but after GEM is initialised.
+ * The device struct_mutex must be held by the caller, as we're
+ * going to allocate a GEM object to hold the firmware image.
+ *
+ * The firmware will be transferred to the GuC's memory later,
+ * when intel_guc_ucode_load() is called.
+ */
+void intel_guc_ucode_init(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
+ const char *fw_path;
+
+ if (!HAS_GUC_SCHED(dev))
+ i915.enable_guc_submission = false;
+
+ if (!HAS_GUC_UCODE(dev)) {
+ fw_path = NULL;
+ } else if (IS_SKYLAKE(dev)) {
+ fw_path = I915_SKL_GUC_UCODE;
+ guc_fw->guc_fw_major_wanted = 3;
+ guc_fw->guc_fw_minor_wanted = 0;
+ } else {
+ i915.enable_guc_submission = false;
+ fw_path = ""; /* unknown device */
+ }
+
+ guc_fw->guc_dev = dev;
+ guc_fw->guc_fw_path = fw_path;
+ guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_NONE;
+ guc_fw->guc_fw_load_status = GUC_FIRMWARE_NONE;
+
+ if (fw_path == NULL)
+ return;
+
+ if (*fw_path == '\0') {
+ DRM_ERROR("No GuC firmware known for this platform\n");
+ guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_FAIL;
+ return;
+ }
+
+ guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_PENDING;
+ DRM_DEBUG_DRIVER("GuC firmware pending, path %s\n", fw_path);
+ guc_fw_fetch(dev, guc_fw);
+ /* status must now be FAIL or SUCCESS */
+}
+
+/**
+ * intel_guc_ucode_fini() - clean up all allocated resources
+ * @dev: drm device
+ */
+void intel_guc_ucode_fini(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
+
+ if (guc_fw->guc_fw_obj)
+ drm_gem_object_unreference(&guc_fw->guc_fw_obj->base);
+ guc_fw->guc_fw_obj = NULL;
+
+ guc_fw->guc_fw_fetch_status = GUC_FIRMWARE_NONE;
+}
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/9 v6] drm/i915: Debugfs interface to read GuC load status
2015-08-12 14:43 [PATCH 0/9 v6] Batch submission via GuC Dave Gordon
2015-08-12 14:43 ` [PATCH 1/9 v6] drm/i915: GuC-specific firmware loader Dave Gordon
@ 2015-08-12 14:43 ` Dave Gordon
2015-08-12 14:43 ` [PATCH 3/9 v6] drm/i915: Expose one LRC function for GuC submission mode Dave Gordon
` (7 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Dave Gordon @ 2015-08-12 14:43 UTC (permalink / raw)
To: intel-gfx
From: Alex Dai <yu.dai@intel.com>
The new node provides access to the status of the GuC-specific loader;
also the scratch registers used for communication between the i915
driver and the GuC firmware.
v2:
Changes to output formats per Chris Wilson's suggestions
v6:
Rebased
Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 39 +++++++++++++++++++++++++++++++++++++
1 file changed, 39 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 86734be..3609b13 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2374,6 +2374,44 @@ static int i915_llc(struct seq_file *m, void *data)
return 0;
}
+static int i915_guc_load_status_info(struct seq_file *m, void *data)
+{
+ struct drm_info_node *node = m->private;
+ struct drm_i915_private *dev_priv = node->minor->dev->dev_private;
+ struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
+ u32 tmp, i;
+
+ if (!HAS_GUC_UCODE(dev_priv->dev))
+ return 0;
+
+ seq_printf(m, "GuC firmware status:\n");
+ seq_printf(m, "\tpath: %s\n",
+ guc_fw->guc_fw_path);
+ seq_printf(m, "\tfetch: %s\n",
+ intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status));
+ seq_printf(m, "\tload: %s\n",
+ intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
+ seq_printf(m, "\tversion wanted: %d.%d\n",
+ guc_fw->guc_fw_major_wanted, guc_fw->guc_fw_minor_wanted);
+ seq_printf(m, "\tversion found: %d.%d\n",
+ guc_fw->guc_fw_major_found, guc_fw->guc_fw_minor_found);
+
+ tmp = I915_READ(GUC_STATUS);
+
+ seq_printf(m, "\nGuC status 0x%08x:\n", tmp);
+ seq_printf(m, "\tBootrom status = 0x%x\n",
+ (tmp & GS_BOOTROM_MASK) >> GS_BOOTROM_SHIFT);
+ seq_printf(m, "\tuKernel status = 0x%x\n",
+ (tmp & GS_UKERNEL_MASK) >> GS_UKERNEL_SHIFT);
+ seq_printf(m, "\tMIA Core status = 0x%x\n",
+ (tmp & GS_MIA_MASK) >> GS_MIA_SHIFT);
+ seq_puts(m, "\nScratch registers:\n");
+ for (i = 0; i < 16; i++)
+ seq_printf(m, "\t%2d: \t0x%x\n", i, I915_READ(SOFT_SCRATCH(i)));
+
+ return 0;
+}
+
static int i915_edp_psr_status(struct seq_file *m, void *data)
{
struct drm_info_node *node = m->private;
@@ -5033,6 +5071,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
{"i915_gem_hws_bsd", i915_hws_info, 0, (void *)VCS},
{"i915_gem_hws_vebox", i915_hws_info, 0, (void *)VECS},
{"i915_gem_batch_pool", i915_gem_batch_pool_info, 0},
+ {"i915_guc_load_status", i915_guc_load_status_info, 0},
{"i915_frequency_info", i915_frequency_info, 0},
{"i915_hangcheck_info", i915_hangcheck_info, 0},
{"i915_drpc_info", i915_drpc_info, 0},
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/9 v6] drm/i915: Expose one LRC function for GuC submission mode
2015-08-12 14:43 [PATCH 0/9 v6] Batch submission via GuC Dave Gordon
2015-08-12 14:43 ` [PATCH 1/9 v6] drm/i915: GuC-specific firmware loader Dave Gordon
2015-08-12 14:43 ` [PATCH 2/9 v6] drm/i915: Debugfs interface to read GuC load status Dave Gordon
@ 2015-08-12 14:43 ` Dave Gordon
2015-08-12 14:43 ` [PATCH 4/9 v6] drm/i915: Prepare for GuC-based command submission Dave Gordon
` (6 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Dave Gordon @ 2015-08-12 14:43 UTC (permalink / raw)
To: intel-gfx
GuC submission is basically execlist submission, but with the GuC
handling the actual writes to the ELSP and the resulting context
switch interrupts. So to describe a context for submission via
the GuC, we need one of the same functions used in execlist mode.
This commit exposes one such function, changing its name to better
describe what it does (it's related to logical ring contexts rather
than to execlists per se).
v2:
Replaces previous "drm/i915: Move execlists defines from .c to .h"
v3:
Incorporates a change to one of the functions exposed here that was
previously part of an internal patch, but which was omitted from
the version recently committed to drm-intel-nightly:
7a01a0a drm/i915/lrc: Update PDPx registers with lri commands
So we reinstate this change here.
v4:
Drop v3 change, update function parameters due to collision with
8ee3615 drm/i915: Convert execlists_ctx_descriptor() for requests
v5:
Don't expose execlists_update_context() after all. The current
version is no longer compatible with GuC submission; trying to
share the execlist version of this function results in both GuC
and CPU updating TAIL in the context image, with bad results when
they get out of step. The GuC submission path now has its own
private version that just updates the ringbuffer start address,
and not TAIL or PDPx.
v6:
Rebased
Issue: VIZ-4884
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
drivers/gpu/drm/i915/intel_lrc.c | 10 +++++-----
drivers/gpu/drm/i915/intel_lrc.h | 2 ++
2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 138964a..f3411f8 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -270,11 +270,11 @@ u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
return lrca >> 12;
}
-static uint64_t execlists_ctx_descriptor(struct drm_i915_gem_request *rq)
+uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
+ struct intel_engine_cs *ring)
{
- struct intel_engine_cs *ring = rq->ring;
struct drm_device *dev = ring->dev;
- struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
+ struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
uint64_t desc;
uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj);
@@ -312,13 +312,13 @@ static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
uint64_t desc[2];
if (rq1) {
- desc[1] = execlists_ctx_descriptor(rq1);
+ desc[1] = intel_lr_context_descriptor(rq1->ctx, rq1->ring);
rq1->elsp_submitted++;
} else {
desc[1] = 0;
}
- desc[0] = execlists_ctx_descriptor(rq0);
+ desc[0] = intel_lr_context_descriptor(rq0->ctx, rq0->ring);
rq0->elsp_submitted++;
/* You must always write both descriptors in the order below. */
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 64f89f99..5e5788c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -74,6 +74,8 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
void intel_lr_context_unpin(struct drm_i915_gem_request *req);
void intel_lr_context_reset(struct drm_device *dev,
struct intel_context *ctx);
+uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
+ struct intel_engine_cs *ring);
/* Execlists */
int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists);
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/9 v6] drm/i915: Prepare for GuC-based command submission
2015-08-12 14:43 [PATCH 0/9 v6] Batch submission via GuC Dave Gordon
` (2 preceding siblings ...)
2015-08-12 14:43 ` [PATCH 3/9 v6] drm/i915: Expose one LRC function for GuC submission mode Dave Gordon
@ 2015-08-12 14:43 ` Dave Gordon
2015-08-12 14:43 ` [PATCH 5/9 v6] drm/i915: Enable GuC firmware log Dave Gordon
` (5 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Dave Gordon @ 2015-08-12 14:43 UTC (permalink / raw)
To: intel-gfx
From: Alex Dai <yu.dai@intel.com>
This adds the first of the data structures used to communicate with the
GuC (the pool of guc_context structures).
We create a GuC-specific wrapper round the GEM object allocator as all
GEM objects shared with the GuC must be pinned into GGTT space at an
address that is NOT in the range [0..WOPCM_TOP), as that range of GGTT
addresses is not accessible to the GuC (from the GuC's point of view,
it's permanently reserved for other objects such as the BootROM & SRAM).
Later, we will need to allocate additional GuC-sharable objects for the
submission client(s) and the GuC's debug log.
v2:
Remove redundant initialisation [Chris Wilson]
Defer adding struct members until needed [Chris Wilson]
Local functions should pass dev_priv rather than dev [Chris Wilson]
v5:
Invalidate GuC TLB after allocating and pinning a new object
v6:
Rebased
Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
drivers/gpu/drm/i915/Makefile | 3 +-
drivers/gpu/drm/i915/i915_guc_submission.c | 118 +++++++++++++++++++++++++++++
drivers/gpu/drm/i915/intel_guc.h | 7 ++
drivers/gpu/drm/i915/intel_guc_loader.c | 21 +++++
4 files changed, 148 insertions(+), 1 deletion(-)
create mode 100644 drivers/gpu/drm/i915/i915_guc_submission.c
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 65e52fa..44d290a 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -41,7 +41,8 @@ i915-y += i915_cmd_parser.o \
intel_uncore.o
# general-purpose microcontroller (GuC) support
-i915-y += intel_guc_loader.o
+i915-y += intel_guc_loader.o \
+ i915_guc_submission.o
# autogenerated null render state
i915-y += intel_renderstate_gen6.o \
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
new file mode 100644
index 0000000..8ff59aa
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -0,0 +1,118 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+#include <linux/firmware.h>
+#include <linux/circ_buf.h>
+#include "i915_drv.h"
+#include "intel_guc.h"
+
+/**
+ * gem_allocate_guc_obj() - Allocate gem object for GuC usage
+ * @dev: drm device
+ * @size: size of object
+ *
+ * This is a wrapper to create a gem obj. In order to use it inside GuC, the
+ * object needs to be pinned lifetime. Also we must pin it to gtt space other
+ * than [0, GUC_WOPCM_TOP) because this range is reserved inside GuC.
+ *
+ * Return: A drm_i915_gem_object if successful, otherwise NULL.
+ */
+static struct drm_i915_gem_object *gem_allocate_guc_obj(struct drm_device *dev,
+ u32 size)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct drm_i915_gem_object *obj;
+
+ obj = i915_gem_alloc_object(dev, size);
+ if (!obj)
+ return NULL;
+
+ if (i915_gem_object_get_pages(obj)) {
+ drm_gem_object_unreference(&obj->base);
+ return NULL;
+ }
+
+ if (i915_gem_obj_ggtt_pin(obj, PAGE_SIZE,
+ PIN_OFFSET_BIAS | GUC_WOPCM_TOP)) {
+ drm_gem_object_unreference(&obj->base);
+ return NULL;
+ }
+
+ /* Invalidate GuC TLB to let GuC take the latest updates to GTT. */
+ I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
+
+ return obj;
+}
+
+/**
+ * gem_release_guc_obj() - Release gem object allocated for GuC usage
+ * @obj: gem obj to be released
+ */
+static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
+{
+ if (!obj)
+ return;
+
+ if (i915_gem_obj_is_pinned(obj))
+ i915_gem_object_ggtt_unpin(obj);
+
+ drm_gem_object_unreference(&obj->base);
+}
+
+/*
+ * Set up the memory resources to be shared with the GuC. At this point,
+ * we require just one object that can be mapped through the GGTT.
+ */
+int i915_guc_submission_init(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ const size_t ctxsize = sizeof(struct guc_context_desc);
+ const size_t poolsize = GUC_MAX_GPU_CONTEXTS * ctxsize;
+ const size_t gemsize = round_up(poolsize, PAGE_SIZE);
+ struct intel_guc *guc = &dev_priv->guc;
+
+ if (!i915.enable_guc_submission)
+ return 0; /* not enabled */
+
+ if (guc->ctx_pool_obj)
+ return 0; /* already allocated */
+
+ guc->ctx_pool_obj = gem_allocate_guc_obj(dev_priv->dev, gemsize);
+ if (!guc->ctx_pool_obj)
+ return -ENOMEM;
+
+ ida_init(&guc->ctx_ids);
+
+ return 0;
+}
+
+void i915_guc_submission_fini(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct intel_guc *guc = &dev_priv->guc;
+
+ if (guc->ctx_pool_obj)
+ ida_destroy(&guc->ctx_ids);
+ gem_release_guc_obj(guc->ctx_pool_obj);
+ guc->ctx_pool_obj = NULL;
+}
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 2846b6d..be3cad8 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -56,6 +56,9 @@ struct intel_guc {
struct intel_guc_fw guc_fw;
uint32_t log_flags;
+
+ struct drm_i915_gem_object *ctx_pool_obj;
+ struct ida ctx_ids;
};
/* intel_guc_loader.c */
@@ -64,4 +67,8 @@ extern int intel_guc_ucode_load(struct drm_device *dev);
extern void intel_guc_ucode_fini(struct drm_device *dev);
extern const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status);
+/* i915_guc_submission.c */
+int i915_guc_submission_init(struct drm_device *dev);
+void i915_guc_submission_fini(struct drm_device *dev);
+
#endif
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
index dd62c31..6ff7fea 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -128,6 +128,21 @@ static void set_guc_init_params(struct drm_i915_private *dev_priv)
i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
}
+ /* If GuC submission is enabled, set up additional parameters here */
+ if (i915.enable_guc_submission) {
+ u32 pgs = i915_gem_obj_ggtt_offset(dev_priv->guc.ctx_pool_obj);
+ u32 ctx_in_16 = GUC_MAX_GPU_CONTEXTS / 16;
+
+ pgs >>= PAGE_SHIFT;
+ params[GUC_CTL_CTXINFO] = (pgs << GUC_CTL_BASE_ADDR_SHIFT) |
+ (ctx_in_16 << GUC_CTL_CTXNUM_IN16_SHIFT);
+
+ params[GUC_CTL_FEATURE] |= GUC_CTL_KERNEL_SUBMISSIONS;
+
+ /* Unmask this bit to enable the GuC's internal scheduler */
+ params[GUC_CTL_FEATURE] &= ~GUC_CTL_DISABLE_SCHEDULER;
+ }
+
I915_WRITE(SOFT_SCRATCH(0), 0);
for (i = 0; i < GUC_CTL_MAX_DWORDS; i++)
@@ -360,6 +375,10 @@ int intel_guc_ucode_load(struct drm_device *dev)
break;
}
+ err = i915_guc_submission_init(dev);
+ if (err)
+ goto fail;
+
err = guc_ucode_xfer(dev_priv);
if (err)
goto fail;
@@ -521,6 +540,8 @@ void intel_guc_ucode_fini(struct drm_device *dev)
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
+ i915_guc_submission_fini(dev);
+
if (guc_fw->guc_fw_obj)
drm_gem_object_unreference(&guc_fw->guc_fw_obj->base);
guc_fw->guc_fw_obj = NULL;
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/9 v6] drm/i915: Enable GuC firmware log
2015-08-12 14:43 [PATCH 0/9 v6] Batch submission via GuC Dave Gordon
` (3 preceding siblings ...)
2015-08-12 14:43 ` [PATCH 4/9 v6] drm/i915: Prepare for GuC-based command submission Dave Gordon
@ 2015-08-12 14:43 ` Dave Gordon
2015-08-12 14:43 ` [PATCH 6/9 v6] drm/i915: Implementation of GuC submission client Dave Gordon
` (4 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Dave Gordon @ 2015-08-12 14:43 UTC (permalink / raw)
To: intel-gfx
From: Alex Dai <yu.dai@intel.com>
Allocate a GEM object to hold GuC log data. A debugfs interface
(i915_guc_log_dump) is provided to print out the log content.
v2:
Add struct members at point of use [Chris Wilson]
v6:
Rebased
Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 29 +++++++++++++++++++
drivers/gpu/drm/i915/i915_guc_submission.c | 46 ++++++++++++++++++++++++++++++
drivers/gpu/drm/i915/intel_guc.h | 1 +
3 files changed, 76 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 3609b13..529d2fd 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2412,6 +2412,34 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data)
return 0;
}
+static int i915_guc_log_dump(struct seq_file *m, void *data)
+{
+ struct drm_info_node *node = m->private;
+ struct drm_device *dev = node->minor->dev;
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct drm_i915_gem_object *log_obj = dev_priv->guc.log_obj;
+ u32 *log;
+ int i = 0, pg;
+
+ if (!log_obj)
+ return 0;
+
+ for (pg = 0; pg < log_obj->base.size / PAGE_SIZE; pg++) {
+ log = kmap_atomic(i915_gem_object_get_page(log_obj, pg));
+
+ for (i = 0; i < PAGE_SIZE / sizeof(u32); i += 4)
+ seq_printf(m, "0x%08x 0x%08x 0x%08x 0x%08x\n",
+ *(log + i), *(log + i + 1),
+ *(log + i + 2), *(log + i + 3));
+
+ kunmap_atomic(log);
+ }
+
+ seq_putc(m, '\n');
+
+ return 0;
+}
+
static int i915_edp_psr_status(struct seq_file *m, void *data)
{
struct drm_info_node *node = m->private;
@@ -5072,6 +5100,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
{"i915_gem_hws_vebox", i915_hws_info, 0, (void *)VECS},
{"i915_gem_batch_pool", i915_gem_batch_pool_info, 0},
{"i915_guc_load_status", i915_guc_load_status_info, 0},
+ {"i915_guc_log_dump", i915_guc_log_dump, 0},
{"i915_frequency_info", i915_frequency_info, 0},
{"i915_hangcheck_info", i915_hangcheck_info, 0},
{"i915_drpc_info", i915_drpc_info, 0},
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 8ff59aa..669c889 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -79,6 +79,47 @@ static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
drm_gem_object_unreference(&obj->base);
}
+static void guc_create_log(struct intel_guc *guc)
+{
+ struct drm_i915_private *dev_priv = guc_to_i915(guc);
+ struct drm_i915_gem_object *obj;
+ unsigned long offset;
+ uint32_t size, flags;
+
+ if (i915.guc_log_level < GUC_LOG_VERBOSITY_MIN)
+ return;
+
+ if (i915.guc_log_level > GUC_LOG_VERBOSITY_MAX)
+ i915.guc_log_level = GUC_LOG_VERBOSITY_MAX;
+
+ /* The first page is to save log buffer state. Allocate one
+ * extra page for others in case for overlap */
+ size = (1 + GUC_LOG_DPC_PAGES + 1 +
+ GUC_LOG_ISR_PAGES + 1 +
+ GUC_LOG_CRASH_PAGES + 1) << PAGE_SHIFT;
+
+ obj = guc->log_obj;
+ if (!obj) {
+ obj = gem_allocate_guc_obj(dev_priv->dev, size);
+ if (!obj) {
+ /* logging will be off */
+ i915.guc_log_level = -1;
+ return;
+ }
+
+ guc->log_obj = obj;
+ }
+
+ /* each allocated unit is a page */
+ flags = GUC_LOG_VALID | GUC_LOG_NOTIFY_ON_HALF_FULL |
+ (GUC_LOG_DPC_PAGES << GUC_LOG_DPC_SHIFT) |
+ (GUC_LOG_ISR_PAGES << GUC_LOG_ISR_SHIFT) |
+ (GUC_LOG_CRASH_PAGES << GUC_LOG_CRASH_SHIFT);
+
+ offset = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT; /* in pages */
+ guc->log_flags = (offset << GUC_LOG_BUF_ADDR_SHIFT) | flags;
+}
+
/*
* Set up the memory resources to be shared with the GuC. At this point,
* we require just one object that can be mapped through the GGTT.
@@ -103,6 +144,8 @@ int i915_guc_submission_init(struct drm_device *dev)
ida_init(&guc->ctx_ids);
+ guc_create_log(guc);
+
return 0;
}
@@ -111,6 +154,9 @@ void i915_guc_submission_fini(struct drm_device *dev)
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc *guc = &dev_priv->guc;
+ gem_release_guc_obj(dev_priv->guc.log_obj);
+ guc->log_obj = NULL;
+
if (guc->ctx_pool_obj)
ida_destroy(&guc->ctx_ids);
gem_release_guc_obj(guc->ctx_pool_obj);
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index be3cad8..5b51b05 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -56,6 +56,7 @@ struct intel_guc {
struct intel_guc_fw guc_fw;
uint32_t log_flags;
+ struct drm_i915_gem_object *log_obj;
struct drm_i915_gem_object *ctx_pool_obj;
struct ida ctx_ids;
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 6/9 v6] drm/i915: Implementation of GuC submission client
2015-08-12 14:43 [PATCH 0/9 v6] Batch submission via GuC Dave Gordon
` (4 preceding siblings ...)
2015-08-12 14:43 ` [PATCH 5/9 v6] drm/i915: Enable GuC firmware log Dave Gordon
@ 2015-08-12 14:43 ` Dave Gordon
2015-08-12 14:43 ` [PATCH 7/9 v6] drm/i915: Interrupt routing for GuC submission Dave Gordon
` (3 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Dave Gordon @ 2015-08-12 14:43 UTC (permalink / raw)
To: intel-gfx
A GuC client has its own doorbell and workqueue. It maintains the
doorbell cache line, process description object and work queue item.
A default guc_client is created for the i915 driver to use for
normal-priority in-order submission.
Note that the created client is not yet ready for use; doorbell
allocation will fail as we haven't yet linked the GuC's context
descriptor to the default contexts for each ring (see later patch).
v2:
Defer adding structure members until needed [Chris Wilson]
Rationalise type declarations [Chris Wilson]
v5:
Add GuC per-engine submission & seqno statistics.
Move wq locking to encompass both get_space() and add_item().
Take forcewake lock in host2guc_action() [Tom O'Rourke]
v6:
Fix GuC doorbell cacheline selection code (the
cacheline-within-page calculation was wrong).
Rename GuC priorities to make them closer to the names used in
the GuC firmware source, matching what the autogenerated
versions will (probably) be.
Add per-ring statistics to client.
Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
drivers/gpu/drm/i915/i915_guc_submission.c | 666 +++++++++++++++++++++++++++++
drivers/gpu/drm/i915/intel_guc.h | 46 ++
drivers/gpu/drm/i915/intel_guc_fwif.h | 6 +-
drivers/gpu/drm/i915/intel_guc_loader.c | 10 +
4 files changed, 725 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 669c889..3352b85 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -27,6 +27,529 @@
#include "intel_guc.h"
/**
+ * DOC: GuC Client
+ *
+ * i915_guc_client:
+ * We use the term client to avoid confusion with contexts. A i915_guc_client is
+ * equivalent to GuC object guc_context_desc. This context descriptor is
+ * allocated from a pool of 1024 entries. Kernel driver will allocate doorbell
+ * and workqueue for it. Also the process descriptor (guc_process_desc), which
+ * is mapped to client space. So the client can write Work Item then ring the
+ * doorbell.
+ *
+ * To simplify the implementation, we allocate one gem object that contains all
+ * pages for doorbell, process descriptor and workqueue.
+ *
+ * The Scratch registers:
+ * There are 16 MMIO-based registers start from 0xC180. The kernel driver writes
+ * a value to the action register (SOFT_SCRATCH_0) along with any data. It then
+ * triggers an interrupt on the GuC via another register write (0xC4C8).
+ * Firmware writes a success/fail code back to the action register after
+ * processes the request. The kernel driver polls waiting for this update and
+ * then proceeds.
+ * See host2guc_action()
+ *
+ * Doorbells:
+ * Doorbells are interrupts to uKernel. A doorbell is a single cache line (QW)
+ * mapped into process space.
+ *
+ * Work Items:
+ * There are several types of work items that the host may place into a
+ * workqueue, each with its own requirements and limitations. Currently only
+ * WQ_TYPE_INORDER is needed to support legacy submission via GuC, which
+ * represents in-order queue. The kernel driver packs ring tail pointer and an
+ * ELSP context descriptor dword into Work Item.
+ * See guc_add_workqueue_item()
+ *
+ */
+
+/*
+ * Read GuC command/status register (SOFT_SCRATCH_0)
+ * Return true if it contains a response rather than a command
+ */
+static inline bool host2guc_action_response(struct drm_i915_private *dev_priv,
+ u32 *status)
+{
+ u32 val = I915_READ(SOFT_SCRATCH(0));
+ *status = val;
+ return GUC2HOST_IS_RESPONSE(val);
+}
+
+static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
+{
+ struct drm_i915_private *dev_priv = guc_to_i915(guc);
+ u32 status;
+ int i;
+ int ret;
+
+ if (WARN_ON(len < 1 || len > 15))
+ return -EINVAL;
+
+ intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+ spin_lock(&dev_priv->guc.host2guc_lock);
+
+ dev_priv->guc.action_count += 1;
+ dev_priv->guc.action_cmd = data[0];
+
+ for (i = 0; i < len; i++)
+ I915_WRITE(SOFT_SCRATCH(i), data[i]);
+
+ POSTING_READ(SOFT_SCRATCH(i - 1));
+
+ I915_WRITE(HOST2GUC_INTERRUPT, HOST2GUC_TRIGGER);
+
+ /* No HOST2GUC command should take longer than 10ms */
+ ret = wait_for_atomic(host2guc_action_response(dev_priv, &status), 10);
+ if (status != GUC2HOST_STATUS_SUCCESS) {
+ /*
+ * Either the GuC explicitly returned an error (which
+ * we convert to -EIO here) or no response at all was
+ * received within the timeout limit (-ETIMEDOUT)
+ */
+ if (ret != -ETIMEDOUT)
+ ret = -EIO;
+
+ DRM_ERROR("GUC: host2guc action 0x%X failed. ret=%d "
+ "status=0x%08X response=0x%08X\n",
+ data[0], ret, status,
+ I915_READ(SOFT_SCRATCH(15)));
+
+ dev_priv->guc.action_fail += 1;
+ dev_priv->guc.action_err = ret;
+ }
+ dev_priv->guc.action_status = status;
+
+ spin_unlock(&dev_priv->guc.host2guc_lock);
+ intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+
+ return ret;
+}
+
+/*
+ * Tell the GuC to allocate or deallocate a specific doorbell
+ */
+
+static int host2guc_allocate_doorbell(struct intel_guc *guc,
+ struct i915_guc_client *client)
+{
+ u32 data[2];
+
+ data[0] = HOST2GUC_ACTION_ALLOCATE_DOORBELL;
+ data[1] = client->ctx_index;
+
+ return host2guc_action(guc, data, 2);
+}
+
+static int host2guc_release_doorbell(struct intel_guc *guc,
+ struct i915_guc_client *client)
+{
+ u32 data[2];
+
+ data[0] = HOST2GUC_ACTION_DEALLOCATE_DOORBELL;
+ data[1] = client->ctx_index;
+
+ return host2guc_action(guc, data, 2);
+}
+
+/*
+ * Initialise, update, or clear doorbell data shared with the GuC
+ *
+ * These functions modify shared data and so need access to the mapped
+ * client object which contains the page being used for the doorbell
+ */
+
+static void guc_init_doorbell(struct intel_guc *guc,
+ struct i915_guc_client *client)
+{
+ struct guc_doorbell_info *doorbell;
+ void *base;
+
+ base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));
+ doorbell = base + client->doorbell_offset;
+
+ doorbell->db_status = 1;
+ doorbell->cookie = 0;
+
+ kunmap_atomic(base);
+}
+
+static int guc_ring_doorbell(struct i915_guc_client *gc)
+{
+ struct guc_process_desc *desc;
+ union guc_doorbell_qw db_cmp, db_exc, db_ret;
+ union guc_doorbell_qw *db;
+ void *base;
+ int attempt = 2, ret = -EAGAIN;
+
+ base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
+ desc = base + gc->proc_desc_offset;
+
+ /* Update the tail so it is visible to GuC */
+ desc->tail = gc->wq_tail;
+
+ /* current cookie */
+ db_cmp.db_status = GUC_DOORBELL_ENABLED;
+ db_cmp.cookie = gc->cookie;
+
+ /* cookie to be updated */
+ db_exc.db_status = GUC_DOORBELL_ENABLED;
+ db_exc.cookie = gc->cookie + 1;
+ if (db_exc.cookie == 0)
+ db_exc.cookie = 1;
+
+ /* pointer of current doorbell cacheline */
+ db = base + gc->doorbell_offset;
+
+ while (attempt--) {
+ /* lets ring the doorbell */
+ db_ret.value_qw = atomic64_cmpxchg((atomic64_t *)db,
+ db_cmp.value_qw, db_exc.value_qw);
+
+ /* if the exchange was successfully executed */
+ if (db_ret.value_qw == db_cmp.value_qw) {
+ /* db was successfully rung */
+ gc->cookie = db_exc.cookie;
+ ret = 0;
+ break;
+ }
+
+ /* XXX: doorbell was lost and need to acquire it again */
+ if (db_ret.db_status == GUC_DOORBELL_DISABLED)
+ break;
+
+ DRM_ERROR("Cookie mismatch. Expected %d, returned %d\n",
+ db_cmp.cookie, db_ret.cookie);
+
+ /* update the cookie to newly read cookie from GuC */
+ db_cmp.cookie = db_ret.cookie;
+ db_exc.cookie = db_ret.cookie + 1;
+ if (db_exc.cookie == 0)
+ db_exc.cookie = 1;
+ }
+
+ kunmap_atomic(base);
+ return ret;
+}
+
+static void guc_disable_doorbell(struct intel_guc *guc,
+ struct i915_guc_client *client)
+{
+ struct drm_i915_private *dev_priv = guc_to_i915(guc);
+ struct guc_doorbell_info *doorbell;
+ void *base;
+ int drbreg = GEN8_DRBREGL(client->doorbell_id);
+ int value;
+
+ base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));
+ doorbell = base + client->doorbell_offset;
+
+ doorbell->db_status = 0;
+
+ kunmap_atomic(base);
+
+ I915_WRITE(drbreg, I915_READ(drbreg) & ~GEN8_DRB_VALID);
+
+ value = I915_READ(drbreg);
+ WARN_ON((value & GEN8_DRB_VALID) != 0);
+
+ I915_WRITE(GEN8_DRBREGU(client->doorbell_id), 0);
+ I915_WRITE(drbreg, 0);
+
+ /* XXX: wait for any interrupts */
+ /* XXX: wait for workqueue to drain */
+}
+
+/*
+ * Select, assign and relase doorbell cachelines
+ *
+ * These functions track which doorbell cachelines are in use.
+ * The data they manipulate is protected by the host2guc lock.
+ */
+
+static uint32_t select_doorbell_cacheline(struct intel_guc *guc)
+{
+ const uint32_t cacheline_size = cache_line_size();
+ uint32_t offset;
+
+ spin_lock(&guc->host2guc_lock);
+
+ /* Doorbell uses a single cache line within a page */
+ offset = offset_in_page(guc->db_cacheline);
+
+ /* Moving to next cache line to reduce contention */
+ guc->db_cacheline += cacheline_size;
+
+ spin_unlock(&guc->host2guc_lock);
+
+ DRM_DEBUG_DRIVER("selected doorbell cacheline 0x%x, next 0x%x, linesize %u\n",
+ offset, guc->db_cacheline, cacheline_size);
+
+ return offset;
+}
+
+static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
+{
+ /*
+ * The bitmap is split into two halves; the first half is used for
+ * normal priority contexts, the second half for high-priority ones.
+ * Note that logically higher priorities are numerically less than
+ * normal ones, so the test below means "is it high-priority?"
+ */
+ const bool hi_pri = (priority <= GUC_CTX_PRIORITY_HIGH);
+ const uint16_t half = GUC_MAX_DOORBELLS / 2;
+ const uint16_t start = hi_pri ? half : 0;
+ const uint16_t end = start + half;
+ uint16_t id;
+
+ spin_lock(&guc->host2guc_lock);
+ id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
+ if (id == end)
+ id = GUC_INVALID_DOORBELL_ID;
+ else
+ bitmap_set(guc->doorbell_bitmap, id, 1);
+ spin_unlock(&guc->host2guc_lock);
+
+ DRM_DEBUG_DRIVER("assigned %s priority doorbell id 0x%x\n",
+ hi_pri ? "high" : "normal", id);
+
+ return id;
+}
+
+static void release_doorbell(struct intel_guc *guc, uint16_t id)
+{
+ spin_lock(&guc->host2guc_lock);
+ bitmap_clear(guc->doorbell_bitmap, id, 1);
+ spin_unlock(&guc->host2guc_lock);
+}
+
+/*
+ * Initialise the process descriptor shared with the GuC firmware.
+ */
+static void guc_init_proc_desc(struct intel_guc *guc,
+ struct i915_guc_client *client)
+{
+ struct guc_process_desc *desc;
+ void *base;
+
+ base = kmap_atomic(i915_gem_object_get_page(client->client_obj, 0));
+ desc = base + client->proc_desc_offset;
+
+ memset(desc, 0, sizeof(*desc));
+
+ /*
+ * XXX: pDoorbell and WQVBaseAddress are pointers in process address
+ * space for ring3 clients (set them as in mmap_ioctl) or kernel
+ * space for kernel clients (map on demand instead? May make debug
+ * easier to have it mapped).
+ */
+ desc->wq_base_addr = 0;
+ desc->db_base_addr = 0;
+
+ desc->context_id = client->ctx_index;
+ desc->wq_size_bytes = client->wq_size;
+ desc->wq_status = WQ_STATUS_ACTIVE;
+ desc->priority = client->priority;
+
+ kunmap_atomic(base);
+}
+
+/*
+ * Initialise/clear the context descriptor shared with the GuC firmware.
+ *
+ * This descriptor tells the GuC where (in GGTT space) to find the important
+ * data structures relating to this client (doorbell, process descriptor,
+ * write queue, etc).
+ */
+
+static void guc_init_ctx_desc(struct intel_guc *guc,
+ struct i915_guc_client *client)
+{
+ struct guc_context_desc desc;
+ struct sg_table *sg;
+
+ memset(&desc, 0, sizeof(desc));
+
+ desc.attribute = GUC_CTX_DESC_ATTR_ACTIVE | GUC_CTX_DESC_ATTR_KERNEL;
+ desc.context_id = client->ctx_index;
+ desc.priority = client->priority;
+ desc.engines_used = (1 << RCS) | (1 << VCS) | (1 << BCS) |
+ (1 << VECS) | (1 << VCS2); /* all engines */
+ desc.db_id = client->doorbell_id;
+
+ /*
+ * The CPU address is only needed at certain points, so kmap_atomic on
+ * demand instead of storing it in the ctx descriptor.
+ * XXX: May make debug easier to have it mapped
+ */
+ desc.db_trigger_cpu = 0;
+ desc.db_trigger_uk = client->doorbell_offset +
+ i915_gem_obj_ggtt_offset(client->client_obj);
+ desc.db_trigger_phy = client->doorbell_offset +
+ sg_dma_address(client->client_obj->pages->sgl);
+
+ desc.process_desc = client->proc_desc_offset +
+ i915_gem_obj_ggtt_offset(client->client_obj);
+
+ desc.wq_addr = client->wq_offset +
+ i915_gem_obj_ggtt_offset(client->client_obj);
+
+ desc.wq_size = client->wq_size;
+
+ /*
+ * XXX: Take LRCs from an existing intel_context if this is not an
+ * IsKMDCreatedContext client
+ */
+ desc.desc_private = (uintptr_t)client;
+
+ /* Pool context is pinned already */
+ sg = guc->ctx_pool_obj->pages;
+ sg_pcopy_from_buffer(sg->sgl, sg->nents, &desc, sizeof(desc),
+ sizeof(desc) * client->ctx_index);
+}
+
+static void guc_fini_ctx_desc(struct intel_guc *guc,
+ struct i915_guc_client *client)
+{
+ struct guc_context_desc desc;
+ struct sg_table *sg;
+
+ memset(&desc, 0, sizeof(desc));
+
+ sg = guc->ctx_pool_obj->pages;
+ sg_pcopy_from_buffer(sg->sgl, sg->nents, &desc, sizeof(desc),
+ sizeof(desc) * client->ctx_index);
+}
+
+/* Get valid workqueue item and return it back to offset */
+static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
+{
+ struct guc_process_desc *desc;
+ void *base;
+ u32 size = sizeof(struct guc_wq_item);
+ int ret = 0, timeout_counter = 200;
+
+ base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
+ desc = base + gc->proc_desc_offset;
+
+ while (timeout_counter-- > 0) {
+ ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
+ gc->wq_size) >= size, 1);
+
+ if (!ret) {
+ *offset = gc->wq_tail;
+
+ /* advance the tail for next workqueue item */
+ gc->wq_tail += size;
+ gc->wq_tail &= gc->wq_size - 1;
+
+ /* this will break the loop */
+ timeout_counter = 0;
+ }
+ };
+
+ kunmap_atomic(base);
+
+ return ret;
+}
+
+static int guc_add_workqueue_item(struct i915_guc_client *gc,
+ struct drm_i915_gem_request *rq)
+{
+ enum intel_ring_id ring_id = rq->ring->id;
+ struct guc_wq_item *wqi;
+ void *base;
+ u32 tail, wq_len, wq_off = 0;
+ int ret;
+
+ ret = guc_get_workqueue_space(gc, &wq_off);
+ if (ret)
+ return ret;
+
+ /* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
+ * should not have the case where structure wqi is across page, neither
+ * wrapped to the beginning. This simplifies the implementation below.
+ *
+ * XXX: if not the case, we need save data to a temp wqi and copy it to
+ * workqueue buffer dw by dw.
+ */
+ WARN_ON(sizeof(struct guc_wq_item) != 16);
+ WARN_ON(wq_off & 3);
+
+ /* wq starts from the page after doorbell / process_desc */
+ base = kmap_atomic(i915_gem_object_get_page(gc->client_obj,
+ (wq_off + GUC_DB_SIZE) >> PAGE_SHIFT));
+ wq_off &= PAGE_SIZE - 1;
+ wqi = (struct guc_wq_item *)((char *)base + wq_off);
+
+ /* len does not include the header */
+ wq_len = sizeof(struct guc_wq_item) / sizeof(u32) - 1;
+ wqi->header = WQ_TYPE_INORDER |
+ (wq_len << WQ_LEN_SHIFT) |
+ (ring_id << WQ_TARGET_SHIFT) |
+ WQ_NO_WCFLUSH_WAIT;
+
+ /* The GuC wants only the low-order word of the context descriptor */
+ wqi->context_desc = (u32)intel_lr_context_descriptor(rq->ctx, rq->ring);
+
+ /* The GuC firmware wants the tail index in QWords, not bytes */
+ tail = rq->ringbuf->tail >> 3;
+ wqi->ring_tail = tail << WQ_RING_TAIL_SHIFT;
+ wqi->fence_id = 0; /*XXX: what fence to be here */
+
+ kunmap_atomic(base);
+
+ return 0;
+}
+
+/**
+ * i915_guc_submit() - Submit commands through GuC
+ * @client: the guc client where commands will go through
+ * @ctx: LRC where commands come from
+ * @ring: HW engine that will excute the commands
+ *
+ * Return: 0 if succeed
+ */
+int i915_guc_submit(struct i915_guc_client *client,
+ struct drm_i915_gem_request *rq)
+{
+ struct intel_guc *guc = client->guc;
+ enum intel_ring_id ring_id = rq->ring->id;
+ unsigned long flags;
+ int q_ret, b_ret;
+
+ spin_lock_irqsave(&client->wq_lock, flags);
+
+ q_ret = guc_add_workqueue_item(client, rq);
+ if (q_ret == 0)
+ b_ret = guc_ring_doorbell(client);
+
+ client->submissions[ring_id] += 1;
+ if (q_ret) {
+ client->q_fail += 1;
+ client->retcode = q_ret;
+ } else if (b_ret) {
+ client->b_fail += 1;
+ client->retcode = q_ret = b_ret;
+ } else {
+ client->retcode = 0;
+ }
+ spin_unlock_irqrestore(&client->wq_lock, flags);
+
+ spin_lock(&guc->host2guc_lock);
+ guc->submissions[ring_id] += 1;
+ guc->last_seqno[ring_id] = rq->seqno;
+ spin_unlock(&guc->host2guc_lock);
+
+ return q_ret;
+}
+
+/*
+ * Everything below here is concerned with setup & teardown, and is
+ * therefore not part of the somewhat time-critical batch-submission
+ * path of i915_guc_submit() above.
+ */
+
+/**
* gem_allocate_guc_obj() - Allocate gem object for GuC usage
* @dev: drm device
* @size: size of object
@@ -79,6 +602,121 @@ static void gem_release_guc_obj(struct drm_i915_gem_object *obj)
drm_gem_object_unreference(&obj->base);
}
+static void guc_client_free(struct drm_device *dev,
+ struct i915_guc_client *client)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct intel_guc *guc = &dev_priv->guc;
+
+ if (!client)
+ return;
+
+ if (client->doorbell_id != GUC_INVALID_DOORBELL_ID) {
+ /*
+ * First disable the doorbell, then tell the GuC we've
+ * finished with it, finally deallocate it in our bitmap
+ */
+ guc_disable_doorbell(guc, client);
+ host2guc_release_doorbell(guc, client);
+ release_doorbell(guc, client->doorbell_id);
+ }
+
+ /*
+ * XXX: wait for any outstanding submissions before freeing memory.
+ * Be sure to drop any locks
+ */
+
+ gem_release_guc_obj(client->client_obj);
+
+ if (client->ctx_index != GUC_INVALID_CTX_ID) {
+ guc_fini_ctx_desc(guc, client);
+ ida_simple_remove(&guc->ctx_ids, client->ctx_index);
+ }
+
+ kfree(client);
+}
+
+/**
+ * guc_client_alloc() - Allocate an i915_guc_client
+ * @dev: drm device
+ * @priority: four levels priority _CRITICAL, _HIGH, _NORMAL and _LOW
+ * The kernel client to replace ExecList submission is created with
+ * NORMAL priority. Priority of a client for scheduler can be HIGH,
+ * while a preemption context can use CRITICAL.
+ *
+ * Return: An i915_guc_client object if success.
+ */
+static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
+ uint32_t priority)
+{
+ struct i915_guc_client *client;
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct intel_guc *guc = &dev_priv->guc;
+ struct drm_i915_gem_object *obj;
+
+ client = kzalloc(sizeof(*client), GFP_KERNEL);
+ if (!client)
+ return NULL;
+
+ client->doorbell_id = GUC_INVALID_DOORBELL_ID;
+ client->priority = priority;
+ client->guc = guc;
+
+ client->ctx_index = (uint32_t)ida_simple_get(&guc->ctx_ids, 0,
+ GUC_MAX_GPU_CONTEXTS, GFP_KERNEL);
+ if (client->ctx_index >= GUC_MAX_GPU_CONTEXTS) {
+ client->ctx_index = GUC_INVALID_CTX_ID;
+ goto err;
+ }
+
+ /* The first page is doorbell/proc_desc. Two followed pages are wq. */
+ obj = gem_allocate_guc_obj(dev, GUC_DB_SIZE + GUC_WQ_SIZE);
+ if (!obj)
+ goto err;
+
+ client->client_obj = obj;
+ client->wq_offset = GUC_DB_SIZE;
+ client->wq_size = GUC_WQ_SIZE;
+ spin_lock_init(&client->wq_lock);
+
+ client->doorbell_offset = select_doorbell_cacheline(guc);
+
+ /*
+ * Since the doorbell only requires a single cacheline, we can save
+ * space by putting the application process descriptor in the same
+ * page. Use the half of the page that doesn't include the doorbell.
+ */
+ if (client->doorbell_offset >= (GUC_DB_SIZE / 2))
+ client->proc_desc_offset = 0;
+ else
+ client->proc_desc_offset = (GUC_DB_SIZE / 2);
+
+ client->doorbell_id = assign_doorbell(guc, client->priority);
+ if (client->doorbell_id == GUC_INVALID_DOORBELL_ID)
+ /* XXX: evict a doorbell instead */
+ goto err;
+
+ guc_init_proc_desc(guc, client);
+ guc_init_ctx_desc(guc, client);
+ guc_init_doorbell(guc, client);
+
+ /* XXX: Any cache flushes needed? General domain mgmt calls? */
+
+ if (host2guc_allocate_doorbell(guc, client))
+ goto err;
+
+ DRM_DEBUG_DRIVER("new priority %u client %p: ctx_index %u db_id %u\n",
+ priority, client, client->ctx_index, client->doorbell_id);
+
+ return client;
+
+err:
+ DRM_ERROR("FAILED to create priority %u GuC client!\n", priority);
+
+ guc_client_free(dev, client);
+ return NULL;
+}
+
static void guc_create_log(struct intel_guc *guc)
{
struct drm_i915_private *dev_priv = guc_to_i915(guc);
@@ -142,6 +780,8 @@ int i915_guc_submission_init(struct drm_device *dev)
if (!guc->ctx_pool_obj)
return -ENOMEM;
+ spin_lock_init(&dev_priv->guc.host2guc_lock);
+
ida_init(&guc->ctx_ids);
guc_create_log(guc);
@@ -149,6 +789,32 @@ int i915_guc_submission_init(struct drm_device *dev)
return 0;
}
+int i915_guc_submission_enable(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct intel_guc *guc = &dev_priv->guc;
+ struct i915_guc_client *client;
+
+ /* client for execbuf submission */
+ client = guc_client_alloc(dev, GUC_CTX_PRIORITY_KMD_NORMAL);
+ if (!client) {
+ DRM_ERROR("Failed to create execbuf guc_client\n");
+ return -ENOMEM;
+ }
+
+ guc->execbuf_client = client;
+ return 0;
+}
+
+void i915_guc_submission_disable(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct intel_guc *guc = &dev_priv->guc;
+
+ guc_client_free(dev, guc->execbuf_client);
+ guc->execbuf_client = NULL;
+}
+
void i915_guc_submission_fini(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 5b51b05..92658ce 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -27,6 +27,31 @@
#include "intel_guc_fwif.h"
#include "i915_guc_reg.h"
+struct i915_guc_client {
+ struct drm_i915_gem_object *client_obj;
+ struct intel_guc *guc;
+ uint32_t priority;
+ uint32_t ctx_index;
+
+ uint32_t proc_desc_offset;
+ uint32_t doorbell_offset;
+ uint32_t cookie;
+ uint16_t doorbell_id;
+ uint16_t padding; /* Maintain alignment */
+
+ uint32_t wq_offset;
+ uint32_t wq_size;
+
+ spinlock_t wq_lock; /* Protects all data below */
+ uint32_t wq_tail;
+
+ /* GuC submission statistics & status */
+ uint64_t submissions[I915_NUM_RINGS];
+ uint32_t q_fail;
+ uint32_t b_fail;
+ int retcode;
+};
+
enum intel_guc_fw_status {
GUC_FIRMWARE_FAIL = -1,
GUC_FIRMWARE_NONE = 0,
@@ -60,6 +85,23 @@ struct intel_guc {
struct drm_i915_gem_object *ctx_pool_obj;
struct ida ctx_ids;
+
+ struct i915_guc_client *execbuf_client;
+
+ spinlock_t host2guc_lock; /* Protects all data below */
+
+ DECLARE_BITMAP(doorbell_bitmap, GUC_MAX_DOORBELLS);
+ uint32_t db_cacheline; /* Cyclic counter mod pagesize */
+
+ /* Action status & statistics */
+ uint64_t action_count; /* Total commands issued */
+ uint32_t action_cmd; /* Last command word */
+ uint32_t action_status; /* Last return status */
+ uint32_t action_fail; /* Total number of failures */
+ int32_t action_err; /* Last error code */
+
+ uint64_t submissions[I915_NUM_RINGS];
+ uint32_t last_seqno[I915_NUM_RINGS];
};
/* intel_guc_loader.c */
@@ -70,6 +112,10 @@ extern const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status);
/* i915_guc_submission.c */
int i915_guc_submission_init(struct drm_device *dev);
+int i915_guc_submission_enable(struct drm_device *dev);
+int i915_guc_submit(struct i915_guc_client *client,
+ struct drm_i915_gem_request *rq);
+void i915_guc_submission_disable(struct drm_device *dev);
void i915_guc_submission_fini(struct drm_device *dev);
#endif
diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 06aad6f..950c7e7 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -35,10 +35,10 @@
#define GFXCORE_FAMILY_GEN9 12
#define GFXCORE_FAMILY_UNKNOWN 0x7fffffff
-#define GUC_CTX_PRIORITY_CRITICAL 0
+#define GUC_CTX_PRIORITY_KMD_HIGH 0
#define GUC_CTX_PRIORITY_HIGH 1
-#define GUC_CTX_PRIORITY_NORMAL 2
-#define GUC_CTX_PRIORITY_LOW 3
+#define GUC_CTX_PRIORITY_KMD_NORMAL 2
+#define GUC_CTX_PRIORITY_NORMAL 3
#define GUC_MAX_GPU_CONTEXTS 1024
#define GUC_INVALID_CTX_ID (GUC_MAX_GPU_CONTEXTS + 1)
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
index 6ff7fea..8b4b057 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -342,6 +342,8 @@ int intel_guc_ucode_load(struct drm_device *dev)
intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
+ i915_guc_submission_disable(dev);
+
if (guc_fw->guc_fw_fetch_status == GUC_FIRMWARE_NONE)
return 0;
@@ -389,12 +391,20 @@ int intel_guc_ucode_load(struct drm_device *dev)
intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
+ if (i915.enable_guc_submission) {
+ err = i915_guc_submission_enable(dev);
+ if (err)
+ goto fail;
+ }
+
return 0;
fail:
if (guc_fw->guc_fw_load_status == GUC_FIRMWARE_PENDING)
guc_fw->guc_fw_load_status = GUC_FIRMWARE_FAIL;
+ i915_guc_submission_disable(dev);
+
return err;
}
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 7/9 v6] drm/i915: Interrupt routing for GuC submission
2015-08-12 14:43 [PATCH 0/9 v6] Batch submission via GuC Dave Gordon
` (5 preceding siblings ...)
2015-08-12 14:43 ` [PATCH 6/9 v6] drm/i915: Implementation of GuC submission client Dave Gordon
@ 2015-08-12 14:43 ` Dave Gordon
2015-08-12 14:43 ` [PATCH 8/9 v6] drm/i915: Integrate GuC-based command submission Dave Gordon
` (2 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Dave Gordon @ 2015-08-12 14:43 UTC (permalink / raw)
To: intel-gfx
Turn on interrupt steering to route necessary interrupts to GuC.
v6:
Rebased
Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 11 +++++--
drivers/gpu/drm/i915/intel_guc_loader.c | 51 +++++++++++++++++++++++++++++++++
2 files changed, 60 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d54dc2c..d4dfd06 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1674,6 +1674,7 @@ enum skl_disp_power_wells {
#define GFX_MODE_GEN7 0x0229c
#define RING_MODE_GEN7(ring) ((ring)->mmio_base+0x29c)
#define GFX_RUN_LIST_ENABLE (1<<15)
+#define GFX_INTERRUPT_STEERING (1<<14)
#define GFX_TLB_INVALIDATE_EXPLICIT (1<<13)
#define GFX_SURFACE_FAULT_ENABLE (1<<12)
#define GFX_REPLAY_MODE (1<<11)
@@ -1681,6 +1682,11 @@ enum skl_disp_power_wells {
#define GFX_PPGTT_ENABLE (1<<9)
#define GEN8_GFX_PPGTT_48B (1<<7)
+#define GFX_FORWARD_VBLANK_MASK (3<<5)
+#define GFX_FORWARD_VBLANK_NEVER (0<<5)
+#define GFX_FORWARD_VBLANK_ALWAYS (1<<5)
+#define GFX_FORWARD_VBLANK_COND (2<<5)
+
#define VLV_DISPLAY_BASE 0x180000
#define VLV_MIPI_BASE VLV_DISPLAY_BASE
@@ -5694,11 +5700,12 @@ enum skl_disp_power_wells {
#define GEN8_GT_IIR(which) (0x44308 + (0x10 * (which)))
#define GEN8_GT_IER(which) (0x4430c + (0x10 * (which)))
-#define GEN8_BCS_IRQ_SHIFT 16
#define GEN8_RCS_IRQ_SHIFT 0
-#define GEN8_VCS2_IRQ_SHIFT 16
+#define GEN8_BCS_IRQ_SHIFT 16
#define GEN8_VCS1_IRQ_SHIFT 0
+#define GEN8_VCS2_IRQ_SHIFT 16
#define GEN8_VECS_IRQ_SHIFT 0
+#define GEN8_WD_IRQ_SHIFT 16
#define GEN8_DE_PIPE_ISR(pipe) (0x44400 + (0x10 * (pipe)))
#define GEN8_DE_PIPE_IMR(pipe) (0x44404 + (0x10 * (pipe)))
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
index 8b4b057..13e75f6 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -79,6 +79,53 @@ const char *intel_guc_fw_status_repr(enum intel_guc_fw_status status)
}
};
+static void direct_interrupts_to_host(struct drm_i915_private *dev_priv)
+{
+ struct intel_engine_cs *ring;
+ int i, irqs;
+
+ /* tell all command streamers NOT to forward interrupts and vblank to GuC */
+ irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_NEVER);
+ irqs |= _MASKED_BIT_DISABLE(GFX_INTERRUPT_STEERING);
+ for_each_ring(ring, dev_priv, i)
+ I915_WRITE(RING_MODE_GEN7(ring), irqs);
+
+ /* tell DE to send nothing to GuC */
+ I915_WRITE(DE_GUCRMR, ~0);
+
+ /* route all GT interrupts to the host */
+ I915_WRITE(GUC_BCS_RCS_IER, 0);
+ I915_WRITE(GUC_VCS2_VCS1_IER, 0);
+ I915_WRITE(GUC_WD_VECS_IER, 0);
+}
+
+static void direct_interrupts_to_guc(struct drm_i915_private *dev_priv)
+{
+ struct intel_engine_cs *ring;
+ int i, irqs;
+
+ /* tell all command streamers to forward interrupts and vblank to GuC */
+ irqs = _MASKED_FIELD(GFX_FORWARD_VBLANK_MASK, GFX_FORWARD_VBLANK_ALWAYS);
+ irqs |= _MASKED_BIT_ENABLE(GFX_INTERRUPT_STEERING);
+ for_each_ring(ring, dev_priv, i)
+ I915_WRITE(RING_MODE_GEN7(ring), irqs);
+
+ /* tell DE to send (all) flip_done to GuC */
+ irqs = DERRMR_PIPEA_PRI_FLIP_DONE | DERRMR_PIPEA_SPR_FLIP_DONE |
+ DERRMR_PIPEB_PRI_FLIP_DONE | DERRMR_PIPEB_SPR_FLIP_DONE |
+ DERRMR_PIPEC_PRI_FLIP_DONE | DERRMR_PIPEC_SPR_FLIP_DONE;
+ /* Unmasked bits will cause GuC response message to be sent */
+ I915_WRITE(DE_GUCRMR, ~irqs);
+
+ /* route USER_INTERRUPT to Host, all others are sent to GuC. */
+ irqs = GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
+ GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
+ /* These three registers have the same bit definitions */
+ I915_WRITE(GUC_BCS_RCS_IER, ~irqs);
+ I915_WRITE(GUC_VCS2_VCS1_IER, ~irqs);
+ I915_WRITE(GUC_WD_VECS_IER, ~irqs);
+}
+
static u32 get_gttype(struct drm_i915_private *dev_priv)
{
/* XXX: GT type based on PCI device ID? field seems unused by fw */
@@ -342,6 +389,7 @@ int intel_guc_ucode_load(struct drm_device *dev)
intel_guc_fw_status_repr(guc_fw->guc_fw_fetch_status),
intel_guc_fw_status_repr(guc_fw->guc_fw_load_status));
+ direct_interrupts_to_host(dev_priv);
i915_guc_submission_disable(dev);
if (guc_fw->guc_fw_fetch_status == GUC_FIRMWARE_NONE)
@@ -395,6 +443,7 @@ int intel_guc_ucode_load(struct drm_device *dev)
err = i915_guc_submission_enable(dev);
if (err)
goto fail;
+ direct_interrupts_to_guc(dev_priv);
}
return 0;
@@ -403,6 +452,7 @@ fail:
if (guc_fw->guc_fw_load_status == GUC_FIRMWARE_PENDING)
guc_fw->guc_fw_load_status = GUC_FIRMWARE_FAIL;
+ direct_interrupts_to_host(dev_priv);
i915_guc_submission_disable(dev);
return err;
@@ -550,6 +600,7 @@ void intel_guc_ucode_fini(struct drm_device *dev)
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc_fw *guc_fw = &dev_priv->guc.guc_fw;
+ direct_interrupts_to_host(dev_priv);
i915_guc_submission_fini(dev);
if (guc_fw->guc_fw_obj)
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 8/9 v6] drm/i915: Integrate GuC-based command submission
2015-08-12 14:43 [PATCH 0/9 v6] Batch submission via GuC Dave Gordon
` (6 preceding siblings ...)
2015-08-12 14:43 ` [PATCH 7/9 v6] drm/i915: Interrupt routing for GuC submission Dave Gordon
@ 2015-08-12 14:43 ` Dave Gordon
2015-08-12 14:43 ` [PATCH 9/9 v6] drm/i915: Debugfs interface for GuC submission statistics Dave Gordon
2015-08-12 14:57 ` [PATCH 0/9 v6] Batch submission via GuC Dave Gordon
9 siblings, 0 replies; 14+ messages in thread
From: Dave Gordon @ 2015-08-12 14:43 UTC (permalink / raw)
To: intel-gfx
From: Alex Dai <yu.dai@intel.com>
GuC-based submission is mostly the same as execlist mode, up to
intel_logical_ring_advance_and_submit(), where the context being
dispatched would be added to the execlist queue; at this point
we submit the context to the GuC backend instead.
There are, however, a few other changes also required, notably:
1. Contexts must be pinned at GGTT addresses accessible by the GuC
i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the
PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls.
2. The GuC's TLB must be invalidated after a context is pinned at
a new GGTT address.
3. GuC firmware uses the one page before Ring Context as shared data.
Therefore, whenever driver wants to get base address of LRC, we
will offset one page for it. LRC_PPHWSP_PN is defined as the page
number of LRCA.
4. In the work queue used to pass requests to the GuC, the GuC
firmware requires the ring-tail-offset to be represented as an
11-bit value, expressed in QWords. Therefore, the ringbuffer
size must be reduced to the representable range (4 pages).
v2:
Defer adding #defines until needed [Chris Wilson]
Rationalise type declarations [Chris Wilson]
v4:
Squashed kerneldoc patch into here [Daniel Vetter]
v5:
Update request->tail in code common to both GuC and execlist modes.
Add a private version of lr_context_update(), as sharing the
execlist version leads to race conditions when the CPU and
the GuC both update TAIL in the context image.
Conversion of error-captured HWS page to string must account
for offset from start of object to actual HWS (LRC_PPHWSP_PN).
Issue: VIZ-4884
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
Documentation/DocBook/drm.tmpl | 14 ++++++
drivers/gpu/drm/i915/i915_debugfs.c | 2 +-
drivers/gpu/drm/i915/i915_gpu_error.c | 14 +++---
drivers/gpu/drm/i915/i915_guc_submission.c | 79 ++++++++++++++++++++++++++++--
drivers/gpu/drm/i915/intel_guc.h | 1 +
drivers/gpu/drm/i915/intel_lrc.c | 55 ++++++++++++++-------
drivers/gpu/drm/i915/intel_lrc.h | 6 +++
7 files changed, 142 insertions(+), 29 deletions(-)
diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl
index 4111902..a01fca9 100644
--- a/Documentation/DocBook/drm.tmpl
+++ b/Documentation/DocBook/drm.tmpl
@@ -4152,6 +4152,20 @@ int num_ioctls;</synopsis>
</sect2>
</sect1>
<sect1>
+ <title>GuC-based Command Submission</title>
+ <sect2>
+ <title>GuC</title>
+!Pdrivers/gpu/drm/i915/intel_guc_loader.c GuC-specific firmware loader
+!Idrivers/gpu/drm/i915/intel_guc_loader.c
+ </sect2>
+ <sect2>
+ <title>GuC Client</title>
+!Pdrivers/gpu/drm/i915/intel_guc_submission.c GuC-based command submissison
+!Idrivers/gpu/drm/i915/intel_guc_submission.c
+ </sect2>
+ </sect1>
+
+ <sect1>
<title> Tracing </title>
<para>
This sections covers all things related to the tracepoints implemented in
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 529d2fd..cfddc9a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1995,7 +1995,7 @@ static void i915_dump_lrc_obj(struct seq_file *m,
return;
}
- page = i915_gem_object_get_page(ctx_obj, 1);
+ page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
if (!WARN_ON(page == NULL)) {
reg_state = kmap_atomic(page);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index f79c952..94012e7 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -461,17 +461,17 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
}
if ((obj = error->ring[i].hws_page)) {
- err_printf(m, "%s --- HW Status = 0x%08x\n",
- dev_priv->ring[i].name,
- lower_32_bits(obj->gtt_offset));
+ err_printf(m, "%s --- HW Status = 0x%08llx\n",
+ dev_priv->ring[i].name,
+ obj->gtt_offset + LRC_PPHWSP_PN * PAGE_SIZE);
offset = 0;
for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
err_printf(m, "[%04x] %08x %08x %08x %08x\n",
offset,
- obj->pages[0][elt],
- obj->pages[0][elt+1],
- obj->pages[0][elt+2],
- obj->pages[0][elt+3]);
+ obj->pages[LRC_PPHWSP_PN][elt],
+ obj->pages[LRC_PPHWSP_PN][elt+1],
+ obj->pages[LRC_PPHWSP_PN][elt+2],
+ obj->pages[LRC_PPHWSP_PN][elt+3]);
offset += 16;
}
}
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 3352b85..ec70393 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -364,18 +364,58 @@ static void guc_init_proc_desc(struct intel_guc *guc,
static void guc_init_ctx_desc(struct intel_guc *guc,
struct i915_guc_client *client)
{
+ struct intel_context *ctx = client->owner;
struct guc_context_desc desc;
struct sg_table *sg;
+ int i;
memset(&desc, 0, sizeof(desc));
desc.attribute = GUC_CTX_DESC_ATTR_ACTIVE | GUC_CTX_DESC_ATTR_KERNEL;
desc.context_id = client->ctx_index;
desc.priority = client->priority;
- desc.engines_used = (1 << RCS) | (1 << VCS) | (1 << BCS) |
- (1 << VECS) | (1 << VCS2); /* all engines */
desc.db_id = client->doorbell_id;
+ for (i = 0; i < I915_NUM_RINGS; i++) {
+ struct guc_execlist_context *lrc = &desc.lrc[i];
+ struct intel_ringbuffer *ringbuf = ctx->engine[i].ringbuf;
+ struct intel_engine_cs *ring;
+ struct drm_i915_gem_object *obj;
+ uint64_t ctx_desc;
+
+ /* TODO: We have a design issue to be solved here. Only when we
+ * receive the first batch, we know which engine is used by the
+ * user. But here GuC expects the lrc and ring to be pinned. It
+ * is not an issue for default context, which is the only one
+ * for now who owns a GuC client. But for future owner of GuC
+ * client, need to make sure lrc is pinned prior to enter here.
+ */
+ obj = ctx->engine[i].state;
+ if (!obj)
+ break; /* XXX: continue? */
+
+ ring = ringbuf->ring;
+ ctx_desc = intel_lr_context_descriptor(ctx, ring);
+ lrc->context_desc = (u32)ctx_desc;
+
+ /* The state page is after PPHWSP */
+ lrc->ring_lcra = i915_gem_obj_ggtt_offset(obj) +
+ LRC_STATE_PN * PAGE_SIZE;
+ lrc->context_id = (client->ctx_index << GUC_ELC_CTXID_OFFSET) |
+ (ring->id << GUC_ELC_ENGINE_OFFSET);
+
+ obj = ringbuf->obj;
+
+ lrc->ring_begin = i915_gem_obj_ggtt_offset(obj);
+ lrc->ring_end = lrc->ring_begin + obj->base.size - 1;
+ lrc->ring_next_free_location = lrc->ring_begin;
+ lrc->ring_current_tail_pointer_value = 0;
+
+ desc.engines_used |= (1 << ring->id);
+ }
+
+ WARN_ON(desc.engines_used == 0);
+
/*
* The CPU address is only needed at certain points, so kmap_atomic on
* demand instead of storing it in the ctx descriptor.
@@ -501,6 +541,29 @@ static int guc_add_workqueue_item(struct i915_guc_client *gc,
return 0;
}
+#define CTX_RING_BUFFER_START 0x08
+
+/* Update the ringbuffer pointer in a saved context image */
+static void lr_context_update(struct drm_i915_gem_request *rq)
+{
+ enum intel_ring_id ring_id = rq->ring->id;
+ struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring_id].state;
+ struct drm_i915_gem_object *rb_obj = rq->ringbuf->obj;
+ struct page *page;
+ uint32_t *reg_state;
+
+ BUG_ON(!ctx_obj);
+ WARN_ON(!i915_gem_obj_is_pinned(ctx_obj));
+ WARN_ON(!i915_gem_obj_is_pinned(rb_obj));
+
+ page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
+ reg_state = kmap_atomic(page);
+
+ reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj);
+
+ kunmap_atomic(reg_state);
+}
+
/**
* i915_guc_submit() - Submit commands through GuC
* @client: the guc client where commands will go through
@@ -517,6 +580,10 @@ int i915_guc_submit(struct i915_guc_client *client,
unsigned long flags;
int q_ret, b_ret;
+ /* Need this because of the deferred pin ctx and ring */
+ /* Shall we move this right after ring is pinned? */
+ lr_context_update(rq);
+
spin_lock_irqsave(&client->wq_lock, flags);
q_ret = guc_add_workqueue_item(client, rq);
@@ -643,11 +710,13 @@ static void guc_client_free(struct drm_device *dev,
* The kernel client to replace ExecList submission is created with
* NORMAL priority. Priority of a client for scheduler can be HIGH,
* while a preemption context can use CRITICAL.
+ * @ctx the context to own the client (we use the default render context)
*
* Return: An i915_guc_client object if success.
*/
static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
- uint32_t priority)
+ uint32_t priority,
+ struct intel_context *ctx)
{
struct i915_guc_client *client;
struct drm_i915_private *dev_priv = dev->dev_private;
@@ -660,6 +729,7 @@ static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
client->doorbell_id = GUC_INVALID_DOORBELL_ID;
client->priority = priority;
+ client->owner = ctx;
client->guc = guc;
client->ctx_index = (uint32_t)ida_simple_get(&guc->ctx_ids, 0,
@@ -793,10 +863,11 @@ int i915_guc_submission_enable(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_guc *guc = &dev_priv->guc;
+ struct intel_context *ctx = dev_priv->ring[RCS].default_context;
struct i915_guc_client *client;
/* client for execbuf submission */
- client = guc_client_alloc(dev, GUC_CTX_PRIORITY_KMD_NORMAL);
+ client = guc_client_alloc(dev, GUC_CTX_PRIORITY_KMD_NORMAL, ctx);
if (!client) {
DRM_ERROR("Failed to create execbuf guc_client\n");
return -ENOMEM;
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 92658ce..4ec2d27 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -29,6 +29,7 @@
struct i915_guc_client {
struct drm_i915_gem_object *client_obj;
+ struct intel_context *owner;
struct intel_guc *guc;
uint32_t priority;
uint32_t ctx_index;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f3411f8..e77b6b0 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -263,7 +263,8 @@ int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists
*/
u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
{
- u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj);
+ u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
+ LRC_PPHWSP_PN * PAGE_SIZE;
/* LRCA is required to be 4K aligned so the more significant 20 bits
* are globally unique */
@@ -276,7 +277,8 @@ uint64_t intel_lr_context_descriptor(struct intel_context *ctx,
struct drm_device *dev = ring->dev;
struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
uint64_t desc;
- uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj);
+ uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj) +
+ LRC_PPHWSP_PN * PAGE_SIZE;
WARN_ON(lrca & 0xFFFFFFFF00000FFFULL);
@@ -350,7 +352,7 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
WARN_ON(!i915_gem_obj_is_pinned(ctx_obj));
WARN_ON(!i915_gem_obj_is_pinned(rb_obj));
- page = i915_gem_object_get_page(ctx_obj, 1);
+ page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
reg_state = kmap_atomic(page);
reg_state[CTX_RING_TAIL+1] = rq->tail;
@@ -548,8 +550,6 @@ static int execlists_context_queue(struct drm_i915_gem_request *request)
i915_gem_request_reference(request);
- request->tail = request->ringbuf->tail;
-
spin_lock_irq(&ring->execlist_lock);
list_for_each_entry(cursor, &ring->execlist_queue, execlist_link)
@@ -702,13 +702,19 @@ static void
intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
{
struct intel_engine_cs *ring = request->ring;
+ struct drm_i915_private *dev_priv = request->i915;
intel_logical_ring_advance(request->ringbuf);
+ request->tail = request->ringbuf->tail;
+
if (intel_ring_stopped(ring))
return;
- execlists_context_queue(request);
+ if (dev_priv->guc.execbuf_client)
+ i915_guc_submit(dev_priv->guc.execbuf_client, request);
+ else
+ execlists_context_queue(request);
}
static void __wrap_ring_buffer(struct intel_ringbuffer *ringbuf)
@@ -998,6 +1004,7 @@ int logical_ring_flush_all_caches(struct drm_i915_gem_request *req)
static int intel_lr_context_pin(struct drm_i915_gem_request *rq)
{
+ struct drm_i915_private *dev_priv = rq->i915;
struct intel_engine_cs *ring = rq->ring;
struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
struct intel_ringbuffer *ringbuf = rq->ringbuf;
@@ -1005,14 +1012,18 @@ static int intel_lr_context_pin(struct drm_i915_gem_request *rq)
WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex));
if (rq->ctx->engine[ring->id].pin_count++ == 0) {
- ret = i915_gem_obj_ggtt_pin(ctx_obj,
- GEN8_LR_CONTEXT_ALIGN, 0);
+ ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN,
+ PIN_OFFSET_BIAS | GUC_WOPCM_TOP);
if (ret)
goto reset_pin_count;
ret = intel_pin_and_map_ringbuffer_obj(ring->dev, ringbuf);
if (ret)
goto unpin_ctx_obj;
+
+ /* Invalidate GuC TLB. */
+ if (i915.enable_guc_submission)
+ I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
}
return ret;
@@ -2137,7 +2148,7 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o
/* The second page of the context object contains some fields which must
* be set up prior to the first execution. */
- page = i915_gem_object_get_page(ctx_obj, 1);
+ page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
reg_state = kmap_atomic(page);
/* A context is actually a big batch buffer with several MI_LOAD_REGISTER_IMM
@@ -2307,12 +2318,13 @@ static void lrc_setup_hardware_status_page(struct intel_engine_cs *ring,
struct drm_i915_gem_object *default_ctx_obj)
{
struct drm_i915_private *dev_priv = ring->dev->dev_private;
+ struct page *page;
- /* The status page is offset 0 from the default context object
- * in LRC mode. */
- ring->status_page.gfx_addr = i915_gem_obj_ggtt_offset(default_ctx_obj);
- ring->status_page.page_addr =
- kmap(sg_page(default_ctx_obj->pages->sgl));
+ /* The HWSP is part of the default context object in LRC mode. */
+ ring->status_page.gfx_addr = i915_gem_obj_ggtt_offset(default_ctx_obj)
+ + LRC_PPHWSP_PN * PAGE_SIZE;
+ page = i915_gem_object_get_page(default_ctx_obj, LRC_PPHWSP_PN);
+ ring->status_page.page_addr = kmap(page);
ring->status_page.obj = default_ctx_obj;
I915_WRITE(RING_HWS_PGA(ring->mmio_base),
@@ -2338,6 +2350,7 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
{
const bool is_global_default_ctx = (ctx == ring->default_context);
struct drm_device *dev = ring->dev;
+ struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_i915_gem_object *ctx_obj;
uint32_t context_size;
struct intel_ringbuffer *ringbuf;
@@ -2348,6 +2361,9 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
context_size = round_up(get_lr_context_size(ring), 4096);
+ /* One extra page as the sharing data between driver and GuC */
+ context_size += PAGE_SIZE * LRC_PPHWSP_PN;
+
ctx_obj = i915_gem_alloc_object(dev, context_size);
if (!ctx_obj) {
DRM_DEBUG_DRIVER("Alloc LRC backing obj failed.\n");
@@ -2355,13 +2371,18 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
}
if (is_global_default_ctx) {
- ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN, 0);
+ ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN,
+ PIN_OFFSET_BIAS | GUC_WOPCM_TOP);
if (ret) {
DRM_DEBUG_DRIVER("Pin LRC backing obj failed: %d\n",
ret);
drm_gem_object_unreference(&ctx_obj->base);
return ret;
}
+
+ /* Invalidate GuC TLB. */
+ if (i915.enable_guc_submission)
+ I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
}
ringbuf = kzalloc(sizeof(*ringbuf), GFP_KERNEL);
@@ -2374,7 +2395,7 @@ int intel_lr_context_deferred_create(struct intel_context *ctx,
ringbuf->ring = ring;
- ringbuf->size = 32 * PAGE_SIZE;
+ ringbuf->size = 4 * PAGE_SIZE;
ringbuf->effective_size = ringbuf->size;
ringbuf->head = 0;
ringbuf->tail = 0;
@@ -2474,7 +2495,7 @@ void intel_lr_context_reset(struct drm_device *dev,
WARN(1, "Failed get_pages for context obj\n");
continue;
}
- page = i915_gem_object_get_page(ctx_obj, 1);
+ page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
reg_state = kmap_atomic(page);
reg_state[CTX_RING_HEAD+1] = 0;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 5e5788c..4cc54b3 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -68,6 +68,12 @@ static inline void intel_logical_ring_emit(struct intel_ringbuffer *ringbuf,
}
/* Logical Ring Contexts */
+
+/* One extra page is added before LRC for GuC as shared data */
+#define LRC_GUCSHR_PN (0)
+#define LRC_PPHWSP_PN (LRC_GUCSHR_PN + 1)
+#define LRC_STATE_PN (LRC_PPHWSP_PN + 1)
+
void intel_lr_context_free(struct intel_context *ctx);
int intel_lr_context_deferred_create(struct intel_context *ctx,
struct intel_engine_cs *ring);
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 9/9 v6] drm/i915: Debugfs interface for GuC submission statistics
2015-08-12 14:43 [PATCH 0/9 v6] Batch submission via GuC Dave Gordon
` (7 preceding siblings ...)
2015-08-12 14:43 ` [PATCH 8/9 v6] drm/i915: Integrate GuC-based command submission Dave Gordon
@ 2015-08-12 14:43 ` Dave Gordon
2015-08-12 14:57 ` [PATCH 0/9 v6] Batch submission via GuC Dave Gordon
9 siblings, 0 replies; 14+ messages in thread
From: Dave Gordon @ 2015-08-12 14:43 UTC (permalink / raw)
To: intel-gfx
This provides a means of reading status and counts relating
to GuC actions and submissions.
v2:
Remove surplus blank line in output [Chris Wilson]
v5:
Added GuC per-engine submission & seqno statistics
v6:
Add per-ring statistics to client, refactor client-dumper.
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Alex Dai <yu.dai@intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 76 +++++++++++++++++++++++++++++++++++++
1 file changed, 76 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index cfddc9a..7a28de5 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2412,6 +2412,81 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data)
return 0;
}
+static void i915_guc_client_info(struct seq_file *m,
+ struct drm_i915_private *dev_priv,
+ struct i915_guc_client *client)
+{
+ struct intel_engine_cs *ring;
+ uint64_t tot = 0;
+ uint32_t i;
+
+ seq_printf(m, "\tPriority %d, GuC ctx index: %u, PD offset 0x%x\n",
+ client->priority, client->ctx_index, client->proc_desc_offset);
+ seq_printf(m, "\tDoorbell id %d, offset: 0x%x, cookie 0x%x\n",
+ client->doorbell_id, client->doorbell_offset, client->cookie);
+ seq_printf(m, "\tWQ size %d, offset: 0x%x, tail %d\n",
+ client->wq_size, client->wq_offset, client->wq_tail);
+
+ seq_printf(m, "\tFailed to queue: %u\n", client->q_fail);
+ seq_printf(m, "\tFailed doorbell: %u\n", client->b_fail);
+ seq_printf(m, "\tLast submission result: %d\n", client->retcode);
+
+ for_each_ring(ring, dev_priv, i) {
+ seq_printf(m, "\tSubmissions: %llu %s\n",
+ client->submissions[i],
+ ring->name);
+ tot += client->submissions[i];
+ }
+ seq_printf(m, "\tTotal: %llu\n", tot);
+}
+
+static int i915_guc_info(struct seq_file *m, void *data)
+{
+ struct drm_info_node *node = m->private;
+ struct drm_device *dev = node->minor->dev;
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct intel_guc guc;
+ struct i915_guc_client client = { .client_obj = 0 };
+ struct intel_engine_cs *ring;
+ enum intel_ring_id i;
+ u64 total = 0;
+
+ if (!HAS_GUC_SCHED(dev_priv->dev))
+ return 0;
+
+ /* Take a local copy of the GuC data, so we can dump it at leisure */
+ spin_lock(&dev_priv->guc.host2guc_lock);
+ guc = dev_priv->guc;
+ if (guc.execbuf_client) {
+ spin_lock(&guc.execbuf_client->wq_lock);
+ client = *guc.execbuf_client;
+ spin_unlock(&guc.execbuf_client->wq_lock);
+ }
+ spin_unlock(&dev_priv->guc.host2guc_lock);
+
+ seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
+ seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
+ seq_printf(m, "GuC last action command: 0x%x\n", guc.action_cmd);
+ seq_printf(m, "GuC last action status: 0x%x\n", guc.action_status);
+ seq_printf(m, "GuC last action error code: %d\n", guc.action_err);
+
+ seq_printf(m, "\nGuC submissions:\n");
+ for_each_ring(ring, dev_priv, i) {
+ seq_printf(m, "\t%-24s: %10llu, last seqno 0x%08x %9d\n",
+ ring->name, guc.submissions[i],
+ guc.last_seqno[i], guc.last_seqno[i]);
+ total += guc.submissions[i];
+ }
+ seq_printf(m, "\t%s: %llu\n", "Total", total);
+
+ seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
+ i915_guc_client_info(m, dev_priv, &client);
+
+ /* Add more as required ... */
+
+ return 0;
+}
+
static int i915_guc_log_dump(struct seq_file *m, void *data)
{
struct drm_info_node *node = m->private;
@@ -5099,6 +5174,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
{"i915_gem_hws_bsd", i915_hws_info, 0, (void *)VCS},
{"i915_gem_hws_vebox", i915_hws_info, 0, (void *)VECS},
{"i915_gem_batch_pool", i915_gem_batch_pool_info, 0},
+ {"i915_guc_info", i915_guc_info, 0},
{"i915_guc_load_status", i915_guc_load_status_info, 0},
{"i915_guc_log_dump", i915_guc_log_dump, 0},
{"i915_frequency_info", i915_frequency_info, 0},
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 0/9 v6] Batch submission via GuC
2015-08-12 14:43 [PATCH 0/9 v6] Batch submission via GuC Dave Gordon
` (8 preceding siblings ...)
2015-08-12 14:43 ` [PATCH 9/9 v6] drm/i915: Debugfs interface for GuC submission statistics Dave Gordon
@ 2015-08-12 14:57 ` Dave Gordon
2015-08-13 2:35 ` O'Rourke, Tom
9 siblings, 1 reply; 14+ messages in thread
From: Dave Gordon @ 2015-08-12 14:57 UTC (permalink / raw)
To: Daniel Vetter, O'Rourke, Tom; +Cc: Intel Graphics Development
On 12/08/15 15:43, Dave Gordon wrote:
> This patch series enables command submission via the GuC. In this mode,
> instead of the host CPU driving the execlist port directly, it hands
> over work items to the GuC, using a doorbell mechanism to tell the GuC
> that new items have been added to its work queue. The GuC then dispatches
> contexts to the various GPU engines, and manages the resulting context-
> switch interrupts. Completion of a batch is however still signalled to
> the CPU; the GuC is not involved in handling user interrupts.
>
> There are two subsequences within the patch series:
>
> drm/i915: GuC-specific firmware loader
> drm/i915: Debugfs interface to read GuC load status
>
> These two patches provide the GuC loader and a debugfs interface to
> verify the resulting state. At this point in the sequence we can load
> and activate the GuC firmware, but not submit any batches through it.
> (This is nonetheless a potentially useful state, as the GuC could do
> other useful work even when not handling batch submissions).
>
> drm/i915: Expose one LRC function for GuC submission mode
> drm/i915: Prepare for GuC-based command submission
> drm/i915: Enable GuC firmware log
> drm/i915: Implementation of GuC submission client
> drm/i915: Interrupt routing for GuC submission
> drm/i915: Integrate GuC-based command submission
> drm/i915: Debugfs interface for GuC submission statistics
>
> In this second section, we implement the GuC submission mechanism, and
> link it into the (execlist-based) submission path. At this stage, it is
> not yet enabled by default; it is activated only if the kernel command
> line explicitly switches it on.
>
> The GuC firmware itself is not included in this patchset; it is or will
> be available for download from https://01.org/linuxgraphics/downloads/
> This driver works with and requires GuC firmware revision 3.x. It will
> not work with any firmware version 1.x, as the GuC protocol in those
> revisions was incompatible and is no longer supported.
>
> On platforms where there is no GuC, or where GuC submission is not
> specifically enabled, batch submission will continue to use the execlist
> (or ringbuffer) mechanisms. On the other hand, if GuC submission is
> requested but the GuC firmware cannot be found or is invalid, the GPU
> will be unusable.
>
> It is expected that a subsequent patch will enable GuC submission by
> default once the signed firmware is published on 01.org.
>
> Ben Widawsky (0):
> Vinit Azad (0):
> Michael H. Nguyen (0):
> created the original versions on which some of these patches are based.
>
> Alex Dai (5):
> drm/i915: GuC-specific firmware loader
> drm/i915: Debugfs interface to read GuC load status
> drm/i915: Prepare for GuC-based command submission
> drm/i915: Enable GuC firmware log
> drm/i915: Integrate GuC-based command submission
>
> Dave Gordon (4):
> drm/i915: Expose one LRC function for GuC submission mode
> drm/i915: Implementation of GuC submission client
> drm/i915: Interrupt routing for GuC submission
> drm/i915: Debugfs interface for GuC submission statistics
>
> Documentation/DocBook/drm.tmpl | 14 +
> drivers/gpu/drm/i915/Makefile | 4 +
> drivers/gpu/drm/i915/i915_debugfs.c | 146 ++++-
> drivers/gpu/drm/i915/i915_dma.c | 9 +
> drivers/gpu/drm/i915/i915_drv.h | 11 +
> drivers/gpu/drm/i915/i915_gem.c | 16 +
> drivers/gpu/drm/i915/i915_gpu_error.c | 14 +-
> drivers/gpu/drm/i915/i915_guc_reg.h | 17 +-
> drivers/gpu/drm/i915/i915_guc_submission.c | 901 +++++++++++++++++++++++++++++
> drivers/gpu/drm/i915/i915_reg.h | 15 +-
> drivers/gpu/drm/i915/intel_guc.h | 122 ++++
> drivers/gpu/drm/i915/intel_guc_fwif.h | 9 +-
> drivers/gpu/drm/i915/intel_guc_loader.c | 611 +++++++++++++++++++
> drivers/gpu/drm/i915/intel_lrc.c | 65 ++-
> drivers/gpu/drm/i915/intel_lrc.h | 8 +
> 15 files changed, 1918 insertions(+), 44 deletions(-)
> create mode 100644 drivers/gpu/drm/i915/i915_guc_submission.c
> create mode 100644 drivers/gpu/drm/i915/intel_guc.h
> create mode 100644 drivers/gpu/drm/i915/intel_guc_loader.c
Tom O'Rourke has already R-B'ed the previous [v5] versions of these, and
there are no substantive changes to patches 2, 3, 4, 5, 7 and 8, so we
can carry that over; also, the change in patch 1 is just an update to a
comment noted in Tom's earlier reviews as being out-of-date.
I haven't included patch 10 (enable-by-default) as we've decided to wait
until the signed firmware is publicly available on 01.org before
making it the default.
So, it's really just patches 6/9 and 9/9 that need a re-review from Tom.
Thanks,
.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/9 v6] Batch submission via GuC
2015-08-12 14:57 ` [PATCH 0/9 v6] Batch submission via GuC Dave Gordon
@ 2015-08-13 2:35 ` O'Rourke, Tom
2015-08-14 9:09 ` Daniel Vetter
0 siblings, 1 reply; 14+ messages in thread
From: O'Rourke, Tom @ 2015-08-13 2:35 UTC (permalink / raw)
To: Gordon, David S; +Cc: Intel Graphics Development
On Wed, Aug 12, 2015 at 07:57:37AM -0700, Gordon, David S wrote:
> On 12/08/15 15:43, Dave Gordon wrote:
> > This patch series enables command submission via the GuC. In this mode,
> > instead of the host CPU driving the execlist port directly, it hands
> > over work items to the GuC, using a doorbell mechanism to tell the GuC
> > that new items have been added to its work queue. The GuC then dispatches
> > contexts to the various GPU engines, and manages the resulting context-
> > switch interrupts. Completion of a batch is however still signalled to
> > the CPU; the GuC is not involved in handling user interrupts.
> >
> > There are two subsequences within the patch series:
> >
> > drm/i915: GuC-specific firmware loader
> > drm/i915: Debugfs interface to read GuC load status
> >
> > These two patches provide the GuC loader and a debugfs interface to
> > verify the resulting state. At this point in the sequence we can load
> > and activate the GuC firmware, but not submit any batches through it.
> > (This is nonetheless a potentially useful state, as the GuC could do
> > other useful work even when not handling batch submissions).
> >
> > drm/i915: Expose one LRC function for GuC submission mode
> > drm/i915: Prepare for GuC-based command submission
> > drm/i915: Enable GuC firmware log
> > drm/i915: Implementation of GuC submission client
> > drm/i915: Interrupt routing for GuC submission
> > drm/i915: Integrate GuC-based command submission
> > drm/i915: Debugfs interface for GuC submission statistics
> >
> > In this second section, we implement the GuC submission mechanism, and
> > link it into the (execlist-based) submission path. At this stage, it is
> > not yet enabled by default; it is activated only if the kernel command
> > line explicitly switches it on.
> >
> > The GuC firmware itself is not included in this patchset; it is or will
> > be available for download from https://01.org/linuxgraphics/downloads/
> > This driver works with and requires GuC firmware revision 3.x. It will
> > not work with any firmware version 1.x, as the GuC protocol in those
> > revisions was incompatible and is no longer supported.
> >
> > On platforms where there is no GuC, or where GuC submission is not
> > specifically enabled, batch submission will continue to use the execlist
> > (or ringbuffer) mechanisms. On the other hand, if GuC submission is
> > requested but the GuC firmware cannot be found or is invalid, the GPU
> > will be unusable.
> >
> > It is expected that a subsequent patch will enable GuC submission by
> > default once the signed firmware is published on 01.org.
> >
> > Ben Widawsky (0):
> > Vinit Azad (0):
> > Michael H. Nguyen (0):
> > created the original versions on which some of these patches are based.
> >
> > Alex Dai (5):
> > drm/i915: GuC-specific firmware loader
> > drm/i915: Debugfs interface to read GuC load status
> > drm/i915: Prepare for GuC-based command submission
> > drm/i915: Enable GuC firmware log
> > drm/i915: Integrate GuC-based command submission
> >
> > Dave Gordon (4):
> > drm/i915: Expose one LRC function for GuC submission mode
> > drm/i915: Implementation of GuC submission client
> > drm/i915: Interrupt routing for GuC submission
> > drm/i915: Debugfs interface for GuC submission statistics
> >
> > Documentation/DocBook/drm.tmpl | 14 +
> > drivers/gpu/drm/i915/Makefile | 4 +
> > drivers/gpu/drm/i915/i915_debugfs.c | 146 ++++-
> > drivers/gpu/drm/i915/i915_dma.c | 9 +
> > drivers/gpu/drm/i915/i915_drv.h | 11 +
> > drivers/gpu/drm/i915/i915_gem.c | 16 +
> > drivers/gpu/drm/i915/i915_gpu_error.c | 14 +-
> > drivers/gpu/drm/i915/i915_guc_reg.h | 17 +-
> > drivers/gpu/drm/i915/i915_guc_submission.c | 901 +++++++++++++++++++++++++++++
> > drivers/gpu/drm/i915/i915_reg.h | 15 +-
> > drivers/gpu/drm/i915/intel_guc.h | 122 ++++
> > drivers/gpu/drm/i915/intel_guc_fwif.h | 9 +-
> > drivers/gpu/drm/i915/intel_guc_loader.c | 611 +++++++++++++++++++
> > drivers/gpu/drm/i915/intel_lrc.c | 65 ++-
> > drivers/gpu/drm/i915/intel_lrc.h | 8 +
> > 15 files changed, 1918 insertions(+), 44 deletions(-)
> > create mode 100644 drivers/gpu/drm/i915/i915_guc_submission.c
> > create mode 100644 drivers/gpu/drm/i915/intel_guc.h
> > create mode 100644 drivers/gpu/drm/i915/intel_guc_loader.c
>
> Tom O'Rourke has already R-B'ed the previous [v5] versions of these, and
> there are no substantive changes to patches 2, 3, 4, 5, 7 and 8, so we
> can carry that over; also, the change in patch 1 is just an update to a
> comment noted in Tom's earlier reviews as being out-of-date.
>
> I haven't included patch 10 (enable-by-default) as we've decided to wait
> until the signed firmware is publicly available on 01.org before
> making it the default.
>
> So, it's really just patches 6/9 and 9/9 that need a re-review from Tom.
>
> Thanks,
> .Dave.
[TOR:] These patches still look good. Good catch in patch 6 with the offset.
For all 9 patches:
Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/9 v6] drm/i915: GuC-specific firmware loader
2015-08-12 14:43 ` [PATCH 1/9 v6] drm/i915: GuC-specific firmware loader Dave Gordon
@ 2015-08-14 9:05 ` Daniel Vetter
0 siblings, 0 replies; 14+ messages in thread
From: Daniel Vetter @ 2015-08-14 9:05 UTC (permalink / raw)
To: Dave Gordon; +Cc: intel-gfx
On Wed, Aug 12, 2015 at 03:43:36PM +0100, Dave Gordon wrote:
> From: Alex Dai <yu.dai@intel.com>
>
> This fetches the required firmware image from the filesystem,
> then loads it into the GuC's memory via a dedicated DMA engine.
>
> This patch is derived from GuC loading work originally done by
> Vinit Azad and Ben Widawsky.
>
> v2:
> Various improvements per review comments by Chris Wilson
>
> v3:
> Removed 'wait' parameter to intel_guc_ucode_load() as firmware
> prefetch is no longer supported in the common firmware loader,
> per Daniel Vetter's request.
> Firmware checker callback fn now returns errno rather than bool.
>
> v4:
> Squash uC-independent code into GuC-specifc loader [Daniel Vetter]
> Don't keep the driver working (by falling back to execlist mode)
> if GuC firmware loading fails [Daniel Vetter]
>
> v5:
> Clarify WOPCM-related #defines [Tom O'Rourke]
> Delete obsolete code no longer required with current h/w & f/w
> [Tom O'Rourke]
> Move the call to intel_guc_ucode_init() later, so that it can
> allocate GEM objects, and have it fetch the firmware; then
> intel_guc_ucode_load() doesn't need to fetch it later.
> [Daniel Vetter].
>
> v6:
> Update comment describing intel_guc_ucode_load() [Tom O'Rourke]
>
> Issue: VIZ-4884
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Checkpatch was pretty unhappy with long lines and alignment of
continuations. But didn't look too bad overall so pulled it in.
> +/*
> + * Transfer the firmware image to RAM for execution by the microcontroller.
> + *
> + * GuC Firmware layout:
> + * +-------------------------------+ ----
> + * | CSS header | 128B
> + * | contains major/minor version |
> + * +-------------------------------+ ----
> + * | uCode |
> + * +-------------------------------+ ----
> + * | RSA signature | 256B
> + * +-------------------------------+ ----
> + * | RSA public Key | 256B
> + * +-------------------------------+ ----
> + * | Public key modulus | 4B
> + * +-------------------------------+ ----
> + *
> + * Architecturally, the DMA engine is bidirectional, and can potentially even
> + * transfer between GTT locations. This functionality is left out of the API
> + * for now as there is no need for it.
> + *
> + * Note that GuC needs the CSS header plus uKernel code to be copied by the
> + * DMA engine in one operation, whereas the RSA signature is loaded via MMIO.
Imo this should be included in the kerneldoc too, especially now that we
can handle tables with markdown.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/9 v6] Batch submission via GuC
2015-08-13 2:35 ` O'Rourke, Tom
@ 2015-08-14 9:09 ` Daniel Vetter
0 siblings, 0 replies; 14+ messages in thread
From: Daniel Vetter @ 2015-08-14 9:09 UTC (permalink / raw)
To: O'Rourke, Tom; +Cc: Intel Graphics Development
On Wed, Aug 12, 2015 at 07:35:10PM -0700, O'Rourke, Tom wrote:
> On Wed, Aug 12, 2015 at 07:57:37AM -0700, Gordon, David S wrote:
> > On 12/08/15 15:43, Dave Gordon wrote:
> > > This patch series enables command submission via the GuC. In this mode,
> > > instead of the host CPU driving the execlist port directly, it hands
> > > over work items to the GuC, using a doorbell mechanism to tell the GuC
> > > that new items have been added to its work queue. The GuC then dispatches
> > > contexts to the various GPU engines, and manages the resulting context-
> > > switch interrupts. Completion of a batch is however still signalled to
> > > the CPU; the GuC is not involved in handling user interrupts.
> > >
> > > There are two subsequences within the patch series:
> > >
> > > drm/i915: GuC-specific firmware loader
> > > drm/i915: Debugfs interface to read GuC load status
> > >
> > > These two patches provide the GuC loader and a debugfs interface to
> > > verify the resulting state. At this point in the sequence we can load
> > > and activate the GuC firmware, but not submit any batches through it.
> > > (This is nonetheless a potentially useful state, as the GuC could do
> > > other useful work even when not handling batch submissions).
> > >
> > > drm/i915: Expose one LRC function for GuC submission mode
> > > drm/i915: Prepare for GuC-based command submission
> > > drm/i915: Enable GuC firmware log
> > > drm/i915: Implementation of GuC submission client
> > > drm/i915: Interrupt routing for GuC submission
> > > drm/i915: Integrate GuC-based command submission
> > > drm/i915: Debugfs interface for GuC submission statistics
> > >
> > > In this second section, we implement the GuC submission mechanism, and
> > > link it into the (execlist-based) submission path. At this stage, it is
> > > not yet enabled by default; it is activated only if the kernel command
> > > line explicitly switches it on.
> > >
> > > The GuC firmware itself is not included in this patchset; it is or will
> > > be available for download from https://01.org/linuxgraphics/downloads/
> > > This driver works with and requires GuC firmware revision 3.x. It will
> > > not work with any firmware version 1.x, as the GuC protocol in those
> > > revisions was incompatible and is no longer supported.
> > >
> > > On platforms where there is no GuC, or where GuC submission is not
> > > specifically enabled, batch submission will continue to use the execlist
> > > (or ringbuffer) mechanisms. On the other hand, if GuC submission is
> > > requested but the GuC firmware cannot be found or is invalid, the GPU
> > > will be unusable.
> > >
> > > It is expected that a subsequent patch will enable GuC submission by
> > > default once the signed firmware is published on 01.org.
> > >
> > > Ben Widawsky (0):
> > > Vinit Azad (0):
> > > Michael H. Nguyen (0):
> > > created the original versions on which some of these patches are based.
> > >
> > > Alex Dai (5):
> > > drm/i915: GuC-specific firmware loader
> > > drm/i915: Debugfs interface to read GuC load status
> > > drm/i915: Prepare for GuC-based command submission
> > > drm/i915: Enable GuC firmware log
> > > drm/i915: Integrate GuC-based command submission
> > >
> > > Dave Gordon (4):
> > > drm/i915: Expose one LRC function for GuC submission mode
> > > drm/i915: Implementation of GuC submission client
> > > drm/i915: Interrupt routing for GuC submission
> > > drm/i915: Debugfs interface for GuC submission statistics
> > >
> > > Documentation/DocBook/drm.tmpl | 14 +
> > > drivers/gpu/drm/i915/Makefile | 4 +
> > > drivers/gpu/drm/i915/i915_debugfs.c | 146 ++++-
> > > drivers/gpu/drm/i915/i915_dma.c | 9 +
> > > drivers/gpu/drm/i915/i915_drv.h | 11 +
> > > drivers/gpu/drm/i915/i915_gem.c | 16 +
> > > drivers/gpu/drm/i915/i915_gpu_error.c | 14 +-
> > > drivers/gpu/drm/i915/i915_guc_reg.h | 17 +-
> > > drivers/gpu/drm/i915/i915_guc_submission.c | 901 +++++++++++++++++++++++++++++
> > > drivers/gpu/drm/i915/i915_reg.h | 15 +-
> > > drivers/gpu/drm/i915/intel_guc.h | 122 ++++
> > > drivers/gpu/drm/i915/intel_guc_fwif.h | 9 +-
> > > drivers/gpu/drm/i915/intel_guc_loader.c | 611 +++++++++++++++++++
> > > drivers/gpu/drm/i915/intel_lrc.c | 65 ++-
> > > drivers/gpu/drm/i915/intel_lrc.h | 8 +
> > > 15 files changed, 1918 insertions(+), 44 deletions(-)
> > > create mode 100644 drivers/gpu/drm/i915/i915_guc_submission.c
> > > create mode 100644 drivers/gpu/drm/i915/intel_guc.h
> > > create mode 100644 drivers/gpu/drm/i915/intel_guc_loader.c
> >
> > Tom O'Rourke has already R-B'ed the previous [v5] versions of these, and
> > there are no substantive changes to patches 2, 3, 4, 5, 7 and 8, so we
> > can carry that over; also, the change in patch 1 is just an update to a
> > comment noted in Tom's earlier reviews as being out-of-date.
> >
> > I haven't included patch 10 (enable-by-default) as we've decided to wait
> > until the signed firmware is publicly available on 01.org before
> > making it the default.
> >
> > So, it's really just patches 6/9 and 9/9 that need a re-review from Tom.
> >
> > Thanks,
> > .Dave.
>
> [TOR:] These patches still look good. Good catch in patch 6 with the offset.
>
> For all 9 patches:
> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
All merged to dinq, thanks.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2015-08-14 9:09 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-12 14:43 [PATCH 0/9 v6] Batch submission via GuC Dave Gordon
2015-08-12 14:43 ` [PATCH 1/9 v6] drm/i915: GuC-specific firmware loader Dave Gordon
2015-08-14 9:05 ` Daniel Vetter
2015-08-12 14:43 ` [PATCH 2/9 v6] drm/i915: Debugfs interface to read GuC load status Dave Gordon
2015-08-12 14:43 ` [PATCH 3/9 v6] drm/i915: Expose one LRC function for GuC submission mode Dave Gordon
2015-08-12 14:43 ` [PATCH 4/9 v6] drm/i915: Prepare for GuC-based command submission Dave Gordon
2015-08-12 14:43 ` [PATCH 5/9 v6] drm/i915: Enable GuC firmware log Dave Gordon
2015-08-12 14:43 ` [PATCH 6/9 v6] drm/i915: Implementation of GuC submission client Dave Gordon
2015-08-12 14:43 ` [PATCH 7/9 v6] drm/i915: Interrupt routing for GuC submission Dave Gordon
2015-08-12 14:43 ` [PATCH 8/9 v6] drm/i915: Integrate GuC-based command submission Dave Gordon
2015-08-12 14:43 ` [PATCH 9/9 v6] drm/i915: Debugfs interface for GuC submission statistics Dave Gordon
2015-08-12 14:57 ` [PATCH 0/9 v6] Batch submission via GuC Dave Gordon
2015-08-13 2:35 ` O'Rourke, Tom
2015-08-14 9:09 ` Daniel Vetter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox