GuC vs multiple timelines

public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed

* GuC vs multiple timelines
@ 2016-11-25  9:30 Chris Wilson
  2016-11-25  9:30 ` [PATCH 01/15] drm: Hold mode_config.lock to prevent hotplug whilst setting up crtcs Chris Wilson
                   ` (15 more replies)
  0 siblings, 16 replies; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

Switching the GuC on for BAT testing another patch revealed I had broken
(made spammy at least) resets with requests in flight on the GuC. In the
course of fixing that, I kept tripping over a nasty timing issue in the
reuse of requests (with many threads running in parallel we would retire
a request whilst it was in the middle of the execute callback
elsewhere), so debug patches plus a small fix for that. And then there
were a few other issues found whilst looking at those...
-Chris

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 01/15] drm: Hold mode_config.lock to prevent hotplug whilst setting up crtcs
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
@ 2016-11-25  9:30 ` Chris Wilson
  2016-11-25  9:30 ` [PATCH 02/15] drm: Pull together probe + setup for drm_fb_helper Chris Wilson
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

The fb_helper->connector_count is modified when a new connector is
constructed following a hotplug event (e.g. DP-MST). This causes trouble
for drm_setup_crtcs() and friends that assume that fb_helper is
constant:

[ 1250.872997] BUG: KASAN: slab-out-of-bounds in drm_setup_crtcs+0x320/0xf80 at addr ffff88074cdd2608
[ 1250.873020] Write of size 40 by task kworker/u8:3/480
[ 1250.873039] CPU: 2 PID: 480 Comm: kworker/u8:3 Tainted: G     U          4.9.0-rc6+ #285
[ 1250.873043] Hardware name:                  /NUC6i3SYB, BIOS SYSKLi35.86A.0024.2015.1027.2142 10/27/2015
[ 1250.873050] Workqueue: events_unbound async_run_entry_fn
[ 1250.873056]  ffff88070f9d78f0 ffffffff814b72aa ffff88074e40c5c0 ffff88074cdd2608
[ 1250.873067]  ffff88070f9d7918 ffffffff8124ff3c ffff88070f9d79b0 ffff88074cdd2600
[ 1250.873079]  ffff88074e40c5c0 ffff88070f9d79a0 ffffffff812501e4 0000000000000005
[ 1250.873090] Call Trace:
[ 1250.873099]  [<ffffffff814b72aa>] dump_stack+0x67/0x9d
[ 1250.873106]  [<ffffffff8124ff3c>] kasan_object_err+0x1c/0x70
[ 1250.873113]  [<ffffffff812501e4>] kasan_report_error+0x204/0x4f0
[ 1250.873120]  [<ffffffff81698df0>] ? drm_dev_printk+0x140/0x140
[ 1250.873127]  [<ffffffff81250ac3>] kasan_report+0x53/0x60
[ 1250.873134]  [<ffffffff81688b40>] ? drm_setup_crtcs+0x320/0xf80
[ 1250.873142]  [<ffffffff8124f18e>] check_memory_region+0x13e/0x1a0
[ 1250.873147]  [<ffffffff8124f5f3>] memset+0x23/0x40
[ 1250.873154]  [<ffffffff81688b40>] drm_setup_crtcs+0x320/0xf80
[ 1250.873161]  [<ffffffff810be7c5>] ? wake_up_q+0x45/0x80
[ 1250.873169]  [<ffffffff81b0c180>] ? mutex_lock_nested+0x5a0/0x5a0
[ 1250.873176]  [<ffffffff8168a0e6>] drm_fb_helper_initial_config+0x206/0x7a0
[ 1250.873183]  [<ffffffff81689ee0>] ? drm_fb_helper_set_par+0x90/0x90
[ 1250.873303]  [<ffffffffa0b68690>] ? intel_fbdev_fini+0x140/0x140 [i915]
[ 1250.873387]  [<ffffffffa0b686b2>] intel_fbdev_initial_config+0x22/0x40 [i915]
[ 1250.873391]  [<ffffffff810b50ff>] async_run_entry_fn+0x7f/0x270
[ 1250.873394]  [<ffffffff810a64b0>] process_one_work+0x3d0/0x960
[ 1250.873398]  [<ffffffff810a641d>] ? process_one_work+0x33d/0x960
[ 1250.873401]  [<ffffffff810a60e0>] ? max_active_store+0xf0/0xf0
[ 1250.873406]  [<ffffffff810f6f9d>] ? do_raw_spin_lock+0x10d/0x1a0
[ 1250.873413]  [<ffffffff810a767d>] worker_thread+0x8d/0x840
[ 1250.873419]  [<ffffffff810a75f0>] ? create_worker+0x2e0/0x2e0
[ 1250.873426]  [<ffffffff810b0454>] kthread+0x194/0x1c0
[ 1250.873432]  [<ffffffff810b02c0>] ? kthread_park+0x60/0x60
[ 1250.873438]  [<ffffffff810f095d>] ? trace_hardirqs_on+0xd/0x10
[ 1250.873446]  [<ffffffff810b02c0>] ? kthread_park+0x60/0x60
[ 1250.873453]  [<ffffffff810b02c0>] ? kthread_park+0x60/0x60
[ 1250.873457]  [<ffffffff81b12277>] ret_from_fork+0x27/0x40
[ 1250.873460] Object at ffff88074cdd2608, in cache kmalloc-32 size: 32

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98826
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/drm_fb_helper.c | 73 ++++++++++++++++++++++-------------------
 1 file changed, 40 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 14547817566d..d13c85682891 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -97,6 +97,10 @@ static LIST_HEAD(kernel_fb_helper_list);
  * mmap page writes.
  */
 
+#define drm_fb_helper_for_each_connector(fbh, i__) \
+	for (({lockdep_assert_held(&(fbh)->dev->mode_config.mutex); 1;}), \
+	     i__ = 0; i__ < (fbh)->connector_count; i__++)
+
 /**
  * drm_fb_helper_single_add_all_connectors() - add all connectors to fbdev
  * 					       emulation helper
@@ -130,7 +134,7 @@ int drm_fb_helper_single_add_all_connectors(struct drm_fb_helper *fb_helper)
 	mutex_unlock(&dev->mode_config.mutex);
 	return 0;
 fail:
-	for (i = 0; i < fb_helper->connector_count; i++) {
+	drm_fb_helper_for_each_connector(fb_helper, i) {
 		struct drm_fb_helper_connector *fb_helper_connector =
 			fb_helper->connector_info[i];
 
@@ -565,7 +569,7 @@ static void drm_fb_helper_dpms(struct fb_info *info, int dpms_mode)
 			continue;
 
 		/* Walk the connectors & encoders on this fb turning them on/off */
-		for (j = 0; j < fb_helper->connector_count; j++) {
+		drm_fb_helper_for_each_connector(fb_helper, j) {
 			connector = fb_helper->connector_info[j]->connector;
 			connector->funcs->dpms(connector, dpms_mode);
 			drm_object_property_set_value(&connector->base,
@@ -1469,7 +1473,6 @@ static int drm_fb_helper_single_fb_probe(struct drm_fb_helper *fb_helper,
 	int ret = 0;
 	int crtc_count = 0;
 	int i;
-	struct fb_info *info;
 	struct drm_fb_helper_surface_size sizes;
 	int gamma_size = 0;
 
@@ -1485,7 +1488,7 @@ static int drm_fb_helper_single_fb_probe(struct drm_fb_helper *fb_helper,
 		sizes.surface_depth = sizes.surface_bpp = preferred_bpp;
 
 	/* first up get a count of crtcs now in use and new min/maxes width/heights */
-	for (i = 0; i < fb_helper->connector_count; i++) {
+	drm_fb_helper_for_each_connector(fb_helper, i) {
 		struct drm_fb_helper_connector *fb_helper_conn = fb_helper->connector_info[i];
 		struct drm_cmdline_mode *cmdline_mode;
 
@@ -1572,8 +1575,6 @@ static int drm_fb_helper_single_fb_probe(struct drm_fb_helper *fb_helper,
 	if (ret < 0)
 		return ret;
 
-	info = fb_helper->fbdev;
-
 	/*
 	 * Set the fb pointer - usually drm_setup_crtcs does this for hotplug
 	 * events, but at init time drm_setup_crtcs needs to be called before
@@ -1585,20 +1586,6 @@ static int drm_fb_helper_single_fb_probe(struct drm_fb_helper *fb_helper,
 		if (fb_helper->crtc_info[i].mode_set.num_connectors)
 			fb_helper->crtc_info[i].mode_set.fb = fb_helper->fb;
 
-
-	info->var.pixclock = 0;
-	if (register_framebuffer(info) < 0)
-		return -EINVAL;
-
-	dev_info(fb_helper->dev->dev, "fb%d: %s frame buffer device\n",
-			info->node, info->fix.id);
-
-	if (list_empty(&kernel_fb_helper_list)) {
-		register_sysrq_key('v', &sysrq_drm_fb_helper_restore_op);
-	}
-
-	list_add(&fb_helper->kernel_fb_list, &kernel_fb_helper_list);
-
 	return 0;
 }
 
@@ -1730,7 +1717,7 @@ static int drm_fb_helper_probe_connector_modes(struct drm_fb_helper *fb_helper,
 	int count = 0;
 	int i;
 
-	for (i = 0; i < fb_helper->connector_count; i++) {
+	drm_fb_helper_for_each_connector(fb_helper, i) {
 		connector = fb_helper->connector_info[i]->connector;
 		count += connector->funcs->fill_modes(connector, maxX, maxY);
 	}
@@ -1830,7 +1817,7 @@ static void drm_enable_connectors(struct drm_fb_helper *fb_helper,
 	struct drm_connector *connector;
 	int i = 0;
 
-	for (i = 0; i < fb_helper->connector_count; i++) {
+	drm_fb_helper_for_each_connector(fb_helper, i) {
 		connector = fb_helper->connector_info[i]->connector;
 		enabled[i] = drm_connector_enabled(connector, true);
 		DRM_DEBUG_KMS("connector %d enabled? %s\n", connector->base.id,
@@ -1841,7 +1828,7 @@ static void drm_enable_connectors(struct drm_fb_helper *fb_helper,
 	if (any_enabled)
 		return;
 
-	for (i = 0; i < fb_helper->connector_count; i++) {
+	drm_fb_helper_for_each_connector(fb_helper, i) {
 		connector = fb_helper->connector_info[i]->connector;
 		enabled[i] = drm_connector_enabled(connector, false);
 	}
@@ -1862,7 +1849,7 @@ static bool drm_target_cloned(struct drm_fb_helper *fb_helper,
 		return false;
 
 	count = 0;
-	for (i = 0; i < fb_helper->connector_count; i++) {
+	drm_fb_helper_for_each_connector(fb_helper, i) {
 		if (enabled[i])
 			count++;
 	}
@@ -1873,7 +1860,7 @@ static bool drm_target_cloned(struct drm_fb_helper *fb_helper,
 
 	/* check the command line or if nothing common pick 1024x768 */
 	can_clone = true;
-	for (i = 0; i < fb_helper->connector_count; i++) {
+	drm_fb_helper_for_each_connector(fb_helper, i) {
 		if (!enabled[i])
 			continue;
 		fb_helper_conn = fb_helper->connector_info[i];
@@ -1899,8 +1886,7 @@ static bool drm_target_cloned(struct drm_fb_helper *fb_helper,
 	can_clone = true;
 	dmt_mode = drm_mode_find_dmt(fb_helper->dev, 1024, 768, 60, false);
 
-	for (i = 0; i < fb_helper->connector_count; i++) {
-
+	drm_fb_helper_for_each_connector(fb_helper, i) {
 		if (!enabled[i])
 			continue;
 
@@ -1931,7 +1917,7 @@ static int drm_get_tile_offsets(struct drm_fb_helper *fb_helper,
 	int i;
 	int hoffset = 0, voffset = 0;
 
-	for (i = 0; i < fb_helper->connector_count; i++) {
+	drm_fb_helper_for_each_connector(fb_helper, i) {
 		fb_helper_conn = fb_helper->connector_info[i];
 		if (!fb_helper_conn->connector->has_tile)
 			continue;
@@ -1964,7 +1950,7 @@ static bool drm_target_preferred(struct drm_fb_helper *fb_helper,
 	int tile_pass = 0;
 	mask = (1 << fb_helper->connector_count) - 1;
 retry:
-	for (i = 0; i < fb_helper->connector_count; i++) {
+	drm_fb_helper_for_each_connector(fb_helper, i) {
 		fb_helper_conn = fb_helper->connector_info[i];
 
 		if (conn_configured & (1 << i))
@@ -2127,6 +2113,9 @@ static void drm_setup_crtcs(struct drm_fb_helper *fb_helper)
 	width = dev->mode_config.max_width;
 	height = dev->mode_config.max_height;
 
+	/* prevent concurrent modification of connector_count by hotplug */
+	lockdep_assert_held(&fb_helper->dev->mode_config.mutex);
+
 	crtcs = kcalloc(fb_helper->connector_count,
 			sizeof(struct drm_fb_helper_crtc *), GFP_KERNEL);
 	modes = kcalloc(fb_helper->connector_count,
@@ -2140,7 +2129,6 @@ static void drm_setup_crtcs(struct drm_fb_helper *fb_helper)
 		goto out;
 	}
 
-
 	drm_enable_connectors(fb_helper, enabled);
 
 	if (!(fb_helper->funcs->initial_config &&
@@ -2169,7 +2157,7 @@ static void drm_setup_crtcs(struct drm_fb_helper *fb_helper)
 		drm_fb_helper_modeset_release(fb_helper,
 					      &fb_helper->crtc_info[i].mode_set);
 
-	for (i = 0; i < fb_helper->connector_count; i++) {
+	drm_fb_helper_for_each_connector(fb_helper, i) {
 		struct drm_display_mode *mode = modes[i];
 		struct drm_fb_helper_crtc *fb_crtc = crtcs[i];
 		struct drm_fb_offset *offset = &offsets[i];
@@ -2246,7 +2234,9 @@ static void drm_setup_crtcs(struct drm_fb_helper *fb_helper)
 int drm_fb_helper_initial_config(struct drm_fb_helper *fb_helper, int bpp_sel)
 {
 	struct drm_device *dev = fb_helper->dev;
+	struct fb_info *info;
 	int count = 0;
+	int ret;
 
 	if (!drm_fbdev_emulation)
 		return 0;
@@ -2255,7 +2245,6 @@ int drm_fb_helper_initial_config(struct drm_fb_helper *fb_helper, int bpp_sel)
 	count = drm_fb_helper_probe_connector_modes(fb_helper,
 						    dev->mode_config.max_width,
 						    dev->mode_config.max_height);
-	mutex_unlock(&dev->mode_config.mutex);
 	/*
 	 * we shouldn't end up with no modes here.
 	 */
@@ -2263,8 +2252,26 @@ int drm_fb_helper_initial_config(struct drm_fb_helper *fb_helper, int bpp_sel)
 		dev_info(fb_helper->dev->dev, "No connectors reported connected with modes\n");
 
 	drm_setup_crtcs(fb_helper);
+	ret = drm_fb_helper_single_fb_probe(fb_helper, bpp_sel);
+	mutex_unlock(&dev->mode_config.mutex);
+	if (ret)
+		return ret;
 
-	return drm_fb_helper_single_fb_probe(fb_helper, bpp_sel);
+	info = fb_helper->fbdev;
+	info->var.pixclock = 0;
+	ret = register_framebuffer(info);
+	if (ret < 0)
+		return ret;
+
+	dev_info(dev->dev, "fb%d: %s frame buffer device\n",
+		 info->node, info->fix.id);
+
+	if (list_empty(&kernel_fb_helper_list))
+		register_sysrq_key('v', &sysrq_drm_fb_helper_restore_op);
+
+	list_add(&fb_helper->kernel_fb_list, &kernel_fb_helper_list);
+
+	return 0;
 }
 EXPORT_SYMBOL(drm_fb_helper_initial_config);
 
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 02/15] drm: Pull together probe + setup for drm_fb_helper
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
  2016-11-25  9:30 ` [PATCH 01/15] drm: Hold mode_config.lock to prevent hotplug whilst setting up crtcs Chris Wilson
@ 2016-11-25  9:30 ` Chris Wilson
  2016-11-25  9:30 ` [PATCH 03/15] drm: Protect fb_helper list manipulation with a mutex Chris Wilson
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

drm_fb_helper_probe_connector_modes() is always called before
drm_setup_crtcs(), so just move the call into drm_setup_crtcs for a
small bit of code compaction.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/drm_fb_helper.c | 37 +++++++++++--------------------------
 1 file changed, 11 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index d13c85682891..a19afc7eccde 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -2098,20 +2098,19 @@ static int drm_pick_crtcs(struct drm_fb_helper *fb_helper,
 	return best_score;
 }
 
-static void drm_setup_crtcs(struct drm_fb_helper *fb_helper)
+static void drm_setup_crtcs(struct drm_fb_helper *fb_helper,
+			    u32 width, u32 height)
 {
 	struct drm_device *dev = fb_helper->dev;
 	struct drm_fb_helper_crtc **crtcs;
 	struct drm_display_mode **modes;
 	struct drm_fb_offset *offsets;
 	bool *enabled;
-	int width, height;
 	int i;
 
 	DRM_DEBUG_KMS("\n");
-
-	width = dev->mode_config.max_width;
-	height = dev->mode_config.max_height;
+	if (drm_fb_helper_probe_connector_modes(fb_helper, width, height) == 0)
+		DRM_DEBUG_KMS("No connectors reported connected with modes\n");
 
 	/* prevent concurrent modification of connector_count by hotplug */
 	lockdep_assert_held(&fb_helper->dev->mode_config.mutex);
@@ -2235,23 +2234,15 @@ int drm_fb_helper_initial_config(struct drm_fb_helper *fb_helper, int bpp_sel)
 {
 	struct drm_device *dev = fb_helper->dev;
 	struct fb_info *info;
-	int count = 0;
 	int ret;
 
 	if (!drm_fbdev_emulation)
 		return 0;
 
 	mutex_lock(&dev->mode_config.mutex);
-	count = drm_fb_helper_probe_connector_modes(fb_helper,
-						    dev->mode_config.max_width,
-						    dev->mode_config.max_height);
-	/*
-	 * we shouldn't end up with no modes here.
-	 */
-	if (count == 0)
-		dev_info(fb_helper->dev->dev, "No connectors reported connected with modes\n");
-
-	drm_setup_crtcs(fb_helper);
+	drm_setup_crtcs(fb_helper,
+			dev->mode_config.max_width,
+			dev->mode_config.max_height);
 	ret = drm_fb_helper_single_fb_probe(fb_helper, bpp_sel);
 	mutex_unlock(&dev->mode_config.mutex);
 	if (ret)
@@ -2299,28 +2290,22 @@ EXPORT_SYMBOL(drm_fb_helper_initial_config);
 int drm_fb_helper_hotplug_event(struct drm_fb_helper *fb_helper)
 {
 	struct drm_device *dev = fb_helper->dev;
-	u32 max_width, max_height;
 
 	if (!drm_fbdev_emulation)
 		return 0;
 
-	mutex_lock(&fb_helper->dev->mode_config.mutex);
+	mutex_lock(&dev->mode_config.mutex);
 	if (!fb_helper->fb || !drm_fb_helper_is_bound(fb_helper)) {
 		fb_helper->delayed_hotplug = true;
-		mutex_unlock(&fb_helper->dev->mode_config.mutex);
+		mutex_unlock(&dev->mode_config.mutex);
 		return 0;
 	}
 	DRM_DEBUG_KMS("\n");
 
-	max_width = fb_helper->fb->width;
-	max_height = fb_helper->fb->height;
+	drm_setup_crtcs(fb_helper, fb_helper->fb->width, fb_helper->fb->height);
 
-	drm_fb_helper_probe_connector_modes(fb_helper, max_width, max_height);
-	mutex_unlock(&fb_helper->dev->mode_config.mutex);
+	mutex_unlock(&dev->mode_config.mutex);
 
-	drm_modeset_lock_all(dev);
-	drm_setup_crtcs(fb_helper);
-	drm_modeset_unlock_all(dev);
 	drm_fb_helper_set_par(fb_helper->fbdev);
 
 	return 0;
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 03/15] drm: Protect fb_helper list manipulation with a mutex
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
  2016-11-25  9:30 ` [PATCH 01/15] drm: Hold mode_config.lock to prevent hotplug whilst setting up crtcs Chris Wilson
  2016-11-25  9:30 ` [PATCH 02/15] drm: Pull together probe + setup for drm_fb_helper Chris Wilson
@ 2016-11-25  9:30 ` Chris Wilson
  2016-11-25  9:30 ` [PATCH 04/15] drm/i915/perf: Wrap 64bit divides in do_div() Chris Wilson
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

Though we only walk the kernel_fb_helper_list inside a panic (or single
thread debugging), we still need to protect the list manipulation on
creating/removing a framebuffer device in order to prevent list
corruption.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/drm_fb_helper.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index a19afc7eccde..2ac2f462d37b 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -49,6 +49,7 @@ MODULE_PARM_DESC(fbdev_emulation,
 		 "Enable legacy fbdev emulation [default=true]");
 
 static LIST_HEAD(kernel_fb_helper_list);
+static DEFINE_MUTEX(kernel_fb_helper_lock);
 
 /**
  * DOC: fbdev helpers
@@ -855,12 +856,14 @@ void drm_fb_helper_fini(struct drm_fb_helper *fb_helper)
 	if (!drm_fbdev_emulation)
 		return;
 
+	mutex_lock(&kernel_fb_helper_lock);
 	if (!list_empty(&fb_helper->kernel_fb_list)) {
 		list_del(&fb_helper->kernel_fb_list);
 		if (list_empty(&kernel_fb_helper_list)) {
 			unregister_sysrq_key('v', &sysrq_drm_fb_helper_restore_op);
 		}
 	}
+	mutex_unlock(&kernel_fb_helper_lock);
 
 	drm_fb_helper_crtc_free(fb_helper);
 
@@ -2257,10 +2260,12 @@ int drm_fb_helper_initial_config(struct drm_fb_helper *fb_helper, int bpp_sel)
 	dev_info(dev->dev, "fb%d: %s frame buffer device\n",
 		 info->node, info->fix.id);
 
+	mutex_lock(&kernel_fb_helper_lock);
 	if (list_empty(&kernel_fb_helper_list))
 		register_sysrq_key('v', &sysrq_drm_fb_helper_restore_op);
 
 	list_add(&fb_helper->kernel_fb_list, &kernel_fb_helper_list);
+	mutex_unlock(&kernel_fb_helper_lock);
 
 	return 0;
 }
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 04/15] drm/i915/perf: Wrap 64bit divides in do_div()
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
                   ` (2 preceding siblings ...)
  2016-11-25  9:30 ` [PATCH 03/15] drm: Protect fb_helper list manipulation with a mutex Chris Wilson
@ 2016-11-25  9:30 ` Chris Wilson
  2016-11-25  9:30 ` [PATCH 05/15] drm/i915: Add is-completed assert to request retire entrypoint Chris Wilson
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

Just a couple of naked 64bit divides causing link errors on 32bit
builds, with:

	ERROR: "__udivdi3" [drivers/gpu/drm/i915/i915.ko] undefined!

v2: do_div() is only u64/u32, we need a u32/u64!
v3: div_u64() == u64/u32, div64_u64() == u64/u64

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: d79651522e89 ("drm/i915: Enable i915 perf stream for Haswell OA unit")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Robert Bragg <robert@sixbynine.org>
---
 drivers/gpu/drm/i915/i915_perf.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 95512824922b..14de9a4eee27 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -974,8 +974,8 @@ static void i915_oa_stream_disable(struct i915_perf_stream *stream)
 
 static u64 oa_exponent_to_ns(struct drm_i915_private *dev_priv, int exponent)
 {
-	return 1000000000ULL * (2ULL << exponent) /
-		dev_priv->perf.oa.timestamp_frequency;
+	return div_u64(1000000000ULL * (2ULL << exponent),
+		       dev_priv->perf.oa.timestamp_frequency);
 }
 
 static const struct i915_perf_stream_ops i915_oa_stream_ops = {
@@ -1051,16 +1051,17 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 
 	dev_priv->perf.oa.periodic = props->oa_periodic;
 	if (dev_priv->perf.oa.periodic) {
-		u64 period_ns = oa_exponent_to_ns(dev_priv,
-						  props->oa_period_exponent);
+		u32 tail;
 
 		dev_priv->perf.oa.period_exponent = props->oa_period_exponent;
 
 		/* See comment for OA_TAIL_MARGIN_NSEC for details
 		 * about this tail_margin...
 		 */
-		dev_priv->perf.oa.tail_margin =
-			((OA_TAIL_MARGIN_NSEC / period_ns) + 1) * format_size;
+		tail = div64_u64(OA_TAIL_MARGIN_NSEC,
+				 oa_exponent_to_ns(dev_priv,
+						   props->oa_period_exponent));
+		dev_priv->perf.oa.tail_margin = (tail + 1) * format_size;
 	}
 
 	if (stream->ctx) {
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 05/15] drm/i915: Add is-completed assert to request retire entrypoint
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
                   ` (3 preceding siblings ...)
  2016-11-25  9:30 ` [PATCH 04/15] drm/i915/perf: Wrap 64bit divides in do_div() Chris Wilson
@ 2016-11-25  9:30 ` Chris Wilson
  2016-11-25 10:36   ` Joonas Lahtinen
  2016-11-25  9:30 ` [PATCH 06/15] drm/i915: Assert no external observers when unwind a failed request alloc Chris Wilson
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

While we will check that the request is completed prior to being
retired, by placing an assert that the request is complete at the
entrypoint of the function we can more clearly document the function's
preconditions.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_request.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 82904595eaae..bd7b21f70283 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -281,6 +281,8 @@ void i915_gem_request_retire_upto(struct drm_i915_gem_request *req)
 	struct drm_i915_gem_request *tmp;
 
 	lockdep_assert_held(&req->i915->drm.struct_mutex);
+	GEM_BUG_ON(!i915_gem_request_completed(req));
+
 	if (list_empty(&req->link))
 		return;
 
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 06/15] drm/i915: Assert no external observers when unwind a failed request alloc
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
                   ` (4 preceding siblings ...)
  2016-11-25  9:30 ` [PATCH 05/15] drm/i915: Add is-completed assert to request retire entrypoint Chris Wilson
@ 2016-11-25  9:30 ` Chris Wilson
  2016-11-25 10:39   ` Joonas Lahtinen
  2016-11-25  9:30 ` [PATCH 07/15] drm/i915: Hold a reference on the request for its fence chain Chris Wilson
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

Before we return the request back to the kmem_cache after a failed
i915_gem_request_alloc(), we should assert that it has not been added to
any global state tracking.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_request.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index bd7b21f70283..ed6cead22a86 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -599,6 +599,11 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
 	return req;
 
 err_ctx:
+	/* Make sure we didn't add ourselves to external state before freeing */
+	GEM_BUG_ON(!list_empty(&req->active_list));
+	GEM_BUG_ON(!list_empty(&req->priotree.signalers_list));
+	GEM_BUG_ON(!list_empty(&req->priotree.waiters_list));
+
 	i915_gem_context_put(ctx);
 	kmem_cache_free(dev_priv->requests, req);
 err_unreserve:
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 07/15] drm/i915: Hold a reference on the request for its fence chain
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
                   ` (5 preceding siblings ...)
  2016-11-25  9:30 ` [PATCH 06/15] drm/i915: Assert no external observers when unwind a failed request alloc Chris Wilson
@ 2016-11-25  9:30 ` Chris Wilson
  2016-11-25 10:31   ` Joonas Lahtinen
  2016-11-25  9:30 ` [PATCH 08/15] drm/i915: Integrate i915_sw_fence with debugobjects Chris Wilson
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

Currently, we have an active reference for the request until it is
retired. Though it cannot be retired before it has been executed by
hardware, the request may be completed before we have finished
processing the execute fence, i.e. we may continue to process that fence
as we free the request.

Fixes: 5590af3e115a ("drm/i915: Drive request submission through fence callbacks")
Fixes: 23902e49c999 ("drm/i915: Split request submit/execute phase into two")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_request.c | 34 ++++++++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_sw_fence.h    |  5 +++++
 2 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index ed6cead22a86..94d71b639f12 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -200,8 +200,8 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
 	struct i915_gem_active *active, *next;
 
 	lockdep_assert_held(&request->i915->drm.struct_mutex);
-	GEM_BUG_ON(!i915_sw_fence_done(&request->submit));
-	GEM_BUG_ON(!i915_sw_fence_done(&request->execute));
+	GEM_BUG_ON(!i915_sw_fence_signaled(&request->submit));
+	GEM_BUG_ON(!i915_sw_fence_signaled(&request->execute));
 	GEM_BUG_ON(!i915_gem_request_completed(request));
 	GEM_BUG_ON(!request->i915->gt.active_requests);
 
@@ -451,11 +451,17 @@ void i915_gem_request_submit(struct drm_i915_gem_request *request)
 static int __i915_sw_fence_call
 submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 {
-	if (state == FENCE_COMPLETE) {
-		struct drm_i915_gem_request *request =
-			container_of(fence, typeof(*request), submit);
+	struct drm_i915_gem_request *request =
+		container_of(fence, typeof(*request), submit);
 
+	switch (state) {
+	case FENCE_COMPLETE:
 		request->engine->submit_request(request);
+		break;
+
+	case FENCE_FREE:
+		i915_gem_request_put(request);
+		break;
 	}
 
 	return NOTIFY_DONE;
@@ -464,6 +470,18 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 static int __i915_sw_fence_call
 execute_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 {
+	struct drm_i915_gem_request *request =
+		container_of(fence, typeof(*request), execute);
+
+	switch (state) {
+	case FENCE_COMPLETE:
+		break;
+
+	case FENCE_FREE:
+		i915_gem_request_put(request);
+		break;
+	}
+
 	return NOTIFY_DONE;
 }
 
@@ -551,8 +569,10 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
 		       req->timeline->fence_context,
 		       __timeline_get_seqno(req->timeline->common));
 
-	i915_sw_fence_init(&req->submit, submit_notify);
-	i915_sw_fence_init(&req->execute, execute_notify);
+	/* We bump the ref for the fence chain */
+	i915_sw_fence_init(&i915_gem_request_get(req)->submit, submit_notify);
+	i915_sw_fence_init(&i915_gem_request_get(req)->execute, execute_notify);
+
 	/* Ensure that the execute fence completes after the submit fence -
 	 * as we complete the execute fence from within the submit fence
 	 * callback, its completion would otherwise be visible first.
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h b/drivers/gpu/drm/i915/i915_sw_fence.h
index 7508d23f823b..0f3185ef7f4e 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -75,6 +75,11 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 				    unsigned long timeout,
 				    gfp_t gfp);
 
+static inline bool i915_sw_fence_signaled(const struct i915_sw_fence *fence)
+{
+	return atomic_read(&fence->pending) <= 0;
+}
+
 static inline bool i915_sw_fence_done(const struct i915_sw_fence *fence)
 {
 	return atomic_read(&fence->pending) < 0;
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 08/15] drm/i915: Integrate i915_sw_fence with debugobjects
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
                   ` (6 preceding siblings ...)
  2016-11-25  9:30 ` [PATCH 07/15] drm/i915: Hold a reference on the request for its fence chain Chris Wilson
@ 2016-11-25  9:30 ` Chris Wilson
  2016-11-25 10:24   ` Joonas Lahtinen
  2016-11-25  9:30 ` [PATCH 09/15] drm/i915: Enable swfence debugobject support for i915.ko Chris Wilson
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

Add the tracking required to enable debugobjects to improve error
detection in BAT.  The debugobject interface lets us to track the
lifetime of the fences even while being embedded into larger structs,
i.e. to check they are not used after they have been released.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Kconfig.debug      |  13 +++
 drivers/gpu/drm/i915/i915_gem_request.c |   3 +
 drivers/gpu/drm/i915/i915_sw_fence.c    | 155 ++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_sw_fence.h    |   6 ++
 4 files changed, 170 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug
index 51ba630a134b..a6c69b8cb1d2 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -22,6 +22,7 @@ config DRM_I915_DEBUG
         select X86_MSR # used by igt/pm_rpm
         select DRM_VGEM # used by igt/prime_vgem (dmabuf interop checks)
         select DRM_DEBUG_MM if DRM=y
+        select DRM_I915_SW_FENCE_DEBUG_OBJECTS if DRM_I915=y
         default n
         help
           Choose this option to turn on extra driver debugging that may affect
@@ -43,3 +44,15 @@ config DRM_I915_DEBUG_GEM
 
           If in doubt, say "N".
 
+config DRM_I915_SW_FENCE_DEBUG_OBJECTS
+        bool "Enable additional driver debugging for fence objects"
+        depends on DRM_I915=y
+        select DEBUG_OBJECTS
+        default n
+        help
+          Choose this option to turn on extra driver debugging that may affect
+          performance but will catch some internal issues.
+
+          Recommended for driver developers only.
+
+          If in doubt, say "N".
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 94d71b639f12..db17111a5a12 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -62,6 +62,9 @@ static void i915_fence_release(struct dma_fence *fence)
 {
 	struct drm_i915_gem_request *req = to_request(fence);
 
+	i915_sw_fence_free(&req->submit);
+	i915_sw_fence_free(&req->execute);
+
 	kmem_cache_free(req->i915->requests, req);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index 147420ccf49c..e93aa15e96b3 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -17,6 +17,107 @@
 
 static DEFINE_SPINLOCK(i915_sw_fence_lock);
 
+enum {
+	DEBUG_FENCE_IDLE = 0,
+	DEBUG_FENCE_NOTIFY,
+};
+
+#ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS
+
+static void *i915_sw_fence_debug_hint(void *addr)
+{
+	return (void *)(((struct i915_sw_fence *)addr)->flags & I915_SW_FENCE_MASK);
+}
+
+static bool i915_sw_fence_is_static_object(void *addr)
+{
+	return false;
+}
+
+static bool i915_sw_fence_fixup_fail(void *addr, enum debug_obj_state state)
+{
+	return false;
+}
+
+static struct debug_obj_descr i915_sw_fence_debug_descr = {
+	.name = "i915_sw_fence",
+	.debug_hint = i915_sw_fence_debug_hint,
+	.is_static_object = i915_sw_fence_is_static_object,
+	.fixup_init = i915_sw_fence_fixup_fail,
+	.fixup_activate = i915_sw_fence_fixup_fail,
+	.fixup_free = i915_sw_fence_fixup_fail,
+	.fixup_assert_init = i915_sw_fence_fixup_fail,
+};
+
+static inline void debug_fence_init(struct i915_sw_fence *fence)
+{
+	debug_object_init(fence, &i915_sw_fence_debug_descr);
+}
+
+static inline void debug_fence_activate(struct i915_sw_fence *fence)
+{
+	debug_object_activate(fence, &i915_sw_fence_debug_descr);
+}
+
+static inline void debug_fence_set_state(struct i915_sw_fence *fence,
+					 int old, int new)
+{
+	debug_object_active_state(fence, &i915_sw_fence_debug_descr, old, new);
+}
+
+static inline void debug_fence_deactivate(struct i915_sw_fence *fence)
+{
+	debug_object_deactivate(fence, &i915_sw_fence_debug_descr);
+}
+
+static inline void debug_fence_destroy(struct i915_sw_fence *fence)
+{
+	debug_object_destroy(fence, &i915_sw_fence_debug_descr);
+}
+
+static inline void debug_fence_free(struct i915_sw_fence *fence)
+{
+	debug_object_free(fence, &i915_sw_fence_debug_descr);
+}
+
+static inline void debug_fence_assert(struct i915_sw_fence *fence)
+{
+	debug_object_assert_init(fence, &i915_sw_fence_debug_descr);
+}
+
+#else
+
+static inline void debug_fence_init(struct i915_sw_fence *fence)
+{
+}
+
+static inline void debug_fence_activate(struct i915_sw_fence *fence)
+{
+}
+
+static inline void debug_fence_set_state(struct i915_sw_fence *fence,
+					 int old, int new)
+{
+}
+
+static inline void debug_fence_deactivate(struct i915_sw_fence *fence)
+{
+}
+
+static inline void debug_fence_destroy(struct i915_sw_fence *fence)
+{
+}
+
+static inline void debug_fence_free(struct i915_sw_fence *fence)
+{
+}
+
+static inline void debug_fence_assert(struct i915_sw_fence *fence)
+{
+}
+
+#endif
+
 static int __i915_sw_fence_notify(struct i915_sw_fence *fence,
 				  enum i915_sw_fence_notify state)
 {
@@ -26,25 +127,37 @@ static int __i915_sw_fence_notify(struct i915_sw_fence *fence,
 	return fn(fence, state);
 }
 
-static void i915_sw_fence_free(struct kref *kref)
+#ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS
+void i915_sw_fence_free(struct i915_sw_fence *fence)
+{
+	debug_fence_free(fence);
+}
+#endif
+
+static void i915_sw_fence_release(struct kref *kref)
 {
 	struct i915_sw_fence *fence = container_of(kref, typeof(*fence), kref);
 
 	WARN_ON(atomic_read(&fence->pending) > 0);
+	debug_fence_destroy(fence);
 
-	if (fence->flags & I915_SW_FENCE_MASK)
+	if (fence->flags & I915_SW_FENCE_MASK) {
 		__i915_sw_fence_notify(fence, FENCE_FREE);
-	else
+	} else {
+		i915_sw_fence_free(fence);
 		kfree(fence);
+	}
 }
 
 static void i915_sw_fence_put(struct i915_sw_fence *fence)
 {
-	kref_put(&fence->kref, i915_sw_fence_free);
+	debug_fence_assert(fence);
+	kref_put(&fence->kref, i915_sw_fence_release);
 }
 
 static struct i915_sw_fence *i915_sw_fence_get(struct i915_sw_fence *fence)
 {
+	debug_fence_assert(fence);
 	kref_get(&fence->kref);
 	return fence;
 }
@@ -56,6 +169,7 @@ static void __i915_sw_fence_wake_up_all(struct i915_sw_fence *fence,
 	wait_queue_t *pos, *next;
 	unsigned long flags;
 
+	debug_fence_deactivate(fence);
 	atomic_set_release(&fence->pending, -1); /* 0 -> -1 [done] */
 
 	/*
@@ -88,23 +202,33 @@ static void __i915_sw_fence_wake_up_all(struct i915_sw_fence *fence,
 		} while (1);
 	}
 	spin_unlock_irqrestore(&x->lock, flags);
+
+	debug_fence_assert(fence);
 }
 
 static void __i915_sw_fence_complete(struct i915_sw_fence *fence,
 				     struct list_head *continuation)
 {
+	debug_fence_assert(fence);
+
 	if (!atomic_dec_and_test(&fence->pending))
 		return;
 
+	debug_fence_set_state(fence, DEBUG_FENCE_IDLE, DEBUG_FENCE_NOTIFY);
+
 	if (fence->flags & I915_SW_FENCE_MASK &&
 	    __i915_sw_fence_notify(fence, FENCE_COMPLETE) != NOTIFY_DONE)
 		return;
 
+	debug_fence_set_state(fence, DEBUG_FENCE_NOTIFY, DEBUG_FENCE_IDLE);
+
 	__i915_sw_fence_wake_up_all(fence, continuation);
 }
 
 static void i915_sw_fence_complete(struct i915_sw_fence *fence)
 {
+	debug_fence_assert(fence);
+
 	if (WARN_ON(i915_sw_fence_done(fence)))
 		return;
 
@@ -113,6 +237,7 @@ static void i915_sw_fence_complete(struct i915_sw_fence *fence)
 
 static void i915_sw_fence_await(struct i915_sw_fence *fence)
 {
+	debug_fence_assert(fence);
 	WARN_ON(atomic_inc_return(&fence->pending) <= 1);
 }
 
@@ -123,18 +248,26 @@ void __i915_sw_fence_init(struct i915_sw_fence *fence,
 {
 	BUG_ON((unsigned long)fn & ~I915_SW_FENCE_MASK);
 
+	debug_fence_init(fence);
+
 	__init_waitqueue_head(&fence->wait, name, key);
 	kref_init(&fence->kref);
 	atomic_set(&fence->pending, 1);
 	fence->flags = (unsigned long)fn;
 }
 
-void i915_sw_fence_commit(struct i915_sw_fence *fence)
+static void __i915_sw_fence_commit(struct i915_sw_fence *fence)
 {
 	i915_sw_fence_complete(fence);
 	i915_sw_fence_put(fence);
 }
 
+void i915_sw_fence_commit(struct i915_sw_fence *fence)
+{
+	debug_fence_activate(fence);
+	__i915_sw_fence_commit(fence);
+}
+
 static int i915_sw_fence_wake(wait_queue_t *wq, unsigned mode, int flags, void *key)
 {
 	list_del(&wq->task_list);
@@ -206,9 +339,13 @@ static int __i915_sw_fence_await_sw_fence(struct i915_sw_fence *fence,
 	unsigned long flags;
 	int pending;
 
+	debug_fence_assert(fence);
+
 	if (i915_sw_fence_done(signaler))
 		return 0;
 
+	debug_fence_assert(signaler);
+
 	/* The dependency graph must be acyclic. */
 	if (unlikely(i915_sw_fence_check_if_after(fence, signaler)))
 		return -EINVAL;
@@ -279,7 +416,7 @@ static void timer_i915_sw_fence_wake(unsigned long data)
 	dma_fence_put(cb->dma);
 	cb->dma = NULL;
 
-	i915_sw_fence_commit(cb->fence);
+	__i915_sw_fence_commit(cb->fence);
 	cb->timer.function = NULL;
 }
 
@@ -290,7 +427,7 @@ static void dma_i915_sw_fence_wake(struct dma_fence *dma,
 
 	del_timer_sync(&cb->timer);
 	if (cb->timer.function)
-		i915_sw_fence_commit(cb->fence);
+		__i915_sw_fence_commit(cb->fence);
 	dma_fence_put(cb->dma);
 
 	kfree(cb);
@@ -304,6 +441,8 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
 	struct i915_sw_dma_fence_cb *cb;
 	int ret;
 
+	debug_fence_assert(fence);
+
 	if (dma_fence_is_signaled(dma))
 		return 0;
 
@@ -349,6 +488,8 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 	struct dma_fence *excl;
 	int ret = 0, pending;
 
+	debug_fence_assert(fence);
+
 	if (write) {
 		struct dma_fence **shared;
 		unsigned int count, i;
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h b/drivers/gpu/drm/i915/i915_sw_fence.h
index 0f3185ef7f4e..d3d6aba22122 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -56,6 +56,12 @@ do {								\
 	__i915_sw_fence_init((fence), (fn), NULL, NULL)
 #endif
 
+#ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS
+void i915_sw_fence_free(struct i915_sw_fence *fence);
+#else
+static inline void i915_sw_fence_free(struct i915_sw_fence *fence) {}
+#endif
+
 void i915_sw_fence_commit(struct i915_sw_fence *fence);
 
 int i915_sw_fence_await_sw_fence(struct i915_sw_fence *fence,
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 09/15] drm/i915: Enable swfence debugobject support for i915.ko
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
                   ` (7 preceding siblings ...)
  2016-11-25  9:30 ` [PATCH 08/15] drm/i915: Integrate i915_sw_fence with debugobjects Chris Wilson
@ 2016-11-25  9:30 ` Chris Wilson
  2016-11-25 10:32   ` Joonas Lahtinen
  2016-11-25  9:30 ` [PATCH 10/15] HAX drm/i915: Enable guc submission Chris Wilson
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

Only once the debugobject symbols are exported can we enable support for
debugging swfences when i915 is built as a module. Requires commit
2617fdca3f68 ("lib/debugobjects: export for use in modules")

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Kconfig.debug | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug
index a6c69b8cb1d2..f0e769290e0b 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -22,7 +22,7 @@ config DRM_I915_DEBUG
         select X86_MSR # used by igt/pm_rpm
         select DRM_VGEM # used by igt/prime_vgem (dmabuf interop checks)
         select DRM_DEBUG_MM if DRM=y
-        select DRM_I915_SW_FENCE_DEBUG_OBJECTS if DRM_I915=y
+        select DRM_I915_SW_FENCE_DEBUG_OBJECTS
         default n
         help
           Choose this option to turn on extra driver debugging that may affect
@@ -46,7 +46,7 @@ config DRM_I915_DEBUG_GEM
 
 config DRM_I915_SW_FENCE_DEBUG_OBJECTS
         bool "Enable additional driver debugging for fence objects"
-        depends on DRM_I915=y
+        depends on DRM_I915
         select DEBUG_OBJECTS
         default n
         help
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 10/15] HAX drm/i915: Enable guc submission
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
                   ` (8 preceding siblings ...)
  2016-11-25  9:30 ` [PATCH 09/15] drm/i915: Enable swfence debugobject support for i915.ko Chris Wilson
@ 2016-11-25  9:30 ` Chris Wilson
  2016-11-25  9:30 ` [PATCH 11/15] drm/i915: Trim i915_guc_info() stack usage Chris Wilson
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

---
 drivers/gpu/drm/i915/i915_params.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 0e280fbd52f1..ef1e9921a2af 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -56,8 +56,8 @@ struct i915_params i915 __read_mostly = {
 	.verbose_state_checks = 1,
 	.nuclear_pageflip = 0,
 	.edp_vswing = 0,
-	.enable_guc_loading = 0,
-	.enable_guc_submission = 0,
+	.enable_guc_loading = -1,
+	.enable_guc_submission = -1,
 	.guc_log_level = -1,
 	.enable_dp_mst = true,
 	.inject_load_failure = 0,
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 11/15] drm/i915: Trim i915_guc_info() stack usage
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
                   ` (9 preceding siblings ...)
  2016-11-25  9:30 ` [PATCH 10/15] HAX drm/i915: Enable guc submission Chris Wilson
@ 2016-11-25  9:30 ` Chris Wilson
  2016-11-28 11:15   ` Tvrtko Ursulin
  2016-11-25  9:30 ` [PATCH 12/15] drm/i915/guc: Rename client->cookie to match use Chris Wilson
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

i915_guc_info() (part of debugfs output) tries to avoid holding
struct_mutex for a long period by copying onto the stack. This causes a
warning that the stack frame is massive, so stop doing that. We can even
forgo holding the struct_mutex here as that doesn't serialise the values
being read (and the lists used exist for the device lifetime).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 39 +++++++++++++------------------------
 1 file changed, 14 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 8eb8c29b7492..7676e88ae5f2 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2434,47 +2434,36 @@ static void i915_guc_client_info(struct seq_file *m,
 static int i915_guc_info(struct seq_file *m, void *data)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
-	struct drm_device *dev = &dev_priv->drm;
-	struct intel_guc guc;
-	struct i915_guc_client client = {};
+	const struct intel_guc *guc = &dev_priv->guc;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	u64 total = 0;
+	u64 total;
 
 	if (!HAS_GUC_SCHED(dev_priv))
 		return 0;
 
-	if (mutex_lock_interruptible(&dev->struct_mutex))
-		return 0;
-
-	/* Take a local copy of the GuC data, so we can dump it at leisure */
-	guc = dev_priv->guc;
-	if (guc.execbuf_client)
-		client = *guc.execbuf_client;
-
-	mutex_unlock(&dev->struct_mutex);
-
 	seq_printf(m, "Doorbell map:\n");
-	seq_printf(m, "\t%*pb\n", GUC_MAX_DOORBELLS, guc.doorbell_bitmap);
-	seq_printf(m, "Doorbell next cacheline: 0x%x\n\n", guc.db_cacheline);
+	seq_printf(m, "\t%*pb\n", GUC_MAX_DOORBELLS, guc->doorbell_bitmap);
+	seq_printf(m, "Doorbell next cacheline: 0x%x\n\n", guc->db_cacheline);
 
-	seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
-	seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
-	seq_printf(m, "GuC last action command: 0x%x\n", guc.action_cmd);
-	seq_printf(m, "GuC last action status: 0x%x\n", guc.action_status);
-	seq_printf(m, "GuC last action error code: %d\n", guc.action_err);
+	seq_printf(m, "GuC total action count: %llu\n", guc->action_count);
+	seq_printf(m, "GuC action failure count: %u\n", guc->action_fail);
+	seq_printf(m, "GuC last action command: 0x%x\n", guc->action_cmd);
+	seq_printf(m, "GuC last action status: 0x%x\n", guc->action_status);
+	seq_printf(m, "GuC last action error code: %d\n", guc->action_err);
 
+	total = 0;
 	seq_printf(m, "\nGuC submissions:\n");
 	for_each_engine(engine, dev_priv, id) {
-		u64 submissions = guc.submissions[id];
+		u64 submissions = guc->submissions[id];
 		total += submissions;
 		seq_printf(m, "\t%-24s: %10llu, last seqno 0x%08x\n",
-			engine->name, submissions, guc.last_seqno[id]);
+			engine->name, submissions, guc->last_seqno[id]);
 	}
 	seq_printf(m, "\t%s: %llu\n", "Total", total);
 
-	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
-	i915_guc_client_info(m, dev_priv, &client);
+	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc->execbuf_client);
+	i915_guc_client_info(m, dev_priv, guc->execbuf_client);
 
 	i915_guc_log_info(m, dev_priv);
 
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 12/15] drm/i915/guc: Rename client->cookie to match use
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
                   ` (10 preceding siblings ...)
  2016-11-25  9:30 ` [PATCH 11/15] drm/i915: Trim i915_guc_info() stack usage Chris Wilson
@ 2016-11-25  9:30 ` Chris Wilson
  2016-11-28 11:37   ` Tvrtko Ursulin
  2016-11-25  9:30 ` [PATCH 13/15] drm/i915/guc: Initialise doorbell cookie to matching value Chris Wilson
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

The client->cookie is a shadow of the doorbell->cookie value, so rename
it to indicate its association with the doorbell, like the doorbell id
and offset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c        | 2 +-
 drivers/gpu/drm/i915/i915_guc_submission.c | 6 +++---
 drivers/gpu/drm/i915/intel_guc.h           | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 7676e88ae5f2..fe62a9da4dd1 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2414,7 +2414,7 @@ static void i915_guc_client_info(struct seq_file *m,
 	seq_printf(m, "\tPriority %d, GuC ctx index: %u, PD offset 0x%x\n",
 		client->priority, client->ctx_index, client->proc_desc_offset);
 	seq_printf(m, "\tDoorbell id %d, offset: 0x%x, cookie 0x%x\n",
-		client->doorbell_id, client->doorbell_offset, client->cookie);
+		client->doorbell_id, client->doorbell_offset, client->doorbell_cookie);
 	seq_printf(m, "\tWQ size %d, offset: 0x%x, tail %d\n",
 		client->wq_size, client->wq_offset, client->wq_tail);
 
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 4462112725ef..81aa5d4265c0 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -567,11 +567,11 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
 
 	/* current cookie */
 	db_cmp.db_status = GUC_DOORBELL_ENABLED;
-	db_cmp.cookie = gc->cookie;
+	db_cmp.cookie = gc->doorbell_cookie;
 
 	/* cookie to be updated */
 	db_exc.db_status = GUC_DOORBELL_ENABLED;
-	db_exc.cookie = gc->cookie + 1;
+	db_exc.cookie = gc->doorbell_cookie + 1;
 	if (db_exc.cookie == 0)
 		db_exc.cookie = 1;
 
@@ -586,7 +586,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
 		/* if the exchange was successfully executed */
 		if (db_ret.value_qw == db_cmp.value_qw) {
 			/* db was successfully rung */
-			gc->cookie = db_exc.cookie;
+			gc->doorbell_cookie = db_exc.cookie;
 			ret = 0;
 			break;
 		}
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 02337a81abc2..731e8a2f2644 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -74,7 +74,7 @@ struct i915_guc_client {
 	uint32_t proc_desc_offset;
 
 	uint32_t doorbell_offset;
-	uint32_t cookie;
+	uint32_t doorbell_cookie;
 	uint16_t doorbell_id;
 	uint16_t padding[3];		/* Maintain alignment		*/
 
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 13/15] drm/i915/guc: Initialise doorbell cookie to matching value
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
                   ` (11 preceding siblings ...)
  2016-11-25  9:30 ` [PATCH 12/15] drm/i915/guc: Rename client->cookie to match use Chris Wilson
@ 2016-11-25  9:30 ` Chris Wilson
  2016-11-28 12:09   ` Tvrtko Ursulin
  2016-11-25  9:30 ` [PATCH 14/15] drm/i915/guc: Keep the execbuf client allocated across reset Chris Wilson
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

Set the initial value of the doorbell cookie from the client.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 81aa5d4265c0..800dc5bb732f 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -247,8 +247,8 @@ static int guc_update_doorbell_id(struct intel_guc *guc,
 
 	/* Activate the new doorbell */
 	__set_bit(new_id, doorbell_bitmap);
-	doorbell->cookie = 0;
 	doorbell->db_status = GUC_DOORBELL_ENABLED;
+	doorbell->cookie = client->doorbell_cookie;
 	return host2guc_allocate_doorbell(guc, client);
 }
 
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 14/15] drm/i915/guc: Keep the execbuf client allocated across reset
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
                   ` (12 preceding siblings ...)
  2016-11-25  9:30 ` [PATCH 13/15] drm/i915/guc: Initialise doorbell cookie to matching value Chris Wilson
@ 2016-11-25  9:30 ` Chris Wilson
  2016-11-28 13:49   ` Tvrtko Ursulin
  2016-11-25  9:30 ` [PATCH 15/15] drm/i915/guc: Split hw submission for replay after GPU reset Chris Wilson
  2016-11-25 10:16 ` ✗ Fi.CI.BAT: warning for series starting with [01/15] drm: Hold mode_config.lock to prevent hotplug whilst setting up crtcs Patchwork
  15 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

In order to avoid some complexity in trying to reconstruct the
workqueues across reset, remember them instead. The issue comes when we
have to handle a reset between request allocation and submission, the
request has reserved space in the wq, but is not in any list so we fail
to restore the reserved space. By keeping the execbuf client intact
across the reset, we also keep the reservations.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 83 +++++++++++++++++++-----------
 1 file changed, 52 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 800dc5bb732f..00b5fa871644 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -252,13 +252,6 @@ static int guc_update_doorbell_id(struct intel_guc *guc,
 	return host2guc_allocate_doorbell(guc, client);
 }
 
-static int guc_init_doorbell(struct intel_guc *guc,
-			      struct i915_guc_client *client,
-			      uint16_t db_id)
-{
-	return guc_update_doorbell_id(guc, client, db_id);
-}
-
 static void guc_disable_doorbell(struct intel_guc *guc,
 				 struct i915_guc_client *client)
 {
@@ -779,8 +772,7 @@ static void guc_init_doorbell_hw(struct intel_guc *guc)
 	uint16_t db_id;
 	int i, err;
 
-	/* Save client's original doorbell selection */
-	db_id = client->doorbell_id;
+	guc_disable_doorbell(guc, client);
 
 	for (i = 0; i < GUC_MAX_DOORBELLS; ++i) {
 		/* Skip if doorbell is OK */
@@ -793,7 +785,9 @@ static void guc_init_doorbell_hw(struct intel_guc *guc)
 					i, err);
 	}
 
-	/* Restore to original value */
+	db_id = select_doorbell_register(guc, client->priority);
+	WARN_ON(db_id == GUC_INVALID_DOORBELL_ID);
+
 	err = guc_update_doorbell_id(guc, client, db_id);
 	if (err)
 		DRM_WARN("Failed to restore doorbell to %d, err %d\n",
@@ -883,8 +877,13 @@ guc_client_alloc(struct drm_i915_private *dev_priv,
 
 	guc_proc_desc_init(guc, client);
 	guc_ctx_desc_init(guc, client);
-	if (guc_init_doorbell(guc, client, db_id))
-		goto err;
+
+	/* For runtime client allocation we need to enable the doorbell. Not
+	 * required yet for the static execbuf_client as this special kernel
+	 * client is enabled from i915_guc_submission_enable().
+	 *
+	 * guc_update_doorbell_id(guc, client, db_id);
+	 */
 
 	DRM_DEBUG_DRIVER("new priority %u client %p for engine(s) 0x%x: ctx_index %u\n",
 		priority, client, client->engines, client->ctx_index);
@@ -1484,6 +1483,9 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
 	struct intel_guc *guc = &dev_priv->guc;
 	struct i915_vma *vma;
 
+	if (!HAS_GUC_SCHED(dev_priv))
+		return 0;
+
 	/* Wipe bitmap & delete client in case of reinitialisation */
 	bitmap_clear(guc->doorbell_bitmap, 0, GUC_MAX_DOORBELLS);
 	i915_guc_submission_disable(dev_priv);
@@ -1504,42 +1506,57 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
 	guc_log_create(guc);
 	guc_addon_create(guc);
 
+	guc->execbuf_client = guc_client_alloc(dev_priv,
+					       INTEL_INFO(dev_priv)->ring_mask,
+					       GUC_CTX_PRIORITY_KMD_NORMAL,
+					       dev_priv->kernel_context);
+	if (!guc->execbuf_client) {
+		DRM_ERROR("Failed to create GuC client for execbuf!\n");
+		goto err;
+	}
+
 	return 0;
+
+err:
+	i915_guc_submission_fini(dev_priv);
+	return -ENOMEM;
+}
+
+static void guc_reset_wq(struct i915_guc_client *gc)
+{
+	struct guc_process_desc *desc = gc->vaddr + gc->proc_desc_offset;
+
+	desc->head = 0;
+	desc->tail = 0;
+
+	gc->wq_tail = 0;
 }
 
 int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 {
 	struct intel_guc *guc = &dev_priv->guc;
-	struct drm_i915_gem_request *request;
-	struct i915_guc_client *client;
+	struct i915_guc_client *client = guc->execbuf_client;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 
-	/* client for execbuf submission */
-	client = guc_client_alloc(dev_priv,
-				  INTEL_INFO(dev_priv)->ring_mask,
-				  GUC_CTX_PRIORITY_KMD_NORMAL,
-				  dev_priv->kernel_context);
-	if (!client) {
-		DRM_ERROR("Failed to create normal GuC client!\n");
-		return -ENOMEM;
-	}
+	if (!client)
+		return -ENODEV;
 
-	guc->execbuf_client = client;
+	guc_reset_wq(client);
 	host2guc_sample_forcewake(guc, client);
 	guc_init_doorbell_hw(guc);
 
 	/* Take over from manual control of ELSP (execlists) */
 	for_each_engine(engine, dev_priv, id) {
+		struct drm_i915_gem_request *rq;
+
 		engine->submit_request = i915_guc_submit;
 		engine->schedule = NULL;
 
 		/* Replay the current set of previously submitted requests */
-		list_for_each_entry(request,
-				    &engine->timeline->requests, link) {
+		list_for_each_entry(rq, &engine->timeline->requests, link) {
 			client->wq_rsvd += sizeof(struct guc_wq_item);
-			if (i915_sw_fence_done(&request->submit))
-				i915_guc_submit(request);
+			i915_guc_submit(rq);
 		}
 	}
 
@@ -1555,14 +1572,18 @@ void i915_guc_submission_disable(struct drm_i915_private *dev_priv)
 
 	/* Revert back to manual ELSP submission */
 	intel_execlists_enable_submission(dev_priv);
-
-	guc_client_free(dev_priv, guc->execbuf_client);
-	guc->execbuf_client = NULL;
 }
 
 void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
 {
 	struct intel_guc *guc = &dev_priv->guc;
+	struct i915_guc_client *client;
+
+	client = fetch_and_zero(&guc->execbuf_client);
+	if (!client)
+		return;
+
+	guc_client_free(dev_priv, client);
 
 	i915_vma_unpin_and_release(&guc->ads_vma);
 	i915_vma_unpin_and_release(&guc->log.vma);
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 15/15] drm/i915/guc: Split hw submission for replay after GPU reset
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
                   ` (13 preceding siblings ...)
  2016-11-25  9:30 ` [PATCH 14/15] drm/i915/guc: Keep the execbuf client allocated across reset Chris Wilson
@ 2016-11-25  9:30 ` Chris Wilson
  2016-11-28 14:02   ` Tvrtko Ursulin
  2016-11-25 10:16 ` ✗ Fi.CI.BAT: warning for series starting with [01/15] drm: Hold mode_config.lock to prevent hotplug whilst setting up crtcs Patchwork
  15 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-11-25  9:30 UTC (permalink / raw)
  To: intel-gfx

Something I missed before sending off the partial series was that the
non-scheduler guc reset path was broken (in the full series, this is
pushed to the execlists reset handler). The issue is that after a reset,
we have to refill the GuC workqueues, which we do by resubmitting the
requests. However, if we already have submitted them, the fences within
them have already been used and triggering them again is an error.
Instead, just repopulate the guc workqueue.

[  115.858560] [IGT] gem_busy: starting subtest hang-render
[  135.839867] [drm] GPU HANG: ecode 9:0:0xe757fefe, in gem_busy [1716], reason: Hang on render ring, action: reset
[  135.839902] drm/i915: Resetting chip after gpu hang
[  135.839957] [drm] RC6 on
[  135.858351] ------------[ cut here ]------------
[  135.858357] WARNING: CPU: 2 PID: 45 at drivers/gpu/drm/i915/i915_sw_fence.c:108 i915_sw_fence_complete+0x25/0x30
[  135.858357] Modules linked in: rfcomm bnep binfmt_misc nls_iso8859_1 input_leds snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core btusb btrtl snd_hwdep snd_pcm 8250_dw snd_seq_midi hid_lenovo snd_seq_midi_event snd_rawmidi iwlwifi x86_pkg_temp_thermal coretemp snd_seq crct10dif_pclmul snd_seq_device hci_uart snd_timer crc32_pclmul ghash_clmulni_intel idma64 aesni_intel virt_dma btbcm snd btqca aes_x86_64 btintel lrw cfg80211 bluetooth gf128mul glue_helper ablk_helper cryptd soundcore intel_lpss_pci intel_pch_thermal intel_lpss_acpi intel_lpss acpi_als mfd_core kfifo_buf acpi_pad industrialio autofs4 hid_plantronics usbhid dm_mirror dm_region_hash dm_log sdhci_pci ahci sdhci libahci i2c_hid hid
[  135.858389] CPU: 2 PID: 45 Comm: kworker/2:1 Tainted: G        W       4.9.0-rc4+ #238
[  135.858389] Hardware name:                  /NUC6i3SYB, BIOS SYSKLi35.86A.0024.2015.1027.2142 10/27/2015
[  135.858392] Workqueue: events_long i915_hangcheck_elapsed
[  135.858394]  ffffc900001bf9b8 ffffffff812bb238 0000000000000000 0000000000000000
[  135.858396]  ffffc900001bf9f8 ffffffff8104f621 0000006c00000000 ffff8808296137f8
[  135.858398]  0000000000000a00 ffff8808457a0000 ffff880845764e60 ffff880845760000
[  135.858399] Call Trace:
[  135.858403]  [<ffffffff812bb238>] dump_stack+0x4d/0x65
[  135.858405]  [<ffffffff8104f621>] __warn+0xc1/0xe0
[  135.858406]  [<ffffffff8104f748>] warn_slowpath_null+0x18/0x20
[  135.858408]  [<ffffffff813f8c15>] i915_sw_fence_complete+0x25/0x30
[  135.858410]  [<ffffffff813f8fad>] i915_sw_fence_commit+0xd/0x30
[  135.858412]  [<ffffffff8142e591>] __i915_gem_request_submit+0xe1/0xf0
[  135.858413]  [<ffffffff8142e5c8>] i915_gem_request_submit+0x28/0x40
[  135.858415]  [<ffffffff814433e7>] i915_guc_submit+0x47/0x210
[  135.858417]  [<ffffffff81443e98>] i915_guc_submission_enable+0x468/0x540
[  135.858419]  [<ffffffff81442495>] intel_guc_setup+0x715/0x810
[  135.858421]  [<ffffffff8142b6b4>] i915_gem_init_hw+0x114/0x2a0
[  135.858423]  [<ffffffff813eeaa8>] i915_reset+0xe8/0x120
[  135.858424]  [<ffffffff813f3937>] i915_reset_and_wakeup+0x157/0x180
[  135.858426]  [<ffffffff813f79db>] i915_handle_error+0x1ab/0x230
[  135.858428]  [<ffffffff812c760d>] ? scnprintf+0x4d/0x90
[  135.858430]  [<ffffffff81435985>] i915_hangcheck_elapsed+0x275/0x3d0
[  135.858432]  [<ffffffff810668cf>] process_one_work+0x12f/0x410
[  135.858433]  [<ffffffff81066bf3>] worker_thread+0x43/0x4d0
[  135.858435]  [<ffffffff81066bb0>] ? process_one_work+0x410/0x410
[  135.858436]  [<ffffffff81066bb0>] ? process_one_work+0x410/0x410
[  135.858438]  [<ffffffff8106bbb4>] kthread+0xd4/0xf0
[  135.858440]  [<ffffffff8106bae0>] ? kthread_park+0x60/0x60

v2: Only resubmit submitted requests
v3: Don't forget the pending requests have reserved space.

Fixes: d55ac5bf97c6 ("drm/i915: Defer transfer onto execution timeline to actual hw submission")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 37 ++++++++++++++++--------------
 1 file changed, 20 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 00b5fa871644..e12ff881d38d 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -602,12 +602,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
 }
 
 /**
- * i915_guc_submit() - Submit commands through GuC
+ * __i915_guc_submit() - Submit commands through GuC
  * @rq:		request associated with the commands
  *
- * Return:	0 on success, otherwise an errno.
- * 		(Note: nonzero really shouldn't happen!)
- *
  * The caller must have already called i915_guc_wq_reserve() above with
  * a result of 0 (success), guaranteeing that there is space in the work
  * queue for the new request, so enqueuing the item cannot fail.
@@ -619,7 +616,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
  * The only error here arises if the doorbell hardware isn't functioning
  * as expected, which really shouln't happen.
  */
-static void i915_guc_submit(struct drm_i915_gem_request *rq)
+static void __i915_guc_submit(struct drm_i915_gem_request *rq)
 {
 	struct drm_i915_private *dev_priv = rq->i915;
 	struct intel_engine_cs *engine = rq->engine;
@@ -628,17 +625,6 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq)
 	struct i915_guc_client *client = guc->execbuf_client;
 	int b_ret;
 
-	/* We keep the previous context alive until we retire the following
-	 * request. This ensures that any the context object is still pinned
-	 * for any residual writes the HW makes into it on the context switch
-	 * into the next object following the breadcrumb. Otherwise, we may
-	 * retire the context too early.
-	 */
-	rq->previous_context = engine->last_context;
-	engine->last_context = rq->ctx;
-
-	i915_gem_request_submit(rq);
-
 	spin_lock(&client->wq_lock);
 	guc_wq_item_append(client, rq);
 
@@ -658,6 +644,23 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq)
 	spin_unlock(&client->wq_lock);
 }
 
+static void i915_guc_submit(struct drm_i915_gem_request *rq)
+{
+	struct intel_engine_cs *engine = rq->engine;
+
+	/* We keep the previous context alive until we retire the following
+	 * request. This ensures that any the context object is still pinned
+	 * for any residual writes the HW makes into it on the context switch
+	 * into the next object following the breadcrumb. Otherwise, we may
+	 * retire the context too early.
+	 */
+	rq->previous_context = engine->last_context;
+	engine->last_context = rq->ctx;
+
+	i915_gem_request_submit(rq);
+	__i915_guc_submit(rq);
+}
+
 /*
  * Everything below here is concerned with setup & teardown, and is
  * therefore not part of the somewhat time-critical batch-submission
@@ -1556,7 +1559,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
 		/* Replay the current set of previously submitted requests */
 		list_for_each_entry(rq, &engine->timeline->requests, link) {
 			client->wq_rsvd += sizeof(struct guc_wq_item);
-			i915_guc_submit(rq);
+			__i915_guc_submit(rq);
 		}
 	}
 
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* ✗ Fi.CI.BAT: warning for series starting with [01/15] drm: Hold mode_config.lock to prevent hotplug whilst setting up crtcs
  2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
                   ` (14 preceding siblings ...)
  2016-11-25  9:30 ` [PATCH 15/15] drm/i915/guc: Split hw submission for replay after GPU reset Chris Wilson
@ 2016-11-25 10:16 ` Patchwork
  15 siblings, 0 replies; 39+ messages in thread
From: Patchwork @ 2016-11-25 10:16 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/15] drm: Hold mode_config.lock to prevent hotplug whilst setting up crtcs
URL   : https://patchwork.freedesktop.org/series/15950/
State : warning

== Summary ==

Series 15950v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/15950/revisions/1/mbox/

Test kms_pipe_crc_basic:
        Subgroup nonblocking-crc-pipe-a:
                pass       -> DMESG-WARN (fi-ilk-650)

fi-bdw-5557u     total:244  pass:229  dwarn:0   dfail:0   fail:0   skip:15 
fi-bxt-t5700     total:244  pass:216  dwarn:0   dfail:0   fail:0   skip:28 
fi-byt-j1900     total:244  pass:216  dwarn:0   dfail:0   fail:0   skip:28 
fi-byt-n2820     total:244  pass:212  dwarn:0   dfail:0   fail:0   skip:32 
fi-hsw-4770      total:244  pass:224  dwarn:0   dfail:0   fail:0   skip:20 
fi-hsw-4770r     total:244  pass:224  dwarn:0   dfail:0   fail:0   skip:20 
fi-ilk-650       total:244  pass:190  dwarn:1   dfail:0   fail:0   skip:53 
fi-ivb-3520m     total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-ivb-3770      total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-kbl-7200u     total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-kbl-7500u     total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-skl-6260u     total:244  pass:230  dwarn:0   dfail:0   fail:0   skip:14 
fi-skl-6700hq    total:244  pass:223  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6700k     total:244  pass:222  dwarn:1   dfail:0   fail:0   skip:21 
fi-skl-6770hq    total:244  pass:230  dwarn:0   dfail:0   fail:0   skip:14 
fi-snb-2520m     total:244  pass:212  dwarn:0   dfail:0   fail:0   skip:32 
fi-snb-2600      total:244  pass:211  dwarn:0   dfail:0   fail:0   skip:33 
fi-bsw-n3050 failed to collect. IGT log at Patchwork_3112/fi-bsw-n3050/igt.log

90e312d771441d1ec7bf3cd72620a3fe9a4aef82 drm-tip: 2016y-11m-25d-07h-11m-59s UTC integration manifest
43ad2d6 drm/i915/guc: Split hw submission for replay after GPU reset
504f114 drm/i915/guc: Keep the execbuf client allocated across reset
11a5fc4 drm/i915/guc: Initialise doorbell cookie to matching value
b32f532 drm/i915/guc: Rename client->cookie to match use
80f383a drm/i915: Trim i915_guc_info() stack usage
4ab155c HAX drm/i915: Enable guc submission
c123832 drm/i915: Enable swfence debugobject support for i915.ko
25278f3 drm/i915: Integrate i915_sw_fence with debugobjects
d617030 drm/i915: Hold a reference on the request for its fence chain
ec83476 drm/i915: Assert no external observers when unwind a failed request alloc
f0ac1c1 drm/i915: Add is-completed assert to request retire entrypoint
51f5855 drm/i915/perf: Wrap 64bit divides in do_div()
e810a5e drm: Protect fb_helper list manipulation with a mutex
2ae4afc drm: Pull together probe + setup for drm_fb_helper
f5d6587 drm: Hold mode_config.lock to prevent hotplug whilst setting up crtcs

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3112/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 08/15] drm/i915: Integrate i915_sw_fence with debugobjects
  2016-11-25  9:30 ` [PATCH 08/15] drm/i915: Integrate i915_sw_fence with debugobjects Chris Wilson
@ 2016-11-25 10:24   ` Joonas Lahtinen
  0 siblings, 0 replies; 39+ messages in thread
From: Joonas Lahtinen @ 2016-11-25 10:24 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On pe, 2016-11-25 at 09:30 +0000, Chris Wilson wrote:
> Add the tracking required to enable debugobjects to improve error
> detection in BAT.  The debugobject interface lets us to track the
> lifetime of the fences even while being embedded into larger structs,
> i.e. to check they are not used after they have been released.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

<SNIP>

> +static bool i915_sw_fence_fixup_fail(void *addr, enum debug_obj_state state)
> +{
> +	return false;
> +}

This is unnecessary, just leave the funcs unpopulated.

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 07/15] drm/i915: Hold a reference on the request for its fence chain
  2016-11-25  9:30 ` [PATCH 07/15] drm/i915: Hold a reference on the request for its fence chain Chris Wilson
@ 2016-11-25 10:31   ` Joonas Lahtinen
  2016-11-25 10:46     ` Chris Wilson
  0 siblings, 1 reply; 39+ messages in thread
From: Joonas Lahtinen @ 2016-11-25 10:31 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On pe, 2016-11-25 at 09:30 +0000, Chris Wilson wrote:
> Currently, we have an active reference for the request until it is
> retired. Though it cannot be retired before it has been executed by
> hardware, the request may be completed before we have finished
> processing the execute fence, i.e. we may continue to process that fence
> as we free the request.
> 
> Fixes: 5590af3e115a ("drm/i915: Drive request submission through fence callbacks")
> Fixes: 23902e49c999 ("drm/i915: Split request submit/execute phase into two")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

<SNIP>

> @@ -551,8 +569,10 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
>  		       req->timeline->fence_context,
>  		       __timeline_get_seqno(req->timeline->common));
>  
> -	i915_sw_fence_init(&req->submit, submit_notify);
> -	i915_sw_fence_init(&req->execute, execute_notify);
> +	/* We bump the ref for the fence chain */
> +	i915_sw_fence_init(&i915_gem_request_get(req)->submit, submit_notify);
> +	i915_sw_fence_init(&i915_gem_request_get(req)->execute, execute_notify);

I think having the gem_request_get in a separate line before fence_init
and remove the comment. Make the code speak for itself instead of
camouflaging it. Now it looks much like a type conversion helper.

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 09/15] drm/i915: Enable swfence debugobject support for i915.ko
  2016-11-25  9:30 ` [PATCH 09/15] drm/i915: Enable swfence debugobject support for i915.ko Chris Wilson
@ 2016-11-25 10:32   ` Joonas Lahtinen
  0 siblings, 0 replies; 39+ messages in thread
From: Joonas Lahtinen @ 2016-11-25 10:32 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On pe, 2016-11-25 at 09:30 +0000, Chris Wilson wrote:
> Only once the debugobject symbols are exported can we enable support for
> debugging swfences when i915 is built as a module. Requires commit
> 2617fdca3f68 ("lib/debugobjects: export for use in modules")
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 05/15] drm/i915: Add is-completed assert to request retire entrypoint
  2016-11-25  9:30 ` [PATCH 05/15] drm/i915: Add is-completed assert to request retire entrypoint Chris Wilson
@ 2016-11-25 10:36   ` Joonas Lahtinen
  0 siblings, 0 replies; 39+ messages in thread
From: Joonas Lahtinen @ 2016-11-25 10:36 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On pe, 2016-11-25 at 09:30 +0000, Chris Wilson wrote:
> While we will check that the request is completed prior to being
> retired, by placing an assert that the request is complete at the
> entrypoint of the function we can more clearly document the function's
> preconditions.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 06/15] drm/i915: Assert no external observers when unwind a failed request alloc
  2016-11-25  9:30 ` [PATCH 06/15] drm/i915: Assert no external observers when unwind a failed request alloc Chris Wilson
@ 2016-11-25 10:39   ` Joonas Lahtinen
  2016-11-25 11:27     ` Chris Wilson
  0 siblings, 1 reply; 39+ messages in thread
From: Joonas Lahtinen @ 2016-11-25 10:39 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On pe, 2016-11-25 at 09:30 +0000, Chris Wilson wrote:
> Before we return the request back to the kmem_cache after a failed
> i915_gem_request_alloc(), we should assert that it has not been added to
> any global state tracking.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_gem_request.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index bd7b21f70283..ed6cead22a86 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -599,6 +599,11 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
> >  	return req;
>  
>  err_ctx:
> +	/* Make sure we didn't add ourselves to external state before freeing */
> +	GEM_BUG_ON(!list_empty(&req->active_list));
> +	GEM_BUG_ON(!list_empty(&req->priotree.signalers_list));
> +	GEM_BUG_ON(!list_empty(&req->priotree.waiters_list));

This is some paranoia already, but if you managed to hit it;

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 07/15] drm/i915: Hold a reference on the request for its fence chain
  2016-11-25 10:31   ` Joonas Lahtinen
@ 2016-11-25 10:46     ` Chris Wilson
  0 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-11-25 10:46 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx

On Fri, Nov 25, 2016 at 12:31:03PM +0200, Joonas Lahtinen wrote:
> On pe, 2016-11-25 at 09:30 +0000, Chris Wilson wrote:
> > Currently, we have an active reference for the request until it is
> > retired. Though it cannot be retired before it has been executed by
> > hardware, the request may be completed before we have finished
> > processing the execute fence, i.e. we may continue to process that fence
> > as we free the request.
> > 
> > Fixes: 5590af3e115a ("drm/i915: Drive request submission through fence callbacks")
> > Fixes: 23902e49c999 ("drm/i915: Split request submit/execute phase into two")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> <SNIP>
> 
> > @@ -551,8 +569,10 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
> >  		       req->timeline->fence_context,
> >  		       __timeline_get_seqno(req->timeline->common));
> >  
> > -	i915_sw_fence_init(&req->submit, submit_notify);
> > -	i915_sw_fence_init(&req->execute, execute_notify);
> > +	/* We bump the ref for the fence chain */
> > +	i915_sw_fence_init(&i915_gem_request_get(req)->submit, submit_notify);
> > +	i915_sw_fence_init(&i915_gem_request_get(req)->execute, execute_notify);
> 
> I think having the gem_request_get in a separate line before fence_init
> and remove the comment. Make the code speak for itself instead of
> camouflaging it. Now it looks much like a type conversion helper.

I don't like the kref_init(); kref_get(); kref_get(); but that's the
interface we have, so yes I was trying to hide it as I think it is ugly.
And also because it pairs with the submit_notify, i.e. we pass a new
ref to sw_fence_init that it deals with. Hmm, still favouring the
all-in-one.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 06/15] drm/i915: Assert no external observers when unwind a failed request alloc
  2016-11-25 10:39   ` Joonas Lahtinen
@ 2016-11-25 11:27     ` Chris Wilson
  0 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-11-25 11:27 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx

On Fri, Nov 25, 2016 at 12:39:25PM +0200, Joonas Lahtinen wrote:
> On pe, 2016-11-25 at 09:30 +0000, Chris Wilson wrote:
> > Before we return the request back to the kmem_cache after a failed
> > i915_gem_request_alloc(), we should assert that it has not been added to
> > any global state tracking.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/i915_gem_request.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> > index bd7b21f70283..ed6cead22a86 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_request.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> > @@ -599,6 +599,11 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
> > >  	return req;
> >  
> >  err_ctx:
> > +	/* Make sure we didn't add ourselves to external state before freeing */
> > +	GEM_BUG_ON(!list_empty(&req->active_list));
> > +	GEM_BUG_ON(!list_empty(&req->priotree.signalers_list));
> > +	GEM_BUG_ON(!list_empty(&req->priotree.waiters_list));
> 
> This is some paranoia already, but if you managed to hit it;

Fortunately not. I scoured the init_context to make sure we weren't
triggering it - but it is a bug we have had in the past and the reason
why requests are so tricky to construct so I thought having some
paranoia was worth the documentation.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 11/15] drm/i915: Trim i915_guc_info() stack usage
  2016-11-25  9:30 ` [PATCH 11/15] drm/i915: Trim i915_guc_info() stack usage Chris Wilson
@ 2016-11-28 11:15   ` Tvrtko Ursulin
  2016-11-28 11:35     ` Chris Wilson
  0 siblings, 1 reply; 39+ messages in thread
From: Tvrtko Ursulin @ 2016-11-28 11:15 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 25/11/2016 09:30, Chris Wilson wrote:
> i915_guc_info() (part of debugfs output) tries to avoid holding
> struct_mutex for a long period by copying onto the stack. This causes a
> warning that the stack frame is massive, so stop doing that. We can even
> forgo holding the struct_mutex here as that doesn't serialise the values
> being read (and the lists used exist for the device lifetime).
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 39 +++++++++++++------------------------
>  1 file changed, 14 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 8eb8c29b7492..7676e88ae5f2 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2434,47 +2434,36 @@ static void i915_guc_client_info(struct seq_file *m,
>  static int i915_guc_info(struct seq_file *m, void *data)
>  {
>  	struct drm_i915_private *dev_priv = node_to_i915(m->private);
> -	struct drm_device *dev = &dev_priv->drm;
> -	struct intel_guc guc;
> -	struct i915_guc_client client = {};
> +	const struct intel_guc *guc = &dev_priv->guc;
>  	struct intel_engine_cs *engine;
>  	enum intel_engine_id id;
> -	u64 total = 0;
> +	u64 total;
>
>  	if (!HAS_GUC_SCHED(dev_priv))
>  		return 0;
>
> -	if (mutex_lock_interruptible(&dev->struct_mutex))
> -		return 0;
> -
> -	/* Take a local copy of the GuC data, so we can dump it at leisure */
> -	guc = dev_priv->guc;
> -	if (guc.execbuf_client)
> -		client = *guc.execbuf_client;

So this used to print out all zeros when GuC submission is disabled. 
Should we instead just skip all the counter dumping if execbuf_client == 
NULL and just print "disabled" or something?

> -
> -	mutex_unlock(&dev->struct_mutex);
> -
>  	seq_printf(m, "Doorbell map:\n");
> -	seq_printf(m, "\t%*pb\n", GUC_MAX_DOORBELLS, guc.doorbell_bitmap);
> -	seq_printf(m, "Doorbell next cacheline: 0x%x\n\n", guc.db_cacheline);
> +	seq_printf(m, "\t%*pb\n", GUC_MAX_DOORBELLS, guc->doorbell_bitmap);
> +	seq_printf(m, "Doorbell next cacheline: 0x%x\n\n", guc->db_cacheline);
>
> -	seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
> -	seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
> -	seq_printf(m, "GuC last action command: 0x%x\n", guc.action_cmd);
> -	seq_printf(m, "GuC last action status: 0x%x\n", guc.action_status);
> -	seq_printf(m, "GuC last action error code: %d\n", guc.action_err);
> +	seq_printf(m, "GuC total action count: %llu\n", guc->action_count);
> +	seq_printf(m, "GuC action failure count: %u\n", guc->action_fail);
> +	seq_printf(m, "GuC last action command: 0x%x\n", guc->action_cmd);
> +	seq_printf(m, "GuC last action status: 0x%x\n", guc->action_status);
> +	seq_printf(m, "GuC last action error code: %d\n", guc->action_err);
>
> +	total = 0;
>  	seq_printf(m, "\nGuC submissions:\n");
>  	for_each_engine(engine, dev_priv, id) {
> -		u64 submissions = guc.submissions[id];
> +		u64 submissions = guc->submissions[id];
>  		total += submissions;
>  		seq_printf(m, "\t%-24s: %10llu, last seqno 0x%08x\n",
> -			engine->name, submissions, guc.last_seqno[id]);
> +			engine->name, submissions, guc->last_seqno[id]);
>  	}
>  	seq_printf(m, "\t%s: %llu\n", "Total", total);
>
> -	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc.execbuf_client);
> -	i915_guc_client_info(m, dev_priv, &client);
> +	seq_printf(m, "\nGuC execbuf client @ %p:\n", guc->execbuf_client);

Dereferencing guc->execbuf_client doesn't oops in execlist mode? I must 
be misreading something in that case.

> +	i915_guc_client_info(m, dev_priv, guc->execbuf_client);
>
>  	i915_guc_log_info(m, dev_priv);
>
>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 11/15] drm/i915: Trim i915_guc_info() stack usage
  2016-11-28 11:15   ` Tvrtko Ursulin
@ 2016-11-28 11:35     ` Chris Wilson
  2016-11-28 12:17       ` Tvrtko Ursulin
  0 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-11-28 11:35 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Nov 28, 2016 at 11:15:27AM +0000, Tvrtko Ursulin wrote:
> 
> On 25/11/2016 09:30, Chris Wilson wrote:
> >i915_guc_info() (part of debugfs output) tries to avoid holding
> >struct_mutex for a long period by copying onto the stack. This causes a
> >warning that the stack frame is massive, so stop doing that. We can even
> >forgo holding the struct_mutex here as that doesn't serialise the values
> >being read (and the lists used exist for the device lifetime).
> >
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >---
> > drivers/gpu/drm/i915/i915_debugfs.c | 39 +++++++++++++------------------------
> > 1 file changed, 14 insertions(+), 25 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> >index 8eb8c29b7492..7676e88ae5f2 100644
> >--- a/drivers/gpu/drm/i915/i915_debugfs.c
> >+++ b/drivers/gpu/drm/i915/i915_debugfs.c
> >@@ -2434,47 +2434,36 @@ static void i915_guc_client_info(struct seq_file *m,
> > static int i915_guc_info(struct seq_file *m, void *data)
> > {
> > 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
> >-	struct drm_device *dev = &dev_priv->drm;
> >-	struct intel_guc guc;
> >-	struct i915_guc_client client = {};
> >+	const struct intel_guc *guc = &dev_priv->guc;
> > 	struct intel_engine_cs *engine;
> > 	enum intel_engine_id id;
> >-	u64 total = 0;
> >+	u64 total;
> >
> > 	if (!HAS_GUC_SCHED(dev_priv))
> > 		return 0;
> >
> >-	if (mutex_lock_interruptible(&dev->struct_mutex))
> >-		return 0;
> >-
> >-	/* Take a local copy of the GuC data, so we can dump it at leisure */
> >-	guc = dev_priv->guc;
> >-	if (guc.execbuf_client)
> >-		client = *guc.execbuf_client;
> 
> So this used to print out all zeros when GuC submission is disabled.
> Should we instead just skip all the counter dumping if
> execbuf_client == NULL and just print "disabled" or something?

That would be the sensible escape instead of HAS_GUC_SCHED.

if (!dev_priv->guc.execbuf_client) {
	seq_puts(m, "GuC scheduling %s\n",
		 HAS_GUC_SCHED(dev_priv) ?  "disabled" : "not supported");
	return 0;
}
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 12/15] drm/i915/guc: Rename client->cookie to match use
  2016-11-25  9:30 ` [PATCH 12/15] drm/i915/guc: Rename client->cookie to match use Chris Wilson
@ 2016-11-28 11:37   ` Tvrtko Ursulin
  0 siblings, 0 replies; 39+ messages in thread
From: Tvrtko Ursulin @ 2016-11-28 11:37 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 25/11/2016 09:30, Chris Wilson wrote:
> The client->cookie is a shadow of the doorbell->cookie value, so rename
> it to indicate its association with the doorbell, like the doorbell id
> and offset.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        | 2 +-
>  drivers/gpu/drm/i915/i915_guc_submission.c | 6 +++---
>  drivers/gpu/drm/i915/intel_guc.h           | 2 +-
>  3 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 7676e88ae5f2..fe62a9da4dd1 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2414,7 +2414,7 @@ static void i915_guc_client_info(struct seq_file *m,
>  	seq_printf(m, "\tPriority %d, GuC ctx index: %u, PD offset 0x%x\n",
>  		client->priority, client->ctx_index, client->proc_desc_offset);
>  	seq_printf(m, "\tDoorbell id %d, offset: 0x%x, cookie 0x%x\n",
> -		client->doorbell_id, client->doorbell_offset, client->cookie);
> +		client->doorbell_id, client->doorbell_offset, client->doorbell_cookie);
>  	seq_printf(m, "\tWQ size %d, offset: 0x%x, tail %d\n",
>  		client->wq_size, client->wq_offset, client->wq_tail);
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 4462112725ef..81aa5d4265c0 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -567,11 +567,11 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
>
>  	/* current cookie */
>  	db_cmp.db_status = GUC_DOORBELL_ENABLED;
> -	db_cmp.cookie = gc->cookie;
> +	db_cmp.cookie = gc->doorbell_cookie;
>
>  	/* cookie to be updated */
>  	db_exc.db_status = GUC_DOORBELL_ENABLED;
> -	db_exc.cookie = gc->cookie + 1;
> +	db_exc.cookie = gc->doorbell_cookie + 1;
>  	if (db_exc.cookie == 0)
>  		db_exc.cookie = 1;
>
> @@ -586,7 +586,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
>  		/* if the exchange was successfully executed */
>  		if (db_ret.value_qw == db_cmp.value_qw) {
>  			/* db was successfully rung */
> -			gc->cookie = db_exc.cookie;
> +			gc->doorbell_cookie = db_exc.cookie;
>  			ret = 0;
>  			break;
>  		}
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index 02337a81abc2..731e8a2f2644 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -74,7 +74,7 @@ struct i915_guc_client {
>  	uint32_t proc_desc_offset;
>
>  	uint32_t doorbell_offset;
> -	uint32_t cookie;
> +	uint32_t doorbell_cookie;
>  	uint16_t doorbell_id;
>  	uint16_t padding[3];		/* Maintain alignment		*/
>
>

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 13/15] drm/i915/guc: Initialise doorbell cookie to matching value
  2016-11-25  9:30 ` [PATCH 13/15] drm/i915/guc: Initialise doorbell cookie to matching value Chris Wilson
@ 2016-11-28 12:09   ` Tvrtko Ursulin
  2016-11-28 12:18     ` Chris Wilson
  0 siblings, 1 reply; 39+ messages in thread
From: Tvrtko Ursulin @ 2016-11-28 12:09 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 25/11/2016 09:30, Chris Wilson wrote:
> Set the initial value of the doorbell cookie from the client.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 81aa5d4265c0..800dc5bb732f 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -247,8 +247,8 @@ static int guc_update_doorbell_id(struct intel_guc *guc,
>
>  	/* Activate the new doorbell */
>  	__set_bit(new_id, doorbell_bitmap);
> -	doorbell->cookie = 0;
>  	doorbell->db_status = GUC_DOORBELL_ENABLED;
> +	doorbell->cookie = client->doorbell_cookie;
>  	return host2guc_allocate_doorbell(guc, client);
>  }
>
>

I assume this is to enable the following patch where the cookie starts 
being preserved?

It appears no change to me when looked just to this point. On that basis 
it looks fine to let it in.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 11/15] drm/i915: Trim i915_guc_info() stack usage
  2016-11-28 11:35     ` Chris Wilson
@ 2016-11-28 12:17       ` Tvrtko Ursulin
  0 siblings, 0 replies; 39+ messages in thread
From: Tvrtko Ursulin @ 2016-11-28 12:17 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 28/11/2016 11:35, Chris Wilson wrote:
> On Mon, Nov 28, 2016 at 11:15:27AM +0000, Tvrtko Ursulin wrote:
>>
>> On 25/11/2016 09:30, Chris Wilson wrote:
>>> i915_guc_info() (part of debugfs output) tries to avoid holding
>>> struct_mutex for a long period by copying onto the stack. This causes a
>>> warning that the stack frame is massive, so stop doing that. We can even
>>> forgo holding the struct_mutex here as that doesn't serialise the values
>>> being read (and the lists used exist for the device lifetime).
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>> drivers/gpu/drm/i915/i915_debugfs.c | 39 +++++++++++++------------------------
>>> 1 file changed, 14 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>>> index 8eb8c29b7492..7676e88ae5f2 100644
>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>> @@ -2434,47 +2434,36 @@ static void i915_guc_client_info(struct seq_file *m,
>>> static int i915_guc_info(struct seq_file *m, void *data)
>>> {
>>> 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
>>> -	struct drm_device *dev = &dev_priv->drm;
>>> -	struct intel_guc guc;
>>> -	struct i915_guc_client client = {};
>>> +	const struct intel_guc *guc = &dev_priv->guc;
>>> 	struct intel_engine_cs *engine;
>>> 	enum intel_engine_id id;
>>> -	u64 total = 0;
>>> +	u64 total;
>>>
>>> 	if (!HAS_GUC_SCHED(dev_priv))
>>> 		return 0;
>>>
>>> -	if (mutex_lock_interruptible(&dev->struct_mutex))
>>> -		return 0;
>>> -
>>> -	/* Take a local copy of the GuC data, so we can dump it at leisure */
>>> -	guc = dev_priv->guc;
>>> -	if (guc.execbuf_client)
>>> -		client = *guc.execbuf_client;
>>
>> So this used to print out all zeros when GuC submission is disabled.
>> Should we instead just skip all the counter dumping if
>> execbuf_client == NULL and just print "disabled" or something?
>
> That would be the sensible escape instead of HAS_GUC_SCHED.
>
> if (!dev_priv->guc.execbuf_client) {
> 	seq_puts(m, "GuC scheduling %s\n",
> 		 HAS_GUC_SCHED(dev_priv) ?  "disabled" : "not supported");
> 	return 0;
> }

That looks good to me. (Just printf, not puts.)

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 13/15] drm/i915/guc: Initialise doorbell cookie to matching value
  2016-11-28 12:09   ` Tvrtko Ursulin
@ 2016-11-28 12:18     ` Chris Wilson
  0 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-11-28 12:18 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Nov 28, 2016 at 12:09:38PM +0000, Tvrtko Ursulin wrote:
> 
> On 25/11/2016 09:30, Chris Wilson wrote:
> >Set the initial value of the doorbell cookie from the client.
> >
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >---
> > drivers/gpu/drm/i915/i915_guc_submission.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> >index 81aa5d4265c0..800dc5bb732f 100644
> >--- a/drivers/gpu/drm/i915/i915_guc_submission.c
> >+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >@@ -247,8 +247,8 @@ static int guc_update_doorbell_id(struct intel_guc *guc,
> >
> > 	/* Activate the new doorbell */
> > 	__set_bit(new_id, doorbell_bitmap);
> >-	doorbell->cookie = 0;
> > 	doorbell->db_status = GUC_DOORBELL_ENABLED;
> >+	doorbell->cookie = client->doorbell_cookie;
> > 	return host2guc_allocate_doorbell(guc, client);
> > }
> >
> >
> 
> I assume this is to enable the following patch where the cookie
> starts being preserved?

Exactly. At present, the value is zero, but when we start carrying the
client over across a reset, the client's cookie persists and so we
either want to reset the client of set the cookie to match the client. I
choose the later as I think that is more consistent with how we ring the
doorbell (by setting the expected value form the client and comparing
against the current doorbell in the fw).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 14/15] drm/i915/guc: Keep the execbuf client allocated across reset
  2016-11-25  9:30 ` [PATCH 14/15] drm/i915/guc: Keep the execbuf client allocated across reset Chris Wilson
@ 2016-11-28 13:49   ` Tvrtko Ursulin
  2016-11-28 14:11     ` Chris Wilson
  0 siblings, 1 reply; 39+ messages in thread
From: Tvrtko Ursulin @ 2016-11-28 13:49 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 25/11/2016 09:30, Chris Wilson wrote:
> In order to avoid some complexity in trying to reconstruct the
> workqueues across reset, remember them instead. The issue comes when we
> have to handle a reset between request allocation and submission, the
> request has reserved space in the wq, but is not in any list so we fail
> to restore the reserved space. By keeping the execbuf client intact
> across the reset, we also keep the reservations.

I lost track a bit on why do we need to reserve the space at request 
creation time? Is it not becoming a bit cumbersome?

>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 83 +++++++++++++++++++-----------
>  1 file changed, 52 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 800dc5bb732f..00b5fa871644 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -252,13 +252,6 @@ static int guc_update_doorbell_id(struct intel_guc *guc,
>  	return host2guc_allocate_doorbell(guc, client);
>  }
>
> -static int guc_init_doorbell(struct intel_guc *guc,
> -			      struct i915_guc_client *client,
> -			      uint16_t db_id)
> -{
> -	return guc_update_doorbell_id(guc, client, db_id);
> -}
> -
>  static void guc_disable_doorbell(struct intel_guc *guc,
>  				 struct i915_guc_client *client)
>  {
> @@ -779,8 +772,7 @@ static void guc_init_doorbell_hw(struct intel_guc *guc)
>  	uint16_t db_id;
>  	int i, err;
>
> -	/* Save client's original doorbell selection */
> -	db_id = client->doorbell_id;
> +	guc_disable_doorbell(guc, client);
>
>  	for (i = 0; i < GUC_MAX_DOORBELLS; ++i) {
>  		/* Skip if doorbell is OK */
> @@ -793,7 +785,9 @@ static void guc_init_doorbell_hw(struct intel_guc *guc)
>  					i, err);
>  	}
>
> -	/* Restore to original value */
> +	db_id = select_doorbell_register(guc, client->priority);
> +	WARN_ON(db_id == GUC_INVALID_DOORBELL_ID);
> +
>  	err = guc_update_doorbell_id(guc, client, db_id);
>  	if (err)
>  		DRM_WARN("Failed to restore doorbell to %d, err %d\n",
> @@ -883,8 +877,13 @@ guc_client_alloc(struct drm_i915_private *dev_priv,
>
>  	guc_proc_desc_init(guc, client);
>  	guc_ctx_desc_init(guc, client);
> -	if (guc_init_doorbell(guc, client, db_id))
> -		goto err;
> +
> +	/* For runtime client allocation we need to enable the doorbell. Not
> +	 * required yet for the static execbuf_client as this special kernel
> +	 * client is enabled from i915_guc_submission_enable().
> +	 *
> +	 * guc_update_doorbell_id(guc, client, db_id);
> +	 */

What future is the "not yet" part referring to? What are the other clients?

>
>  	DRM_DEBUG_DRIVER("new priority %u client %p for engine(s) 0x%x: ctx_index %u\n",
>  		priority, client, client->engines, client->ctx_index);
> @@ -1484,6 +1483,9 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
>  	struct intel_guc *guc = &dev_priv->guc;
>  	struct i915_vma *vma;
>
> +	if (!HAS_GUC_SCHED(dev_priv))
> +		return 0;

Why did you have to add this hunk? I think this function does not get 
called unless there is a GuC.
> +
>  	/* Wipe bitmap & delete client in case of reinitialisation */
>  	bitmap_clear(guc->doorbell_bitmap, 0, GUC_MAX_DOORBELLS);
>  	i915_guc_submission_disable(dev_priv);
> @@ -1504,42 +1506,57 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
>  	guc_log_create(guc);
>  	guc_addon_create(guc);
>
> +	guc->execbuf_client = guc_client_alloc(dev_priv,
> +					       INTEL_INFO(dev_priv)->ring_mask,
> +					       GUC_CTX_PRIORITY_KMD_NORMAL,
> +					       dev_priv->kernel_context);
> +	if (!guc->execbuf_client) {
> +		DRM_ERROR("Failed to create GuC client for execbuf!\n");
> +		goto err;
> +	}
> +
>  	return 0;
> +
> +err:
> +	i915_guc_submission_fini(dev_priv);
> +	return -ENOMEM;
> +}
> +
> +static void guc_reset_wq(struct i915_guc_client *gc)
> +{
> +	struct guc_process_desc *desc = gc->vaddr + gc->proc_desc_offset;
> +
> +	desc->head = 0;
> +	desc->tail = 0;
> +
> +	gc->wq_tail = 0;
>  }
>
>  int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>  {
>  	struct intel_guc *guc = &dev_priv->guc;
> -	struct drm_i915_gem_request *request;
> -	struct i915_guc_client *client;
> +	struct i915_guc_client *client = guc->execbuf_client;
>  	struct intel_engine_cs *engine;
>  	enum intel_engine_id id;
>
> -	/* client for execbuf submission */
> -	client = guc_client_alloc(dev_priv,
> -				  INTEL_INFO(dev_priv)->ring_mask,
> -				  GUC_CTX_PRIORITY_KMD_NORMAL,
> -				  dev_priv->kernel_context);
> -	if (!client) {
> -		DRM_ERROR("Failed to create normal GuC client!\n");
> -		return -ENOMEM;
> -	}
> +	if (!client)
> +		return -ENODEV;
>
> -	guc->execbuf_client = client;
> +	guc_reset_wq(client);
>  	host2guc_sample_forcewake(guc, client);
>  	guc_init_doorbell_hw(guc);
>
>  	/* Take over from manual control of ELSP (execlists) */
>  	for_each_engine(engine, dev_priv, id) {
> +		struct drm_i915_gem_request *rq;
> +
>  		engine->submit_request = i915_guc_submit;
>  		engine->schedule = NULL;
>
>  		/* Replay the current set of previously submitted requests */
> -		list_for_each_entry(request,
> -				    &engine->timeline->requests, link) {
> +		list_for_each_entry(rq, &engine->timeline->requests, link) {
>  			client->wq_rsvd += sizeof(struct guc_wq_item);
> -			if (i915_sw_fence_done(&request->submit))
> -				i915_guc_submit(request);

i915_sw_fence_done check is not needed because only submit-ready 
requests can be on the engine timeline?

> +			i915_guc_submit(rq);
>  		}
>  	}
>
> @@ -1555,14 +1572,18 @@ void i915_guc_submission_disable(struct drm_i915_private *dev_priv)
>
>  	/* Revert back to manual ELSP submission */
>  	intel_execlists_enable_submission(dev_priv);
> -
> -	guc_client_free(dev_priv, guc->execbuf_client);
> -	guc->execbuf_client = NULL;
>  }
>
>  void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
>  {
>  	struct intel_guc *guc = &dev_priv->guc;
> +	struct i915_guc_client *client;
> +
> +	client = fetch_and_zero(&guc->execbuf_client);
> +	if (!client)
> +		return;
> +
> +	guc_client_free(dev_priv, client);
>
>  	i915_vma_unpin_and_release(&guc->ads_vma);
>  	i915_vma_unpin_and_release(&guc->log.vma);
>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 15/15] drm/i915/guc: Split hw submission for replay after GPU reset
  2016-11-25  9:30 ` [PATCH 15/15] drm/i915/guc: Split hw submission for replay after GPU reset Chris Wilson
@ 2016-11-28 14:02   ` Tvrtko Ursulin
  2016-11-28 14:19     ` Chris Wilson
  0 siblings, 1 reply; 39+ messages in thread
From: Tvrtko Ursulin @ 2016-11-28 14:02 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 25/11/2016 09:30, Chris Wilson wrote:
> Something I missed before sending off the partial series was that the
> non-scheduler guc reset path was broken (in the full series, this is
> pushed to the execlists reset handler). The issue is that after a reset,
> we have to refill the GuC workqueues, which we do by resubmitting the
> requests. However, if we already have submitted them, the fences within
> them have already been used and triggering them again is an error.
> Instead, just repopulate the guc workqueue.
>
> [  115.858560] [IGT] gem_busy: starting subtest hang-render
> [  135.839867] [drm] GPU HANG: ecode 9:0:0xe757fefe, in gem_busy [1716], reason: Hang on render ring, action: reset
> [  135.839902] drm/i915: Resetting chip after gpu hang
> [  135.839957] [drm] RC6 on
> [  135.858351] ------------[ cut here ]------------
> [  135.858357] WARNING: CPU: 2 PID: 45 at drivers/gpu/drm/i915/i915_sw_fence.c:108 i915_sw_fence_complete+0x25/0x30
> [  135.858357] Modules linked in: rfcomm bnep binfmt_misc nls_iso8859_1 input_leds snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core btusb btrtl snd_hwdep snd_pcm 8250_dw snd_seq_midi hid_lenovo snd_seq_midi_event snd_rawmidi iwlwifi x86_pkg_temp_thermal coretemp snd_seq crct10dif_pclmul snd_seq_device hci_uart snd_timer crc32_pclmul ghash_clmulni_intel idma64 aesni_intel virt_dma btbcm snd btqca aes_x86_64 btintel lrw cfg80211 bluetooth gf128mul glue_helper ablk_helper cryptd soundcore intel_lpss_pci intel_pch_thermal intel_lpss_acpi intel_lpss acpi_als mfd_core kfifo_buf acpi_pad industrialio autofs4 hid_plantronics usbhid dm_mirror dm_region_hash dm_log sdhci_pci ahci sdhci libahci i2c_hid hid
> [  135.858389] CPU: 2 PID: 45 Comm: kworker/2:1 Tainted: G        W       4.9.0-rc4+ #238
> [  135.858389] Hardware name:                  /NUC6i3SYB, BIOS SYSKLi35.86A.0024.2015.1027.2142 10/27/2015
> [  135.858392] Workqueue: events_long i915_hangcheck_elapsed
> [  135.858394]  ffffc900001bf9b8 ffffffff812bb238 0000000000000000 0000000000000000
> [  135.858396]  ffffc900001bf9f8 ffffffff8104f621 0000006c00000000 ffff8808296137f8
> [  135.858398]  0000000000000a00 ffff8808457a0000 ffff880845764e60 ffff880845760000
> [  135.858399] Call Trace:
> [  135.858403]  [<ffffffff812bb238>] dump_stack+0x4d/0x65
> [  135.858405]  [<ffffffff8104f621>] __warn+0xc1/0xe0
> [  135.858406]  [<ffffffff8104f748>] warn_slowpath_null+0x18/0x20
> [  135.858408]  [<ffffffff813f8c15>] i915_sw_fence_complete+0x25/0x30
> [  135.858410]  [<ffffffff813f8fad>] i915_sw_fence_commit+0xd/0x30
> [  135.858412]  [<ffffffff8142e591>] __i915_gem_request_submit+0xe1/0xf0
> [  135.858413]  [<ffffffff8142e5c8>] i915_gem_request_submit+0x28/0x40
> [  135.858415]  [<ffffffff814433e7>] i915_guc_submit+0x47/0x210
> [  135.858417]  [<ffffffff81443e98>] i915_guc_submission_enable+0x468/0x540
> [  135.858419]  [<ffffffff81442495>] intel_guc_setup+0x715/0x810
> [  135.858421]  [<ffffffff8142b6b4>] i915_gem_init_hw+0x114/0x2a0
> [  135.858423]  [<ffffffff813eeaa8>] i915_reset+0xe8/0x120
> [  135.858424]  [<ffffffff813f3937>] i915_reset_and_wakeup+0x157/0x180
> [  135.858426]  [<ffffffff813f79db>] i915_handle_error+0x1ab/0x230
> [  135.858428]  [<ffffffff812c760d>] ? scnprintf+0x4d/0x90
> [  135.858430]  [<ffffffff81435985>] i915_hangcheck_elapsed+0x275/0x3d0
> [  135.858432]  [<ffffffff810668cf>] process_one_work+0x12f/0x410
> [  135.858433]  [<ffffffff81066bf3>] worker_thread+0x43/0x4d0
> [  135.858435]  [<ffffffff81066bb0>] ? process_one_work+0x410/0x410
> [  135.858436]  [<ffffffff81066bb0>] ? process_one_work+0x410/0x410
> [  135.858438]  [<ffffffff8106bbb4>] kthread+0xd4/0xf0
> [  135.858440]  [<ffffffff8106bae0>] ? kthread_park+0x60/0x60
>
> v2: Only resubmit submitted requests
> v3: Don't forget the pending requests have reserved space.
>
> Fixes: d55ac5bf97c6 ("drm/i915: Defer transfer onto execution timeline to actual hw submission")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 37 ++++++++++++++++--------------
>  1 file changed, 20 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 00b5fa871644..e12ff881d38d 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -602,12 +602,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
>  }
>
>  /**
> - * i915_guc_submit() - Submit commands through GuC
> + * __i915_guc_submit() - Submit commands through GuC
>   * @rq:		request associated with the commands
>   *
> - * Return:	0 on success, otherwise an errno.
> - * 		(Note: nonzero really shouldn't happen!)
> - *
>   * The caller must have already called i915_guc_wq_reserve() above with
>   * a result of 0 (success), guaranteeing that there is space in the work
>   * queue for the new request, so enqueuing the item cannot fail.
> @@ -619,7 +616,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
>   * The only error here arises if the doorbell hardware isn't functioning
>   * as expected, which really shouln't happen.
>   */
> -static void i915_guc_submit(struct drm_i915_gem_request *rq)
> +static void __i915_guc_submit(struct drm_i915_gem_request *rq)
>  {
>  	struct drm_i915_private *dev_priv = rq->i915;
>  	struct intel_engine_cs *engine = rq->engine;
> @@ -628,17 +625,6 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq)
>  	struct i915_guc_client *client = guc->execbuf_client;
>  	int b_ret;
>
> -	/* We keep the previous context alive until we retire the following
> -	 * request. This ensures that any the context object is still pinned
> -	 * for any residual writes the HW makes into it on the context switch
> -	 * into the next object following the breadcrumb. Otherwise, we may
> -	 * retire the context too early.
> -	 */
> -	rq->previous_context = engine->last_context;
> -	engine->last_context = rq->ctx;
> -
> -	i915_gem_request_submit(rq);
> -
>  	spin_lock(&client->wq_lock);
>  	guc_wq_item_append(client, rq);
>
> @@ -658,6 +644,23 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq)
>  	spin_unlock(&client->wq_lock);
>  }
>
> +static void i915_guc_submit(struct drm_i915_gem_request *rq)
> +{
> +	struct intel_engine_cs *engine = rq->engine;
> +
> +	/* We keep the previous context alive until we retire the following
> +	 * request. This ensures that any the context object is still pinned
> +	 * for any residual writes the HW makes into it on the context switch
> +	 * into the next object following the breadcrumb. Otherwise, we may
> +	 * retire the context too early.
> +	 */
> +	rq->previous_context = engine->last_context;
> +	engine->last_context = rq->ctx;
> +
> +	i915_gem_request_submit(rq);
> +	__i915_guc_submit(rq);
> +}
> +
>  /*
>   * Everything below here is concerned with setup & teardown, and is
>   * therefore not part of the somewhat time-critical batch-submission
> @@ -1556,7 +1559,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>  		/* Replay the current set of previously submitted requests */
>  		list_for_each_entry(rq, &engine->timeline->requests, link) {
>  			client->wq_rsvd += sizeof(struct guc_wq_item);
> -			i915_guc_submit(rq);
> +			__i915_guc_submit(rq);
>  		}
>  	}
>
>

The only thing bothering me here is how this interacts with the common, 
or "execlist" part of the reset.

Copying Mika in hope he can help here. :)

I see that i915_gem_reset_engine runs before this GuC wq refill and does 
some magic with the requests on the engine->timeline->requests lists.

Would it be possible to somehow better integrate the two things?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 14/15] drm/i915/guc: Keep the execbuf client allocated across reset
  2016-11-28 13:49   ` Tvrtko Ursulin
@ 2016-11-28 14:11     ` Chris Wilson
  2016-11-28 15:44       ` Tvrtko Ursulin
  0 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-11-28 14:11 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Nov 28, 2016 at 01:49:03PM +0000, Tvrtko Ursulin wrote:
> 
> On 25/11/2016 09:30, Chris Wilson wrote:
> >In order to avoid some complexity in trying to reconstruct the
> >workqueues across reset, remember them instead. The issue comes when we
> >have to handle a reset between request allocation and submission, the
> >request has reserved space in the wq, but is not in any list so we fail
> >to restore the reserved space. By keeping the execbuf client intact
> >across the reset, we also keep the reservations.
> 
> I lost track a bit on why do we need to reserve the space at request
> creation time? Is it not becoming a bit cumbersome?

It is very, very hard to handle a failure. We have to be careful not to
alter global state prior to executing the request, or at least
submitting the request, which we are currently not. Since we can't
unwind the global state changes, that imposes a point-of-no-return on
request construction after which, the request must be submitted. (It is
possible though to detect when a request doesn't make any global state
changes and drop the request on add.) As the reservation may fail, we
have to do that early.

> >@@ -883,8 +877,13 @@ guc_client_alloc(struct drm_i915_private *dev_priv,
> >
> > 	guc_proc_desc_init(guc, client);
> > 	guc_ctx_desc_init(guc, client);
> >-	if (guc_init_doorbell(guc, client, db_id))
> >-		goto err;
> >+
> >+	/* For runtime client allocation we need to enable the doorbell. Not
> >+	 * required yet for the static execbuf_client as this special kernel
> >+	 * client is enabled from i915_guc_submission_enable().
> >+	 *
> >+	 * guc_update_doorbell_id(guc, client, db_id);
> >+	 */
> 
> What future is the "not yet" part referring to? What are the other clients?

No idea. The code is designed around the premise that users could
allocate guc contexts and do direct submission on their private
channels. That may be more appropriate in a bufferless world, but not
yet.

> > 	DRM_DEBUG_DRIVER("new priority %u client %p for engine(s) 0x%x: ctx_index %u\n",
> > 		priority, client, client->engines, client->ctx_index);
> >@@ -1484,6 +1483,9 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
> > 	struct intel_guc *guc = &dev_priv->guc;
> > 	struct i915_vma *vma;
> >
> >+	if (!HAS_GUC_SCHED(dev_priv))
> >+		return 0;
> 
> Why did you have to add this hunk? I think this function does not
> get called unless there is a GuC.

I too thought that it would not called without a guc.

> > int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
> > {
> > 	struct intel_guc *guc = &dev_priv->guc;
> >-	struct drm_i915_gem_request *request;
> >-	struct i915_guc_client *client;
> >+	struct i915_guc_client *client = guc->execbuf_client;
> > 	struct intel_engine_cs *engine;
> > 	enum intel_engine_id id;
> >
> >-	/* client for execbuf submission */
> >-	client = guc_client_alloc(dev_priv,
> >-				  INTEL_INFO(dev_priv)->ring_mask,
> >-				  GUC_CTX_PRIORITY_KMD_NORMAL,
> >-				  dev_priv->kernel_context);
> >-	if (!client) {
> >-		DRM_ERROR("Failed to create normal GuC client!\n");
> >-		return -ENOMEM;
> >-	}
> >+	if (!client)
> >+		return -ENODEV;
> >
> >-	guc->execbuf_client = client;
> >+	guc_reset_wq(client);
> > 	host2guc_sample_forcewake(guc, client);
> > 	guc_init_doorbell_hw(guc);
> >
> > 	/* Take over from manual control of ELSP (execlists) */
> > 	for_each_engine(engine, dev_priv, id) {
> >+		struct drm_i915_gem_request *rq;
> >+
> > 		engine->submit_request = i915_guc_submit;
> > 		engine->schedule = NULL;
> >
> > 		/* Replay the current set of previously submitted requests */
> >-		list_for_each_entry(request,
> >-				    &engine->timeline->requests, link) {
> >+		list_for_each_entry(rq, &engine->timeline->requests, link) {
> > 			client->wq_rsvd += sizeof(struct guc_wq_item);
> >-			if (i915_sw_fence_done(&request->submit))
> >-				i915_guc_submit(request);
> 
> i915_sw_fence_done check is not needed because only submit-ready
> requests can be on the engine timeline?

More so, only requests that have not only passed the submit fence but
also have the execute fence signaled can be on the engine/execution
timeline. (Distinction is more interesting when the sw scheduler delays
the execute.)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 15/15] drm/i915/guc: Split hw submission for replay after GPU reset
  2016-11-28 14:02   ` Tvrtko Ursulin
@ 2016-11-28 14:19     ` Chris Wilson
  2016-11-28 15:20       ` Mika Kuoppala
  2016-11-28 15:55       ` Tvrtko Ursulin
  0 siblings, 2 replies; 39+ messages in thread
From: Chris Wilson @ 2016-11-28 14:19 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Nov 28, 2016 at 02:02:33PM +0000, Tvrtko Ursulin wrote:
> 
> On 25/11/2016 09:30, Chris Wilson wrote:
> >Something I missed before sending off the partial series was that the
> >non-scheduler guc reset path was broken (in the full series, this is
> >pushed to the execlists reset handler). The issue is that after a reset,
> >we have to refill the GuC workqueues, which we do by resubmitting the
> >requests. However, if we already have submitted them, the fences within
> >them have already been used and triggering them again is an error.
> >Instead, just repopulate the guc workqueue.
> >
> >[  115.858560] [IGT] gem_busy: starting subtest hang-render
> >[  135.839867] [drm] GPU HANG: ecode 9:0:0xe757fefe, in gem_busy [1716], reason: Hang on render ring, action: reset
> >[  135.839902] drm/i915: Resetting chip after gpu hang
> >[  135.839957] [drm] RC6 on
> >[  135.858351] ------------[ cut here ]------------
> >[  135.858357] WARNING: CPU: 2 PID: 45 at drivers/gpu/drm/i915/i915_sw_fence.c:108 i915_sw_fence_complete+0x25/0x30
> >[  135.858357] Modules linked in: rfcomm bnep binfmt_misc nls_iso8859_1 input_leds snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core btusb btrtl snd_hwdep snd_pcm 8250_dw snd_seq_midi hid_lenovo snd_seq_midi_event snd_rawmidi iwlwifi x86_pkg_temp_thermal coretemp snd_seq crct10dif_pclmul snd_seq_device hci_uart snd_timer crc32_pclmul ghash_clmulni_intel idma64 aesni_intel virt_dma btbcm snd btqca aes_x86_64 btintel lrw cfg80211 bluetooth gf128mul glue_helper ablk_helper cryptd soundcore intel_lpss_pci intel_pch_thermal intel_lpss_acpi intel_lpss acpi_als mfd_core kfifo_buf acpi_pad industrialio autofs4 hid_plantronics usbhid dm_mirror dm_region_hash dm_log sdhci_pci ahci sdhci libahci i2c_hid hid
> >[  135.858389] CPU: 2 PID: 45 Comm: kworker/2:1 Tainted: G        W       4.9.0-rc4+ #238
> >[  135.858389] Hardware name:                  /NUC6i3SYB, BIOS SYSKLi35.86A.0024.2015.1027.2142 10/27/2015
> >[  135.858392] Workqueue: events_long i915_hangcheck_elapsed
> >[  135.858394]  ffffc900001bf9b8 ffffffff812bb238 0000000000000000 0000000000000000
> >[  135.858396]  ffffc900001bf9f8 ffffffff8104f621 0000006c00000000 ffff8808296137f8
> >[  135.858398]  0000000000000a00 ffff8808457a0000 ffff880845764e60 ffff880845760000
> >[  135.858399] Call Trace:
> >[  135.858403]  [<ffffffff812bb238>] dump_stack+0x4d/0x65
> >[  135.858405]  [<ffffffff8104f621>] __warn+0xc1/0xe0
> >[  135.858406]  [<ffffffff8104f748>] warn_slowpath_null+0x18/0x20
> >[  135.858408]  [<ffffffff813f8c15>] i915_sw_fence_complete+0x25/0x30
> >[  135.858410]  [<ffffffff813f8fad>] i915_sw_fence_commit+0xd/0x30
> >[  135.858412]  [<ffffffff8142e591>] __i915_gem_request_submit+0xe1/0xf0
> >[  135.858413]  [<ffffffff8142e5c8>] i915_gem_request_submit+0x28/0x40
> >[  135.858415]  [<ffffffff814433e7>] i915_guc_submit+0x47/0x210
> >[  135.858417]  [<ffffffff81443e98>] i915_guc_submission_enable+0x468/0x540
> >[  135.858419]  [<ffffffff81442495>] intel_guc_setup+0x715/0x810
> >[  135.858421]  [<ffffffff8142b6b4>] i915_gem_init_hw+0x114/0x2a0
> >[  135.858423]  [<ffffffff813eeaa8>] i915_reset+0xe8/0x120
> >[  135.858424]  [<ffffffff813f3937>] i915_reset_and_wakeup+0x157/0x180
> >[  135.858426]  [<ffffffff813f79db>] i915_handle_error+0x1ab/0x230
> >[  135.858428]  [<ffffffff812c760d>] ? scnprintf+0x4d/0x90
> >[  135.858430]  [<ffffffff81435985>] i915_hangcheck_elapsed+0x275/0x3d0
> >[  135.858432]  [<ffffffff810668cf>] process_one_work+0x12f/0x410
> >[  135.858433]  [<ffffffff81066bf3>] worker_thread+0x43/0x4d0
> >[  135.858435]  [<ffffffff81066bb0>] ? process_one_work+0x410/0x410
> >[  135.858436]  [<ffffffff81066bb0>] ? process_one_work+0x410/0x410
> >[  135.858438]  [<ffffffff8106bbb4>] kthread+0xd4/0xf0
> >[  135.858440]  [<ffffffff8106bae0>] ? kthread_park+0x60/0x60
> >
> >v2: Only resubmit submitted requests
> >v3: Don't forget the pending requests have reserved space.
> >
> >Fixes: d55ac5bf97c6 ("drm/i915: Defer transfer onto execution timeline to actual hw submission")
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >---
> > drivers/gpu/drm/i915/i915_guc_submission.c | 37 ++++++++++++++++--------------
> > 1 file changed, 20 insertions(+), 17 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> >index 00b5fa871644..e12ff881d38d 100644
> >--- a/drivers/gpu/drm/i915/i915_guc_submission.c
> >+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >@@ -602,12 +602,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
> > }
> >
> > /**
> >- * i915_guc_submit() - Submit commands through GuC
> >+ * __i915_guc_submit() - Submit commands through GuC
> >  * @rq:		request associated with the commands
> >  *
> >- * Return:	0 on success, otherwise an errno.
> >- * 		(Note: nonzero really shouldn't happen!)
> >- *
> >  * The caller must have already called i915_guc_wq_reserve() above with
> >  * a result of 0 (success), guaranteeing that there is space in the work
> >  * queue for the new request, so enqueuing the item cannot fail.
> >@@ -619,7 +616,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
> >  * The only error here arises if the doorbell hardware isn't functioning
> >  * as expected, which really shouln't happen.
> >  */
> >-static void i915_guc_submit(struct drm_i915_gem_request *rq)
> >+static void __i915_guc_submit(struct drm_i915_gem_request *rq)
> > {
> > 	struct drm_i915_private *dev_priv = rq->i915;
> > 	struct intel_engine_cs *engine = rq->engine;
> >@@ -628,17 +625,6 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq)
> > 	struct i915_guc_client *client = guc->execbuf_client;
> > 	int b_ret;
> >
> >-	/* We keep the previous context alive until we retire the following
> >-	 * request. This ensures that any the context object is still pinned
> >-	 * for any residual writes the HW makes into it on the context switch
> >-	 * into the next object following the breadcrumb. Otherwise, we may
> >-	 * retire the context too early.
> >-	 */
> >-	rq->previous_context = engine->last_context;
> >-	engine->last_context = rq->ctx;
> >-
> >-	i915_gem_request_submit(rq);
> >-
> > 	spin_lock(&client->wq_lock);
> > 	guc_wq_item_append(client, rq);
> >
> >@@ -658,6 +644,23 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq)
> > 	spin_unlock(&client->wq_lock);
> > }
> >
> >+static void i915_guc_submit(struct drm_i915_gem_request *rq)
> >+{
> >+	struct intel_engine_cs *engine = rq->engine;
> >+
> >+	/* We keep the previous context alive until we retire the following
> >+	 * request. This ensures that any the context object is still pinned
> >+	 * for any residual writes the HW makes into it on the context switch
> >+	 * into the next object following the breadcrumb. Otherwise, we may
> >+	 * retire the context too early.
> >+	 */
> >+	rq->previous_context = engine->last_context;
> >+	engine->last_context = rq->ctx;
> >+
> >+	i915_gem_request_submit(rq);
> >+	__i915_guc_submit(rq);
> >+}
> >+
> > /*
> >  * Everything below here is concerned with setup & teardown, and is
> >  * therefore not part of the somewhat time-critical batch-submission
> >@@ -1556,7 +1559,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
> > 		/* Replay the current set of previously submitted requests */
> > 		list_for_each_entry(rq, &engine->timeline->requests, link) {
> > 			client->wq_rsvd += sizeof(struct guc_wq_item);
> >-			i915_guc_submit(rq);
> >+			__i915_guc_submit(rq);
> > 		}
> > 	}
> >
> >
> 
> The only thing bothering me here is how this interacts with the
> common, or "execlist" part of the reset.

Also applies to ringbuffer (some parts at least).
 
> Copying Mika in hope he can help here. :)
> 
> I see that i915_gem_reset_engine runs before this GuC wq refill and
> does some magic with the requests on the engine->timeline->requests
> lists.
> 
> Would it be possible to somehow better integrate the two things?

I thought they were nicely separated. The common work to disable the
guilty context is handled by the core reset function. The specific work
required to restart and replay the requests handled by the different
backends.

i915_gem_reset_engine: is for disabling the guilty context
engine->reset: updates the context / ring for restarting past a hung
batch (if required)
engine->init: starts the engine, processing the pending requests if any

engine->reset is a bad name, since we have per-engine resets as well,
but I haven't a better verb yet for handling the request bumping after a
reset.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 15/15] drm/i915/guc: Split hw submission for replay after GPU reset
  2016-11-28 14:19     ` Chris Wilson
@ 2016-11-28 15:20       ` Mika Kuoppala
  2016-11-28 15:55       ` Tvrtko Ursulin
  1 sibling, 0 replies; 39+ messages in thread
From: Mika Kuoppala @ 2016-11-28 15:20 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin; +Cc: intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> On Mon, Nov 28, 2016 at 02:02:33PM +0000, Tvrtko Ursulin wrote:
>> 
>> On 25/11/2016 09:30, Chris Wilson wrote:
>> >Something I missed before sending off the partial series was that the
>> >non-scheduler guc reset path was broken (in the full series, this is
>> >pushed to the execlists reset handler). The issue is that after a reset,
>> >we have to refill the GuC workqueues, which we do by resubmitting the
>> >requests. However, if we already have submitted them, the fences within
>> >them have already been used and triggering them again is an error.
>> >Instead, just repopulate the guc workqueue.
>> >
>> >[  115.858560] [IGT] gem_busy: starting subtest hang-render
>> >[  135.839867] [drm] GPU HANG: ecode 9:0:0xe757fefe, in gem_busy [1716], reason: Hang on render ring, action: reset
>> >[  135.839902] drm/i915: Resetting chip after gpu hang
>> >[  135.839957] [drm] RC6 on
>> >[  135.858351] ------------[ cut here ]------------
>> >[  135.858357] WARNING: CPU: 2 PID: 45 at drivers/gpu/drm/i915/i915_sw_fence.c:108 i915_sw_fence_complete+0x25/0x30
>> >[  135.858357] Modules linked in: rfcomm bnep binfmt_misc nls_iso8859_1 input_leds snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core btusb btrtl snd_hwdep snd_pcm 8250_dw snd_seq_midi hid_lenovo snd_seq_midi_event snd_rawmidi iwlwifi x86_pkg_temp_thermal coretemp snd_seq crct10dif_pclmul snd_seq_device hci_uart snd_timer crc32_pclmul ghash_clmulni_intel idma64 aesni_intel virt_dma btbcm snd btqca aes_x86_64 btintel lrw cfg80211 bluetooth gf128mul glue_helper ablk_helper cryptd soundcore intel_lpss_pci intel_pch_thermal intel_lpss_acpi intel_lpss acpi_als mfd_core kfifo_buf acpi_pad industrialio autofs4 hid_plantronics usbhid dm_mirror dm_region_hash dm_log sdhci_pci ahci sdhci libahci i2c_hid hid
>> >[  135.858389] CPU: 2 PID: 45 Comm: kworker/2:1 Tainted: G        W       4.9.0-rc4+ #238
>> >[  135.858389] Hardware name:                  /NUC6i3SYB, BIOS SYSKLi35.86A.0024.2015.1027.2142 10/27/2015
>> >[  135.858392] Workqueue: events_long i915_hangcheck_elapsed
>> >[  135.858394]  ffffc900001bf9b8 ffffffff812bb238 0000000000000000 0000000000000000
>> >[  135.858396]  ffffc900001bf9f8 ffffffff8104f621 0000006c00000000 ffff8808296137f8
>> >[  135.858398]  0000000000000a00 ffff8808457a0000 ffff880845764e60 ffff880845760000
>> >[  135.858399] Call Trace:
>> >[  135.858403]  [<ffffffff812bb238>] dump_stack+0x4d/0x65
>> >[  135.858405]  [<ffffffff8104f621>] __warn+0xc1/0xe0
>> >[  135.858406]  [<ffffffff8104f748>] warn_slowpath_null+0x18/0x20
>> >[  135.858408]  [<ffffffff813f8c15>] i915_sw_fence_complete+0x25/0x30
>> >[  135.858410]  [<ffffffff813f8fad>] i915_sw_fence_commit+0xd/0x30
>> >[  135.858412]  [<ffffffff8142e591>] __i915_gem_request_submit+0xe1/0xf0
>> >[  135.858413]  [<ffffffff8142e5c8>] i915_gem_request_submit+0x28/0x40
>> >[  135.858415]  [<ffffffff814433e7>] i915_guc_submit+0x47/0x210
>> >[  135.858417]  [<ffffffff81443e98>] i915_guc_submission_enable+0x468/0x540
>> >[  135.858419]  [<ffffffff81442495>] intel_guc_setup+0x715/0x810
>> >[  135.858421]  [<ffffffff8142b6b4>] i915_gem_init_hw+0x114/0x2a0
>> >[  135.858423]  [<ffffffff813eeaa8>] i915_reset+0xe8/0x120
>> >[  135.858424]  [<ffffffff813f3937>] i915_reset_and_wakeup+0x157/0x180
>> >[  135.858426]  [<ffffffff813f79db>] i915_handle_error+0x1ab/0x230
>> >[  135.858428]  [<ffffffff812c760d>] ? scnprintf+0x4d/0x90
>> >[  135.858430]  [<ffffffff81435985>] i915_hangcheck_elapsed+0x275/0x3d0
>> >[  135.858432]  [<ffffffff810668cf>] process_one_work+0x12f/0x410
>> >[  135.858433]  [<ffffffff81066bf3>] worker_thread+0x43/0x4d0
>> >[  135.858435]  [<ffffffff81066bb0>] ? process_one_work+0x410/0x410
>> >[  135.858436]  [<ffffffff81066bb0>] ? process_one_work+0x410/0x410
>> >[  135.858438]  [<ffffffff8106bbb4>] kthread+0xd4/0xf0
>> >[  135.858440]  [<ffffffff8106bae0>] ? kthread_park+0x60/0x60
>> >
>> >v2: Only resubmit submitted requests
>> >v3: Don't forget the pending requests have reserved space.
>> >
>> >Fixes: d55ac5bf97c6 ("drm/i915: Defer transfer onto execution timeline to actual hw submission")
>> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> >Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> >---
>> > drivers/gpu/drm/i915/i915_guc_submission.c | 37 ++++++++++++++++--------------
>> > 1 file changed, 20 insertions(+), 17 deletions(-)
>> >
>> >diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
>> >index 00b5fa871644..e12ff881d38d 100644
>> >--- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> >+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> >@@ -602,12 +602,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
>> > }
>> >
>> > /**
>> >- * i915_guc_submit() - Submit commands through GuC
>> >+ * __i915_guc_submit() - Submit commands through GuC
>> >  * @rq:		request associated with the commands
>> >  *
>> >- * Return:	0 on success, otherwise an errno.
>> >- * 		(Note: nonzero really shouldn't happen!)
>> >- *
>> >  * The caller must have already called i915_guc_wq_reserve() above with
>> >  * a result of 0 (success), guaranteeing that there is space in the work
>> >  * queue for the new request, so enqueuing the item cannot fail.
>> >@@ -619,7 +616,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
>> >  * The only error here arises if the doorbell hardware isn't functioning
>> >  * as expected, which really shouln't happen.
>> >  */
>> >-static void i915_guc_submit(struct drm_i915_gem_request *rq)
>> >+static void __i915_guc_submit(struct drm_i915_gem_request *rq)
>> > {
>> > 	struct drm_i915_private *dev_priv = rq->i915;
>> > 	struct intel_engine_cs *engine = rq->engine;
>> >@@ -628,17 +625,6 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq)
>> > 	struct i915_guc_client *client = guc->execbuf_client;
>> > 	int b_ret;
>> >
>> >-	/* We keep the previous context alive until we retire the following
>> >-	 * request. This ensures that any the context object is still pinned
>> >-	 * for any residual writes the HW makes into it on the context switch
>> >-	 * into the next object following the breadcrumb. Otherwise, we may
>> >-	 * retire the context too early.
>> >-	 */
>> >-	rq->previous_context = engine->last_context;
>> >-	engine->last_context = rq->ctx;
>> >-
>> >-	i915_gem_request_submit(rq);
>> >-
>> > 	spin_lock(&client->wq_lock);
>> > 	guc_wq_item_append(client, rq);
>> >
>> >@@ -658,6 +644,23 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq)
>> > 	spin_unlock(&client->wq_lock);
>> > }
>> >
>> >+static void i915_guc_submit(struct drm_i915_gem_request *rq)
>> >+{
>> >+	struct intel_engine_cs *engine = rq->engine;
>> >+
>> >+	/* We keep the previous context alive until we retire the following
>> >+	 * request. This ensures that any the context object is still pinned
>> >+	 * for any residual writes the HW makes into it on the context switch
>> >+	 * into the next object following the breadcrumb. Otherwise, we may
>> >+	 * retire the context too early.
>> >+	 */
>> >+	rq->previous_context = engine->last_context;
>> >+	engine->last_context = rq->ctx;
>> >+
>> >+	i915_gem_request_submit(rq);
>> >+	__i915_guc_submit(rq);
>> >+}
>> >+
>> > /*
>> >  * Everything below here is concerned with setup & teardown, and is
>> >  * therefore not part of the somewhat time-critical batch-submission
>> >@@ -1556,7 +1559,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>> > 		/* Replay the current set of previously submitted requests */
>> > 		list_for_each_entry(rq, &engine->timeline->requests, link) {
>> > 			client->wq_rsvd += sizeof(struct guc_wq_item);
>> >-			i915_guc_submit(rq);
>> >+			__i915_guc_submit(rq);
>> > 		}
>> > 	}
>> >
>> >
>> 
>> The only thing bothering me here is how this interacts with the
>> common, or "execlist" part of the reset.
>
> Also applies to ringbuffer (some parts at least).
>  
>> Copying Mika in hope he can help here. :)
>> 
>> I see that i915_gem_reset_engine runs before this GuC wq refill and
>> does some magic with the requests on the engine->timeline->requests
>> lists.
>> 
>> Would it be possible to somehow better integrate the two things?
>
> I thought they were nicely separated. The common work to disable the
> guilty context is handled by the core reset function. The specific work
> required to restart and replay the requests handled by the different
> backends.
>
> i915_gem_reset_engine: is for disabling the guilty context
> engine->reset: updates the context / ring for restarting past a hung
> batch (if required)
> engine->init: starts the engine, processing the pending requests if any
>
> engine->reset is a bad name, since we have per-engine resets as well,
> but I haven't a better verb yet for handling the request bumping after
> a

How about
engine->setup(request *) to replace ->reset 

and engine->start() to replace ->init()?

-Mika

> reset.
> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 14/15] drm/i915/guc: Keep the execbuf client allocated across reset
  2016-11-28 14:11     ` Chris Wilson
@ 2016-11-28 15:44       ` Tvrtko Ursulin
  2016-11-28 16:01         ` Chris Wilson
  0 siblings, 1 reply; 39+ messages in thread
From: Tvrtko Ursulin @ 2016-11-28 15:44 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 28/11/2016 14:11, Chris Wilson wrote:
> On Mon, Nov 28, 2016 at 01:49:03PM +0000, Tvrtko Ursulin wrote:
>>
>> On 25/11/2016 09:30, Chris Wilson wrote:
>>> In order to avoid some complexity in trying to reconstruct the
>>> workqueues across reset, remember them instead. The issue comes when we
>>> have to handle a reset between request allocation and submission, the
>>> request has reserved space in the wq, but is not in any list so we fail
>>> to restore the reserved space. By keeping the execbuf client intact
>>> across the reset, we also keep the reservations.
>>
>> I lost track a bit on why do we need to reserve the space at request
>> creation time? Is it not becoming a bit cumbersome?
>
> It is very, very hard to handle a failure. We have to be careful not to
> alter global state prior to executing the request, or at least
> submitting the request, which we are currently not. Since we can't
> unwind the global state changes, that imposes a point-of-no-return on
> request construction after which, the request must be submitted. (It is
> possible though to detect when a request doesn't make any global state
> changes and drop the request on add.) As the reservation may fail, we
> have to do that early.

We couldn't just not do any of the ring buffer writing at execbuf time, 
just add it to the appropriate timeline and do all of that later, when 
the scheduler decides it is time to actually submit it?

>>> @@ -883,8 +877,13 @@ guc_client_alloc(struct drm_i915_private *dev_priv,
>>>
>>> 	guc_proc_desc_init(guc, client);
>>> 	guc_ctx_desc_init(guc, client);
>>> -	if (guc_init_doorbell(guc, client, db_id))
>>> -		goto err;
>>> +
>>> +	/* For runtime client allocation we need to enable the doorbell. Not
>>> +	 * required yet for the static execbuf_client as this special kernel
>>> +	 * client is enabled from i915_guc_submission_enable().
>>> +	 *
>>> +	 * guc_update_doorbell_id(guc, client, db_id);
>>> +	 */
>>
>> What future is the "not yet" part referring to? What are the other clients?
>
> No idea. The code is designed around the premise that users could
> allocate guc contexts and do direct submission on their private
> channels. That may be more appropriate in a bufferless world, but not
> yet.
>
>>> 	DRM_DEBUG_DRIVER("new priority %u client %p for engine(s) 0x%x: ctx_index %u\n",
>>> 		priority, client, client->engines, client->ctx_index);
>>> @@ -1484,6 +1483,9 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
>>> 	struct intel_guc *guc = &dev_priv->guc;
>>> 	struct i915_vma *vma;
>>>
>>> +	if (!HAS_GUC_SCHED(dev_priv))
>>> +		return 0;
>>
>> Why did you have to add this hunk? I think this function does not
>> get called unless there is a GuC.
>
> I too thought that it would not called without a guc.

But it is or what?

>>> int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>>> {
>>> 	struct intel_guc *guc = &dev_priv->guc;
>>> -	struct drm_i915_gem_request *request;
>>> -	struct i915_guc_client *client;
>>> +	struct i915_guc_client *client = guc->execbuf_client;
>>> 	struct intel_engine_cs *engine;
>>> 	enum intel_engine_id id;
>>>
>>> -	/* client for execbuf submission */
>>> -	client = guc_client_alloc(dev_priv,
>>> -				  INTEL_INFO(dev_priv)->ring_mask,
>>> -				  GUC_CTX_PRIORITY_KMD_NORMAL,
>>> -				  dev_priv->kernel_context);
>>> -	if (!client) {
>>> -		DRM_ERROR("Failed to create normal GuC client!\n");
>>> -		return -ENOMEM;
>>> -	}
>>> +	if (!client)
>>> +		return -ENODEV;
>>>
>>> -	guc->execbuf_client = client;
>>> +	guc_reset_wq(client);
>>> 	host2guc_sample_forcewake(guc, client);
>>> 	guc_init_doorbell_hw(guc);
>>>
>>> 	/* Take over from manual control of ELSP (execlists) */
>>> 	for_each_engine(engine, dev_priv, id) {
>>> +		struct drm_i915_gem_request *rq;
>>> +
>>> 		engine->submit_request = i915_guc_submit;
>>> 		engine->schedule = NULL;
>>>
>>> 		/* Replay the current set of previously submitted requests */
>>> -		list_for_each_entry(request,
>>> -				    &engine->timeline->requests, link) {
>>> +		list_for_each_entry(rq, &engine->timeline->requests, link) {
>>> 			client->wq_rsvd += sizeof(struct guc_wq_item);
>>> -			if (i915_sw_fence_done(&request->submit))
>>> -				i915_guc_submit(request);
>>
>> i915_sw_fence_done check is not needed because only submit-ready
>> requests can be on the engine timeline?
>
> More so, only requests that have not only passed the submit fence but
> also have the execute fence signaled can be on the engine/execution
> timeline. (Distinction is more interesting when the sw scheduler delays
> the execute.)

Hm yes, that is correct.

Patch itself looked fine to me. It was just complexity which made me 
wonder if we couldn't have taken a different route. But that was long 
time ago so not relevant to this patch anyway.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 15/15] drm/i915/guc: Split hw submission for replay after GPU reset
  2016-11-28 14:19     ` Chris Wilson
  2016-11-28 15:20       ` Mika Kuoppala
@ 2016-11-28 15:55       ` Tvrtko Ursulin
  2016-11-28 16:06         ` Chris Wilson
  1 sibling, 1 reply; 39+ messages in thread
From: Tvrtko Ursulin @ 2016-11-28 15:55 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, Mika Kuoppala


On 28/11/2016 14:19, Chris Wilson wrote:
> On Mon, Nov 28, 2016 at 02:02:33PM +0000, Tvrtko Ursulin wrote:
>>
>> On 25/11/2016 09:30, Chris Wilson wrote:
>>> Something I missed before sending off the partial series was that the
>>> non-scheduler guc reset path was broken (in the full series, this is
>>> pushed to the execlists reset handler). The issue is that after a reset,
>>> we have to refill the GuC workqueues, which we do by resubmitting the
>>> requests. However, if we already have submitted them, the fences within
>>> them have already been used and triggering them again is an error.
>>> Instead, just repopulate the guc workqueue.
>>>
>>> [  115.858560] [IGT] gem_busy: starting subtest hang-render
>>> [  135.839867] [drm] GPU HANG: ecode 9:0:0xe757fefe, in gem_busy [1716], reason: Hang on render ring, action: reset
>>> [  135.839902] drm/i915: Resetting chip after gpu hang
>>> [  135.839957] [drm] RC6 on
>>> [  135.858351] ------------[ cut here ]------------
>>> [  135.858357] WARNING: CPU: 2 PID: 45 at drivers/gpu/drm/i915/i915_sw_fence.c:108 i915_sw_fence_complete+0x25/0x30
>>> [  135.858357] Modules linked in: rfcomm bnep binfmt_misc nls_iso8859_1 input_leds snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core btusb btrtl snd_hwdep snd_pcm 8250_dw snd_seq_midi hid_lenovo snd_seq_midi_event snd_rawmidi iwlwifi x86_pkg_temp_thermal coretemp snd_seq crct10dif_pclmul snd_seq_device hci_uart snd_timer crc32_pclmul ghash_clmulni_intel idma64 aesni_intel virt_dma btbcm snd btqca aes_x86_64 btintel lrw cfg80211 bluetooth gf128mul glue_helper ablk_helper cryptd soundcore intel_lpss_pci intel_pch_thermal intel_lpss_acpi intel_lpss acpi_als mfd_core kfifo_buf acpi_pad industrialio autofs4 hid_plantronics usbhid dm_mirror dm_region_hash dm_log sdhci_pci ahci sdhci libahci i2c_hid hid
>>> [  135.858389] CPU: 2 PID: 45 Comm: kworker/2:1 Tainted: G        W       4.9.0-rc4+ #238
>>> [  135.858389] Hardware name:                  /NUC6i3SYB, BIOS SYSKLi35.86A.0024.2015.1027.2142 10/27/2015
>>> [  135.858392] Workqueue: events_long i915_hangcheck_elapsed
>>> [  135.858394]  ffffc900001bf9b8 ffffffff812bb238 0000000000000000 0000000000000000
>>> [  135.858396]  ffffc900001bf9f8 ffffffff8104f621 0000006c00000000 ffff8808296137f8
>>> [  135.858398]  0000000000000a00 ffff8808457a0000 ffff880845764e60 ffff880845760000
>>> [  135.858399] Call Trace:
>>> [  135.858403]  [<ffffffff812bb238>] dump_stack+0x4d/0x65
>>> [  135.858405]  [<ffffffff8104f621>] __warn+0xc1/0xe0
>>> [  135.858406]  [<ffffffff8104f748>] warn_slowpath_null+0x18/0x20
>>> [  135.858408]  [<ffffffff813f8c15>] i915_sw_fence_complete+0x25/0x30
>>> [  135.858410]  [<ffffffff813f8fad>] i915_sw_fence_commit+0xd/0x30
>>> [  135.858412]  [<ffffffff8142e591>] __i915_gem_request_submit+0xe1/0xf0
>>> [  135.858413]  [<ffffffff8142e5c8>] i915_gem_request_submit+0x28/0x40
>>> [  135.858415]  [<ffffffff814433e7>] i915_guc_submit+0x47/0x210
>>> [  135.858417]  [<ffffffff81443e98>] i915_guc_submission_enable+0x468/0x540
>>> [  135.858419]  [<ffffffff81442495>] intel_guc_setup+0x715/0x810
>>> [  135.858421]  [<ffffffff8142b6b4>] i915_gem_init_hw+0x114/0x2a0
>>> [  135.858423]  [<ffffffff813eeaa8>] i915_reset+0xe8/0x120
>>> [  135.858424]  [<ffffffff813f3937>] i915_reset_and_wakeup+0x157/0x180
>>> [  135.858426]  [<ffffffff813f79db>] i915_handle_error+0x1ab/0x230
>>> [  135.858428]  [<ffffffff812c760d>] ? scnprintf+0x4d/0x90
>>> [  135.858430]  [<ffffffff81435985>] i915_hangcheck_elapsed+0x275/0x3d0
>>> [  135.858432]  [<ffffffff810668cf>] process_one_work+0x12f/0x410
>>> [  135.858433]  [<ffffffff81066bf3>] worker_thread+0x43/0x4d0
>>> [  135.858435]  [<ffffffff81066bb0>] ? process_one_work+0x410/0x410
>>> [  135.858436]  [<ffffffff81066bb0>] ? process_one_work+0x410/0x410
>>> [  135.858438]  [<ffffffff8106bbb4>] kthread+0xd4/0xf0
>>> [  135.858440]  [<ffffffff8106bae0>] ? kthread_park+0x60/0x60
>>>
>>> v2: Only resubmit submitted requests
>>> v3: Don't forget the pending requests have reserved space.
>>>
>>> Fixes: d55ac5bf97c6 ("drm/i915: Defer transfer onto execution timeline to actual hw submission")
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_guc_submission.c | 37 ++++++++++++++++--------------
>>> 1 file changed, 20 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> index 00b5fa871644..e12ff881d38d 100644
>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> @@ -602,12 +602,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
>>> }
>>>
>>> /**
>>> - * i915_guc_submit() - Submit commands through GuC
>>> + * __i915_guc_submit() - Submit commands through GuC
>>>  * @rq:		request associated with the commands
>>>  *
>>> - * Return:	0 on success, otherwise an errno.
>>> - * 		(Note: nonzero really shouldn't happen!)
>>> - *
>>>  * The caller must have already called i915_guc_wq_reserve() above with
>>>  * a result of 0 (success), guaranteeing that there is space in the work
>>>  * queue for the new request, so enqueuing the item cannot fail.
>>> @@ -619,7 +616,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
>>>  * The only error here arises if the doorbell hardware isn't functioning
>>>  * as expected, which really shouln't happen.
>>>  */
>>> -static void i915_guc_submit(struct drm_i915_gem_request *rq)
>>> +static void __i915_guc_submit(struct drm_i915_gem_request *rq)
>>> {
>>> 	struct drm_i915_private *dev_priv = rq->i915;
>>> 	struct intel_engine_cs *engine = rq->engine;
>>> @@ -628,17 +625,6 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq)
>>> 	struct i915_guc_client *client = guc->execbuf_client;
>>> 	int b_ret;
>>>
>>> -	/* We keep the previous context alive until we retire the following
>>> -	 * request. This ensures that any the context object is still pinned
>>> -	 * for any residual writes the HW makes into it on the context switch
>>> -	 * into the next object following the breadcrumb. Otherwise, we may
>>> -	 * retire the context too early.
>>> -	 */
>>> -	rq->previous_context = engine->last_context;
>>> -	engine->last_context = rq->ctx;
>>> -
>>> -	i915_gem_request_submit(rq);
>>> -
>>> 	spin_lock(&client->wq_lock);
>>> 	guc_wq_item_append(client, rq);
>>>
>>> @@ -658,6 +644,23 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq)
>>> 	spin_unlock(&client->wq_lock);
>>> }
>>>
>>> +static void i915_guc_submit(struct drm_i915_gem_request *rq)
>>> +{
>>> +	struct intel_engine_cs *engine = rq->engine;
>>> +
>>> +	/* We keep the previous context alive until we retire the following
>>> +	 * request. This ensures that any the context object is still pinned
>>> +	 * for any residual writes the HW makes into it on the context switch
>>> +	 * into the next object following the breadcrumb. Otherwise, we may
>>> +	 * retire the context too early.
>>> +	 */
>>> +	rq->previous_context = engine->last_context;
>>> +	engine->last_context = rq->ctx;
>>> +
>>> +	i915_gem_request_submit(rq);
>>> +	__i915_guc_submit(rq);
>>> +}
>>> +
>>> /*
>>>  * Everything below here is concerned with setup & teardown, and is
>>>  * therefore not part of the somewhat time-critical batch-submission
>>> @@ -1556,7 +1559,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
>>> 		/* Replay the current set of previously submitted requests */
>>> 		list_for_each_entry(rq, &engine->timeline->requests, link) {
>>> 			client->wq_rsvd += sizeof(struct guc_wq_item);
>>> -			i915_guc_submit(rq);
>>> +			__i915_guc_submit(rq);
>>> 		}
>>> 	}
>>>
>>>
>>
>> The only thing bothering me here is how this interacts with the
>> common, or "execlist" part of the reset.
>
> Also applies to ringbuffer (some parts at least).
>
>> Copying Mika in hope he can help here. :)
>>
>> I see that i915_gem_reset_engine runs before this GuC wq refill and
>> does some magic with the requests on the engine->timeline->requests
>> lists.
>>
>> Would it be possible to somehow better integrate the two things?
>
> I thought they were nicely separated. The common work to disable the
> guilty context is handled by the core reset function. The specific work
> required to restart and replay the requests handled by the different
> backends.
>
> i915_gem_reset_engine: is for disabling the guilty context
> engine->reset: updates the context / ring for restarting past a hung
> batch (if required)
> engine->init: starts the engine, processing the pending requests if any
>
> engine->reset is a bad name, since we have per-engine resets as well,
> but I haven't a better verb yet for handling the request bumping after a
> reset.

Yes but final part of the reset when GuC is enabled, is not in any of 
those and is not even called from i915_gem_reset. It happens later via 
i915_gem_init_hw in i915_guc_submission_enable.

I was wondering if that is the most logical way of doing it - does it 
belong with the GEM reset, or the hardware reset?

For example could we, for GuC, not use reset_common_ring but add 
guc_reset_engine (yes, why did you name it ring when we went to engines?).

But I see that the ordering of operations would make it very 
problematic, since GuC is not setup until later following a reset.

So it all becomes very complicated and since it also predates this patch 
it doesn't matter. Just confused me a little bit, and probably will 
again in the future. :)

So..

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 14/15] drm/i915/guc: Keep the execbuf client allocated across reset
  2016-11-28 15:44       ` Tvrtko Ursulin
@ 2016-11-28 16:01         ` Chris Wilson
  0 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-11-28 16:01 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Nov 28, 2016 at 03:44:41PM +0000, Tvrtko Ursulin wrote:
> 
> On 28/11/2016 14:11, Chris Wilson wrote:
> >On Mon, Nov 28, 2016 at 01:49:03PM +0000, Tvrtko Ursulin wrote:
> >>
> >>On 25/11/2016 09:30, Chris Wilson wrote:
> >>>In order to avoid some complexity in trying to reconstruct the
> >>>workqueues across reset, remember them instead. The issue comes when we
> >>>have to handle a reset between request allocation and submission, the
> >>>request has reserved space in the wq, but is not in any list so we fail
> >>>to restore the reserved space. By keeping the execbuf client intact
> >>>across the reset, we also keep the reservations.
> >>
> >>I lost track a bit on why do we need to reserve the space at request
> >>creation time? Is it not becoming a bit cumbersome?
> >
> >It is very, very hard to handle a failure. We have to be careful not to
> >alter global state prior to executing the request, or at least
> >submitting the request, which we are currently not. Since we can't
> >unwind the global state changes, that imposes a point-of-no-return on
> >request construction after which, the request must be submitted. (It is
> >possible though to detect when a request doesn't make any global state
> >changes and drop the request on add.) As the reservation may fail, we
> >have to do that early.
> 
> We couldn't just not do any of the ring buffer writing at execbuf
> time, just add it to the appropriate timeline and do all of that
> later, when the scheduler decides it is time to actually submit it?

It's the fence/request tracking that is the hard part, which ties into
the active tracking required to hold the GTT reservation for the request
(among many other external factors).

> >>>@@ -1484,6 +1483,9 @@ int i915_guc_submission_init(struct drm_i915_private *dev_priv)
> >>>	struct intel_guc *guc = &dev_priv->guc;
> >>>	struct i915_vma *vma;
> >>>
> >>>+	if (!HAS_GUC_SCHED(dev_priv))
> >>>+		return 0;
> >>
> >>Why did you have to add this hunk? I think this function does not
> >>get called unless there is a GuC.
> >
> >I too thought that it would not called without a guc.
> 
> But it is or what?

Yes. It was clobbering hw state on gen3-gen5 (CI had an ilk fail, I had
a pnv fail).

[snip]

> Patch itself looked fine to me. It was just complexity which made me
> wonder if we couldn't have taken a different route. But that was
> long time ago so not relevant to this patch anyway.

It's still a valid question. I think the current no-fail request
handling is the simpler approach atm, but it is almost certain that some
point we may need a new scheme (either for greater concurrency, late
changes to the requests, something even more wacky). I'd just like to
fix the last few remaining bugs in the state handling (legacy switch
context) before embarking on a new journey.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 15/15] drm/i915/guc: Split hw submission for replay after GPU reset
  2016-11-28 15:55       ` Tvrtko Ursulin
@ 2016-11-28 16:06         ` Chris Wilson
  0 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-11-28 16:06 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Nov 28, 2016 at 03:55:42PM +0000, Tvrtko Ursulin wrote:
> 
> On 28/11/2016 14:19, Chris Wilson wrote:
> >On Mon, Nov 28, 2016 at 02:02:33PM +0000, Tvrtko Ursulin wrote:
> >>
> >>On 25/11/2016 09:30, Chris Wilson wrote:
> >>>Something I missed before sending off the partial series was that the
> >>>non-scheduler guc reset path was broken (in the full series, this is
> >>>pushed to the execlists reset handler). The issue is that after a reset,
> >>>we have to refill the GuC workqueues, which we do by resubmitting the
> >>>requests. However, if we already have submitted them, the fences within
> >>>them have already been used and triggering them again is an error.
> >>>Instead, just repopulate the guc workqueue.
> >>>
> >>>[  115.858560] [IGT] gem_busy: starting subtest hang-render
> >>>[  135.839867] [drm] GPU HANG: ecode 9:0:0xe757fefe, in gem_busy [1716], reason: Hang on render ring, action: reset
> >>>[  135.839902] drm/i915: Resetting chip after gpu hang
> >>>[  135.839957] [drm] RC6 on
> >>>[  135.858351] ------------[ cut here ]------------
> >>>[  135.858357] WARNING: CPU: 2 PID: 45 at drivers/gpu/drm/i915/i915_sw_fence.c:108 i915_sw_fence_complete+0x25/0x30
> >>>[  135.858357] Modules linked in: rfcomm bnep binfmt_misc nls_iso8859_1 input_leds snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core btusb btrtl snd_hwdep snd_pcm 8250_dw snd_seq_midi hid_lenovo snd_seq_midi_event snd_rawmidi iwlwifi x86_pkg_temp_thermal coretemp snd_seq crct10dif_pclmul snd_seq_device hci_uart snd_timer crc32_pclmul ghash_clmulni_intel idma64 aesni_intel virt_dma btbcm snd btqca aes_x86_64 btintel lrw cfg80211 bluetooth gf128mul glue_helper ablk_helper cryptd soundcore intel_lpss_pci intel_pch_thermal intel_lpss_acpi intel_lpss acpi_als mfd_core kfifo_buf acpi_pad industrialio autofs4 hid_plantronics usbhid dm_mirror dm_region_hash dm_log sdhci_pci ahci sdhci libahci i2c_hid hid
> >>>[  135.858389] CPU: 2 PID: 45 Comm: kworker/2:1 Tainted: G        W       4.9.0-rc4+ #238
> >>>[  135.858389] Hardware name:                  /NUC6i3SYB, BIOS SYSKLi35.86A.0024.2015.1027.2142 10/27/2015
> >>>[  135.858392] Workqueue: events_long i915_hangcheck_elapsed
> >>>[  135.858394]  ffffc900001bf9b8 ffffffff812bb238 0000000000000000 0000000000000000
> >>>[  135.858396]  ffffc900001bf9f8 ffffffff8104f621 0000006c00000000 ffff8808296137f8
> >>>[  135.858398]  0000000000000a00 ffff8808457a0000 ffff880845764e60 ffff880845760000
> >>>[  135.858399] Call Trace:
> >>>[  135.858403]  [<ffffffff812bb238>] dump_stack+0x4d/0x65
> >>>[  135.858405]  [<ffffffff8104f621>] __warn+0xc1/0xe0
> >>>[  135.858406]  [<ffffffff8104f748>] warn_slowpath_null+0x18/0x20
> >>>[  135.858408]  [<ffffffff813f8c15>] i915_sw_fence_complete+0x25/0x30
> >>>[  135.858410]  [<ffffffff813f8fad>] i915_sw_fence_commit+0xd/0x30
> >>>[  135.858412]  [<ffffffff8142e591>] __i915_gem_request_submit+0xe1/0xf0
> >>>[  135.858413]  [<ffffffff8142e5c8>] i915_gem_request_submit+0x28/0x40
> >>>[  135.858415]  [<ffffffff814433e7>] i915_guc_submit+0x47/0x210
> >>>[  135.858417]  [<ffffffff81443e98>] i915_guc_submission_enable+0x468/0x540
> >>>[  135.858419]  [<ffffffff81442495>] intel_guc_setup+0x715/0x810
> >>>[  135.858421]  [<ffffffff8142b6b4>] i915_gem_init_hw+0x114/0x2a0
> >>>[  135.858423]  [<ffffffff813eeaa8>] i915_reset+0xe8/0x120
> >>>[  135.858424]  [<ffffffff813f3937>] i915_reset_and_wakeup+0x157/0x180
> >>>[  135.858426]  [<ffffffff813f79db>] i915_handle_error+0x1ab/0x230
> >>>[  135.858428]  [<ffffffff812c760d>] ? scnprintf+0x4d/0x90
> >>>[  135.858430]  [<ffffffff81435985>] i915_hangcheck_elapsed+0x275/0x3d0
> >>>[  135.858432]  [<ffffffff810668cf>] process_one_work+0x12f/0x410
> >>>[  135.858433]  [<ffffffff81066bf3>] worker_thread+0x43/0x4d0
> >>>[  135.858435]  [<ffffffff81066bb0>] ? process_one_work+0x410/0x410
> >>>[  135.858436]  [<ffffffff81066bb0>] ? process_one_work+0x410/0x410
> >>>[  135.858438]  [<ffffffff8106bbb4>] kthread+0xd4/0xf0
> >>>[  135.858440]  [<ffffffff8106bae0>] ? kthread_park+0x60/0x60
> >>>
> >>>v2: Only resubmit submitted requests
> >>>v3: Don't forget the pending requests have reserved space.
> >>>
> >>>Fixes: d55ac5bf97c6 ("drm/i915: Defer transfer onto execution timeline to actual hw submission")
> >>>Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>>Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>---
> >>>drivers/gpu/drm/i915/i915_guc_submission.c | 37 ++++++++++++++++--------------
> >>>1 file changed, 20 insertions(+), 17 deletions(-)
> >>>
> >>>diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> >>>index 00b5fa871644..e12ff881d38d 100644
> >>>--- a/drivers/gpu/drm/i915/i915_guc_submission.c
> >>>+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >>>@@ -602,12 +602,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
> >>>}
> >>>
> >>>/**
> >>>- * i915_guc_submit() - Submit commands through GuC
> >>>+ * __i915_guc_submit() - Submit commands through GuC
> >>> * @rq:		request associated with the commands
> >>> *
> >>>- * Return:	0 on success, otherwise an errno.
> >>>- * 		(Note: nonzero really shouldn't happen!)
> >>>- *
> >>> * The caller must have already called i915_guc_wq_reserve() above with
> >>> * a result of 0 (success), guaranteeing that there is space in the work
> >>> * queue for the new request, so enqueuing the item cannot fail.
> >>>@@ -619,7 +616,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
> >>> * The only error here arises if the doorbell hardware isn't functioning
> >>> * as expected, which really shouln't happen.
> >>> */
> >>>-static void i915_guc_submit(struct drm_i915_gem_request *rq)
> >>>+static void __i915_guc_submit(struct drm_i915_gem_request *rq)
> >>>{
> >>>	struct drm_i915_private *dev_priv = rq->i915;
> >>>	struct intel_engine_cs *engine = rq->engine;
> >>>@@ -628,17 +625,6 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq)
> >>>	struct i915_guc_client *client = guc->execbuf_client;
> >>>	int b_ret;
> >>>
> >>>-	/* We keep the previous context alive until we retire the following
> >>>-	 * request. This ensures that any the context object is still pinned
> >>>-	 * for any residual writes the HW makes into it on the context switch
> >>>-	 * into the next object following the breadcrumb. Otherwise, we may
> >>>-	 * retire the context too early.
> >>>-	 */
> >>>-	rq->previous_context = engine->last_context;
> >>>-	engine->last_context = rq->ctx;
> >>>-
> >>>-	i915_gem_request_submit(rq);
> >>>-
> >>>	spin_lock(&client->wq_lock);
> >>>	guc_wq_item_append(client, rq);
> >>>
> >>>@@ -658,6 +644,23 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq)
> >>>	spin_unlock(&client->wq_lock);
> >>>}
> >>>
> >>>+static void i915_guc_submit(struct drm_i915_gem_request *rq)
> >>>+{
> >>>+	struct intel_engine_cs *engine = rq->engine;
> >>>+
> >>>+	/* We keep the previous context alive until we retire the following
> >>>+	 * request. This ensures that any the context object is still pinned
> >>>+	 * for any residual writes the HW makes into it on the context switch
> >>>+	 * into the next object following the breadcrumb. Otherwise, we may
> >>>+	 * retire the context too early.
> >>>+	 */
> >>>+	rq->previous_context = engine->last_context;
> >>>+	engine->last_context = rq->ctx;
> >>>+
> >>>+	i915_gem_request_submit(rq);
> >>>+	__i915_guc_submit(rq);
> >>>+}
> >>>+
> >>>/*
> >>> * Everything below here is concerned with setup & teardown, and is
> >>> * therefore not part of the somewhat time-critical batch-submission
> >>>@@ -1556,7 +1559,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
> >>>		/* Replay the current set of previously submitted requests */
> >>>		list_for_each_entry(rq, &engine->timeline->requests, link) {
> >>>			client->wq_rsvd += sizeof(struct guc_wq_item);
> >>>-			i915_guc_submit(rq);
> >>>+			__i915_guc_submit(rq);
> >>>		}
> >>>	}
> >>>
> >>>
> >>
> >>The only thing bothering me here is how this interacts with the
> >>common, or "execlist" part of the reset.
> >
> >Also applies to ringbuffer (some parts at least).
> >
> >>Copying Mika in hope he can help here. :)
> >>
> >>I see that i915_gem_reset_engine runs before this GuC wq refill and
> >>does some magic with the requests on the engine->timeline->requests
> >>lists.
> >>
> >>Would it be possible to somehow better integrate the two things?
> >
> >I thought they were nicely separated. The common work to disable the
> >guilty context is handled by the core reset function. The specific work
> >required to restart and replay the requests handled by the different
> >backends.
> >
> >i915_gem_reset_engine: is for disabling the guilty context
> >engine->reset: updates the context / ring for restarting past a hung
> >batch (if required)
> >engine->init: starts the engine, processing the pending requests if any
> >
> >engine->reset is a bad name, since we have per-engine resets as well,
> >but I haven't a better verb yet for handling the request bumping after a
> >reset.
> 
> Yes but final part of the reset when GuC is enabled, is not in any
> of those and is not even called from i915_gem_reset. It happens
> later via i915_gem_init_hw in i915_guc_submission_enable.

Which is the init_hw...
 
> I was wondering if that is the most logical way of doing it - does
> it belong with the GEM reset, or the hardware reset?

It's just GuC not fitting in very well.
 
> For example could we, for GuC, not use reset_common_ring but add
> guc_reset_engine (yes, why did you name it ring when we went to
> engines?).

It was already called ring, I just cut'n'sed. :)

> But I see that the ordering of operations would make it very
> problematic, since GuC is not setup until later following a reset.
> 
> So it all becomes very complicated and since it also predates this
> patch it doesn't matter. Just confused me a little bit, and probably
> will again in the future. :)

Yes, I think with care we could remove the GuC special casing. Just
the discontent hasn't yet boiled over into anger with a bug to justify
some gutting.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2016-11-28 16:06 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-25  9:30 GuC vs multiple timelines Chris Wilson
2016-11-25  9:30 ` [PATCH 01/15] drm: Hold mode_config.lock to prevent hotplug whilst setting up crtcs Chris Wilson
2016-11-25  9:30 ` [PATCH 02/15] drm: Pull together probe + setup for drm_fb_helper Chris Wilson
2016-11-25  9:30 ` [PATCH 03/15] drm: Protect fb_helper list manipulation with a mutex Chris Wilson
2016-11-25  9:30 ` [PATCH 04/15] drm/i915/perf: Wrap 64bit divides in do_div() Chris Wilson
2016-11-25  9:30 ` [PATCH 05/15] drm/i915: Add is-completed assert to request retire entrypoint Chris Wilson
2016-11-25 10:36   ` Joonas Lahtinen
2016-11-25  9:30 ` [PATCH 06/15] drm/i915: Assert no external observers when unwind a failed request alloc Chris Wilson
2016-11-25 10:39   ` Joonas Lahtinen
2016-11-25 11:27     ` Chris Wilson
2016-11-25  9:30 ` [PATCH 07/15] drm/i915: Hold a reference on the request for its fence chain Chris Wilson
2016-11-25 10:31   ` Joonas Lahtinen
2016-11-25 10:46     ` Chris Wilson
2016-11-25  9:30 ` [PATCH 08/15] drm/i915: Integrate i915_sw_fence with debugobjects Chris Wilson
2016-11-25 10:24   ` Joonas Lahtinen
2016-11-25  9:30 ` [PATCH 09/15] drm/i915: Enable swfence debugobject support for i915.ko Chris Wilson
2016-11-25 10:32   ` Joonas Lahtinen
2016-11-25  9:30 ` [PATCH 10/15] HAX drm/i915: Enable guc submission Chris Wilson
2016-11-25  9:30 ` [PATCH 11/15] drm/i915: Trim i915_guc_info() stack usage Chris Wilson
2016-11-28 11:15   ` Tvrtko Ursulin
2016-11-28 11:35     ` Chris Wilson
2016-11-28 12:17       ` Tvrtko Ursulin
2016-11-25  9:30 ` [PATCH 12/15] drm/i915/guc: Rename client->cookie to match use Chris Wilson
2016-11-28 11:37   ` Tvrtko Ursulin
2016-11-25  9:30 ` [PATCH 13/15] drm/i915/guc: Initialise doorbell cookie to matching value Chris Wilson
2016-11-28 12:09   ` Tvrtko Ursulin
2016-11-28 12:18     ` Chris Wilson
2016-11-25  9:30 ` [PATCH 14/15] drm/i915/guc: Keep the execbuf client allocated across reset Chris Wilson
2016-11-28 13:49   ` Tvrtko Ursulin
2016-11-28 14:11     ` Chris Wilson
2016-11-28 15:44       ` Tvrtko Ursulin
2016-11-28 16:01         ` Chris Wilson
2016-11-25  9:30 ` [PATCH 15/15] drm/i915/guc: Split hw submission for replay after GPU reset Chris Wilson
2016-11-28 14:02   ` Tvrtko Ursulin
2016-11-28 14:19     ` Chris Wilson
2016-11-28 15:20       ` Mika Kuoppala
2016-11-28 15:55       ` Tvrtko Ursulin
2016-11-28 16:06         ` Chris Wilson
2016-11-25 10:16 ` ✗ Fi.CI.BAT: warning for series starting with [01/15] drm: Hold mode_config.lock to prevent hotplug whilst setting up crtcs Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox