All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Cc: igt-dev@lists.freedesktop.org
Subject: Re: [igt-dev] [Intel-gfx] [PATCH i-g-t] i915/gem_mmap_gtt: Reset faster and longer to catch fencing errors
Date: Thu, 24 Jan 2019 17:25:10 +0200	[thread overview]
Message-ID: <87va2ekrhl.fsf@gaia.fi.intel.com> (raw)
In-Reply-To: <20190118003238.30588-1-chris@chris-wilson.co.uk>

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Performing a GPU reset clobbers the fence registers, affecting which
> addresses the tiled GTT mmap access. If the driver does not take
> precautions across a GPU reset, a client may read the wrong values (but
> only within their own buffer as the fence will only be degraded to
> I915_TILING_NONE, reducing the access area). However, as this requires
> performing a read using the indirect GTT at exactly the same time as the
> reset occurs, it can be quite difficult to catch, so repeat the test
> many times and across all cores simultaneously.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  tests/i915/gem_mmap_gtt.c | 99 +++++++++++++++++++++++++++------------
>  1 file changed, 68 insertions(+), 31 deletions(-)
>
> diff --git a/tests/i915/gem_mmap_gtt.c b/tests/i915/gem_mmap_gtt.c
> index f63535556..21880d31d 100644
> --- a/tests/i915/gem_mmap_gtt.c
> +++ b/tests/i915/gem_mmap_gtt.c
> @@ -38,6 +38,7 @@
>  #include "drm.h"
>  
>  #include "igt.h"
> +#include "igt_sysfs.h"
>  #include "igt_x86.h"
>  
>  #ifndef PAGE_SIZE
> @@ -375,50 +376,86 @@ test_clflush(int fd)
>  static void
>  test_hang(int fd)
>  {
> -	igt_hang_t hang;
> -	uint32_t patterns[] = {
> +	const uint32_t patterns[] = {
>  		0, 0xaaaaaaaa, 0x55555555, 0xcccccccc,
>  	};
> -	uint32_t *gtt[3];
> -	int last_pattern = 0;
> -	int next_pattern = 1;
> -	int i;
> +	const int ncpus = sysconf(_SC_NPROCESSORS_ONLN);
> +	struct {
> +		bool done;
> +		bool error;
> +	} *control;
> +	unsigned long count;
> +	igt_hang_t hang;
> +	int dir;
>  
> -	for (i = I915_TILING_NONE; i <= I915_TILING_Y; i++) {
> -		uint32_t handle;
> +	hang = igt_allow_hang(fd, 0, 0);
> +	igt_sysfs_set_parameter(fd, "reset", "1"); /* global resets */

igt_assert to be sure that you made it?

igt_sysfs_set_module_parameter would be less misleadning :P

>  
> -		handle = gem_create(fd, OBJECT_SIZE);
> -		gem_set_tiling(fd, handle, i, 2048);
> +	control = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
> +	igt_assert(control != MAP_FAILED);
>  
> -		gtt[i] = gem_mmap__gtt(fd, handle, OBJECT_SIZE, PROT_WRITE);
> -		set_domain_gtt(fd, handle);
> -		gem_close(fd, handle);
> -	}
> +	igt_fork(child, ncpus) {
> +		int last_pattern = 0;
> +		int next_pattern = 1;
> +		uint32_t *gtt[2];

You throw tiling none out as it is just a distraction and
waste of cycles?

>  
> -	hang = igt_hang_ring(fd, I915_EXEC_RENDER);
> +		for (int i = 0; i < ARRAY_SIZE(gtt); i++) {
> +			uint32_t handle;
>  
> -	do {
> -		for (i = 0; i < OBJECT_SIZE / 64; i++) {
> -			int x = 16*i + (i%16);
> +			handle = gem_create(fd, OBJECT_SIZE);
> +			gem_set_tiling(fd, handle, I915_TILING_X + i, 2048);

You could have setup a priori. But this prolly is faster than
one reset cycle of tests so nothing to gain.

>  
> -			igt_assert(gtt[0][x] == patterns[last_pattern]);
> -			igt_assert(gtt[1][x] == patterns[last_pattern]);
> -			igt_assert(gtt[2][x] == patterns[last_pattern]);
> +			gtt[i] = gem_mmap__gtt(fd, handle, OBJECT_SIZE, PROT_WRITE);
> +			set_domain_gtt(fd, handle);
> +			gem_close(fd, handle);
> +		}
>  
> -			gtt[0][x] = patterns[next_pattern];
> -			gtt[1][x] = patterns[next_pattern];
> -			gtt[2][x] = patterns[next_pattern];
> +		while (!READ_ONCE(control->done)) {
> +			for (int i = 0; i < OBJECT_SIZE / 64; i++) {
> +				uint32_t expected = patterns[last_pattern];
> +				uint32_t found[2];
> +				int x = 16*i + (i%16);

nitpicking here for consts and unsigned x.

> +
> +				found[0] = READ_ONCE(gtt[0][x]);
> +				found[1] = READ_ONCE(gtt[1][x]);
> +
> +				if (found[0] != expected ||
> +				    found[1] != expected) {
> +					igt_warn("child[%d] found (%x, %x), expecting %x\n",
> +						 child,
> +						 found[0], found[1],
> +						 expected);
> +					control->error = true;
> +					exit(0);
> +				}
> +
> +				gtt[0][x] = patterns[next_pattern];
> +				gtt[1][x] = patterns[next_pattern];
> +			}
> +
> +			last_pattern = next_pattern;
> +			next_pattern = (next_pattern + 1) % ARRAY_SIZE(patterns);
>  		}
> +	}
>  
> -		last_pattern = next_pattern;
> -		next_pattern = (next_pattern + 1) % ARRAY_SIZE(patterns);
> -	} while (gem_bo_busy(fd, hang.spin->handle));

Well, no concern here anymore that something would sync on
there.

Only nitpicks so,
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>


> +	count = 0;
> +	dir = igt_debugfs_dir(fd);
> +	igt_until_timeout(5) {
> +		igt_sysfs_set(dir, "i915_wedged", "-1");
> +		if (READ_ONCE(control->error))
> +			break;
> +		count++;
> +	}
> +	close(dir);
> +	igt_info("%lu resets\n", count);
> +
> +	control->done = true;
> +	igt_waitchildren();
>  
> -	igt_post_hang_ring(fd, hang);
> +	igt_assert(!control->error);
> +	munmap(control, 4096);
>  
> -	munmap(gtt[0], OBJECT_SIZE);
> -	munmap(gtt[1], OBJECT_SIZE);
> -	munmap(gtt[2], OBJECT_SIZE);
> +	igt_disallow_hang(fd, hang);
>  }
>  
>  static int min_tile_width(uint32_t devid, int tiling)
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

WARNING: multiple messages have this Message-ID (diff)
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Cc: igt-dev@lists.freedesktop.org
Subject: Re: [PATCH i-g-t] i915/gem_mmap_gtt: Reset faster and longer to catch fencing errors
Date: Thu, 24 Jan 2019 17:25:10 +0200	[thread overview]
Message-ID: <87va2ekrhl.fsf@gaia.fi.intel.com> (raw)
In-Reply-To: <20190118003238.30588-1-chris@chris-wilson.co.uk>

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Performing a GPU reset clobbers the fence registers, affecting which
> addresses the tiled GTT mmap access. If the driver does not take
> precautions across a GPU reset, a client may read the wrong values (but
> only within their own buffer as the fence will only be degraded to
> I915_TILING_NONE, reducing the access area). However, as this requires
> performing a read using the indirect GTT at exactly the same time as the
> reset occurs, it can be quite difficult to catch, so repeat the test
> many times and across all cores simultaneously.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  tests/i915/gem_mmap_gtt.c | 99 +++++++++++++++++++++++++++------------
>  1 file changed, 68 insertions(+), 31 deletions(-)
>
> diff --git a/tests/i915/gem_mmap_gtt.c b/tests/i915/gem_mmap_gtt.c
> index f63535556..21880d31d 100644
> --- a/tests/i915/gem_mmap_gtt.c
> +++ b/tests/i915/gem_mmap_gtt.c
> @@ -38,6 +38,7 @@
>  #include "drm.h"
>  
>  #include "igt.h"
> +#include "igt_sysfs.h"
>  #include "igt_x86.h"
>  
>  #ifndef PAGE_SIZE
> @@ -375,50 +376,86 @@ test_clflush(int fd)
>  static void
>  test_hang(int fd)
>  {
> -	igt_hang_t hang;
> -	uint32_t patterns[] = {
> +	const uint32_t patterns[] = {
>  		0, 0xaaaaaaaa, 0x55555555, 0xcccccccc,
>  	};
> -	uint32_t *gtt[3];
> -	int last_pattern = 0;
> -	int next_pattern = 1;
> -	int i;
> +	const int ncpus = sysconf(_SC_NPROCESSORS_ONLN);
> +	struct {
> +		bool done;
> +		bool error;
> +	} *control;
> +	unsigned long count;
> +	igt_hang_t hang;
> +	int dir;
>  
> -	for (i = I915_TILING_NONE; i <= I915_TILING_Y; i++) {
> -		uint32_t handle;
> +	hang = igt_allow_hang(fd, 0, 0);
> +	igt_sysfs_set_parameter(fd, "reset", "1"); /* global resets */

igt_assert to be sure that you made it?

igt_sysfs_set_module_parameter would be less misleadning :P

>  
> -		handle = gem_create(fd, OBJECT_SIZE);
> -		gem_set_tiling(fd, handle, i, 2048);
> +	control = mmap(NULL, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0);
> +	igt_assert(control != MAP_FAILED);
>  
> -		gtt[i] = gem_mmap__gtt(fd, handle, OBJECT_SIZE, PROT_WRITE);
> -		set_domain_gtt(fd, handle);
> -		gem_close(fd, handle);
> -	}
> +	igt_fork(child, ncpus) {
> +		int last_pattern = 0;
> +		int next_pattern = 1;
> +		uint32_t *gtt[2];

You throw tiling none out as it is just a distraction and
waste of cycles?

>  
> -	hang = igt_hang_ring(fd, I915_EXEC_RENDER);
> +		for (int i = 0; i < ARRAY_SIZE(gtt); i++) {
> +			uint32_t handle;
>  
> -	do {
> -		for (i = 0; i < OBJECT_SIZE / 64; i++) {
> -			int x = 16*i + (i%16);
> +			handle = gem_create(fd, OBJECT_SIZE);
> +			gem_set_tiling(fd, handle, I915_TILING_X + i, 2048);

You could have setup a priori. But this prolly is faster than
one reset cycle of tests so nothing to gain.

>  
> -			igt_assert(gtt[0][x] == patterns[last_pattern]);
> -			igt_assert(gtt[1][x] == patterns[last_pattern]);
> -			igt_assert(gtt[2][x] == patterns[last_pattern]);
> +			gtt[i] = gem_mmap__gtt(fd, handle, OBJECT_SIZE, PROT_WRITE);
> +			set_domain_gtt(fd, handle);
> +			gem_close(fd, handle);
> +		}
>  
> -			gtt[0][x] = patterns[next_pattern];
> -			gtt[1][x] = patterns[next_pattern];
> -			gtt[2][x] = patterns[next_pattern];
> +		while (!READ_ONCE(control->done)) {
> +			for (int i = 0; i < OBJECT_SIZE / 64; i++) {
> +				uint32_t expected = patterns[last_pattern];
> +				uint32_t found[2];
> +				int x = 16*i + (i%16);

nitpicking here for consts and unsigned x.

> +
> +				found[0] = READ_ONCE(gtt[0][x]);
> +				found[1] = READ_ONCE(gtt[1][x]);
> +
> +				if (found[0] != expected ||
> +				    found[1] != expected) {
> +					igt_warn("child[%d] found (%x, %x), expecting %x\n",
> +						 child,
> +						 found[0], found[1],
> +						 expected);
> +					control->error = true;
> +					exit(0);
> +				}
> +
> +				gtt[0][x] = patterns[next_pattern];
> +				gtt[1][x] = patterns[next_pattern];
> +			}
> +
> +			last_pattern = next_pattern;
> +			next_pattern = (next_pattern + 1) % ARRAY_SIZE(patterns);
>  		}
> +	}
>  
> -		last_pattern = next_pattern;
> -		next_pattern = (next_pattern + 1) % ARRAY_SIZE(patterns);
> -	} while (gem_bo_busy(fd, hang.spin->handle));

Well, no concern here anymore that something would sync on
there.

Only nitpicks so,
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>


> +	count = 0;
> +	dir = igt_debugfs_dir(fd);
> +	igt_until_timeout(5) {
> +		igt_sysfs_set(dir, "i915_wedged", "-1");
> +		if (READ_ONCE(control->error))
> +			break;
> +		count++;
> +	}
> +	close(dir);
> +	igt_info("%lu resets\n", count);
> +
> +	control->done = true;
> +	igt_waitchildren();
>  
> -	igt_post_hang_ring(fd, hang);
> +	igt_assert(!control->error);
> +	munmap(control, 4096);
>  
> -	munmap(gtt[0], OBJECT_SIZE);
> -	munmap(gtt[1], OBJECT_SIZE);
> -	munmap(gtt[2], OBJECT_SIZE);
> +	igt_disallow_hang(fd, hang);
>  }
>  
>  static int min_tile_width(uint32_t devid, int tiling)
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2019-01-24 15:25 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-18  0:32 [igt-dev] [PATCH i-g-t] i915/gem_mmap_gtt: Reset faster and longer to catch fencing errors Chris Wilson
2019-01-18  0:32 ` Chris Wilson
2019-01-18  1:02 ` [igt-dev] ✓ Fi.CI.BAT: success for " Patchwork
2019-01-18 10:55 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork
2019-01-24 15:25 ` Mika Kuoppala [this message]
2019-01-24 15:25   ` [PATCH i-g-t] " Mika Kuoppala
2019-01-24 15:34   ` [igt-dev] [Intel-gfx] " Chris Wilson
2019-01-24 15:34     ` Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87va2ekrhl.fsf@gaia.fi.intel.com \
    --to=mika.kuoppala@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=igt-dev@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.