From: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
To: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com>,
igt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Subject: Re: [Intel-gfx] [igt-dev] [PATCH i-g-t] tests/core_hotunplug: Take care of closing fences before failing
Date: Thu, 15 Oct 2020 19:18:53 +0200 [thread overview]
Message-ID: <7e13ba8833d8fbb887c01e420591a10944965147.camel@linux.intel.com> (raw)
In-Reply-To: <80c901a14b3e4976fa670b410b96926cd97b370d.camel@linux.intel.com>
On Thu, 2020-10-15 at 09:40 +0200, Marcin Bernatowicz wrote:
> On Wed, 2020-10-14 at 18:55 +0200, Janusz Krzysztofik wrote:
> > The test was designed to keep track of open device file descriptors
> > for safe driver unbind on recovery from a failed subtest. In that
> > context, fences introduced by commit 1fbd127bd4e1 ("core_hotplug:
> > Teach the healthcheck how to check execution status") can affect
> > device
> > recovery as much as an open device file if not closed before unbind.
> >
> > Moreover, forced GPU reset which used to be applied on recovery from
> > a
> > failed i915 GPU health check is no longer reachable since a GPU hang
> > hopefully detected by the new health check algorithm can now break
> > the
> > whole recovery procedure prematurely.
> >
> > Refactor local_i915_healthcheck() so it takes care of closing fences
> > and returns a result to its caller instead of long jumping on
> > failures
> > believed to be recoverable. While avoiding use of igt_assert() and
> > friends, report actual source and error code of failures via
> > igt_warn_on_f().
> >
> > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> > tests/core_hotunplug.c | 46 ++++++++++++++++++++++++++++++++------
> > ----
> > 1 file changed, 35 insertions(+), 11 deletions(-)
> >
> > diff --git a/tests/core_hotunplug.c b/tests/core_hotunplug.c
> > index 70669c590..d6db02bad 100644
> > --- a/tests/core_hotunplug.c
> > +++ b/tests/core_hotunplug.c
> > @@ -233,9 +233,9 @@ static int merge_fences(int old, int new)
> > return new;
> >
> > merge = sync_fence_merge(old, new);
> > - igt_assert(merge != -1);
> > - close(old);
> > - close(new);
> > + /* Assume fence close errors don't affect device close status
> > */
> > + igt_ignore_warn(local_close(old, "old fence close failed"));
> > + igt_ignore_warn(local_close(new, "new fence close failed"));
> >
> > return merge;
> > }
> > @@ -249,29 +249,53 @@ static int local_i915_healthcheck(int i915,
> > const char *prefix)
> > .buffer_count = 1,
> > };
> > const struct intel_execution_engine2 *engine;
> > - int fence = -1;
> > + int fence = -1, err = 0, status = 1;
> >
> > local_debug("%s%s\n", prefix, "running i915 GPU healthcheck");
> > - if (local_i915_is_wedged(i915))
> > + if (igt_warn_on_f(local_i915_is_wedged(i915), "GPU found
> > wedged\n"))
> > return -EIO;
> >
> > + /* Assume gem_create()/gem_write() failures are unrecoverable
> > */
> > obj.handle = gem_create(i915, 4096);
> > gem_write(i915, obj.handle, 0, &bbe, sizeof(bbe));
> >
> > + /* As soon as a fence is open, don't fail before closing it */
> > __for_each_physical_engine(i915, engine) {
> > execbuf.flags = engine->flags | I915_EXEC_FENCE_OUT;
> > - gem_execbuf_wr(i915, &execbuf);
> > + err = __gem_execbuf_wr(i915, &execbuf);
> > + if (igt_warn_on_f(err < 0, "__gem_execbuf_wr() returned
> > %d\n",
> > + err))
> > + break;
> >
> > fence = merge_fences(fence, execbuf.rsvd2 >> 32);
> > + if (igt_warn_on_f(fence < 0, "merge_fences() returned
> > %d\n",
> > + fence)) {
> > + err = fence;
> > + break;
> > + }
> > + }
> > + if (fence >= 0) {
> > + status = sync_fence_wait(fence, -1);
> > + if (igt_warn_on_f(status < 0, "sync_fence_wait()
> > returned %d\n",
> > + status))
> > + err = status;
> > + if (!err)
> > + status = sync_fence_status(fence);
> > +
> > + /* Assume fence close errors don't affect device close
> > status */
> > + igt_ignore_warn(local_close(fence, "fence close
> > failed"));
> > }
> > - igt_assert(fence != -1);
> > +
> > + /* Assume gem_close() failure is unrecoverable */
> > gem_close(i915, obj.handle);
> >
> > - igt_assert_eq(sync_fence_wait(fence, -1), 0);
> > - igt_assert_eq(sync_fence_status(fence), 1);
> > - close(fence);
> > + if (err < 0)
> > + return err;
> > + if (igt_warn_on_f(status != 1, "sync_fence_status() returned
> > %d\n",
> > + status))
> > + return -1;
> >
> > - if (local_i915_is_wedged(i915))
> > + if (igt_warn_on_f(local_i915_is_wedged(i915), "GPU turned
> > wedged\n"))
> > return -EIO;
> LGTM,
> Reviewed-by: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com>
Thanks Marcin, pushed.
Unfortunately I forgot to include the R-b and pushed without it so it
will exist only in the list archives, sorry.
Thanks,
Janusz
>
> >
> > return 0;
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
prev parent reply other threads:[~2020-10-15 17:18 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-14 16:55 [Intel-gfx] [PATCH i-g-t] tests/core_hotunplug: Take care of closing fences before failing Janusz Krzysztofik
2020-10-15 7:40 ` [Intel-gfx] [igt-dev] " Marcin Bernatowicz
2020-10-15 17:18 ` Janusz Krzysztofik [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7e13ba8833d8fbb887c01e420591a10944965147.camel@linux.intel.com \
--to=janusz.krzysztofik@linux.intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=igt-dev@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=marcin.bernatowicz@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox