public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx]  ✗ Fi.CI.BAT: failure for Clean up GuC CI failures, simplify locking, and kernel DOC (rev6)
Date: Thu, 26 Aug 2021 10:44:44 -0700	[thread overview]
Message-ID: <20210826174444.GA20202@jons-linux-dev-box> (raw)
In-Reply-To: <162999462762.15048.16301274628038623814@emeril.freedesktop.org>

On Thu, Aug 26, 2021 at 04:17:07PM +0000, Patchwork wrote:
> Patch Details
> 
> Series:  Clean up GuC CI failures, simplify locking, and kernel DOC (rev6)
> URL:     https://patchwork.freedesktop.org/series/93704/
> State:   failure
> Details: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20904/index.html
> 
> CI Bug Log - changes from CI_DRM_10525 -> Patchwork_20904
> 
> Summary
> 
> FAILURE
> 
> Serious unknown changes coming with Patchwork_20904 absolutely need to be
> verified manually.
> 
> If you think the reported changes have nothing to do with the changes
> introduced in Patchwork_20904, please notify your bug team to allow them
> to document this new failure mode, which will reduce false positives in CI.
> 
> External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20904/
> index.html
> 
> Possible new issues
> 
> Here are the unknown changes that may have been introduced in Patchwork_20904:
> 
> IGT changes
> 
> Possible regressions
> 
>   • igt@i915_selftest@live@hangcheck:
>       □ fi-rkl-guc: PASS -> INCOMPLETE

I've seen this locally before and after this series. I wouldn't hold of
the merge of this series because of this as I don't believe it is a
regression, just an existing instability in the stack. I haven't been
able to root cause this yet, but my initial analysis points to the GuC
losing a submission after the GuC has reset a context. Will dig into
this and hopefully get a fix after I'm back from vacation on 9/7.

Matt 

> 
> New tests
> 
> New tests have been introduced between CI_DRM_10525 and Patchwork_20904:
> 
> New IGT tests (1)
> 
>   • igt@i915_selftest@live@guc:
>       □ Statuses : 30 pass(s)
>       □ Exec time: [0.41, 5.26] s
> 
> Known issues
> 
> Here are the changes found in Patchwork_20904 that come from known issues:
> 
> IGT changes
> 
> Issues hit
> 
>   • igt@amdgpu/amd_cs_nop@sync-compute0:
> 
>       □ fi-kbl-soraka: NOTRUN -> SKIP (fdo#109271) +5 similar issues
>   • igt@runner@aborted:
> 
>       □ fi-rkl-guc: NOTRUN -> FAIL (i915#3928)
> 
> {name}: This element is suppressed. This means it is ignored when computing
> the status of the difference (SUCCESS, WARNING, or FAILURE).
> 
> Participating hosts (40 -> 33)
> 
> Missing (7): fi-ilk-m540 bat-adls-5 fi-hsw-4200u fi-tgl-1115g4 fi-bsw-cyan
> fi-bdw-samus bat-jsl-1
> 
> Build changes
> 
>   • Linux: CI_DRM_10525 -> Patchwork_20904
> 
> CI-20190529: 20190529
> CI_DRM_10525: 059309d37ac2de5d93cf6d71fd7fe33c9c2c66ea @ git://
> anongit.freedesktop.org/gfx-ci/linux
> IGT_6186: 250081b306c6fa8f95405fab6a7604f1968dd4ec @ https://
> gitlab.freedesktop.org/drm/igt-gpu-tools.git
> Patchwork_20904: 0c1d27ac9fce7e231e7dddebcf56905e05302cae @ git://
> anongit.freedesktop.org/gfx-ci/linux
> 
> == Linux commits ==
> 
> 0c1d27ac9fce drm/i915/guc: Drop static inline functions intel_guc_submission.c
> 50ada01b3d95 drm/i915/guc: Add GuC kernel doc
> 883eccfa8221 drm/i915/guc: Drop guc_active move everything into guc_state
> fa075902c938 drm/i915/guc: Move fields protected by guc->contexts_lock into sub
> structure
> a1c73c8c481a drm/i915/guc: Move GuC priority fields in context under guc_active
> f16c0554ae08 drm/i915/guc: Drop pin count check trick between sched_disable and
> re-pin
> 42ac1b77a019 drm/i915/guc: Proper xarray usage for contexts_lookup
> 9b9222998c83 drm/i915/guc: Rework and simplify locking
> 244934484f63 drm/i915/guc: Move guc_blocked fence to struct guc_state
> ba695a58136a drm/i915/guc: Release submit fence from an irq_work
> 3bd5803d5e25 drm/i915/guc: Flush G2H work queue during reset
> b87ba9121748 drm/i915: Allocate error capture in nowait context
> adb35ad83c76 drm/i915/guc: Reset LRC descriptor if register returns -ENODEV
> 97e616063006 drm/i915/guc: Don't touch guc_state.sched_state without a lock
> 1ff99308ef88 drm/i915/guc: Take context ref when cancelling request
> ff84f14ddceb drm/i915/selftests: Add initial GuC selftest for scrubbing lost
> G2H
> abd6a8884cf4 drm/i915/guc: Copy whole golden context, set engine state size of
> subset
> a19ba1f51009 drm/i915/guc: Don't enable scheduling on a banned context, guc_id
> invalid, not registered
> f29b2b338002 drm/i915/guc: Kick tasklet after queuing a request
> f577a4fdeeab drm/i915/selftests: Add a cancel request selftest that triggers a
> reset
> da3d87dfe8c5 Revert "drm/i915/gt: Propagate change in error status to children
> on unhold"
> 25273a034c8d drm/i915/guc: Workaround reset G2H is received after schedule done
> G2H
> c00d543957c2 drm/i915/guc: Process all G2H message at once in work queue
> 5b7ff1fa9e43 drm/i915/guc: Don't drop ce->guc_active.lock when unwinding
> context
> 54cd904fa232 drm/i915/guc: Unwind context requests in reverse order
> 593f21493fda drm/i915/guc: Fix outstanding G2H accounting
> 6b511953d015 drm/i915/guc: Fix blocked context accounting
> 
> SECURITY NOTE: file ~/.netrc must not be accessible by others

  reply	other threads:[~2021-08-26 17:49 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-26  3:23 [Intel-gfx] [PATCH 00/27] Clean up GuC CI failures, simplify locking, and kernel DOC Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 01/27] drm/i915/guc: Fix blocked context accounting Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 02/27] drm/i915/guc: Fix outstanding G2H accounting Matthew Brost
2021-08-26 23:09   ` Daniele Ceraolo Spurio
2021-08-27  1:36     ` Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 03/27] drm/i915/guc: Unwind context requests in reverse order Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 04/27] drm/i915/guc: Don't drop ce->guc_active.lock when unwinding context Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 05/27] drm/i915/guc: Process all G2H message at once in work queue Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 06/27] drm/i915/guc: Workaround reset G2H is received after schedule done G2H Matthew Brost
2021-08-26 23:11   ` Daniele Ceraolo Spurio
2021-08-26  3:23 ` [Intel-gfx] [PATCH 07/27] Revert "drm/i915/gt: Propagate change in error status to children on unhold" Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 08/27] drm/i915/selftests: Add a cancel request selftest that triggers a reset Matthew Brost
2021-08-26  9:32   ` Tvrtko Ursulin
2021-08-26 14:00     ` Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 09/27] drm/i915/guc: Kick tasklet after queuing a request Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 10/27] drm/i915/guc: Don't enable scheduling on a banned context, guc_id invalid, not registered Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 11/27] drm/i915/guc: Copy whole golden context, set engine state size of subset Matthew Brost
2021-08-26 23:21   ` Daniele Ceraolo Spurio
2021-08-26 23:33   ` John Harrison
2021-08-26  3:23 ` [Intel-gfx] [PATCH 12/27] drm/i915/selftests: Add initial GuC selftest for scrubbing lost G2H Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 13/27] drm/i915/guc: Take context ref when cancelling request Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 14/27] drm/i915/guc: Don't touch guc_state.sched_state without a lock Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 15/27] drm/i915/guc: Reset LRC descriptor if register returns -ENODEV Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 16/27] drm/i915: Allocate error capture in nowait context Matthew Brost
2021-08-26 16:18   ` Matthew Brost
2021-08-26 16:21   ` Daniel Vetter
2021-08-26  3:23 ` [Intel-gfx] [PATCH 17/27] drm/i915/guc: Flush G2H work queue during reset Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 18/27] drm/i915/guc: Release submit fence from an irq_work Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 19/27] drm/i915/guc: Move guc_blocked fence to struct guc_state Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 20/27] drm/i915/guc: Rework and simplify locking Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 21/27] drm/i915/guc: Proper xarray usage for contexts_lookup Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 22/27] drm/i915/guc: Drop pin count check trick between sched_disable and re-pin Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 23/27] drm/i915/guc: Move GuC priority fields in context under guc_active Matthew Brost
2021-08-26 23:26   ` Daniele Ceraolo Spurio
2021-08-26  3:23 ` [Intel-gfx] [PATCH 24/27] drm/i915/guc: Move fields protected by guc->contexts_lock into sub structure Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 25/27] drm/i915/guc: Drop guc_active move everything into guc_state Matthew Brost
2021-08-26  3:23 ` [Intel-gfx] [PATCH 26/27] drm/i915/guc: Add GuC kernel doc Matthew Brost
2021-08-31 19:04   ` John Harrison
2021-08-26  3:23 ` [Intel-gfx] [PATCH 27/27] drm/i915/guc: Drop static inline functions intel_guc_submission.c Matthew Brost
2021-08-31 19:09   ` John Harrison
2021-08-26  4:43 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Clean up GuC CI failures, simplify locking, and kernel DOC (rev5) Patchwork
2021-08-26  4:45 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-08-26  5:14 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-08-26 10:34 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2021-08-26 17:56   ` Matthew Brost
2021-08-26 15:32 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Clean up GuC CI failures, simplify locking, and kernel DOC (rev6) Patchwork
2021-08-26 15:33 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-08-26 16:17 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2021-08-26 17:44   ` Matthew Brost [this message]
2021-08-26 19:20 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Clean up GuC CI failures, simplify locking, and kernel DOC (rev7) Patchwork
2021-08-26 19:21 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-08-26 19:50 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-08-27  4:22 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210826174444.GA20202@jons-linux-dev-box \
    --to=matthew.brost@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox