From: Matthew Brost <matthew.brost@intel.com>
To: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH] drm/i915: fix blank screen booting crashes
Date: Tue, 21 Sep 2021 15:55:15 -0700 [thread overview]
Message-ID: <20210921225514.GA8109@jons-linux-dev-box> (raw)
In-Reply-To: <20210921184637.ullcwswqd6z5hi4j@ldmartin-desk2>
On Tue, Sep 21, 2021 at 11:46:37AM -0700, Lucas De Marchi wrote:
> On Tue, Sep 21, 2021 at 10:43:32AM -0700, Matthew Brost wrote:
> > From: Hugh Dickins <hughd@google.com>
> >
> > 5.15-rc1 crashes with blank screen when booting up on two ThinkPads
> > using i915. Bisections converge convincingly, but arrive at different
> > and surprising "culprits", none of them the actual culprit.
> >
> > netconsole (with init_netconsole() hacked to call i915_init() when
> > logging has started, instead of by module_init()) tells the story:
> >
> > kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245!
> > with RSI: ffffffff814d408b pointing to sw_fence_dummy_notify().
> > I've been building with CONFIG_CC_OPTIMIZE_FOR_SIZE=y, and that
> > function needs to be 4-byte aligned.
> >
> > v2:
> > (Jani Nikula)
> > - Change BUG_ON to WARN_ON
> > v3:
> > (Jani / Tvrtko)
> > - Short circuit __i915_sw_fence_init on WARN_ON
> >
> > Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
> > Signed-off-by: Hugh Dickins <hughd@google.com>
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > Reviewed-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> > drivers/gpu/drm/i915/gt/intel_context.c | 4 ++--
> > drivers/gpu/drm/i915/i915_sw_fence.c | 17 ++++++++++-------
> > 2 files changed, 12 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> > index ff637147b1a9..e7f78bc7ebfc 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > @@ -362,8 +362,8 @@ static int __intel_context_active(struct i915_active *active)
> > return 0;
> > }
> >
>
> > -static int sw_fence_dummy_notify(struct i915_sw_fence *sf,
> > - enum i915_sw_fence_notify state)
> > +static int __i915_sw_fence_call
> > +sw_fence_dummy_notify(struct i915_sw_fence *sf, enum i915_sw_fence_notify state)
> > {
> > return NOTIFY_DONE;
> > }
> > diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
> > index c589a681da77..08cea73264e7 100644
> > --- a/drivers/gpu/drm/i915/i915_sw_fence.c
> > +++ b/drivers/gpu/drm/i915/i915_sw_fence.c
> > @@ -13,9 +13,9 @@
> > #include "i915_selftest.h"
> >
> > #if IS_ENABLED(CONFIG_DRM_I915_DEBUG)
> > -#define I915_SW_FENCE_BUG_ON(expr) BUG_ON(expr)
> > +#define I915_SW_FENCE_WARN_ON(expr) WARN_ON(expr)
> > #else
> > -#define I915_SW_FENCE_BUG_ON(expr) BUILD_BUG_ON_INVALID(expr)
> > +#define I915_SW_FENCE_WARN_ON(expr) BUILD_BUG_ON_INVALID(expr)
> > #endif
> >
> > static DEFINE_SPINLOCK(i915_sw_fence_lock);
> > @@ -129,7 +129,10 @@ static int __i915_sw_fence_notify(struct i915_sw_fence *fence,
> > i915_sw_fence_notify_t fn;
> >
> > fn = (i915_sw_fence_notify_t)(fence->flags & I915_SW_FENCE_MASK);
> > - return fn(fence, state);
> > + if (likely(fn))
> > + return fn(fence, state);
> > + else
> > + return 0;
>
> since the knowledge for these being NULL (or with the wrong alignment)
> are in the init/reinit functions, wouldn't it be better to just add a
> fence_nop() and assign it there instead this likely() here?
>
Maybe? I prefer the way it is.
> > }
> >
> > #ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS
> > @@ -242,9 +245,9 @@ void __i915_sw_fence_init(struct i915_sw_fence *fence,
> > const char *name,
> > struct lock_class_key *key)
> > {
> > - BUG_ON(!fn || (unsigned long)fn & ~I915_SW_FENCE_MASK);
> > -
> > __init_waitqueue_head(&fence->wait, name, key);
> > + if (WARN_ON(!fn || (unsigned long)fn & ~I915_SW_FENCE_MASK))
> > + return;
>
> like:
> if (WARN_ON(!fn || (unsigned long)fn & ~I915_SW_FENCE_MASK))
> fence->flags = (unsigned long)sw_fence_dummy_notify;
> else
> fence->flags = (unsigned long)fn;
>
>
> f you return here instead of calling i915_sw_fence_reinit(), aren't you
> just going to use uninitialized memory later? At least in the selftests,
> which allocate it with kmalloc()... I didn't check others.
>
I don't think so, maybe the fence won't work but it won't blow up
either.
>
> For the bug fix we could just add the __aligned(4) and leave the rest to a
> separate patch.
>
The bug was sw_fence_dummy_notify in gt/intel_context.c was not 4 byte
align which triggered a BUG_ON during boot which blank screened a
laptop. Jani / Tvrtko suggested that we make the BUG_ON to WARN_ONs so
if someone makes this mistake in the future kernel should boot albiet
with a WARNING.
The long term fix is just pull out the I915_SW_FENCE_MASK (stealing bits
from a poitner) and we don't have to worry any of this.
Matt
>
> Lucas De Marchi
>
> > fence->flags = (unsigned long)fn;
> >
> > i915_sw_fence_reinit(fence);
> > @@ -257,8 +260,8 @@ void i915_sw_fence_reinit(struct i915_sw_fence *fence)
> > atomic_set(&fence->pending, 1);
> > fence->error = 0;
> >
> > - I915_SW_FENCE_BUG_ON(!fence->flags);
> > - I915_SW_FENCE_BUG_ON(!list_empty(&fence->wait.head));
> > + I915_SW_FENCE_WARN_ON(!fence->flags);
> > + I915_SW_FENCE_WARN_ON(!list_empty(&fence->wait.head));
> > }
> >
> > void i915_sw_fence_commit(struct i915_sw_fence *fence)
> > --
> > 2.32.0
> >
next prev parent reply other threads:[~2021-09-21 23:00 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-21 17:43 [Intel-gfx] [PATCH] drm/i915: fix blank screen booting crashes Matthew Brost
2021-09-21 17:43 ` Matthew Brost
2021-09-21 18:46 ` [Intel-gfx] " Lucas De Marchi
2021-09-21 22:55 ` Matthew Brost [this message]
2021-09-21 23:29 ` Lucas De Marchi
2021-09-22 1:40 ` Matthew Brost
2021-09-22 21:37 ` Lucas De Marchi
2021-09-21 20:24 ` [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: fix blank screen booting crashes (rev3) Patchwork
2021-09-22 0:06 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
-- strict thread matches above, loose matches on Subject: below --
2021-09-22 1:50 [Intel-gfx] [PATCH] drm/i915: fix blank screen booting crashes Matthew Brost
2021-09-24 12:16 ` Ville Syrjälä
2021-10-04 7:36 ` Jani Nikula
2021-10-15 14:52 ` Tvrtko Ursulin
2021-10-15 15:42 ` John Harrison
2021-10-15 15:55 ` Jani Nikula
2021-09-17 23:38 Matthew Brost
2021-09-20 7:28 ` Tvrtko Ursulin
2021-09-20 7:38 ` Jani Nikula
2021-09-20 7:42 ` Tvrtko Ursulin
2021-09-20 20:29 ` Matthew Brost
2021-09-20 20:25 ` Matthew Brost
2021-09-17 23:38 Matthew Brost
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210921225514.GA8109@jons-linux-dev-box \
--to=matthew.brost@intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.