All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jani Nikula <jani.nikula@linux.intel.com>
To: Hugh Dickins <hughd@google.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Hugh Dickins <hughd@google.com>,
	Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>,
	Matt Roper <matthew.d.roper@intel.com>,
	Lucas De Marchi <lucas.demarchi@intel.com>,
	Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	Caz Yokoyama <caz.yokoyama@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Matthew Brost <matthew.brost@intel.com>,
	intel-gfx <intel-gfx@lists.freedesktop.org>,
	dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: [Intel-gfx] [BUG 5.15-rc3] kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245!
Date: Mon, 04 Oct 2021 10:39:29 +0300	[thread overview]
Message-ID: <87mtnp2q8e.fsf@intel.com> (raw)
In-Reply-To: <7bad278d-ff81-21aa-48a-b46b9453b2b@google.com>

On Sat, 02 Oct 2021, Hugh Dickins <hughd@google.com> wrote:
> On Sat, 2 Oct 2021, Linus Torvalds wrote:
>> On Sat, Oct 2, 2021 at 5:17 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>> > On Sat, 2 Oct 2021 03:17:29 -0700 (PDT)
>> > Hugh Dickins <hughd@google.com> wrote:
>> >
>> > > Yes (though bisection doesn't work right on this one): the fix
>> >
>> > Interesting, as it appeared to be very reliable. But I didn't do the
>> > "try before / after" on the patch.
>> 
>> Well, even the before/after might well have worked, since the problem
>> depended on how that sw_fence_dummy_notify() function ended up
>> aligned. So random unrelated changes could re-align it just by
>> mistake.
>
> Yup.
>
>> 
>> Patch applied directly.
>
> Great, thanks a lot.

Thanks & sorry, really looks like we managed to drop this between the
cracks. :(

>
>> 
>> I'd also like to point out how that BUG_ON() actually made things
>> worse, and made this harder to debug. If it had been a WARN_ON_ONCE(),
>> this would presumably not even have needed bisecting, it would have
>> been obvious.
>> 
>> BUG_ON() really is pretty much *always* the wrong thing to do. It
>> onl;y results in problems being harder to see because you end up with
>> a dead machine and the message is often hidden.
>
> Jani made the same point. But I guess they then went off into the weeds
> of how to recover when warning, that the fix itself did not progress.

Yes. That, as well as removing the entire alignment thing to reuse a
couple of bits for flags. Too fragile for its own good.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center

WARNING: multiple messages have this Message-ID (diff)
From: Jani Nikula <jani.nikula@linux.intel.com>
To: Hugh Dickins <hughd@google.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Hugh Dickins <hughd@google.com>,
	Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>,
	Matt Roper <matthew.d.roper@intel.com>,
	Lucas De Marchi <lucas.demarchi@intel.com>,
	Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	Caz Yokoyama <caz.yokoyama@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Matthew Brost <matthew.brost@intel.com>,
	intel-gfx <intel-gfx@lists.freedesktop.org>,
	dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: [BUG 5.15-rc3] kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245!
Date: Mon, 04 Oct 2021 10:39:29 +0300	[thread overview]
Message-ID: <87mtnp2q8e.fsf@intel.com> (raw)
In-Reply-To: <7bad278d-ff81-21aa-48a-b46b9453b2b@google.com>

On Sat, 02 Oct 2021, Hugh Dickins <hughd@google.com> wrote:
> On Sat, 2 Oct 2021, Linus Torvalds wrote:
>> On Sat, Oct 2, 2021 at 5:17 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>> > On Sat, 2 Oct 2021 03:17:29 -0700 (PDT)
>> > Hugh Dickins <hughd@google.com> wrote:
>> >
>> > > Yes (though bisection doesn't work right on this one): the fix
>> >
>> > Interesting, as it appeared to be very reliable. But I didn't do the
>> > "try before / after" on the patch.
>> 
>> Well, even the before/after might well have worked, since the problem
>> depended on how that sw_fence_dummy_notify() function ended up
>> aligned. So random unrelated changes could re-align it just by
>> mistake.
>
> Yup.
>
>> 
>> Patch applied directly.
>
> Great, thanks a lot.

Thanks & sorry, really looks like we managed to drop this between the
cracks. :(

>
>> 
>> I'd also like to point out how that BUG_ON() actually made things
>> worse, and made this harder to debug. If it had been a WARN_ON_ONCE(),
>> this would presumably not even have needed bisecting, it would have
>> been obvious.
>> 
>> BUG_ON() really is pretty much *always* the wrong thing to do. It
>> onl;y results in problems being harder to see because you end up with
>> a dead machine and the message is often hidden.
>
> Jani made the same point. But I guess they then went off into the weeds
> of how to recover when warning, that the fix itself did not progress.

Yes. That, as well as removing the entire alignment thing to reuse a
couple of bits for flags. Too fragile for its own good.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center

  reply	other threads:[~2021-10-04  7:39 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-02  6:02 [Intel-gfx] [BUG 5.15-rc3] kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245! Steven Rostedt
2021-10-02  6:02 ` Steven Rostedt
2021-10-02 10:17 ` [Intel-gfx] " Hugh Dickins
2021-10-02 10:17   ` Hugh Dickins
2021-10-02 12:17   ` [Intel-gfx] " Steven Rostedt
2021-10-02 12:17     ` Steven Rostedt
2021-10-02 16:49     ` [Intel-gfx] " Linus Torvalds
2021-10-02 16:49       ` Linus Torvalds
2021-10-02 17:10       ` [Intel-gfx] " Hugh Dickins
2021-10-02 17:10         ` Hugh Dickins
2021-10-04  7:39         ` Jani Nikula [this message]
2021-10-04  7:39           ` Jani Nikula
2021-10-02 10:52 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2021-10-02 11:26 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-10-02 12:46 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mtnp2q8e.fsf@intel.com \
    --to=jani.nikula@linux.intel.com \
    --cc=caz.yokoyama@intel.com \
    --cc=daniele.ceraolospurio@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hughd@google.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lucas.demarchi@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.