Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Jani Nikula <jani.nikula@linux.intel.com>
Cc: "Dugast, Francois" <francois.dugast@intel.com>,
	"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>
Subject: Re: [Intel-xe] [RFC] drm/xe: Introduce xe_ASSERT macros
Date: Fri, 4 Aug 2023 16:47:48 -0400	[thread overview]
Message-ID: <ZM1j9PFKuQEt32xb@intel.com> (raw)
In-Reply-To: <87r0onw4xw.fsf@intel.com>

On Tue, Aug 01, 2023 at 11:19:07AM +0300, Jani Nikula wrote:
> On Fri, 28 Jul 2023, Michal Wajdeczko <michal.wajdeczko@intel.com> wrote:
> > On 28.07.2023 21:47, Vivi, Rodrigo wrote:
> >> On Fri, 2023-07-28 at 21:25 +0200, Michal Wajdeczko wrote:
> >>> As we are moving away from the controversial XE_BUG_ON macro,
> >>> relying just on WARN_ON or drm_err does not cover the cases
> >>> where we want to annotate functions with additional detailed
> >>> debug checks to assert that all prerequisites are satisfied,
> >>> without paying footprint or performance penalty on non-debug
> >>> builds, where all misuses introduced during code integration
> >>> were already fixed.
> >>>
> >>> Introduce family of xe_ASSERT macros that try to follow classic
> >>> assert() utility and can be compiled out on non-debug builds.
> >> 
> >> This is indeed much better and clear then the XE_BUG_ON.
> >> 
> >> Also, let me be clear that my intention with reply is not to
> >> block the introduction of this assert macro.
> >
> > thanks
> >
> >> 
> >> But I'd like to raise my concern(s?) with such assert macros:
> >> 
> >> In many cases that we currently use these macros are actually
> >> very useful on production build. Whenever we receive bug reports
> >> we should be able to ask the reporter to get a dmesg or enable
> >> a drm.debug and get a message and we could see these, without
> >> ever having to ask the report to rebuild the kernel with a debug
> >> config.
> >
> > you should think of xe_assert() as extension to static_assert() but
> > aimed to be executed during internal CI builds only, when we run basic
> > integration tests, and were we are able to quickly catch all misuses,
> > caused by rapid or multiple changes going in.
> >
> >> 
> >> Then, we add this in the code and we start using this everywhere
> >> without stopping to think if that would be good in a report and
> >> we end up with a code full of asserts and big problem to get the
> >> useful information from the bug reports.
> >
> > those asserts shouldn't fire beyond integration builds - if someone uses
> > assert() to tag unfinished work, then it's simply wrong, in such cases
> > true WARN and correct fallback shall be implemented - and that's the
> > role of the reviewers to spot it ;)
> 
> Naming this ASSERT is the first step towards proper usage, because it
> implies it might be different for regular and debug builds.
> 
> The second step would be adding a comment documenting the usage.

Yes, please. Let's add the documentation so it gets easier to everyone
to understand the rules and help during reviews to avoid the same
mistake written below.

With the doc in place:

Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

> 
> I think in general the idea behind the conditional GEM_BUG_ONs was to
> avoid the performance impact of leaving them in place. But then we've
> started sprinkling GEM_BUG_ON indiscriminately, because you didn't have
> to think about performance. 1200+ instances of GEM_BUG_ON in i915. It's
> just crazy.
> 
> >
> >> 
> >> Aside from the fact that we end up going far from other drm and
> >> non-drm kernel drivers and starting from day 0 a driver full of
> >> driver-isms.
> >
> > maybe because our driver(s) are bigger and more complex ? also much more
> > developers are working on it, with multiple refactoring going constantly
> > on, so we should have mechanism to allow us catching unexpected uses
> > without polluting production driver too much.
> >
> > btw, DRM already has assert() but named as bug()
> 
> Do git blame on that and realize it originated from us following the
> same pattern as GEM_BUG_ON, which got copy-pasted as XE_BUG_ON. It's
> just problematic all around, and upstream consensus is that BUG_ON() is
> not even for CI. Use warn and panic_on_warn instead.
> 
> BR,
> Jani.
> 
> >
> > [1] https://elixir.bootlin.com/linux/latest/C/ident/DRM_MM_BUG_ON
> >
> >> 
> >> I would prefer we have a driver where we make usage of the all
> >> the drm error, warning, debug, info variants on a case by case,
> >> carefully analyzing the needs and thinking about the bug reports
> >> that we might receive from end users.
> >
> > +1
> >
> > it would be also nice if driver is correctly handling all errors or
> > invalid input paths, but at the same time, it shouldn't duplicate checks
> > that shall be already done at higher layer/caller - with asserts() we
> > just express our expectations without ignoring other bugs by trying
> > handle that unexpected case down the stack
> >
> >> 
> >>>
> >>> Macros are based on drm_WARN, but unlikely to origin, disallow
> >>> use in expressions since we will compile that code out.
> >>>
> >>> As we are operating on the xe pointers, we can print additional
> >>> information about the device, like tile or GT identifier, that
> >>> is not available from generic WARN report:
> >>>
> >>> [ ] xe 0000:00:02.0: [drm] Assertion `false` failed!
> >>>     graphics: Xe_LP 12.0 media: Xe_M 12.0 GT0
> >>> [ ] WARNING: CPU: 4 PID: 8442 at drivers/gpu/drm/xe/xe_device.c:280
> >>> xe_device_probe+0x2f5/0x4a0 [xe]
> >>> [ ] RIP: 0010:xe_device_probe+0x2f5/0x4a0 [xe]
> >>>
> >>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> >>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >>> Cc: Francois Dugast <francois.dugast@intel.com>
> >>> Cc: Matthew Brost <matthew.brost@intel.com>
> >>> ---
> >>>  drivers/gpu/drm/xe/xe_assert.h | 47
> >>> ++++++++++++++++++++++++++++++++++
> >>>  1 file changed, 47 insertions(+)
> >>>  create mode 100644 drivers/gpu/drm/xe/xe_assert.h
> >>>
> >>> diff --git a/drivers/gpu/drm/xe/xe_assert.h
> >>> b/drivers/gpu/drm/xe/xe_assert.h
> >>> new file mode 100644
> >>> index 000000000000..65306768d637
> >>> --- /dev/null
> >>> +++ b/drivers/gpu/drm/xe/xe_assert.h
> >>> @@ -0,0 +1,47 @@
> >>> +/* SPDX-License-Identifier: MIT */
> >>> +/*
> >>> + * Copyright © 2023 Intel Corporation
> >>> + */
> >>> +
> >>> +#ifndef __XE_ASSERT_H__
> >>> +#define __XE_ASSERT_H__
> >>> +
> >>> +#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
> >>> +
> >>> +#define xe_ASSERT_MSG(xe, condition, fmt, arg...)
> >>> ({                                           \
> >>> +       struct xe_device *__xe =
> >>> (xe);                                                          \
> >>> +       int __passed =
> >>> !!(condition);                                                       
> >>>     \
> >>> +       drm_WARN(&__xe->drm, !__passed, "[" DRM_NAME "] Assertion
> >>> `%s` failed!\n"               \
> >>> +                "graphics: %s %u.%u media: %s %u.%u "
> >>> fmt,                                     \
> >>> +               
> >>> __stringify(condition),                                              
> >>>           \
> >>> +                __xe-
> >>>> info.graphics_name,                                                 
> >>>      \
> >>> +                __xe->info.graphics_verx100 / 100, __xe-
> >>>> info.graphics_verx100 % 100,          \
> >>> +                __xe-
> >>>> info.media_name,                                                    
> >>>      \
> >>> +                __xe->info.media_verx100 / 100, __xe-
> >>>> info.media_verx100 % 100,                \
> >>> +                ##
> >>> arg);                                                                
> >>>        \
> >>> +       (void)(__passed);                                            
> >>>                            \
> >>> +})
> >>> +
> >>> +#else
> >>> +
> >>> +#define xe_ASSERT_MSG(xe, condition, fmt, arg...)
> >>> ({                                           \
> >>> +       typecheck(struct xe_device *,
> >>> xe);                                                      \
> >>> +       BUILD_BUG_ON_INVALID(condition);                             
> >>>                            \
> >>> +})
> >>> +
> >>> +#endif
> >>> +
> >>> +#define xe_ASSERT(xe, condition) \
> >>> +       xe_ASSERT_MSG((xe), condition, "")
> >>> +
> >>> +#define xe_tile_ASSERT(tile, condition)
> >>> ({                                                     \
> >>> +       struct xe_tile *__tile =
> >>> (tile);                                                        \
> >>> +       xe_ASSERT_MSG(tile_to_xe(__tile), condition, "tile%u ",
> >>> __tile->id);                    \
> >>> +})
> >>> +
> >>> +#define xe_gt_ASSERT(gt, condition)
> >>> ({                                                         \
> >>> +       struct xe_gt *__gt =
> >>> (gt);                                                              \
> >>> +       xe_ASSERT_MSG(gt_to_xe(__gt), condition, "GT%u ", __gt-
> >>>> info.id);                       \
> >>> +})
> >>> +
> >>> +#endif /* __XE_ASSERT_H__ */
> >> 
> 
> -- 
> Jani Nikula, Intel Open Source Graphics Center

  reply	other threads:[~2023-08-04 20:48 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-28 19:25 [Intel-xe] [RFC] drm/xe: Introduce xe_ASSERT macros Michal Wajdeczko
2023-07-28 19:47 ` [Intel-xe] ✓ CI.Patch_applied: success for " Patchwork
2023-07-28 19:47 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork
2023-07-28 19:47 ` [Intel-xe] [RFC] " Vivi, Rodrigo
2023-07-28 20:33   ` Michal Wajdeczko
2023-08-01  8:19     ` Jani Nikula
2023-08-04 20:47       ` Rodrigo Vivi [this message]
2023-07-28 19:48 ` [Intel-xe] ✓ CI.KUnit: success for " Patchwork
2023-07-28 19:52 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-07-28 19:52 ` [Intel-xe] ✓ CI.Hooks: " Patchwork
2023-07-28 19:53 ` [Intel-xe] ✓ CI.checksparse: " Patchwork
2023-07-28 20:26 ` [Intel-xe] ○ CI.BAT: info " Patchwork
2023-08-04  3:58 ` [Intel-xe] [RFC] " Matthew Brost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZM1j9PFKuQEt32xb@intel.com \
    --to=rodrigo.vivi@intel.com \
    --cc=francois.dugast@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox