From: Jani Nikula <jani.nikula@linux.intel.com>
To: Michal Wajdeczko <michal.wajdeczko@intel.com>,
"Vivi, Rodrigo" <rodrigo.vivi@intel.com>,
"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>
Cc: "Dugast, Francois" <francois.dugast@intel.com>
Subject: Re: [Intel-xe] [RFC] drm/xe: Introduce xe_ASSERT macros
Date: Tue, 01 Aug 2023 11:19:07 +0300 [thread overview]
Message-ID: <87r0onw4xw.fsf@intel.com> (raw)
In-Reply-To: <5e88b39b-6c15-cae4-0a15-0dbd7ef594b4@intel.com>
On Fri, 28 Jul 2023, Michal Wajdeczko <michal.wajdeczko@intel.com> wrote:
> On 28.07.2023 21:47, Vivi, Rodrigo wrote:
>> On Fri, 2023-07-28 at 21:25 +0200, Michal Wajdeczko wrote:
>>> As we are moving away from the controversial XE_BUG_ON macro,
>>> relying just on WARN_ON or drm_err does not cover the cases
>>> where we want to annotate functions with additional detailed
>>> debug checks to assert that all prerequisites are satisfied,
>>> without paying footprint or performance penalty on non-debug
>>> builds, where all misuses introduced during code integration
>>> were already fixed.
>>>
>>> Introduce family of xe_ASSERT macros that try to follow classic
>>> assert() utility and can be compiled out on non-debug builds.
>>
>> This is indeed much better and clear then the XE_BUG_ON.
>>
>> Also, let me be clear that my intention with reply is not to
>> block the introduction of this assert macro.
>
> thanks
>
>>
>> But I'd like to raise my concern(s?) with such assert macros:
>>
>> In many cases that we currently use these macros are actually
>> very useful on production build. Whenever we receive bug reports
>> we should be able to ask the reporter to get a dmesg or enable
>> a drm.debug and get a message and we could see these, without
>> ever having to ask the report to rebuild the kernel with a debug
>> config.
>
> you should think of xe_assert() as extension to static_assert() but
> aimed to be executed during internal CI builds only, when we run basic
> integration tests, and were we are able to quickly catch all misuses,
> caused by rapid or multiple changes going in.
>
>>
>> Then, we add this in the code and we start using this everywhere
>> without stopping to think if that would be good in a report and
>> we end up with a code full of asserts and big problem to get the
>> useful information from the bug reports.
>
> those asserts shouldn't fire beyond integration builds - if someone uses
> assert() to tag unfinished work, then it's simply wrong, in such cases
> true WARN and correct fallback shall be implemented - and that's the
> role of the reviewers to spot it ;)
Naming this ASSERT is the first step towards proper usage, because it
implies it might be different for regular and debug builds.
The second step would be adding a comment documenting the usage.
I think in general the idea behind the conditional GEM_BUG_ONs was to
avoid the performance impact of leaving them in place. But then we've
started sprinkling GEM_BUG_ON indiscriminately, because you didn't have
to think about performance. 1200+ instances of GEM_BUG_ON in i915. It's
just crazy.
>
>>
>> Aside from the fact that we end up going far from other drm and
>> non-drm kernel drivers and starting from day 0 a driver full of
>> driver-isms.
>
> maybe because our driver(s) are bigger and more complex ? also much more
> developers are working on it, with multiple refactoring going constantly
> on, so we should have mechanism to allow us catching unexpected uses
> without polluting production driver too much.
>
> btw, DRM already has assert() but named as bug()
Do git blame on that and realize it originated from us following the
same pattern as GEM_BUG_ON, which got copy-pasted as XE_BUG_ON. It's
just problematic all around, and upstream consensus is that BUG_ON() is
not even for CI. Use warn and panic_on_warn instead.
BR,
Jani.
>
> [1] https://elixir.bootlin.com/linux/latest/C/ident/DRM_MM_BUG_ON
>
>>
>> I would prefer we have a driver where we make usage of the all
>> the drm error, warning, debug, info variants on a case by case,
>> carefully analyzing the needs and thinking about the bug reports
>> that we might receive from end users.
>
> +1
>
> it would be also nice if driver is correctly handling all errors or
> invalid input paths, but at the same time, it shouldn't duplicate checks
> that shall be already done at higher layer/caller - with asserts() we
> just express our expectations without ignoring other bugs by trying
> handle that unexpected case down the stack
>
>>
>>>
>>> Macros are based on drm_WARN, but unlikely to origin, disallow
>>> use in expressions since we will compile that code out.
>>>
>>> As we are operating on the xe pointers, we can print additional
>>> information about the device, like tile or GT identifier, that
>>> is not available from generic WARN report:
>>>
>>> [ ] xe 0000:00:02.0: [drm] Assertion `false` failed!
>>> graphics: Xe_LP 12.0 media: Xe_M 12.0 GT0
>>> [ ] WARNING: CPU: 4 PID: 8442 at drivers/gpu/drm/xe/xe_device.c:280
>>> xe_device_probe+0x2f5/0x4a0 [xe]
>>> [ ] RIP: 0010:xe_device_probe+0x2f5/0x4a0 [xe]
>>>
>>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>> Cc: Francois Dugast <francois.dugast@intel.com>
>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>> ---
>>> drivers/gpu/drm/xe/xe_assert.h | 47
>>> ++++++++++++++++++++++++++++++++++
>>> 1 file changed, 47 insertions(+)
>>> create mode 100644 drivers/gpu/drm/xe/xe_assert.h
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_assert.h
>>> b/drivers/gpu/drm/xe/xe_assert.h
>>> new file mode 100644
>>> index 000000000000..65306768d637
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/xe/xe_assert.h
>>> @@ -0,0 +1,47 @@
>>> +/* SPDX-License-Identifier: MIT */
>>> +/*
>>> + * Copyright © 2023 Intel Corporation
>>> + */
>>> +
>>> +#ifndef __XE_ASSERT_H__
>>> +#define __XE_ASSERT_H__
>>> +
>>> +#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
>>> +
>>> +#define xe_ASSERT_MSG(xe, condition, fmt, arg...)
>>> ({ \
>>> + struct xe_device *__xe =
>>> (xe); \
>>> + int __passed =
>>> !!(condition);
>>> \
>>> + drm_WARN(&__xe->drm, !__passed, "[" DRM_NAME "] Assertion
>>> `%s` failed!\n" \
>>> + "graphics: %s %u.%u media: %s %u.%u "
>>> fmt, \
>>> +
>>> __stringify(condition),
>>> \
>>> + __xe-
>>>> info.graphics_name,
>>> \
>>> + __xe->info.graphics_verx100 / 100, __xe-
>>>> info.graphics_verx100 % 100, \
>>> + __xe-
>>>> info.media_name,
>>> \
>>> + __xe->info.media_verx100 / 100, __xe-
>>>> info.media_verx100 % 100, \
>>> + ##
>>> arg);
>>> \
>>> + (void)(__passed);
>>> \
>>> +})
>>> +
>>> +#else
>>> +
>>> +#define xe_ASSERT_MSG(xe, condition, fmt, arg...)
>>> ({ \
>>> + typecheck(struct xe_device *,
>>> xe); \
>>> + BUILD_BUG_ON_INVALID(condition);
>>> \
>>> +})
>>> +
>>> +#endif
>>> +
>>> +#define xe_ASSERT(xe, condition) \
>>> + xe_ASSERT_MSG((xe), condition, "")
>>> +
>>> +#define xe_tile_ASSERT(tile, condition)
>>> ({ \
>>> + struct xe_tile *__tile =
>>> (tile); \
>>> + xe_ASSERT_MSG(tile_to_xe(__tile), condition, "tile%u ",
>>> __tile->id); \
>>> +})
>>> +
>>> +#define xe_gt_ASSERT(gt, condition)
>>> ({ \
>>> + struct xe_gt *__gt =
>>> (gt); \
>>> + xe_ASSERT_MSG(gt_to_xe(__gt), condition, "GT%u ", __gt-
>>>> info.id); \
>>> +})
>>> +
>>> +#endif /* __XE_ASSERT_H__ */
>>
--
Jani Nikula, Intel Open Source Graphics Center
next prev parent reply other threads:[~2023-08-01 8:19 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-28 19:25 [Intel-xe] [RFC] drm/xe: Introduce xe_ASSERT macros Michal Wajdeczko
2023-07-28 19:47 ` [Intel-xe] ✓ CI.Patch_applied: success for " Patchwork
2023-07-28 19:47 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork
2023-07-28 19:47 ` [Intel-xe] [RFC] " Vivi, Rodrigo
2023-07-28 20:33 ` Michal Wajdeczko
2023-08-01 8:19 ` Jani Nikula [this message]
2023-08-04 20:47 ` Rodrigo Vivi
2023-07-28 19:48 ` [Intel-xe] ✓ CI.KUnit: success for " Patchwork
2023-07-28 19:52 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-07-28 19:52 ` [Intel-xe] ✓ CI.Hooks: " Patchwork
2023-07-28 19:53 ` [Intel-xe] ✓ CI.checksparse: " Patchwork
2023-07-28 20:26 ` [Intel-xe] ○ CI.BAT: info " Patchwork
2023-08-04 3:58 ` [Intel-xe] [RFC] " Matthew Brost
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r0onw4xw.fsf@intel.com \
--to=jani.nikula@linux.intel.com \
--cc=francois.dugast@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=michal.wajdeczko@intel.com \
--cc=rodrigo.vivi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox