From: Michal Wajdeczko <michal.wajdeczko@intel.com>
To: Jani Nikula <jani.nikula@intel.com>,
Francois Dugast <francois.dugast@intel.com>,
intel-xe@lists.freedesktop.org
Cc: Matt Roper <matthew.d.roper@intel.com>,
Lucas De Marchi <lucas.demarchi@intel.com>,
Rodrigo Vivi <rodrigo.vivi@intel.com>
Subject: Re: [Intel-xe] [PATCH v4 2/3] drm/xe: Introduce Xe assert macros
Date: Tue, 12 Sep 2023 15:48:01 +0200 [thread overview]
Message-ID: <268cacb5-2feb-b43f-325f-5aef9cf8951f@intel.com> (raw)
In-Reply-To: <871qf3y0qa.fsf@intel.com>
On 12.09.2023 15:34, Jani Nikula wrote:
> On Tue, 12 Sep 2023, Michal Wajdeczko <michal.wajdeczko@intel.com> wrote:
>> On 12.09.2023 13:35, Jani Nikula wrote:
>>> On Tue, 12 Sep 2023, Francois Dugast <francois.dugast@intel.com> wrote:
>>>> From: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>>
>>>> As we are moving away from the controversial XE_BUG_ON macro,
>>>> relying just on WARN_ON or drm_err does not cover the cases
>>>> where we want to annotate functions with additional detailed
>>>> debug checks to assert that all prerequisites are satisfied,
>>>> without paying footprint or performance penalty on non-debug
>>>> builds, where all misuses introduced during code integration
>>>> were already fixed.
>>>>
>>>> Introduce family of Xe assert macros that try to follow classic
>>>> assert() utility and can be compiled out on non-debug builds.
>>>>
>>>> Macros are based on drm_WARN, but unlikely to origin, disallow
>>>> use in expressions since we will compile that code out.
>>>>
>>>> As we are operating on the xe pointers, we can print additional
>>>> information about the device, like tile or GT identifier, that
>>>> is not available from generic WARN report:
>>>>
>>>> [ ] xe 0000:00:02.0: [drm] Assertion `true == false` failed!
>>>> platform: 1 subplatform: 1
>>>> graphics: Xe_LP 12.00 step B0
>>>> media: Xe_M 12.00 step B0
>>>> display: enabled step D0
>>>> tile: 0 VRAM 0 B
>>>> GT: 0 type 1
>>>>
>>>> [ ] xe 0000:b3:00.0: [drm] Assertion `true == false` failed!
>>>> platform: 7 subplatform: 3
>>>> graphics: Xe_HPG 12.55 step A1
>>>> media: Xe_HPM 12.55 step A1
>>>> display: disabled step **
>>>> tile: 0 VRAM 14.0 GiB
>>>> GT: 0 type 1
>>>>
>>>> [ ] WARNING: CPU: 0 PID: 2687 at drivers/gpu/drm/xe/xe_device.c:281 xe_device_probe+0x374/0x520 [xe]
>>>> [ ] RIP: 0010:xe_device_probe+0x374/0x520 [xe]
>>>> [ ] Call Trace:
>>>> [ ] ? __warn+0x7b/0x160
>>>> [ ] ? xe_device_probe+0x374/0x520 [xe]
>>>> [ ] ? report_bug+0x1c3/0x1d0
>>>> [ ] ? handle_bug+0x42/0x70
>>>> [ ] ? exc_invalid_op+0x14/0x70
>>>> [ ] ? asm_exc_invalid_op+0x16/0x20
>>>> [ ] ? xe_device_probe+0x374/0x520 [xe]
>>>> [ ] ? xe_device_probe+0x374/0x520 [xe]
>>>> [ ] xe_pci_probe+0x6e3/0x950 [xe]
>>>> [ ] ? lockdep_hardirqs_on+0xc7/0x140
>>>> [ ] pci_device_probe+0x9e/0x160
>>>> [ ] really_probe+0x19d/0x400
>>>>
>>>> v2: use lowercase names
>>>> v3: apply xe coding style
>>>>
>>>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>> Cc: Oded Gabbay <ogabbay@kernel.org>
>>>> Cc: Jani Nikula <jani.nikula@intel.com>
>>>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>>> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>>>> Cc: Matt Roper <matthew.d.roper@intel.com>
>>>> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
>>>> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>>> ---
>>>> drivers/gpu/drm/xe/xe_assert.h | 177 +++++++++++++++++++++++++++++++++
>>>> 1 file changed, 177 insertions(+)
>>>> create mode 100644 drivers/gpu/drm/xe/xe_assert.h
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_assert.h b/drivers/gpu/drm/xe/xe_assert.h
>>>> new file mode 100644
>>>> index 000000000000..b2d3c9b82b31
>>>> --- /dev/null
>>>> +++ b/drivers/gpu/drm/xe/xe_assert.h
>>>> @@ -0,0 +1,177 @@
>>>> +/* SPDX-License-Identifier: MIT */
>>>> +/*
>>>> + * Copyright © 2023 Intel Corporation
>>>> + */
>>>> +
>>>> +#ifndef _XE_ASSERT_H_
>>>> +#define _XE_ASSERT_H_
>>>> +
>>>> +#include <linux/string_helpers.h>
>>>> +
>>>> +#include <drm/drm_print.h>
>>>> +
>>>> +#include "xe_device_types.h"
>>>> +#include "xe_step.h"
>>>> +
>>>> +/**
>>>> + * DOC: Xe ASSERTs
>>>> + *
>>>> + * While Xe driver aims to be simpler than legacy i915 driver it is still
>>>> + * complex enough that some changes introduced while adding new functionality
>>>> + * could break the existing code.
>>>> + *
>>>> + * Adding &drm_WARN or &drm_err to catch unwanted programming usage could lead
>>>> + * to undesired increased driver footprint and may impact production driver
>>>> + * performance as this additional code will be always present.
>>>> + *
>>>> + * To allow annotate functions with additional detailed debug checks to assert
>>>> + * that all prerequisites are satisfied, without worrying about footprint or
>>>> + * performance penalty on production builds where all potential misuses
>>>> + * introduced during code integration were already fixed, we introduce family
>>>> + * of Xe assert macros that try to follow classic assert() utility:
>>>> + *
>>>> + * * &xe_assert
>>>> + * * &xe_tile_assert
>>>> + * * &xe_gt_assert
>>>> + *
>>>> + * These macros are implemented on top of &drm_WARN, but unlikely to the origin,
>>>> + * warning is triggered when provided condition is false. Additionally all above
>>>> + * assert macros cannot be used in expressions or as a condition, since
>>>> + * underlying code will be compiled out on non-debug builds.
>>>> + *
>>>> + * Note that these macros are not intended for use to cover known gaps in the
>>>> + * implementation; for such cases use regular &drm_WARN or &drm_err and provide
>>>> + * valid safe fallback.
>>>> + *
>>>> + * Also in cases where performance or footprint is not an issue, developers
>>>> + * should continue to use the regular &drm_WARN or &drm_err to ensure that bug
>>>> + * reports from production builds will contain meagningful diagnostics data.
>>>> + *
>>>> + * Below code shows how asserts could help in debug to catch unplanned use::
>>>> + *
>>>> + * static void one_igfx(struct xe_device *xe)
>>>> + * {
>>>> + * xe_assert(xe, xe->info.is_dgfx == false);
>>>> + * xe_assert(xe, xe->info.tile_count == 1);
>>>> + * }
>>>> + *
>>>> + * static void two_dgfx(struct xe_device *xe)
>>>> + * {
>>>> + * xe_assert(xe, xe->info.is_dgfx);
>>>> + * xe_assert(xe, xe->info.tile_count == 2);
>>>> + * }
>>>> + *
>>>> + * void foo(struct xe_device *xe)
>>>> + * {
>>>> + * if (xe->info.dgfx)
>>>> + * return two_dgfx(xe);
>>>> + * return one_igfx(xe);
>>>> + * }
>>>> + *
>>>> + * void bar(struct xe_device *xe)
>>>> + * {
>>>> + * if (drm_WARN_ON(xe->drm, xe->info.tile_count > 2))
>>>> + * return;
>>>> + *
>>>> + * if (xe->info.tile_count == 2)
>>>> + * return two_dgfx(xe);
>>>> + * return one_igfx(xe);
>>>> + * }
>>>> + */
>>>> +
>>>> +#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
>>>> +#define __xe_assert_msg(xe, condition, msg, arg...) ({ \
>>>> + (void)drm_WARN(&(xe)->drm, !(condition), "[" DRM_NAME "] Assertion `%s` failed!\n" msg, \
>>>> + __stringify(condition), ## arg); \
>>>> +})
>>>> +#else
>>>> +#define __xe_assert_msg(xe, condition, msg, arg...) ({ \
>>>> + typecheck(struct xe_device *, xe); \
>>>> + BUILD_BUG_ON_INVALID(condition); \
>>>> +})
>>>> +#endif
>>>> +
>>>> +/**
>>>> + * xe_assert - warn if condition is false when debugging.
>>>> + * @xe: the &struct xe_device pointer to which &condition applies
>>>> + * @condition: condition to check
>>>> + *
>>>> + * xe_assert() uses &drm_WARN to emit a warning and print additional information
>>>> + * that could be read from the &xe pointer if provided &condition is false.
>>>> + *
>>>> + * Contrary to &drm_WARN, xe_assert() is effective only on debug builds
>>>> + * (&CONFIG_DRM_XE_DEBUG must be enabled) and cannot be used in expressions
>>>> + * or as a condition.
>>>> + *
>>>> + * See `Xe ASSERTs`_ for general usage guidelines.
>>>> + */
>>>> +#define xe_assert(xe, condition) xe_assert_msg((xe), condition, "")
>>>> +#define xe_assert_msg(xe, condition, msg, arg...) ({ \
>>>> + struct xe_device *__xe = (xe); \
>>>> + __xe_assert_msg(__xe, condition, \
>>>> + "platform: %d subplatform: %d\n" \
>>>> + "graphics: %s %u.%02u step %s\n" \
>>>> + "media: %s %u.%02u step %s\n" \
>>>> + "display: %s step %s\n" \
>>>> + msg, \
>>>> + __xe->info.platform, __xe->info.subplatform, \
>>>> + __xe->info.graphics_name, \
>>>> + __xe->info.graphics_verx100 / 100, \
>>>> + __xe->info.graphics_verx100 % 100, \
>>>> + xe_step_name(__xe->info.step.graphics), \
>>>> + __xe->info.media_name, \
>>>> + __xe->info.media_verx100 / 100, \
>>>> + __xe->info.media_verx100 % 100, \
>>>> + xe_step_name(__xe->info.step.media), \
>>>> + str_enabled_disabled(__xe->info.enable_display), \
>>>> + xe_step_name(__xe->info.step.display), \
>>>> + ## arg); \
>>>> +})
>>>
>>> I guess I have missed this huge splat all along... Why is it necessary?
>>> If you print the device id, all the information should be there already,
>>> right?
>>
>> somewhere in the dmesg (if someone/CI was clever enough) maybe yes
>>
>> but in bug reports usually only the WARN is included, so exposing some
>> basic info here for quicker triage
>
> I just think it's unnecessary duplication.
but what if assert() fires before driver print welcome messages ;)
IMO we should try to collect whatever is possible and comes almost at no
extra cost (you don't need to type anything, macro takes care of that)
> Most likely this will only be
> enabled in CI only anyway.
I assume asserts will fire mostly during pre-merge testing (either on CI
or on tested manually on DUT machines)
>
>>
>>>
>>> This also makes it impossible to use xe_assert() with a NULL xe device
>>> pointer in contexts where you don't have the device available.
>>
>> there was no such requirement (but we can add that if needed)
>>
>> note that code is based on drm_WARN() which also doesn't work with NULL
>
> Ah, true. The drm_dbg and friends do.
true, so do you want xe_assert() and family to also accept NULL ?
>
>>
>> Michal
>>
>>>
>>> BR,
>>> Jani.
>>>
>>>
>>>> +
>>>> +/**
>>>> + * xe_tile_assert - warn if condition is false when debugging.
>>>> + * @tile: the &struct xe_tile pointer to which &condition applies
>>>> + * @condition: condition to check
>>>> + *
>>>> + * xe_tile_assert() uses &drm_WARN to emit a warning and print additional
>>>> + * information that could be read from the &tile pointer if provided &condition
>>>> + * is false.
>>>> + *
>>>> + * Contrary to &drm_WARN, xe_tile_assert() is effective only on debug builds
>>>> + * (&CONFIG_DRM_XE_DEBUG must be enabled) and cannot be used in expressions
>>>> + * or as a condition.
>>>> + *
>>>> + * See `Xe ASSERTs`_ for general usage guidelines.
>>>> + */
>>>> +#define xe_tile_assert(tile, condition) xe_tile_assert_msg((tile), condition, "")
>>>> +#define xe_tile_assert_msg(tile, condition, msg, arg...) ({ \
>>>> + struct xe_tile *__tile = (tile); \
>>>> + char __buf[10]; \
>>>> + xe_assert_msg(tile_to_xe(__tile), condition, "tile: %u VRAM %s\n" msg, \
>>>> + __tile->id, ({ string_get_size(__tile->mem.vram.actual_physical_size, 1, \
>>>> + STRING_UNITS_2, __buf, sizeof(__buf)); __buf; }), ## arg); \
>>>> +})
>>>> +
>>>> +/**
>>>> + * xe_gt_assert - warn if condition is false when debugging.
>>>> + * @gt: the &struct xe_gt pointer to which &condition applies
>>>> + * @condition: condition to check
>>>> + *
>>>> + * xe_gt_assert() uses &drm_WARN to emit a warning and print additional
>>>> + * information that could be safetely read from the > pointer if provided
>>>> + * &condition is false.
>>>> + *
>>>> + * Contrary to &drm_WARN, xe_gt_assert() is effective only on debug builds
>>>> + * (&CONFIG_DRM_XE_DEBUG must be enabled) and cannot be used in expressions
>>>> + * or as a condition.
>>>> + *
>>>> + * See `Xe ASSERTs`_ for general usage guidelines.
>>>> + */
>>>> +#define xe_gt_assert(gt, condition) xe_gt_assert_msg((gt), condition, "")
>>>> +#define xe_gt_assert_msg(gt, condition, msg, arg...) ({ \
>>>> + struct xe_gt *__gt = (gt); \
>>>> + xe_tile_assert_msg(gt_to_tile(__gt), condition, "GT: %u type %d\n" msg, \
>>>> + __gt->info.id, __gt->info.type, ## arg); \
>>>> +})
>>>> +
>>>> +#endif
>>>
>
next prev parent reply other threads:[~2023-09-12 13:48 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-12 8:36 [Intel-xe] [PATCH v4 0/3] Use Xe assert macros instead of XE_WARN_ON macro Francois Dugast
2023-09-12 8:36 ` [Intel-xe] [PATCH v4 1/3] drm/xe: Replace XE_WARN_ON with drm_warn when just printing a string Francois Dugast
2023-09-12 8:36 ` [Intel-xe] [PATCH v4 2/3] drm/xe: Introduce Xe assert macros Francois Dugast
2023-09-12 9:11 ` Francois Dugast
2023-09-12 10:17 ` Jani Nikula
2023-09-12 11:18 ` Michal Wajdeczko
2023-09-12 11:35 ` Jani Nikula
2023-09-12 13:18 ` Michal Wajdeczko
2023-09-12 13:34 ` Jani Nikula
2023-09-12 13:48 ` Michal Wajdeczko [this message]
2023-09-12 14:21 ` Rodrigo Vivi
2023-09-12 15:11 ` Michal Wajdeczko
2023-09-12 15:39 ` Rodrigo Vivi
2023-09-12 15:46 ` Matt Roper
2023-09-12 16:08 ` Rodrigo Vivi
2023-09-12 8:36 ` [Intel-xe] [PATCH v4 3/3] drm/xe: Use Xe assert macros instead of XE_WARN_ON macro Francois Dugast
2023-09-12 8:39 ` [Intel-xe] ✓ CI.Patch_applied: success for Use Xe assert macros instead of XE_WARN_ON macro (rev4) Patchwork
2023-09-12 8:39 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork
2023-09-12 8:40 ` [Intel-xe] ✓ CI.KUnit: success " Patchwork
2023-09-12 8:46 ` [Intel-xe] ✗ CI.Build: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=268cacb5-2feb-b43f-325f-5aef9cf8951f@intel.com \
--to=michal.wajdeczko@intel.com \
--cc=francois.dugast@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=jani.nikula@intel.com \
--cc=lucas.demarchi@intel.com \
--cc=matthew.d.roper@intel.com \
--cc=rodrigo.vivi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox