public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: John Harrison <john.c.harrison@intel.com>,
	"Ceraolo Spurio, Daniele" <daniele.ceraolospurio@intel.com>,
	Intel-GFX@Lists.FreeDesktop.Org
Cc: DRI-Devel@Lists.FreeDesktop.Org, Rodrigo Vivi <rodrigo.vivi@intel.com>
Subject: Re: [Intel-gfx] [PATCH 1/2] drm/i915/gt: Add GT oriented dmesg output
Date: Wed, 9 Nov 2022 11:05:36 +0000	[thread overview]
Message-ID: <ad19d7ce-4102-4f8f-903d-7390b004b2e9@linux.intel.com> (raw)
In-Reply-To: <c9742b0f-546f-cccc-021a-7bad68410838@intel.com>


On 08/11/2022 20:15, John Harrison wrote:
> On 11/8/2022 01:01, Tvrtko Ursulin wrote:
>> On 07/11/2022 19:14, John Harrison wrote:
>>> On 11/7/2022 08:17, Tvrtko Ursulin wrote:
>>>> On 07/11/2022 09:33, Tvrtko Ursulin wrote:
>>>>> On 05/11/2022 01:03, Ceraolo Spurio, Daniele wrote:
>>>>>> On 11/4/2022 10:25 AM, John.C.Harrison@Intel.com wrote:
>>>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>>>
>>>>>>> When trying to analyse bug reports from CI, customers, etc. it 
>>>>>>> can be
>>>>>>> difficult to work out exactly what is happening on which GT in a
>>>>>>> multi-GT system. So add GT oriented debug/error message wrappers. If
>>>>>>> used instead of the drm_ equivalents, you get the same output but 
>>>>>>> with
>>>>>>> a GT# prefix on it.
>>>>>>>
>>>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>>>
>>>>>> The only downside to this is that we'll print "GT0: " even on 
>>>>>> single-GT devices. We could introduce a gt->info.name and print 
>>>>>> that, so we could have it different per-platform, but IMO it's not 
>>>>>> worth the effort.
>>>>>>
>>>>>> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>>>>>
>>>>>> I think it might be worth getting an ack from one of the 
>>>>>> maintainers to make sure we're all aligned on transitioning to 
>>>>>> these new logging macro for gt code.
>>>>>
>>>>> Idea is I think a very good one. First I would suggest 
>>>>> standardising to lowercase GT in logs because:
>>>>>
>>>>> $ grep "GT%" i915/ -r
>>>>> $ grep "gt%" i915/ -r
>>>>> i915/gt/intel_gt_sysfs.c: gt->i915->sysfs_gt, "gt%d", gt->info.id))
>>>>> i915/gt/intel_gt_sysfs.c:                "failed to initialize gt%d 
>>>>> sysfs root\n", gt->info.id);
>>>>> i915/gt/intel_gt_sysfs_pm.c:                     "failed to create 
>>>>> gt%u RC6 sysfs files (%pe)\n",
>>>>> i915/gt/intel_gt_sysfs_pm.c: "failed to create gt%u RC6p sysfs 
>>>>> files (%pe)\n",
>>>>> i915/gt/intel_gt_sysfs_pm.c:                     "failed to create 
>>>>> gt%u RPS sysfs files (%pe)",
>>>>> i915/gt/intel_gt_sysfs_pm.c:                     "failed to create 
>>>>> gt%u punit_req_freq_mhz sysfs (%pe)",
>>>>> i915/gt/intel_gt_sysfs_pm.c: "failed to create gt%u throttle sysfs 
>>>>> files (%pe)",
>>>>> i915/gt/intel_gt_sysfs_pm.c: "failed to create gt%u 
>>>>> media_perf_power_attrs sysfs (%pe)\n",
>>>>> i915/gt/intel_gt_sysfs_pm.c:                     "failed to add 
>>>>> gt%u rps defaults (%pe)\n",
>>>>> i915/i915_driver.c: drm_err(&gt->i915->drm, "gt%d: intel_pcode_init 
>>>>> failed %d\n", id, ret);
>>>>> i915/i915_hwmon.c:              snprintf(ddat_gt->name, 
>>>>> sizeof(ddat_gt->name), "i915_gt%u", i);
>>>>>
>>>
>>> Just because there are 11 existing instances of one form doesn't mean 
>>> that the 275 instances that are waiting to be converted should be 
>>> done incorrectly. GT is an acronym and should be capitalised.
>>
>> Okay just make it consistent then.
>>
>>> Besides:
>>> grep -r "GT " i915 | grep '"'
>>> i915/vlv_suspend.c:             drm_err(&i915->drm, "timeout 
>>> disabling GT waking\n");
>>> i915/vlv_suspend.c:                     "timeout waiting for GT wells 
>>> to go %s\n",
>>> i915/vlv_suspend.c:     drm_dbg(&i915->drm, "GT register access while 
>>> GT waking disabled\n");
>>> i915/i915_gpu_error.c:  err_printf(m, "GT awake: %s\n", 
>>> str_yes_no(gt->awake));
>>> i915/i915_debugfs.c:    seq_printf(m, "GT awake? %s [%d], %llums\n",
>>> i915/selftests/i915_gem_evict.c: pr_err("Failed to idle GT (on %s)", 
>>> engine->name);
>>> i915/intel_uncore.c:                  "GT thread status wait timed 
>>> out\n");
>>> i915/gt/uc/selftest_guc_multi_lrc.c: drm_err(&gt->i915->drm, "GT 
>>> failed to idle: %d\n", ret);
>>> i915/gt/uc/selftest_guc.c: drm_err(&gt->i915->drm, "GT failed to 
>>> idle: %d\n", ret);
>>> i915/gt/uc/selftest_guc.c: drm_err(&gt->i915->drm, "GT failed to 
>>> idle: %d\n", ret);
>>> i915/gt/intel_gt_mcr.c: * Some GT registers are designed as 
>>> "multicast" or "replicated" registers:
>>> i915/gt/selftest_rps.c:                 pr_info("%s: rps counted %d 
>>> C0 cycles [%lldns] in %lldns [%d cycles], using GT clock frequency of 
>>> %uKHz\n",
>>> i915/gt/selftest_hangcheck.c:                   pr_err("[%s] GT is 
>>> wedged!\n", engine->name);
>>> i915/gt/selftest_hangcheck.c:           pr_err("GT is wedged!\n");
>>> i915/gt/intel_gt_clock_utils.c:                 "GT clock frequency 
>>> changed, was %uHz, now %uHz!\n",
>>> i915/gt/selftest_engine_pm.c:           pr_err("Unable to flush GT pm 
>>> before test\n");
>>> i915/gt/selftest_engine_pm.c: pr_err("GT failed to idle\n");
>>> i915/i915_sysfs.c:                       "failed to register GT sysfs 
>>> directory\n");
>>> i915/intel_uncore.h:     * of the basic non-engine GT registers 
>>> (referred to as "GSI" on
>>> i915/intel_uncore.h:     * newer platforms, or "GT block" on older 
>>> platforms)?  If so, we'll
>>>
>>>
>>>
>>>>> Then there is a question of naming. Are we okay with GT_XXX or, do 
>>>>> we want intel_gt_, or something completely different. I don't have 
>>>>> a strong opinion at the moment so I'll add some more folks to Cc.
>>>>
>>> You mean GT_ERR("msg") vs intel_gt_err("msg")? Personally, I would 
>>> prefer just gt_err("msg") to keep it as close to the official drm_* 
>>> versions as possible. Print lines tend to be excessively long 
>>> already. Taking a 'gt' parameter instead of a '&gt->i915->drm' 
>>> parameter does help with that but it seems like calling the wrapper 
>>> intel_gt_* is shooting ourselves in the foot on that one. And GT_ERR 
>>> vs gt_err just comes down to the fact that it is a macro wrapper and 
>>> therefore is required to be in upper case.
>>>
>>>> There was a maintainer level mini-discussion on this topic which I 
>>>> will try to summarise.
>>>>
>>>> Main contention point was the maintenance cost and generally an 
>>>> undesirable pattern of needing to add many 
>>>> subsystem/component/directory specific macros. Which then typically 
>>>> need extra flavours and so on. But over verbosity of the 
>>> How many versions are you expecting to add? Beyond the tile instance, 
>>> what further addressing requirements are there? The card instance is 
>>> already printed as part of the PCI address. The only other reason to 
>>> add per component wrappers would be to wrap the mechanism for getting 
>>> from some random per component object back to the intel_gt structure. 
>>> But that is hardware a new issue being added by this wrapper. It is 
>>> also not a requirement. Much of the code has a gt pointer already. 
>>> For the parts that don't, some of it would be a trivial engine->gt 
>>> type dereference, some of it is a more complex container_of type 
>>> construction. But for those, the given file will already have 
>>> multiple instances of that already (usually as the first or second 
>>> line of the function - 'intel_gt *gt = fancy_access_method(my_obj)' 
>>> so adding one or two more of those as necessary is not making the 
>>> code harder to read.
>>>
>>>> code is obviously also bad, so one compromise idea was to add a 
>>>> macro which builds the GT string and use drm logging helpers 
>>>> directly. This would be something like:
>>>>
>>>>  drm_err(GT_LOG("something went wrong ret=%d\n", gt), ret);
>>>>  drm_info(GT_LOG(...same...));
>>> Seriously? As above, some of these lines are already way too long, 
>>> this version makes them even longer with no obvious benefit. Worse, 
>>> it makes it harder to read what is going on. It is much less 
>>> intuitive to read than just replacing the drm_err itself. And having 
>>> two sets of parenthesis with some parameters inside the first and 
>>> some only inside the second is really horrid! Also, putting the 'gt' 
>>> parameter in the middle just confuses it with the rest of the printf 
>>> arguments even though there is no %d in the string for it. So now a 
>>> quick glances tells you that your code is wrong because you have 
>>> three format specifiers but four parameters.
>>>
>>> Whereas, just replacing drm_err with gt_err (or GT_ERR or 
>>> intel_gt_err) keeps everything else consistent. The first parameter 
>>> changes from 'drm' to 'gt' but is still the master object parameter 
>>> and it matches the function/macro prefix so inherently looks correct. 
>>> Then you have your message plus parameters. No confusing orders, no 
>>> confusing parenthesis, no excessive macro levels, no confusion at 
>>> all. Just nice simple, easy to read, easy to maintain code.
>>
>> I am personally okay with gt_err/GT_ERR some other folks might object 
>> though. And I can also understand the argument why it is better to not 
>> have to define gt_err, gt_warn, gt_info, gt_notice, gt_debug, 
>> gt_err_ratelimited, gt_warn_once.. and instead have only one macro.
> A small set of trivial macro definitions vs a complicated and unreadable 
> construct on every single print? Erm, isn't that the very definition of 
> abstracting to helpers as generally required by every code review ever?
> 
> And what 'other folks might object'? People already CC'd? People outside 
> of i915?
> 
> 
>>
>> Because of that I was passing on to you the compromise option.
>>
>> It maybe still has net space savings since we wouldn't have to be 
>> repeating the gt->i915->drm whatever and gt->info.id on every line.
>>
>> You are free to try the most compact one and see how hard those 
>> objections will be.
> Um. I already did. This patch. And you are the only person to have 
> objected in any manner at all.

Where I have objected?

I was a) asking to convert all gt/ within one kernel release, b) 
transferring the maintainer discussion from IRC to this email chain to 
outlay one alternative, for which I said I could see the pros and cons 
of both, and c) raised the naming question early since that can usually 
become a churn point later on when we have large scale code transformations.

As said, FWIW you have my ack for GT_XXX naming and approach, but please 
do convert the whole of gt/ so we don't ship with a mish-mash of log 
messages.

Regards,

Tvrtko

  reply	other threads:[~2022-11-09 11:05 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-04 17:25 [Intel-gfx] [PATCH 0/2] Add GT oriented dmesg output John.C.Harrison
2022-11-04 17:25 ` [Intel-gfx] [PATCH 1/2] drm/i915/gt: " John.C.Harrison
2022-11-05  1:03   ` Ceraolo Spurio, Daniele
2022-11-07  9:33     ` Tvrtko Ursulin
2022-11-07 16:17       ` Tvrtko Ursulin
2022-11-07 19:14         ` John Harrison
2022-11-08  9:01           ` Tvrtko Ursulin
2022-11-08 20:15             ` John Harrison
2022-11-09 11:05               ` Tvrtko Ursulin [this message]
2022-11-09 17:46                 ` John Harrison
2022-11-09 19:57                   ` Michal Wajdeczko
2022-11-10  9:55                     ` Tvrtko Ursulin
2022-11-10 10:35                       ` Michal Wajdeczko
2022-11-15 10:23                         ` Tvrtko Ursulin
2022-11-10 10:33                     ` Jani Nikula
2022-11-14 22:10                       ` John Harrison
2022-11-10  9:43                   ` Tvrtko Ursulin
2022-11-14 22:14                     ` John Harrison
2022-11-04 17:25 ` [Intel-gfx] [PATCH 2/2] drm/i915/uc: Update the gt/uc code to use GT_ERR and friends John.C.Harrison
2022-11-04 17:47 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Add GT oriented dmesg output Patchwork
2022-11-04 18:04 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-11-05  9:24 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ad19d7ce-4102-4f8f-903d-7390b004b2e9@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=DRI-Devel@Lists.FreeDesktop.Org \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    --cc=daniele.ceraolospurio@intel.com \
    --cc=john.c.harrison@intel.com \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox