From: John Harrison <john.c.harrison@intel.com>
To: "Cavitt, Jonathan" <jonathan.cavitt@intel.com>,
"Intel-Xe@Lists.FreeDesktop.Org" <Intel-Xe@Lists.FreeDesktop.Org>
Subject: Re: [PATCH 2/2] drm/xe/guc: Add support for NPK as a GuC log target
Date: Thu, 12 Jun 2025 11:47:00 -0700 [thread overview]
Message-ID: <733f6db7-ce2a-4f67-9c66-d44702a03ab0@intel.com> (raw)
In-Reply-To: <CH0PR11MB5444E69BE6FC406E515DC61BE574A@CH0PR11MB5444.namprd11.prod.outlook.com>
On 6/12/2025 11:27 AM, Cavitt, Jonathan wrote:
> ----Original Message-----
> From: Harrison, John C <john.c.harrison@intel.com>
> Sent: Thursday, June 12, 2025 10:50 AM
> To: Cavitt, Jonathan <jonathan.cavitt@intel.com>; Intel-Xe@Lists.FreeDesktop.Org
> Subject: Re: [PATCH 2/2] drm/xe/guc: Add support for NPK as a GuC log target
>> On 6/12/2025 7:32 AM, Cavitt, Jonathan wrote:
>>> -----Original Message-----
>>> From: Harrison, John C <john.c.harrison@intel.com>
>>> Sent: Wednesday, June 11, 2025 4:57 PM
>>> To: Cavitt, Jonathan <jonathan.cavitt@intel.com>; Intel-Xe@Lists.FreeDesktop.Org
>>> Subject: Re: [PATCH 2/2] drm/xe/guc: Add support for NPK as a GuC log target
>>>> On 6/11/2025 3:04 PM, Cavitt, Jonathan wrote:
>>>>> -----Original Message-----
>>>>> From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of John.C.Harrison@Intel.com
>>>>> Sent: Wednesday, June 11, 2025 2:06 PM
>>>>> To: Intel-Xe@Lists.FreeDesktop.Org
>>>>> Cc: Harrison, John C <john.c.harrison@intel.com>
>>>>> Subject: [PATCH 2/2] drm/xe/guc: Add support for NPK as a GuC log target
>>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>>
>>>>>> The GuC has an option to write log data via NPK. This is basically a
>>>>>> magic IO address that GuC writes arbitrary data to and which can be
>>>>>> logged by a suitable hardware logger. This can allow retrieval of the
>>>>>> GuC log in hardware debug environments even when the system as a whole
>>>>>> dies horribly.
>>>>>>
>>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>> So, this is basically a new modparam value that redirects GuC logs to
>>>>> a specific IO address? I take it guc_log_target = 2 is the default value, and
>>>>> guc_log_target = 1 would print the logs to stdout, then? I'd ask why we
>>>>> use 0 as a default value and not just default to 2 all the time, but I think I
>>>>> already know why (we need to guard against guc_log_target = 0 anyways
>>>>> to prevent printing to stdin).
>>>> Um, read the patch - "(0=memory [default], 1 = NPK, 2 = memory + NPK)".
>>>> The default is zero. And no, nothing prints to stdout. This is about
>>>> hardware level debugging. It has nothing to do with stdin/stdout/stderr.
>>>> Those concepts do not exist in hardware nor in the KMD. If you send the
>>>> GuC log to the NPK target then you need a hardware debugger (JTAG, etc.)
>>>> to read it, as described in the commit message.
>>> Ah, okay. When I read "This is basically a magic IO address that GuC writes
>>> arbitrary data to", I thought that was indicating that we're writing to the
>>> in/out/err IO addresses, and that the data being logged "by a suitable
>> You say "the in/out/err IO addresses" like there is such a thing.
>> In/out/err generally refers to stdin/stdout/stderr, being the default
>> first three file handles of a unix process. File handles are not IO
>> addresses. Such file handles also do not exist in the kernel. And they
>> absolutely do not exist inside the GT hardware. My comment was referring
> Okay, right. The last time the distinction between I/O as a device concept
> and I/O as an interface concept was relevant to me was about 8 years ago,
> so I can understand how I got confused there.
>
>> to memory mapped IO addresses, i.e. hardware registers. On a normal
>> system, there is nothing connected to said hardware so the logged output
>> goes nowhere. However, if you have a hardware debugger attached to the
>> system then it can trap those accesses and record the log.
> You know, sometimes I forget that the intended customers for these
> products are major companies that end up shoving these cards into
> giant server racks and not PC users with screens and keyboards.
>
>>> hardware logger" was occurring separately. I guess I also thought NPK
>>> was a writing protocol and not a hardware address.
>> NPK is neither a protocol nor an address. It is a block of silicon
>> called North Peak, also known as the Intel Trace Hub.
> "The GuC has an option to write log data via NPK. This is basically a magic IO address..."
>
> If NPK is "a block of silicon" and not an IO address, then perhaps this would be better
> worded as:
>
> "The GuC has the option to write log data to NPK, which is basically a block of silicon..."
NPK is a block which implements a register which is written to by GuC as
an MMIO access. The point of saying 'basically' is because this is very
simple view - GuC writes to a register. All else is irrelevant. If you
want the full details then there is a white paper about it. But
generally speaking, if you don't know already then it isn't relevant to
you because you don't have the hundreds of thousands of dollars of
hardware required to make use of it. So complicated descriptions are
pointless.
>>> It didn't occur to me that we'd need to directly write the logs to the
>>> hardware logger in cases of catastrophic failure because we already have
>>> methods of streaming the logs directly via the serial ports. Though I
>>> suppose that we're talking about different "logs" at this point?
>> I don't know what logs you are talking about. This patch is quite
>> clearly only talking about the GuC log. Which is generally accessible
>> via debugfs snapshots or as part of a devcoredump capture.
>> It is not ever 'streamed directly via the serial ports'.
> I was referring to the dmesg logs at the time, though I will admit that I forgot
> there were classes of logs that never get printed to dmesg. I don't
> personally agree with the practice and think that all relevant logs should
> be printed to dmesg if possible, even if only at certain debug levels or
Sure, lets write to dmesg from inside the GT hardware. I'm sure we can
add that in to the next product...
DMesg is for super important kernel messages. Sometimes, it is the only
way to debug kernel related problems because the system dies and
userland is no longer functional. But it is absolutely not meant to be
the default output method for all possible logging systems. For example,
the ftrace mechanism exists precisely because dmesg is not the right way
to log many things.
> upon direct request. However, I can at least see why we'd want to store those
> separately in the NPK silicon block in case of catastrophic failure given the files
> they normally get saved to are wiped on system reset.
Nothing is stored. NPK is simply a transport mechanism here. If there is
nothing connected to it then it goes nowhere. /dev/null if you prefer.
And there is no file to get wiped. The normal target for GuC logging is
a memory buffer. Any access via a debugfs or sysfs 'file' is just code
being run inside the kernel to read that memory buffer and return it to
the user.
John.
>
>> John.
>>
>>> The reviewed-by still stands.
>>> -Jonathan Cavitt
>>>
>>>>> I also take it this is modified on boot by, for example, writing
>>>>> "xe.guc_log_target=1" to CMDLINE_LINUX_DEFAULT as a part of the grub file.
>>>> That is generally how module parameters work. You can also set via
>>>> modprobe.conf files as long as the Xe driver is a module and not
>>>> compiled in.
>>>>
>>>> John.
>>>>
>>>>> Yeah, seems good.
>>>>> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
>>>>> -Jonathan Cavitt
>>>>>
>>>>>> ---
>>>>>> drivers/gpu/drm/xe/xe_guc.c | 4 ++++
>>>>>> drivers/gpu/drm/xe/xe_module.c | 4 ++++
>>>>>> drivers/gpu/drm/xe/xe_module.h | 1 +
>>>>>> 3 files changed, 9 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
>>>>>> index e16d19b44bcc..9c0e3113f7d5 100644
>>>>>> --- a/drivers/gpu/drm/xe/xe_guc.c
>>>>>> +++ b/drivers/gpu/drm/xe/xe_guc.c
>>>>>> @@ -35,6 +35,7 @@
>>>>>> #include "xe_guc_submit.h"
>>>>>> #include "xe_memirq.h"
>>>>>> #include "xe_mmio.h"
>>>>>> +#include "xe_module.h"
>>>>>> #include "xe_platform_types.h"
>>>>>> #include "xe_sriov.h"
>>>>>> #include "xe_uc.h"
>>>>>> @@ -74,6 +75,9 @@ static u32 guc_ctl_debug_flags(struct xe_guc *guc)
>>>>>> else
>>>>>> flags |= FIELD_PREP(GUC_LOG_VERBOSITY, GUC_LOG_LEVEL_TO_VERBOSITY(level));
>>>>>>
>>>>>> + if (xe_modparam.guc_log_target)
>>>>>> + flags |= FIELD_PREP(GUC_LOG_DESTINATION, xe_modparam.guc_log_target);
>>>>>> +
>>>>>> return flags;
>>>>>> }
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
>>>>>> index 1c4dfafbcd0b..fc8c681819b9 100644
>>>>>> --- a/drivers/gpu/drm/xe/xe_module.c
>>>>>> +++ b/drivers/gpu/drm/xe/xe_module.c
>>>>>> @@ -21,6 +21,7 @@
>>>>>> struct xe_modparam xe_modparam = {
>>>>>> .probe_display = true,
>>>>>> .guc_log_level = 3,
>>>>>> + .guc_log_target = 0,
>>>>>> .force_probe = CONFIG_DRM_XE_FORCE_PROBE,
>>>>>> #ifdef CONFIG_PCI_IOV
>>>>>> .max_vfs = IS_ENABLED(CONFIG_DRM_XE_DEBUG) ? ~0 : 0,
>>>>>> @@ -45,6 +46,9 @@ MODULE_PARM_DESC(vram_bar_size, "Set the vram bar size (in MiB) - <0=disable-res
>>>>>> module_param_named(guc_log_level, xe_modparam.guc_log_level, int, 0600);
>>>>>> MODULE_PARM_DESC(guc_log_level, "GuC firmware logging level (0=disable, 1..5=enable with verbosity min..max)");
>>>>>>
>>>>>> +module_param_named(guc_log_target, xe_modparam.guc_log_target, int, 0600);
>>>>>> +MODULE_PARM_DESC(guc_log_target, "GuC firmware logging target (0=memory [default], 1 = NPK, 2 = memory + NPK)");
>>>>>> +
>>>>>> module_param_named_unsafe(guc_firmware_path, xe_modparam.guc_firmware_path, charp, 0400);
>>>>>> MODULE_PARM_DESC(guc_firmware_path,
>>>>>> "GuC firmware path to use instead of the default one");
>>>>>> diff --git a/drivers/gpu/drm/xe/xe_module.h b/drivers/gpu/drm/xe/xe_module.h
>>>>>> index 5a3bfea8b7b4..4d978f6f26b6 100644
>>>>>> --- a/drivers/gpu/drm/xe/xe_module.h
>>>>>> +++ b/drivers/gpu/drm/xe/xe_module.h
>>>>>> @@ -14,6 +14,7 @@ struct xe_modparam {
>>>>>> bool probe_display;
>>>>>> u32 force_vram_bar_size;
>>>>>> int guc_log_level;
>>>>>> + int guc_log_target;
>>>>>> char *guc_firmware_path;
>>>>>> char *huc_firmware_path;
>>>>>> char *gsc_firmware_path;
>>>>>> --
>>>>>> 2.49.0
>>>>>>
>>>>>>
>>
next prev parent reply other threads:[~2025-06-12 18:47 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-11 21:05 [PATCH 0/2] Clean up of GuC init data macros John.C.Harrison
2025-06-11 21:05 ` [PATCH 1/2] drm/xe/guc: Clean up of GuC 'CTL' defines John.C.Harrison
2025-06-11 22:04 ` Cavitt, Jonathan
2025-06-12 22:01 ` Lucas De Marchi
2025-06-11 21:05 ` [PATCH 2/2] drm/xe/guc: Add support for NPK as a GuC log target John.C.Harrison
2025-06-11 21:49 ` Lucas De Marchi
2025-06-11 23:51 ` John Harrison
2025-06-12 22:05 ` Lucas De Marchi
2025-06-12 23:43 ` John Harrison
2025-06-11 22:04 ` Cavitt, Jonathan
2025-06-11 23:57 ` John Harrison
2025-06-12 14:32 ` Cavitt, Jonathan
2025-06-12 17:49 ` John Harrison
2025-06-12 18:27 ` Cavitt, Jonathan
2025-06-12 18:47 ` John Harrison [this message]
2025-06-12 19:52 ` Cavitt, Jonathan
2025-06-12 19:30 ` Summers, Stuart
2025-06-11 21:11 ` ✓ CI.checkpatch: success for Clean up of GuC init data macros Patchwork
2025-06-11 21:12 ` ✓ CI.KUnit: " Patchwork
2025-06-11 21:23 ` ✓ CI.Build: " Patchwork
2025-06-11 21:25 ` ✓ CI.Hooks: " Patchwork
2025-06-11 21:27 ` ✓ CI.checksparse: " Patchwork
2025-06-11 22:19 ` ✓ Xe.CI.BAT: " Patchwork
2025-06-12 8:01 ` ✗ Xe.CI.Full: failure " Patchwork
-- strict thread matches above, loose matches on Subject: below --
2025-07-23 21:20 [PATCH 0/2] Clean up of GuC init data macros & add extra log option John.C.Harrison
2025-07-23 21:20 ` [PATCH 2/2] drm/xe/guc: Add support for NPK as a GuC log target John.C.Harrison
2025-07-24 16:01 ` Lucas De Marchi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=733f6db7-ce2a-4f67-9c66-d44702a03ab0@intel.com \
--to=john.c.harrison@intel.com \
--cc=Intel-Xe@Lists.FreeDesktop.Org \
--cc=jonathan.cavitt@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox