Re: [PATCH v8 4/6] drm/xe/guc: Extract GuC error capture lists

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: "Dong, Zhanjun" <zhanjun.dong@intel.com>
To: <intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH v8 4/6] drm/xe/guc: Extract GuC error capture lists
Date: Wed, 15 May 2024 17:55:10 -0400	[thread overview]
Message-ID: <afea7b56-2d7d-4686-aac1-012288e1bab5@intel.com> (raw)
In-Reply-To: <d89e9233-083d-4ab3-bb8d-9fad07922864@intel.com>



On 2024-05-15 5:45 p.m., Dong, Zhanjun wrote:
> See my comments below.
> 
> Regards,
> Zhanjun Dong
> 
> On 2024-05-10 9:43 p.m., Teres Alexis, Alan Previn wrote:
>> On Mon, 2024-05-06 at 18:47 -0700, Zhanjun Dong wrote:
>>> Upon the G2H Notify-Err-Capture event, parse through the
>>> GuC Log Buffer (error-capture-subregion) and generate one or
>>> more capture-nodes. A single node represents a single "engine-
>>> instance-capture-dump" and contains at least 3 register lists:
>>> global, engine-class and engine-instance. An internal link
>>> list is maintained to store one or more nodes.
>>>
>>> Because the link-list node generation happen before the call
>>> to devcoredump, duplicate global and engine-class register
>>> lists for each engine-instance register dump if we find
>>> dependent-engine resets in a engine-capture-group.
>>>
>> alan:snip
>>> diff --git a/drivers/gpu/drm/xe/xe_guc_capture.c 
>>> b/drivers/gpu/drm/xe/xe_guc_capture.c
>>> index d2df027081b5..71d7c4a58925 100644
>>> --- a/drivers/gpu/drm/xe/xe_guc_capture.c
>>> +++ b/drivers/gpu/drm/xe/xe_guc_capture.c
>>> @@ -520,6 +520,560 @@ static void check_guc_capture_size(struct 
>>> xe_guc *guc)
>>>                            buffer_size, spare_size, capture_size);
>>>   }
>>>
>> alan:snip
>>> +static struct __guc_capture_parsed_output *
>>> +guc_capture_get_prealloc_node(struct xe_guc *guc)
>>> +{
>>> +       struct __guc_capture_parsed_output *found = NULL;
>>> +
>>> +       if (!list_empty(&guc->capture->cachelist)) {
>>> +               struct __guc_capture_parsed_output *n, *ntmp;
>>> +
>>> +               /* get first avail node from the cache list */
>>> +               list_for_each_entry_safe(n, ntmp, 
>>> &guc->capture->cachelist, link) {
>>> +                       found = n;
>>> +                       list_del(&n->link);
>>> +                       break;
>>> +               }
>>> +       } else {
>>> +               struct __guc_capture_parsed_output *n, *ntmp;
>>> +
>>> +               /* traverse down and steal back the oldest node 
>>> already allocated */
>>> +               list_for_each_entry_safe(n, ntmp, 
>>> &guc->capture->outlist, link) {
>>> +                       found = n;
>>> +               }
>>> +               if (found)
>>> +                       list_del(&found->link);
>>> +       }
>>> +       if (found)
>>> +               guc_capture_init_node(guc, found);
>>> +
>>> +       return found;
>>> +}
>> alan: I mentioned this in rev6, you cannot start pre-allocated 
>> nodelist anywhere
>> in this patch when you are only allocating it in patch 6. Look back at 
>> my rev 6
>> comments on this. Also, take a look at the original i915 patch on how 
>> to implement
>> guc_capture_alloc/delete_one_node without preallocated nodelist:
>> https://patchwork.freedesktop.org/patch/479022/?series=101604&rev=1
>> (note: watch especially for the use of new->reginfo[i].regs which needed
>> additional allocation step. Alternatively we could squash patch 4 and 
>> patch 6
>> together and change patch 4's comment but not sure it might be too 
>> large a
>> patch (can discuss offline).
> 
> Good point, let me try with pre-alloc vs GFP_AOTMIC and will get back to 
> you.
> 
>>
>>> +static int
>>> +guc_capture_extract_reglists(struct xe_guc *guc, struct 
>>> __guc_capture_bufstate *buf)
>>> +{
>>> +       struct xe_gt *gt = guc_to_gt(guc);
>>> +       struct guc_state_capture_group_header_t ghdr = {0};
>>> +       struct guc_state_capture_header_t hdr = {0};
>>> +       struct __guc_capture_parsed_output *node = NULL;
>>> +       struct guc_mmio_reg *regs = NULL;
>>> +       int i, numlists, numregs, ret = 0;
>>> +       enum guc_capture_type datatype;
>>> +       struct guc_mmio_reg tmp;
>>> +       bool is_partial = false;
>> alan:snip
>>> +               if (!node) {
>>> +                       node = guc_capture_get_prealloc_node(guc);
>> alan: see above comment on the use of prealloc_node (as per rev 6's 
>> comments)
>> alan:snip
>>
>>> +static void __guc_capture_process_output(struct xe_guc *guc)
>>> +{
>>> +       unsigned int buffer_size, read_offset, write_offset, full_count;
>>> +       struct xe_uc *uc = container_of(guc, typeof(*uc), guc);
>>> +       struct guc_log_buffer_state log_buf_state_local;
>>> +       struct guc_log_buffer_state *log_buf_state;
>>> +       struct __guc_capture_bufstate buf;
>>> +       bool new_overflow;
>>> +       int ret;
>>> +       u32 log_buf_state_offset;
>>> +       u32 src_data_offset;
>>> +
>>> +       log_buf_state = (struct guc_log_buffer_state 
>>> *)((ulong)guc->log.bo->vmap.vaddr +
>>> +                       (sizeof(struct guc_log_buffer_state) * 
>>> GUC_CAPTURE_LOG_BUFFER));
>> alan: once again, i dont think we can use vmap.vaddr directly this 
>> this anymore right?
>> i dont think we use "log_buf_state" until the end of this function to 
>> set the new read_ptr
>> and flush flag. We ought to use xe_map_wr below?
> Yes, need xe_map_xxx helper, as we are doing read, so should be xe_map_rd
Oops, here is to get the pointer, but the code at function bottom:
	/* Update the state of log buffer err-cap state */
	log_buf_state->read_ptr = write_offset;
	log_buf_state->flush_to_file = 0;
Need to be replaced by xe_map_wr

>>> +
>>> +       log_buf_state_offset = sizeof(struct guc_log_buffer_state) * 
>>> GUC_CAPTURE_LOG_BUFFER;
>>> +       src_data_offset = xe_guc_get_log_buffer_offset(&guc->log, 
>>> GUC_CAPTURE_LOG_BUFFER);
>>> +
>>> +       /*
>>> +        * Make a copy of the state structure, inside GuC log buffer
>>> +        * (which is uncached mapped), on the stack to avoid reading
>>> +        * from it multiple times.
>>> +        */
>>> +       xe_map_memcpy_from(guc_to_xe(guc), &log_buf_state_local, 
>>> &guc->log.bo->vmap,
>>> +                          log_buf_state_offset, sizeof(struct 
>>> guc_log_buffer_state));
>> alan:snip
>>

next prev parent reply	other threads:[~2024-05-15 21:55 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-07  1:47 [PATCH v8 0/6] drm/xe/guc: Add GuC based register capture for error capture Zhanjun Dong
2024-05-07  1:47 ` [PATCH v8 1/6] drm/xe/guc: Prepare GuC register list and update ADS size " Zhanjun Dong
2024-05-10 18:43   ` Teres Alexis, Alan Previn
2024-05-14 22:44     ` Dong, Zhanjun
2024-05-10 18:58   ` Teres Alexis, Alan Previn
2024-05-07  1:47 ` [PATCH v8 2/6] drm/xe/guc: Add XE_LP steered register lists Zhanjun Dong
2024-05-11  0:17   ` Teres Alexis, Alan Previn
2024-05-14 23:00     ` Dong, Zhanjun
2024-05-07  1:47 ` [PATCH v8 3/6] drm/xe/guc: Add capture size check in GuC log buffer Zhanjun Dong
2024-05-08 22:57   ` Teres Alexis, Alan Previn
2024-05-15 21:39     ` Dong, Zhanjun
2024-05-07  1:47 ` [PATCH v8 4/6] drm/xe/guc: Extract GuC error capture lists Zhanjun Dong
2024-05-11  1:43   ` Teres Alexis, Alan Previn
2024-05-15 21:45     ` Dong, Zhanjun
2024-05-15 21:55       ` Dong, Zhanjun [this message]
2024-05-07  1:47 ` [PATCH v8 5/6] drm/xe/guc: Pre-allocate output nodes for extraction Zhanjun Dong
2024-05-11 18:07   ` Teres Alexis, Alan Previn
2024-05-07  1:47 ` [PATCH v8 6/6] drm/xe/guc: Plumb GuC-capture into dev coredump Zhanjun Dong
2024-05-11 20:25   ` Teres Alexis, Alan Previn
2024-05-07  4:17 ` ✓ CI.Patch_applied: success for drm/xe/guc: Add GuC based register capture for error capture (rev8) Patchwork
2024-05-07  4:18 ` ✗ CI.checkpatch: warning " Patchwork
2024-05-07  4:19 ` ✓ CI.KUnit: success " Patchwork
2024-05-07  4:31 ` ✓ CI.Build: " Patchwork
2024-05-07  4:41 ` ✗ CI.Hooks: failure " Patchwork
2024-05-07  4:49 ` ✓ CI.checksparse: success " Patchwork
2024-05-07  5:24 ` ✗ CI.BAT: failure " Patchwork
2024-05-07  9:35 ` ✗ CI.FULL: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afea7b56-2d7d-4686-aac1-012288e1bab5@intel.com \
    --to=zhanjun.dong@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox