From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
To: "Somalapuram, Amaranath" <asomalap@amd.com>,
Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>,
amd-gfx@lists.freedesktop.org
Cc: alexander.deucher@amd.com, christian.koenig@amd.com,
shashank.sharma@amd.com
Subject: Re: [PATCH v4 2/2] drm/amdgpu: add reset register dump trace on GPU reset
Date: Wed, 16 Feb 2022 10:40:29 -0500 [thread overview]
Message-ID: <0cddf8b7-39b1-ae3c-6b3e-c5946d4f96cc@amd.com> (raw)
In-Reply-To: <bd8ad3f6-ce54-6498-5b79-74e4a5457492@amd.com>
On 2022-02-16 05:46, Somalapuram, Amaranath wrote:
>
> On 2/15/2022 10:09 PM, Andrey Grodzovsky wrote:
>>
>> On 2022-02-15 05:12, Somalapuram Amaranath wrote:
>>> Dump the list of register values to trace event on GPU reset.
>>>
>>> Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 17 ++++++++++++++++-
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 16 ++++++++++++++++
>>> 2 files changed, 32 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 1e651b959141..ff21262c6fea 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -4534,6 +4534,19 @@ int amdgpu_device_pre_asic_reset(struct
>>> amdgpu_device *adev,
>>> return r;
>>> }
>>> +static int amdgpu_reset_reg_dumps(struct amdgpu_device *adev)
>>> +{
>>> + uint32_t reg_value;
>>> + int i;
>>> +
>>> + for (i = 0; i < adev->n_regs; i++) {
>>> + reg_value = RREG32(adev->reset_dump_reg_list[i]);
>>> + trace_amdgpu_reset_reg_dumps(adev->reset_dump_reg_list[i],
>>> reg_value);
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> int amdgpu_do_asic_reset(struct list_head *device_list_handle,
>>> struct amdgpu_reset_context *reset_context)
>>> {
>>> @@ -4567,8 +4580,10 @@ int amdgpu_do_asic_reset(struct list_head
>>> *device_list_handle,
>>> tmp_adev->gmc.xgmi.pending_reset = false;
>>> if (!queue_work(system_unbound_wq,
>>> &tmp_adev->xgmi_reset_work))
>>> r = -EALREADY;
>>> - } else
>>> + } else {
>>> + amdgpu_reset_reg_dumps(tmp_adev);
>>> r = amdgpu_asic_reset(tmp_adev);
>>> + }
>>
>>
>> Is there any particular reason you only dump registers in single ASIC
>> case and not for XGMI ?
>>
>> Andrey
>>
> Not really, should I move it to the top of function?
>
> Regards,
>
> S.Amarnath
Yes, no reason to avoid dumping this info for XGMI case.
Consider also that maybe you don't want to print this when
the reset is not due to any error state - like manual GPU reset
from debugfs - but to know this you will have to wire a flag from
high above in the call stack to here and so probably this not worth
all this changes.
Andrey
>
>>
>>> if (r) {
>>> dev_err(tmp_adev->dev, "ASIC reset failed with
>>> error, %d for drm dev, %s",
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
>>> index d855cb53c7e0..b9637925e85c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
>>> @@ -537,6 +537,22 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
>>> __entry->seqno)
>>> );
>>> +TRACE_EVENT(amdgpu_reset_reg_dumps,
>>> + TP_PROTO(uint32_t address, uint32_t value),
>>> + TP_ARGS(address, value),
>>> + TP_STRUCT__entry(
>>> + __field(uint32_t, address)
>>> + __field(uint32_t, value)
>>> + ),
>>> + TP_fast_assign(
>>> + __entry->address = address;
>>> + __entry->value = value;
>>> + ),
>>> + TP_printk("amdgpu register dump 0x%x: 0x%x",
>>> + __entry->address,
>>> + __entry->value)
>>> +);
>>> +
>>> #undef AMDGPU_JOB_GET_TIMELINE_NAME
>>> #endif
next prev parent reply other threads:[~2022-02-16 15:40 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-15 10:12 [PATCH v4 1/2] drm/amdgpu: add debugfs for reset registers list Somalapuram Amaranath
2022-02-15 10:12 ` [PATCH v4 2/2] drm/amdgpu: add reset register dump trace on GPU reset Somalapuram Amaranath
2022-02-15 16:39 ` Andrey Grodzovsky
2022-02-16 10:46 ` Somalapuram, Amaranath
2022-02-16 15:40 ` Andrey Grodzovsky [this message]
2022-02-15 10:16 ` [PATCH v4 1/2] drm/amdgpu: add debugfs for reset registers list Christian König
2022-02-15 11:15 ` Somalapuram, Amaranath
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0cddf8b7-39b1-ae3c-6b3e-c5946d4f96cc@amd.com \
--to=andrey.grodzovsky@amd.com \
--cc=Amaranath.Somalapuram@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=asomalap@amd.com \
--cc=christian.koenig@amd.com \
--cc=shashank.sharma@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox