From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: "Khatri, Sunil" <sukhatri@amd.com>,
"Christian König" <christian.koenig@amd.com>,
"Sunil Khatri" <sunil.khatri@amd.com>,
"Alex Deucher" <alexander.deucher@amd.com>,
"Shashank Sharma" <shashank.sharma@amd.com>
Cc: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
linux-kernel@vger.kernel.org, Mukul Joshi <mukul.joshi@amd.com>,
Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Subject: Re: [PATCH v2 2/2] drm/amdgpu: add vm fault information to devcoredump
Date: Fri, 8 Mar 2024 11:11:57 +0100 [thread overview]
Message-ID: <b1f8dedf-e671-464d-9087-483e46bbd462@gmail.com> (raw)
In-Reply-To: <83c46d51-7d4c-4a2f-b34e-8b6700a5fca7@amd.com>
Am 08.03.24 um 10:16 schrieb Khatri, Sunil:
>
> On 3/8/2024 2:39 PM, Christian König wrote:
>> Am 07.03.24 um 21:50 schrieb Sunil Khatri:
>>> Add page fault information to the devcoredump.
>>>
>>> Output of devcoredump:
>>> **** AMDGPU Device Coredump ****
>>> version: 1
>>> kernel: 6.7.0-amd-staging-drm-next
>>> module: amdgpu
>>> time: 29.725011811
>>> process_name: soft_recovery_p PID: 1720
>>>
>>> Ring timed out details
>>> IP Type: 0 Ring Name: gfx_0.0.0
>>>
>>> [gfxhub] Page fault observed
>>> Faulty page starting at address: 0x0000000000000000
>>> Protection fault status register: 0x301031
>>>
>>> VRAM is lost due to GPU reset!
>>>
>>> Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 14 +++++++++++++-
>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
>>> index 147100c27c2d..8794a3c21176 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
>>> @@ -203,8 +203,20 @@ amdgpu_devcoredump_read(char *buffer, loff_t
>>> offset, size_t count,
>>> coredump->ring->name);
>>> }
>>> + if (coredump->adev) {
>>> + struct amdgpu_vm_fault_info *fault_info =
>>> + &coredump->adev->vm_manager.fault_info;
>>> +
>>> + drm_printf(&p, "\n[%s] Page fault observed\n",
>>> + fault_info->vmhub ? "mmhub" : "gfxhub");
>>> + drm_printf(&p, "Faulty page starting at address: 0x%016llx\n",
>>> + fault_info->addr);
>>> + drm_printf(&p, "Protection fault status register: 0x%x\n",
>>> + fault_info->status);
>>> + }
>>> +
>>> if (coredump->reset_vram_lost)
>>> - drm_printf(&p, "VRAM is lost due to GPU reset!\n");
>>> + drm_printf(&p, "\nVRAM is lost due to GPU reset!\n");
>>
>> Why this additional new line?
> The intent is the devcoredump have different sections clearly
> demarcated with an new line else "VRAM is lost due to GPU reset!"
> seems part of the page fault information.
> [gfxhub] Page fault observed
> Faulty page starting at address: 0x0000000000000000
> Protection fault status register: 0x301031
>
> VRAM is lost due to GPU reset!
In that case I would print the newline independent if VRAM is lost or
not. Otherwise you get:
Protection fault status register:...
VRAM is lost due to GPU reset!
AMDGPU register dumps:
In one case and:
Protection fault status register:...
AMDGPU register dumps:
In the other case which breaks this sectioning quite a bit.
Regards,
Christian.
>
> Regards
> Sunil
>
>>
>> Apart from that looks really good to me.
>>
>> Regards,
>> Christian.
>>
>>> if (coredump->adev->reset_info.num_regs) {
>>> drm_printf(&p, "AMDGPU register dumps:\nOffset:
>>> Value:\n");
>>
prev parent reply other threads:[~2024-03-08 10:12 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-07 20:50 [PATCH v2 0/2] Add pagefault support for devcoredump Sunil Khatri
2024-03-07 20:50 ` [PATCH v2 1/2] drm/amdgpu: add recent pagefault info in vm_manager Sunil Khatri
2024-03-07 20:50 ` [PATCH v2 2/2] drm/amdgpu: add vm fault information to devcoredump Sunil Khatri
2024-03-08 9:09 ` Christian König
2024-03-08 9:16 ` Khatri, Sunil
2024-03-08 10:11 ` Christian König [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b1f8dedf-e671-464d-9087-483e46bbd462@gmail.com \
--to=ckoenig.leichtzumerken@gmail.com \
--cc=Arunpravin.PaneerSelvam@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mukul.joshi@amd.com \
--cc=shashank.sharma@amd.com \
--cc=sukhatri@amd.com \
--cc=sunil.khatri@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.