From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: "Khatri, Sunil" <sukhatri@amd.com>,
"Christian König" <christian.koenig@amd.com>,
"Sunil Khatri" <sunil.khatri@amd.com>,
"Alex Deucher" <alexander.deucher@amd.com>,
"Shashank Sharma" <shashank.sharma@amd.com>
Cc: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
linux-kernel@vger.kernel.org, Mukul Joshi <mukul.joshi@amd.com>,
Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Subject: Re: [PATCH v2 2/2] drm/amdgpu: add vm fault information to devcoredump
Date: Fri, 8 Mar 2024 11:11:57 +0100 [thread overview]
Message-ID: <b1f8dedf-e671-464d-9087-483e46bbd462@gmail.com> (raw)
In-Reply-To: <83c46d51-7d4c-4a2f-b34e-8b6700a5fca7@amd.com>
Am 08.03.24 um 10:16 schrieb Khatri, Sunil:
>
> On 3/8/2024 2:39 PM, Christian König wrote:
>> Am 07.03.24 um 21:50 schrieb Sunil Khatri:
>>> Add page fault information to the devcoredump.
>>>
>>> Output of devcoredump:
>>> **** AMDGPU Device Coredump ****
>>> version: 1
>>> kernel: 6.7.0-amd-staging-drm-next
>>> module: amdgpu
>>> time: 29.725011811
>>> process_name: soft_recovery_p PID: 1720
>>>
>>> Ring timed out details
>>> IP Type: 0 Ring Name: gfx_0.0.0
>>>
>>> [gfxhub] Page fault observed
>>> Faulty page starting at address: 0x0000000000000000
>>> Protection fault status register: 0x301031
>>>
>>> VRAM is lost due to GPU reset!
>>>
>>> Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 14 +++++++++++++-
>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
>>> index 147100c27c2d..8794a3c21176 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
>>> @@ -203,8 +203,20 @@ amdgpu_devcoredump_read(char *buffer, loff_t
>>> offset, size_t count,
>>> coredump->ring->name);
>>> }
>>> + if (coredump->adev) {
>>> + struct amdgpu_vm_fault_info *fault_info =
>>> + &coredump->adev->vm_manager.fault_info;
>>> +
>>> + drm_printf(&p, "\n[%s] Page fault observed\n",
>>> + fault_info->vmhub ? "mmhub" : "gfxhub");
>>> + drm_printf(&p, "Faulty page starting at address: 0x%016llx\n",
>>> + fault_info->addr);
>>> + drm_printf(&p, "Protection fault status register: 0x%x\n",
>>> + fault_info->status);
>>> + }
>>> +
>>> if (coredump->reset_vram_lost)
>>> - drm_printf(&p, "VRAM is lost due to GPU reset!\n");
>>> + drm_printf(&p, "\nVRAM is lost due to GPU reset!\n");
>>
>> Why this additional new line?
> The intent is the devcoredump have different sections clearly
> demarcated with an new line else "VRAM is lost due to GPU reset!"
> seems part of the page fault information.
> [gfxhub] Page fault observed
> Faulty page starting at address: 0x0000000000000000
> Protection fault status register: 0x301031
>
> VRAM is lost due to GPU reset!
In that case I would print the newline independent if VRAM is lost or
not. Otherwise you get:
Protection fault status register:...
VRAM is lost due to GPU reset!
AMDGPU register dumps:
In one case and:
Protection fault status register:...
AMDGPU register dumps:
In the other case which breaks this sectioning quite a bit.
Regards,
Christian.
>
> Regards
> Sunil
>
>>
>> Apart from that looks really good to me.
>>
>> Regards,
>> Christian.
>>
>>> if (coredump->adev->reset_info.num_regs) {
>>> drm_printf(&p, "AMDGPU register dumps:\nOffset:
>>> Value:\n");
>>
prev parent reply other threads:[~2024-03-08 10:12 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-07 20:50 [PATCH v2 0/2] Add pagefault support for devcoredump Sunil Khatri
2024-03-07 20:50 ` [PATCH v2 1/2] drm/amdgpu: add recent pagefault info in vm_manager Sunil Khatri
2024-03-07 20:50 ` [PATCH v2 2/2] drm/amdgpu: add vm fault information to devcoredump Sunil Khatri
2024-03-08 9:09 ` Christian König
2024-03-08 9:16 ` Khatri, Sunil
2024-03-08 10:11 ` Christian König [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b1f8dedf-e671-464d-9087-483e46bbd462@gmail.com \
--to=ckoenig.leichtzumerken@gmail.com \
--cc=Arunpravin.PaneerSelvam@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mukul.joshi@amd.com \
--cc=shashank.sharma@amd.com \
--cc=sukhatri@amd.com \
--cc=sunil.khatri@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox