From: "Kuppuswamy, Sathyanarayanan" <sathyanarayanan.kuppuswamy@linux.intel.com>
To: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>,
amd-gfx@lists.freedesktop.org, linux-pci@vger.kernel.org
Cc: nirmodas@amd.com, bhelgaas@google.com, luben.tuikov@amd.com,
alexander.deucher@amd.com, christian.koenig@amd.com,
Dennis.Li@amd.com
Subject: Re: [PATCH v4 8/8] Revert "PCI/ERR: Update error status after reset_link()"
Date: Wed, 2 Sep 2020 13:27:33 -0700 [thread overview]
Message-ID: <d4da4c2c-4fdb-08ee-c514-acfbcb67e16b@linux.intel.com> (raw)
In-Reply-To: <a3cadf36-d597-97fe-a096-83baa73c6f8f@amd.com>
On 9/2/20 12:54 PM, Andrey Grodzovsky wrote:
> Yes, works also.
>
> Can you provide me a formal patch that i can commit into our local amd staging tree with my patch set ?
https://patchwork.kernel.org/patch/11684175/mbox/
>
> Alex - is that how we want to do it, without this patch or reverting the original patch the feature
> is broken.
>
> Andrey
>
> On 9/2/20 3:00 PM, Kuppuswamy, Sathyanarayanan wrote:
>>
>>
>> On 9/2/20 11:42 AM, Andrey Grodzovsky wrote:
>>> This reverts commit 6d2c89441571ea534d6240f7724f518936c44f8d.
>>>
>>> In the code bellow
>>>
>>> pci_walk_bus(bus, report_frozen_detected, &status);
>>> - if (reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED)
>>> + status = reset_link(dev, service);
>>>
>>> status returned from report_frozen_detected is unconditionally masked
>>> by status returned from reset_link which is wrong.
>>>
>>> This breaks error recovery implementation for AMDGPU driver
>>> by masking PCI_ERS_RESULT_NEED_RESET returned from amdgpu_pci_error_detected
>>> and hence skiping slot reset callback which is necessary for proper
>>> ASIC recovery. Effectively no other callback besides resume callback will
>>> be called after link reset the way it is implemented now regardless of what
>>> value error_detected callback returns.
>>>
>> }
>>
>> Instead of reverting this change, can you try following patch ?
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-pci%2F56ad4901-725f-7b88-2117-b124b28b027f%40linux.intel.com%2FT%2F%23me8029c04f63c21f9d1cb3b1ba2aeffbca3a60df5&data=02%7C01%7Candrey.grodzovsky%40amd.com%7C77325d6a2abc42d26ae608d84f726c51%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637346700170831846&sdata=JPo8lOXfjxpq%2BnmlVrSi93aZxGjIlbuh0rkZmNKkzQM%3D&reserved=0
>>
>>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
WARNING: multiple messages have this Message-ID (diff)
From: "Kuppuswamy, Sathyanarayanan" <sathyanarayanan.kuppuswamy@linux.intel.com>
To: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>,
amd-gfx@lists.freedesktop.org, linux-pci@vger.kernel.org
Cc: alexander.deucher@amd.com, nirmodas@amd.com, Dennis.Li@amd.com,
christian.koenig@amd.com, luben.tuikov@amd.com,
bhelgaas@google.com
Subject: Re: [PATCH v4 8/8] Revert "PCI/ERR: Update error status after reset_link()"
Date: Wed, 2 Sep 2020 13:27:33 -0700 [thread overview]
Message-ID: <d4da4c2c-4fdb-08ee-c514-acfbcb67e16b@linux.intel.com> (raw)
In-Reply-To: <a3cadf36-d597-97fe-a096-83baa73c6f8f@amd.com>
On 9/2/20 12:54 PM, Andrey Grodzovsky wrote:
> Yes, works also.
>
> Can you provide me a formal patch that i can commit into our local amd staging tree with my patch set ?
https://patchwork.kernel.org/patch/11684175/mbox/
>
> Alex - is that how we want to do it, without this patch or reverting the original patch the feature
> is broken.
>
> Andrey
>
> On 9/2/20 3:00 PM, Kuppuswamy, Sathyanarayanan wrote:
>>
>>
>> On 9/2/20 11:42 AM, Andrey Grodzovsky wrote:
>>> This reverts commit 6d2c89441571ea534d6240f7724f518936c44f8d.
>>>
>>> In the code bellow
>>>
>>> pci_walk_bus(bus, report_frozen_detected, &status);
>>> - if (reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED)
>>> + status = reset_link(dev, service);
>>>
>>> status returned from report_frozen_detected is unconditionally masked
>>> by status returned from reset_link which is wrong.
>>>
>>> This breaks error recovery implementation for AMDGPU driver
>>> by masking PCI_ERS_RESULT_NEED_RESET returned from amdgpu_pci_error_detected
>>> and hence skiping slot reset callback which is necessary for proper
>>> ASIC recovery. Effectively no other callback besides resume callback will
>>> be called after link reset the way it is implemented now regardless of what
>>> value error_detected callback returns.
>>>
>> }
>>
>> Instead of reverting this change, can you try following patch ?
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-pci%2F56ad4901-725f-7b88-2117-b124b28b027f%40linux.intel.com%2FT%2F%23me8029c04f63c21f9d1cb3b1ba2aeffbca3a60df5&data=02%7C01%7Candrey.grodzovsky%40amd.com%7C77325d6a2abc42d26ae608d84f726c51%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637346700170831846&sdata=JPo8lOXfjxpq%2BnmlVrSi93aZxGjIlbuh0rkZmNKkzQM%3D&reserved=0
>>
>>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
next prev parent reply other threads:[~2020-09-02 20:44 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-02 18:42 [PATCH v4 0/8] Implement PCI Error Recovery on Navi12 Andrey Grodzovsky
2020-09-02 18:42 ` Andrey Grodzovsky
2020-09-02 18:42 ` [PATCH v4 1/8] drm/amdgpu: Avoid accessing HW when suspending SW state Andrey Grodzovsky
2020-09-02 18:42 ` Andrey Grodzovsky
2020-09-02 21:56 ` Bjorn Helgaas
2020-09-02 21:56 ` Bjorn Helgaas
2020-09-03 15:25 ` Andrey Grodzovsky
2020-09-03 1:32 ` Li, Dennis
2020-09-03 1:32 ` Li, Dennis
2020-09-02 18:42 ` [PATCH v4 2/8] drm/amdgpu: Block all job scheduling activity during DPC recovery Andrey Grodzovsky
2020-09-02 18:42 ` Andrey Grodzovsky
2020-09-02 22:07 ` Bjorn Helgaas
2020-09-02 22:07 ` Bjorn Helgaas
2020-09-02 18:42 ` [PATCH v4 3/8] drm/amdgpu: Fix SMU error failure Andrey Grodzovsky
2020-09-02 18:42 ` Andrey Grodzovsky
2020-09-02 22:05 ` Bjorn Helgaas
2020-09-02 22:05 ` Bjorn Helgaas
2020-09-03 15:29 ` Andrey Grodzovsky
2020-09-03 15:29 ` Andrey Grodzovsky
2020-09-02 18:42 ` [PATCH v4 4/8] drm/amdgpu: Fix consecutive DPC recovery failures Andrey Grodzovsky
2020-09-02 18:42 ` Andrey Grodzovsky
2020-09-02 22:23 ` Bjorn Helgaas
2020-09-02 22:23 ` Bjorn Helgaas
2020-09-03 15:45 ` Andrey Grodzovsky
2020-09-03 15:45 ` Andrey Grodzovsky
2020-09-02 18:42 ` [PATCH v4 5/8] drm/amdgpu: Trim amdgpu_pci_slot_reset by reusing code Andrey Grodzovsky
2020-09-02 18:42 ` Andrey Grodzovsky
2020-09-02 18:42 ` [PATCH v4 6/8] drm/amdgpu: Disable DPC for XGMI for now Andrey Grodzovsky
2020-09-02 18:42 ` Andrey Grodzovsky
2020-09-02 18:42 ` [PATCH v4 7/8] drm/amdgpu: Minor checkpatch fix Andrey Grodzovsky
2020-09-02 18:42 ` Andrey Grodzovsky
2020-09-02 18:42 ` [PATCH v4 8/8] Revert "PCI/ERR: Update error status after reset_link()" Andrey Grodzovsky
2020-09-02 18:42 ` Andrey Grodzovsky
2020-09-02 19:00 ` Kuppuswamy, Sathyanarayanan
2020-09-02 19:00 ` Kuppuswamy, Sathyanarayanan
2020-09-02 19:54 ` Andrey Grodzovsky
2020-09-02 19:54 ` Andrey Grodzovsky
2020-09-02 20:27 ` Kuppuswamy, Sathyanarayanan [this message]
2020-09-02 20:27 ` Kuppuswamy, Sathyanarayanan
2020-09-02 21:36 ` [PATCH v4 0/8] Implement PCI Error Recovery on Navi12 Bjorn Helgaas
2020-09-02 21:36 ` Bjorn Helgaas
2020-09-02 23:43 ` Grodzovsky, Andrey
2020-09-03 0:41 ` Bjorn Helgaas
2020-09-03 0:41 ` Bjorn Helgaas
2020-09-03 15:01 ` Andrey Grodzovsky
2020-09-03 15:01 ` Andrey Grodzovsky
2020-09-03 15:58 ` Deucher, Alexander
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d4da4c2c-4fdb-08ee-c514-acfbcb67e16b@linux.intel.com \
--to=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=Andrey.Grodzovsky@amd.com \
--cc=Dennis.Li@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=bhelgaas@google.com \
--cc=christian.koenig@amd.com \
--cc=linux-pci@vger.kernel.org \
--cc=luben.tuikov@amd.com \
--cc=nirmodas@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.