* [PATCH] [PATCH net] tg3: power down device only on SYSTEM_POWER_OFF @ 2023-11-01 13:04 George Shuklin 2023-11-01 15:20 ` Pavan Chebbi 0 siblings, 1 reply; 8+ messages in thread From: George Shuklin @ 2023-11-01 13:04 UTC (permalink / raw) To: netdev; +Cc: George Shuklin Dell R650xs servers hangs if tg3 driver calls tg3_power_down. This happens only if network adapters (BCM5720 for R650xs) were initialized using SNP (e.g. by booting ipxe.efi). This is partial revert of commit 2ca1c94ce0b. The actual problem is on Dell side, but this fix allow servers to come back alive after reboot. --- drivers/net/ethernet/broadcom/tg3.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c index 14b311196b8f..22b00912f7ac 100644 --- a/drivers/net/ethernet/broadcom/tg3.c +++ b/drivers/net/ethernet/broadcom/tg3.c @@ -18078,7 +18078,8 @@ static void tg3_shutdown(struct pci_dev *pdev) if (netif_running(dev)) dev_close(dev); - tg3_power_down(tp); + if (system_state == SYSTEM_POWER_OFF) + tg3_power_down(tp); rtnl_unlock(); -- 2.42.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] [PATCH net] tg3: power down device only on SYSTEM_POWER_OFF 2023-11-01 13:04 [PATCH] [PATCH net] tg3: power down device only on SYSTEM_POWER_OFF George Shuklin @ 2023-11-01 15:20 ` Pavan Chebbi 2023-11-01 19:58 ` George Shuklin 0 siblings, 1 reply; 8+ messages in thread From: Pavan Chebbi @ 2023-11-01 15:20 UTC (permalink / raw) To: George Shuklin; +Cc: netdev, Andrew Gospodarek, Michael Chan [-- Attachment #1: Type: text/plain, Size: 1183 bytes --] On Wed, Nov 1, 2023 at 6:34 PM George Shuklin <george.shuklin@gmail.com> wrote: > > Dell R650xs servers hangs if tg3 driver calls tg3_power_down. > > This happens only if network adapters (BCM5720 for R650xs) were > initialized using SNP (e.g. by booting ipxe.efi). > > This is partial revert of commit 2ca1c94ce0b. > > The actual problem is on Dell side, but this fix allow servers > to come back alive after reboot. How are you sure that the problem solved by 2ca1c94ce0b is not reintroduced with this change? > --- > drivers/net/ethernet/broadcom/tg3.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c > index 14b311196b8f..22b00912f7ac 100644 > --- a/drivers/net/ethernet/broadcom/tg3.c > +++ b/drivers/net/ethernet/broadcom/tg3.c > @@ -18078,7 +18078,8 @@ static void tg3_shutdown(struct pci_dev *pdev) > if (netif_running(dev)) > dev_close(dev); > > - tg3_power_down(tp); > + if (system_state == SYSTEM_POWER_OFF) > + tg3_power_down(tp); > > rtnl_unlock(); > > -- > 2.42.0 > > [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4209 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [PATCH net] tg3: power down device only on SYSTEM_POWER_OFF 2023-11-01 15:20 ` Pavan Chebbi @ 2023-11-01 19:58 ` George Shuklin 2023-11-02 7:04 ` Pavan Chebbi 0 siblings, 1 reply; 8+ messages in thread From: George Shuklin @ 2023-11-01 19:58 UTC (permalink / raw) To: Pavan Chebbi; +Cc: netdev, Andrew Gospodarek, Michael Chan On 01/11/2023 17:20, Pavan Chebbi wrote: > On Wed, Nov 1, 2023 at 6:34 PM George Shuklin <george.shuklin@gmail.com> wrote: >> Dell R650xs servers hangs if tg3 driver calls tg3_power_down. >> >> This happens only if network adapters (BCM5720 for R650xs) were >> initialized using SNP (e.g. by booting ipxe.efi). >> >> This is partial revert of commit 2ca1c94ce0b. >> >> The actual problem is on Dell side, but this fix allow servers >> to come back alive after reboot. > How are you sure that the problem solved by 2ca1c94ce0b is not > reintroduced with this change? I contacted the author of original patch, no reply yet (1st day). Also, I tested it on few generations of available Dell servers (R330, R340, R350 and R650sx, for which this fix should help). It does produce log message from https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1917471, but, at least, it reboots without issues. Actually, original patch is regression: 5.19 rebooting just fine, 6.0 start to hang. I also reported it to dell support forum, but I'm not sure if they pick it up or not. What would be the proper course of actions for such problem (outside of fixing UEFI SNP, for which I don't have access to sources)? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [PATCH net] tg3: power down device only on SYSTEM_POWER_OFF 2023-11-01 19:58 ` George Shuklin @ 2023-11-02 7:04 ` Pavan Chebbi 2023-11-02 14:14 ` George Shuklin 0 siblings, 1 reply; 8+ messages in thread From: Pavan Chebbi @ 2023-11-02 7:04 UTC (permalink / raw) To: George Shuklin; +Cc: netdev, Andrew Gospodarek, Michael Chan [-- Attachment #1: Type: text/plain, Size: 1630 bytes --] On Thu, Nov 2, 2023 at 1:28 AM George Shuklin <george.shuklin@gmail.com> wrote: > > On 01/11/2023 17:20, Pavan Chebbi wrote: > > On Wed, Nov 1, 2023 at 6:34 PM George Shuklin <george.shuklin@gmail.com> wrote: > >> Dell R650xs servers hangs if tg3 driver calls tg3_power_down. > >> > >> This happens only if network adapters (BCM5720 for R650xs) were > >> initialized using SNP (e.g. by booting ipxe.efi). > >> > >> This is partial revert of commit 2ca1c94ce0b. > >> > >> The actual problem is on Dell side, but this fix allow servers > >> to come back alive after reboot. > > How are you sure that the problem solved by 2ca1c94ce0b is not > > reintroduced with this change? > > I contacted the author of original patch, no reply yet (1st day). Also, > I tested it on few generations of available Dell servers (R330, R340, > R350 and R650sx, for which this fix should help). It does produce log > message from > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1917471, but, at > least, it reboots without issues. > > Actually, original patch is regression: 5.19 rebooting just fine, 6.0 > start to hang. I also reported it to dell support forum, but I'm not > sure if they pick it up or not. > > What would be the proper course of actions for such problem (outside of > fixing UEFI SNP, for which I don't have access to sources)? > Thanks for the explanation. I am not sure if we should make this change unless we are 100pc sure that this patch won't cause regression. I feel Dell is in the best position to debug this and they can even contact Broadcom if they see any problem in UEFI. [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4209 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [PATCH net] tg3: power down device only on SYSTEM_POWER_OFF 2023-11-02 7:04 ` Pavan Chebbi @ 2023-11-02 14:14 ` George Shuklin 2023-11-02 17:11 ` Pavan Chebbi 0 siblings, 1 reply; 8+ messages in thread From: George Shuklin @ 2023-11-02 14:14 UTC (permalink / raw) To: Pavan Chebbi; +Cc: netdev, Andrew Gospodarek, Michael Chan On 11/2/23 09:04, Pavan Chebbi wrote: > On Thu, Nov 2, 2023 at 1:28 AM George Shuklin <george.shuklin@gmail.com> wrote: >> On 01/11/2023 17:20, Pavan Chebbi wrote: >>> On Wed, Nov 1, 2023 at 6:34 PM George Shuklin <george.shuklin@gmail.com> wrote: >>>> Dell R650xs servers hangs if tg3 driver calls tg3_power_down. >>>> >>>> This happens only if network adapters (BCM5720 for R650xs) were >>>> initialized using SNP (e.g. by booting ipxe.efi). >>>> >>>> This is partial revert of commit 2ca1c94ce0b. >>>> >>>> The actual problem is on Dell side, but this fix allow servers >>>> to come back alive after reboot. >>> How are you sure that the problem solved by 2ca1c94ce0b is not >>> reintroduced with this change? >> I contacted the author of original patch, no reply yet (1st day). Also, >> I tested it on few generations of available Dell servers (R330, R340, >> R350 and R650sx, for which this fix should help). It does produce log >> message from >> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1917471, but, at >> least, it reboots without issues. >> >> Actually, original patch is regression: 5.19 rebooting just fine, 6.0 >> start to hang. I also reported it to dell support forum, but I'm not >> sure if they pick it up or not. >> >> What would be the proper course of actions for such problem (outside of >> fixing UEFI SNP, for which I don't have access to sources)? >> > Thanks for the explanation. I am not sure if we should make this > change unless we are 100pc sure that this patch won't cause > regression. > I feel Dell is in the best position to debug this and they can even > contact Broadcom if they see any problem in UEFI. I'm right now with dell support, and what they asked is to 'try this on supported distros', which at newest are 5.15. I'll try to bypass their L1 with Ubuntu + HWE to get to 6+ versions... I was able to reproduce hanging at reboot there (without ACPI messages), and patching helps there too. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [PATCH net] tg3: power down device only on SYSTEM_POWER_OFF 2023-11-02 14:14 ` George Shuklin @ 2023-11-02 17:11 ` Pavan Chebbi 2023-11-02 17:49 ` Michael Chan 0 siblings, 1 reply; 8+ messages in thread From: Pavan Chebbi @ 2023-11-02 17:11 UTC (permalink / raw) To: George Shuklin; +Cc: netdev, Andrew Gospodarek, Michael Chan [-- Attachment #1: Type: text/plain, Size: 2351 bytes --] On Thu, Nov 2, 2023 at 7:44 PM George Shuklin <george.shuklin@gmail.com> wrote: > > On 11/2/23 09:04, Pavan Chebbi wrote: > > On Thu, Nov 2, 2023 at 1:28 AM George Shuklin <george.shuklin@gmail.com> wrote: > >> On 01/11/2023 17:20, Pavan Chebbi wrote: > >>> On Wed, Nov 1, 2023 at 6:34 PM George Shuklin <george.shuklin@gmail.com> wrote: > >>>> Dell R650xs servers hangs if tg3 driver calls tg3_power_down. > >>>> > >>>> This happens only if network adapters (BCM5720 for R650xs) were > >>>> initialized using SNP (e.g. by booting ipxe.efi). > >>>> > >>>> This is partial revert of commit 2ca1c94ce0b. > >>>> > >>>> The actual problem is on Dell side, but this fix allow servers > >>>> to come back alive after reboot. > >>> How are you sure that the problem solved by 2ca1c94ce0b is not > >>> reintroduced with this change? > >> I contacted the author of original patch, no reply yet (1st day). Also, > >> I tested it on few generations of available Dell servers (R330, R340, > >> R350 and R650sx, for which this fix should help). It does produce log > >> message from > >> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1917471, but, at > >> least, it reboots without issues. > >> > >> Actually, original patch is regression: 5.19 rebooting just fine, 6.0 > >> start to hang. I also reported it to dell support forum, but I'm not > >> sure if they pick it up or not. > >> > >> What would be the proper course of actions for such problem (outside of > >> fixing UEFI SNP, for which I don't have access to sources)? > >> > > Thanks for the explanation. I am not sure if we should make this > > change unless we are 100pc sure that this patch won't cause > > regression. > > I feel Dell is in the best position to debug this and they can even > > contact Broadcom if they see any problem in UEFI. > > I'm right now with dell support, and what they asked is to 'try this on > supported distros', which at newest are 5.15. I'll try to bypass their > L1 with Ubuntu + HWE to get to 6+ versions... > > I was able to reproduce hanging at reboot there (without ACPI messages), > and patching helps there too. > OK. I am not too sure what we should do. The change as such looks fine to me. Of course, the patch needs proper tags (tree.fixes, cc ...) @Michael : Do you have any suggestions on this? [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4209 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [PATCH net] tg3: power down device only on SYSTEM_POWER_OFF 2023-11-02 17:11 ` Pavan Chebbi @ 2023-11-02 17:49 ` Michael Chan 2023-11-03 3:48 ` Pavan Chebbi 0 siblings, 1 reply; 8+ messages in thread From: Michael Chan @ 2023-11-02 17:49 UTC (permalink / raw) To: Pavan Chebbi; +Cc: George Shuklin, netdev, Andrew Gospodarek [-- Attachment #1: Type: text/plain, Size: 1357 bytes --] On Thu, Nov 2, 2023 at 10:11 AM Pavan Chebbi <pavan.chebbi@broadcom.com> wrote: > > On Thu, Nov 2, 2023 at 7:44 PM George Shuklin <george.shuklin@gmail.com> wrote: > > > > I'm right now with dell support, and what they asked is to 'try this on > > supported distros', which at newest are 5.15. I'll try to bypass their > > L1 with Ubuntu + HWE to get to 6+ versions... > > > > I was able to reproduce hanging at reboot there (without ACPI messages), > > and patching helps there too. > > > OK. I am not too sure what we should do. The change as such looks fine to me. > Of course, the patch needs proper tags (tree.fixes, cc ...) > > @Michael : Do you have any suggestions on this? I think that this logic: if (system_state == SYSTEM_POWER_OFF) tg3_power_down(tp); is correct in the shutdown method. Many other drivers do the same thing. tg3_power_down() is just to enable WoL if it should be enabled and to put the device in D3hot state. The original commit to change this (2ca1c94ce0b6 tg3: Disable tg3 device on system reboot to avoid triggering AER) claims that calling tg3_power_down() unconditionally is to avoid getting spurious MSI. But the dev_close() call in tg3_shutdown() should have shut everything down including MSI. So I don't think it is necessary to call tg3_power_down() unconditionally. [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4209 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] [PATCH net] tg3: power down device only on SYSTEM_POWER_OFF 2023-11-02 17:49 ` Michael Chan @ 2023-11-03 3:48 ` Pavan Chebbi 0 siblings, 0 replies; 8+ messages in thread From: Pavan Chebbi @ 2023-11-03 3:48 UTC (permalink / raw) To: Michael Chan; +Cc: George Shuklin, netdev, Andrew Gospodarek [-- Attachment #1: Type: text/plain, Size: 1710 bytes --] On Thu, Nov 2, 2023 at 11:19 PM Michael Chan <michael.chan@broadcom.com> wrote: > > On Thu, Nov 2, 2023 at 10:11 AM Pavan Chebbi <pavan.chebbi@broadcom.com> wrote: > > > > On Thu, Nov 2, 2023 at 7:44 PM George Shuklin <george.shuklin@gmail.com> wrote: > > > > > > I'm right now with dell support, and what they asked is to 'try this on > > > supported distros', which at newest are 5.15. I'll try to bypass their > > > L1 with Ubuntu + HWE to get to 6+ versions... > > > > > > I was able to reproduce hanging at reboot there (without ACPI messages), > > > and patching helps there too. > > > > > OK. I am not too sure what we should do. The change as such looks fine to me. > > Of course, the patch needs proper tags (tree.fixes, cc ...) > > > > @Michael : Do you have any suggestions on this? > > I think that this logic: > > if (system_state == SYSTEM_POWER_OFF) > tg3_power_down(tp); > > is correct in the shutdown method. Many other drivers do the same > thing. tg3_power_down() is just to enable WoL if it should be enabled > and to put the device in D3hot state. > > The original commit to change this (2ca1c94ce0b6 tg3: Disable tg3 > device on system reboot to avoid triggering AER) claims that calling > tg3_power_down() unconditionally is to avoid getting spurious MSI. > But the dev_close() call in tg3_shutdown() should have shut everything > down including MSI. So I don't think it is necessary to call > tg3_power_down() unconditionally. Agree. So let us have this change in. George, can you pls spin a v2 with proper tagging/addressing done to the patch. Please see https://docs.kernel.org/process/submitting-patches.html for help. Thanks. [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4209 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-11-03 3:48 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-11-01 13:04 [PATCH] [PATCH net] tg3: power down device only on SYSTEM_POWER_OFF George Shuklin 2023-11-01 15:20 ` Pavan Chebbi 2023-11-01 19:58 ` George Shuklin 2023-11-02 7:04 ` Pavan Chebbi 2023-11-02 14:14 ` George Shuklin 2023-11-02 17:11 ` Pavan Chebbi 2023-11-02 17:49 ` Michael Chan 2023-11-03 3:48 ` Pavan Chebbi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox