* [DISCUSS] tg3 reboot handling on Dell T440 (BCM5720)
@ 2025-12-21 17:50 Noam D. Eliyahu
2025-12-22 5:53 ` Pavan Chebbi
0 siblings, 1 reply; 5+ messages in thread
From: Noam D. Eliyahu @ 2025-12-21 17:50 UTC (permalink / raw)
To: netdev; +Cc: pavan.chebbi, mchan
Hi all,
I have not contributed to the Linux kernel before, but after following and learning from the work here for several years, I am now trying to make my first contribution.
I would like to discuss a reboot-related issue I am seeing on a Dell PowerEdge T440 with a BCM5720 NIC. With recent stable kernels (6.17+) and up-to-date Dell firmware, the system hits AER fatal errors during ACPI _PTS. On older kernels (starting around 6.6.2), the behavior is different and appears related to the conditional tg3_power_down flow introduced by commit 9931c9d04f4d.
Below is a short summary of how the driver logic evolved.
### Relevant driver evolution
* **9931c9d04f4d – tg3: power down device only on SYSTEM_POWER_OFF**
Added to avoid reboot hangs when the NIC was initialized via SNP (PXE):
```
- tg3_power_down(tp);
+ if (system_state == SYSTEM_POWER_OFF)
+ tg3_power_down(tp);
```
* **2ca1c94ce0b6 – tg3: Disable tg3 device on system reboot to avoid triggering AER**
Addressed a separate issue and partially reverted earlier behavior:
```
+ tg3_reset_task_cancel(tp);
+
rtnl_lock();
+
netif_device_detach(dev);
if (netif_running(dev))
dev_close(dev);
- if (system_state == SYSTEM_POWER_OFF)
- tg3_power_down(tp);
+ tg3_power_down(tp); /* unconditional again */
rtnl_unlock();
+
+ pci_disable_device(pdev);
```
* **e0efe83ed3252 – tg3: Disable tg3 PCIe AER on system reboot**
Combined both approaches, resulting in:
* Conditional tg3_power_down based on SYSTEM_STATE
* A DMI-based whitelist for AER handling
```
static const struct dmi_system_id tg3_restart_aer_quirk_table[] = {
{
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R440"),
},
},
/* other R-series entries omitted */
{}
};
```
```
else if (system_state == SYSTEM_RESTART &&
dmi_first_match(tg3_restart_aer_quirk_table) &&
pdev->current_state != PCI_D3cold &&
pdev->current_state != PCI_UNKNOWN) {
pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL,
PCI_EXP_DEVCTL_CERE |
PCI_EXP_DEVCTL_NFERE |
PCI_EXP_DEVCTL_FERE |
PCI_EXP_DEVCTL_URRE);
}
```
On my T440, this combined design is what causes problems. Simply removing the DMI table does not help, since the conditional tg3_power_down logic itself also causes issues on this hardware. Adding “PowerEdge T440” to the DMI list avoids the AER error, but this does not scale well and requires constant maintenance.
I also could not reproduce the original reboot hang that motivated 9931c9d04f4d when running current firmware, even when initializing the NICs via SNP (PXE). This makes it look like the original problem has been resolved in firmware, while the workaround logic now causes trouble on up to date systems.
### Possible ways forward (from cleanest to minimal)
1. **Cleanest (recent firmware only)**
Remove both the conditional tg3_power_down and the AER disablement, and always call tg3_power_down. This restores the pre-workaround behavior and works reliably on my system.
2. **Flip the conditioning**
Keep the DMI list, but use it to guard the conditional tg3_power_down instead (only for models where the original issue was observed, e.g. R650xs). Drop the AER handling entirely. This limits risk to known systems while simplifying the flow.
3. **Minimal change**
Add “T” variants (e.g. T440) to the existing DMI table. This fixes my system but keeps the current complexity and maintenance burden.
I wanted to start with a discussion and a detailed report before sending any patches, to get guidance on what approach makes sense upstream. I can provide logs, kernel versions, and test results if useful. My testing (down to 6.6.1) was done on my own hardware. I could not reproduce either the original SNP reboot hang (9931c9d04f4d) or the AER ACPI _PTS failure (e0efe83ed3252) on current firmware so currently only (2ca1c94ce0b6) seems required, which suggests both issues may now be firmware-resolved.
Thanks for taking the time to read this.
Best regards,
Noam D. Eliyahu
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [DISCUSS] tg3 reboot handling on Dell T440 (BCM5720)
2025-12-21 17:50 [DISCUSS] tg3 reboot handling on Dell T440 (BCM5720) Noam D. Eliyahu
@ 2025-12-22 5:53 ` Pavan Chebbi
2025-12-24 16:53 ` Noam D. Eliyahu
0 siblings, 1 reply; 5+ messages in thread
From: Pavan Chebbi @ 2025-12-22 5:53 UTC (permalink / raw)
To: Noam D. Eliyahu; +Cc: netdev, mchan
[-- Attachment #1: Type: text/plain, Size: 5222 bytes --]
On Sun, Dec 21, 2025 at 11:20 PM Noam D. Eliyahu
<noam.d.eliyahu@gmail.com> wrote:
>
> Hi all,
>
> I have not contributed to the Linux kernel before, but after following and learning from the work here for several years, I am now trying to make my first contribution.
>
> I would like to discuss a reboot-related issue I am seeing on a Dell PowerEdge T440 with a BCM5720 NIC. With recent stable kernels (6.17+) and up-to-date Dell firmware, the system hits AER fatal errors during ACPI _PTS. On older kernels (starting around 6.6.2), the behavior is different and appears related to the conditional tg3_power_down flow introduced by commit 9931c9d04f4d.
>
> Below is a short summary of how the driver logic evolved.
>
> ### Relevant driver evolution
>
> * **9931c9d04f4d – tg3: power down device only on SYSTEM_POWER_OFF**
I think this commit id is wrong. Anyway, I know the commit.
> Added to avoid reboot hangs when the NIC was initialized via SNP (PXE):
>
> ```
> - tg3_power_down(tp);
> + if (system_state == SYSTEM_POWER_OFF)
> + tg3_power_down(tp);
> ```
>
> * **2ca1c94ce0b6 – tg3: Disable tg3 device on system reboot to avoid triggering AER**
> Addressed a separate issue and partially reverted earlier behavior:
>
> ```
> + tg3_reset_task_cancel(tp);
> +
> rtnl_lock();
> +
> netif_device_detach(dev);
>
> if (netif_running(dev))
> dev_close(dev);
>
> - if (system_state == SYSTEM_POWER_OFF)
> - tg3_power_down(tp);
> + tg3_power_down(tp); /* unconditional again */
>
> rtnl_unlock();
> +
> + pci_disable_device(pdev);
> ```
>
> * **e0efe83ed3252 – tg3: Disable tg3 PCIe AER on system reboot**
> Combined both approaches, resulting in:
>
> * Conditional tg3_power_down based on SYSTEM_STATE
> * A DMI-based whitelist for AER handling
>
> ```
> static const struct dmi_system_id tg3_restart_aer_quirk_table[] = {
> {
> .matches = {
> DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
> DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R440"),
> },
> },
> /* other R-series entries omitted */
> {}
> };
> ```
>
> ```
> else if (system_state == SYSTEM_RESTART &&
> dmi_first_match(tg3_restart_aer_quirk_table) &&
> pdev->current_state != PCI_D3cold &&
> pdev->current_state != PCI_UNKNOWN) {
> pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL,
> PCI_EXP_DEVCTL_CERE |
> PCI_EXP_DEVCTL_NFERE |
> PCI_EXP_DEVCTL_FERE |
> PCI_EXP_DEVCTL_URRE);
> }
> ```
>
> On my T440, this combined design is what causes problems. Simply removing the DMI table does not help, since the conditional tg3_power_down logic itself also causes issues on this hardware. Adding “PowerEdge T440” to the DMI list avoids the AER error, but this does not scale well and requires constant maintenance.
>
> I also could not reproduce the original reboot hang that motivated 9931c9d04f4d when running current firmware, even when initializing the NICs via SNP (PXE). This makes it look like the original problem has been resolved in firmware, while the workaround logic now causes trouble on up to date systems.
>
> ### Possible ways forward (from cleanest to minimal)
>
> 1. **Cleanest (recent firmware only)**
> Remove both the conditional tg3_power_down and the AER disablement, and always call tg3_power_down. This restores the pre-workaround behavior and works reliably on my system.
This is going to be a problem, please follow the discussion here:
https://lore.kernel.org/netdev/CALs4sv1-6mgQ2JfF9MYiRADxumJD7m7OGWhCB5aWj1tGP0OPJg@mail.gmail.com/
where regression risk is flagged and it came true in
https://lore.kernel.org/netdev/CALs4sv2_JZd5K-ZgBkjL=QpXVEXnoJrjuqwwKg0+jo2-4taHJw@mail.gmail.com/
>
> 2. **Flip the conditioning**
> Keep the DMI list, but use it to guard the conditional tg3_power_down instead (only for models where the original issue was observed, e.g. R650xs). Drop the AER handling entirely. This limits risk to known systems while simplifying the flow.
But I am not sure how systems affected in the commit e0efe83ed3252
will react. Can't tell 100pc without testing.
>
> 3. **Minimal change**
> Add “T” variants (e.g. T440) to the existing DMI table. This fixes my system but keeps the current complexity and maintenance burden.
I can understand the burden. But unless Dell gets involved I don't see
a better way. Intel i40e also has a similar work around.
>
> I wanted to start with a discussion and a detailed report before sending any patches, to get guidance on what approach makes sense upstream. I can provide logs, kernel versions, and test results if useful. My testing (down to 6.6.1) was done on my own hardware. I could not reproduce either the original SNP reboot hang (9931c9d04f4d) or the AER ACPI _PTS failure (e0efe83ed3252) on current firmware so currently only (2ca1c94ce0b6) seems required, which suggests both issues may now be firmware-resolved.
>
> Thanks for taking the time to read this.
>
> Best regards,
> Noam D. Eliyahu
>
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5469 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [DISCUSS] tg3 reboot handling on Dell T440 (BCM5720)
2025-12-22 5:53 ` Pavan Chebbi
@ 2025-12-24 16:53 ` Noam D. Eliyahu
2025-12-31 15:10 ` Pavan Chebbi
0 siblings, 1 reply; 5+ messages in thread
From: Noam D. Eliyahu @ 2025-12-24 16:53 UTC (permalink / raw)
To: pavan.chebbi; +Cc: mchan, netdev
Thanks for the quick reply!
On Mon, Dec 22, 2025 at 11:23 AM Pavan Chebbi <pavan.chebbi@broadcom.com> wrote:
>
> On Sun, Dec 21, 2025 at 11:20 PM Noam D. Eliyahu
> <noam.d.eliyahu@gmail.com> wrote:
> >
> > ### Relevant driver evolution
> >
> > * **9931c9d04f4d – tg3: power down device only on SYSTEM_POWER_OFF**
>
> I think this commit id is wrong. Anyway, I know the commit.
My apologies, I likely copied a local hash; the upstream commit ID is 9fc3bc764334.
> This is going to be a problem, please follow the discussion here:
> https://lore.kernel.org/netdev/CALs4sv1-6mgQ2JfF9MYiRADxumJD7m7OGWhCB5aWj1tGP0OPJg@mail.gmail.com/
> where regression risk is flagged and it came true in
> https://lore.kernel.org/netdev/CALs4sv2_JZd5K-ZgBkjL=QpXVEXnoJrjuqwwKg0+jo2-4taHJw@mail.gmail.com/
Thank you for the links.
I understand the need to prevent regressions, especially in an area where it happened before.
That said, I still think the design of the first and second fixes is problematic and needs adjusting.
The original bug (Fixed in: 9fc3bc764334) was triggered by SNP initialization on specific models (R650xs with BCM5720).
The fix, the conditional tg3_power_down call, *was applied globally regardless of models*.
The second bug I mentioned (Fixed in: e0efe83ed3252) was triggered mainly due to the conditional tg3_power_down call.
Look again at the changes made in the commit referenced by e0efe83ed3252 (2ca1c94ce0b6 as the original fix):
```
+ tg3_reset_task_cancel(tp);
+
rtnl_lock();
+
netif_device_detach(dev);
if (netif_running(dev))
dev_close(dev);
- if (system_state == SYSTEM_POWER_OFF)
- tg3_power_down(tp);
+ tg3_power_down(tp); /* NOTE: the conditional system state based tg3_power_down call was problematic */
rtnl_unlock();
+
+ pci_disable_device(pdev);
}
```
The changes in 2ca1c94ce0b6 caused the regression which later led to the AER disablement in e0efe83ed3252.
The problem is that it was decided to apply the change to a specific set of models, even though it originated from 9fc3bc764334 which was applied globally.
If we apply the conditional tg3_power_down specifically for the R650xs, we can guarantee no regression, as the logic for the models with the first bug stays the same, just now limited to their set of models.
By applying the conditional tg3_power_down this way, we won't need the AER disablement at all.
> >
> > 2. **Flip the conditioning**
> > Keep the DMI list, but use it to guard the conditional tg3_power_down instead (only for models where the original issue was observed, e.g. R650xs). Drop the AER handling entirely. This limits risk to known systems while simplifying the flow.
> But I am not sure how systems affected in the commit e0efe83ed3252 will react. Can't tell 100pc without testing.
Regarding your concern about systems affected in e0efe83ed3252: my hardware (Dell PowerEdge T440) is one of the models affected in e0efe83ed3252 but not listed in the initial DMI match list.
I tested all of the solutions I suggested, others have reported the same regarding my first suggestion (to remove both the conditional tg3_power_down and the AER disablement) online, and most importantly, the e0efe83ed3252 commit itself references the original fix (2ca1c94ce0b6) which didn't include a specific set of models and was considered a viable fix.
If we restrict the system_state check (from 9fc3bc764334) to a DMI table for the R650xs, all other systems would revert to the 'unconditional' tg3_power_down which was the standard for years. This would naturally prevent the AER errors (as I and others had seen on our machines) without needing to touch the AER registers at all.
For reference:
- https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e0efe83ed3252
- https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2ca1c94ce0b6
- https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9fc3bc764334
Best regards,
Noam D. Eliyahu
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [DISCUSS] tg3 reboot handling on Dell T440 (BCM5720)
2025-12-24 16:53 ` Noam D. Eliyahu
@ 2025-12-31 15:10 ` Pavan Chebbi
2026-01-02 16:22 ` Noam D. Eliyahu
0 siblings, 1 reply; 5+ messages in thread
From: Pavan Chebbi @ 2025-12-31 15:10 UTC (permalink / raw)
To: Noam D. Eliyahu; +Cc: mchan, netdev, George Shuklin, Lenny Szubowicz
[-- Attachment #1: Type: text/plain, Size: 4941 bytes --]
On Wed, Dec 24, 2025 at 10:23 PM Noam D. Eliyahu
<noam.d.eliyahu@gmail.com> wrote:
>
> Thanks for the quick reply!
>
> On Mon, Dec 22, 2025 at 11:23 AM Pavan Chebbi <pavan.chebbi@broadcom.com> wrote:
> >
> > On Sun, Dec 21, 2025 at 11:20 PM Noam D. Eliyahu
> > <noam.d.eliyahu@gmail.com> wrote:
> > >
> > > ### Relevant driver evolution
> > >
> > > * **9931c9d04f4d – tg3: power down device only on SYSTEM_POWER_OFF**
> >
> > I think this commit id is wrong. Anyway, I know the commit.
>
> My apologies, I likely copied a local hash; the upstream commit ID is 9fc3bc764334.
>
> > This is going to be a problem, please follow the discussion here:
> > https://lore.kernel.org/netdev/CALs4sv1-6mgQ2JfF9MYiRADxumJD7m7OGWhCB5aWj1tGP0OPJg@mail.gmail.com/
> > where regression risk is flagged and it came true in
> > https://lore.kernel.org/netdev/CALs4sv2_JZd5K-ZgBkjL=QpXVEXnoJrjuqwwKg0+jo2-4taHJw@mail.gmail.com/
>
> Thank you for the links.
> I understand the need to prevent regressions, especially in an area where it happened before.
> That said, I still think the design of the first and second fixes is problematic and needs adjusting.
>
> The original bug (Fixed in: 9fc3bc764334) was triggered by SNP initialization on specific models (R650xs with BCM5720).
> The fix, the conditional tg3_power_down call, *was applied globally regardless of models*.
>
> The second bug I mentioned (Fixed in: e0efe83ed3252) was triggered mainly due to the conditional tg3_power_down call.
> Look again at the changes made in the commit referenced by e0efe83ed3252 (2ca1c94ce0b6 as the original fix):
> ```
> + tg3_reset_task_cancel(tp);
> +
> rtnl_lock();
> +
> netif_device_detach(dev);
>
> if (netif_running(dev))
> dev_close(dev);
>
> - if (system_state == SYSTEM_POWER_OFF)
> - tg3_power_down(tp);
> + tg3_power_down(tp); /* NOTE: the conditional system state based tg3_power_down call was problematic */
>
> rtnl_unlock();
> +
> + pci_disable_device(pdev);
> }
> ```
>
> The changes in 2ca1c94ce0b6 caused the regression which later led to the AER disablement in e0efe83ed3252.
> The problem is that it was decided to apply the change to a specific set of models, even though it originated from 9fc3bc764334 which was applied globally.
>
> If we apply the conditional tg3_power_down specifically for the R650xs, we can guarantee no regression, as the logic for the models with the first bug stays the same, just now limited to their set of models.
> By applying the conditional tg3_power_down this way, we won't need the AER disablement at all.
>
> > >
> > > 2. **Flip the conditioning**
> > > Keep the DMI list, but use it to guard the conditional tg3_power_down instead (only for models where the original issue was observed, e.g. R650xs). Drop the AER handling entirely. This limits risk to known systems while simplifying the flow.
>
> > But I am not sure how systems affected in the commit e0efe83ed3252 will react. Can't tell 100pc without testing.
>
> Regarding your concern about systems affected in e0efe83ed3252: my hardware (Dell PowerEdge T440) is one of the models affected in e0efe83ed3252 but not listed in the initial DMI match list.
> I tested all of the solutions I suggested, others have reported the same regarding my first suggestion (to remove both the conditional tg3_power_down and the AER disablement) online, and most importantly, the e0efe83ed3252 commit itself references the original fix (2ca1c94ce0b6) which didn't include a specific set of models and was considered a viable fix.
>
> If we restrict the system_state check (from 9fc3bc764334) to a DMI table for the R650xs, all other systems would revert to the 'unconditional' tg3_power_down which was the standard for years. This would naturally prevent the AER errors (as I and others had seen on our machines) without needing to touch the AER registers at all.
OK I now understand. You make a lucid argument. 9fc3bc764334 does
indicate that the problem it is trying to fix is very much limited to
R650xs. Hence while I am almost sure that your proposal (about adding
conditional tg3_power_down for R650xs alone) won't break anything for
cc: George, we need an ack from cc: Lenny, to see if he is OK with an
unconditional power down, and that he had no other motive to disable
AER in tg3 config space. In all possible logic, it should just be fine
because e0efe83ed3252 is fixing 9fc3bc764334.
P.S Sorry for the delayed reply. I am on vacation.
>
> For reference:
> - https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e0efe83ed3252
> - https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2ca1c94ce0b6
> - https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9fc3bc764334
>
> Best regards,
> Noam D. Eliyahu
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5469 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [DISCUSS] tg3 reboot handling on Dell T440 (BCM5720)
2025-12-31 15:10 ` Pavan Chebbi
@ 2026-01-02 16:22 ` Noam D. Eliyahu
0 siblings, 0 replies; 5+ messages in thread
From: Noam D. Eliyahu @ 2026-01-02 16:22 UTC (permalink / raw)
To: Pavan Chebbi; +Cc: mchan, netdev, George Shuklin, Lenny Szubowicz
On Wed Dec 31, 2025 at 5:10 PM IST, Pavan Chebbi wrote:
> OK I now understand. You make a lucid argument. 9fc3bc764334 does
> indicate that the problem it is trying to fix is very much limited to
> R650xs.
Indeed AER suppression is needed only because of the conditional
tg3_power_down.
So both the conditional tg3_power_down and the AER suppression should
be scoped only to the platforms that originally required them.
This preserves the intent of the original commits while avoiding
whitelist maintenance and reflecting the relationship between the two bugs.
> Hence while I am almost sure that your proposal (about adding
> conditional tg3_power_down for R650xs alone) won't break anything for
> cc: George, we need an ack from cc: Lenny, to see if he is OK with an
> unconditional power down, and that he had no other motive to disable
> AER in tg3 config space. In all possible logic, it should just be fine
> because e0efe83ed3252 is fixing 9fc3bc764334.
Sounds great!
I'll wait for an ACK from Lenny confirming that this approach -
limiting both conditional tg3_power_down and AER suppression to the
affected platforms - works for his setups and doesn't need broader
application.
> P.S Sorry for the delayed reply. I am on vacation.
Totally reasonable - between Christmas and the New Year, I haven't been
very active myself.
Wishing everyone a happy and hopefully reboot-hang-free start to the New Year!
If it helps, I can provide a pre-approved patch so George or Lenny can
test it on their hardware before I try to commit one.
Thanks,
Noam D. Eliyahu
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-01-02 16:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-21 17:50 [DISCUSS] tg3 reboot handling on Dell T440 (BCM5720) Noam D. Eliyahu
2025-12-22 5:53 ` Pavan Chebbi
2025-12-24 16:53 ` Noam D. Eliyahu
2025-12-31 15:10 ` Pavan Chebbi
2026-01-02 16:22 ` Noam D. Eliyahu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).