From: "Ruinskiy, Dima" <dima.ruinskiy@intel.com>
To: Helge Deller <deller@gmx.de>, Andrew Lunn <andrew@lunn.ch>,
Helge Deller <deller@kernel.org>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>,
Przemek Kitszel <przemyslaw.kitszel@intel.com>,
<intel-wired-lan@lists.osuosl.org>, <netdev@vger.kernel.org>
Subject: Re: [Intel-wired-lan] e1000e: Report link down after "Detected Hardware Unit Hang" ?
Date: Tue, 16 Jun 2026 19:20:00 +0300 [thread overview]
Message-ID: <51828156-e859-44db-9926-c076796d0f75@intel.com> (raw)
In-Reply-To: <9d80ed59-5483-4c33-9d27-52fdf24aac6e@gmx.de>
On 15/06/2026 23:36, Helge Deller wrote:
> On 6/15/26 18:41, Andrew Lunn wrote:
>> On Sun, Jun 14, 2026 at 11:48:08PM +0200, Helge Deller wrote:
>>> I'm regularily facing the known "eno1: Detected Hardware Unit Hang:"
>>> with my on-board intel e1000e NIC hardware.
>>> Since none of he various tips on the internet helped, I had the idea
>>> to setup a master/slave bond networking to fail over to another NIC when
>>> the Intel chip hangs.
>>>
>>> Sadly this doesn't work as intended, because the link of the intel NIC
>>> isn't reported "down", so the failover never happens, unless I manually
>>> start "ifconfig eno1 down".
>>>
>>> My question: Shouldn't the intel NIC ideally report Link Down if we know
>>> it hangs? That way a fail-over should at least happen, right?
>>>
>>> Below is a completely untested patch.
>>> Does it make sense that I try to test and/or develop such a patch, or
>>> are there things I miss?
>>
>> If the interface is dead, then setting the carrier down makes a lot of
>> sense.
>
> That's what I think as well. Thanks for confirming.
>
>> One question i have is, what do you need to do to recover the
>> hardware? Will it correctly set the carrier up when you do the
>> recovery?
>
> The only way I could recover was to plug the network cable and re-insert
> it.
> I have not tested bringing the NIC down.
> But in both cases the driver will need to re-detect the media & link
>
>> Also, just looking at your proposed change, it is not clear to me why
>> such an assignment will result in carrier down. It would be good to
>> explain it in the commit message.
>
> Sure. The patch I attached was completely untested and just based on
> the analysis of the flow and how to make the Link possibly report to be
> down.
> Maybe someone knowledgeable of the driver has a better suggestion how to
> report the link down situation in a clean way?
>
> Helge
This does not seem like the right direction to me.
The "Detected Hardware Unit Hang" print does not indicate that the
interface is dead, but that the transmitter is stalled.
This can be due to an unusually high load, or a HW fault / race
condition with another component, etc.
When a hang is detected, the transmitter is stopped with
netif_stop_queue() and eventually ndo_tx_timeout triggers a full reset
to the device, which in many cases recovers it from the hang.
If the hang is persistent, we try to understand the cause and debug it.
Permanently marking the device as 'down' because it hung once is not
going to be the optimal solution.
next prev parent reply other threads:[~2026-06-16 16:20 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-14 21:48 e1000e: Report link down after "Detected Hardware Unit Hang" ? Helge Deller
2026-06-15 16:41 ` Andrew Lunn
2026-06-15 20:36 ` Helge Deller
2026-06-16 16:20 ` Ruinskiy, Dima [this message]
2026-06-16 16:55 ` [Intel-wired-lan] " Helge Deller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51828156-e859-44db-9926-c076796d0f75@intel.com \
--to=dima.ruinskiy@intel.com \
--cc=andrew@lunn.ch \
--cc=anthony.l.nguyen@intel.com \
--cc=deller@gmx.de \
--cc=deller@kernel.org \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=netdev@vger.kernel.org \
--cc=przemyslaw.kitszel@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox