* TX watchdog vs link-layer flow control
@ 2011-06-02 20:48 Ben Hutchings
2011-06-02 20:55 ` David Miller
2011-06-02 21:01 ` Ben Hutchings
0 siblings, 2 replies; 3+ messages in thread
From: Ben Hutchings @ 2011-06-02 20:48 UTC (permalink / raw)
To: netdev; +Cc: linux-net-drivers
The TX watchdog will fire if and only if a TX queue remains stopped for
a certain period for no apparent reason. Specifically, it requires
netif_device_present(dev) && netif_running(dev) &&
netif_carrier_ok(dev).
However, even if the link is up it can still be blocked by link-layer
flow control. A customer report (which has not yet been reproduced
here) suggests that when Ethernet flow control is enabled a switch may
in some circumstances throttle the TX packet rate to the extent that a
TX queue cannot be unblocked before the watchdog fires. It is certainly
possible for a misbehaving link partner to do this, and this should
probably not be considered as a bug in the local hardware or driver!
TX may also be blocked by a 'remote fault' indication. This should
possibly be translated into netif_carrier_off(), but I'm not sure that
all drivers will be able to detect remote fault without polling.
Perhaps dev_watchdog() should support a driver operation to poll for
cases like this before it decides that the local device is actually
misbehaving?
Even then, I can't think of a reliable way to detect a pause frame
flood. Also, drivers might well require process context for such an
operation.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: TX watchdog vs link-layer flow control
2011-06-02 20:48 TX watchdog vs link-layer flow control Ben Hutchings
@ 2011-06-02 20:55 ` David Miller
2011-06-02 21:01 ` Ben Hutchings
1 sibling, 0 replies; 3+ messages in thread
From: David Miller @ 2011-06-02 20:55 UTC (permalink / raw)
To: bhutchings; +Cc: netdev, linux-net-drivers
From: Ben Hutchings <bhutchings@solarflare.com>
Date: Thu, 02 Jun 2011 21:48:40 +0100
> However, even if the link is up it can still be blocked by link-layer
> flow control. A customer report (which has not yet been reproduced
> here) suggests that when Ethernet flow control is enabled a switch may
> in some circumstances throttle the TX packet rate to the extent that a
> TX queue cannot be unblocked before the watchdog fires. It is certainly
> possible for a misbehaving link partner to do this, and this should
> probably not be considered as a bug in the local hardware or driver!
>
> TX may also be blocked by a 'remote fault' indication. This should
> possibly be translated into netif_carrier_off(), but I'm not sure that
> all drivers will be able to detect remote fault without polling.
>
> Perhaps dev_watchdog() should support a driver operation to poll for
> cases like this before it decides that the local device is actually
> misbehaving?
>
> Even then, I can't think of a reliable way to detect a pause frame
> flood. Also, drivers might well require process context for such an
> operation.
Frankly, if the switch can't take packets for several seconds I want a
notification as that's a serious condition.
In lieu of a reliable way to poll for these kinds of cases in order to
distinguish properly, I think what happens now is the best we can do.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: TX watchdog vs link-layer flow control
2011-06-02 20:48 TX watchdog vs link-layer flow control Ben Hutchings
2011-06-02 20:55 ` David Miller
@ 2011-06-02 21:01 ` Ben Hutchings
1 sibling, 0 replies; 3+ messages in thread
From: Ben Hutchings @ 2011-06-02 21:01 UTC (permalink / raw)
To: netdev; +Cc: linux-net-drivers
Also: given that the watchdog fires if *any* TX queue is stopped, that
seems to mean that on a device with hardware TX priority it could be
triggered for a TX queue with low priority if the link is filled with
higher-priority TX packets for long enough. (This depends on how 'hard'
the priorities are, of course.)
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-06-02 21:01 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-02 20:48 TX watchdog vs link-layer flow control Ben Hutchings
2011-06-02 20:55 ` David Miller
2011-06-02 21:01 ` Ben Hutchings
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).