netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net: reset ct before calling ndo_start_xmit
@ 2017-01-24  9:40 Paolo Abeni
  2017-01-24 14:04 ` Eric Dumazet
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Paolo Abeni @ 2017-01-24  9:40 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Hannes Frederic Sowa, Florian Westphal

Some devices (e.g. ipoib and most wifi drivers) can retain the
to-be-xmitted packets on some internal queue for a possibly
unlimited time.

Removing conntrack modules after some skbs are queued on any of
those devices may cause rmmod to hang waiting for ct refcount going
away.

Since clearing skb nfct early can also improve the performance,
we now clear skb nfct before calling ndo_start_xmit() for all
the devices not strictly requiring such information, that is,
all virtual devices.

Currently we use the NETIF_F_LLTX feature bit to identify such devices,
since all the [legacy] phys drivers setting such bit are not prone
the hangup issue. The plan is adding a specific 'this is a
virtual device' priv flag and use it instead, in a later net-next
patch.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/linux/netdevice.h | 7 +++++++
 net/core/dev.c            | 3 +++
 2 files changed, 10 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 9bde955..6e6b2ea 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4200,6 +4200,13 @@ static inline bool netif_reduces_vlan_mtu(struct net_device *dev)
 	return dev->priv_flags & IFF_MACSEC;
 }
 
+/* return true if we should preserve skb nfct before calling ndo_start_xmit() */
+static inline bool netif_needs_ct(struct net_device *dev)
+{
+	/* any kind of virtual device needs to preserve the ct entry */
+	return dev->features & NETIF_F_LLTX;
+}
+
 extern struct pernet_operations __net_initdata loopback_net_ops;
 
 /* Logging, debugging and troubleshooting/diagnostic helpers. */
diff --git a/net/core/dev.c b/net/core/dev.c
index 7f218e0..85fcae0 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2886,6 +2886,9 @@ static int xmit_one(struct sk_buff *skb, struct net_device *dev,
 	unsigned int len;
 	int rc;
 
+	if (!netif_needs_ct(dev))
+		nf_reset(skb);
+
 	if (!list_empty(&ptype_all) || !list_empty(&dev->ptype_all))
 		dev_queue_xmit_nit(skb, dev);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: reset ct before calling ndo_start_xmit
  2017-01-24  9:40 [PATCH net] net: reset ct before calling ndo_start_xmit Paolo Abeni
@ 2017-01-24 14:04 ` Eric Dumazet
  2017-01-24 15:04   ` Paolo Abeni
  2017-01-24 16:21 ` David Miller
  2017-01-24 17:37 ` Stephen Hemminger
  2 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2017-01-24 14:04 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: netdev, David S. Miller, Hannes Frederic Sowa, Florian Westphal

On Tue, 2017-01-24 at 10:40 +0100, Paolo Abeni wrote:

> Currently we use the NETIF_F_LLTX feature bit to identify such devices,
> since all the [legacy] phys drivers setting such bit are not prone
> the hangup issue. The plan is adding a specific 'this is a
> virtual device' priv flag and use it instead, in a later net-next
> patch.

This is too ugly in my opinion.

LLTX is LLTX, and has absolutely nothing to do with connection tracking.

We have ndo_features_check, and this can be trivially backported to
stable versions.

No need for yet another flag really.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: reset ct before calling ndo_start_xmit
  2017-01-24 14:04 ` Eric Dumazet
@ 2017-01-24 15:04   ` Paolo Abeni
  2017-01-24 15:42     ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: Paolo Abeni @ 2017-01-24 15:04 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, David S. Miller, Hannes Frederic Sowa, Florian Westphal

On Tue, 2017-01-24 at 06:04 -0800, Eric Dumazet wrote:
> On Tue, 2017-01-24 at 10:40 +0100, Paolo Abeni wrote:
> 
> > Currently we use the NETIF_F_LLTX feature bit to identify such
> > devices,
> > since all the [legacy] phys drivers setting such bit are not prone
> > the hangup issue. The plan is adding a specific 'this is a
> > virtual device' priv flag and use it instead, in a later net-next
> > patch.
> 
> This is too ugly in my opinion.
> 
> LLTX is LLTX, and has absolutely nothing to do with connection
> tracking.
> 
> We have ndo_features_check, and this can be trivially backported to
> stable versions.
> 
> No need for yet another flag really.

Thank you for the feedback.

Double checking to see if I understood the above correctly: do you
suggest to call nf_reset() from the affected drivers's
ndo_features_check(), eventually adding such ndo if needed ?

I think calling nf_reset() in the common code should be better: the
conntrack entry is hot in the cache and we may want to clear it early
for as many devices as possible.

Thank you,

Paolo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: reset ct before calling ndo_start_xmit
  2017-01-24 15:04   ` Paolo Abeni
@ 2017-01-24 15:42     ` Eric Dumazet
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2017-01-24 15:42 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: netdev, David S. Miller, Hannes Frederic Sowa, Florian Westphal

On Tue, 2017-01-24 at 16:04 +0100, Paolo Abeni wrote:

> Double checking to see if I understood the above correctly: do you
> suggest to call nf_reset() from the affected drivers's
> ndo_features_check(), eventually adding such ndo if needed ?
> 
> I think calling nf_reset() in the common code should be better: the
> conntrack entry is hot in the cache and we may want to clear it early
> for as many devices as possible.

This would add a conditional test, which will be not correctly predicted
by CPU in tunnel or bonding/team very common cases.

While doing it in an affected driver would not penalize fast path.

A little bit harder sure, but performance matters ;)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: reset ct before calling ndo_start_xmit
  2017-01-24  9:40 [PATCH net] net: reset ct before calling ndo_start_xmit Paolo Abeni
  2017-01-24 14:04 ` Eric Dumazet
@ 2017-01-24 16:21 ` David Miller
  2017-01-24 17:37 ` Stephen Hemminger
  2 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2017-01-24 16:21 UTC (permalink / raw)
  To: pabeni; +Cc: netdev, hannes, fw

From: Paolo Abeni <pabeni@redhat.com>
Date: Tue, 24 Jan 2017 10:40:13 +0100

> Some devices (e.g. ipoib and most wifi drivers) can retain the
> to-be-xmitted packets on some internal queue for a possibly
> unlimited time.

This is really not legal, such behavior also potentially holds onto
sockets forever.

If driver really wants to hold onto an SKB potentially forever, it
must release these resources.

This belongs in the driver that violates, and I strongly say
"violates", the basic fundamental rules of TX handling in the stack.
Drivers absolutely must liberate queued TX packets in a reasonable,
finite, amount of time.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: reset ct before calling ndo_start_xmit
  2017-01-24  9:40 [PATCH net] net: reset ct before calling ndo_start_xmit Paolo Abeni
  2017-01-24 14:04 ` Eric Dumazet
  2017-01-24 16:21 ` David Miller
@ 2017-01-24 17:37 ` Stephen Hemminger
  2 siblings, 0 replies; 6+ messages in thread
From: Stephen Hemminger @ 2017-01-24 17:37 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: netdev, David S. Miller, Hannes Frederic Sowa, Florian Westphal

On Tue, 24 Jan 2017 10:40:13 +0100
Paolo Abeni <pabeni@redhat.com> wrote:

>  
> +/* return true if we should preserve skb nfct before calling ndo_start_xmit() */
> +static inline bool netif_needs_ct(struct net_device *dev)
> +{
> +	/* any kind of virtual device needs to preserve the ct entry */
> +	return dev->features & NETIF_F_LLTX;
> +}
> +

I am not sure I fully understand the problem, and this looks like a special case
hack. You are overloading meaning of LLTX flag in a new way.

If you must do it then code this in place, and put comment there.
The addition of accessor function just obscures the purpose.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-01-24 17:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-24  9:40 [PATCH net] net: reset ct before calling ndo_start_xmit Paolo Abeni
2017-01-24 14:04 ` Eric Dumazet
2017-01-24 15:04   ` Paolo Abeni
2017-01-24 15:42     ` Eric Dumazet
2017-01-24 16:21 ` David Miller
2017-01-24 17:37 ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).