netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] net: Add ndo_gso_check
@ 2014-09-29  3:50 Tom Herbert
  2014-09-29 19:59 ` Or Gerlitz
                   ` (2 more replies)
  0 siblings, 3 replies; 51+ messages in thread
From: Tom Herbert @ 2014-09-29  3:50 UTC (permalink / raw)
  To: davem, netdev, gerlitz.or

Add ndo_gso_check which a device can define to indicate whether is
is capable of doing GSO on a packet. This funciton would be called from
the stack to determine whether software GSO is needed to be done. A
driver should populate this function if it advertises GSO types for
which there are combinations that it wouldn't be able to handle. For
instance a device that performs UDP tunneling might only implement
support for transparent Ethernet bridging type of inner packets
or might have limitations on lengths of inner headers.

Signed-off-by: Tom Herbert <therbert@google.com>
---
 include/linux/netdevice.h | 12 +++++++++++-
 net/core/dev.c            |  2 +-
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 9f5d293..f8c2027 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -997,6 +997,12 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
  *	Callback to use for xmit over the accelerated station. This
  *	is used in place of ndo_start_xmit on accelerated net
  *	devices.
+ * bool	(*ndo_gso_check) (struct sk_buff *skb,
+ *			  struct net_device *dev);
+ *	Called by core transmit path to determine if device is capable of
+ *	performing GSO on a packet. The device returns true if it is
+ *	able to GSO the packet, false otherwise. If the return value is
+ *	false the stack will do software GSO.
  */
 struct net_device_ops {
 	int			(*ndo_init)(struct net_device *dev);
@@ -1146,6 +1152,8 @@ struct net_device_ops {
 							struct net_device *dev,
 							void *priv);
 	int			(*ndo_get_lock_subclass)(struct net_device *dev);
+	bool			(*ndo_gso_check) (struct sk_buff *skb,
+						  struct net_device *dev);
 };
 
 /**
@@ -3536,10 +3544,12 @@ static inline bool skb_gso_ok(struct sk_buff *skb, netdev_features_t features)
 	       (!skb_has_frag_list(skb) || (features & NETIF_F_FRAGLIST));
 }
 
-static inline bool netif_needs_gso(struct sk_buff *skb,
+static inline bool netif_needs_gso(struct net_device *dev, struct sk_buff *skb,
 				   netdev_features_t features)
 {
 	return skb_is_gso(skb) && (!skb_gso_ok(skb, features) ||
+		(dev->netdev_ops->ndo_gso_check &&
+		 !dev->netdev_ops->ndo_gso_check(skb, dev)) ||
 		unlikely((skb->ip_summed != CHECKSUM_PARTIAL) &&
 			 (skb->ip_summed != CHECKSUM_UNNECESSARY)));
 }
diff --git a/net/core/dev.c b/net/core/dev.c
index e2ced01..8c2b9bb 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2680,7 +2680,7 @@ struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device *dev)
 	if (skb->encapsulation)
 		features &= dev->hw_enc_features;
 
-	if (netif_needs_gso(skb, features)) {
+	if (netif_needs_gso(dev, skb, features)) {
 		struct sk_buff *segs;
 
 		segs = skb_gso_segment(skb, features);
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related	[flat|nested] 51+ messages in thread
* Re: [PATCH] net: Add ndo_gso_check
@ 2014-10-07 16:50 Alexei Starovoitov
  2014-10-07 17:05 ` David Miller
  2014-10-07 17:23 ` Eric Dumazet
  0 siblings, 2 replies; 51+ messages in thread
From: Alexei Starovoitov @ 2014-10-07 16:50 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Jesse Gross, Or Gerlitz, Alexander Duyck, John Fastabend,
	Jeff Kirsher, David Miller, Linux Netdev List, Thomas Graf,
	Pravin Shelar, Andy Zhou

On Mon, Oct 6, 2014 at 5:17 PM, Tom Herbert <therbert@google.com> wrote:
> On Mon, Oct 6, 2014 at 3:33 PM, Jesse Gross <jesse@nicira.com> wrote:
>
>> I have no disagreement with trying to be generic across protocols. I'm
>> just not convinced that it is a realistic plan. It's obvious that it
>> is not doable today nor will be it be in the next generation of NICs
>> (which are guaranteed to add support for new protocols). Furthermore,
>> there will be more advanced stuff coming in the future that I think
>> will be difficult or impossible to make protocol agnostic. Rather than
>> pretending that this doesn't exist or will never happen, it's better
>> focus on how to integrating it cleanly.
>
> Sorry, but I don't understand how supporting a new protocols in a
> device for the purposes of returning CHECKSUM_UNNECESSARY is better or
> easier to implement than just returning CHECKSUM_COMPLETE. Same thing
> for trying to use NETIF_F_IP_CSUM with encapsulation rather than
> NETIF_F_HW_CSUM. I'm not a hardware guy, so it's possible I'm missing
> something obvious...

it's definitely more difficult to properly implement
CHECKSUM_UNNECESSARY in HW, but it's worth it.
CHECKSUM_COMPLETE is a burden on software. Old NICs
used to do that, but overhead of recomputing csum for every
step of packet parsing and header modifications is too high.
sw routers, bridges and < L4 networking devices are
simpler and faster with CHECKSUM_UNNECESSARY.

^ permalink raw reply	[flat|nested] 51+ messages in thread
* Re: [PATCH] net: Add ndo_gso_check
@ 2014-10-07 22:05 Alexei Starovoitov
  2014-10-07 23:43 ` Tom Herbert
  0 siblings, 1 reply; 51+ messages in thread
From: Alexei Starovoitov @ 2014-10-07 22:05 UTC (permalink / raw)
  To: David Miller
  Cc: Tom Herbert, Jesse Gross, gerlitz.or@gmail.com, Alexander Duyck,
	John Fastabend, Jeff Kirsher, netdev@vger.kernel.org, Thomas Graf,
	Pravin Shelar, Andy Zhou

On Tue, Oct 7, 2014 at 1:48 PM, David Miller <davem@davemloft.net> wrote:
>> They're exposing packet parsers to users, so we will be able to
>> program any protocol into the device without reflashing it.
>> Some of the guys are even allowing reprogramming the parser
>> while packets are flowing.
>
> So we have to write new software in _EVERY_ driver to accomodate this.
>
> That makes zero sense either, and it is unneeded complexity in the
> hardware.

I have to agree that for single purpose of csum verify
adding all the complexity around programmable
parsers doesn't make sense.
To me packet parsing is the first step of HW offload,
regardless whether it's flow based or what else this hw can do.

I'd rather see HW vendors provide programmable parser
for any and all protocols instead of them saying:
here is my device X that supports protocols defined
by standards Y and Z.
(which is the case today and it's not pretty)

> COMPLETE works everywhere, on everything, with no driver changes, and
> is so much harder to get wrong.

agree. I'm not against COMPLETE in any way.

> Every protocol specific feature has major downsides whether you choose
> to see them or not.

I surely don't have a preference to a protocol.
I want to see programmable HW that supports any crazy
packet format.

Let's get back to COMPLETE and VMs example,
because I want to see it fixed.
In such case kernel stack in hypervisor will be pulling
encap headers then populating new field in virtio
ring with updated csum and let guest VMs to continue
adjusting that csum, right?
Alternative would be to do inner processing and
csum verification in hypervisor so that packet
can be marked as DATA_VALID, but is it better?

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2015-01-15 18:18 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-29  3:50 [PATCH] net: Add ndo_gso_check Tom Herbert
2014-09-29 19:59 ` Or Gerlitz
2014-09-29 20:12   ` David Miller
2014-09-29 20:53   ` Tom Herbert
2014-09-29 21:10     ` Or Gerlitz
2014-09-29 21:38       ` Tom Herbert
2014-09-30 14:30         ` Or Gerlitz
2014-09-30 15:34           ` Tom Herbert
2014-09-30 21:37             ` Stephen Hemminger
2014-09-30 22:11               ` Eric Dumazet
2014-10-01  0:05               ` Tom Herbert
2014-10-01  0:18                 ` Eric Dumazet
2014-10-01  6:12                   ` Eric Dumazet
2014-10-01 20:58             ` Or Gerlitz
2014-10-01 21:12               ` Jesse Gross
2014-10-01 23:06               ` Tom Herbert
2014-10-05 14:04                 ` Or Gerlitz
2014-10-05 18:49                   ` Tom Herbert
2014-10-05 19:13                     ` Or Gerlitz
2014-10-06 17:59                       ` Tom Herbert
2014-10-06 20:22                         ` Or Gerlitz
2014-10-06 21:28                           ` Tom Herbert
2014-10-07 11:07                             ` Or Gerlitz
2015-01-15 18:18                             ` Or Gerlitz
2014-10-06 22:33                         ` Jesse Gross
2014-10-07  0:17                           ` Tom Herbert
2014-10-09  0:30                             ` Jesse Gross
2014-10-09  1:46                               ` Tom Herbert
2014-10-06 21:51                     ` Jesse Gross
2014-09-29 20:13 ` Jesse Gross
2014-09-29 20:47   ` Tom Herbert
2014-09-30  0:33     ` Jesse Gross
2014-09-30  1:59       ` Tom Herbert
2014-10-07 18:13 ` Tom Herbert
2014-10-07 20:15   ` David Miller
  -- strict thread matches above, loose matches on Subject: below --
2014-10-07 16:50 Alexei Starovoitov
2014-10-07 17:05 ` David Miller
2014-10-07 17:18   ` Alexei Starovoitov
2014-10-07 18:47     ` David Miller
2014-10-07 20:28       ` Alexei Starovoitov
2014-10-07 20:32         ` David Miller
2014-10-07 20:43           ` Alexei Starovoitov
2014-10-07 20:48             ` David Miller
2014-10-07 21:41               ` Thomas Graf
2014-10-07 17:23 ` Eric Dumazet
2014-10-07 18:15   ` Alexei Starovoitov
2014-10-07 18:50     ` Eric Dumazet
2014-10-07 18:51     ` David Miller
2014-10-07 19:10       ` Tom Herbert
2014-10-07 22:05 Alexei Starovoitov
2014-10-07 23:43 ` Tom Herbert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).