netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Definition and usage of NETIF_F_HW_SUM?
@ 2007-05-29 20:58 Stephen Hemminger
  2007-05-29 21:36 ` Herbert Xu
  0 siblings, 1 reply; 10+ messages in thread
From: Stephen Hemminger @ 2007-05-29 20:58 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev

The flag NETIF_F_HW_SUM is being misused. The definition says that the device
is capable of checksumming any packet. When in fact from usage it seems to
imply that the device is capable of doing IPV6 as well as IPV4.

Some devices like 8139too do "fake checksum offloading" because they always have to copy
the packet.

Some devices like via-rhine, don't really checksum but if they see CHECKSUM_PARTIAL then they
copy. This is bogus, they should just let higher layer do checksum/copy.

Devices like e1000, and bnx2 are broken because they assume only TCP/UDP and IPV4/IPV6. 
The definition of the flag says other protocols should work, but they probably send the
hardware into a state of confusion.

A few devices take a offset, starting point, and insertion point. This looks like
the correct model. But no upper layer protocols other than IPV4/IPV6 can do checksum
offload at present, so it seems moot.

IMHO the correct solution would be to get rid if NETIF_F_HW_SUM and make a new flag
NETIF_F_IPV6_SUM. Devices that can checksum both could do NETIF_F_IPV4_SUM|NETI_F_IPV6_SUM.


-- 
Stephen Hemminger <shemminger@linux-foundation.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Definition and usage of NETIF_F_HW_SUM?
  2007-05-29 20:58 Definition and usage of NETIF_F_HW_SUM? Stephen Hemminger
@ 2007-05-29 21:36 ` Herbert Xu
  2007-05-29 21:58   ` Stephen Hemminger
                     ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Herbert Xu @ 2007-05-29 21:36 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Michael Chan

On Tue, May 29, 2007 at 01:58:13PM -0700, Stephen Hemminger wrote:
> The flag NETIF_F_HW_SUM is being misused. The definition says that the device
> is capable of checksumming any packet. When in fact from usage it seems to
> imply that the device is capable of doing IPV6 as well as IPV4.

That would be a problem.

> Some devices like 8139too do "fake checksum offloading" because they always have to copy
> the packet.
> 
> Some devices like via-rhine, don't really checksum but if they see CHECKSUM_PARTIAL then they
> copy. This is bogus, they should just let higher layer do checksum/copy.

Actually this is OK because if they have to copy it then it's cheaper to
checksum it there.  Both of these should be able to support all protocols.

> Devices like e1000, and bnx2 are broken because they assume only TCP/UDP and IPV4/IPV6. 
> The definition of the flag says other protocols should work, but they probably send the
> hardware into a state of confusion.

I just checked e1000 and it's correct as it does use the csum_offset
when doing TX offload.  However, you're definitely right that bnx2
seems to be broken.

> A few devices take a offset, starting point, and insertion point. This looks like
> the correct model. But no upper layer protocols other than IPV4/IPV6 can do checksum
> offload at present, so it seems moot.

I could easily whip up a patch to get GRE to use it for a start :)

> IMHO the correct solution would be to get rid if NETIF_F_HW_SUM and make a new flag
> NETIF_F_IPV6_SUM. Devices that can checksum both could do NETIF_F_IPV4_SUM|NETI_F_IPV6_SUM.

We should definitely keep NETIF_F_HW_SUM for sane hardware such as the
e1000.  Unfortunately we may just have to invent IPV6_SUM for the broken
ones.

Ccing Michael to see if the bnx2 chip can actually do offset-based
checksum offload.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Definition and usage of NETIF_F_HW_SUM?
  2007-05-29 21:36 ` Herbert Xu
@ 2007-05-29 21:58   ` Stephen Hemminger
  2007-05-30  0:10   ` Michael Chan
  2007-05-30 15:53   ` [RFC] IPV6 checksum offloading in network devices Stephen Hemminger
  2 siblings, 0 replies; 10+ messages in thread
From: Stephen Hemminger @ 2007-05-29 21:58 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, Michael Chan

On Wed, 30 May 2007 07:36:18 +1000
Herbert Xu <herbert.xu@redhat.com> wrote:

> On Tue, May 29, 2007 at 01:58:13PM -0700, Stephen Hemminger wrote:
> > The flag NETIF_F_HW_SUM is being misused. The definition says that the device
> > is capable of checksumming any packet. When in fact from usage it seems to
> > imply that the device is capable of doing IPV6 as well as IPV4.
> 
> That would be a problem.
> 
> > Some devices like 8139too do "fake checksum offloading" because they always have to copy
> > the packet.
> > 
> > Some devices like via-rhine, don't really checksum but if they see CHECKSUM_PARTIAL then they
> > copy. This is bogus, they should just let higher layer do checksum/copy.
> 
> Actually this is OK because if they have to copy it then it's cheaper to
> checksum it there.  Both of these should be able to support all protocols.
> 
> > Devices like e1000, and bnx2 are broken because they assume only TCP/UDP and IPV4/IPV6. 
> > The definition of the flag says other protocols should work, but they probably send the
> > hardware into a state of confusion.
> 
> I just checked e1000 and it's correct as it does use the csum_offset
> when doing TX offload.  However, you're definitely right that bnx2
> seems to be broken.
> 
> > A few devices take a offset, starting point, and insertion point. This looks like
> > the correct model. But no upper layer protocols other than IPV4/IPV6 can do checksum
> > offload at present, so it seems moot.
> 
> I could easily whip up a patch to get GRE to use it for a start :)
> 
> > IMHO the correct solution would be to get rid if NETIF_F_HW_SUM and make a new flag
> > NETIF_F_IPV6_SUM. Devices that can checksum both could do NETIF_F_IPV4_SUM|NETI_F_IPV6_SUM.
> 
> We should definitely keep NETIF_F_HW_SUM for sane hardware such as the
> e1000.  Unfortunately we may just have to invent IPV6_SUM for the broken
> ones.
> 

The Marvell 88e8071 does IPV4 and IPV6 checksum, earlier chips could do arbitrary
checksum. Looks like when they added the TSO6 logic, they made transmit state machine
more protocol dependent.
-- 
Stephen Hemminger <shemminger@linux-foundation.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Definition and usage of NETIF_F_HW_SUM?
  2007-05-30  0:10   ` Michael Chan
@ 2007-05-29 23:45     ` Stephen Hemminger
  2007-06-04 15:35       ` Ron Mercer
  0 siblings, 1 reply; 10+ messages in thread
From: Stephen Hemminger @ 2007-05-29 23:45 UTC (permalink / raw)
  To: Michael Chan; +Cc: Herbert Xu, netdev

On Tue, 29 May 2007 17:10:52 -0700
"Michael Chan" <mchan@broadcom.com> wrote:

> On Wed, 2007-05-30 at 07:36 +1000, Herbert Xu wrote:
> 
> > 
> > I just checked e1000 and it's correct as it does use the csum_offset
> > when doing TX offload.  However, you're definitely right that bnx2
> > seems to be broken.
> > 
> > > A few devices take a offset, starting point, and insertion point. This looks like
> > > the correct model. But no upper layer protocols other than IPV4/IPV6 can do checksum
> > > offload at present, so it seems moot.
> > 
> > I could easily whip up a patch to get GRE to use it for a start :)
> > 
> > > IMHO the correct solution would be to get rid if NETIF_F_HW_SUM and make a new flag
> > > NETIF_F_IPV6_SUM. Devices that can checksum both could do NETIF_F_IPV4_SUM|NETI_F_IPV6_SUM.
> > 
> > We should definitely keep NETIF_F_HW_SUM for sane hardware such as the
> > e1000.  Unfortunately we may just have to invent IPV6_SUM for the broken
> > ones.
> > 
> > Ccing Michael to see if the bnx2 chip can actually do offset-based
> > checksum offload.
> > 
> 
> bnx2 and tg3 cannot do offset-based checksumming because the hardware
> doesn't have room in the buffer descriptors to specify the offsets.  So
> regrettably, the NETIF_F_HW_SUM flag has been misused in these drivers.
> A new NETIF_F_IPV6_SUM flag will be very useful for us.

Look furthur many drivers are just plain broken and use F_HW_SUM
and can't even do IPV6 properly.  I'll fix

The worst code award goes to: qla3xxx.c
which is broken on IPV6 and goes to trouble of computing all the
offsets and they are already there in skb...



-- 
Stephen Hemminger <shemminger@linux-foundation.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Definition and usage of NETIF_F_HW_SUM?
  2007-05-29 21:36 ` Herbert Xu
  2007-05-29 21:58   ` Stephen Hemminger
@ 2007-05-30  0:10   ` Michael Chan
  2007-05-29 23:45     ` Stephen Hemminger
  2007-05-30 15:53   ` [RFC] IPV6 checksum offloading in network devices Stephen Hemminger
  2 siblings, 1 reply; 10+ messages in thread
From: Michael Chan @ 2007-05-30  0:10 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Stephen Hemminger, netdev

On Wed, 2007-05-30 at 07:36 +1000, Herbert Xu wrote:

> 
> I just checked e1000 and it's correct as it does use the csum_offset
> when doing TX offload.  However, you're definitely right that bnx2
> seems to be broken.
> 
> > A few devices take a offset, starting point, and insertion point. This looks like
> > the correct model. But no upper layer protocols other than IPV4/IPV6 can do checksum
> > offload at present, so it seems moot.
> 
> I could easily whip up a patch to get GRE to use it for a start :)
> 
> > IMHO the correct solution would be to get rid if NETIF_F_HW_SUM and make a new flag
> > NETIF_F_IPV6_SUM. Devices that can checksum both could do NETIF_F_IPV4_SUM|NETI_F_IPV6_SUM.
> 
> We should definitely keep NETIF_F_HW_SUM for sane hardware such as the
> e1000.  Unfortunately we may just have to invent IPV6_SUM for the broken
> ones.
> 
> Ccing Michael to see if the bnx2 chip can actually do offset-based
> checksum offload.
> 

bnx2 and tg3 cannot do offset-based checksumming because the hardware
doesn't have room in the buffer descriptors to specify the offsets.  So
regrettably, the NETIF_F_HW_SUM flag has been misused in these drivers.
A new NETIF_F_IPV6_SUM flag will be very useful for us.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC] IPV6 checksum offloading in network devices
  2007-05-29 21:36 ` Herbert Xu
  2007-05-29 21:58   ` Stephen Hemminger
  2007-05-30  0:10   ` Michael Chan
@ 2007-05-30 15:53   ` Stephen Hemminger
  2007-05-30 16:13     ` Patrick McHardy
  2 siblings, 1 reply; 10+ messages in thread
From: Stephen Hemminger @ 2007-05-30 15:53 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller; +Cc: netdev, Michael Chan

Here is a proposed change to address hardware that can do IPV6 checksum
offload, but doesn't truly do generic hw checksumming.  The bnx2 and tg3
are like this for some revisions, and upcoming Marvell 88e8071 is similar.


--- a/drivers/net/bnx2.c	2007-05-30 08:26:18.000000000 -0700
+++ b/drivers/net/bnx2.c	2007-05-30 08:26:32.000000000 -0700
@@ -6471,10 +6471,10 @@ bnx2_init_one(struct pci_dev *pdev, cons
 	memcpy(dev->perm_addr, bp->mac_addr, 6);
 	bp->name = board_info[ent->driver_data].name;
 
+	dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG;
 	if (CHIP_NUM(bp) == CHIP_NUM_5709)
-		dev->features |= NETIF_F_HW_CSUM | NETIF_F_SG;
-	else
-		dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG;
+		dev->features |= NETIF_F_IPV6_CSUM;
+
 #ifdef BCM_VLAN
 	dev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX;
 #endif
--- a/drivers/net/tg3.c	2007-05-30 08:26:18.000000000 -0700
+++ b/drivers/net/tg3.c	2007-05-30 08:26:32.000000000 -0700
@@ -11959,12 +11959,11 @@ static int __devinit tg3_init_one(struct
 	 * checksumming.
 	 */
 	if ((tp->tg3_flags & TG3_FLAG_BROKEN_CHECKSUMS) == 0) {
+		dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG;
 		if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5755 ||
 		    GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5787)
-			dev->features |= NETIF_F_HW_CSUM;
-		else
-			dev->features |= NETIF_F_IP_CSUM;
-		dev->features |= NETIF_F_SG;
+			dev->features |= NETIF_F_IPV6_CSUM;
+
 		tp->tg3_flags |= TG3_FLAG_RX_CHECKSUMS;
 	} else
 		tp->tg3_flags &= ~TG3_FLAG_RX_CHECKSUMS;
--- a/include/linux/netdevice.h	2007-05-30 08:26:18.000000000 -0700
+++ b/include/linux/netdevice.h	2007-05-30 08:30:20.000000000 -0700
@@ -314,9 +314,10 @@ struct net_device
 	/* Net device features */
 	unsigned long		features;
 #define NETIF_F_SG		1	/* Scatter/gather IO. */
-#define NETIF_F_IP_CSUM		2	/* Can checksum only TCP/UDP over IPv4. */
+#define NETIF_F_IP_CSUM		2	/* Can checksum TCP/UDP over IPv4. */
 #define NETIF_F_NO_CSUM		4	/* Does not require checksum. F.e. loopack. */
 #define NETIF_F_HW_CSUM		8	/* Can checksum all the packets. */
+#define NETIF_F_IPV6_CSUM	16	/* Can checksum TCP/UDP over IPV6 */
 #define NETIF_F_HIGHDMA		32	/* Can DMA to high memory. */
 #define NETIF_F_FRAGLIST	64	/* Scatter/gather IO. */
 #define NETIF_F_HW_VLAN_TX	128	/* Transmit VLAN hw acceleration */
@@ -339,7 +340,8 @@ struct net_device
 #define NETIF_F_GSO_SOFTWARE	(NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6)
 
 #define NETIF_F_GEN_CSUM	(NETIF_F_NO_CSUM | NETIF_F_HW_CSUM)
-#define NETIF_F_ALL_CSUM	(NETIF_F_IP_CSUM | NETIF_F_GEN_CSUM)
+#define NETIF_F_ALL_CSUM	(NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM \
+				 | NETIF_F_GEN_CSUM)
 
 	struct net_device	*next_sched;
 
--- a/net/bridge/br_if.c	2007-05-30 08:26:18.000000000 -0700
+++ b/net/bridge/br_if.c	2007-05-30 08:26:32.000000000 -0700
@@ -368,10 +368,18 @@ void br_features_recompute(struct net_br
 	list_for_each_entry(p, &br->port_list, list) {
 		unsigned long feature = p->dev->features;
 
+		/* if device needs checksumming, downgrade to hw checksumming */
 		if (checksum & NETIF_F_NO_CSUM && !(feature & NETIF_F_NO_CSUM))
 			checksum ^= NETIF_F_NO_CSUM | NETIF_F_HW_CSUM;
+
+		/* if device can't do all checksum, downgrade to ipv4/ipv6 */
 		if (checksum & NETIF_F_HW_CSUM && !(feature & NETIF_F_HW_CSUM))
-			checksum ^= NETIF_F_HW_CSUM | NETIF_F_IP_CSUM;
+			checksum ^= NETIF_F_HW_CSUM
+				| NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM;
+
+		if (checksum & NETIF_F_IPV6_CSUM && !(feature & NETIF_F_IPV6_CSUM))
+			checksum &= ~NETIF_F_IPV6_CSUM;
+
 		if (!(feature & NETIF_F_IP_CSUM))
 			checksum = 0;
 
--- a/net/core/dev.c	2007-05-30 08:26:18.000000000 -0700
+++ b/net/core/dev.c	2007-05-30 08:49:54.000000000 -0700
@@ -1509,9 +1509,11 @@ int dev_queue_xmit(struct sk_buff *skb)
 		skb_set_transport_header(skb, skb->csum_start -
 					      skb_headroom(skb));
 
-		if (!(dev->features & NETIF_F_GEN_CSUM) &&
-		    (!(dev->features & NETIF_F_IP_CSUM) ||
-		     skb->protocol != htons(ETH_P_IP)))
+		if (!(dev->features & NETIF_F_GEN_CSUM)
+		    || ((dev->features & NETIF_F_IP_CSUM)
+			&& skb->protocol == htons(ETH_P_IP))
+		    || ((dev->features & NETIF_F_IPV6_CSUM)
+			&& skb->protocol == htons(ETH_P_IPV6)))
 			if (skb_checksum_help(skb))
 				goto out_kfree_skb;
 	}
@@ -3105,6 +3107,22 @@ int register_netdevice(struct net_device
 		}
 	}
 
+	/* Fix illegal checksum combinations */
+	if ((dev->features & NETIF_F_HW_CSUM) &&
+	    (dev->features & (NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM))) {
+		printk(KERN_NOTICE "%s: mixed HW and IP checksum settings.\n",
+		       dev->name);
+		dev->features &= ~(NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM);
+	}
+
+	if ((dev->features & NETIF_F_NO_CSUM) &&
+	    (dev->features & (NETIF_F_HW_CSUM|NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM))) {
+		printk(KERN_NOTICE "%s: mixed no checksumming and other settings.\n",
+		       dev->name);
+		dev->features &= ~(NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM|NETIF_F_HW_CSUM);
+	}
+
+
 	/* Fix illegal SG+CSUM combinations. */
 	if ((dev->features & NETIF_F_SG) &&
 	    !(dev->features & NETIF_F_ALL_CSUM)) {
--- a/net/ipv6/ipv6_sockglue.c	2007-05-29 16:44:45.000000000 -0700
+++ b/net/ipv6/ipv6_sockglue.c	2007-05-30 08:48:35.000000000 -0700
@@ -123,7 +123,7 @@ static struct sk_buff *ipv6_gso_segment(
 	struct ipv6hdr *ipv6h;
 	struct inet6_protocol *ops;
 
-	if (!(features & NETIF_F_HW_CSUM))
+	if (!(features & (NETIF_F_HW_CSUM|NETIF_F_IPV6_CSUM))
 		features &= ~NETIF_F_SG;
 
 	if (unlikely(skb_shinfo(skb)->gso_type &

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] IPV6 checksum offloading in network devices
  2007-05-30 15:53   ` [RFC] IPV6 checksum offloading in network devices Stephen Hemminger
@ 2007-05-30 16:13     ` Patrick McHardy
  2007-05-30 21:00       ` Stephen Hemminger
  0 siblings, 1 reply; 10+ messages in thread
From: Patrick McHardy @ 2007-05-30 16:13 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Herbert Xu, David S. Miller, netdev, Michael Chan

Stephen Hemminger wrote:
> Here is a proposed change to address hardware that can do IPV6 checksum
> offload, but doesn't truly do generic hw checksumming.  The bnx2 and tg3
> are like this for some revisions, and upcoming Marvell 88e8071 is similar.
> 
> 
> --- a/include/linux/netdevice.h	2007-05-30 08:26:18.000000000 -0700
> +++ b/include/linux/netdevice.h	2007-05-30 08:30:20.000000000 -0700
> @@ -314,9 +314,10 @@ struct net_device
>  	/* Net device features */
>  	unsigned long		features;
>  #define NETIF_F_SG		1	/* Scatter/gather IO. */
> -#define NETIF_F_IP_CSUM		2	/* Can checksum only TCP/UDP over IPv4. */
> +#define NETIF_F_IP_CSUM		2	/* Can checksum TCP/UDP over IPv4. */
>  #define NETIF_F_NO_CSUM		4	/* Does not require checksum. F.e. loopack. */
>  #define NETIF_F_HW_CSUM		8	/* Can checksum all the packets. */
> +#define NETIF_F_IPV6_CSUM	16	/* Can checksum TCP/UDP over IPV6 */
>  #define NETIF_F_HIGHDMA		32	/* Can DMA to high memory. */
>  #define NETIF_F_FRAGLIST	64	/* Scatter/gather IO. */
>  #define NETIF_F_HW_VLAN_TX	128	/* Transmit VLAN hw acceleration */
> @@ -339,7 +340,8 @@ struct net_device
>  #define NETIF_F_GSO_SOFTWARE	(NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6)
>  
>  #define NETIF_F_GEN_CSUM	(NETIF_F_NO_CSUM | NETIF_F_HW_CSUM)
> -#define NETIF_F_ALL_CSUM	(NETIF_F_IP_CSUM | NETIF_F_GEN_CSUM)
> +#define NETIF_F_ALL_CSUM	(NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM \
> +				 | NETIF_F_GEN_CSUM)


This might confuse some of the existing IPv4 code that
checks for NETIF_F_ALL_CSUM if we ever get a device that
has NETIF_F_IPV6_CSUM but not NETIF_F_IP_CSUM.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] IPV6 checksum offloading in network devices
  2007-05-30 16:13     ` Patrick McHardy
@ 2007-05-30 21:00       ` Stephen Hemminger
  2007-06-27  7:44         ` David Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Stephen Hemminger @ 2007-05-30 21:00 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Herbert Xu, David S. Miller, netdev, Michael Chan

This is better. 
There is still a possiblity when a device allows IPV6 and not IPV4
checksumming, that the checksum will be done in the fixup code in
dev_queue_xmit.

The existing model for checksum offload does not correctly handle
devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag
implies device can do any arbitrary protocol.

This patch:
 * adds NETIF_F_IPV6_CSUM for those devices
 * fixes bnx2 and tg3 devices that need it
 * add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO)
 * fixes assumptions about NETIF_F_ALL_CSUM in nat
 * adjusts bridge union of checksumming computation


---
 drivers/net/bnx2.c                 |    6 +++---
 drivers/net/tg3.c                  |    7 +++----
 include/linux/netdevice.h          |    8 ++++++--
 net/bridge/br_if.c                 |   10 +++++++++-
 net/core/dev.c                     |   24 +++++++++++++++++++++---
 net/ipv4/af_inet.c                 |    3 +++
 net/ipv4/ip_output.c               |    2 +-
 net/ipv4/netfilter/nf_nat_helper.c |    4 ++--
 net/ipv6/ipv6_sockglue.c           |    2 +-
 9 files changed, 49 insertions(+), 17 deletions(-)

--- a/drivers/net/bnx2.c	2007-05-30 08:26:18.000000000 -0700
+++ b/drivers/net/bnx2.c	2007-05-30 08:26:32.000000000 -0700
@@ -6471,10 +6471,10 @@ bnx2_init_one(struct pci_dev *pdev, cons
 	memcpy(dev->perm_addr, bp->mac_addr, 6);
 	bp->name = board_info[ent->driver_data].name;
 
+	dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG;
 	if (CHIP_NUM(bp) == CHIP_NUM_5709)
-		dev->features |= NETIF_F_HW_CSUM | NETIF_F_SG;
-	else
-		dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG;
+		dev->features |= NETIF_F_IPV6_CSUM;
+
 #ifdef BCM_VLAN
 	dev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX;
 #endif
--- a/drivers/net/tg3.c	2007-05-30 08:26:18.000000000 -0700
+++ b/drivers/net/tg3.c	2007-05-30 08:26:32.000000000 -0700
@@ -11959,12 +11959,11 @@ static int __devinit tg3_init_one(struct
 	 * checksumming.
 	 */
 	if ((tp->tg3_flags & TG3_FLAG_BROKEN_CHECKSUMS) == 0) {
+		dev->features |= NETIF_F_IP_CSUM | NETIF_F_SG;
 		if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5755 ||
 		    GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5787)
-			dev->features |= NETIF_F_HW_CSUM;
-		else
-			dev->features |= NETIF_F_IP_CSUM;
-		dev->features |= NETIF_F_SG;
+			dev->features |= NETIF_F_IPV6_CSUM;
+
 		tp->tg3_flags |= TG3_FLAG_RX_CHECKSUMS;
 	} else
 		tp->tg3_flags &= ~TG3_FLAG_RX_CHECKSUMS;
--- a/include/linux/netdevice.h	2007-05-30 08:26:18.000000000 -0700
+++ b/include/linux/netdevice.h	2007-05-30 10:10:57.000000000 -0700
@@ -314,9 +314,10 @@ struct net_device
 	/* Net device features */
 	unsigned long		features;
 #define NETIF_F_SG		1	/* Scatter/gather IO. */
-#define NETIF_F_IP_CSUM		2	/* Can checksum only TCP/UDP over IPv4. */
+#define NETIF_F_IP_CSUM		2	/* Can checksum TCP/UDP over IPv4. */
 #define NETIF_F_NO_CSUM		4	/* Does not require checksum. F.e. loopack. */
 #define NETIF_F_HW_CSUM		8	/* Can checksum all the packets. */
+#define NETIF_F_IPV6_CSUM	16	/* Can checksum TCP/UDP over IPV6 */
 #define NETIF_F_HIGHDMA		32	/* Can DMA to high memory. */
 #define NETIF_F_FRAGLIST	64	/* Scatter/gather IO. */
 #define NETIF_F_HW_VLAN_TX	128	/* Transmit VLAN hw acceleration */
@@ -338,8 +339,11 @@ struct net_device
 	/* List of features with software fallbacks. */
 #define NETIF_F_GSO_SOFTWARE	(NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6)
 
+
 #define NETIF_F_GEN_CSUM	(NETIF_F_NO_CSUM | NETIF_F_HW_CSUM)
-#define NETIF_F_ALL_CSUM	(NETIF_F_IP_CSUM | NETIF_F_GEN_CSUM)
+#define NETIF_F_V4_CSUM		(NETIF_F_GEN_CSUM | NETIF_F_IP_CSUM)
+#define NETIF_F_V6_CSUM		(NETIF_F_GEN_CSUM | NETIF_F_IPV6_CSUM)
+#define NETIF_F_ALL_CSUM	(NETIF_F_V4_CSUM | NETIF_F_V6_CSUM)
 
 	struct net_device	*next_sched;
 
--- a/net/bridge/br_if.c	2007-05-30 08:26:18.000000000 -0700
+++ b/net/bridge/br_if.c	2007-05-30 10:07:22.000000000 -0700
@@ -368,10 +368,18 @@ void br_features_recompute(struct net_br
 	list_for_each_entry(p, &br->port_list, list) {
 		unsigned long feature = p->dev->features;
 
+		/* if device needs checksumming, downgrade to hw checksumming */
 		if (checksum & NETIF_F_NO_CSUM && !(feature & NETIF_F_NO_CSUM))
 			checksum ^= NETIF_F_NO_CSUM | NETIF_F_HW_CSUM;
+
+		/* if device can't do all checksum, downgrade to ipv4/ipv6 */
 		if (checksum & NETIF_F_HW_CSUM && !(feature & NETIF_F_HW_CSUM))
-			checksum ^= NETIF_F_HW_CSUM | NETIF_F_IP_CSUM;
+			checksum ^= NETIF_F_HW_CSUM
+				| NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM;
+
+		if (checksum & NETIF_F_IPV6_CSUM && !(feature & NETIF_F_IPV6_CSUM))
+			checksum &= ~NETIF_F_IPV6_CSUM;
+
 		if (!(feature & NETIF_F_IP_CSUM))
 			checksum = 0;
 
--- a/net/core/dev.c	2007-05-30 08:26:18.000000000 -0700
+++ b/net/core/dev.c	2007-05-30 10:07:31.000000000 -0700
@@ -1509,9 +1509,11 @@ int dev_queue_xmit(struct sk_buff *skb)
 		skb_set_transport_header(skb, skb->csum_start -
 					      skb_headroom(skb));
 
-		if (!(dev->features & NETIF_F_GEN_CSUM) &&
-		    (!(dev->features & NETIF_F_IP_CSUM) ||
-		     skb->protocol != htons(ETH_P_IP)))
+		if (!(dev->features & NETIF_F_GEN_CSUM)
+		    || ((dev->features & NETIF_F_IP_CSUM)
+			&& skb->protocol == htons(ETH_P_IP))
+		    || ((dev->features & NETIF_F_IPV6_CSUM)
+			&& skb->protocol == htons(ETH_P_IPV6)))
 			if (skb_checksum_help(skb))
 				goto out_kfree_skb;
 	}
@@ -3105,6 +3107,22 @@ int register_netdevice(struct net_device
 		}
 	}
 
+	/* Fix illegal checksum combinations */
+	if ((dev->features & NETIF_F_HW_CSUM) &&
+	    (dev->features & (NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM))) {
+		printk(KERN_NOTICE "%s: mixed HW and IP checksum settings.\n",
+		       dev->name);
+		dev->features &= ~(NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM);
+	}
+
+	if ((dev->features & NETIF_F_NO_CSUM) &&
+	    (dev->features & (NETIF_F_HW_CSUM|NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM))) {
+		printk(KERN_NOTICE "%s: mixed no checksumming and other settings.\n",
+		       dev->name);
+		dev->features &= ~(NETIF_F_IP_CSUM|NETIF_F_IPV6_CSUM|NETIF_F_HW_CSUM);
+	}
+
+
 	/* Fix illegal SG+CSUM combinations. */
 	if ((dev->features & NETIF_F_SG) &&
 	    !(dev->features & NETIF_F_ALL_CSUM)) {
--- a/net/ipv6/ipv6_sockglue.c	2007-05-29 16:44:45.000000000 -0700
+++ b/net/ipv6/ipv6_sockglue.c	2007-05-30 09:54:52.000000000 -0700
@@ -123,7 +123,7 @@ static struct sk_buff *ipv6_gso_segment(
 	struct ipv6hdr *ipv6h;
 	struct inet6_protocol *ops;
 
-	if (!(features & NETIF_F_HW_CSUM))
+	if (!(features & NETIF_F_V6_CSUM))
 		features &= ~NETIF_F_SG;
 
 	if (unlikely(skb_shinfo(skb)->gso_type &
--- a/net/ipv4/af_inet.c	2007-05-29 16:44:45.000000000 -0700
+++ b/net/ipv4/af_inet.c	2007-05-30 09:55:11.000000000 -0700
@@ -1170,6 +1170,9 @@ static struct sk_buff *inet_gso_segment(
 	int ihl;
 	int id;
 
+	if (!(features & NETIF_F_V4_CSUM))
+		features &= ~NETIF_F_SG;
+
 	if (unlikely(skb_shinfo(skb)->gso_type &
 		     ~(SKB_GSO_TCPV4 |
 		       SKB_GSO_UDP |
--- a/net/ipv4/ip_output.c	2007-05-29 16:44:45.000000000 -0700
+++ b/net/ipv4/ip_output.c	2007-05-30 09:51:33.000000000 -0700
@@ -837,7 +837,7 @@ int ip_append_data(struct sock *sk,
 	 */
 	if (transhdrlen &&
 	    length + fragheaderlen <= mtu &&
-	    rt->u.dst.dev->features & NETIF_F_ALL_CSUM &&
+	    rt->u.dst.dev->features & NETIF_F_V4_CSUM &&
 	    !exthdrlen)
 		csummode = CHECKSUM_PARTIAL;
 
--- a/net/ipv4/netfilter/nf_nat_helper.c	2007-05-29 16:44:45.000000000 -0700
+++ b/net/ipv4/netfilter/nf_nat_helper.c	2007-05-30 10:01:33.000000000 -0700
@@ -178,7 +178,7 @@ nf_nat_mangle_tcp_packet(struct sk_buff 
 	datalen = (*pskb)->len - iph->ihl*4;
 	if ((*pskb)->ip_summed != CHECKSUM_PARTIAL) {
 		if (!(rt->rt_flags & RTCF_LOCAL) &&
-		    (*pskb)->dev->features & NETIF_F_ALL_CSUM) {
+		    (*pskb)->dev->features & NETIF_F_V4_CSUM) {
 			(*pskb)->ip_summed = CHECKSUM_PARTIAL;
 			(*pskb)->csum_start = skb_headroom(*pskb) +
 					      skb_network_offset(*pskb) +
@@ -265,7 +265,7 @@ nf_nat_mangle_udp_packet(struct sk_buff 
 
 	if ((*pskb)->ip_summed != CHECKSUM_PARTIAL) {
 		if (!(rt->rt_flags & RTCF_LOCAL) &&
-		    (*pskb)->dev->features & NETIF_F_ALL_CSUM) {
+		    (*pskb)->dev->features & NETIF_F_V4_CSUM) {
 			(*pskb)->ip_summed = CHECKSUM_PARTIAL;
 			(*pskb)->csum_start = skb_headroom(*pskb) +
 					      skb_network_offset(*pskb) +

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Definition and usage of NETIF_F_HW_SUM?
  2007-05-29 23:45     ` Stephen Hemminger
@ 2007-06-04 15:35       ` Ron Mercer
  0 siblings, 0 replies; 10+ messages in thread
From: Ron Mercer @ 2007-06-04 15:35 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

I was out of town last week and did not have a chance to respond.  Yes,
qla3xxx is (before Stephen's fix) broken on IPV6.  I will review the
changes and post a patch if necessary. 

> -----Original Message-----
> From: netdev-owner@vger.kernel.org 
> [mailto:netdev-owner@vger.kernel.org] On Behalf Of Stephen Hemminger
> Sent: Tuesday, May 29, 2007 4:46 PM
> To: Michael Chan
> Cc: Herbert Xu; netdev
> Subject: Re: Definition and usage of NETIF_F_HW_SUM?
> 
> On Tue, 29 May 2007 17:10:52 -0700
> "Michael Chan" <mchan@broadcom.com> wrote:
> 
> > On Wed, 2007-05-30 at 07:36 +1000, Herbert Xu wrote:
> > 
> > > 
> > > I just checked e1000 and it's correct as it does use the 
> csum_offset
> > > when doing TX offload.  However, you're definitely right that bnx2
> > > seems to be broken.
> > > 
> > > > A few devices take a offset, starting point, and 
> insertion point. This looks like
> > > > the correct model. But no upper layer protocols other 
> than IPV4/IPV6 can do checksum
> > > > offload at present, so it seems moot.
> > > 
> > > I could easily whip up a patch to get GRE to use it for a start :)
> > > 
> > > > IMHO the correct solution would be to get rid if 
> NETIF_F_HW_SUM and make a new flag
> > > > NETIF_F_IPV6_SUM. Devices that can checksum both could 
> do NETIF_F_IPV4_SUM|NETI_F_IPV6_SUM.
> > > 
> > > We should definitely keep NETIF_F_HW_SUM for sane 
> hardware such as the
> > > e1000.  Unfortunately we may just have to invent IPV6_SUM 
> for the broken
> > > ones.
> > > 
> > > Ccing Michael to see if the bnx2 chip can actually do offset-based
> > > checksum offload.
> > > 
> > 
> > bnx2 and tg3 cannot do offset-based checksumming because 
> the hardware
> > doesn't have room in the buffer descriptors to specify the 
> offsets.  So
> > regrettably, the NETIF_F_HW_SUM flag has been misused in 
> these drivers.
> > A new NETIF_F_IPV6_SUM flag will be very useful for us.
> 
> Look furthur many drivers are just plain broken and use F_HW_SUM
> and can't even do IPV6 properly.  I'll fix
> 
> The worst code award goes to: qla3xxx.c
> which is broken on IPV6 and goes to trouble of computing all the
> offsets and they are already there in skb...
> 
> 
> 
> -- 
> Stephen Hemminger <shemminger@linux-foundation.org>
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] IPV6 checksum offloading in network devices
  2007-05-30 21:00       ` Stephen Hemminger
@ 2007-06-27  7:44         ` David Miller
  0 siblings, 0 replies; 10+ messages in thread
From: David Miller @ 2007-06-27  7:44 UTC (permalink / raw)
  To: shemminger; +Cc: kaber, herbert.xu, netdev, mchan

From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Wed, 30 May 2007 14:00:34 -0700

> This is better. 
> There is still a possiblity when a device allows IPV6 and not IPV4
> checksumming, that the checksum will be done in the fixup code in
> dev_queue_xmit.
> 
> The existing model for checksum offload does not correctly handle
> devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag
> implies device can do any arbitrary protocol.
> 
> This patch:
>  * adds NETIF_F_IPV6_CSUM for those devices
>  * fixes bnx2 and tg3 devices that need it
>  * add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO)
>  * fixes assumptions about NETIF_F_ALL_CSUM in nat
>  * adjusts bridge union of checksumming computation

I've applied this to net-2.6.23, we can back it out or rework
it if there are problems.

Several of these checksum feature tests are getting out of
hand :-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-06-27  7:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-29 20:58 Definition and usage of NETIF_F_HW_SUM? Stephen Hemminger
2007-05-29 21:36 ` Herbert Xu
2007-05-29 21:58   ` Stephen Hemminger
2007-05-30  0:10   ` Michael Chan
2007-05-29 23:45     ` Stephen Hemminger
2007-06-04 15:35       ` Ron Mercer
2007-05-30 15:53   ` [RFC] IPV6 checksum offloading in network devices Stephen Hemminger
2007-05-30 16:13     ` Patrick McHardy
2007-05-30 21:00       ` Stephen Hemminger
2007-06-27  7:44         ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).