* [PATCH net-next v5 1/2] IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver
@ 2022-12-07 22:54 Coco Li
2022-12-07 22:54 ` [RFC net-next v5 2/2] bnxt: Use generic HBH removal helper in tx path Coco Li
2022-12-08 2:21 ` [PATCH net-next v5 1/2] IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver Eric Dumazet
0 siblings, 2 replies; 6+ messages in thread
From: Coco Li @ 2022-12-07 22:54 UTC (permalink / raw)
To: David S. Miller, Hideaki YOSHIFUJI, David Ahern, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Michael Chan
Cc: netdev, linux-kernel, Coco Li
IPv6/TCP and GRO stacks can build big TCP packets with an added
temporary Hop By Hop header.
Is GSO is not involved, then the temporary header needs to be removed in
the driver. This patch provides a generic helper for drivers that need
to modify their headers in place.
Tested:
Compiled and ran with ethtool -K eth1 tso off
Could send Big TCP packets
Signed-off-by: Coco Li <lixiaoyan@google.com>
---
include/net/ipv6.h | 36 ++++++++++++++++++++++++++++++++++++
net/ipv6/ip6_offload.c | 27 ++++-----------------------
2 files changed, 40 insertions(+), 23 deletions(-)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index d383c895592a..6dcf93a1ec14 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -500,6 +500,42 @@ static inline int ipv6_has_hopopt_jumbo(const struct sk_buff *skb)
return jhdr->nexthdr;
}
+/* Return 0 if HBH header is successfully removed
+ * Or if HBH removal is unnecessary (packet is not big TCP)
+ * Return error to indicate dropping the packet
+ */
+static inline int ipv6_hopopt_jumbo_remove(struct sk_buff *skb)
+{
+ const int hophdr_len = sizeof(struct hop_jumbo_hdr);
+ int nexthdr = ipv6_has_hopopt_jumbo(skb);
+ struct ipv6hdr *h6;
+
+ if (!nexthdr)
+ return 0;
+
+ if (skb_cow_head(skb, 0))
+ return -1;
+
+ /* Remove the HBH header.
+ * Layout: [Ethernet header][IPv6 header][HBH][L4 Header]
+ */
+ memmove(skb_mac_header(skb) + hophdr_len, skb_mac_header(skb),
+ skb_network_header(skb) - skb_mac_header(skb) +
+ sizeof(struct ipv6hdr));
+
+ if (unlikely(!pskb_may_pull(skb, hophdr_len)))
+ return -1;
+
+ __skb_pull(skb, hophdr_len);
+ skb->network_header += hophdr_len;
+ skb->mac_header += hophdr_len;
+
+ h6 = ipv6_hdr(skb);
+ h6->nexthdr = nexthdr;
+
+ return 0;
+}
+
static inline bool ipv6_accept_ra(struct inet6_dev *idev)
{
/* If forwarding is enabled, RA are not accepted unless the special
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index 3ee345672849..00dc2e3b0184 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -77,7 +77,7 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
struct sk_buff *segs = ERR_PTR(-EINVAL);
struct ipv6hdr *ipv6h;
const struct net_offload *ops;
- int proto, nexthdr;
+ int proto, err;
struct frag_hdr *fptr;
unsigned int payload_len;
u8 *prevhdr;
@@ -87,28 +87,9 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
bool gso_partial;
skb_reset_network_header(skb);
- nexthdr = ipv6_has_hopopt_jumbo(skb);
- if (nexthdr) {
- const int hophdr_len = sizeof(struct hop_jumbo_hdr);
- int err;
-
- err = skb_cow_head(skb, 0);
- if (err < 0)
- return ERR_PTR(err);
-
- /* remove the HBH header.
- * Layout: [Ethernet header][IPv6 header][HBH][TCP header]
- */
- memmove(skb_mac_header(skb) + hophdr_len,
- skb_mac_header(skb),
- ETH_HLEN + sizeof(struct ipv6hdr));
- skb->data += hophdr_len;
- skb->len -= hophdr_len;
- skb->network_header += hophdr_len;
- skb->mac_header += hophdr_len;
- ipv6h = (struct ipv6hdr *)skb->data;
- ipv6h->nexthdr = nexthdr;
- }
+ err = ipv6_hopopt_jumbo_remove(skb);
+ if (err)
+ return ERR_PTR(err);
nhoff = skb_network_header(skb) - skb_mac_header(skb);
if (unlikely(!pskb_may_pull(skb, sizeof(*ipv6h))))
goto out;
--
2.39.0.rc0.267.gcb52ba06e7-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [RFC net-next v5 2/2] bnxt: Use generic HBH removal helper in tx path
2022-12-07 22:54 [PATCH net-next v5 1/2] IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver Coco Li
@ 2022-12-07 22:54 ` Coco Li
2022-12-08 0:32 ` Saeed Mahameed
2022-12-08 19:54 ` Michael Chan
2022-12-08 2:21 ` [PATCH net-next v5 1/2] IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver Eric Dumazet
1 sibling, 2 replies; 6+ messages in thread
From: Coco Li @ 2022-12-07 22:54 UTC (permalink / raw)
To: David S. Miller, Hideaki YOSHIFUJI, David Ahern, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Michael Chan
Cc: netdev, linux-kernel, Coco Li
Eric Dumazet implemented Big TCP that allowed bigger TSO/GRO packet sizes
for IPv6 traffic. See patch series:
'commit 89527be8d8d6 ("net: add IFLA_TSO_{MAX_SIZE|SEGS} attributes")'
This reduces the number of packets traversing the networking stack and
should usually improves performance. However, it also inserts a
temporary Hop-by-hop IPv6 extension header.
Using the HBH header removal method in the previous path, the extra header
be removed in bnxt drivers to allow it to send big TCP packets (bigger
TSO packets) as well.
Tested:
Compiled locally
To further test functional correctness, update the GSO/GRO limit on the
physical NIC:
ip link set eth0 gso_max_size 181000
ip link set eth0 gro_max_size 181000
Note that if there are bonding or ipvan devices on top of the physical
NIC, their GSO sizes need to be updated as well.
Then, IPv6/TCP packets with sizes larger than 64k can be observed.
Big TCP functionality is tested by Michael, feature checks not yet.
Tested by Michael:
I've confirmed with our hardware team that this is supported by our
chips, and I've tested it up to gso_max_size of 524280. Thanks.
Tested-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Coco Li <lixiaoyan@google.com>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 26 ++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 0fe164b42c5d..6ba1cd342a80 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -389,6 +389,9 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
return NETDEV_TX_BUSY;
}
+ if (unlikely(ipv6_hopopt_jumbo_remove(skb)))
+ goto tx_free;
+
length = skb->len;
len = skb_headlen(skb);
last_frag = skb_shinfo(skb)->nr_frags;
@@ -11315,6 +11318,7 @@ static bool bnxt_exthdr_check(struct bnxt *bp, struct sk_buff *skb, int nw_off,
u8 **nextp)
{
struct ipv6hdr *ip6h = (struct ipv6hdr *)(skb->data + nw_off);
+ struct hop_jumbo_hdr *jhdr;
int hdr_count = 0;
u8 *nexthdr;
int start;
@@ -11342,9 +11346,27 @@ static bool bnxt_exthdr_check(struct bnxt *bp, struct sk_buff *skb, int nw_off,
if (hdrlen > 64)
return false;
+
+ /* The ext header may be a hop-by-hop header inserted for
+ * big TCP purposes. This will be removed before sending
+ * from NIC, so do not count it.
+ */
+ if (*nexthdr == NEXTHDR_HOP) {
+ if (likely(skb->len <= GRO_LEGACY_MAX_SIZE))
+ goto increment_hdr;
+
+ jhdr = (struct hop_jumbo_hdr *)nexthdr;
+ if (jhdr->tlv_type != IPV6_TLV_JUMBO || jhdr->hdrlen != 0 ||
+ jhdr->nexthdr != IPPROTO_TCP)
+ goto increment_hdr;
+
+ goto next_hdr;
+ }
+increment_hdr:
+ hdr_count++;
+next_hdr:
nexthdr = &hp->nexthdr;
start += hdrlen;
- hdr_count++;
}
if (nextp) {
/* Caller will check inner protocol */
@@ -13657,6 +13679,8 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
dev->features &= ~NETIF_F_LRO;
dev->priv_flags |= IFF_UNICAST_FLT;
+ netif_set_tso_max_size(dev, GSO_MAX_SIZE);
+
#ifdef CONFIG_BNXT_SRIOV
init_waitqueue_head(&bp->sriov_cfg_wait);
#endif
--
2.39.0.rc0.267.gcb52ba06e7-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC net-next v5 2/2] bnxt: Use generic HBH removal helper in tx path
2022-12-07 22:54 ` [RFC net-next v5 2/2] bnxt: Use generic HBH removal helper in tx path Coco Li
@ 2022-12-08 0:32 ` Saeed Mahameed
2022-12-10 3:53 ` Coco Li
2022-12-08 19:54 ` Michael Chan
1 sibling, 1 reply; 6+ messages in thread
From: Saeed Mahameed @ 2022-12-08 0:32 UTC (permalink / raw)
To: Coco Li
Cc: David S. Miller, Hideaki YOSHIFUJI, David Ahern, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Michael Chan, netdev, linux-kernel
On 07 Dec 14:54, Coco Li wrote:
>Eric Dumazet implemented Big TCP that allowed bigger TSO/GRO packet sizes
>for IPv6 traffic. See patch series:
>'commit 89527be8d8d6 ("net: add IFLA_TSO_{MAX_SIZE|SEGS} attributes")'
>
>This reduces the number of packets traversing the networking stack and
>should usually improves performance. However, it also inserts a
>temporary Hop-by-hop IPv6 extension header.
>
>Using the HBH header removal method in the previous path, the extra header
^ patch
>be removed in bnxt drivers to allow it to send big TCP packets (bigger
>TSO packets) as well.
>
I think Eric didn't expose this function because it isn't efficient for
drivers who are already processing the headers separately from payload for
LSO packets .. the trick is to have an optimized copy method depending on
your driver xmit function, usually you would just memcpy the TCP header over
the HBH exactly at the point you copy/process those headers into the HW
descriptor.
>Tested:
>Compiled locally
>
>To further test functional correctness, update the GSO/GRO limit on the
>physical NIC:
>
>ip link set eth0 gso_max_size 181000
>ip link set eth0 gro_max_size 181000
>
>Note that if there are bonding or ipvan devices on top of the physical
>NIC, their GSO sizes need to be updated as well.
>
>Then, IPv6/TCP packets with sizes larger than 64k can be observed.
>
>Big TCP functionality is tested by Michael, feature checks not yet.
>
>Tested by Michael:
>I've confirmed with our hardware team that this is supported by our
>chips, and I've tested it up to gso_max_size of 524280. Thanks.
>
>Tested-by: Michael Chan <michael.chan@broadcom.com>
>Reviewed-by: Michael Chan <michael.chan@broadcom.com>
>Signed-off-by: Coco Li <lixiaoyan@google.com>
>---
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 26 ++++++++++++++++++++++-
> 1 file changed, 25 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>index 0fe164b42c5d..6ba1cd342a80 100644
>--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>@@ -389,6 +389,9 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
> return NETDEV_TX_BUSY;
> }
>
>+ if (unlikely(ipv6_hopopt_jumbo_remove(skb)))
>+ goto tx_free;
>+
> length = skb->len;
> len = skb_headlen(skb);
> last_frag = skb_shinfo(skb)->nr_frags;
>@@ -11315,6 +11318,7 @@ static bool bnxt_exthdr_check(struct bnxt *bp, struct sk_buff *skb, int nw_off,
> u8 **nextp)
> {
> struct ipv6hdr *ip6h = (struct ipv6hdr *)(skb->data + nw_off);
>+ struct hop_jumbo_hdr *jhdr;
> int hdr_count = 0;
> u8 *nexthdr;
> int start;
>@@ -11342,9 +11346,27 @@ static bool bnxt_exthdr_check(struct bnxt *bp, struct sk_buff *skb, int nw_off,
>
> if (hdrlen > 64)
> return false;
>+
>+ /* The ext header may be a hop-by-hop header inserted for
>+ * big TCP purposes. This will be removed before sending
>+ * from NIC, so do not count it.
>+ */
>+ if (*nexthdr == NEXTHDR_HOP) {
>+ if (likely(skb->len <= GRO_LEGACY_MAX_SIZE))
>+ goto increment_hdr;
>+
>+ jhdr = (struct hop_jumbo_hdr *)nexthdr;
>+ if (jhdr->tlv_type != IPV6_TLV_JUMBO || jhdr->hdrlen != 0 ||
>+ jhdr->nexthdr != IPPROTO_TCP)
>+ goto increment_hdr;
>+
>+ goto next_hdr;
>+ }
>+increment_hdr:
>+ hdr_count++;
>+next_hdr:
> nexthdr = &hp->nexthdr;
> start += hdrlen;
>- hdr_count++;
> }
> if (nextp) {
> /* Caller will check inner protocol */
>@@ -13657,6 +13679,8 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> dev->features &= ~NETIF_F_LRO;
> dev->priv_flags |= IFF_UNICAST_FLT;
>
>+ netif_set_tso_max_size(dev, GSO_MAX_SIZE);
>+
> #ifdef CONFIG_BNXT_SRIOV
> init_waitqueue_head(&bp->sriov_cfg_wait);
> #endif
>--
>2.39.0.rc0.267.gcb52ba06e7-goog
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-next v5 1/2] IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver
2022-12-07 22:54 [PATCH net-next v5 1/2] IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver Coco Li
2022-12-07 22:54 ` [RFC net-next v5 2/2] bnxt: Use generic HBH removal helper in tx path Coco Li
@ 2022-12-08 2:21 ` Eric Dumazet
1 sibling, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2022-12-08 2:21 UTC (permalink / raw)
To: Coco Li
Cc: David S. Miller, Hideaki YOSHIFUJI, David Ahern, Jakub Kicinski,
Paolo Abeni, Michael Chan, netdev, linux-kernel
On Wed, Dec 7, 2022 at 11:54 PM Coco Li <lixiaoyan@google.com> wrote:
>
> IPv6/TCP and GRO stacks can build big TCP packets with an added
> temporary Hop By Hop header.
>
> Is GSO is not involved, then the temporary header needs to be removed in
> the driver. This patch provides a generic helper for drivers that need
> to modify their headers in place.
>
> Tested:
> Compiled and ran with ethtool -K eth1 tso off
> Could send Big TCP packets
>
> Signed-off-by: Coco Li <lixiaoyan@google.com>
> ---
> include/net/ipv6.h | 36 ++++++++++++++++++++++++++++++++++++
> net/ipv6/ip6_offload.c | 27 ++++-----------------------
> 2 files changed, 40 insertions(+), 23 deletions(-)
>
> diff --git a/include/net/ipv6.h b/include/net/ipv6.h
> index d383c895592a..6dcf93a1ec14 100644
> --- a/include/net/ipv6.h
> +++ b/include/net/ipv6.h
> @@ -500,6 +500,42 @@ static inline int ipv6_has_hopopt_jumbo(const struct sk_buff *skb)
> return jhdr->nexthdr;
> }
>
> +/* Return 0 if HBH header is successfully removed
> + * Or if HBH removal is unnecessary (packet is not big TCP)
> + * Return error to indicate dropping the packet
> + */
> +static inline int ipv6_hopopt_jumbo_remove(struct sk_buff *skb)
> +{
> + const int hophdr_len = sizeof(struct hop_jumbo_hdr);
> + int nexthdr = ipv6_has_hopopt_jumbo(skb);
> + struct ipv6hdr *h6;
> +
> + if (!nexthdr)
> + return 0;
> +
> + if (skb_cow_head(skb, 0))
> + return -1;
> +
> + /* Remove the HBH header.
> + * Layout: [Ethernet header][IPv6 header][HBH][L4 Header]
> + */
> + memmove(skb_mac_header(skb) + hophdr_len, skb_mac_header(skb),
> + skb_network_header(skb) - skb_mac_header(skb) +
> + sizeof(struct ipv6hdr));
> +
> + if (unlikely(!pskb_may_pull(skb, hophdr_len)))
> + return -1;
ipv6_has_hopopt_jumbo() had a stronger condition already.
if (skb_network_offset(skb) +
sizeof(struct ipv6hdr) +
sizeof(struct hop_jumbo_hdr) > skb_headlen(skb))
return 0;
So this !pskb_may_pull(skb, hophdr_len) , especially if done after the
memmove(), is not needed.
> +
> + __skb_pull(skb, hophdr_len);
> + skb->network_header += hophdr_len;
> + skb->mac_header += hophdr_len;
> +
> + h6 = ipv6_hdr(skb);
> + h6->nexthdr = nexthdr;
> +
> + return 0;
> +}
> +
> static inline bool ipv6_accept_ra(struct inet6_dev *idev)
> {
> /* If forwarding is enabled, RA are not accepted unless the special
> diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
> index 3ee345672849..00dc2e3b0184 100644
> --- a/net/ipv6/ip6_offload.c
> +++ b/net/ipv6/ip6_offload.c
> @@ -77,7 +77,7 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
> struct sk_buff *segs = ERR_PTR(-EINVAL);
> struct ipv6hdr *ipv6h;
> const struct net_offload *ops;
> - int proto, nexthdr;
> + int proto, err;
> struct frag_hdr *fptr;
> unsigned int payload_len;
> u8 *prevhdr;
> @@ -87,28 +87,9 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
> bool gso_partial;
>
> skb_reset_network_header(skb);
> - nexthdr = ipv6_has_hopopt_jumbo(skb);
> - if (nexthdr) {
> - const int hophdr_len = sizeof(struct hop_jumbo_hdr);
> - int err;
> -
> - err = skb_cow_head(skb, 0);
> - if (err < 0)
> - return ERR_PTR(err);
> -
> - /* remove the HBH header.
> - * Layout: [Ethernet header][IPv6 header][HBH][TCP header]
> - */
> - memmove(skb_mac_header(skb) + hophdr_len,
> - skb_mac_header(skb),
> - ETH_HLEN + sizeof(struct ipv6hdr));
> - skb->data += hophdr_len;
> - skb->len -= hophdr_len;
> - skb->network_header += hophdr_len;
> - skb->mac_header += hophdr_len;
> - ipv6h = (struct ipv6hdr *)skb->data;
> - ipv6h->nexthdr = nexthdr;
> - }
> + err = ipv6_hopopt_jumbo_remove(skb);
> + if (err)
> + return ERR_PTR(err);
> nhoff = skb_network_header(skb) - skb_mac_header(skb);
> if (unlikely(!pskb_may_pull(skb, sizeof(*ipv6h))))
> goto out;
> --
> 2.39.0.rc0.267.gcb52ba06e7-goog
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC net-next v5 2/2] bnxt: Use generic HBH removal helper in tx path
2022-12-07 22:54 ` [RFC net-next v5 2/2] bnxt: Use generic HBH removal helper in tx path Coco Li
2022-12-08 0:32 ` Saeed Mahameed
@ 2022-12-08 19:54 ` Michael Chan
1 sibling, 0 replies; 6+ messages in thread
From: Michael Chan @ 2022-12-08 19:54 UTC (permalink / raw)
To: Coco Li
Cc: David S. Miller, Hideaki YOSHIFUJI, David Ahern, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, netdev, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 4193 bytes --]
On Wed, Dec 7, 2022 at 2:54 PM Coco Li <lixiaoyan@google.com> wrote:
>
> Eric Dumazet implemented Big TCP that allowed bigger TSO/GRO packet sizes
> for IPv6 traffic. See patch series:
> 'commit 89527be8d8d6 ("net: add IFLA_TSO_{MAX_SIZE|SEGS} attributes")'
>
> This reduces the number of packets traversing the networking stack and
> should usually improves performance. However, it also inserts a
> temporary Hop-by-hop IPv6 extension header.
>
> Using the HBH header removal method in the previous path, the extra header
> be removed in bnxt drivers to allow it to send big TCP packets (bigger
> TSO packets) as well.
>
> Tested:
> Compiled locally
>
> To further test functional correctness, update the GSO/GRO limit on the
> physical NIC:
>
> ip link set eth0 gso_max_size 181000
> ip link set eth0 gro_max_size 181000
>
> Note that if there are bonding or ipvan devices on top of the physical
> NIC, their GSO sizes need to be updated as well.
>
> Then, IPv6/TCP packets with sizes larger than 64k can be observed.
>
> Big TCP functionality is tested by Michael, feature checks not yet.
>
> Tested by Michael:
> I've confirmed with our hardware team that this is supported by our
> chips, and I've tested it up to gso_max_size of 524280. Thanks.
>
> Tested-by: Michael Chan <michael.chan@broadcom.com>
> Reviewed-by: Michael Chan <michael.chan@broadcom.com>
If you have made changes since the last version, please drop these
tags. Reviewers will provide new tags after reviewing the new
version.
> Signed-off-by: Coco Li <lixiaoyan@google.com>
> ---
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 26 ++++++++++++++++++++++-
> 1 file changed, 25 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 0fe164b42c5d..6ba1cd342a80 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -389,6 +389,9 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
> return NETDEV_TX_BUSY;
> }
>
> + if (unlikely(ipv6_hopopt_jumbo_remove(skb)))
> + goto tx_free;
> +
> length = skb->len;
> len = skb_headlen(skb);
> last_frag = skb_shinfo(skb)->nr_frags;
> @@ -11315,6 +11318,7 @@ static bool bnxt_exthdr_check(struct bnxt *bp, struct sk_buff *skb, int nw_off,
> u8 **nextp)
> {
> struct ipv6hdr *ip6h = (struct ipv6hdr *)(skb->data + nw_off);
> + struct hop_jumbo_hdr *jhdr;
> int hdr_count = 0;
> u8 *nexthdr;
> int start;
> @@ -11342,9 +11346,27 @@ static bool bnxt_exthdr_check(struct bnxt *bp, struct sk_buff *skb, int nw_off,
>
> if (hdrlen > 64)
> return false;
> +
> + /* The ext header may be a hop-by-hop header inserted for
> + * big TCP purposes. This will be removed before sending
> + * from NIC, so do not count it.
> + */
> + if (*nexthdr == NEXTHDR_HOP) {
> + if (likely(skb->len <= GRO_LEGACY_MAX_SIZE))
> + goto increment_hdr;
> +
> + jhdr = (struct hop_jumbo_hdr *)nexthdr;
I already explained when reviewing your last version that nexthdr
initially points to the next header field within the ipv6 header so
this won't work. If you cast it to jhdr, jhdr will be at offset 6 of
the ipv6 header. It won't be pointing to the extension header. You
need to do:
jhdr = (struct hop_jumbo_hdr *)hp
hp is pointing to the extension header.
Thanks.
> + if (jhdr->tlv_type != IPV6_TLV_JUMBO || jhdr->hdrlen != 0 ||
> + jhdr->nexthdr != IPPROTO_TCP)
> + goto increment_hdr;
> +
> + goto next_hdr;
> + }
> +increment_hdr:
> + hdr_count++;
> +next_hdr:
> nexthdr = &hp->nexthdr;
> start += hdrlen;
> - hdr_count++;
> }
> if (nextp) {
> /* Caller will check inner protocol */
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC net-next v5 2/2] bnxt: Use generic HBH removal helper in tx path
2022-12-08 0:32 ` Saeed Mahameed
@ 2022-12-10 3:53 ` Coco Li
0 siblings, 0 replies; 6+ messages in thread
From: Coco Li @ 2022-12-10 3:53 UTC (permalink / raw)
To: Saeed Mahameed
Cc: David S. Miller, Hideaki YOSHIFUJI, David Ahern, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Michael Chan, netdev, linux-kernel
I agree that this function isn't efficient for drivers who already
copy headers, which can just copy over the needed parts of the header
as you mentioned. However, for drivers that need HBH header removed in
place, it would be a nice function to have (and it reduces code
duplication, see function be reused for GSO path).
On Wed, Dec 7, 2022 at 4:33 PM Saeed Mahameed <saeed@kernel.org> wrote:
>
>
> On 07 Dec 14:54, Coco Li wrote:
> >Eric Dumazet implemented Big TCP that allowed bigger TSO/GRO packet sizes
> >for IPv6 traffic. See patch series:
> >'commit 89527be8d8d6 ("net: add IFLA_TSO_{MAX_SIZE|SEGS} attributes")'
> >
> >This reduces the number of packets traversing the networking stack and
> >should usually improves performance. However, it also inserts a
> >temporary Hop-by-hop IPv6 extension header.
> >
> >Using the HBH header removal method in the previous path, the extra header
> ^ patch
> >be removed in bnxt drivers to allow it to send big TCP packets (bigger
> >TSO packets) as well.
> >
>
> I think Eric didn't expose this function because it isn't efficient for
> drivers who are already processing the headers separately from payload for
> LSO packets .. the trick is to have an optimized copy method depending on
> your driver xmit function, usually you would just memcpy the TCP header over
> the HBH exactly at the point you copy/process those headers into the HW
> descriptor.
>
> >Tested:
> >Compiled locally
> >
> >To further test functional correctness, update the GSO/GRO limit on the
> >physical NIC:
> >
> >ip link set eth0 gso_max_size 181000
> >ip link set eth0 gro_max_size 181000
> >
> >Note that if there are bonding or ipvan devices on top of the physical
> >NIC, their GSO sizes need to be updated as well.
> >
> >Then, IPv6/TCP packets with sizes larger than 64k can be observed.
> >
> >Big TCP functionality is tested by Michael, feature checks not yet.
> >
> >Tested by Michael:
> >I've confirmed with our hardware team that this is supported by our
> >chips, and I've tested it up to gso_max_size of 524280. Thanks.
> >
> >Tested-by: Michael Chan <michael.chan@broadcom.com>
> >Reviewed-by: Michael Chan <michael.chan@broadcom.com>
> >Signed-off-by: Coco Li <lixiaoyan@google.com>
> >---
> > drivers/net/ethernet/broadcom/bnxt/bnxt.c | 26 ++++++++++++++++++++++-
> > 1 file changed, 25 insertions(+), 1 deletion(-)
> >
> >diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> >index 0fe164b42c5d..6ba1cd342a80 100644
> >--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> >+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> >@@ -389,6 +389,9 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
> > return NETDEV_TX_BUSY;
> > }
> >
> >+ if (unlikely(ipv6_hopopt_jumbo_remove(skb)))
> >+ goto tx_free;
> >+
> > length = skb->len;
> > len = skb_headlen(skb);
> > last_frag = skb_shinfo(skb)->nr_frags;
> >@@ -11315,6 +11318,7 @@ static bool bnxt_exthdr_check(struct bnxt *bp, struct sk_buff *skb, int nw_off,
> > u8 **nextp)
> > {
> > struct ipv6hdr *ip6h = (struct ipv6hdr *)(skb->data + nw_off);
> >+ struct hop_jumbo_hdr *jhdr;
> > int hdr_count = 0;
> > u8 *nexthdr;
> > int start;
> >@@ -11342,9 +11346,27 @@ static bool bnxt_exthdr_check(struct bnxt *bp, struct sk_buff *skb, int nw_off,
> >
> > if (hdrlen > 64)
> > return false;
> >+
> >+ /* The ext header may be a hop-by-hop header inserted for
> >+ * big TCP purposes. This will be removed before sending
> >+ * from NIC, so do not count it.
> >+ */
> >+ if (*nexthdr == NEXTHDR_HOP) {
> >+ if (likely(skb->len <= GRO_LEGACY_MAX_SIZE))
> >+ goto increment_hdr;
> >+
> >+ jhdr = (struct hop_jumbo_hdr *)nexthdr;
> >+ if (jhdr->tlv_type != IPV6_TLV_JUMBO || jhdr->hdrlen != 0 ||
> >+ jhdr->nexthdr != IPPROTO_TCP)
> >+ goto increment_hdr;
> >+
> >+ goto next_hdr;
> >+ }
> >+increment_hdr:
> >+ hdr_count++;
> >+next_hdr:
> > nexthdr = &hp->nexthdr;
> > start += hdrlen;
> >- hdr_count++;
> > }
> > if (nextp) {
> > /* Caller will check inner protocol */
> >@@ -13657,6 +13679,8 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> > dev->features &= ~NETIF_F_LRO;
> > dev->priv_flags |= IFF_UNICAST_FLT;
> >
> >+ netif_set_tso_max_size(dev, GSO_MAX_SIZE);
> >+
> > #ifdef CONFIG_BNXT_SRIOV
> > init_waitqueue_head(&bp->sriov_cfg_wait);
> > #endif
> >--
> >2.39.0.rc0.267.gcb52ba06e7-goog
> >
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-12-10 3:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-07 22:54 [PATCH net-next v5 1/2] IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver Coco Li
2022-12-07 22:54 ` [RFC net-next v5 2/2] bnxt: Use generic HBH removal helper in tx path Coco Li
2022-12-08 0:32 ` Saeed Mahameed
2022-12-10 3:53 ` Coco Li
2022-12-08 19:54 ` Michael Chan
2022-12-08 2:21 ` [PATCH net-next v5 1/2] IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).