* [PATCH net v4 0/3] Fix a regression in the Gemini ethernet controller.
@ 2023-12-22 17:36 Linus Walleij
2023-12-22 17:36 ` [PATCH net v4 1/3] net: ethernet: cortina: Drop software checksum and TSO Linus Walleij
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Linus Walleij @ 2023-12-22 17:36 UTC (permalink / raw)
To: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Vladimir Oltean, Household Cang
Cc: netdev, Linus Walleij, Maxime Chevallier
These fixes were developed on top of the earlier fixes.
Finding the right solution is hard because the Gemini checksumming
engine is completely undocumented in the datasheets.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
Changes in v4:
- Properly drop all MTU/TSO muckery in the TX function, the
whole approach is bogus.
- Make the raw etherype retrieveal return __be16, it is the
callers job to deal with endianness (as per the pattern
from if_vlan.h)
- Use __vlan_get_protocol() instead of vlan_get_protocol()
- Only actively bypass the TSS if the frame is over a certain
size.
- Drop comment that no longer applies.
- Link to v3: https://lore.kernel.org/r/20231221-new-gemini-ethernet-regression-v3-0-a96b4374bfe8@linaro.org
Changes in v3:
- Fix a whitespace bug in the first patch.
- Add generic accessors to obtain the raw ethertype of an
ethernet frame. VLAN already have the right accessors.
- Link to v2: https://lore.kernel.org/r/20231216-new-gemini-ethernet-regression-v2-0-64c269413dfa@linaro.org
Changes in v2:
- Drop the TSO and length checks altogether, this was never
working properly.
- Plan to make a proper TSO implementation in the next kernel
cycle.
- Link to v1: https://lore.kernel.org/r/20231215-new-gemini-ethernet-regression-v1-0-93033544be23@linaro.org
---
Linus Walleij (3):
net: ethernet: cortina: Drop software checksum and TSO
if_ether: Add an accessor to read the raw ethertype
net: ethernet: cortina: Bypass checksumming engine of alien ethertypes
drivers/net/ethernet/cortina/gemini.c | 62 +++++++++++++++--------------------
include/linux/if_ether.h | 16 +++++++++
2 files changed, 42 insertions(+), 36 deletions(-)
---
base-commit: 33cc938e65a98f1d29d0a18403dbbee050dcad9a
change-id: 20231203-new-gemini-ethernet-regression-3c672de9cfd9
Best regards,
--
Linus Walleij <linus.walleij@linaro.org>
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH net v4 1/3] net: ethernet: cortina: Drop software checksum and TSO
2023-12-22 17:36 [PATCH net v4 0/3] Fix a regression in the Gemini ethernet controller Linus Walleij
@ 2023-12-22 17:36 ` Linus Walleij
2023-12-22 17:36 ` [PATCH net v4 2/3] if_ether: Add an accessor to read the raw ethertype Linus Walleij
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Linus Walleij @ 2023-12-22 17:36 UTC (permalink / raw)
To: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Vladimir Oltean, Household Cang
Cc: netdev, Linus Walleij
The recent change to allow large frames without hardware checksumming
slotted in software checksumming in the driver if hardware could not
do it.
This will however upset TSO (TCP Segment Offloading). Typical
error dumps includes this:
skb len=2961 headroom=222 headlen=66 tailroom=0
(...)
WARNING: CPU: 0 PID: 956 at net/core/dev.c:3259 skb_warn_bad_offload+0x7c/0x108
gemini-ethernet-port: caps=(0x0000010000154813, 0x00002007ffdd7889)
And the packets do not go through.
After investigating I drilled it down to the introduction of the
software checksumming in the driver.
Since the segmenting of packets will be done by the hardware this
makes a bit of sense since in that case the hardware also needs to
be keeping track of the checksumming.
That begs the question why large TCP or UDP packets also have to
bypass the checksumming (like e.g. ICMP does). If the hardware is
splitting it into smaller packets per-MTU setting, and checksumming
them, why is this happening then? I don't know. I know it is needed,
from tests: the OpenWrt webserver uhttpd starts sending big skb:s (up
to 2047 bytes, the max MTU) and above 1514 bytes it starts to fail
and hang unless the bypass bit is set: the frames are not getting
through.
Drop the size check and the offloading features for now: this
needs to be fixed up properly.
Suggested-by: Eric Dumazet <edumazet@google.com>
Fixes: d4d0c5b4d279 ("net: ethernet: cortina: Handle large frames")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
drivers/net/ethernet/cortina/gemini.c | 35 ++++-------------------------------
1 file changed, 4 insertions(+), 31 deletions(-)
diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
index 78287cfcbf63..5e399c6e095b 100644
--- a/drivers/net/ethernet/cortina/gemini.c
+++ b/drivers/net/ethernet/cortina/gemini.c
@@ -79,8 +79,7 @@ MODULE_PARM_DESC(debug, "Debug level (0=none,...,16=all)");
#define GMAC0_IRQ4_8 (GMAC0_MIB_INT_BIT | GMAC0_RX_OVERRUN_INT_BIT)
#define GMAC_OFFLOAD_FEATURES (NETIF_F_SG | NETIF_F_IP_CSUM | \
- NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM | \
- NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6)
+ NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM)
/**
* struct gmac_queue_page - page buffer per-page info
@@ -1143,39 +1142,13 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
struct gmac_txdesc *txd;
skb_frag_t *skb_frag;
dma_addr_t mapping;
- unsigned short mtu;
void *buffer;
- int ret;
-
- mtu = ETH_HLEN;
- mtu += netdev->mtu;
- if (skb->protocol == htons(ETH_P_8021Q))
- mtu += VLAN_HLEN;
+ /* TODO: implement proper TSO using MTU in word3 */
word1 = skb->len;
- word3 = SOF_BIT;
-
- if (word1 > mtu) {
- word1 |= TSS_MTU_ENABLE_BIT;
- word3 |= mtu;
- }
+ word3 = SOF_BIT | skb->len;
- if (skb->len >= ETH_FRAME_LEN) {
- /* Hardware offloaded checksumming isn't working on frames
- * bigger than 1514 bytes. A hypothesis about this is that the
- * checksum buffer is only 1518 bytes, so when the frames get
- * bigger they get truncated, or the last few bytes get
- * overwritten by the FCS.
- *
- * Just use software checksumming and bypass on bigger frames.
- */
- if (skb->ip_summed == CHECKSUM_PARTIAL) {
- ret = skb_checksum_help(skb);
- if (ret)
- return ret;
- }
- word1 |= TSS_BYPASS_BIT;
- } else if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ if (skb->ip_summed == CHECKSUM_PARTIAL) {
int tcp = 0;
/* We do not switch off the checksumming on non TCP/UDP
--
2.34.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH net v4 2/3] if_ether: Add an accessor to read the raw ethertype
2023-12-22 17:36 [PATCH net v4 0/3] Fix a regression in the Gemini ethernet controller Linus Walleij
2023-12-22 17:36 ` [PATCH net v4 1/3] net: ethernet: cortina: Drop software checksum and TSO Linus Walleij
@ 2023-12-22 17:36 ` Linus Walleij
2023-12-22 17:36 ` [PATCH net v4 3/3] net: ethernet: cortina: Bypass checksumming engine of alien ethertypes Linus Walleij
2023-12-29 23:17 ` [PATCH net v4 0/3] Fix a regression in the Gemini ethernet controller Linus Walleij
3 siblings, 0 replies; 5+ messages in thread
From: Linus Walleij @ 2023-12-22 17:36 UTC (permalink / raw)
To: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Vladimir Oltean, Household Cang
Cc: netdev, Linus Walleij, Maxime Chevallier
There are circumstances where the skb->protocol can not be
trusted, such as when using DSA switches that add a custom
ethertype to the ethernet packet, which is later on supposed
to be stripped by the switch hardware connected to the
conduit ethernet interface.
Since ethernet drivers transmitting such frames with alien
ethertypes can have hardware that will get confused by
custom ethertypes they need a way to retrieve and act
on any such type.
The new eth_skb_raw_ethertype() helper will extract the
ethertype directly from the skb->data using the ethernet
and (if necessary) VLAN helper functions, and return the
ethertype actually found inside the raw buffer.
Suggested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
include/linux/if_ether.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/include/linux/if_ether.h b/include/linux/if_ether.h
index 8a9792a6427a..264457f291eb 100644
--- a/include/linux/if_ether.h
+++ b/include/linux/if_ether.h
@@ -37,6 +37,22 @@ static inline struct ethhdr *inner_eth_hdr(const struct sk_buff *skb)
return (struct ethhdr *)skb_inner_mac_header(skb);
}
+/* This determines the ethertype incoded into the skb data without
+ * relying on skb->protocol which is not always identical.
+ */
+static inline __be16 skb_eth_raw_ethertype(struct sk_buff *skb)
+{
+ struct ethhdr *hdr;
+
+ /* If we can't extract a header, return invalid type */
+ if (!pskb_may_pull(skb, ETH_HLEN))
+ return 0x0000U;
+
+ hdr = skb_eth_hdr(skb);
+
+ return hdr->h_proto;
+}
+
int eth_header_parse(const struct sk_buff *skb, unsigned char *haddr);
extern ssize_t sysfs_format_mac(char *buf, const unsigned char *addr, int len);
--
2.34.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH net v4 3/3] net: ethernet: cortina: Bypass checksumming engine of alien ethertypes
2023-12-22 17:36 [PATCH net v4 0/3] Fix a regression in the Gemini ethernet controller Linus Walleij
2023-12-22 17:36 ` [PATCH net v4 1/3] net: ethernet: cortina: Drop software checksum and TSO Linus Walleij
2023-12-22 17:36 ` [PATCH net v4 2/3] if_ether: Add an accessor to read the raw ethertype Linus Walleij
@ 2023-12-22 17:36 ` Linus Walleij
2023-12-29 23:17 ` [PATCH net v4 0/3] Fix a regression in the Gemini ethernet controller Linus Walleij
3 siblings, 0 replies; 5+ messages in thread
From: Linus Walleij @ 2023-12-22 17:36 UTC (permalink / raw)
To: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Vladimir Oltean, Household Cang
Cc: netdev, Linus Walleij
We had workarounds were the ethernet checksumming engine would be bypassed
for larger frames, this fixed devices using DSA, but regressed devices
where the ethernet was connected directly to a PHY.
The devices with a PHY connected directly can't handle large frames
either way, with or without bypass. Looking at the size of the frame
is probably just wrong.
Rework the workaround such that we don't activate the checksumming engine if
the ethertype inside the actual frame is something else than 0x0800
(IPv4) or 0x86dd (IPv6). These are the only frames the checksumming engine
can actually handle. VLAN framing (0x8100) also works fine.
We can't inspect skb->protocol because DSA frames will sometimes have a
custom ethertype despite skb->protocol is e.g. 0x0800.
If the frame is ALSO over the size of an ordinary ethernet frame,
we will actively bypass the checksumming engine. (Always doing this
makes the hardware unstable.)
After this both devices with direct ethernet attached such as D-Link
DNS-313 and devices with a DSA switch with a custom ethertype such as
D-Link DIR-685 work fine.
Fixes: d4d0c5b4d279 ("net: ethernet: cortina: Handle large frames")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
drivers/net/ethernet/cortina/gemini.c | 33 +++++++++++++++++++++++++--------
1 file changed, 25 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
index 5e399c6e095b..db828e4f258f 100644
--- a/drivers/net/ethernet/cortina/gemini.c
+++ b/drivers/net/ethernet/cortina/gemini.c
@@ -29,6 +29,7 @@
#include <linux/of_net.h>
#include <linux/of_platform.h>
#include <linux/etherdevice.h>
+#include <linux/if_ether.h>
#include <linux/if_vlan.h>
#include <linux/skbuff.h>
#include <linux/phy.h>
@@ -1142,22 +1143,38 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
struct gmac_txdesc *txd;
skb_frag_t *skb_frag;
dma_addr_t mapping;
+ u16 ethertype;
void *buffer;
/* TODO: implement proper TSO using MTU in word3 */
word1 = skb->len;
word3 = SOF_BIT | skb->len;
- if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ /* Dig out the the ethertype actually in the buffer and not what the
+ * protocol claims to be. This is the raw data that the checksumming
+ * offload engine will have to deal with.
+ */
+ ethertype = ntohs(skb_eth_raw_ethertype(skb));
+ /* This is the only VLAN type supported by this hardware so check for
+ * that: the checksumming engine can handle IP and IPv6 inside 802.1Q.
+ */
+ if (ethertype == ETH_P_8021Q)
+ ethertype = ntohs(__vlan_get_protocol(skb, htons(ethertype), NULL));
+
+ if (ethertype != ETH_P_IP && ethertype != ETH_P_IPV6) {
+ /* Hardware offloaded checksumming isn't working on non-IP frames.
+ * This happens for example on some DSA switches using a custom
+ * ethertype. When a frame gets bigger than a standard ethernet
+ * frame, it also needs to actively bypass the checksumming engine.
+ * There is no clear explanation to why it is like this, the
+ * reference manual has left the TSS completely undocumented.
+ */
+ if (skb->len > ETH_FRAME_LEN)
+ word1 |= TSS_BYPASS_BIT;
+ } else if (skb->ip_summed == CHECKSUM_PARTIAL) {
int tcp = 0;
- /* We do not switch off the checksumming on non TCP/UDP
- * frames: as is shown from tests, the checksumming engine
- * is smart enough to see that a frame is not actually TCP
- * or UDP and then just pass it through without any changes
- * to the frame.
- */
- if (skb->protocol == htons(ETH_P_IP)) {
+ if (ethertype == ETH_P_IP) {
word1 |= TSS_IP_CHKSUM_BIT;
tcp = ip_hdr(skb)->protocol == IPPROTO_TCP;
} else { /* IPv6 */
--
2.34.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net v4 0/3] Fix a regression in the Gemini ethernet controller.
2023-12-22 17:36 [PATCH net v4 0/3] Fix a regression in the Gemini ethernet controller Linus Walleij
` (2 preceding siblings ...)
2023-12-22 17:36 ` [PATCH net v4 3/3] net: ethernet: cortina: Bypass checksumming engine of alien ethertypes Linus Walleij
@ 2023-12-29 23:17 ` Linus Walleij
3 siblings, 0 replies; 5+ messages in thread
From: Linus Walleij @ 2023-12-29 23:17 UTC (permalink / raw)
To: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Vladimir Oltean, Household Cang
Cc: netdev, Maxime Chevallier
On Fri, Dec 22, 2023 at 6:36 PM Linus Walleij <linus.walleij@linaro.org> wrote:
> These fixes were developed on top of the earlier fixes.
>
> Finding the right solution is hard because the Gemini checksumming
> engine is completely undocumented in the datasheets.
>
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
> Changes in v4:
If no-one is actively against the v4 patch can we merge this?
It's a regression.
Yours,
Linus Walleij
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-12-29 23:17 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-22 17:36 [PATCH net v4 0/3] Fix a regression in the Gemini ethernet controller Linus Walleij
2023-12-22 17:36 ` [PATCH net v4 1/3] net: ethernet: cortina: Drop software checksum and TSO Linus Walleij
2023-12-22 17:36 ` [PATCH net v4 2/3] if_ether: Add an accessor to read the raw ethertype Linus Walleij
2023-12-22 17:36 ` [PATCH net v4 3/3] net: ethernet: cortina: Bypass checksumming engine of alien ethertypes Linus Walleij
2023-12-29 23:17 ` [PATCH net v4 0/3] Fix a regression in the Gemini ethernet controller Linus Walleij
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).