* Re: build_skb() and data corruption
From: Arnd Bergmann @ 2014-01-14 15:51 UTC (permalink / raw)
To: Jonas Jensen
Cc: netdev, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, alexander.h.duyck, Florian Fainelli,
Ben Hutchings
In-Reply-To: <CACmBeS1gTfdjwiCkHMViOnF6G2zjy3HSMm3NVq8Jiq5F22ynJA@mail.gmail.com>
On Tuesday 14 January 2014, Jonas Jensen wrote:
> Thanks for the replies, you led me to a new solution,
>
>
> I now think build_skb() is not the right choice, my motivation for
> using it in the first place, that I thought it meant getting away with
> not copying memory.
>
> build_skb() is replaced by netdev_alloc_skb_ip_align() and memcpy()
> (derived from drivers/net/ethernet/realtek/r8169.c).
>
> Read errors are gone, even without syncing DMA. Is it a good idea to
> do it anyway, i.e. leave calls to dma_sync_single_* in?
The calls to dma_sync_single_* in the moxart_rx() function are needed.
The call to arm_dma_ops.sync_single_for_device() in
moxart_mac_setup_desc_ring() is wrong, because the buffer is already
owned by the device at that point (just after dma_map_single), and
because you should use the official dma_* api rather than using
the arm_dma_ops struct.
Arnd
^ permalink raw reply
* Re: [PATCH net-next v2 0/2] stmmac: fix kernel crashes for jumbo frames
From: Dinh Nguyen @ 2014-01-14 15:52 UTC (permalink / raw)
To: Vince Bridgers
Cc: devicetree, netdev, peppe.cavallaro, robh+dt, pawel.moll,
mark.rutland, ijc+devicetree, galak, rayagond
In-Reply-To: <1389710409-14106-1-git-send-email-vbridgers2013@gmail.com>
Hi Vince
On Tue, 2014-01-14 at 08:40 -0600, Vince Bridgers wrote:
> These patches address two kernel crashes seen when using jumbo frames on
> the Synopsys stmmac driver, and adds device tree configurability for the
> maximum mtu. The Synopsys emac fifo sizes can be configured when a logic
> design is synthesized, but does not provide a way for a driver to query the
> exact fifo size.
>
> The crashes seen were due to two issues.
>
> 1) The dma buffer size was being set after the dma buffers were allocated.
> This caused a crash when changing the mtu since it was possible the buffers
> would subsequently be freed using an incorrect dma buffer size. This could
> also cause kernel panics due to memory corruption since a large mtu size could
> have been configured, but the dma buffers were not sized accordingly.
>
> 2) Jumbo frames were being enabled by default, but the dma buffers were not
> sized accordingly. This caused memory corruption in the context of certain
> types of network traffic, leading to kernel panics.
>
> I've tested these changes using automated, reproducible testware. I can
> demonstrate the panics described before the fixes and show that the fixes
> address the problems described.
>
> Testing and improvements continue through the use of the mentioned automated
> and reproducible testware.
>
> Vince Bridgers
>
> Vince Bridgers (2):
> dts: Add a binding for Synopsys emac max-frame-size
> stmmac: Fix kernel crashes for jumbo frames
>
> Documentation/devicetree/bindings/net/stmmac.txt | 5 +++++
> drivers/net/ethernet/stmicro/stmmac/common.h | 4 +++-
> drivers/net/ethernet/stmicro/stmmac/dwmac1000.h | 7 ++-----
> .../net/ethernet/stmicro/stmmac/dwmac1000_core.c | 7 ++++++-
> .../net/ethernet/stmicro/stmmac/dwmac100_core.c | 2 +-
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 11 +++++++----
> .../net/ethernet/stmicro/stmmac/stmmac_platform.c | 5 +++++
> include/linux/stmmac.h | 1 +
> 8 files changed, 30 insertions(+), 12 deletions(-)
What has changed since v1? A version log would be nice.
Thanks,
Dinh
^ permalink raw reply
* Re: [PATCH net-next v2 1/2] dts: Add a binding for Synopsys emac max-frame-size
From: Dinh Nguyen @ 2014-01-14 15:57 UTC (permalink / raw)
To: Vince Bridgers
Cc: devicetree, netdev, peppe.cavallaro, robh+dt, pawel.moll,
mark.rutland, ijc+devicetree, galak, rayagond
In-Reply-To: <1389710409-14106-2-git-send-email-vbridgers2013@gmail.com>
On Tue, 2014-01-14 at 08:40 -0600, Vince Bridgers wrote:
> This change adds a parameter for the Synopsys 10/100/1000
> stmmac Ethernet driver to configure the maximum frame
> size supported by the EMAC driver. Synopsys allows the FIFO
> sizes to be configured when the cores are built for a particular
> device, but do not provide a way for the driver to read
> information from the device about the maximum MTU size
> supported as limited by the device's FIFO size.
>
> Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
> ---
> Documentation/devicetree/bindings/net/stmmac.txt | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/net/stmmac.txt b/Documentation/devicetree/bindings/net/stmmac.txt
> index eba0e5e..26a0ba9 100644
> --- a/Documentation/devicetree/bindings/net/stmmac.txt
> +++ b/Documentation/devicetree/bindings/net/stmmac.txt
> @@ -30,6 +30,10 @@ Required properties:
>
> Optional properties:
> - mac-address: 6 bytes, mac address
> +- snps,max-frame-size: Maximum frame size permitted. This parameter is useful
I don't think max-frame-size should be a snps-only binding.
Dinh
> + since different implementations of the Synopsys MAC may
> + have different FIFO sizes depending on the selections
> + made in Synopsys Core Consultant.
>
> Examples:
>
> @@ -40,5 +44,6 @@ Examples:
> interrupts = <24 23>;
> interrupt-names = "macirq", "eth_wake_irq";
> mac-address = [000000000000]; /* Filled in by U-Boot */
> + snps,max-frame-size = <3800>;
> phy-mode = "gmii";
> };
^ permalink raw reply
* [PATCH net-next V4 2/3] net: Export gro_find_by_type helpers
From: Or Gerlitz @ 2014-01-14 16:00 UTC (permalink / raw)
To: davem; +Cc: netdev, hkchu, edumazet, herbert, yanb, shlomop, therbert,
Or Gerlitz
In-Reply-To: <1389715212-14504-1-git-send-email-ogerlitz@mellanox.com>
Export the gro_find_receive/complete_by_type helpers to they can be invoked
by the gro callbacks of encapsulation protocols such as vxlan.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
net/core/dev.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index aafc07a..03cab5f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3949,6 +3949,7 @@ struct packet_offload *gro_find_receive_by_type(__be16 type)
}
return NULL;
}
+EXPORT_SYMBOL(gro_find_receive_by_type);
struct packet_offload *gro_find_complete_by_type(__be16 type)
{
@@ -3962,6 +3963,7 @@ struct packet_offload *gro_find_complete_by_type(__be16 type)
}
return NULL;
}
+EXPORT_SYMBOL(gro_find_complete_by_type);
static gro_result_t napi_skb_finish(gro_result_t ret, struct sk_buff *skb)
{
--
1.7.1
^ permalink raw reply related
* [PATCH net-next V3 0/3] net: Add GRO support for UDP encapsulating protocols
From: Or Gerlitz @ 2014-01-14 16:00 UTC (permalink / raw)
To: davem; +Cc: netdev, hkchu, edumazet, herbert, yanb, shlomop, therbert,
Or Gerlitz
This series adds GRO handlers for protocols that do UDP encapsulation, with the
intent of being able to coalesce packets which encapsulate packets belonging to
the same TCP session.
For GRO purposes, the destination UDP port takes the role of the ether type
field in the ethernet header or the next protocol in the IP header.
The UDP GRO handler will only attempt to coalesce packets whose destination
port is registered to have gro handler.
The patches done against net-next ae237b3ede64 "net: 3com: fix
warning for incorrect type in argument"
Or.
v3 --> v4 changes:
- applied feedback from Tom on some micro-optimizations that save
branches and goto directives in the udp gro logic
- applied feedback from Eric on correct RCU programming for the
add/remove flow of the upper protocols udp gro handlers
v2 --> v3 changes:
- moved to use linked list to store the udp gro handlers, this solves the
problem of consuming 512KB of memory for the handlers.
- use a mark on the skb GRO CB data to disallow running the udp gro_receive twice
on a packet, this solves the problem of udp encapsulated packets whose inner VM
packet is udp and happen to carry a port which has registered offloads - and flush it.
- invoke the udp offload protocol registration and de-registration from the vxlan driver
in a sleepable context
For unclear some reason I got this warning when the vxlan driver deletes the
udp offload structure
*** BLURB HERE ***
Or Gerlitz (3):
net: Add GRO support for UDP encapsulating protocols
net: Export gro_find_by_type helpers
net: Add GRO support for vxlan traffic
drivers/net/vxlan.c | 117 +++++++++++++++++++++++++++++++--
include/linux/netdevice.h | 10 +++-
include/net/protocol.h | 3 +
include/net/vxlan.h | 1 +
net/core/dev.c | 3 +
net/ipv4/udp_offload.c | 157 +++++++++++++++++++++++++++++++++++++++++++++
6 files changed, 283 insertions(+), 8 deletions(-)
^ permalink raw reply
* [PATCH net-next V4 1/3] net: Add GRO support for UDP encapsulating protocols
From: Or Gerlitz @ 2014-01-14 16:00 UTC (permalink / raw)
To: davem; +Cc: netdev, hkchu, edumazet, herbert, yanb, shlomop, therbert,
Or Gerlitz
In-Reply-To: <1389715212-14504-1-git-send-email-ogerlitz@mellanox.com>
Add GRO handlers for protocols that do UDP encapsulation, with the intent of
being able to coalesce packets which encapsulate packets belonging to
the same TCP session.
For GRO purposes, the destination UDP port takes the role of the ether type
field in the ethernet header or the next protocol in the IP header.
The UDP GRO handler will only attempt to coalesce packets whose destination
port is registered to have gro handler.
Use a mark on the skb GRO CB data to disallow (flush) running the udp gro receive
code twice on a packet. This solves the problem of udp encapsulated packets whose
inner VM packet is udp and happen to carry a port which has registered offloads.
Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
include/linux/netdevice.h | 10 +++-
include/net/protocol.h | 3 +
net/core/dev.c | 1 +
net/ipv4/udp_offload.c | 157 +++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 170 insertions(+), 1 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index a2a70cc..efb942f 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1652,7 +1652,10 @@ struct napi_gro_cb {
unsigned long age;
/* Used in ipv6_gro_receive() */
- int proto;
+ u16 proto;
+
+ /* Used in udp_gro_receive */
+ u16 udp_mark;
/* used to support CHECKSUM_COMPLETE for tunneling protocols */
__wsum csum;
@@ -1691,6 +1694,11 @@ struct packet_offload {
struct list_head list;
};
+struct udp_offload {
+ __be16 port;
+ struct offload_callbacks callbacks;
+};
+
/* often modified stats are per cpu, other are shared (netdev->stats) */
struct pcpu_sw_netstats {
u64 rx_packets;
diff --git a/include/net/protocol.h b/include/net/protocol.h
index 0e5f866..a7e986b 100644
--- a/include/net/protocol.h
+++ b/include/net/protocol.h
@@ -108,6 +108,9 @@ int inet_del_offload(const struct net_offload *prot, unsigned char num);
void inet_register_protosw(struct inet_protosw *p);
void inet_unregister_protosw(struct inet_protosw *p);
+int udp_add_offload(struct udp_offload *prot);
+void udp_del_offload(struct udp_offload *prot);
+
#if IS_ENABLED(CONFIG_IPV6)
int inet6_add_protocol(const struct inet6_protocol *prot, unsigned char num);
int inet6_del_protocol(const struct inet6_protocol *prot, unsigned char num);
diff --git a/net/core/dev.c b/net/core/dev.c
index 87312dc..aafc07a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3858,6 +3858,7 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
NAPI_GRO_CB(skb)->same_flow = 0;
NAPI_GRO_CB(skb)->flush = 0;
NAPI_GRO_CB(skb)->free = 0;
+ NAPI_GRO_CB(skb)->udp_mark = 0;
pp = ptype->callbacks.gro_receive(&napi->gro_list, skb);
break;
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 79c62bd..11785ac 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -14,6 +14,16 @@
#include <net/udp.h>
#include <net/protocol.h>
+static DEFINE_SPINLOCK(udp_offload_lock);
+static struct udp_offload_priv *udp_offload_base __read_mostly;
+
+struct udp_offload_priv {
+ struct udp_offload *offload;
+ struct rcu_head rcu;
+ atomic_t refcount;
+ struct udp_offload_priv __rcu *next;
+};
+
static int udp4_ufo_send_check(struct sk_buff *skb)
{
if (!pskb_may_pull(skb, sizeof(struct udphdr)))
@@ -89,10 +99,157 @@ out:
return segs;
}
+int udp_add_offload(struct udp_offload *uo)
+{
+ struct udp_offload_priv **head = &udp_offload_base;
+ struct udp_offload_priv *new_offload = kzalloc(sizeof(*new_offload), GFP_KERNEL);
+
+ if (!new_offload)
+ return -ENOMEM;
+
+ new_offload->offload = uo;
+ atomic_set(&new_offload->refcount, 1);
+
+ spin_lock(&udp_offload_lock);
+ rcu_assign_pointer(new_offload->next, rcu_dereference(*head));
+ rcu_assign_pointer(*head, rcu_dereference(new_offload));
+ spin_unlock(&udp_offload_lock);
+
+ return 0;
+}
+EXPORT_SYMBOL(udp_add_offload);
+
+static void udp_offload_free_routine(struct rcu_head *head)
+{
+ struct udp_offload_priv *ou_priv = container_of(head, struct udp_offload_priv, rcu);
+ kfree(ou_priv);
+}
+
+static void udp_offload_put(struct udp_offload_priv *uo_priv)
+{
+ if (atomic_dec_and_test(&uo_priv->refcount))
+ call_rcu(&uo_priv->rcu, udp_offload_free_routine);
+}
+
+void udp_del_offload(struct udp_offload *uo)
+{
+ struct udp_offload_priv __rcu **head = &udp_offload_base;
+ struct udp_offload_priv *uo_priv;
+
+ spin_lock(&udp_offload_lock);
+
+ uo_priv = rcu_dereference(*head);
+ for (; uo_priv != NULL;
+ uo_priv = rcu_dereference(*head)) {
+
+ if (uo_priv->offload == uo) {
+ rcu_assign_pointer(*head, rcu_dereference(uo_priv->next));
+ udp_offload_put(uo_priv);
+ goto unlock;
+ }
+ head = &uo_priv->next;
+ }
+ pr_warn("udp_del_offload: didn't find offload for port %d\n", htons(uo->port));
+unlock:
+ spin_unlock(&udp_offload_lock);
+}
+EXPORT_SYMBOL(udp_del_offload);
+
+static struct sk_buff **udp_gro_receive(struct sk_buff **head, struct sk_buff *skb)
+{
+ struct udp_offload_priv *uo_priv;
+ struct sk_buff *p, **pp = NULL;
+ struct udphdr *uh, *uh2;
+ unsigned int hlen, off;
+ int flush = 1;
+
+ if (NAPI_GRO_CB(skb)->udp_mark ||
+ (!skb->encapsulation && skb->ip_summed != CHECKSUM_COMPLETE))
+ goto out;
+
+ /* mark that this skb passed once through the udp gro layer */
+ NAPI_GRO_CB(skb)->udp_mark = 1;
+
+ off = skb_gro_offset(skb);
+ hlen = off + sizeof(*uh);
+ uh = skb_gro_header_fast(skb, off);
+ if (skb_gro_header_hard(skb, hlen)) {
+ uh = skb_gro_header_slow(skb, hlen, off);
+ if (unlikely(!uh))
+ goto out;
+ }
+
+ rcu_read_lock();
+ uo_priv = rcu_dereference(udp_offload_base);
+ for (; uo_priv != NULL; uo_priv = rcu_dereference(uo_priv->next)) {
+ if (uo_priv->offload->port == uh->dest &&
+ uo_priv->offload->callbacks.gro_receive) {
+ atomic_inc(&uo_priv->refcount);
+ goto unflush;
+ }
+ }
+ rcu_read_unlock();
+ goto out;
+
+unflush:
+ rcu_read_unlock();
+ flush = 0;
+
+ for (p = *head; p; p = p->next) {
+ if (!NAPI_GRO_CB(p)->same_flow)
+ continue;
+
+ uh2 = (struct udphdr *)(p->data + off);
+ if ((*(u32 *)&uh->source != *(u32 *)&uh2->source)) {
+ NAPI_GRO_CB(p)->same_flow = 0;
+ continue;
+ }
+ }
+
+ skb_gro_pull(skb, sizeof(struct udphdr)); /* pull encapsulating udp header */
+ pp = uo_priv->offload->callbacks.gro_receive(head, skb);
+ udp_offload_put(uo_priv);
+
+out:
+ NAPI_GRO_CB(skb)->flush |= flush;
+ return pp;
+}
+
+static int udp_gro_complete(struct sk_buff *skb, int nhoff)
+{
+ struct udp_offload_priv *uo_priv;
+ __be16 newlen = htons(skb->len - nhoff);
+ struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
+ int err = -ENOSYS;
+
+ uh->len = newlen;
+
+ rcu_read_lock();
+
+ uo_priv = rcu_dereference(udp_offload_base);
+ for (; uo_priv != NULL; uo_priv = rcu_dereference(uo_priv->next)) {
+ if (uo_priv->offload->port == uh->dest &&
+ uo_priv->offload->callbacks.gro_complete)
+ goto found;
+ }
+
+ rcu_read_unlock();
+ return err;
+
+found:
+ atomic_inc(&uo_priv->refcount);
+ rcu_read_unlock();
+ err = uo_priv->offload->callbacks.gro_complete(skb, nhoff + sizeof(struct udphdr));
+ udp_offload_put(uo_priv);
+ return err;
+}
+
static const struct net_offload udpv4_offload = {
.callbacks = {
.gso_send_check = udp4_ufo_send_check,
.gso_segment = udp4_ufo_fragment,
+ .gro_receive = udp_gro_receive,
+ .gro_complete = udp_gro_complete,
},
};
--
1.7.1
^ permalink raw reply related
* [PATCH net-next V4 3/3] net: Add GRO support for vxlan traffic
From: Or Gerlitz @ 2014-01-14 16:00 UTC (permalink / raw)
To: davem; +Cc: netdev, hkchu, edumazet, herbert, yanb, shlomop, therbert,
Or Gerlitz
In-Reply-To: <1389715212-14504-1-git-send-email-ogerlitz@mellanox.com>
Add GRO handlers for vxlann, by using the UDP GRO infrastructure.
For single TCP session that goes through vxlan tunneling I got nice
improvement from 6.8Gbs to 11.5Gbs
--> UDP/VXLAN GRO disabled
$ netperf -H 192.168.52.147 -c -C
$ netperf -t TCP_STREAM -H 192.168.52.147 -c -C
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.52.147 () port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 65536 65536 10.00 6799.75 12.54 24.79 0.604 1.195
--> UDP/VXLAN GRO enabled
$ netperf -t TCP_STREAM -H 192.168.52.147 -c -C
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.52.147 () port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 65536 65536 10.00 11562.72 24.90 20.34 0.706 0.577
Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
drivers/net/vxlan.c | 117 +++++++++++++++++++++++++++++++++++++++++++++++---
include/net/vxlan.h | 1 +
2 files changed, 111 insertions(+), 7 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 481f85d..27a25ce 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -40,6 +40,7 @@
#include <net/net_namespace.h>
#include <net/netns/generic.h>
#include <net/vxlan.h>
+#include <net/protocol.h>
#if IS_ENABLED(CONFIG_IPV6)
#include <net/ipv6.h>
#include <net/addrconf.h>
@@ -554,13 +555,106 @@ static int vxlan_fdb_append(struct vxlan_fdb *f,
return 1;
}
+static struct sk_buff **vxlan_gro_receive(struct sk_buff **head, struct sk_buff *skb)
+{
+ struct sk_buff *p, **pp = NULL;
+ struct vxlanhdr *vh, *vh2;
+ struct ethhdr *eh, *eh2;
+ unsigned int hlen, off_vx, off_eth;
+ const struct packet_offload *ptype;
+ __be16 type;
+ int flush = 1;
+
+ off_vx = skb_gro_offset(skb);
+ hlen = off_vx + sizeof(*vh);
+ vh = skb_gro_header_fast(skb, off_vx);
+ if (skb_gro_header_hard(skb, hlen)) {
+ vh = skb_gro_header_slow(skb, hlen, off_vx);
+ if (unlikely(!vh))
+ goto out;
+ }
+ skb_gro_pull(skb, sizeof(struct vxlanhdr)); /* pull vxlan header */
+
+ off_eth = skb_gro_offset(skb);
+ hlen = off_eth + sizeof(*eh);
+ eh = skb_gro_header_fast(skb, off_eth);
+ if (skb_gro_header_hard(skb, hlen)) {
+ eh = skb_gro_header_slow(skb, hlen, off_eth);
+ if (unlikely(!eh))
+ goto out;
+ }
+
+ flush = 0;
+
+ for (p = *head; p; p = p->next) {
+ if (!NAPI_GRO_CB(p)->same_flow)
+ continue;
+
+ vh2 = (struct vxlanhdr *)(p->data + off_vx);
+ eh2 = (struct ethhdr *)(p->data + off_eth);
+ if (vh->vx_vni != vh2->vx_vni || compare_ether_header(eh, eh2)) {
+ NAPI_GRO_CB(p)->same_flow = 0;
+ continue;
+ }
+ goto found;
+ }
+
+found:
+ type = eh->h_proto;
+
+ rcu_read_lock();
+ ptype = gro_find_receive_by_type(type);
+ if (ptype == NULL) {
+ flush = 1;
+ goto out_unlock;
+ }
+
+ skb_gro_pull(skb, sizeof(*eh)); /* pull inner eth header */
+ pp = ptype->callbacks.gro_receive(head, skb);
+
+out_unlock:
+ rcu_read_unlock();
+out:
+ NAPI_GRO_CB(skb)->flush |= flush;
+
+ return pp;
+}
+
+static int vxlan_gro_complete(struct sk_buff *skb, int nhoff)
+{
+ struct ethhdr *eh;
+ struct packet_offload *ptype;
+ __be16 type;
+ int vxlan_len = sizeof(struct vxlanhdr) + sizeof(struct ethhdr);
+ int err = -ENOSYS;
+
+ eh = (struct ethhdr *)(skb->data + nhoff + sizeof(struct vxlanhdr));
+ type = eh->h_proto;
+
+ rcu_read_lock();
+ ptype = gro_find_complete_by_type(type);
+ if (ptype != NULL)
+ err = ptype->callbacks.gro_complete(skb, nhoff + vxlan_len);
+
+ rcu_read_unlock();
+ return err;
+}
+
/* Notify netdevs that UDP port started listening */
-static void vxlan_notify_add_rx_port(struct sock *sk)
+static void vxlan_notify_add_rx_port(struct vxlan_sock *vs)
{
struct net_device *dev;
+ struct sock *sk = vs->sock->sk;
struct net *net = sock_net(sk);
sa_family_t sa_family = sk->sk_family;
__be16 port = inet_sk(sk)->inet_sport;
+ int err;
+
+ if (sa_family == AF_INET) {
+ err = udp_add_offload(&vs->udp_offloads);
+ if (err)
+ pr_warn("vxlan: udp_add_offload failed with status %d\n", err);
+ }
rcu_read_lock();
for_each_netdev_rcu(net, dev) {
@@ -572,9 +666,10 @@ static void vxlan_notify_add_rx_port(struct sock *sk)
}
/* Notify netdevs that UDP port is no more listening */
-static void vxlan_notify_del_rx_port(struct sock *sk)
+static void vxlan_notify_del_rx_port(struct vxlan_sock *vs)
{
struct net_device *dev;
+ struct sock *sk = vs->sock->sk;
struct net *net = sock_net(sk);
sa_family_t sa_family = sk->sk_family;
__be16 port = inet_sk(sk)->inet_sport;
@@ -586,6 +681,9 @@ static void vxlan_notify_del_rx_port(struct sock *sk)
port);
}
rcu_read_unlock();
+
+ if (sa_family == AF_INET)
+ udp_del_offload(&vs->udp_offloads);
}
/* Add new entry to forwarding table -- assumes lock held */
@@ -964,7 +1062,7 @@ void vxlan_sock_release(struct vxlan_sock *vs)
spin_lock(&vn->sock_lock);
hlist_del_rcu(&vs->hlist);
rcu_assign_sk_user_data(vs->sock->sk, NULL);
- vxlan_notify_del_rx_port(sk);
+ vxlan_notify_del_rx_port(vs);
spin_unlock(&vn->sock_lock);
queue_work(vxlan_wq, &vs->del_work);
@@ -1125,8 +1223,8 @@ static void vxlan_rcv(struct vxlan_sock *vs,
* leave the CHECKSUM_UNNECESSARY, the device checksummed it
* for us. Otherwise force the upper layers to verify it.
*/
- if (skb->ip_summed != CHECKSUM_UNNECESSARY || !skb->encapsulation ||
- !(vxlan->dev->features & NETIF_F_RXCSUM))
+ if ((skb->ip_summed != CHECKSUM_UNNECESSARY && skb->ip_summed != CHECKSUM_PARTIAL) ||
+ !skb->encapsulation || !(vxlan->dev->features & NETIF_F_RXCSUM))
skb->ip_summed = CHECKSUM_NONE;
skb->encapsulation = 0;
@@ -2304,7 +2402,7 @@ static struct vxlan_sock *vxlan_socket_create(struct net *net, __be16 port,
struct sock *sk;
unsigned int h;
- vs = kmalloc(sizeof(*vs), GFP_KERNEL);
+ vs = kzalloc(sizeof(*vs), GFP_KERNEL);
if (!vs)
return ERR_PTR(-ENOMEM);
@@ -2329,9 +2427,14 @@ static struct vxlan_sock *vxlan_socket_create(struct net *net, __be16 port,
vs->data = data;
rcu_assign_sk_user_data(vs->sock->sk, vs);
+ /* Initialize the vxlan udp offloads structure */
+ vs->udp_offloads.port = port;
+ vs->udp_offloads.callbacks.gro_receive = vxlan_gro_receive;
+ vs->udp_offloads.callbacks.gro_complete = vxlan_gro_complete;
+
spin_lock(&vn->sock_lock);
hlist_add_head_rcu(&vs->hlist, vs_head(net, port));
- vxlan_notify_add_rx_port(sk);
+ vxlan_notify_add_rx_port(vs);
spin_unlock(&vn->sock_lock);
/* Mark socket as an encapsulation socket. */
diff --git a/include/net/vxlan.h b/include/net/vxlan.h
index 6b6d180..5deef1a 100644
--- a/include/net/vxlan.h
+++ b/include/net/vxlan.h
@@ -21,6 +21,7 @@ struct vxlan_sock {
struct rcu_head rcu;
struct hlist_head vni_list[VNI_HASH_SIZE];
atomic_t refcnt;
+ struct udp_offload udp_offloads;
};
struct vxlan_sock *vxlan_sock_add(struct net *net, __be16 port,
--
1.7.1
^ permalink raw reply related
* Re: [PATCH net-next v2 1/2] dts: Add a binding for Synopsys emac max-frame-size
From: Florian Fainelli @ 2014-01-14 16:07 UTC (permalink / raw)
To: Dinh Nguyen
Cc: Vince Bridgers, devicetree, netdev, peppe.cavallaro, robh+dt,
pawel.moll, mark.rutland, ijc+devicetree, galak, rayagond
In-Reply-To: <1389715056.10673.9.camel@linux-builds1>
Le mardi 14 janvier 2014, 09:57:36 Dinh Nguyen a écrit :
> On Tue, 2014-01-14 at 08:40 -0600, Vince Bridgers wrote:
> > This change adds a parameter for the Synopsys 10/100/1000
> > stmmac Ethernet driver to configure the maximum frame
> > size supported by the EMAC driver. Synopsys allows the FIFO
> > sizes to be configured when the cores are built for a particular
> > device, but do not provide a way for the driver to read
> > information from the device about the maximum MTU size
> > supported as limited by the device's FIFO size.
> >
> > Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
> > ---
> >
> > Documentation/devicetree/bindings/net/stmmac.txt | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/net/stmmac.txt
> > b/Documentation/devicetree/bindings/net/stmmac.txt index eba0e5e..26a0ba9
> > 100644
> > --- a/Documentation/devicetree/bindings/net/stmmac.txt
> > +++ b/Documentation/devicetree/bindings/net/stmmac.txt
> >
> > @@ -30,6 +30,10 @@ Required properties:
> > Optional properties:
> > - mac-address: 6 bytes, mac address
> >
> > +- snps,max-frame-size: Maximum frame size permitted. This parameter is
> > useful
> I don't think max-frame-size should be a snps-only binding.
Right, and I already made that comment to Vince. Maybe this is just an
oversight and mistakenly resubmitted v2 instead of v3?
--
Florian
^ permalink raw reply
* Re: [PATCH net-next V3 0/3] net: Add GRO support for UDP encapsulating protocols
From: Or Gerlitz @ 2014-01-14 16:06 UTC (permalink / raw)
To: David Miller; +Cc: netdev
In-Reply-To: <1389715212-14504-1-git-send-email-ogerlitz@mellanox.com>
On 14/01/2014 18:00, Or Gerlitz wrote:
> This series adds GRO handlers for protocols that do UDP encapsulation, with the
> intent of being able to coalesce packets which encapsulate packets belonging to
> the same TCP session.
>
> For GRO purposes, the destination UDP port takes the role of the ether type
> field in the ethernet header or the next protocol in the IP header.
>
> The UDP GRO handler will only attempt to coalesce packets whose destination
> port is registered to have gro handler.
>
> The patches done against net-next ae237b3ede64 "net: 3com: fix
> warning for incorrect type in argument"
>
> Or.
>
>
> v3 --> v4 changes:
>
> - applied feedback from Tom on some micro-optimizations that save
> branches and goto directives in the udp gro logic
>
> - applied feedback from Eric on correct RCU programming for the
> add/remove flow of the upper protocols udp gro handlers
>
>
> v2 --> v3 changes:
>
> - moved to use linked list to store the udp gro handlers, this solves the
> problem of consuming 512KB of memory for the handlers.
>
> - use a mark on the skb GRO CB data to disallow running the udp gro_receive twice
> on a packet, this solves the problem of udp encapsulated packets whose inner VM
> packet is udp and happen to carry a port which has registered offloads - and flush it.
>
> - invoke the udp offload protocol registration and de-registration from the vxlan driver
> in a sleepable context
>
> For unclear some reason I got this warning when the vxlan driver deletes the
> udp offload structure
> *** BLURB HERE ***
Sorry for the spam, the above three lines are leftovers from the V3
cover letter, same for the subject line of this
cover-letter which carries "V3" this *is* V4, will make sure to avoid
such flushes (....) in the future.
Or.
>
> Or Gerlitz (3):
> net: Add GRO support for UDP encapsulating protocols
> net: Export gro_find_by_type helpers
> net: Add GRO support for vxlan traffic
>
> drivers/net/vxlan.c | 117 +++++++++++++++++++++++++++++++--
> include/linux/netdevice.h | 10 +++-
> include/net/protocol.h | 3 +
> include/net/vxlan.h | 1 +
> net/core/dev.c | 3 +
> net/ipv4/udp_offload.c | 157 +++++++++++++++++++++++++++++++++++++++++++++
> 6 files changed, 283 insertions(+), 8 deletions(-)
>
^ permalink raw reply
* Re: [PATCH net-next 09/10] can: use __dev_get_by_index instead of dev_get_by_index to find interface
From: Oliver Hartkopp @ 2014-01-14 16:11 UTC (permalink / raw)
To: Ying Xue, davem
Cc: vfalico, john.r.fastabend, stephen, antonio, dmitry.tarnyagin,
johannes, netdev, linux-kernel
In-Reply-To: <1389685269-18600-10-git-send-email-ying.xue@windriver.com>
On 14.01.2014 08:41, Ying Xue wrote:
> As cgw_create_job() is always under rtnl_lock protection,
> __dev_get_by_index() instead of dev_get_by_index() should be used to
> find interface handler in it having us avoid to change interface
> reference counter.
>
> Cc: Oliver Hartkopp <socketcan@hartkopp.net>
> Signed-off-by: Ying Xue <ying.xue@windriver.com>
Thanks for the simplification!
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
> ---
> net/can/gw.c | 15 +++++----------
> 1 file changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/net/can/gw.c b/net/can/gw.c
> index 88c8a39..ac31891 100644
> --- a/net/can/gw.c
> +++ b/net/can/gw.c
> @@ -839,21 +839,21 @@ static int cgw_create_job(struct sk_buff *skb, struct nlmsghdr *nlh)
> if (!gwj->ccgw.src_idx || !gwj->ccgw.dst_idx)
> goto out;
>
> - gwj->src.dev = dev_get_by_index(&init_net, gwj->ccgw.src_idx);
> + gwj->src.dev = __dev_get_by_index(&init_net, gwj->ccgw.src_idx);
>
> if (!gwj->src.dev)
> goto out;
>
> if (gwj->src.dev->type != ARPHRD_CAN)
> - goto put_src_out;
> + goto out;
>
> - gwj->dst.dev = dev_get_by_index(&init_net, gwj->ccgw.dst_idx);
> + gwj->dst.dev = __dev_get_by_index(&init_net, gwj->ccgw.dst_idx);
>
> if (!gwj->dst.dev)
> - goto put_src_out;
> + goto out;
>
> if (gwj->dst.dev->type != ARPHRD_CAN)
> - goto put_src_dst_out;
> + goto out;
>
> gwj->limit_hops = limhops;
>
> @@ -862,11 +862,6 @@ static int cgw_create_job(struct sk_buff *skb, struct nlmsghdr *nlh)
> err = cgw_register_filter(gwj);
> if (!err)
> hlist_add_head_rcu(&gwj->list, &cgw_list);
> -
> -put_src_dst_out:
> - dev_put(gwj->dst.dev);
> -put_src_out:
> - dev_put(gwj->src.dev);
> out:
> if (err)
> kmem_cache_free(cgw_cache, gwj);
>
^ permalink raw reply
* [PATCH] [trivial] ixgbe: Fix format string in ixgbe_fcoe.c
From: Masanari Iida @ 2014-01-14 16:14 UTC (permalink / raw)
To: jeffrey.t.kirsher, jesse.brandeburg, bruce.w.allan, e1000-devel,
netdev, linux-kernel, trivial
Cc: Masanari Iida
cppcheck detected following warning in ixgbe_fcoe.c
(warning) %d in format string (no. 1) requires 'int' but the
argument type is 'unsigned int'.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
index f58db45..0872617 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
@@ -585,7 +585,7 @@ static int ixgbe_fcoe_dma_pool_alloc(struct ixgbe_fcoe *fcoe,
struct dma_pool *pool;
char pool_name[32];
- snprintf(pool_name, 32, "ixgbe_fcoe_ddp_%d", cpu);
+ snprintf(pool_name, 32, "ixgbe_fcoe_ddp_%u", cpu);
pool = dma_pool_create(pool_name, dev, IXGBE_FCPTR_MAX,
IXGBE_FCPTR_ALIGN, PAGE_SIZE);
--
1.8.5.2.309.ga25014b
^ permalink raw reply related
* [PATCH net-next] openvswitch: Pad OVS_PACKET_ATTR_PACKET if linear copy was performed
From: Thomas Graf @ 2014-01-14 16:16 UTC (permalink / raw)
To: Zoltan Kiss
Cc: Thomas Graf, Jesse Gross, David Miller, dev@openvswitch.org,
netdev
In-Reply-To: <52D52E0A.4050700@citrix.com>
While the zerocopy method is correctly omitted if user space
does not support unaligned Netlink messages. The attribute is
still not padded correctly as skb_zerocopy() will not ensure
padding and the attribute size is no longer pre calculated
though nla_reserve() which ensured padding previously.
This patch applies appropriate padding if a linear data copy
was performed in skb_zerocopy().
Signed-off-by: Thomas Graf <tgraf@suug.ch>
---
net/openvswitch/datapath.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index df46928..24e93f5 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -396,7 +396,7 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb,
.dst_sk = ovs_dp_get_net(dp)->genl_sock,
.snd_portid = upcall_info->portid,
};
- size_t len;
+ size_t len, plen;
unsigned int hlen;
int err, dp_ifindex;
@@ -466,6 +466,11 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb,
skb_zerocopy(user_skb, skb, skb->len, hlen);
+ /* Pad OVS_PACKET_ATTR_PACKET if linear copy was performed */
+ if (!(dp->user_features & OVS_DP_F_UNALIGNED) &&
+ (plen = (ALIGN(user_skb->len, NLA_ALIGNTO) - user_skb->len)) > 0)
+ skb_put(user_skb, plen);
+
((struct nlmsghdr *) user_skb->data)->nlmsg_len = user_skb->len;
err = genlmsg_unicast(ovs_dp_get_net(dp), user_skb, upcall_info->portid);
^ permalink raw reply related
* Re: [PATCH net-next] openvswitch: Pad OVS_PACKET_ATTR_PACKET if linear copy was performed
From: Thomas Graf @ 2014-01-14 16:19 UTC (permalink / raw)
To: Zoltan Kiss
Cc: Thomas Graf, Jesse Gross, David Miller, dev@openvswitch.org,
netdev
In-Reply-To: <20140114161640.GA24121@casper.infradead.org>
On 01/14/14 at 04:16pm, Thomas Graf wrote:
> While the zerocopy method is correctly omitted if user space
> does not support unaligned Netlink messages. The attribute is
> still not padded correctly as skb_zerocopy() will not ensure
> padding and the attribute size is no longer pre calculated
> though nla_reserve() which ensured padding previously.
>
> This patch applies appropriate padding if a linear data copy
> was performed in skb_zerocopy().
>
> Signed-off-by: Thomas Graf <tgraf@suug.ch>
> ---
> net/openvswitch/datapath.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index df46928..24e93f5 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -396,7 +396,7 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb,
> .dst_sk = ovs_dp_get_net(dp)->genl_sock,
> .snd_portid = upcall_info->portid,
> };
> - size_t len;
> + size_t len, plen;
> unsigned int hlen;
> int err, dp_ifindex;
>
> @@ -466,6 +466,11 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb,
>
> skb_zerocopy(user_skb, skb, skb->len, hlen);
>
> + /* Pad OVS_PACKET_ATTR_PACKET if linear copy was performed */
> + if (!(dp->user_features & OVS_DP_F_UNALIGNED) &&
> + (plen = (ALIGN(user_skb->len, NLA_ALIGNTO) - user_skb->len)) > 0)
> + skb_put(user_skb, plen);
While this fixes the padding issue, it leaves the padding
uninitialized, I will send v2 with an additional memset().
^ permalink raw reply
* [PATCH net-next v2] openvswitch: Pad OVS_PACKET_ATTR_PACKET if linear copy was performed
From: Thomas Graf @ 2014-01-14 16:27 UTC (permalink / raw)
To: Zoltan Kiss
Cc: Thomas Graf, Jesse Gross, David Miller, dev@openvswitch.org,
netdev
In-Reply-To: <20140114161911.GB24121@casper.infradead.org>
While the zerocopy method is correctly omitted if user space
does not support unaligned Netlink messages. The attribute is
still not padded correctly as skb_zerocopy() will not ensure
padding and the attribute size is no longer pre calculated
though nla_reserve() which ensured padding previously.
This patch applies appropriate padding if a linear data copy
was performed in skb_zerocopy().
Signed-off-by: Thomas Graf <tgraf@suug.ch>
---
v2: initialize padding to 0's
net/openvswitch/datapath.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index df46928..3ca9121 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -396,7 +396,7 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb,
.dst_sk = ovs_dp_get_net(dp)->genl_sock,
.snd_portid = upcall_info->portid,
};
- size_t len;
+ size_t len, plen;
unsigned int hlen;
int err, dp_ifindex;
@@ -466,6 +466,11 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb,
skb_zerocopy(user_skb, skb, skb->len, hlen);
+ /* Pad OVS_PACKET_ATTR_PACKET if linear copy was performed */
+ if (!(dp->user_features & OVS_DP_F_UNALIGNED) &&
+ (plen = (ALIGN(user_skb->len, NLA_ALIGNTO) - user_skb->len)) > 0)
+ memset(skb_put(user_skb, plen), 0, plen);
+
((struct nlmsghdr *) user_skb->data)->nlmsg_len = user_skb->len;
err = genlmsg_unicast(ovs_dp_get_net(dp), user_skb, upcall_info->portid);
^ permalink raw reply related
* Re: [PATCH net-next v2 1/2] dts: Add a binding for Synopsys emac max-frame-size
From: Vince Bridgers @ 2014-01-14 16:30 UTC (permalink / raw)
To: Florian Fainelli
Cc: Dinh Nguyen, devicetree, netdev, Giuseppe CAVALLARO, robh+dt,
pawel.moll, mark.rutland, ijc+devicetree, Kumar Gala,
Rayagond Kokatanur
In-Reply-To: <1890114.hPeOerbe4b@lenovo>
I'll address comments and resubmit.
Vince
On Tue, Jan 14, 2014 at 10:07 AM, Florian Fainelli <f.fainelli@gmail.com> wrote:
> Le mardi 14 janvier 2014, 09:57:36 Dinh Nguyen a écrit :
>> On Tue, 2014-01-14 at 08:40 -0600, Vince Bridgers wrote:
>> > This change adds a parameter for the Synopsys 10/100/1000
>> > stmmac Ethernet driver to configure the maximum frame
>> > size supported by the EMAC driver. Synopsys allows the FIFO
>> > sizes to be configured when the cores are built for a particular
>> > device, but do not provide a way for the driver to read
>> > information from the device about the maximum MTU size
>> > supported as limited by the device's FIFO size.
>> >
>> > Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
>> > ---
>> >
>> > Documentation/devicetree/bindings/net/stmmac.txt | 5 +++++
>> > 1 file changed, 5 insertions(+)
>> >
>> > diff --git a/Documentation/devicetree/bindings/net/stmmac.txt
>> > b/Documentation/devicetree/bindings/net/stmmac.txt index eba0e5e..26a0ba9
>> > 100644
>> > --- a/Documentation/devicetree/bindings/net/stmmac.txt
>> > +++ b/Documentation/devicetree/bindings/net/stmmac.txt
>> >
>> > @@ -30,6 +30,10 @@ Required properties:
>> > Optional properties:
>> > - mac-address: 6 bytes, mac address
>> >
>> > +- snps,max-frame-size: Maximum frame size permitted. This parameter is
>> > useful
>> I don't think max-frame-size should be a snps-only binding.
>
> Right, and I already made that comment to Vince. Maybe this is just an
> oversight and mistakenly resubmitted v2 instead of v3?
> --
> Florian
^ permalink raw reply
* [PATCH net-next 1/2] net: add sysfs helpers for netdev_adjacent logic
From: Veaceslav Falico @ 2014-01-14 16:35 UTC (permalink / raw)
To: netdev
Cc: Veaceslav Falico, Ding Tianhong, David S. Miller, Eric Dumazet,
Nicolas Dichtel, Cong Wang
In-Reply-To: <1389717360-13920-1-git-send-email-vfalico@redhat.com>
They clean up the code a bit and can be used further.
CC: Ding Tianhong <dingtianhong@huawei.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
net/core/dev.c | 57 ++++++++++++++++++++++++++++++---------------------------
1 file changed, 30 insertions(+), 27 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 87312dc..c578d4e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4595,13 +4595,36 @@ struct net_device *netdev_master_upper_dev_get_rcu(struct net_device *dev)
}
EXPORT_SYMBOL(netdev_master_upper_dev_get_rcu);
+int netdev_adjacent_sysfs_add(struct net_device *dev,
+ struct net_device *adj_dev,
+ struct list_head *dev_list)
+{
+ char linkname[IFNAMSIZ+7];
+ sprintf(linkname, dev_list == &dev->adj_list.upper ?
+ "upper_%s" : "lower_%s", adj_dev->name);
+ return sysfs_create_link(&(dev->dev.kobj), &(adj_dev->dev.kobj),
+ linkname);
+}
+void netdev_adjacent_sysfs_del(struct net_device *dev,
+ char *name,
+ struct list_head *dev_list)
+{
+ char linkname[IFNAMSIZ+7];
+ sprintf(linkname, dev_list == &dev->adj_list.upper ?
+ "upper_%s" : "lower_%s", name);
+ sysfs_remove_link(&(dev->dev.kobj), linkname);
+}
+
+#define netdev_adjacent_is_neigh_list(dev, dev_list) \
+ (dev_list == &dev->adj_list.upper || \
+ dev_list == &dev->adj_list.lower)
+
static int __netdev_adjacent_dev_insert(struct net_device *dev,
struct net_device *adj_dev,
struct list_head *dev_list,
void *private, bool master)
{
struct netdev_adjacent *adj;
- char linkname[IFNAMSIZ+7];
int ret;
adj = __netdev_find_adj(dev, adj_dev, dev_list);
@@ -4624,16 +4647,8 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev,
pr_debug("dev_hold for %s, because of link added from %s to %s\n",
adj_dev->name, dev->name, adj_dev->name);
- if (dev_list == &dev->adj_list.lower) {
- sprintf(linkname, "lower_%s", adj_dev->name);
- ret = sysfs_create_link(&(dev->dev.kobj),
- &(adj_dev->dev.kobj), linkname);
- if (ret)
- goto free_adj;
- } else if (dev_list == &dev->adj_list.upper) {
- sprintf(linkname, "upper_%s", adj_dev->name);
- ret = sysfs_create_link(&(dev->dev.kobj),
- &(adj_dev->dev.kobj), linkname);
+ if (netdev_adjacent_is_neigh_list(dev, dev_list)) {
+ ret = netdev_adjacent_sysfs_add(dev, adj_dev, dev_list);
if (ret)
goto free_adj;
}
@@ -4653,14 +4668,8 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev,
return 0;
remove_symlinks:
- if (dev_list == &dev->adj_list.lower) {
- sprintf(linkname, "lower_%s", adj_dev->name);
- sysfs_remove_link(&(dev->dev.kobj), linkname);
- } else if (dev_list == &dev->adj_list.upper) {
- sprintf(linkname, "upper_%s", adj_dev->name);
- sysfs_remove_link(&(dev->dev.kobj), linkname);
- }
-
+ if (netdev_adjacent_is_neigh_list(dev, dev_list))
+ netdev_adjacent_sysfs_del(dev, adj_dev->name, dev_list);
free_adj:
kfree(adj);
dev_put(adj_dev);
@@ -4673,7 +4682,6 @@ static void __netdev_adjacent_dev_remove(struct net_device *dev,
struct list_head *dev_list)
{
struct netdev_adjacent *adj;
- char linkname[IFNAMSIZ+7];
adj = __netdev_find_adj(dev, adj_dev, dev_list);
@@ -4693,13 +4701,8 @@ static void __netdev_adjacent_dev_remove(struct net_device *dev,
if (adj->master)
sysfs_remove_link(&(dev->dev.kobj), "master");
- if (dev_list == &dev->adj_list.lower) {
- sprintf(linkname, "lower_%s", adj_dev->name);
- sysfs_remove_link(&(dev->dev.kobj), linkname);
- } else if (dev_list == &dev->adj_list.upper) {
- sprintf(linkname, "upper_%s", adj_dev->name);
- sysfs_remove_link(&(dev->dev.kobj), linkname);
- }
+ if (netdev_adjacent_is_neigh_list(dev, dev_list))
+ netdev_adjacent_sysfs_del(dev, adj_dev->name, dev_list);
list_del_rcu(&adj->list);
pr_debug("dev_put for %s, because link removed from %s to %s\n",
--
1.8.4
^ permalink raw reply related
* [PATCH net-next 2/2] net: rename sysfs symlinks on device name change
From: Veaceslav Falico @ 2014-01-14 16:36 UTC (permalink / raw)
To: netdev
Cc: Veaceslav Falico, Ding Tianhong, David S. Miller, Eric Dumazet,
Nicolas Dichtel, Cong Wang
In-Reply-To: <1389717360-13920-1-git-send-email-vfalico@redhat.com>
Currently, we don't rename the upper/lower_ifc symlinks in
/sys/class/net/*/ , which might result stale/duplicate links/names.
Fix this by adding netdev_adjacent_rename_links(dev, oldname) which renames
all the upper/lower interface's links to dev from the upper/lower_oldname
to the new name.
We don't need a rollback because only we control these symlinks and if we
fail to rename them - sysfs will anyway complain.
Reported-by: Ding Tianhong <dingtianhong@huawei.com>
CC: Ding Tianhong <dingtianhong@huawei.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
include/linux/netdevice.h | 1 +
net/core/dev.c | 23 +++++++++++++++++++++++
2 files changed, 24 insertions(+)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index a2a70cc..61f8338 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2937,6 +2937,7 @@ int netdev_master_upper_dev_link_private(struct net_device *dev,
void *private);
void netdev_upper_dev_unlink(struct net_device *dev,
struct net_device *upper_dev);
+void netdev_adjacent_rename_links(struct net_device *dev, char *oldname);
void *netdev_lower_dev_get_private(struct net_device *dev,
struct net_device *lower_dev);
int skb_checksum_help(struct sk_buff *skb);
diff --git a/net/core/dev.c b/net/core/dev.c
index c578d4e..5bf0950 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1117,6 +1117,8 @@ rollback:
write_seqcount_end(&devnet_rename_seq);
+ netdev_adjacent_rename_links(dev, oldname);
+
write_lock_bh(&dev_base_lock);
hlist_del_rcu(&dev->name_hlist);
write_unlock_bh(&dev_base_lock);
@@ -1136,6 +1138,7 @@ rollback:
err = ret;
write_seqcount_begin(&devnet_rename_seq);
memcpy(dev->name, oldname, IFNAMSIZ);
+ memcpy(oldname, newname, IFNAMSIZ);
goto rollback;
} else {
pr_err("%s: name change rollback failed: %d\n",
@@ -4971,6 +4974,26 @@ void netdev_upper_dev_unlink(struct net_device *dev,
}
EXPORT_SYMBOL(netdev_upper_dev_unlink);
+void netdev_adjacent_rename_links(struct net_device *dev, char *oldname)
+{
+ struct netdev_adjacent *iter;
+
+ list_for_each_entry(iter, &dev->adj_list.upper, list) {
+ netdev_adjacent_sysfs_del(iter->dev, oldname,
+ &iter->dev->adj_list.lower);
+ netdev_adjacent_sysfs_add(iter->dev, dev,
+ &iter->dev->adj_list.lower);
+ }
+
+ list_for_each_entry(iter, &dev->adj_list.lower, list) {
+ netdev_adjacent_sysfs_del(iter->dev, oldname,
+ &iter->dev->adj_list.upper);
+ netdev_adjacent_sysfs_add(iter->dev, dev,
+ &iter->dev->adj_list.upper);
+ }
+}
+EXPORT_SYMBOL(netdev_adjacent_rename_links);
+
void *netdev_lower_dev_get_private(struct net_device *dev,
struct net_device *lower_dev)
{
--
1.8.4
^ permalink raw reply related
* [PATCH net-next 0/2] net: rename device's sysfs symlinks on name change
From: Veaceslav Falico @ 2014-01-14 16:35 UTC (permalink / raw)
To: netdev
Cc: Ding Tianhong, David S. Miller, Eric Dumazet, Nicolas Dichtel,
Cong Wang, Veaceslav Falico
First patch only adds helper functions and cleans up the code a bit, second
one already does the renaming.
Reported-by: Ding Tianhong <dingtianhong@huawei.com>
CC: Ding Tianhong <dingtianhong@huawei.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
CC: Cong Wang <amwang@redhat.com>
CC: netdev@vger.kernel.org
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
include/linux/netdevice.h | 1 +
net/core/dev.c | 80 +++++++++++++++++++++++++++++++----------------
2 files changed, 54 insertions(+), 27 deletions(-)
^ permalink raw reply
* Re: [PATCH net-next 06/10] vxlan: use __dev_get_by_index instead of dev_get_by_index to find interface
From: Stephen Hemminger @ 2014-01-14 16:51 UTC (permalink / raw)
To: Ying Xue
Cc: davem, vfalico, john.r.fastabend, antonio, dmitry.tarnyagin,
socketcan, johannes, netdev, linux-kernel
In-Reply-To: <1389685269-18600-7-git-send-email-ying.xue@windriver.com>
On Tue, 14 Jan 2014 15:41:05 +0800
Ying Xue <ying.xue@windriver.com> wrote:
> The following call chains indicate that vxlan_fdb_parse() is
> under rtnl_lock protection. So if we use __dev_get_by_index()
> instead of dev_get_by_index() to find interface handler in it,
> this would help us avoid to change interface reference counter.
>
> rtnetlink_rcv()
> rtnl_lock()
> netlink_rcv_skb()
> rtnl_fdb_add()
> vxlan_fdb_add()
> vxlan_fdb_parse()
> rtnl_unlock()
>
> rtnetlink_rcv()
> rtnl_lock()
> netlink_rcv_skb()
> rtnl_fdb_del()
> vxlan_fdb_del()
> vxlan_fdb_parse()
> rtnl_unlock()
>
> Cc: Stephen Hemminger <stephen@networkplumber.org>
> Signed-off-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
^ permalink raw reply
* [RFC] sysfs_rename_link() and its usage
From: Veaceslav Falico @ 2014-01-14 17:17 UTC (permalink / raw)
To: linux-kernel; +Cc: netdev, gregkh, ebiederm
Hi,
I'm hitting a strange issue and/or I'm completely lost in sysfs internals.
Consider having two net_device *a, *b; which are registered normally.
Now, to create a link from /sys/class/net/a->name/linkname to b, one should
use:
sysfs_create_link(&(a->dev.kobj), &(b->dev.kobj), linkname);
To remove it, even simpler:
sysfs_remove_link(&(a->dev.kobj), linkname);
This works like a charm. However, if I want to use (obviously, with the
symlink present):
sysfs_rename_link(&(a->dev.kobj), &(b->dev.kobj), oldname, newname);
this fails with:
"sysfs: ns invalid in 'a->name' for 'oldname'"
in
608 struct sysfs_dirent *sysfs_find_dirent(struct sysfs_dirent *parent_sd,
...
615 if (!!sysfs_ns_type(parent_sd) != !!ns) {
616 WARN(1, KERN_WARNING "sysfs: ns %s in '%s' for '%s'\n",
617 sysfs_ns_type(parent_sd) ? "required" : "invalid",
618 parent_sd->s_name, name);
619 return NULL;
620 }
Code path:
warn_slowpath_fmt+0x46/0x50
sysfs_get_dirent_ns+0x30/0x80
sysfs_find_dirent+0x84/0x110
sysfs_get_dirent_ns+0x3e/0x80
sysfs_rename_link_ns+0x54/0xd0
I have no idea what this code means. Is there any reason for it to
fail (i.e. am I doing something wrong?) or I've hit a bug?
I've tested the only user of it (bridge) - and it works fine, however it's
not using its own net_device's kobject but rather its own dir.
Thank you!
^ permalink raw reply
* [PATCH net-next v3 0/2] stmmac: fix kernel crashes for jumbo frames
From: Vince Bridgers @ 2014-01-14 17:17 UTC (permalink / raw)
To: devicetree, netdev
Cc: peppe.cavallaro, robh+dt, pawel.moll, mark.rutland,
ijc+devicetree, galak, dinguyen, rayagond, vbridgers2013
v3:
* change snps,max-frame-size to max-frame-size
v2:
* change snps,max-mtu to snps,max-frame-size
These patches address two kernel crashes seen when using jumbo frames on
the Synopsys stmmac driver, and adds device tree configurability for the
maximum mtu. The Synopsys emac fifo sizes can be configured when a logic
design is synthesized, but does not provide a way for a driver to query the
exact fifo size.
The crashes seen were due to two issues.
1) The dma buffer size was being set after the dma buffers were allocated.
This caused a crash when changing the mtu since it was possible the buffers
would subsequently be freed using an incorrect dma buffer size. This could
also cause kernel panics due to memory corruption since a large mtu size could
have been configured, but the dma buffers were not sized accordingly.
2) Jumbo frames were being enabled by default, but the dma buffers were not
sized accordingly. This caused memory corruption in the context of certain
types of network traffic, leading to kernel panics.
I've tested these changes using automated, reproducible testware. I can
demonstrate the panics described before the fixes and show that the fixes
address the problems described.
Testing and improvements continue through the use of the mentioned automated
and reproducible testware.
Vince Bridgers
Vince Bridgers (2):
dts: Add a binding for Synopsys emac max-frame-size
stmmac: Fix kernel crashes for jumbo frames
Documentation/devicetree/bindings/net/stmmac.txt | 5 +++++
drivers/net/ethernet/stmicro/stmmac/common.h | 4 +++-
drivers/net/ethernet/stmicro/stmmac/dwmac1000.h | 7 ++-----
.../net/ethernet/stmicro/stmmac/dwmac1000_core.c | 7 ++++++-
.../net/ethernet/stmicro/stmmac/dwmac100_core.c | 2 +-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 11 +++++++----
.../net/ethernet/stmicro/stmmac/stmmac_platform.c | 5 +++++
include/linux/stmmac.h | 1 +
8 files changed, 30 insertions(+), 12 deletions(-)
--
1.7.9.5
^ permalink raw reply
* [PATCH net-next v3 1/2] dts: Add a binding for Synopsys emac max-frame-size
From: Vince Bridgers @ 2014-01-14 17:17 UTC (permalink / raw)
To: devicetree, netdev
Cc: peppe.cavallaro, robh+dt, pawel.moll, mark.rutland,
ijc+devicetree, galak, dinguyen, rayagond, vbridgers2013
In-Reply-To: <1389719859-29071-1-git-send-email-vbridgers2013@gmail.com>
This change adds a parameter for the Synopsys 10/100/1000
stmmac Ethernet driver to configure the maximum frame
size supported by the EMAC driver. Synopsys allows the FIFO
sizes to be configured when the cores are built for a particular
device, but do not provide a way for the driver to read
information from the device about the maximum MTU size
supported as limited by the device's FIFO size.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
---
V3: change snps,max-frame-size to max-frame-size
V2: change snps,max-mtu to snps,max-frame-size
---
Documentation/devicetree/bindings/net/stmmac.txt | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/Documentation/devicetree/bindings/net/stmmac.txt b/Documentation/devicetree/bindings/net/stmmac.txt
index eba0e5e..323d310 100644
--- a/Documentation/devicetree/bindings/net/stmmac.txt
+++ b/Documentation/devicetree/bindings/net/stmmac.txt
@@ -30,6 +30,10 @@ Required properties:
Optional properties:
- mac-address: 6 bytes, mac address
+- max-frame-size: Maximum frame size permitted. This parameter is useful
+ since different implementations of the Synopsys MAC may
+ have different FIFO sizes depending on the selections
+ made in Synopsys Core Consultant.
Examples:
@@ -40,5 +44,6 @@ Examples:
interrupts = <24 23>;
interrupt-names = "macirq", "eth_wake_irq";
mac-address = [000000000000]; /* Filled in by U-Boot */
+ max-frame-size = <3800>;
phy-mode = "gmii";
};
--
1.7.9.5
^ permalink raw reply related
* [PATCH net-next v3 2/2] stmmac: Fix kernel crashes for jumbo frames
From: Vince Bridgers @ 2014-01-14 17:17 UTC (permalink / raw)
To: devicetree, netdev
Cc: peppe.cavallaro, robh+dt, pawel.moll, mark.rutland,
ijc+devicetree, galak, dinguyen, rayagond, vbridgers2013
In-Reply-To: <1389719859-29071-1-git-send-email-vbridgers2013@gmail.com>
These changes correct the following issues with jumbo frames on the
stmmac driver:
1) The Synopsys EMAC can be configured to support different FIFO
sizes at core configuration time. There's no way to query the
controller and know the FIFO size, so the driver needs to get this
information from the device tree in order to know how to correctly
handle MTU changes and setting up dma buffers. The default
max-frame-size is as currently used, which is the size of a jumbo
frame.
2) The driver was enabling Jumbo frames by default, but was not allocating
dma buffers of sufficient size to handle the maximum possible packet
size that could be received. This led to memory corruption since DMAs were
occurring beyond the extent of the allocated receive buffers for certain types
of network traffic.
kernel BUG at net/core/skbuff.c:126!
Internal error: Oops - BUG: 0 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 563 Comm: sockperf Not tainted 3.13.0-rc6-01523-gf7111b9 #31
task: ef35e580 ti: ef252000 task.ti: ef252000
PC is at skb_panic+0x60/0x64
LR is at skb_panic+0x60/0x64
pc : [<c03c7c3c>] lr : [<c03c7c3c>] psr: 60000113
sp : ef253c18 ip : 60000113 fp : 00000000
r10: ef3a5400 r9 : 00000ebc r8 : ef3a546c
r7 : ee59f000 r6 : ee59f084 r5 : ee59ff40 r4 : ee59f140
r3 : 000003e2 r2 : 00000007 r1 : c0b9c420 r0 : 0000007d
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
Control: 10c5387d Table: 2e8ac04a DAC: 00000015
Process sockperf (pid: 563, stack limit = 0xef252248)
Stack: (0xef253c18 to 0xef254000)
3c00: 00000ebc ee59f000
3c20: ee59f084 ee59ff40 ee59f140 c04a9cd8 ee8c50c0 00000ebc ee59ff40 00000000
3c40: ee59f140 c02d0ef0 00000056 ef1eda80 ee8c50c0 00000ebc 22bbef29 c0318f8c
3c60: 00000056 ef3a547c ffe2c716 c02c9c90 c0ba1298 ef3a5838 ef3a5838 ef3a5400
3c80: 000020c0 ee573840 000055cb ef3f2050 c053f0e0 c0319214 22b9b085 22d92813
3ca0: 00001c80 004b8e00 ef3a5400 ee573840 ef3f2064 22d92813 ef3f2064 000055cb
3cc0: ef3f2050 c031a19c ef252000 00000000 00000000 c0561bc0 00000000 ff00ffff
3ce0: c05621c0 ef3a5400 ef3f2064 ee573840 00000020 ef3f2064 000055cb ef3f2050
3d00: c053f0e0 c031cad0 c053e740 00000e60 00000000 00000000 ee573840 ef3a5400
3d20: ef0a6e00 00000000 ef3f2064 c032507c 00010000 00000020 c0561bc0 c0561bc0
3d40: ee599850 c032799c 00000000 ee573840 c055a380 ef3a5400 00000000 ef3f2064
3d60: ef3f2050 c032799c 0101c7c0 2b6755cb c059a280 c030e4d8 000055cb ffffffff
3d80: ee574fc0 c055a380 ee574000 ee573840 00002b67 ee573840 c03fe9c4 c053fa68
3da0: c055a380 00001f6f 00000000 ee573840 c053f0e0 c0304fdc ef0a6e01 ef3f2050
3dc0: ee573858 ef031000 ee573840 c03055d8 c0ba0c40 ef000f40 00100100 c053f0dc
3de0: c053ffdc c053f0f0 00000008 00000000 ef031000 c02da948 00001140 00000000
3e00: c0563c78 ef253e5f 00000020 ee573840 00000020 c053f0f0 ef313400 ee573840
3e20: c053f0e0 00000000 00000000 c05380c0 ef313400 00001000 00000015 c02df280
3e40: ee574000 ef001e00 00000000 00001080 00000042 005cd980 ef031500 ef031500
3e60: 00000000 c02df824 ef031500 c053e390 c0541084 f00b1e00 c05925e8 c02df864
3e80: 00001f5c ef031440 c053e390 c0278524 00000002 00000000 c0b9eb48 c02df280
3ea0: ee8c7180 00000100 c0542ca8 00000015 00000040 ef031500 ef031500 ef031500
3ec0: c027803c ef252000 00000040 000000ec c05380c0 c0b9eb40 c0b9eb48 c02df940
3ee0: ef060780 ffffa4dd c0564a9c c056343c 002e80a8 00000080 ef031500 00000001
3f00: c053808c ef252000 fffec100 00000003 00000004 002e80a8 0000000c c00258f0
3f20: 002e80a8 c005e704 00000005 00000100 c05634d0 c0538080 c05333e0 00000000
3f40: 0000000a c0565580 c05380c0 ffffa4dc c05434f4 00400100 00000004 c0534cd4
3f60: 00000098 00000000 fffec100 002e80a8 00000004 002e80a8 002a20e0 c0025da8
3f80: c0534cd4 c000f020 fffec10c c053ea60 ef253fb0 c0008530 0000ffe2 b6ef67f4
3fa0: 40000010 ffffffff 00000124 c0012f3c 0000ffe2 002e80f0 0000ffe2 00004000
3fc0: becb6338 becb6334 00000004 00000124 002e80a8 00000004 002e80a8 002a20e0
3fe0: becb6300 becb62f4 002773bb b6ef67f4 40000010 ffffffff 00000000 00000000
[<c03c7c3c>] (skb_panic+0x60/0x64) from [<c02d0ef0>] (skb_put+0x4c/0x50)
[<c02d0ef0>] (skb_put+0x4c/0x50) from [<c0318f8c>] (tcp_collapse+0x314/0x3ec)
[<c0318f8c>] (tcp_collapse+0x314/0x3ec) from [<c0319214>]
(tcp_try_rmem_schedule+0x1b0/0x3c4)
[<c0319214>] (tcp_try_rmem_schedule+0x1b0/0x3c4) from [<c031a19c>]
(tcp_data_queue+0x480/0xe6c)
[<c031a19c>] (tcp_data_queue+0x480/0xe6c) from [<c031cad0>]
(tcp_rcv_established+0x180/0x62c)
[<c031cad0>] (tcp_rcv_established+0x180/0x62c) from [<c032507c>]
(tcp_v4_do_rcv+0x13c/0x31c)
[<c032507c>] (tcp_v4_do_rcv+0x13c/0x31c) from [<c032799c>]
(tcp_v4_rcv+0x718/0x73c)
[<c032799c>] (tcp_v4_rcv+0x718/0x73c) from [<c0304fdc>]
(ip_local_deliver+0x98/0x274)
[<c0304fdc>] (ip_local_deliver+0x98/0x274) from [<c03055d8>]
(ip_rcv+0x420/0x758)
[<c03055d8>] (ip_rcv+0x420/0x758) from [<c02da948>]
(__netif_receive_skb_core+0x44c/0x5bc)
[<c02da948>] (__netif_receive_skb_core+0x44c/0x5bc) from [<c02df280>]
(netif_receive_skb+0x48/0xb4)
[<c02df280>] (netif_receive_skb+0x48/0xb4) from [<c02df824>]
(napi_gro_flush+0x70/0x94)
[<c02df824>] (napi_gro_flush+0x70/0x94) from [<c02df864>]
(napi_complete+0x1c/0x34)
[<c02df864>] (napi_complete+0x1c/0x34) from [<c0278524>]
(stmmac_poll+0x4e8/0x5c8)
[<c0278524>] (stmmac_poll+0x4e8/0x5c8) from [<c02df940>]
(net_rx_action+0xc4/0x1e4)
[<c02df940>] (net_rx_action+0xc4/0x1e4) from [<c00258f0>]
(__do_softirq+0x12c/0x2e8)
[<c00258f0>] (__do_softirq+0x12c/0x2e8) from [<c0025da8>] (irq_exit+0x78/0xac)
[<c0025da8>] (irq_exit+0x78/0xac) from [<c000f020>] (handle_IRQ+0x44/0x90)
[<c000f020>] (handle_IRQ+0x44/0x90) from [<c0008530>]
(gic_handle_irq+0x2c/0x5c)
[<c0008530>] (gic_handle_irq+0x2c/0x5c) from [<c0012f3c>]
(__irq_usr+0x3c/0x60)
3) The driver was setting the dma buffer size after allocating dma buffers,
which caused a system panic when changing the MTU.
BUG: Bad page state in process ifconfig pfn:2e850
page:c0b72a00 count:0 mapcount:0 mapping: (null) index:0x0
page flags: 0x200(arch_1)
Modules linked in:
CPU: 0 PID: 566 Comm: ifconfig Not tainted 3.13.0-rc6-01523-gf7111b9 #29
[<c001547c>] (unwind_backtrace+0x0/0xf8) from [<c00122dc>]
(show_stack+0x10/0x14)
[<c00122dc>] (show_stack+0x10/0x14) from [<c03c793c>] (dump_stack+0x70/0x88)
[<c03c793c>] (dump_stack+0x70/0x88) from [<c00b2620>] (bad_page+0xc8/0x118)
[<c00b2620>] (bad_page+0xc8/0x118) from [<c00b302c>]
(get_page_from_freelist+0x744/0x870)
[<c00b302c>] (get_page_from_freelist+0x744/0x870) from [<c00b40f4>]
(__alloc_pages_nodemask+0x118/0x86c)
[<c00b40f4>] (__alloc_pages_nodemask+0x118/0x86c) from [<c00b4858>]
(__get_free_pages+0x10/0x54)
[<c00b4858>] (__get_free_pages+0x10/0x54) from [<c00cba1c>]
(kmalloc_order_trace+0x24/0xa0)
[<c00cba1c>] (kmalloc_order_trace+0x24/0xa0) from [<c02d199c>]
(__kmalloc_reserve.isra.21+0x24/0x70)
[<c02d199c>] (__kmalloc_reserve.isra.21+0x24/0x70) from [<c02d240c>]
(__alloc_skb+0x68/0x13c)
[<c02d240c>] (__alloc_skb+0x68/0x13c) from [<c02d3930>]
(__netdev_alloc_skb+0x3c/0xe8)
[<c02d3930>] (__netdev_alloc_skb+0x3c/0xe8) from [<c0279378>]
(stmmac_open+0x63c/0x1024)
[<c0279378>] (stmmac_open+0x63c/0x1024) from [<c02e18cc>]
(__dev_open+0xa0/0xfc)
[<c02e18cc>] (__dev_open+0xa0/0xfc) from [<c02e1b40>]
(__dev_change_flags+0x94/0x158)
[<c02e1b40>] (__dev_change_flags+0x94/0x158) from [<c02e1c24>]
(dev_change_flags+0x18/0x48)
[<c02e1c24>] (dev_change_flags+0x18/0x48) from [<c0337bc0>]
(devinet_ioctl+0x638/0x700)
[<c0337bc0>] (devinet_ioctl+0x638/0x700) from [<c02c7aec>]
(sock_ioctl+0x64/0x290)
[<c02c7aec>] (sock_ioctl+0x64/0x290) from [<c0100890>]
(do_vfs_ioctl+0x78/0x5b8)
[<c0100890>] (do_vfs_ioctl+0x78/0x5b8) from [<c0100e0c>] (SyS_ioctl+0x3c/0x5c)
[<c0100e0c>] (SyS_ioctl+0x3c/0x5c) from [<c000e760>]
The fixes have been verified using reproducible, automated testing.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
---
V3: change snps,max-frame-size to max-frame-size
V2: change snps,max-mtu to snps,max-frame-size
---
drivers/net/ethernet/stmicro/stmmac/common.h | 4 +++-
drivers/net/ethernet/stmicro/stmmac/dwmac1000.h | 7 ++-----
.../net/ethernet/stmicro/stmmac/dwmac1000_core.c | 7 ++++++-
.../net/ethernet/stmicro/stmmac/dwmac100_core.c | 2 +-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 11 +++++++----
.../net/ethernet/stmicro/stmmac/stmmac_platform.c | 5 +++++
include/linux/stmmac.h | 1 +
7 files changed, 25 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index fc94f20..97bfb6b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -293,6 +293,8 @@ struct dma_features {
#define STMMAC_CHAIN_MODE 0x1
#define STMMAC_RING_MODE 0x2
+#define JUMBO_LEN 9000
+
struct stmmac_desc_ops {
/* DMA RX descriptor ring initialization */
void (*init_rx_desc) (struct dma_desc *p, int disable_rx_ic, int mode,
@@ -369,7 +371,7 @@ struct stmmac_dma_ops {
struct stmmac_ops {
/* MAC core initialization */
- void (*core_init) (void __iomem *ioaddr);
+ void (*core_init) (void __iomem *ioaddr, int mtu);
/* Enable and verify that the IPC module is supported */
int (*rx_ipc) (void __iomem *ioaddr);
/* Dump MAC registers */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
index c12aabb..f37d90f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
@@ -126,11 +126,8 @@ enum power_event {
#define GMAC_ANE_PSE (3 << 7)
#define GMAC_ANE_PSE_SHIFT 7
- /* GMAC Configuration defines */
-#define GMAC_CONTROL_TC 0x01000000 /* Transmit Conf. in RGMII/SGMII */
-#define GMAC_CONTROL_WD 0x00800000 /* Disable Watchdog on receive */
-
/* GMAC Configuration defines */
+#define GMAC_CONTROL_2K 0x08000000 /* IEEE 802.3as 2K packets */
#define GMAC_CONTROL_TC 0x01000000 /* Transmit Conf. in RGMII/SGMII */
#define GMAC_CONTROL_WD 0x00800000 /* Disable Watchdog on receive */
#define GMAC_CONTROL_JD 0x00400000 /* Jabber disable */
@@ -156,7 +153,7 @@ enum inter_frame_gap {
#define GMAC_CONTROL_RE 0x00000004 /* Receiver Enable */
#define GMAC_CORE_INIT (GMAC_CONTROL_JD | GMAC_CONTROL_PS | GMAC_CONTROL_ACS | \
- GMAC_CONTROL_JE | GMAC_CONTROL_BE)
+ GMAC_CONTROL_BE)
/* GMAC Frame Filter defines */
#define GMAC_FRAME_FILTER_PR 0x00000001 /* Promiscuous Mode */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
index cdd9268..b3e148e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
@@ -32,10 +32,15 @@
#include <asm/io.h>
#include "dwmac1000.h"
-static void dwmac1000_core_init(void __iomem *ioaddr)
+static void dwmac1000_core_init(void __iomem *ioaddr, int mtu)
{
u32 value = readl(ioaddr + GMAC_CONTROL);
value |= GMAC_CORE_INIT;
+ if (mtu > 1500)
+ value |= GMAC_CONTROL_2K;
+ if (mtu > 2000)
+ value |= GMAC_CONTROL_JE;
+
writel(value, ioaddr + GMAC_CONTROL);
/* Mask GMAC interrupts */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c
index 5857d67..2ff767b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c
@@ -32,7 +32,7 @@
#include <asm/io.h>
#include "dwmac100.h"
-static void dwmac100_core_init(void __iomem *ioaddr)
+static void dwmac100_core_init(void __iomem *ioaddr, int mtu)
{
u32 value = readl(ioaddr + MAC_CONTROL);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index b8e3a4c..4f5dfd7 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -52,7 +52,6 @@
#include "stmmac.h"
#define STMMAC_ALIGN(x) L1_CACHE_ALIGN(x)
-#define JUMBO_LEN 9000
/* Module parameters */
#define TX_TIMEO 5000
@@ -91,7 +90,7 @@ static int tc = TC_DEFAULT;
module_param(tc, int, S_IRUGO | S_IWUSR);
MODULE_PARM_DESC(tc, "DMA threshold control value");
-#define DMA_BUFFER_SIZE BUF_SIZE_2KiB
+#define DMA_BUFFER_SIZE BUF_SIZE_4KiB
static int buf_sz = DMA_BUFFER_SIZE;
module_param(buf_sz, int, S_IRUGO | S_IWUSR);
MODULE_PARM_DESC(buf_sz, "DMA buffer size");
@@ -990,6 +989,8 @@ static int init_dma_desc_rings(struct net_device *dev)
if (bfsize < BUF_SIZE_16KiB)
bfsize = stmmac_set_bfsize(dev->mtu, priv->dma_buf_sz);
+ priv->dma_buf_sz = bfsize;
+
if (netif_msg_probe(priv))
pr_debug("%s: txsize %d, rxsize %d, bfsize %d\n", __func__,
txsize, rxsize, bfsize);
@@ -1079,7 +1080,6 @@ static int init_dma_desc_rings(struct net_device *dev)
}
priv->cur_rx = 0;
priv->dirty_rx = (unsigned int)(i - rxsize);
- priv->dma_buf_sz = bfsize;
buf_sz = bfsize;
/* Setup the chained descriptor addresses */
@@ -1642,7 +1642,7 @@ static int stmmac_open(struct net_device *dev)
priv->plat->bus_setup(priv->ioaddr);
/* Initialize the MAC Core */
- priv->hw->mac->core_init(priv->ioaddr);
+ priv->hw->mac->core_init(priv->ioaddr, dev->mtu);
/* Request the IRQ lines */
ret = request_irq(dev->irq, stmmac_interrupt,
@@ -2229,6 +2229,9 @@ static int stmmac_change_mtu(struct net_device *dev, int new_mtu)
else
max_mtu = SKB_MAX_HEAD(NET_SKB_PAD + NET_IP_ALIGN);
+ if (priv->plat->maxmtu < max_mtu)
+ max_mtu = priv->plat->maxmtu;
+
if ((new_mtu < 46) || (new_mtu > max_mtu)) {
pr_err("%s: invalid MTU, max MTU is: %d\n", dev->name, max_mtu);
return -EINVAL;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index 38bd1f4..160bc18 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
@@ -51,6 +51,10 @@ static int stmmac_probe_config_dt(struct platform_device *pdev,
plat->mdio_bus_data = devm_kzalloc(&pdev->dev,
sizeof(struct stmmac_mdio_bus_data),
GFP_KERNEL);
+ /* Set the maxmtu to a default of 1500 in case the
+ * parameter is not present in the device tree
+ */
+ plat->maxmtu = JUMBO_LEN;
/*
* Currently only the properties needed on SPEAr600
@@ -60,6 +64,7 @@ static int stmmac_probe_config_dt(struct platform_device *pdev,
if (of_device_is_compatible(np, "st,spear600-gmac") ||
of_device_is_compatible(np, "snps,dwmac-3.70a") ||
of_device_is_compatible(np, "snps,dwmac")) {
+ of_property_read_u32(np, "max-frame-size", &plat->maxmtu);
plat->has_gmac = 1;
plat->pmt = 1;
}
diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
index bb5deb0..9689706 100644
--- a/include/linux/stmmac.h
+++ b/include/linux/stmmac.h
@@ -110,6 +110,7 @@ struct plat_stmmacenet_data {
int force_sf_dma_mode;
int force_thresh_dma_mode;
int riwt_off;
+ int maxmtu;
void (*fix_mac_speed)(void *priv, unsigned int speed);
void (*bus_setup)(void __iomem *ioaddr);
int (*init)(struct platform_device *pdev);
--
1.7.9.5
^ permalink raw reply related
* RE: [PATCH] usbnet: Fix dma setup for fragmented packets that need a pad byte appended.
From: David Laight @ 2014-01-14 17:28 UTC (permalink / raw)
To: 'Bjørn Mork'; +Cc: 'Eric Dumazet', netdev
In-Reply-To: <87vbxm7eey.fsf@nemi.mork.no>
From: Bjørn Mork [mailto:bjorn@mork.no]
> David Laight <David.Laight@ACULAB.COM> writes:
>
> > I couldn't find the original patch anywhere!
> > I did do quite a lot of looking as well - if I'd found it I've
> > have done something else.
>
> You obviously haven't looked the one place you are supposed to always
> look before submitting *anything*:
> https://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=fdc3452cd2c7b2bfe0f378f92123f4f9
> a98fa2bd
>
> I am not impressed...
Without the commit id that is rather hard.
Does anything even get there without manual actions.
usb patches sit in a hidden limbo for ages.
I looked through my mailboxes and searched for the thread - but only
found the messages that included the one that said you'd send a patch
soon.
What I still don't understand is how I found the patch before!
The fact that work make me use outluck makes searching mail locally
almost impossible.
I might have to subscribe at home as well.
David
^ permalink raw reply
* Re: [PATCH v2 net-next] bonding: handle slave's name change with primary_slave logic
From: Jay Vosburgh @ 2014-01-14 17:32 UTC (permalink / raw)
To: Veaceslav Falico; +Cc: netdev, Ding Tianhong, Andy Gospodarek
In-Reply-To: <1389700179-12723-1-git-send-email-vfalico@redhat.com>
Veaceslav Falico <vfalico@redhat.com> wrote:
>Currently, if a slave's name change, we just pass it by. However, if the
>slave is a current primary_slave, then we end up with using a slave, whose
>name != params.primary, for primary_slave. And vice-versa, if we don't have
>a primary_slave but have params.primary set - we will not detected a new
>primary_slave.
>
>Fix this by catching the NETDEV_CHANGENAME event and setting primary_slave
>accordingly. Also, if the primary_slave was changed, issue a reselection of
>the active slave, cause the priorities have changed.
>
>Reported-by: Ding Tianhong <dingtianhong@huawei.com>
>CC: Ding Tianhong <dingtianhong@huawei.com>
>CC: Jay Vosburgh <fubar@us.ibm.com>
>CC: Andy Gospodarek <andy@greyhouse.net>
>Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
>---
> drivers/net/bonding/bond_main.c | 23 ++++++++++++++++++++---
> 1 file changed, 20 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index e06c445..8077199 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2860,9 +2860,26 @@ static int bond_slave_netdev_event(unsigned long event,
> */
> break;
> case NETDEV_CHANGENAME:
>- /*
>- * TODO: handle changing the primary's name
>- */
>+ /* we don't care if we don't have primary set */
>+ if (!USES_PRIMARY(bond->params.mode) ||
>+ !bond->params.primary[0])
>+ break;
>+
>+ if (slave == bond->primary_slave) {
>+ /* slave's name changed - he's no longer primary */
>+ bond->primary_slave = NULL;
>+ } else if (!strcmp(slave_dev->name, bond->params.primary)) {
>+ /* we have a new primary slave */
>+ bond->primary_slave = slave;
>+ } else /* we didn't change primary - exit */
>+ break;
>+
>+ pr_info("%s: Primary slave changed to %s, re-electing.\n",
I suspect you mean "reselecting" here, not "re-electing." I'd
add a couple more words, e.g., "reselecting active slave" to make it
clearer.
-J
>+ bond->dev->name, bond->primary_slave ? slave_dev->name :
>+ "none");
>+ write_lock_bh(&bond->curr_slave_lock);
>+ bond_select_active_slave(bond);
>+ write_unlock_bh(&bond->curr_slave_lock);
> break;
> case NETDEV_FEAT_CHANGE:
> bond_compute_features(bond);
>--
>1.8.4
>
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox