* Re: [PATCH] carl9170: fix leaks at failure path in carl9170_usb_probe()
From: John W. Linville @ 2013-10-10 17:59 UTC (permalink / raw)
To: Alexey Khoroshilov
Cc: Fabio Estevam, Christian Lamparter, linux-wireless,
netdev@vger.kernel.org, linux-kernel, ldv-project
In-Reply-To: <52466624.4040106@ispras.ru>
On Sat, Sep 28, 2013 at 01:16:20AM -0400, Alexey Khoroshilov wrote:
> On 28.09.2013 00:17, Fabio Estevam wrote:
> >On Sat, Sep 28, 2013 at 12:51 AM, Alexey Khoroshilov
> ><khoroshilov@ispras.ru> wrote:
> >
> >>- return request_firmware_nowait(THIS_MODULE, 1, CARL9170FW_NAME,
> >>+ err = request_firmware_nowait(THIS_MODULE, 1, CARL9170FW_NAME,
> >> &ar->udev->dev, GFP_KERNEL, ar, carl9170_usb_firmware_step2);
> >>+ if (err) {
> >>+ usb_put_dev(udev);
> >>+ usb_put_dev(udev);
> >You are doing the same free twice.
> Yes, because it was get twice.
> >I guess you meant to also free: usb_put_dev(ar->udev)
> udev and ar->udev are equal, so technically the patch is correct.
>
> I agree that there is some inconsistency, but I would prefer to fix
> it at usb_get_dev() side with a comment about reasons for the double
> get.
What is the reason for the double get?
--
John W. Linville Someday the world will need a hero, and you
linville@tuxdriver.com might be all we have. Be ready.
^ permalink raw reply
* [PATCH] ipv6: Initialize ip6_tnl.hlen in gre tunnel even if no route is found
From: Oussama Ghorbel @ 2013-10-10 17:50 UTC (permalink / raw)
To: David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy
Cc: netdev, linux-kernel, Hannes Frederic Sowa, Oussama Ghorbel
The ip6_tnl.hlen (gre and ipv6 headers length) is independent from the
outgoing interface, so it would be better to initialize it even when no
route is found, otherwise its value will be zero.
While I'm not sure if this could happen in real life, but doing that
will avoid to call the skb_push function with a zero in ip6gre_header
function.
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Oussama Ghorbel <ou.ghorbel@gmail.com>
---
net/ipv6/ip6_gre.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 90747f1..8e4d42e 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -978,6 +978,7 @@ static void ip6gre_tnl_link_config(struct ip6_tnl *t, int set_mtu)
if (t->parms.o_flags&GRE_SEQ)
addend += 4;
}
+ t->hlen = addend;
if (p->flags & IP6_TNL_F_CAP_XMIT) {
int strict = (ipv6_addr_type(&p->raddr) &
@@ -1004,8 +1005,6 @@ static void ip6gre_tnl_link_config(struct ip6_tnl *t, int set_mtu)
}
ip6_rt_put(rt);
}
-
- t->hlen = addend;
}
static int ip6gre_tnl_change(struct ip6_tnl *t,
--
1.7.9.5
^ permalink raw reply related
* Re: [PATCH RFC 00/77] Re-design MSI/MSI-X interrupts enablement pattern
From: H. Peter Anvin @ 2013-10-10 16:28 UTC (permalink / raw)
To: Alexander Gordeev
Cc: Benjamin Herrenschmidt, linux-kernel, Bjorn Helgaas, Ralf Baechle,
Michael Ellerman, Martin Schwidefsky, Ingo Molnar, Tejun Heo,
Dan Williams, Andy King, Jon Mason, Matt Porter, linux-pci,
linux-mips, linuxppc-dev, linux390, linux-s390, x86, linux-ide,
iss_storagedev, linux-nvme, linux-rdma, netdev, e1000-devel,
linux-dr
In-Reply-To: <20131010101704.GC11874@dhcp-26-207.brq.redhat.com>
On 10/10/2013 03:17 AM, Alexander Gordeev wrote:
> On Wed, Oct 09, 2013 at 03:24:08PM +1100, Benjamin Herrenschmidt wrote:
>
> Ok, this suggestion sounded in one or another form by several people.
> What about name it pcim_enable_msix_range() and wrap in couple more
> helpers to complete an API:
>
> int pcim_enable_msix_range(pdev, msix_entries, nvec, minvec);
> <0 - error code
> >0 - number of MSIs allocated, where minvec >= result <= nvec
>
> int pcim_enable_msix(pdev, msix_entries, nvec);
> <0 - error code
> >0 - number of MSIs allocated, where 1 >= result <= nvec
>
> int pcim_enable_msix_exact(pdev, msix_entries, nvec);
> <0 - error code
> >0 - number of MSIs allocated, where result == nvec
>
> The latter's return value seems odd, but I can not help to make
> it consistent with the first two.
>
Is there a reason for the wrappers, as opposed to just specifying either
1 or nvec as the minimum?
-hpa
^ permalink raw reply
* Re: [PATCH net-next] tcp: tcp_transmit_skb() optimizations
From: Yuchung Cheng @ 2013-10-10 15:54 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev, MF Nowlan, Neal Cardwell
In-Reply-To: <1381419780.4971.84.camel@edumazet-glaptop.roam.corp.google.com>
On Thu, Oct 10, 2013 at 8:43 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> 1) We need to take a timestamp only for skb that should be cloned.
>
> Other skbs are not in write queue and no rtt estimation is done on them.
>
> 2) the unlikely() hint is wrong for receivers (they send pure ACK)
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: MF Nowlan <fitz@cs.yale.edu>
> Cc: Yuchung Cheng <ycheng@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
Acked-By: Yuchung Cheng <ycheng@google.com>
> ---
> net/ipv4/tcp_output.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index faec813..c40f608 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -850,15 +850,15 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
>
> BUG_ON(!skb || !tcp_skb_pcount(skb));
>
> - /* If congestion control is doing timestamping, we must
> - * take such a timestamp before we potentially clone/copy.
> - */
> - if (icsk->icsk_ca_ops->flags & TCP_CONG_RTT_STAMP)
> - __net_timestamp(skb);
> -
> - if (likely(clone_it)) {
> + if (clone_it) {
> const struct sk_buff *fclone = skb + 1;
>
> + /* If congestion control is doing timestamping, we must
> + * take such a timestamp before we potentially clone/copy.
> + */
> + if (icsk->icsk_ca_ops->flags & TCP_CONG_RTT_STAMP)
> + __net_timestamp(skb);
> +
> if (unlikely(skb->fclone == SKB_FCLONE_ORIG &&
> fclone->fclone == SKB_FCLONE_CLONE))
> NET_INC_STATS_BH(sock_net(sk),
>
>
^ permalink raw reply
* [PATCH net-next] tcp: tcp_transmit_skb() optimizations
From: Eric Dumazet @ 2013-10-10 15:43 UTC (permalink / raw)
To: David Miller; +Cc: netdev, MF Nowlan, Yuchung Cheng, Neal Cardwell
From: Eric Dumazet <edumazet@google.com>
1) We need to take a timestamp only for skb that should be cloned.
Other skbs are not in write queue and no rtt estimation is done on them.
2) the unlikely() hint is wrong for receivers (they send pure ACK)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: MF Nowlan <fitz@cs.yale.edu>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
---
net/ipv4/tcp_output.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index faec813..c40f608 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -850,15 +850,15 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
BUG_ON(!skb || !tcp_skb_pcount(skb));
- /* If congestion control is doing timestamping, we must
- * take such a timestamp before we potentially clone/copy.
- */
- if (icsk->icsk_ca_ops->flags & TCP_CONG_RTT_STAMP)
- __net_timestamp(skb);
-
- if (likely(clone_it)) {
+ if (clone_it) {
const struct sk_buff *fclone = skb + 1;
+ /* If congestion control is doing timestamping, we must
+ * take such a timestamp before we potentially clone/copy.
+ */
+ if (icsk->icsk_ca_ops->flags & TCP_CONG_RTT_STAMP)
+ __net_timestamp(skb);
+
if (unlikely(skb->fclone == SKB_FCLONE_ORIG &&
fclone->fclone == SKB_FCLONE_CLONE))
NET_INC_STATS_BH(sock_net(sk),
^ permalink raw reply related
* Re: [PATCH RFC 50/77] mlx5: Update MSI/MSI-X interrupts enablement code
From: Eli Cohen @ 2013-10-10 15:29 UTC (permalink / raw)
To: Alexander Gordeev
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Bjorn Helgaas, Ralf Baechle,
Michael Ellerman, Benjamin Herrenschmidt, Martin Schwidefsky,
Ingo Molnar, Tejun Heo, Dan Williams, Andy King, Jon Mason,
Matt Porter, linux-pci-u79uwXL29TY76Z2rM5mHXA,
linux-mips-6z/3iImG2C8G8FEW9MqTrA,
linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ,
linux390-tA70FqPdS9bQT0dZR+AlfA,
linux-s390-u79uwXL29TY76Z2rM5mHXA, x86-DgEjT+Ai2ygdnm+yROfE0A,
linux-ide-u79uwXL29TY76Z2rM5mHXA, iss_storagedev-VXdhtT5mjnY,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
e1000-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
linux-driver-h88ZbnxC6KDQT0dZR+AlfA, Solarflare linux maintainers,
"VMware, Inc." <pv-dr
In-Reply-To: <20131003194837.GA27636-hdGaXg0bp3uRXgp2RCiI5R/sF2h8X+2i0E9HWUfgJXw@public.gmane.org>
On Thu, Oct 03, 2013 at 09:48:39PM +0200, Alexander Gordeev wrote:
>
> pci_enable_msix() may fail, but it can not return a positive number.
>
That is true according to the current logic but the comment on top of
pci_enable_msix() still says:
"A return of < 0 indicates a failure. Or a return of > 0 indicates
that driver request is exceeding the number of irqs or MSI-X vectors
available"
So you're counting on an implementation that may change in the future.
I think leaving the code as it is now is safer.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH net-next v3 2/5] xen-netback: add support for IPv6 checksum offload from guest
From: Paul Durrant @ 2013-10-10 15:25 UTC (permalink / raw)
To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381418733-20470-1-git-send-email-paul.durrant@citrix.com>
For performance of VM to VM traffic on a single host it is better to avoid
calculation of TCP/UDP checksum in the sending frontend. To allow this this
patch adds the code necessary to set up partial checksum for IPv6 packets
and xenstore flag feature-ipv6-csum-offload to advertise that fact to
frontends.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
drivers/net/xen-netback/netback.c | 243 +++++++++++++++++++++++++++++++------
drivers/net/xen-netback/xenbus.c | 9 ++
2 files changed, 213 insertions(+), 39 deletions(-)
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index f3e591c..4c12030 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -109,15 +109,10 @@ static inline unsigned long idx_to_kaddr(struct xenvif *vif,
return (unsigned long)pfn_to_kaddr(idx_to_pfn(vif, idx));
}
-/*
- * This is the amount of packet we copy rather than map, so that the
- * guest can't fiddle with the contents of the headers while we do
- * packet processing on them (netfilter, routing, etc).
+/* This is a miniumum size for the linear area to avoid lots of
+ * calls to __pskb_pull_tail() as we set up checksum offsets.
*/
-#define PKT_PROT_LEN (ETH_HLEN + \
- VLAN_HLEN + \
- sizeof(struct iphdr) + MAX_IPOPTLEN + \
- sizeof(struct tcphdr) + MAX_TCP_OPTION_SPACE)
+#define PKT_PROT_LEN 128
static u16 frag_get_pending_idx(skb_frag_t *frag)
{
@@ -1118,61 +1113,74 @@ static int xenvif_set_skb_gso(struct xenvif *vif,
return 0;
}
-static int checksum_setup(struct xenvif *vif, struct sk_buff *skb)
+static inline void maybe_pull_tail(struct sk_buff *skb, unsigned int len)
+{
+ if (skb_is_nonlinear(skb) && skb_headlen(skb) < len) {
+ /* If we need to pullup then pullup to the max, so we
+ * won't need to do it again.
+ */
+ int target = min_t(int, skb->len, MAX_TCP_HEADER);
+ __pskb_pull_tail(skb, target - skb_headlen(skb));
+ }
+}
+
+static int checksum_setup_ip(struct xenvif *vif, struct sk_buff *skb,
+ int recalculate_partial_csum)
{
- struct iphdr *iph;
+ struct iphdr *iph = (void *)skb->data;
+ unsigned int header_size;
+ unsigned int off;
int err = -EPROTO;
- int recalculate_partial_csum = 0;
- /*
- * A GSO SKB must be CHECKSUM_PARTIAL. However some buggy
- * peers can fail to set NETRXF_csum_blank when sending a GSO
- * frame. In this case force the SKB to CHECKSUM_PARTIAL and
- * recalculate the partial checksum.
- */
- if (skb->ip_summed != CHECKSUM_PARTIAL && skb_is_gso(skb)) {
- vif->rx_gso_checksum_fixup++;
- skb->ip_summed = CHECKSUM_PARTIAL;
- recalculate_partial_csum = 1;
- }
+ off = sizeof(struct iphdr);
- /* A non-CHECKSUM_PARTIAL SKB does not require setup. */
- if (skb->ip_summed != CHECKSUM_PARTIAL)
- return 0;
+ header_size = skb->network_header + off + MAX_IPOPTLEN;
+ maybe_pull_tail(skb, header_size);
- if (skb->protocol != htons(ETH_P_IP))
- goto out;
+ off = iph->ihl * 4;
- iph = (void *)skb->data;
switch (iph->protocol) {
case IPPROTO_TCP:
- if (!skb_partial_csum_set(skb, 4 * iph->ihl,
+ if (!skb_partial_csum_set(skb, off,
offsetof(struct tcphdr, check)))
goto out;
if (recalculate_partial_csum) {
struct tcphdr *tcph = tcp_hdr(skb);
+
+ header_size = skb->network_header +
+ off +
+ sizeof(struct tcphdr);
+ maybe_pull_tail(skb, header_size);
+
tcph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr,
- skb->len - iph->ihl*4,
+ skb->len - off,
IPPROTO_TCP, 0);
}
break;
case IPPROTO_UDP:
- if (!skb_partial_csum_set(skb, 4 * iph->ihl,
+ if (!skb_partial_csum_set(skb, off,
offsetof(struct udphdr, check)))
goto out;
if (recalculate_partial_csum) {
struct udphdr *udph = udp_hdr(skb);
+
+ header_size = skb->network_header +
+ off +
+ sizeof(struct udphdr);
+ maybe_pull_tail(skb, header_size);
+
udph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr,
- skb->len - iph->ihl*4,
+ skb->len - off,
IPPROTO_UDP, 0);
}
break;
default:
if (net_ratelimit())
netdev_err(vif->dev,
- "Attempting to checksum a non-TCP/UDP packet, dropping a protocol %d packet\n",
+ "Attempting to checksum a non-TCP/UDP packet, "
+ "dropping a protocol %d packet\n",
iph->protocol);
goto out;
}
@@ -1183,6 +1191,168 @@ out:
return err;
}
+static int checksum_setup_ipv6(struct xenvif *vif, struct sk_buff *skb,
+ int recalculate_partial_csum)
+{
+ int err = -EPROTO;
+ struct ipv6hdr *ipv6h = (void *)skb->data;
+ u8 nexthdr;
+ unsigned int header_size;
+ unsigned int off;
+ bool done;
+ bool found;
+
+ done = false;
+ found = false;
+
+ off = sizeof(struct ipv6hdr);
+
+ header_size = skb->network_header + off;
+ maybe_pull_tail(skb, header_size);
+
+ nexthdr = ipv6h->nexthdr;
+
+ while ((off <= sizeof(struct ipv6hdr) + ntohs(ipv6h->payload_len)) &&
+ !done) {
+ switch (nexthdr) {
+ case IPPROTO_FRAGMENT: {
+ struct frag_hdr *hp = (void *)(skb->data + off);
+
+ header_size = skb->network_header +
+ off +
+ sizeof(struct frag_hdr);
+ maybe_pull_tail(skb, header_size);
+
+ nexthdr = hp->nexthdr;
+ off += 8;
+ break;
+ }
+ case IPPROTO_DSTOPTS:
+ case IPPROTO_HOPOPTS:
+ case IPPROTO_ROUTING: {
+ struct ipv6_opt_hdr *hp = (void *)(skb->data + off);
+
+ header_size = skb->network_header +
+ off +
+ sizeof(struct ipv6_opt_hdr);
+ maybe_pull_tail(skb, header_size);
+
+ nexthdr = hp->nexthdr;
+ off += ipv6_optlen(hp);
+ break;
+ }
+ case IPPROTO_AH: {
+ struct ip_auth_hdr *hp = (void *)(skb->data + off);
+
+ header_size = skb->network_header +
+ off +
+ sizeof(struct ip_auth_hdr);
+ maybe_pull_tail(skb, header_size);
+
+ nexthdr = hp->nexthdr;
+ off += (hp->hdrlen+2)<<2;
+ break;
+ }
+ case IPPROTO_TCP:
+ case IPPROTO_UDP:
+ found = true;
+ /* Fall through */
+ default:
+ done = true;
+ break;
+ }
+ }
+
+ if (!done) {
+ if (net_ratelimit())
+ netdev_err(vif->dev, "Failed to parse packet header\n");
+ goto out;
+ }
+
+ if (!found) {
+ if (net_ratelimit())
+ netdev_err(vif->dev,
+ "Attempting to checksum a non-TCP/UDP packet, "
+ "dropping a protocol %d packet\n",
+ nexthdr);
+ goto out;
+ }
+
+ switch (nexthdr) {
+ case IPPROTO_TCP:
+ if (!skb_partial_csum_set(skb, off,
+ offsetof(struct tcphdr, check)))
+ goto out;
+
+ if (recalculate_partial_csum) {
+ struct tcphdr *tcph = tcp_hdr(skb);
+
+ header_size = skb->network_header +
+ off +
+ sizeof(struct tcphdr);
+ maybe_pull_tail(skb, header_size);
+
+ tcph->check = ~csum_ipv6_magic(&ipv6h->saddr,
+ &ipv6h->daddr,
+ skb->len - off,
+ IPPROTO_TCP, 0);
+ }
+ break;
+ case IPPROTO_UDP:
+ if (!skb_partial_csum_set(skb, off,
+ offsetof(struct udphdr, check)))
+ goto out;
+
+ if (recalculate_partial_csum) {
+ struct udphdr *udph = udp_hdr(skb);
+
+ header_size = skb->network_header +
+ off +
+ sizeof(struct udphdr);
+ maybe_pull_tail(skb, header_size);
+
+ udph->check = ~csum_ipv6_magic(&ipv6h->saddr,
+ &ipv6h->daddr,
+ skb->len - off,
+ IPPROTO_UDP, 0);
+ }
+ break;
+ }
+
+ err = 0;
+
+out:
+ return err;
+}
+
+static int checksum_setup(struct xenvif *vif, struct sk_buff *skb)
+{
+ int err = -EPROTO;
+ int recalculate_partial_csum = 0;
+
+ /* A GSO SKB must be CHECKSUM_PARTIAL. However some buggy
+ * peers can fail to set NETRXF_csum_blank when sending a GSO
+ * frame. In this case force the SKB to CHECKSUM_PARTIAL and
+ * recalculate the partial checksum.
+ */
+ if (skb->ip_summed != CHECKSUM_PARTIAL && skb_is_gso(skb)) {
+ vif->rx_gso_checksum_fixup++;
+ skb->ip_summed = CHECKSUM_PARTIAL;
+ recalculate_partial_csum = 1;
+ }
+
+ /* A non-CHECKSUM_PARTIAL SKB does not require setup. */
+ if (skb->ip_summed != CHECKSUM_PARTIAL)
+ return 0;
+
+ if (skb->protocol == htons(ETH_P_IP))
+ err = checksum_setup_ip(vif, skb, recalculate_partial_csum);
+ else if (skb->protocol == htons(ETH_P_IPV6))
+ err = checksum_setup_ipv6(vif, skb, recalculate_partial_csum);
+
+ return err;
+}
+
static bool tx_credit_exceeded(struct xenvif *vif, unsigned size)
{
unsigned long now = jiffies;
@@ -1428,12 +1598,7 @@ static int xenvif_tx_submit(struct xenvif *vif, int budget)
xenvif_fill_frags(vif, skb);
- /*
- * If the initial fragment was < PKT_PROT_LEN then
- * pull through some bytes from the other fragments to
- * increase the linear region to PKT_PROT_LEN bytes.
- */
- if (skb_headlen(skb) < PKT_PROT_LEN && skb_is_nonlinear(skb)) {
+ if (skb_is_nonlinear(skb) && skb_headlen(skb) < PKT_PROT_LEN) {
int target = min_t(int, skb->len, PKT_PROT_LEN);
__pskb_pull_tail(skb, target - skb_headlen(skb));
}
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 99343ca..9c9b37d 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -105,6 +105,15 @@ static int netback_probe(struct xenbus_device *dev,
goto abort_transaction;
}
+ /* We support partial checksum setup for IPv6 packets */
+ err = xenbus_printf(xbt, dev->nodename,
+ "feature-ipv6-csum-offload",
+ "%d", 1);
+ if (err) {
+ message = "writing feature-ipv6-csum-offload";
+ goto abort_transaction;
+ }
+
/* We support rx-copy path. */
err = xenbus_printf(xbt, dev->nodename,
"feature-rx-copy", "%d", 1);
--
1.7.10.4
^ permalink raw reply related
* [PATCH net-next v3 5/5] xen-netback: enable IPv6 TCP GSO to the guest
From: Paul Durrant @ 2013-10-10 15:25 UTC (permalink / raw)
To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381418733-20470-1-git-send-email-paul.durrant@citrix.com>
This patch adds code to handle SKB_GSO_TCPV6 skbs and construct appropriate
extra or prefix segments to pass the large packet to the frontend. New
xenstore flags, feature-gso-tcpv6 and feature-gso-tcpv6-prefix, are sampled
to determine if the frontend is capable of handling such packets.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
drivers/net/xen-netback/common.h | 9 +++++--
drivers/net/xen-netback/interface.c | 6 +++--
drivers/net/xen-netback/netback.c | 48 +++++++++++++++++++++++++++--------
drivers/net/xen-netback/xenbus.c | 29 +++++++++++++++++++--
include/xen/interface/io/netif.h | 1 +
5 files changed, 77 insertions(+), 16 deletions(-)
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index b4a9a3c..55b8dec 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -87,9 +87,13 @@ struct pending_tx_info {
struct xenvif_rx_meta {
int id;
int size;
+ int gso_type;
int gso_size;
};
+#define GSO_BIT(type) \
+ (1 << XEN_NETIF_GSO_TYPE_ ## type)
+
/* Discriminate from any valid pending_idx value. */
#define INVALID_PENDING_IDX 0xFFFF
@@ -150,9 +154,10 @@ struct xenvif {
u8 fe_dev_addr[6];
/* Frontend feature information. */
+ int gso_mask;
+ int gso_prefix_mask;
+
u8 can_sg:1;
- u8 gso:1;
- u8 gso_prefix:1;
u8 ip_csum:1;
u8 ipv6_csum:1;
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index cb0d8ea..e4aa267 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -214,8 +214,10 @@ static netdev_features_t xenvif_fix_features(struct net_device *dev,
if (!vif->can_sg)
features &= ~NETIF_F_SG;
- if (!vif->gso && !vif->gso_prefix)
+ if (~(vif->gso_mask | vif->gso_prefix_mask) & GSO_BIT(TCPV4))
features &= ~NETIF_F_TSO;
+ if (~(vif->gso_mask | vif->gso_prefix_mask) & GSO_BIT(TCPV6))
+ features &= ~NETIF_F_TSO6;
if (!vif->ip_csum)
features &= ~NETIF_F_IP_CSUM;
if (!vif->ipv6_csum)
@@ -320,7 +322,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
dev->netdev_ops = &xenvif_netdev_ops;
dev->hw_features = NETIF_F_SG |
NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
- NETIF_F_TSO;
+ NETIF_F_TSO | NETIF_F_TSO6;
dev->features = dev->hw_features | NETIF_F_RXCSUM;
SET_ETHTOOL_OPS(dev, &xenvif_ethtool_ops);
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index afc055c..bb31921 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -140,7 +140,7 @@ static int max_required_rx_slots(struct xenvif *vif)
int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
- if (vif->can_sg || vif->gso || vif->gso_prefix)
+ if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
return max;
@@ -312,6 +312,7 @@ static struct xenvif_rx_meta *get_next_rx_buffer(struct xenvif *vif,
req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
meta = npo->meta + npo->meta_prod++;
+ meta->gso_type = 0;
meta->gso_size = 0;
meta->size = 0;
meta->id = req->id;
@@ -334,6 +335,7 @@ static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
struct gnttab_copy *copy_gop;
struct xenvif_rx_meta *meta;
unsigned long bytes;
+ int gso_type;
/* Data must not cross a page boundary. */
BUG_ON(size + offset > PAGE_SIZE<<compound_order(page));
@@ -392,7 +394,14 @@ static void xenvif_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
}
/* Leave a gap for the GSO descriptor. */
- if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
+ if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
+ gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
+ else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
+ gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
+ else
+ gso_type = XEN_NETIF_GSO_TYPE_NONE;
+
+ if (*head && ((1 << gso_type) & vif->gso_mask))
vif->rx.req_cons++;
*head = 0; /* There must be something in this buffer now. */
@@ -423,14 +432,28 @@ static int xenvif_gop_skb(struct sk_buff *skb,
unsigned char *data;
int head = 1;
int old_meta_prod;
+ int gso_type;
+ int gso_size;
old_meta_prod = npo->meta_prod;
+ if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4) {
+ gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
+ gso_size = skb_shinfo(skb)->gso_size;
+ } else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6) {
+ gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
+ gso_size = skb_shinfo(skb)->gso_size;
+ } else {
+ gso_type = 0;
+ gso_size = 0;
+ }
+
/* Set up a GSO prefix descriptor, if necessary */
- if (skb_shinfo(skb)->gso_size && vif->gso_prefix) {
+ if ((1 << skb_shinfo(skb)->gso_type) & vif->gso_prefix_mask) {
req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
meta = npo->meta + npo->meta_prod++;
- meta->gso_size = skb_shinfo(skb)->gso_size;
+ meta->gso_type = gso_type;
+ meta->gso_size = gso_size;
meta->size = 0;
meta->id = req->id;
}
@@ -438,10 +461,13 @@ static int xenvif_gop_skb(struct sk_buff *skb,
req = RING_GET_REQUEST(&vif->rx, vif->rx.req_cons++);
meta = npo->meta + npo->meta_prod++;
- if (!vif->gso_prefix)
- meta->gso_size = skb_shinfo(skb)->gso_size;
- else
+ if ((1 << gso_type) & vif->gso_mask) {
+ meta->gso_type = gso_type;
+ meta->gso_size = gso_size;
+ } else {
+ meta->gso_type = 0;
meta->gso_size = 0;
+ }
meta->size = 0;
meta->id = req->id;
@@ -587,7 +613,8 @@ void xenvif_rx_action(struct xenvif *vif)
vif = netdev_priv(skb->dev);
- if (vif->meta[npo.meta_cons].gso_size && vif->gso_prefix) {
+ if ((1 << vif->meta[npo.meta_cons].gso_type) &
+ vif->gso_prefix_mask) {
resp = RING_GET_RESPONSE(&vif->rx,
vif->rx.rsp_prod_pvt++);
@@ -624,7 +651,8 @@ void xenvif_rx_action(struct xenvif *vif)
vif->meta[npo.meta_cons].size,
flags);
- if (vif->meta[npo.meta_cons].gso_size && !vif->gso_prefix) {
+ if ((1 << vif->meta[npo.meta_cons].gso_type) &
+ vif->gso_mask) {
struct xen_netif_extra_info *gso =
(struct xen_netif_extra_info *)
RING_GET_RESPONSE(&vif->rx,
@@ -632,8 +660,8 @@ void xenvif_rx_action(struct xenvif *vif)
resp->flags |= XEN_NETRXF_extra_info;
+ gso->u.gso.type = vif->meta[npo.meta_cons].gso_type;
gso->u.gso.size = vif->meta[npo.meta_cons].gso_size;
- gso->u.gso.type = XEN_NETIF_GSO_TYPE_TCPV4;
gso->u.gso.pad = 0;
gso->u.gso.features = 0;
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 7e4dcc9..6a005f1 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -577,15 +577,40 @@ static int connect_rings(struct backend_info *be)
val = 0;
vif->can_sg = !!val;
+ vif->gso_mask = 0;
+ vif->gso_prefix_mask = 0;
+
if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4",
"%d", &val) < 0)
val = 0;
- vif->gso = !!val;
+ if (val)
+ vif->gso_mask |= GSO_BIT(TCPV4);
if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4-prefix",
"%d", &val) < 0)
val = 0;
- vif->gso_prefix = !!val;
+ if (val)
+ vif->gso_prefix_mask |= GSO_BIT(TCPV4);
+
+ if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv6",
+ "%d", &val) < 0)
+ val = 0;
+ if (val)
+ vif->gso_mask |= GSO_BIT(TCPV6);
+
+ if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv6-prefix",
+ "%d", &val) < 0)
+ val = 0;
+ if (val)
+ vif->gso_prefix_mask |= GSO_BIT(TCPV6);
+
+ if (vif->gso_mask & vif->gso_prefix_mask) {
+ xenbus_dev_fatal(dev, err,
+ "%s: gso and gso prefix flags are not "
+ "mutually exclusive",
+ dev->otherend);
+ return -EOPNOTSUPP;
+ }
/* Before feature-ipv6-csum-offload was introduced, checksum offload
* was turned on by a zero value in (or lack of)
diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
index d7dd8d7..41674bc 100644
--- a/include/xen/interface/io/netif.h
+++ b/include/xen/interface/io/netif.h
@@ -113,6 +113,7 @@ struct xen_netif_tx_request {
#define XEN_NETIF_EXTRA_FLAG_MORE (1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
/* GSO types */
+#define XEN_NETIF_GSO_TYPE_NONE (0)
#define XEN_NETIF_GSO_TYPE_TCPV4 (1)
#define XEN_NETIF_GSO_TYPE_TCPV6 (2)
--
1.7.10.4
^ permalink raw reply related
* [PATCH net-next v3 4/5] xen-netback: handle IPv6 TCP GSO packets from the guest
From: Paul Durrant @ 2013-10-10 15:25 UTC (permalink / raw)
To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381418733-20470-1-git-send-email-paul.durrant@citrix.com>
This patch adds a xenstore feature flag, festure-gso-tcpv6, to advertise
that netback can handle IPv6 TCP GSO packets. It creates SKB_GSO_TCPV6 skbs
if the frontend passes an extra segment with the new type
XEN_NETIF_GSO_TYPE_TCPV6 added to netif.h.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
drivers/net/xen-netback/netback.c | 11 ++++++++---
drivers/net/xen-netback/xenbus.c | 7 +++++++
include/xen/interface/io/netif.h | 10 +++++++++-
3 files changed, 24 insertions(+), 4 deletions(-)
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 4c12030..afc055c 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1096,15 +1096,20 @@ static int xenvif_set_skb_gso(struct xenvif *vif,
return -EINVAL;
}
- /* Currently only TCPv4 S.O. is supported. */
- if (gso->u.gso.type != XEN_NETIF_GSO_TYPE_TCPV4) {
+ switch (gso->u.gso.type) {
+ case XEN_NETIF_GSO_TYPE_TCPV4:
+ skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
+ break;
+ case XEN_NETIF_GSO_TYPE_TCPV6:
+ skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
+ break;
+ default:
netdev_err(vif->dev, "Bad GSO type %d.\n", gso->u.gso.type);
xenvif_fatal_tx_err(vif);
return -EINVAL;
}
skb_shinfo(skb)->gso_size = gso->u.gso.size;
- skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
/* Header must be checked, and gso_segs computed. */
skb_shinfo(skb)->gso_type |= SKB_GSO_DODGY;
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 9c9b37d..7e4dcc9 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -105,6 +105,13 @@ static int netback_probe(struct xenbus_device *dev,
goto abort_transaction;
}
+ err = xenbus_printf(xbt, dev->nodename, "feature-gso-tcpv6",
+ "%d", sg);
+ if (err) {
+ message = "writing feature-gso-tcpv6";
+ goto abort_transaction;
+ }
+
/* We support partial checksum setup for IPv6 packets */
err = xenbus_printf(xbt, dev->nodename,
"feature-ipv6-csum-offload",
diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
index d9fb44739..d7dd8d7 100644
--- a/include/xen/interface/io/netif.h
+++ b/include/xen/interface/io/netif.h
@@ -61,6 +61,13 @@
*/
/*
+ * "feature-gso-tcpv4" and "feature-gso-tcpv6" advertise the capability to
+ * handle large TCP packets (in IPv4 or IPv6 form respectively). Neither
+ * frontends nor backends are assumed to be capable unless the flags are
+ * present.
+ */
+
+/*
* This is the 'wire' format for packets:
* Request 1: xen_netif_tx_request -- XEN_NETTXF_* (any flags)
* [Request 2: xen_netif_extra_info] (only if request 1 has XEN_NETTXF_extra_info)
@@ -105,8 +112,9 @@ struct xen_netif_tx_request {
#define _XEN_NETIF_EXTRA_FLAG_MORE (0)
#define XEN_NETIF_EXTRA_FLAG_MORE (1U<<_XEN_NETIF_EXTRA_FLAG_MORE)
-/* GSO types - only TCPv4 currently supported. */
+/* GSO types */
#define XEN_NETIF_GSO_TYPE_TCPV4 (1)
+#define XEN_NETIF_GSO_TYPE_TCPV6 (2)
/*
* This structure needs to fit within both netif_tx_request and
--
1.7.10.4
^ permalink raw reply related
* [PATCH net-next v3 3/5] xen-netback: Unconditionally set NETIF_F_RXCSUM
From: Paul Durrant @ 2013-10-10 15:25 UTC (permalink / raw)
To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381418733-20470-1-git-send-email-paul.durrant@citrix.com>
There is no mechanism to insist that a guest always generates a packet
with good checksum (at least for IPv4) so we must handle checksum
offloading from the guest and hence should set NETIF_F_RXCSUM.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
drivers/net/xen-netback/interface.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 8e92783..cb0d8ea 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -321,7 +321,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
dev->hw_features = NETIF_F_SG |
NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
NETIF_F_TSO;
- dev->features = dev->hw_features;
+ dev->features = dev->hw_features | NETIF_F_RXCSUM;
SET_ETHTOOL_OPS(dev, &xenvif_ethtool_ops);
dev->tx_queue_len = XENVIF_QUEUE_LENGTH;
--
1.7.10.4
^ permalink raw reply related
* [PATCH net-next v3 1/5] xen-netback: add support for IPv6 checksum offload to guest
From: Paul Durrant @ 2013-10-10 15:25 UTC (permalink / raw)
To: xen-devel, netdev; +Cc: Paul Durrant, Wei Liu, David Vrabel, Ian Campbell
In-Reply-To: <1381418733-20470-1-git-send-email-paul.durrant@citrix.com>
Check xenstore flag feature-ipv6-csum-offload to determine if a
guest is happy to accept IPv6 packets with only partial checksum.
Also check analogous feature-ip-csum-offload to determine if a
guest is happy to accept IPv4 packets with only partial checksum
as a replacement for a negated feature-no-csum-offload value and
add a comment to deprecate use of feature-no-csum-offload.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
drivers/net/xen-netback/common.h | 3 ++-
drivers/net/xen-netback/interface.c | 10 +++++++---
drivers/net/xen-netback/xenbus.c | 24 +++++++++++++++++++++++-
include/xen/interface/io/netif.h | 10 ++++++++++
4 files changed, 42 insertions(+), 5 deletions(-)
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 5715318..b4a9a3c 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -153,7 +153,8 @@ struct xenvif {
u8 can_sg:1;
u8 gso:1;
u8 gso_prefix:1;
- u8 csum:1;
+ u8 ip_csum:1;
+ u8 ipv6_csum:1;
/* Internal feature information. */
u8 can_queue:1; /* can queue packets for receiver? */
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 01bb854..8e92783 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -216,8 +216,10 @@ static netdev_features_t xenvif_fix_features(struct net_device *dev,
features &= ~NETIF_F_SG;
if (!vif->gso && !vif->gso_prefix)
features &= ~NETIF_F_TSO;
- if (!vif->csum)
+ if (!vif->ip_csum)
features &= ~NETIF_F_IP_CSUM;
+ if (!vif->ipv6_csum)
+ features &= ~NETIF_F_IPV6_CSUM;
return features;
}
@@ -306,7 +308,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
vif->domid = domid;
vif->handle = handle;
vif->can_sg = 1;
- vif->csum = 1;
+ vif->ip_csum = 1;
vif->dev = dev;
vif->credit_bytes = vif->remaining_credit = ~0UL;
@@ -316,7 +318,9 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
vif->credit_timeout.expires = jiffies;
dev->netdev_ops = &xenvif_netdev_ops;
- dev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;
+ dev->hw_features = NETIF_F_SG |
+ NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
+ NETIF_F_TSO;
dev->features = dev->hw_features;
SET_ETHTOOL_OPS(dev, &xenvif_ethtool_ops);
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 1b08d87..99343ca 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -571,10 +571,32 @@ static int connect_rings(struct backend_info *be)
val = 0;
vif->gso_prefix = !!val;
+ /* Before feature-ipv6-csum-offload was introduced, checksum offload
+ * was turned on by a zero value in (or lack of)
+ * feature-no-csum-offload. Therefore, for compatibility, assume
+ * that if feature-ip-csum-offload is missing that IP (v4) offload
+ * is turned on. If this is not the case then the flag will be
+ * cleared when we subsequently sample feature-no-csum-offload.
+ */
+ if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-ip-csum-offload",
+ "%d", &val) < 0)
+ val = 1;
+ vif->ip_csum = !!val;
+
+ if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-ipv6-csum-offload",
+ "%d", &val) < 0)
+ val = 0;
+ vif->ipv6_csum = !!val;
+
+ /* This is deprecated. Frontends should set IP v4 and v6 checksum
+ * offload using feature-ip-csum-offload and
+ * feature-ipv6-csum-offload respectively.
+ */
if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-no-csum-offload",
"%d", &val) < 0)
val = 0;
- vif->csum = !val;
+ if (val)
+ vif->ip_csum = 0;
/* Map the shared frame, irq etc. */
err = xenvif_connect(vif, tx_ring_ref, rx_ring_ref,
diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
index eb262e3..d9fb44739 100644
--- a/include/xen/interface/io/netif.h
+++ b/include/xen/interface/io/netif.h
@@ -51,6 +51,16 @@
*/
/*
+ * "feature-no-csum-offload" was used to turn off IPv4 TCP/UDP checksum
+ * offload but is now deprecated. Two new feature flags should now be used
+ * to control checksum offload:
+ * "feature-ip-csum-offload" should be used to turn IPv4 TCP/UDP checksum
+ * offload on or off. If it is missing then the feature is assumed to be on.
+ * "feature-ipv6-csum-offload" should be used to turn IPv6 TCP/UDP checksum
+ * offload on or off. If it is missing then the feature is assumed to be off.
+ */
+
+/*
* This is the 'wire' format for packets:
* Request 1: xen_netif_tx_request -- XEN_NETTXF_* (any flags)
* [Request 2: xen_netif_extra_info] (only if request 1 has XEN_NETTXF_extra_info)
--
1.7.10.4
^ permalink raw reply related
* [PATCH net-next v3 0/5] xen-netback: IPv6 offload support
From: Paul Durrant @ 2013-10-10 15:25 UTC (permalink / raw)
To: xen-devel, netdev
This patch series adds support for checksum and large packet offloads into
xen-netback.
Testing has mainly been done using the Microsoft network hardware
certification suite running in Server 2008R2 VMs with Citrix PV frontends.
v2:
- Fixed Wei's email address in Cc lines
v3:
- Responded to Wei's comments:
- netif.h now updated with comments and a definition of
XEN_NETIF_GSO_TYPE_NONE.
- limited number of pullups
- Responded to Annie's comments:
- New GSO_BIT macro
^ permalink raw reply
* LOAN OFFER.
From: tenger @ 2013-10-10 12:53 UTC (permalink / raw)
To: Recipients
Apply for loan at 3% interest rate. Contact us with your requirement via Email; tenger.g@mail.com, for more information. Try us and be convinced.
^ permalink raw reply
* Re: [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that
From: Vlad Yasevich @ 2013-10-10 14:11 UTC (permalink / raw)
To: Fan Du, nhorman, steffen.klassert; +Cc: davem, netdev
In-Reply-To: <1381384296-1821-1-git-send-email-fan.du@windriver.com>
On 10/10/2013 01:51 AM, Fan Du wrote:
> igb/ixgbe have hardware sctp checksum support, when this feature is enabled
> and also IPsec is armed to protect sctp traffic, ugly things happened as
> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum every thing
> up and pack the 16bits result in the checksum field). The result is fail
> establishment of sctp communication.
>
> And I saw another point in this part of code, when IPsec is not armed,
> sctp communication is good, however setting setting CHECKSUM_PARTIAL will
> make xfrm_output compute dummy checksum values which will be overwritten by
> hardware lately.
>
> So this patch try to solve above two issues together.
>
> Signed-off-by: Fan Du <fan.du@windriver.com>
> ---
> note:
> igb/ixgbe hardware is not handy on my side, so just build test only.
>
> ---
> net/sctp/output.c | 27 +++++++++++++++++++--------
> 1 file changed, 19 insertions(+), 8 deletions(-)
>
> diff --git a/net/sctp/output.c b/net/sctp/output.c
> index 0ac3a65..f0b9cc5 100644
> --- a/net/sctp/output.c
> +++ b/net/sctp/output.c
> @@ -372,6 +372,16 @@ static void sctp_packet_set_owner_w(struct sk_buff *skb, struct sock *sk)
> atomic_inc(&sk->sk_wmem_alloc);
> }
>
> +static int is_xfrm_armed(struct dst_entry *dst)
> +{
> +#ifdef CONFIG_XFRM
> + /* If dst->xfrm is valid, this skb needs to be transformed */
> + return dst->xfrm != NULL;
> +#else
> + return 0;
> +#endif
> +}
> +
> /* All packets are sent to the network through this function from
> * sctp_outq_tail().
> *
> @@ -536,20 +546,21 @@ int sctp_packet_transmit(struct sctp_packet *packet)
> * by CRC32-C as described in <draft-ietf-tsvwg-sctpcsum-02.txt>.
> */
> if (!sctp_checksum_disable) {
> - if (!(dst->dev->features & NETIF_F_SCTP_CSUM)) {
> + if ((!(dst->dev->features & NETIF_F_SCTP_CSUM)) ||
> + is_xfrm_armed(dst)) {
> +
> __u32 crc32 = sctp_start_cksum((__u8 *)sh, cksum_buf_len);
>
> /* 3) Put the resultant value into the checksum field in the
> * common header, and leave the rest of the bits unchanged.
> */
> sh->checksum = sctp_end_cksum(crc32);
> - } else {
> - /* no need to seed pseudo checksum for SCTP */
> - nskb->ip_summed = CHECKSUM_PARTIAL;
> - nskb->csum_start = (skb_transport_header(nskb) -
> - nskb->head);
> - nskb->csum_offset = offsetof(struct sctphdr, checksum);
> - }
> + } else
> + /* Mark skb as CHECKSUM_UNNECESSARY to let hardware compute
> + * the checksum, and also avoid xfrm_output to do unceccessary
> + * checksum.
> + */
> + nskb->ip_summed = CHECKSUM_UNNECESSARY;
> }
In addition to what Niel said, the use of CHECKSUM_UNNECESSARY is
incorrect as it will cause the nic to not compute the checksum.
The checksum offload depends on the use of CHECKSUM_PARTIAL.
-vlad
>
> /* IP layer ECN support
>
^ permalink raw reply
* LOAN OFFER.
From: tenger @ 2013-10-10 12:53 UTC (permalink / raw)
To: Recipients
Apply for loan at 3% interest rate. Contact us with your requirement via Email; tenger.g@mail.com, for more information. Try us and be convinced.
^ permalink raw reply
* [PATCH] l2tp: must disable bh before calling l2tp_xmit_skb()
From: Eric Dumazet @ 2013-10-10 13:30 UTC (permalink / raw)
To: François Cachereul; +Cc: James Chapman, David S. Miller, netdev
In-Reply-To: <5256592E.3080600@alphalink.fr>
From: Eric Dumazet <edumazet@google.com>
François Cachereul made a very nice bug report and suspected
the bh_lock_sock() / bh_unlok_sock() pair used in l2tp_xmit_skb() from
process context was not good.
This problem was added by commit
("l2tp: Fix locking in l2tp_core.c").
l2tp_eth_dev_xmit() runs from BH context, so we must disable BH
from other l2tp_xmit_skb() users.
[ 452.060011] BUG: soft lockup - CPU#1 stuck for 23s! [accel-pppd:6662]
[ 452.061757] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox
ppp_generic slhc ipv6 ext3 mbcache jbd virtio_balloon xfs exportfs dm_mod
virtio_blk ata_generic virtio_net floppy ata_piix libata virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan]
[ 452.064012] CPU 1
[ 452.080015] BUG: soft lockup - CPU#2 stuck for 23s! [accel-pppd:6643]
[ 452.080015] CPU 2
[ 452.080015]
[ 452.080015] Pid: 6643, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs
[ 452.080015] RIP: 0010:[<ffffffff81059f6c>] [<ffffffff81059f6c>] do_raw_spin_lock+0x17/0x1f
[ 452.080015] RSP: 0018:ffff88007125fc18 EFLAGS: 00000293
[ 452.080015] RAX: 000000000000aba9 RBX: ffffffff811d0703 RCX: 0000000000000000
[ 452.080015] RDX: 00000000000000ab RSI: ffff8800711f6896 RDI: ffff8800745c8110
[ 452.080015] RBP: ffff88007125fc18 R08: 0000000000000020 R09: 0000000000000000
[ 452.080015] R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000286
[ 452.080015] R13: 0000000000000020 R14: 0000000000000240 R15: 0000000000000000
[ 452.080015] FS: 00007fdc0cc24700(0000) GS:ffff8800b6f00000(0000) knlGS:0000000000000000
[ 452.080015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 452.080015] CR2: 00007fdb054899b8 CR3: 0000000074404000 CR4: 00000000000006a0
[ 452.080015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 452.080015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 452.080015] Process accel-pppd (pid: 6643, threadinfo ffff88007125e000, task ffff8800b27e6dd0)
[ 452.080015] Stack:
[ 452.080015] ffff88007125fc28 ffffffff81256559 ffff88007125fc98 ffffffffa01b2bd1
[ 452.080015] ffff88007125fc58 000000000000000c 00000000029490d0 0000009c71dbe25e
[ 452.080015] 000000000000005c 000000080000000e 0000000000000000 ffff880071170600
[ 452.080015] Call Trace:
[ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core]
[ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
[ 452.080015] Code: 81 48 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 <8a> 07 eb f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3
[ 452.080015] Call Trace:
[ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core]
[ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
[ 452.064012]
[ 452.064012] Pid: 6662, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs
[ 452.064012] RIP: 0010:[<ffffffff81059f6e>] [<ffffffff81059f6e>] do_raw_spin_lock+0x19/0x1f
[ 452.064012] RSP: 0018:ffff8800b6e83ba0 EFLAGS: 00000297
[ 452.064012] RAX: 000000000000aaa9 RBX: ffff8800b6e83b40 RCX: 0000000000000002
[ 452.064012] RDX: 00000000000000aa RSI: 000000000000000a RDI: ffff8800745c8110
[ 452.064012] RBP: ffff8800b6e83ba0 R08: 000000000000c802 R09: 000000000000001c
[ 452.064012] R10: ffff880071096c4e R11: 0000000000000006 R12: ffff8800b6e83b18
[ 452.064012] R13: ffffffff8125d51e R14: ffff8800b6e83ba0 R15: ffff880072a589c0
[ 452.064012] FS: 00007fdc0b81e700(0000) GS:ffff8800b6e80000(0000) knlGS:0000000000000000
[ 452.064012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 452.064012] CR2: 0000000000625208 CR3: 0000000074404000 CR4: 00000000000006a0
[ 452.064012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 452.064012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 452.064012] Process accel-pppd (pid: 6662, threadinfo ffff88007129a000, task ffff8800744f7410)
[ 452.064012] Stack:
[ 452.064012] ffff8800b6e83bb0 ffffffff81256559 ffff8800b6e83bc0 ffffffff8121c64a
[ 452.064012] ffff8800b6e83bf0 ffffffff8121ec7a ffff880072a589c0 ffff880071096c62
[ 452.064012] 0000000000000011 ffffffff81430024 ffff8800b6e83c80 ffffffff8121f276
[ 452.064012] Call Trace:
[ 452.064012] <IRQ>
[ 452.064012] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb
[ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269
[ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae
[ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0
[ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c
[ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5
[ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84
[ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3
[ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269
[ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb
[ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7
[ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e
[ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b
[ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net]
[ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184
[ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8
[ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12
[ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10
[ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26
[ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82
[ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c
[ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5
[ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e
[ 452.064012] <EOI>
[ 452.064012] [<ffffffff810b82a1>] ? kfree+0x8a/0xa3
[ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core]
[ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
[ 452.064012] Code: 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 8a 07 <eb> f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 55 48
[ 452.064012] Call Trace:
[ 452.064012] <IRQ> [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb
[ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269
[ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae
[ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0
[ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c
[ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5
[ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84
[ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3
[ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269
[ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb
[ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7
[ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e
[ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b
[ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net]
[ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184
[ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8
[ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12
[ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10
[ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26
[ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82
[ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c
[ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5
[ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e
[ 452.064012] <EOI> [<ffffffff810b82a1>] ? kfree+0x8a/0xa3
[ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core]
[ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
Reported-by: François Cachereul <f.cachereul@alphalink.fr>
Tested-by: François Cachereul <f.cachereul@alphalink.fr>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
---
net/l2tp/l2tp_ppp.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
index f0a7ada..ffda81e 100644
--- a/net/l2tp/l2tp_ppp.c
+++ b/net/l2tp/l2tp_ppp.c
@@ -353,7 +353,9 @@ static int pppol2tp_sendmsg(struct kiocb *iocb, struct socket *sock, struct msgh
goto error_put_sess_tun;
}
+ local_bh_disable();
l2tp_xmit_skb(session, skb, session->hdr_len);
+ local_bh_enable();
sock_put(ps->tunnel_sock);
sock_put(sk);
@@ -422,7 +424,9 @@ static int pppol2tp_xmit(struct ppp_channel *chan, struct sk_buff *skb)
skb->data[0] = ppph[0];
skb->data[1] = ppph[1];
+ local_bh_disable();
l2tp_xmit_skb(session, skb, session->hdr_len);
+ local_bh_enable();
sock_put(sk_tun);
sock_put(sk);
^ permalink raw reply related
* Re: [PATCH net] {xfrm, sctp} Stick to software crc32 even if hardware is capable of that
From: Neil Horman @ 2013-10-10 13:11 UTC (permalink / raw)
To: Fan Du; +Cc: vyasevich, steffen.klassert, davem, netdev
In-Reply-To: <1381384296-1821-1-git-send-email-fan.du@windriver.com>
On Thu, Oct 10, 2013 at 01:51:36PM +0800, Fan Du wrote:
> igb/ixgbe have hardware sctp checksum support, when this feature is enabled
> and also IPsec is armed to protect sctp traffic, ugly things happened as
> xfrm_output checks CHECKSUM_PARTIAL to do check sum operation(sum every thing
> up and pack the 16bits result in the checksum field). The result is fail
> establishment of sctp communication.
>
Shouldn't this be fixed in the xfrm code then? E.g. check the device features
for SCTP checksum offloading and and skip the checksum during xfrm output if its
available?
Or am I missing something?
Neil
^ permalink raw reply
* [PATCH net] bridge: allow receiption on disabled port
From: Felix Fietkau @ 2013-10-10 12:52 UTC (permalink / raw)
To: netdev; +Cc: stephen
When an ethernet device is enslaved to a bridge, and the bridge STP
detects loss of carrier (or operational state down), then normally
packet receiption is blocked.
This breaks control applications like WPA which maybe expecting to
receive packets to negotiate to bring link up. The bridge needs to
block forwarding packets from these disabled ports, but there is no
hard requirement to not allow local packet delivery.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
---
net/bridge/br_input.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index a2fd37e..0a8a8cd 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -146,9 +146,11 @@ static int br_handle_local_finish(struct sk_buff *skb)
struct net_bridge_port *p = br_port_get_rcu(skb->dev);
u16 vid = 0;
- br_vlan_get_tag(skb, &vid);
- if (p->flags & BR_LEARNING)
- br_fdb_update(p->br, p, eth_hdr(skb)->h_source, vid);
+ if (p->state != BR_STATE_DISABLED) {
+ br_vlan_get_tag(skb, &vid);
+ if (p->flags & BR_LEARNING)
+ br_fdb_update(p->br, p, eth_hdr(skb)->h_source, vid);
+ }
return 0; /* process further */
}
@@ -218,6 +220,18 @@ rx_handler_result_t br_handle_frame(struct sk_buff **pskb)
forward:
switch (p->state) {
+ case BR_STATE_DISABLED:
+ if (ether_addr_equal(p->br->dev->dev_addr, dest))
+ skb->pkt_type = PACKET_HOST;
+
+ if (NF_HOOK(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING, skb, skb->dev, NULL,
+ br_handle_local_finish))
+ break;
+
+ BR_INPUT_SKB_CB(skb)->brdev = p->br->dev;
+ br_pass_frame_up(skb);
+ break;
+
case BR_STATE_FORWARDING:
rhook = rcu_dereference(br_should_route_hook);
if (rhook) {
--
1.8.0.2
^ permalink raw reply related
* Re: [PATCH] usbnet: smsc95xx: Add device tree input for MAC address
From: Dan Murphy @ 2013-10-10 12:47 UTC (permalink / raw)
To: Ming Lei
Cc: Steve Glendinning, Network Development, Linux Kernel Mailing List,
linux-usb, mugunthanvnm
In-Reply-To: <CACVXFVP-KgqcNTnk+JZsLc188pYk6iUXvW6=kPXNaurSuevGeA@mail.gmail.com>
On 10/10/2013 07:39 AM, Ming Lei wrote:
> On Thu, Oct 10, 2013 at 8:08 PM, Dan Murphy <dmurphy@ti.com> wrote:
>> You are correct I don't expect a match for PnP devices only devices that are hard wired.
>>
>> After thinking of it I should move the OF code below the EEPROM code as the EEPROM should take preference over the DT code.
>>
>> I will need to post V2 for that.
> Your patch still doesn't deal with two smsc95xx devices built in
> one board.
Is there a board that has 2 built in smsc devices?
This is all dependent on what the dev.id is of udev.
>
> The problem is a generic one, looks it is better to do it when OF supports
> discoverable bus.
>
>
> Thanks,
--
------------------
Dan Murphy
^ permalink raw reply
* RE: [PATCH net-next 01/10] qlcnic: Print informational messages only once during driver load.
From: Sucheta Chakraborty @ 2013-10-10 12:41 UTC (permalink / raw)
To: Stephen Hemminger, Himanshu Madhani
Cc: David Miller, netdev, Dept-NX Linux NIC Driver
In-Reply-To: <20131004142938.4ab6c6aa@nehalam.linuxnetplumber.net>
> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Saturday, October 05, 2013 3:00 AM
> To: Himanshu Madhani
> Cc: David Miller; netdev; Dept-NX Linux NIC Driver; Sucheta Chakraborty
> Subject: Re: [PATCH net-next 01/10] qlcnic: Print informational
> messages only once during driver load.
>
> On Fri, 4 Oct 2013 14:30:48 -0400
> Himanshu Madhani <himanshu.madhani@qlogic.com> wrote:
>
> > From: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
> >
> > Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
> > Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
> > ---
> > drivers/net/ethernet/qlogic/qlcnic/qlcnic.h | 1 +
> > .../net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 12 -----------
> > .../net/ethernet/qlogic/qlcnic/qlcnic_83xx_vnic.c | 25
> ++++++++++++++++++----
> > drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c | 1 +
> > 4 files changed, 23 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
> > b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
> > index 81bf836..a3c4379 100644
> > --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
> > +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
> > @@ -1199,6 +1199,7 @@ struct qlcnic_npar_info {
> > u8 promisc_mode;
> > u8 offload_flags;
> > u8 pci_func;
> > + u8 mac[ETH_ALEN];
> > };
> >
>
> >
>
> There is a field in netdevice which should probably be used for this
> perm_addr.
>
This information is logged from the management function in Virtual NIC mode.
Management function do not a have access to the netdev info structure of other functions.
So we cannot make the changes as suggested.
> And then this could be corrected:
>
> static void qlcnic_dcb_get_perm_hw_addr(struct net_device *netdev, u8
> *addr) {
> memcpy(addr, netdev->dev_addr, netdev->addr_len); }
Will incorporate this change in "dcb cleanup patch".
Thanks,
Sucheta.
^ permalink raw reply
* Re: [PATCH] usbnet: smsc95xx: Add device tree input for MAC address
From: Ming Lei @ 2013-10-10 12:39 UTC (permalink / raw)
To: Dan Murphy
Cc: Steve Glendinning, Network Development, Linux Kernel Mailing List,
linux-usb, mugunthanvnm
In-Reply-To: <525698CC.3060406@ti.com>
On Thu, Oct 10, 2013 at 8:08 PM, Dan Murphy <dmurphy@ti.com> wrote:
>
> You are correct I don't expect a match for PnP devices only devices that are hard wired.
>
> After thinking of it I should move the OF code below the EEPROM code as the EEPROM should take preference over the DT code.
>
> I will need to post V2 for that.
Your patch still doesn't deal with two smsc95xx devices built in
one board.
The problem is a generic one, looks it is better to do it when OF supports
discoverable bus.
Thanks,
--
Ming Lei
^ permalink raw reply
* Re: [PATCH] usbnet: smsc95xx: Add device tree input for MAC address
From: Dan Murphy @ 2013-10-10 12:08 UTC (permalink / raw)
To: Ming Lei
Cc: Steve Glendinning, Network Development, Linux Kernel Mailing List,
linux-usb, mugunthanvnm
In-Reply-To: <CACVXFVNAB3VnmdoP_LGJsBjQ17j1aoqKMsBE7LRt65OqROeRUQ@mail.gmail.com>
Ming
On 10/07/2013 06:42 AM, Ming Lei wrote:
> On Mon, Oct 7, 2013 at 1:31 AM, Dan Murphy <dmurphy@ti.com> wrote:
>> On 10/06/2013 10:05 AM, Ming Lei wrote:
>>> On Sat, Oct 5, 2013 at 2:25 AM, Dan Murphy <dmurphy@ti.com> wrote:
>>>> If the smsc95xx does not have a valid MAC address stored within
>>>> the eeprom then a random number is generated. The MAC can also
>>>> be set by uBoot but the smsc95xx does not have a way to read this.
>>>>
>>>> Create the binding for the smsc95xx so that uBoot can set the MAC
>>>> and the code can retrieve the MAC from the modified DTB file.
>>> Suppose there are two smsc95xx usbnet devices connected to usb bus, and
>>> one is built-in, another is hotplug device, can your patch handle the situation
>>> correctly?
>> Look at this line in the patch below
>>
>> sprintf(of_name, "%s%i", SMSC95XX_OF_NAME, dev->udev->dev.id);
>>
>> I am appending the dev ID of the device to the of_name here. As long as init_mac_address is called, the dev.id and the uBoot
>> entry match then yes.
> Currently, non-root-hub usb device is created and added into usb bus without
> any platform(device tree) knowledge, so you can't expect the match here.
>
> Also not mention the two smsc95xx devices may attach to two different
> usb host controllers(buses).
>
> Thanks,
You are correct I don't expect a match for PnP devices only devices that are hard wired.
After thinking of it I should move the OF code below the EEPROM code as the EEPROM should take preference over the DT code.
I will need to post V2 for that.
Dan
--
------------------
Dan Murphy
^ permalink raw reply
* Re: [PATCH v2] Stmmac: fix a bug when the clk_csr is euqal to 0x0
From: Giuseppe CAVALLARO @ 2013-10-10 12:05 UTC (permalink / raw)
To: Wan ZongShun; +Cc: David Miller, netdev
In-Reply-To: <CAKT61h8=HGL8TvX2LZamXntR4QhM_s8UVdHRhs--kVT2RQ-wDA@mail.gmail.com>
On 10/10/2013 6:40 AM, Wan ZongShun wrote:
> Hi Giuseppe,
>
> According to your advice, I add this new field comments in stmmac.txt.
> And I only make one patch for fix this issue and field comments.
thanks Wan but pls re-send the patch where the comment only reports
what the patch actually does and the fix you are providing.
Put me on CC and remove "Hi Giuseppe"
many thanks
Peppe
>
> Signed-off-by: Wan Zongshun <mcuos.com@gmail.com>
>
> ---
> Documentation/networking/stmmac.txt | 3 +++
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
> include/linux/stmmac.h | 1 +
> 3 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/networking/stmmac.txt
> b/Documentation/networking/stmmac.txt
> index 457b8bb..3ed0ea9 100644
> --- a/Documentation/networking/stmmac.txt
> +++ b/Documentation/networking/stmmac.txt
> @@ -116,6 +116,7 @@ struct plat_stmmacenet_data {
> struct stmmac_mdio_bus_data *mdio_bus_data;
> struct stmmac_dma_cfg *dma_cfg;
> int clk_csr;
> + unsigned int dynamic_mdc_clk_en;
> int has_gmac;
> int enh_desc;
> int tx_coe;
> @@ -148,6 +149,8 @@ Where:
> GMAC also enables the 4xPBL by default.
> o fixed_burst/mixed_burst/burst_len
> o clk_csr: fixed CSR Clock range selection.
> + o dynamic_mdc_clk_en: If it is set to >=1 MDC clk will be selected
> dynamically,
> + or else you must set a fixed CSR Clock range to clk_src.
> o has_gmac: uses the GMAC core.
> o enh_desc: if sets the MAC will use the enhanced descriptor structure.
> o tx_coe: core is able to perform the tx csum in HW.
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 8d4ccd3..93fc6bc 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -2741,7 +2741,7 @@ struct stmmac_priv *stmmac_dvr_probe(struct
> device *device,
> * set the MDC clock dynamically according to the csr actual
> * clock input.
> */
> - if (!priv->plat->clk_csr)
> + if (!!priv->plat->dynamic_mdc_clk_en)
> stmmac_clk_csr_set(priv);
> else
> priv->clk_csr = priv->plat->clk_csr;
> diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
> index bb5deb0..1b9f0b5 100644
> --- a/include/linux/stmmac.h
> +++ b/include/linux/stmmac.h
> @@ -101,6 +101,7 @@ struct plat_stmmacenet_data {
> struct stmmac_mdio_bus_data *mdio_bus_data;
> struct stmmac_dma_cfg *dma_cfg;
> int clk_csr;
> + unsigned int dynamic_mdc_clk_en;
> int has_gmac;
> int enh_desc;
> int tx_coe;
>
^ permalink raw reply
* [PATCH net-next v2 2/3] bonding: use RCU protection for alb xmit path
From: Ding Tianhong @ 2013-10-10 11:51 UTC (permalink / raw)
To: Jay Vosburgh, Andy Gospodarek, David S. Miller,
Nikolay Aleksandrov, Veaceslav Falico, Netdev
The commit 278b20837511776dc9d5f6ee1c7fabd5479838bb
(bonding: initial RCU conversion) has convert the roundrobin,
active-backup, broadcast and xor xmit path to rcu protection,
the performance will be better for these mode, so this time,
convert xmit path for alb mode.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Cc: Nikolay Aleksandrov <nikolay@redhat.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
---
drivers/net/bonding/bond_alb.c | 58 +++++++++++++++++++++++++++++++-----------
drivers/net/bonding/bonding.h | 14 ++++++++++
2 files changed, 57 insertions(+), 15 deletions(-)
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 576ccea..0287240 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -230,7 +230,7 @@ static struct slave *tlb_get_least_loaded_slave(struct bonding *bond)
max_gap = LLONG_MIN;
/* Find the slave with the largest gap */
- bond_for_each_slave(bond, slave, iter) {
+ bond_for_each_slave_rcu(bond, slave, iter) {
if (SLAVE_IS_OK(slave)) {
long long gap = compute_gap(slave);
@@ -412,6 +412,39 @@ static struct slave *rlb_next_rx_slave(struct bonding *bond)
return rx_slave;
}
+/* Caller must hold rcu_read_lock() for read */
+static struct slave *__rlb_next_rx_slave(struct bonding *bond)
+{
+ struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
+ struct slave *before = NULL, *rx_slave = NULL, *slave;
+ struct list_head *iter;
+ bool found = false;
+
+ bond_for_each_slave_rcu(bond, slave, iter) {
+ if (!SLAVE_IS_OK(slave))
+ continue;
+ if (!found) {
+ if (!before || before->speed < slave->speed)
+ before = slave;
+ } else {
+ if (!rx_slave || rx_slave->speed < slave->speed)
+ rx_slave = slave;
+ }
+ if (slave == bond_info->rx_slave)
+ found = true;
+ }
+ /* we didn't find anything after the current or we have something
+ * better before and up to the current slave
+ */
+ if (!rx_slave || (before && rx_slave->speed < before->speed))
+ rx_slave = before;
+
+ if (rx_slave)
+ bond_info->rx_slave = rx_slave;
+
+ return rx_slave;
+}
+
/* teach the switch the mac of a disabled slave
* on the primary for fault tolerance
*
@@ -628,12 +661,14 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
{
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct arp_pkt *arp = arp_pkt(skb);
- struct slave *assigned_slave;
+ struct slave *assigned_slave, *curr_active_slave;
struct rlb_client_info *client_info;
u32 hash_index = 0;
_lock_rx_hashtbl(bond);
+ curr_active_slave = rcu_dereference(bond->curr_active_slave);
+
hash_index = _simple_hash((u8 *)&arp->ip_dst, sizeof(arp->ip_dst));
client_info = &(bond_info->rx_hashtbl[hash_index]);
@@ -658,14 +693,14 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
* that the new client can be assigned to this entry.
*/
if (bond->curr_active_slave &&
- client_info->slave != bond->curr_active_slave) {
- client_info->slave = bond->curr_active_slave;
+ client_info->slave != curr_active_slave) {
+ client_info->slave = curr_active_slave;
rlb_update_client(client_info);
}
}
}
/* assign a new slave */
- assigned_slave = rlb_next_rx_slave(bond);
+ assigned_slave = __rlb_next_rx_slave(bond);
if (assigned_slave) {
if (!(client_info->assigned &&
@@ -728,7 +763,7 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
/* Don't modify or load balance ARPs that do not originate locally
* (e.g.,arrive via a bridge).
*/
- if (!bond_slave_has_mac(bond, arp->mac_src))
+ if (!bond_slave_has_mac_rcu(bond, arp->mac_src))
return NULL;
if (arp->op_code == htons(ARPOP_REPLY)) {
@@ -1343,11 +1378,6 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
skb_reset_mac_header(skb);
eth_data = eth_hdr(skb);
- /* make sure that the curr_active_slave do not change during tx
- */
- read_lock(&bond->lock);
- read_lock(&bond->curr_slave_lock);
-
switch (ntohs(skb->protocol)) {
case ETH_P_IP: {
const struct iphdr *iph = ip_hdr(skb);
@@ -1429,12 +1459,12 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
if (!tx_slave) {
/* unbalanced or unassigned, send through primary */
- tx_slave = bond->curr_active_slave;
+ tx_slave = rcu_dereference(bond->curr_active_slave);
bond_info->unbalanced_load += skb->len;
}
if (tx_slave && SLAVE_IS_OK(tx_slave)) {
- if (tx_slave != bond->curr_active_slave) {
+ if (tx_slave != rcu_dereference(bond->curr_active_slave)) {
memcpy(eth_data->h_source,
tx_slave->dev->dev_addr,
ETH_ALEN);
@@ -1449,8 +1479,6 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
}
}
- read_unlock(&bond->curr_slave_lock);
- read_unlock(&bond->lock);
if (res) {
/* no suitable interface, frame not sent */
kfree_skb(skb);
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 0bd04fb..3c3076e 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -464,6 +464,20 @@ static inline struct slave *bond_slave_has_mac(struct bonding *bond,
return NULL;
}
+/* Caller must hold rcu_read_lock() for read */
+static inline struct slave *bond_slave_has_mac_rcu(struct bonding *bond,
+ const u8 *mac)
+{
+ struct list_head *iter;
+ struct slave *tmp;
+
+ bond_for_each_slave_rcu(bond, tmp, iter)
+ if (ether_addr_equal_64bits(mac, tmp->dev->dev_addr))
+ return tmp;
+
+ return NULL;
+}
+
/* Check if the ip is present in arp ip list, or first free slot if ip == 0
* Returns -1 if not found, index if found
*/
--
1.8.2.1
^ permalink raw reply related
* [PATCH net-next v2 3/3] bonding: add rtnl lock and remove read lock for bond sysfs
From: Ding Tianhong @ 2013-10-10 11:51 UTC (permalink / raw)
To: Jay Vosburgh, Andy Gospodarek, David S. Miller,
Nikolay Aleksandrov, Veaceslav Falico, Netdev
The bond_for_each_slave() will not be protected by read_lock(),
only protected by rtnl_lock(), so need to replace read_lock()
with rtnl_lock().
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
drivers/net/bonding/bond_sysfs.c | 30 +++++++++++++++++-------------
1 file changed, 17 insertions(+), 13 deletions(-)
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index e06c644..2ba1114 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -179,7 +179,9 @@ static ssize_t bonding_show_slaves(struct device *d,
struct slave *slave;
int res = 0;
- read_lock(&bond->lock);
+ if (!rtnl_trylock())
+ return restart_syscall();
+
bond_for_each_slave(bond, slave, iter) {
if (res > (PAGE_SIZE - IFNAMSIZ)) {
/* not enough space for another interface name */
@@ -190,7 +192,9 @@ static ssize_t bonding_show_slaves(struct device *d,
}
res += sprintf(buf + res, "%s ", slave->dev->name);
}
- read_unlock(&bond->lock);
+
+ rtnl_unlock();
+
if (res)
buf[res-1] = '\n'; /* eat the leftover space */
@@ -628,6 +632,9 @@ static ssize_t bonding_store_arp_targets(struct device *d,
unsigned long *targets_rx;
int ind, i, j, ret = -EINVAL;
+ if (!rtnl_trylock())
+ return restart_syscall();
+
targets = bond->params.arp_targets;
newtarget = in_aton(buf + 1);
/* look for adds */
@@ -701,6 +708,7 @@ static ssize_t bonding_store_arp_targets(struct device *d,
ret = count;
out:
+ rtnl_unlock();
return ret;
}
static DEVICE_ATTR(arp_ip_target, S_IRUGO | S_IWUSR , bonding_show_arp_targets, bonding_store_arp_targets);
@@ -1469,7 +1477,6 @@ static ssize_t bonding_show_queue_id(struct device *d,
if (!rtnl_trylock())
return restart_syscall();
- read_lock(&bond->lock);
bond_for_each_slave(bond, slave, iter) {
if (res > (PAGE_SIZE - IFNAMSIZ - 6)) {
/* not enough space for another interface_name:queue_id pair */
@@ -1481,9 +1488,9 @@ static ssize_t bonding_show_queue_id(struct device *d,
res += sprintf(buf + res, "%s:%d ",
slave->dev->name, slave->queue_id);
}
- read_unlock(&bond->lock);
if (res)
buf[res-1] = '\n'; /* eat the leftover space */
+
rtnl_unlock();
return res;
@@ -1532,8 +1539,6 @@ static ssize_t bonding_store_queue_id(struct device *d,
if (!sdev)
goto err_no_cmd;
- read_lock(&bond->lock);
-
/* Search for thes slave and check for duplicate qids */
update_slave = NULL;
bond_for_each_slave(bond, slave, iter) {
@@ -1544,23 +1549,20 @@ static ssize_t bonding_store_queue_id(struct device *d,
*/
update_slave = slave;
else if (qid && qid == slave->queue_id) {
- goto err_no_cmd_unlock;
+ goto err_no_cmd;
}
}
if (!update_slave)
- goto err_no_cmd_unlock;
+ goto err_no_cmd;
/* Actually set the qids for the slave */
update_slave->queue_id = qid;
- read_unlock(&bond->lock);
out:
rtnl_unlock();
return ret;
-err_no_cmd_unlock:
- read_unlock(&bond->lock);
err_no_cmd:
pr_info("invalid input for queue_id set for %s.\n",
bond->dev->name);
@@ -1593,6 +1595,9 @@ static ssize_t bonding_store_slaves_active(struct device *d,
struct list_head *iter;
struct slave *slave;
+ if (!rtnl_trylock())
+ return restart_syscall();
+
if (sscanf(buf, "%d", &new_value) != 1) {
pr_err("%s: no all_slaves_active value specified.\n",
bond->dev->name);
@@ -1612,7 +1617,6 @@ static ssize_t bonding_store_slaves_active(struct device *d,
goto out;
}
- read_lock(&bond->lock);
bond_for_each_slave(bond, slave, iter) {
if (!bond_is_active_slave(slave)) {
if (new_value)
@@ -1621,8 +1625,8 @@ static ssize_t bonding_store_slaves_active(struct device *d,
slave->inactive = 1;
}
}
- read_unlock(&bond->lock);
out:
+ rtnl_unlock();
return ret;
}
static DEVICE_ATTR(all_slaves_active, S_IRUGO | S_IWUSR,
--
1.8.2.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox