Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: The check of upper MTU limit when changing it in ip6 gre tunnel seems incorrect.
From: Oussama Ghorbel @ 2013-10-04  9:58 UTC (permalink / raw)
  To: netdev
In-Reply-To: <CABfLueFJWv0OvcDmP7TMTF4wHjjv5QW_Qt+ULSDBciX==HsXQw@mail.gmail.com>

I have sent this patch to address the issue.

[PATCH] Fix the upper MTU limit in ipv6 GRE tunnel

On Wed, Oct 2, 2013 at 5:44 PM, Oussama Ghorbel <ou.ghorbel@gmail.com> wrote:
> The check of upper MTU limit when changing it in ip6 gre tunnel seems incorrect.
> The function in question is:
>
> static int ip6gre_tunnel_change_mtu(struct net_device *dev, int new_mtu)
> {
>     struct ip6_tnl *tunnel = netdev_priv(dev);
>
>     if (new_mtu < 68 ||
>         new_mtu > 0xFFF8 - dev->hard_header_len - tunnel->hlen)
>         return -EINVAL;
>     dev->mtu = new_mtu;
>     return 0;
> }
>
> However the dev->hard_header_len and tunnel->hlen are initialized in
> the following way in ip6gre_tnl_link_config():
>
> int addend = sizeof(struct ipv6hdr) + 4;
> ...
> dev->hard_header_len = rt->dst.dev->hard_header_len + addend;
> ...
> t->hlen = addend; // t is ip6_tnl pointer
>
> As you see the information t->hlen is already included in
> dev->hard_header_len, so why calculate it twice?
>
> Thanks

^ permalink raw reply

* Re: [PATCH net-next] dev: add support of flag IFF_NOPROC
From: Nicolas Dichtel @ 2013-10-04 12:07 UTC (permalink / raw)
  To: David Miller, stephen; +Cc: netdev
In-Reply-To: <20131003.150947.2179820478039260398.davem@davemloft.net>

Le 03/10/2013 21:09, David Miller a écrit :
> From: Stephen Hemminger <stephen@networkplumber.org>
> Date: Thu, 3 Oct 2013 10:46:27 -0700
>
>> What about speeding up proc or sysfs? Or providing a bulk create/destroy.
>
> +1 +1 +1
>
> This will benefit more people than the just the envisioned users for
> this IFF_NOPROC thing.
>
> I really don't want to take the IFF_NOPROC approach.
>
Of course optimizing /proc and /sysfs is a good option, but any optimizations
will never be as fast as disabling them for some well known netdevices.

Note also that the memory consumption is significantly less with this flag:
for 20000 dummy interfaces:
without the flag: 463,84Mo
with the flag: 297,45Mo
the gain is 166Mo (35%)

^ permalink raw reply

* [PATCH 1/2] net: ethernet: cpsw: Search childs for slave nodes
From: Markus Pargmann @ 2013-10-04 12:44 UTC (permalink / raw)
  To: David S. Miller
  Cc: Florian Fainelli, Mugunthan V N, linux-arm-kernel, netdev, kernel,
	Markus Pargmann

The current implementation searches the whole DT for nodes named
"slave".

This patch changes it to search only child nodes for slaves.

Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
---
 drivers/net/ethernet/ti/cpsw.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 79974e3..5df50a9 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1782,7 +1782,7 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
 	if (ret)
 		pr_warn("Doesn't have any child node\n");
 
-	for_each_node_by_name(slave_node, "slave") {
+	for_each_child_of_node(node, slave_node) {
 		struct cpsw_slave_data *slave_data = data->slave_data + i;
 		const void *mac_addr = NULL;
 		u32 phyid;
@@ -1791,6 +1791,10 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
 		struct device_node *mdio_node;
 		struct platform_device *mdio;
 
+		/* This is no slave child node, continue */
+		if (strcmp(slave_node->name, "slave"))
+			continue;
+
 		parp = of_get_property(slave_node, "phy_id", &lenp);
 		if ((parp == NULL) || (lenp != (sizeof(void *) * 2))) {
 			pr_err("Missing slave[%d] phy_id property\n", i);
-- 
1.8.4.rc3

^ permalink raw reply related

* [PATCH 2/2] net/ethernet: cpsw: DT read bool dual_emac
From: Markus Pargmann @ 2013-10-04 12:44 UTC (permalink / raw)
  To: David S. Miller
  Cc: Florian Fainelli, Mugunthan V N, linux-arm-kernel, netdev, kernel,
	Markus Pargmann
In-Reply-To: <1380890680-30941-1-git-send-email-mpa@pengutronix.de>

Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
---
 drivers/net/ethernet/ti/cpsw.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 5df50a9..804846e 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1771,8 +1771,8 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
 	}
 	data->mac_control = prop;
 
-	if (!of_property_read_u32(node, "dual_emac", &prop))
-		data->dual_emac = prop;
+	if (of_property_read_bool(node, "dual_emac"))
+		data->dual_emac = 1;
 
 	/*
 	 * Populate all the child nodes here...
-- 
1.8.4.rc3

^ permalink raw reply related

* Re: [PATCH 1/2] net: ethernet: cpsw: Search childs for slave nodes
From: Mugunthan V N @ 2013-10-04 13:27 UTC (permalink / raw)
  To: Markus Pargmann, David S. Miller
  Cc: kernel, netdev, Florian Fainelli, linux-arm-kernel
In-Reply-To: <1380890680-30941-1-git-send-email-mpa@pengutronix.de>

On Friday 04 October 2013 07:44 AM, Markus Pargmann wrote:
> The current implementation searches the whole DT for nodes named
> "slave".
>
> This patch changes it to search only child nodes for slaves.
>
> Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>

Regards
Mugunthan V N

^ permalink raw reply

* Re: [PATCH 2/2] net/ethernet: cpsw: DT read bool dual_emac
From: Mugunthan V N @ 2013-10-04 13:28 UTC (permalink / raw)
  To: Markus Pargmann, David S. Miller
  Cc: Florian Fainelli, linux-arm-kernel, netdev, kernel
In-Reply-To: <1380890680-30941-2-git-send-email-mpa@pengutronix.de>

On Friday 04 October 2013 07:44 AM, Markus Pargmann wrote:
> Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>

Regards
Mugunthan V N

^ permalink raw reply

* Re: [PATCHv2] IPv6: Allow the MTU of ipip6 tunnel to be set below 1280
From: Hannes Frederic Sowa @ 2013-10-04 13:32 UTC (permalink / raw)
  To: Oussama Ghorbel
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev, linux-kernel
In-Reply-To: <1380808166-13280-1-git-send-email-ou.ghorbel@gmail.com>

On Thu, Oct 03, 2013 at 02:49:26PM +0100, Oussama Ghorbel wrote:
> The (inner) MTU of a ipip6 (IPv4-in-IPv6) tunnel cannot be set below 1280, which is the minimum MTU in IPv6.
> However, there should be no IPv6 on the tunnel interface at all, so the IPv6 rules should not apply.
> More info at https://bugzilla.kernel.org/show_bug.cgi?id=15530
> 
> This patch allows to check the minimum MTU for ipv6 tunnel according to these rules:
> -In case the tunnel is configured with ipip6 mode the minimum MTU is 68.
> -In case the tunnel is configured with ip6ip6 or any mode the minimum MTU is 1280.
> 
> Signed-off-by: Oussama Ghorbel <ou.ghorbel@gmail.com>
> ---
>  net/ipv6/ip6_tunnel.c |   12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
> index 46ba243..4b51b03 100644
> --- a/net/ipv6/ip6_tunnel.c
> +++ b/net/ipv6/ip6_tunnel.c
> @@ -1429,9 +1429,17 @@ ip6_tnl_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
>  static int
>  ip6_tnl_change_mtu(struct net_device *dev, int new_mtu)
>  {
> -	if (new_mtu < IPV6_MIN_MTU) {
> -		return -EINVAL;
> +	struct ip6_tnl *tnl = netdev_priv(dev);
> +
> +	if (tnl->parms.proto == IPPROTO_IPIP) {
> +		if (new_mtu < 68)
> +			return -EINVAL;
> +	} else {
> +		if (new_mtu < IPV6_MIN_MTU)
> +			return -EINVAL;
>  	}
> +	if (new_mtu > 0xFFF8 - dev->hard_header_len)
> +		return -EINVAL;
>  	dev->mtu = new_mtu;
>  	return 0;
>  }

Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Anytime soon we should replace all the FFF8 with a symbolic constant.

Thanks,

  Hannes

^ permalink raw reply

* Re: [PATCH v3 net-next] fix unsafe set_memory_rw from softirq
From: Eric Dumazet @ 2013-10-04 13:56 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Heiko Carstens, Paul Mackerras, H. Peter Anvin, sparclinux,
	Nicolas Dichtel, Alexei Starovoitov, linux-s390, Russell King,
	x86, James Morris, Ingo Molnar, Alexey Kuznetsov,
	Paul E. McKenney, Xi Wang, Matt Evans, Thomas Gleixner,
	linux-arm-kernel, Stelian Nirlu, Nicolas Schichan,
	Hideaki YOSHIFUJI, netdev, linux-kernel, David S. Miller,
	Mircea Gherzan, Daniel Borkmann <
In-Reply-To: <20131004075133.GA12313@gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 1306 bytes --]

1)
>
> I took a brief look at arch/x86/net/bpf_jit_comp.c while reviewing this
> patch.
>
> You need to split up bpf_jit_compile(), it's an obscenely large, ~600
> lines long function. We don't do that in modern, maintainable kernel code.
>
> 2)
>
> This 128 bytes extra padding:
>
>         /* Most of BPF filters are really small,
>          * but if some of them fill a page, allow at least
>          * 128 extra bytes to insert a random section of int3
>          */
>         sz = round_up(proglen + sizeof(*header) + 128, PAGE_SIZE);
>
> why is it done? It's not clear to me from the comment.
>

commit 314beb9bcabfd6b4542ccbced2402af2c6f6142a
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri May 17 16:37:03 2013 +0000

    x86: bpf_jit_comp: secure bpf jit against spraying attacks

    hpa bringed into my attention some security related issues
    with BPF JIT on x86.

    This patch makes sure the bpf generated code is marked read only,
    as other kernel text sections.

    It also splits the unused space (we vmalloc() and only use a fraction of
    the page) in two parts, so that the generated bpf code not starts at a
    known offset in the page, but a pseudo random one.

    Refs:

http://mainisusuallyafunction.blogspot.com/2012/11/attacking-hardened-linux-systems-with.html

[-- Attachment #1.2: Type: text/html, Size: 2059 bytes --]

[-- Attachment #2: Type: text/plain, Size: 150 bytes --]

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply

* RE: [PATCH] xen-netback: Don't destroy the netdev until the vif is shut
From: Paul Durrant @ 2013-10-04 14:23 UTC (permalink / raw)
  To: David Miller; +Cc: netdev@vger.kernel.org, Wei Liu
In-Reply-To: <20131004.002248.789391156050342547.davem@davemloft.net>

> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: 04 October 2013 05:23
> To: Paul Durrant
> Cc: netdev@vger.kernel.org; Wei Liu
> Subject: Re: [PATCH] xen-netback: Don't destroy the netdev until the vif is
> shut
> 
> 
> Can one of you do -stable backports of this change for v3.11 and any
> other active -stable branch where this fix is relevant?
> 

Ok. Will do.

  Paul

> I gave it a shot, and it got outside my realm of expertise.
> 
> Thanks!

^ permalink raw reply

* [PATCH] ipv4: fix ineffective source address selection
From: Jiri Benc @ 2013-10-04 15:04 UTC (permalink / raw)
  To: netdev

When sending out multicast messages, the source address in inet->mc_addr is
ignored and rewritten by an autoselected one. This is caused by a typo in
commit 813b3b5db831 ("ipv4: Use caller's on-stack flowi as-is in output
route lookups").

Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
 net/ipv4/route.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 727f436..6011615 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2072,7 +2072,7 @@ struct rtable *__ip_route_output_key(struct net *net, struct flowi4 *fl4)
 							      RT_SCOPE_LINK);
 			goto make_route;
 		}
-		if (fl4->saddr) {
+		if (!fl4->saddr) {
 			if (ipv4_is_multicast(fl4->daddr))
 				fl4->saddr = inet_select_addr(dev_out, 0,
 							      fl4->flowi4_scope);
-- 
1.7.6.5

^ permalink raw reply related

* Re: [PATCH] veth: Showing peer of veth type dev in ip link (kernel side)
From: Nicolas Dichtel @ 2013-10-04 15:21 UTC (permalink / raw)
  To: Masatake YAMATO, netdev
In-Reply-To: <1380859529-32351-1-git-send-email-yamato@redhat.com>

Le 04/10/2013 06:05, Masatake YAMATO a écrit :
> ip link has ability to show extra information of net work device if
> kernel provides sunh information. With this patch veth driver can
> provide its peer ifindex information to ip command via netlink
> interface.
But the ifindex can be interpreted only when you know the related netns and
veth peer is probably in another netns.
How to found the netdevice in userland in this case?

^ permalink raw reply

* Re: [PATCH 1/2] net: ethernet: cpsw: Search childs for slave nodes
From: Peter Korsgaard @ 2013-10-04 15:24 UTC (permalink / raw)
  To: Markus Pargmann
  Cc: David S. Miller, Mugunthan V N, netdev, kernel, Florian Fainelli,
	linux-arm-kernel
In-Reply-To: <1380890680-30941-1-git-send-email-mpa@pengutronix.de>

>>>>> "Markus" == Markus Pargmann <mpa@pengutronix.de> writes:

 Markus> The current implementation searches the whole DT for nodes named
 Markus> "slave".

 Markus> This patch changes it to search only child nodes for slaves.

 Markus> Signed-off-by: Markus Pargmann <mpa@pengutronix.de>

Acked-by: Peter Korsgaard <jacmet@sunsite.dk>

-- 
Bye, Peter Korsgaard

^ permalink raw reply

* Re: [PATCH 2/2] net/ethernet: cpsw: DT read bool dual_emac
From: Peter Korsgaard @ 2013-10-04 15:24 UTC (permalink / raw)
  To: Markus Pargmann
  Cc: David S. Miller, Mugunthan V N, netdev, kernel, Florian Fainelli,
	linux-arm-kernel
In-Reply-To: <1380890680-30941-2-git-send-email-mpa@pengutronix.de>

>>>>> "Markus" == Markus Pargmann <mpa@pengutronix.de> writes:

 Markus> Signed-off-by: Markus Pargmann <mpa@pengutronix.de>

Acked-by: Peter Korsgaard <jacmet@sunsite.dk>

-- 
Bye, Peter Korsgaard

^ permalink raw reply

* Re: [PATCH] ipv4: fix ineffective source address selection
From: Eric Dumazet @ 2013-10-04 15:49 UTC (permalink / raw)
  To: Jiri Benc; +Cc: netdev
In-Reply-To: <245895a0777442b56ecea1453be041aa1b31c5a2.1380898983.git.jbenc@redhat.com>

On Fri, 2013-10-04 at 17:04 +0200, Jiri Benc wrote:
> When sending out multicast messages, the source address in inet->mc_addr is
> ignored and rewritten by an autoselected one. This is caused by a typo in
> commit 813b3b5db831 ("ipv4: Use caller's on-stack flowi as-is in output
> route lookups").
> 
> Signed-off-by: Jiri Benc <jbenc@redhat.com>
> ---

Nice catch !

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* Re: [stable 3.0] add some CVE fixes
From: Greg KH @ 2013-10-04 16:03 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Jiri Slaby, stable, netdev
In-Reply-To: <1380860287.1441.101.camel@deadeye.wl.decadent.org.uk>

On Fri, Oct 04, 2013 at 05:18:07AM +0100, Ben Hutchings wrote:
> On Thu, 2013-10-03 at 21:11 +0200, Jiri Slaby wrote:
> > On 10/03/2013 08:44 PM, Greg KH wrote:
> > > On Thu, Oct 03, 2013 at 11:20:28AM +0200, Jiri Slaby wrote:
> > >> Plus the backports that are replied to this mail?
> > > 
> > > I don't see any backports, did you forget to send them?
> > 
> > I don't think so, they were sent and this is a log of one of them:
> > OK. Log says:
> > Sendmail: /usr/sbin/sendmail -f jslaby@suse.cz -i stable@vger.kernel.org
> > jslaby@suse.cz
> > From: Jiri Slaby <jslaby@suse.cz>
> > To: <stable@vger.kernel.org>
> > Cc: jslaby@suse.cz
> > Subject: [PATCH 4/4] Tools: hv: verify origin of netlink connector message
> > Date: Thu,  3 Oct 2013 11:23:50 +0200
> > Message-Id: <1380792230-27255-4-git-send-email-jslaby@suse.cz>
> > X-Mailer: git-send-email 1.8.4
> > In-Reply-To: <1380792230-27255-1-git-send-email-jslaby@suse.cz>
> > References: <524D36DC.5070506@suse.cz>
> >  <1380792230-27255-1-git-send-email-jslaby@suse.cz>
> > 
> > Result: OK
> > 
> > Could you check your spam folder? Or I can bounce them directly to you?
> 
> They didn't hit the list either.  If they're applicable to 3.2 as well
> then could you send them this way?

I did not find these anywhere in any spam filter, and as they didn't hit
the list either, I'd blame an email server on your end somewhere.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH v2.42 1/5] odp: Allow VLAN actions after MPLS actions
From: Ben Pfaff @ 2013-10-04 16:21 UTC (permalink / raw)
  To: Simon Horman
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, Ravi K, netdev-u79uwXL29TY76Z2rM5mHXA,
	Isaku Yamahata
In-Reply-To: <1380874200-8981-2-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>

On Fri, Oct 04, 2013 at 05:09:56PM +0900, Simon Horman wrote:
> From: Joe Stringer <joe-Q1GJJQv1iO6lP80pJB477g@public.gmane.org>
> 
> OpenFlow 1.1 and 1.2, and 1.3 differ in their handling of MPLS actions in the
> presence of VLAN tags. To allow correct behaviour to be committed in
> each situation, this patch adds a second round of VLAN tag action
> handling to commit_odp_actions(), which occurs after MPLS actions. This
> is implemented with a new field in 'struct xlate_in' called 'vlan_tci'.
> 
> When an push_mpls action is composed, the flow's current VLAN state is
> stored into xin->vlan_tci, and flow->vlan_tci is set to 0 (pop_vlan). If
> a VLAN tag is present, it is stripped; if not, then there is no change.
> Any later modifications to the VLAN state is written to xin->vlan_tci.
> When committing the actions, flow->vlan_tci is used before MPLS actions,
> and xin->vlan_tci is used afterwards. This retains the current datapath
> behaviour, but allows VLAN actions to be applied in a more flexible
> manner.
> 
> Both before and after this patch MPLS LSEs are pushed onto a packet after
> any VLAN tags that may be present. This is the behaviour described in
> OpenFlow 1.1 and 1.2. OpenFlow 1.3 specifies that MPLS LSEs should be
> pushed onto a packet before any VLAN tags that are present. Support
> for this will be added by a subsequent patch that makes use of
> the infrastructure added by this patch.
> 
> Signed-off-by: Joe Stringer <joe-Q1GJJQv1iO6lP80pJB477g@public.gmane.org>
> Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>

I noticed a couple more minor points.

First, it seems to me that the "vlan_tci" member that this adds to
xlate_in could go in xlate_ctx just as well.  I would prefer that,
because (as far as I can tell) there is no reason for the client to use
any value other than flow->vlan_tci here, and putting it in xlate_ctx
hides it from the client.

Thanks for rearranging the code and updating the comment in
do_xlate_actions().  It makes the operation clearer.  But now that it's
clear I have an additional question.  Does it really make sense to have
'vlan_tci' as only a local variable in do_xlate_actions()?  Presumably,
MPLS and VLANs should interact the same way regardless of whether they
are separated by resubmits or goto_tables.  That is, I suspect that this
is xlate_ctx state, not local state.

Thanks,

Ben.

^ permalink raw reply

* [PATCH net-next] xen-netback: fix xenvif_count_skb_slots()
From: Paul Durrant @ 2013-10-04 16:26 UTC (permalink / raw)
  To: xen-devel, netdev
  Cc: Paul Durrant, Xi Xiong, Matt Wilson, Annie Li, Wei Liu,
	Ian Campbell

Commit 4f0581d25827d5e864bcf07b05d73d0d12a20a5c introduced an error into
xenvif_count_skb_slots() for skbs with a linear area spanning a page
boundary. The alignment of skb->data needs to be taken into account, not
just the head length. This patch fixes the issue by dry-running the code
from xenvif_gop_skb() (and adjusting the comment above the function to note
that).

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Xi Xiong <xixiong@amazon.com>
Cc: Matt Wilson <msw@amazon.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>

---
 drivers/net/xen-netback/netback.c |   17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index d0b0feb..6f680f4 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -223,15 +223,28 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head)
 /*
  * Figure out how many ring slots we're going to need to send @skb to
  * the guest. This function is essentially a dry run of
- * xenvif_gop_frag_copy.
+ * xenvif_gop_skb.
  */
 unsigned int xenvif_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
 {
+	unsigned char *data;
 	unsigned int count;
 	int i, copy_off;
 	struct skb_cb_overlay *sco;
 
-	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
+	count = 0;
+
+	data = skb->data;
+	while (data < skb_tail_pointer(skb)) {
+		unsigned int offset = offset_in_page(data);
+		unsigned int len = PAGE_SIZE - offset;
+
+		if (data + len > skb_tail_pointer(skb))
+			len = skb_tail_pointer(skb) - data;
+
+		count++;
+		data += len;
+	}
 
 	copy_off = skb_headlen(skb) % PAGE_SIZE;
 
-- 
1.7.10.4

^ permalink raw reply related

* USB ethernet (smsc95xx) transmit timeout errors
From: David Laight @ 2013-10-04 16:28 UTC (permalink / raw)
  To: netdev

We are seeing some 'transmit timeout' errors from the smsc95xx driver.
I've found references to similar problems on the RPI, but no explicit
resolution.

Our systems are Intel i7 motherboards with the USB chip on a 'front panel'
board (together with three USB2 sockets) plugged into one on the
motherboard's USB3 sockets.
(The silicon has some kind of clever USB hub in it.)

Initial failures were with the 3.2 kernel from Ubuntu 12.04, but I've
updated one of the systems that failed to a 3.8 kernel and it still
fails.

Does this problem 'ring a bell' with anyone? and is it likely to have
been fixed in a more recent kernel?

Even with heavy traffic the systems might run for 24 hours before
failing - so proving a system is 'fixed' is difficult.

	David

^ permalink raw reply

* Re: [PATCH v2.42 2/5] ofp-actions: Add separate OpenFlow 1.3 action parser
From: Ben Pfaff @ 2013-10-04 16:31 UTC (permalink / raw)
  To: Simon Horman
  Cc: dev, netdev, Jesse Gross, Pravin B Shelar, Ravi K, Isaku Yamahata,
	Joe Stringer
In-Reply-To: <1380874200-8981-3-git-send-email-horms@verge.net.au>

On Fri, Oct 04, 2013 at 05:09:57PM +0900, Simon Horman wrote:
> From: Joe Stringer <joe@wand.net.nz>
> 
> This patch adds new ofpact_from_openflow13() and
> ofpacts_from_openflow13() functions parallel to the existing ofpact
> handling code. In the OpenFlow 1.3 version, push_mpls is handled
> differently, but all other actions are handled by the existing code.
> 
> In the case of push_mpls for OpenFlow 1.3 the new mpls_before_vlan field of
> struct ofpact_push_mpls is set to true.  This will be used by a subsequent
> patch to allow allow the correct VLAN+MPLS datapath behaviour to be
> determined at odp translation time.
> 
> Signed-off-by: Joe Stringer <joe@wand.net.nz>
> Signed-off-by: Simon Horman <horms@verge.net.au>

The need for a huge comment on mpls_before_vlan suggests to me that
breaking it out as a separate type would be helpful.  How about this:

/* The position in the packet at which to insert an MPLS header.
 *
 * Used NXAST_PUSH_MPLS, OFPAT11_PUSH_MPLS. */
enum ofpact_mpls_position {
    /* Add the MPLS LSE after the Ethernet header but before any VLAN tags.
     * OpenFlow 1.3+ requires this behavior. */
    OFPACT_MPLS_BEFORE_VLAN,

    /* Add the MPLS LSE after the Ethernet header and any VLAN tags.
     * OpenFlow 1.1 and 1.2 require this behavior. */
    OFPACT_MPLS_AFTER_VLAN
};

/* OFPACT_PUSH_VLAN/MPLS/PBB
 *
 * Used for NXAST_PUSH_MPLS, OFPAT11_PUSH_MPLS. */
struct ofpact_push_mpls {
    struct ofpact ofpact;
    ovs_be16 ethertype;
    enum ofpact_mpls_position position;
};

Thanks,

Ben.

^ permalink raw reply

* Re: [PATCH v2.42 3/5] lib: Support pushing of MPLS LSE before or after VLAN tag
From: Ben Pfaff @ 2013-10-04 16:32 UTC (permalink / raw)
  To: Simon Horman
  Cc: dev, netdev, Jesse Gross, Pravin B Shelar, Ravi K, Isaku Yamahata,
	Joe Stringer
In-Reply-To: <1380874200-8981-4-git-send-email-horms@verge.net.au>

On Fri, Oct 04, 2013 at 05:09:58PM +0900, Simon Horman wrote:
> From: Joe Stringer <joe@wand.net.nz>
> 
> This patch modifies the push_mpls behaviour to allow
> pushing of an MPLS LSE either before any VLAN tag that may be present.
> 
> Pushing the MPLS LSE before any VLAN tag that is present is the
> behaviour specified in OpenFlow 1.3.
> 
> Pushing the MPLS LSE after the any VLAN tag that is present is the
> behaviour specified in OpenFlow 1.1 and 1.2. This is the only behaviour
> that was supported prior to this patch.
> 
> When an push_mpls action has been inserted using OpenFlow 1.2 or earlier
> the behaviour of pushing the MPLS LSE before any VLAN tag that may be
> present is implemented by by inserting VLAN actions around the MPLS push
> action during odp translation; Pop VLAN tags before committing MPLS
> actions, and push the expected VLAN tag afterwards.
> 
> The trigger condition for the two different behaviours is the value of the
> mpls_before_vlan field of struct ofpact_push_mpls.  This field is set when
> parsing OpenFlow actions.
> 
> Signed-off-by: Joe Stringer <joe@wand.net.nz>
> Signed-off-by: Simon Horman <horms@verge.net.au>

I'm happy with this, I think.  It will need a trivial update if you take
my suggestion on patch 2.

^ permalink raw reply

* Re: [PATCH 0/3] net: mv643xx_eth: various small fixes for v3.12
From: Ezequiel Garcia @ 2013-10-04 16:53 UTC (permalink / raw)
  To: Sebastian Hesselbarth
  Cc: Jason Cooper, netdev, linux-kernel, Lennert Buytenhek,
	David Miller, linux-arm-kernel
In-Reply-To: <1380711442-24735-1-git-send-email-sebastian.hesselbarth@gmail.com>

On Wed, Oct 02, 2013 at 12:57:19PM +0200, Sebastian Hesselbarth wrote:
> This patch set comprises some one-liners to fix issues with repeated
> loading and unloading of a modular mv643xx_eth driver.
> 
> First two patches take care of the periodic port statistic timer, that
> updates statistics by reading port registers using add_timer/mod_timer.
> 
> Patch 1 moves timer re-schedule from mib_counters_update to the timer
> callback. As mib_counters_update is also called from non-timer context,
> this ensures the timer is reactivated from timer context only.
> 
> Patch 2 moves initial timer schedule from _probe() time to right before
> the port is actually started as the corresponding del_timer_sync is at
> _stop() time. This fixes a regression, where unloading the driver from a
> non-started eth device can cause the timer to access deallocated mem.
> 
> Patch 3 adds an assignment of the ports device_node to the corresponding
> self-created platform_device. This is required to allow fixups based on
> the device_node's compatible string later. Actually, it is also a potential
> regression because we already check compatible string for Kirkwood, but
> does not (yet) rely on the fixup.
> 
> All patches are based on v3.12-rc3 and have been tested on Kirkwood-based
> Seagate Dockstar.
> 
> Patches 1 and 2 can also possibly queued up for -stable.
> 
> Sebastian Hesselbarth (3):
>   net: mv643xx_eth: update statistics timer from timer context only
>   net: mv643xx_eth: fix orphaned statistics timer crash
>   net: mv643xx_eth: fix missing device_node for port devices
> 
>  drivers/net/ethernet/marvell/mv643xx_eth.c |    7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> ---
> Cc: David Miller <davem@davemloft.net>
> Cc: Lennert Buytenhek <buytenh@wantstofly.org>
> Cc: Jason Cooper <jason@lakedaemon.net>
> Cc: netdev@vger.kernel.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org

Sorry for missing this series, somehow my filters failed to catch it.

Anyway, I would expect these patches to have my Reported-by, but I'm
glad to see you've solved all the issues.

I see David has already applied them all, but I'll re-run my tests to
check there aren't any more issues if I can find some time next week.

Thanks for taking care of it!
-- 
Ezequiel García, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com

^ permalink raw reply

* Re: [PATCH net-next] dev: add support of flag IFF_NOPROC
From: David Miller @ 2013-10-04 17:29 UTC (permalink / raw)
  To: nicolas.dichtel; +Cc: stephen, netdev
In-Reply-To: <524EAF64.8000801@6wind.com>

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Fri, 04 Oct 2013 14:07:00 +0200

> Of course optimizing /proc and /sysfs is a good option, but any
> optimizations
> will never be as fast as disabling them for some well known
> netdevices.
> 
> Note also that the memory consumption is significantly less with this
> flag:

It potentially breaks tools, it's a non-starter, sorry.

^ permalink raw reply

* [PATCH] tcp: do not forget FIN in tcp_shifted_skb()
From: Eric Dumazet @ 2013-10-04 17:31 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Neal Cardwell, Ilpo Järvinen, Yuchung Cheng

From: Eric Dumazet <edumazet@google.com>

Yuchung found following problem :

 There are bugs in the SACK processing code, merging part in
 tcp_shift_skb_data(), that incorrectly resets or ignores the sacked
 skbs FIN flag. When a receiver first SACK the FIN sequence, and later
 throw away ofo queue (e.g., sack-reneging), the sender will stop
 retransmitting the FIN flag, and hangs forever.

Following packetdrill test can be used to reproduce the bug.

$ cat sack-merge-bug.pkt
`sysctl -q net.ipv4.tcp_fack=0`

// Establish a connection and send 10 MSS.
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+.000 bind(3, ..., ...) = 0
+.000 listen(3, 1) = 0

+.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
+.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6>
+.001 < . 1:1(0) ack 1 win 1024
+.000 accept(3, ..., ...) = 4

+.100 write(4, ..., 12000) = 12000
+.000 shutdown(4, SHUT_WR) = 0
+.000 > . 1:10001(10000) ack 1
+.050 < . 1:1(0) ack 2001 win 257
+.000 > FP. 10001:12001(2000) ack 1
+.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop>
+.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop>
// SACK reneg
+.050 < . 1:1(0) ack 12001 win 257
+0 %{ print "unacked: ",tcpi_unacked }%
+5 %{ print "" }%


First, a typo inverted left/right of one OR operation, then
code forgot to advance end_seq if the merged skb carried FIN.

Bug was added in 2.6.29 by commit 832d11c5cd076ab
("tcp: Try to restore large SKBs while SACK processing")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 net/ipv4/tcp_input.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 25a89ea..113dc5f 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1284,7 +1284,10 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *skb,
 		tp->lost_cnt_hint -= tcp_skb_pcount(prev);
 	}
 
-	TCP_SKB_CB(skb)->tcp_flags |= TCP_SKB_CB(prev)->tcp_flags;
+	TCP_SKB_CB(prev)->tcp_flags |= TCP_SKB_CB(skb)->tcp_flags;
+	if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
+		TCP_SKB_CB(prev)->end_seq++;
+
 	if (skb == tcp_highest_sack(sk))
 		tcp_advance_highest_sack(sk, skb);
 

^ permalink raw reply related

* Re: [PATCH 0/3] net: mv643xx_eth: various small fixes for v3.12
From: Sebastian Hesselbarth @ 2013-10-04 17:34 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: Jason Cooper, netdev, linux-kernel, Lennert Buytenhek,
	David Miller, linux-arm-kernel
In-Reply-To: <20131004165354.GA9065@localhost>

On 10/04/2013 06:53 PM, Ezequiel Garcia wrote:
> On Wed, Oct 02, 2013 at 12:57:19PM +0200, Sebastian Hesselbarth wrote:
>> This patch set comprises some one-liners to fix issues with repeated
>> loading and unloading of a modular mv643xx_eth driver.
>>
>> First two patches take care of the periodic port statistic timer, that
>> updates statistics by reading port registers using add_timer/mod_timer.
>>
>> Patch 1 moves timer re-schedule from mib_counters_update to the timer
>> callback. As mib_counters_update is also called from non-timer context,
>> this ensures the timer is reactivated from timer context only.
>>
>> Patch 2 moves initial timer schedule from _probe() time to right before
>> the port is actually started as the corresponding del_timer_sync is at
>> _stop() time. This fixes a regression, where unloading the driver from a
>> non-started eth device can cause the timer to access deallocated mem.
>>
>> Patch 3 adds an assignment of the ports device_node to the corresponding
>> self-created platform_device. This is required to allow fixups based on
>> the device_node's compatible string later. Actually, it is also a potential
>> regression because we already check compatible string for Kirkwood, but
>> does not (yet) rely on the fixup.
>>
>> All patches are based on v3.12-rc3 and have been tested on Kirkwood-based
>> Seagate Dockstar.
>>
>> Patches 1 and 2 can also possibly queued up for -stable.
>>
>> Sebastian Hesselbarth (3):
>>    net: mv643xx_eth: update statistics timer from timer context only
>>    net: mv643xx_eth: fix orphaned statistics timer crash
>>    net: mv643xx_eth: fix missing device_node for port devices
>>
>>   drivers/net/ethernet/marvell/mv643xx_eth.c |    7 +++----
>>   1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> ---
>> Cc: David Miller <davem@davemloft.net>
>> Cc: Lennert Buytenhek <buytenh@wantstofly.org>
>> Cc: Jason Cooper <jason@lakedaemon.net>
>> Cc: netdev@vger.kernel.org
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>
> Sorry for missing this series, somehow my filters failed to catch it.
>
> Anyway, I would expect these patches to have my Reported-by, but I'm
> glad to see you've solved all the issues.

True, thanks to Ezequiel for initially reporting the issues. I will
have to be more careful about the Reported-by next time.

> I see David has already applied them all, but I'll re-run my tests to
> check there aren't any more issues if I can find some time next week.
>
> Thanks for taking care of it!

^ permalink raw reply

* Re: [PATCH v3 net-next] fix unsafe set_memory_rw from softirq
From: Ingo Molnar @ 2013-10-04 17:35 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Benjamin Herrenschmidt, Heiko Carstens, Paul Mackerras,
	H. Peter Anvin, sparclinux, Nicolas Dichtel, Alexei Starovoitov,
	linux-s390, Russell King, x86, James Morris, Ingo Molnar,
	Alexey Kuznetsov, Paul E. McKenney, Xi Wang, Matt Evans,
	Thomas Gleixner, linux-arm-kernel, Stelian Nirlu,
	Nicolas Schichan, Hideaki YOSHIFUJI, netdev, linux-kernel,
	David S. Miller, Mi
In-Reply-To: <CANn89iKkbR0_HaofvC_OVvqRv_Hqj3rATx-Z_4xXeusOasa56g@mail.gmail.com>


* Eric Dumazet <edumazet@google.com> wrote:

> 1)
> >
> > I took a brief look at arch/x86/net/bpf_jit_comp.c while reviewing this
> > patch.
> >
> > You need to split up bpf_jit_compile(), it's an obscenely large, ~600
> > lines long function. We don't do that in modern, maintainable kernel code.
> >
> > 2)
> >
> > This 128 bytes extra padding:
> >
> >         /* Most of BPF filters are really small,
> >          * but if some of them fill a page, allow at least
> >          * 128 extra bytes to insert a random section of int3
> >          */
> >         sz = round_up(proglen + sizeof(*header) + 128, PAGE_SIZE);
> >
> > why is it done? It's not clear to me from the comment.
> >
> 
> commit 314beb9bcabfd6b4542ccbced2402af2c6f6142a
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Fri May 17 16:37:03 2013 +0000
> 
>     x86: bpf_jit_comp: secure bpf jit against spraying attacks
> 
>     hpa bringed into my attention some security related issues
>     with BPF JIT on x86.
> 
>     This patch makes sure the bpf generated code is marked read only,
>     as other kernel text sections.
> 
>     It also splits the unused space (we vmalloc() and only use a fraction of
>     the page) in two parts, so that the generated bpf code not starts at a
>     known offset in the page, but a pseudo random one.

Thanks for the explanation - that makes sense.

	Ingo

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox