Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 10/11] tcp: Always set options to 0 before calling tcp_established_options
From: David Miller @ 2013-10-02 20:33 UTC (permalink / raw)
  To: andi; +Cc: linux-kernel, ak, netdev
In-Reply-To: <1380572952-30729-11-git-send-email-andi@firstfloor.org>

From: Andi Kleen <andi@firstfloor.org>
Date: Mon, 30 Sep 2013 13:29:11 -0700

> From: Andi Kleen <ak@linux.intel.com>
> 
> tcp_established_options assumes opts->options is 0 before calling,
> as it read modify writes it.
> 
> For the tcp_current_mss() case the opts structure is not zeroed,
> so this can be done with uninitialized values.
> 
> This is ok, because ->options is not read in this path.
> But it's still better to avoid the operation on the uninitialized
> field. This shuts up a static code analyzer, and presumably
> may help the optimizer.
> 
> Cc: netdev@vger.kernel.org
> Signed-off-by: Andi Kleen <ak@linux.intel.com>

Applied.

^ permalink raw reply

* Re: [net-next 2/3] udp: Add udp early demux
From: Shawn Bohrer @ 2013-10-02 20:35 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, tomk, netdev
In-Reply-To: <1380737365.19002.110.camel@edumazet-glaptop.roam.corp.google.com>

On Wed, Oct 02, 2013 at 11:09:25AM -0700, Eric Dumazet wrote:
> On Wed, 2013-10-02 at 12:34 -0500, Shawn Bohrer wrote:
> > On Tue, Oct 01, 2013 at 01:52:49PM -0700, Eric Dumazet wrote:
> > > On Tue, 2013-10-01 at 14:33 -0500, Shawn Bohrer wrote:
> > > > The removal of the routing cache introduced a performance regression for
> > > > some UDP workloads since a dst lookup must be done for each packet.
> > > > This change caches the dst per socket in a similar manner to what we do
> > > > for TCP by implementing early_demux.
> > > > 
> > > > For UDP multicast we can only cache the dst if there is only one
> > > > receiving socket on the host.  Since caching only works when there is
> > > > one receiving socket we do the multicast socket lookup using RCU.
> > > 
> > > For unicast, we should find a matching socket for early demux only if
> > > this is a connected socket.
> > > 
> > > Otherwise, forwarding setups will break.
> > > 
> > > You probably need to add a minimum score to __udp4_lib_lookup()
> > 
> > Perhaps I'm missing something but I don't think a minimum score would
> > work because compute_score() and compute_score2() have several ways of
> > returning a score of lets say 4 and I don't think they all mean the
> > socket is connected.
> 
> Just change how score is computed. The existing +4 values are not hard
> coded anywhere.
> 
> You want to compute a score so that a single compare against a threshold
> is enough to tell you what's going on, before even taking a refcount on
> the socket.

Sorry, I must be a little slow today.  I understand what you are
suggesting but I don't see how to implement it with a score.  Or at
least not without potentially changing existing behavior.  For example
I could make the inet->inet_daddr case add +100 to the score and I
would know that a score >= 100 was connected.  However, this would
unfairly favor that one case making a socket that only had a matching
inet_daddr be better than one that only had a matching inet_dport,
sk_bound_dev_if, and inet_rcv_saddr.

The other possibility I can think of would be to use a bit mask so I
could see which tests passed and I could compute a score by counting
the set bits.  This would probably work since most of the tests
currently add an equal weight of 4 except for the one
(sk->sk_family == PF_INET) test which I'm not sure how to handle using
this strategy.

Did you have something specific in mind with your suggestion?

Thanks,
Shawn

-- 

---------------------------------------------------------------
This email, along with any attachments, is confidential. If you 
believe you received this message in error, please contact the 
sender immediately and delete all copies of the message.  
Thank you.

^ permalink raw reply

* Re: [PATCH V3 net-next] fib_trie: avoid a redundant bit judgement in inflate
From: David Miller @ 2013-10-02 20:37 UTC (permalink / raw)
  To: baker.kernel; +Cc: kuznet, jmorris, yoshfuji, kaber, netdev, linux-kernel
In-Reply-To: <1380584709-5877-1-git-send-email-baker.kernel@gmail.com>

From: baker.kernel@gmail.com
Date: Tue,  1 Oct 2013 07:45:09 +0800

> From: "baker.zhang" <baker.kernel@gmail.com>
> 
> Because 'node' is the i'st child of 'oldnode',
> thus, here 'i' equals
> tkey_extract_bits(node->key, oldtnode->pos, oldtnode->bits)
> 
> we just get 1 more bit,
> and need not care the detail value of this bits.
> 
> I apologize for the mistake.
> 
> I generated the patch on a branch version,
> and did not notice the put_child has been changed.
> 
> I have redone the test on HEAD version with my patch.

Applied, thanks for testing things so thoroughly.

^ permalink raw reply

* Re: [PATCH net] tc: export tc_defact.h to userspace
From: David Miller @ 2013-10-02 20:39 UTC (permalink / raw)
  To: stephen; +Cc: jhs, netdev
In-Reply-To: <20130930213022.073850bb@nehalam.linuxnetplumber.net>

From: Stephen Hemminger <stephen@networkplumber.org>
Date: Mon, 30 Sep 2013 21:30:22 -0700

> Jamal sent patch to add tc user simple actions to iproute2
> but required header was not being exported.
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Applied, thanks Stephen.

^ permalink raw reply

* Re: [patch net] udp6: respect IPV6_DONTFRAG sockopt in case there are pending frames
From: Jiri Pirko @ 2013-10-02 20:45 UTC (permalink / raw)
  To: David Miller; +Cc: hannes, netdev, kuznet, jmorris, kaber, yoshfuji
In-Reply-To: <20131002.160028.1472026531076506821.davem@davemloft.net>

Wed, Oct 02, 2013 at 10:00:28PM CEST, davem@davemloft.net wrote:
>From: Hannes Frederic Sowa <hannes@stressinduktion.org>
>Date: Wed, 2 Oct 2013 00:03:54 +0200
>
>> Hi David!
>> 
>> On Mon, Sep 30, 2013 at 11:21:55PM +0200, Hannes Frederic Sowa wrote:
>>> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
>> 
>> (for patchwork correspondence)
>> 
>> I still think this patch is perfectly fine but it opens up a hole in the
>> ip6_append_data logic where it is possible to crash the kernel. Jiri and me
>> will look after it in the thread " ipv6: udp packets following an UFO enqueued
>> packet need also be handled by UFO".
>> 
>> So, for the time being, I would like to withdraw my Acked-by until this is
>> sorted out.
>
>Jiri please repost the IPV6_DONTFRAG patch once things are sorted, thanks.

allright

^ permalink raw reply

* Re: [PATCH] tcp: sndbuf autotuning improvements
From: David Miller @ 2013-10-02 20:46 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, ncardwell, ycheng, maze
In-Reply-To: <1380648224.19002.32.camel@edumazet-glaptop.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 01 Oct 2013 10:23:44 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> tcp_fixup_sndbuf() is underestimating initial send buffer requirements.
> 
> It was not noticed because big GSO packets were escaping the limitation,
> but with smaller TSO packets (or TSO/GSO/SG off), application hits
> sk_sndbuf before having a chance to fill enough packets in socket write
> queue.
> 
> - initial cwnd can be bigger than 10 for specific routes
> 
> - SKB_TRUESIZE() is a bit under real needs in some cases,
>   because of power-of-two rounding in kmalloc()
> 
> - Fast Recovery (RFC 5681 3.2) : Cubic needs 70% factor
> 
> - Extra cushion (application might react slowly to POLLOUT)
> 
> tcp_v4_conn_req_fastopen() needs to call tcp_init_metrics() before
> calling tcp_init_buffer_space()
> 
> Then we realize tcp_new_space() should call tcp_fixup_sndbuf()
> instead of duplicating this stuff.
> 
> Rename tcp_fixup_sndbuf() to tcp_sndbuf_expand() to be more
> descriptive.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Neal Cardwell <ncardwell@google.com>
> Signed-off-by: Yuchung Cheng <ycheng@google.com>
> Acked-by: Maciej Żenczykowski <maze@google.com>

I assume this is meant for net-next as that is where this applies
cleanly.

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] bonding: update MAINTAINERS
From: David Miller @ 2013-10-02 20:49 UTC (permalink / raw)
  To: andy; +Cc: netdev, vfalico
In-Reply-To: <1380677005-23745-1-git-send-email-andy@greyhouse.net>

From: Andy Gospodarek <andy@greyhouse.net>
Date: Tue,  1 Oct 2013 21:23:25 -0400

> From: Andy Gospodarek <gospo@redhat.com>
> 
> Veaceslav has been doing a significant amount of work on bonding lately and
> reached out to me about being a maintainer.  After discussing this with him, I
> think he would be a good fit as a bonding maintainer.
> 
> Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
> Acked-by: Veaceslav Falico <vfalico@redhat.com>

I've applied this to 'net' as it's important to have up to date
maintainer information upstream asap.

Thanks!

^ permalink raw reply

* Re: [PATCH v2 net-next] include/linux/skbuff.h: move CONFIG_XFRM check inside the skb_sec_path()
From: David Miller @ 2013-10-02 20:50 UTC (permalink / raw)
  To: kda; +Cc: netdev
In-Reply-To: <1380679112-2393-1-git-send-email-kda@linux-powerpc.org>

From: Denis Kirjanov <kda@linux-powerpc.org>
Date: Wed,  2 Oct 2013 05:58:32 +0400

> And thus we have only one function definition
> 
> Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
> ---
> v1->v2: rebase to net-next head

Applied, thanks.

^ permalink raw reply

* Re: [PATCH RFC 40/77] ixgbevf: Update MSI/MSI-X interrupts enablement code
From: Keller, Jacob E @ 2013-10-02 20:50 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: linux-kernel@vger.kernel.org, Bjorn Helgaas, Ralf Baechle,
	Michael Ellerman, Benjamin Herrenschmidt, Martin Schwidefsky,
	Ingo Molnar, Tejun Heo, Williams, Dan J, Andy King, Mason, Jon,
	Matt Porter, linux-pci@vger.kernel.org, linux-mips@linux-mips.org,
	linuxppc-dev@lists.ozlabs.org, linux390@de.ibm.com,
	linux-s390@vger.kernel.org
In-Reply-To: <338c9012577acf694eb23622902185584987bd8f.1380703263.git.agordeev@redhat.com>

On Wed, 2013-10-02 at 12:48 +0200, Alexander Gordeev wrote:
> As result of recent re-design of the MSI/MSI-X interrupts enabling
> pattern this driver has to be updated to use the new technique to
> obtain a optimal number of MSI/MSI-X interrupts required.
> 
> Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
> ---
>  drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |   18 +++++++-----------
>  1 files changed, 7 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> index fa0537a..d506a01 100644
> --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> @@ -1749,8 +1749,7 @@ void ixgbevf_reset(struct ixgbevf_adapter *adapter)
>  static int ixgbevf_acquire_msix_vectors(struct ixgbevf_adapter *adapter,
>  					int vectors)
>  {
> -	int err = 0;
> -	int vector_threshold;
> +	int err, vector_threshold;
>  
>  	/* We'll want at least 2 (vector_threshold):
>  	 * 1) TxQ[0] + RxQ[0] handler
> @@ -1763,18 +1762,15 @@ static int ixgbevf_acquire_msix_vectors(struct ixgbevf_adapter *adapter,
>  	 * Right now, we simply care about how many we'll get; we'll
>  	 * set them up later while requesting irq's.
>  	 */
> -	while (vectors >= vector_threshold) {
> -		err = pci_enable_msix(adapter->pdev, adapter->msix_entries,
> -				      vectors);
> -		if (!err || err < 0) /* Success or a nasty failure. */
> -			break;
> -		else /* err == number of vectors we should try again with */
> -			vectors = err;
> -	}
> +	err = pci_msix_table_size(adapter->pdev);
I would prefer to use something other than "err" here since the value
isn't really an error if it's greater than 0. However, it's not really a
big issue, since you immediately conver to using vectors on the next
line of code.. I think its alright overall.

Regards,
Jake

> +	if (err < 0)
> +		return err;
>  
> +	vectors = min(vectors, err);
>  	if (vectors < vector_threshold)
> -		err = -ENOSPC;
> +		return -ENOSPC;
>  
> +	err = pci_enable_msix(adapter->pdev, adapter->msix_entries, vectors);
>  	if (err) {
>  		dev_err(&adapter->pdev->dev,
>  			"Unable to allocate MSI-X interrupts\n");



^ permalink raw reply

* Re: [PATCH net-next] net:drivers/net: Miscellaneous conversions to ETH_ALEN
From: David Miller @ 2013-10-02 21:05 UTC (permalink / raw)
  To: joe
  Cc: brcm80211-dev-list, bridge, e1000-devel, netdev, linux-usb,
	linux-wireless, linux-kernel, ath10k, wil6210, netfilter-devel,
	b43-dev, linuxppc-dev
In-Reply-To: <1380679480.2081.24.camel@joe-AO722>

From: Joe Perches <joe@perches.com>
Date: Tue, 01 Oct 2013 19:04:40 -0700

> Convert the memset/memcpy uses of 6 to ETH_ALEN
> where appropriate.
> 
> Also convert some struct definitions and u8 array
> declarations of [6] to ETH_ALEN.
> 
> Signed-off-by: Joe Perches <joe@perches.com>

Looks fine, applied, thanks Joe.

^ permalink raw reply

* Re: [PATCH] net: do not call sock_put() on TIMEWAIT sockets
From: David Miller @ 2013-10-02 21:06 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1380686651.19002.64.camel@edumazet-glaptop.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 01 Oct 2013 21:04:11 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> commit 3ab5aee7fe84 ("net: Convert TCP & DCCP hash tables to use RCU /
> hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets.
> 
> We should instead use inet_twsk_put()
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Ugh, applied and queued up for -stable, thanks Eric!

^ permalink raw reply

* Re: [PATCH] tcp: sndbuf autotuning improvements
From: Eric Dumazet @ 2013-10-02 21:07 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, ncardwell, ycheng, maze
In-Reply-To: <20131002.164600.347083935801850982.davem@davemloft.net>

On Wed, 2013-10-02 at 16:46 -0400, David Miller wrote:

> I assume this is meant for net-next as that is where this applies
> cleanly.
> 
> Applied, thanks.

Indeed, sorry for missing net-next tag.

Thanks !

^ permalink raw reply

* Re: [PATCH] netif_set_xps_queue: make cpu mask const
From: David Miller @ 2013-10-02 21:07 UTC (permalink / raw)
  To: mst
  Cc: jasowang, rusty, edumazet, jiri, vyasevic, amwang,
	alexander.h.duyck, netdev, linux-kernel
In-Reply-To: <20131002061406.GA8809@redhat.com>

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Wed, 2 Oct 2013 09:14:06 +0300

> virtio wants to pass in cpumask_of(cpu), make parameter
> const to avoid build warnings.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Refresh my memory, is to accomodate a change you want to submit to
'net' or 'net-next'?  This patch only applies to 'net' cleanly.

^ permalink raw reply

* Re: [PATCH net 1/2] sit: allow to use rtnl ops on fb tunnel
From: David Miller @ 2013-10-02 21:08 UTC (permalink / raw)
  To: nicolas.dichtel; +Cc: netdev, steffen.klassert, pshelar
In-Reply-To: <524BCCE2.8000700@6wind.com>

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Wed, 02 Oct 2013 09:36:02 +0200

> Le 02/10/2013 09:15, Nicolas Dichtel a écrit :
>> Le 01/10/2013 18:59, David Miller a écrit :
>>> From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>>> Date: Tue,  1 Oct 2013 18:04:59 +0200
>>>
>>>> rtnl ops where introduced by ba3e3f50a0e5 ("sit: advertise tunnel
>>>> param via
>>>> rtnl"), but I forget to assign rtnl ops to fb tunnels.
>>>>
>>>> Now that it is done, we must remove the explicit call to
>>>> unregister_netdevice_queue(), because the fallback tunnel is added to
>>>> the queue
>>>> in sit_destroy_tunnels() when checking rtnl_link_ops of all netdevices
>>>> (this
>>>> is valid since commit 5e6700b3bf98 ("sit: add support of x-netns")).
>>>>
>>>> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>>>
>>> Applied and queued up for -stable.
> Another things about ipip: between 0974658da47c ("ipip: advertise
> tunnel param
> via rtnl", v3.8) and fd58156e456d ("IPIP: Use ip-tunneling code.",
> v3.10) the
> fb device of ipip module has the same problem.
> Should I send a patch?

Yes please do, thanks for noticing this.

^ permalink raw reply

* Re: [net-next 2/3] udp: Add udp early demux
From: Eric Dumazet @ 2013-10-02 21:08 UTC (permalink / raw)
  To: Shawn Bohrer; +Cc: David Miller, tomk, netdev
In-Reply-To: <20131002203507.GB7824@sbohrermbp13-local.rgmadvisors.com>

On Wed, 2013-10-02 at 15:35 -0500, Shawn Bohrer wrote:

> Sorry, I must be a little slow today.  I understand what you are
> suggesting but I don't see how to implement it with a score.  Or at
> least not without potentially changing existing behavior.  For example
> I could make the inet->inet_daddr case add +100 to the score and I
> would know that a score >= 100 was connected.  However, this would
> unfairly favor that one case making a socket that only had a matching
> inet_daddr be better than one that only had a matching inet_dport,
> sk_bound_dev_if, and inet_rcv_saddr.
> 

If early demux has to increment a socket refcount, then decrementing it
because it found a non connected socket, this will be too expensive.

Also, keep in mind UDP chains can be long, so you should limit the early
lookup to say a single socket.

TCP ehash is mostly empty (0 or 1 socket per bucket), so early demux
really makes sense, but for UDP, there is no such property.

^ permalink raw reply

* Re: [PATCH v3] l2tp: fix kernel panic when using IPv4-mapped IPv6 addresses
From: David Miller @ 2013-10-02 21:10 UTC (permalink / raw)
  To: f.cachereul; +Cc: jchapman, netdev
In-Reply-To: <524BD642.3020102@alphalink.fr>

From: François Cachereul <f.cachereul@alphalink.fr>
Date: Wed, 02 Oct 2013 10:16:02 +0200

> IPv4 mapped addresses cause kernel panic.
> The patch juste check whether the IPv6 address is an IPv4 mapped
> address. If so, use IPv4 API instead of IPv6.
 ...
> Signed-off-by: François CACHEREUL <f.cachereul@alphalink.fr>
> ---
> v3: add panic message
> Sorry for the dumb oversight

Applied and queued up for -stable, thank you.

^ permalink raw reply

* Re: [patch] net: heap overflow in __audit_sockaddr()
From: Ben Hutchings @ 2013-10-02 21:11 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: David S. Miller, netdev, security, Jüri Aedla
In-Reply-To: <20131002185811.GA23438@elgon.mountain>

On Wed, 2013-10-02 at 21:58 +0300, Dan Carpenter wrote:
> We need to cap ->msg_namelen or it leads to a buffer overflow when we
> to the memcpy() in __audit_sockaddr().  It requires CAP_AUDIT_CONTROL to
> exploit this bug.
> 
> The call tree is:
> ___sys_recvmsg()
>   move_addr_to_user()
>     audit_sockaddr()
>       __audit_sockaddr()
> 
> Reported-by: Jüri Aedla <juri.aedla@gmail.com>
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
>
> diff --git a/net/socket.c b/net/socket.c
> index ebed4b6..c226ace 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -1964,6 +1964,16 @@ struct used_address {
>  	unsigned int name_len;
>  };
>  
> +static int copy_msghdr_from_user(struct msghdr *kmsg,
> +				 struct msghdr __user *umsg)
> +{
> +	if (copy_from_user(kmsg, umsg, sizeof(struct msghdr)))
> +		return -EFAULT;
> +	if (kmsg->msg_namelen > sizeof(struct sockaddr_storage))
> +		return -EINVAL;
> +	return 0;
> +}
> +
>  static int ___sys_sendmsg(struct socket *sock, struct msghdr __user *msg,
>  			 struct msghdr *msg_sys, unsigned int flags,
>  			 struct used_address *used_address)
> @@ -1982,8 +1992,11 @@ static int ___sys_sendmsg(struct socket *sock, struct msghdr __user *msg,
>  	if (MSG_CMSG_COMPAT & flags) {
>  		if (get_compat_msghdr(msg_sys, msg_compat))
>  			return -EFAULT;
> -	} else if (copy_from_user(msg_sys, msg, sizeof(struct msghdr)))
> -		return -EFAULT;
> +	} else {
> +		err = copy_msghdr_from_user(msg_sys, msg);
> +		if (err)
> +			return err;
> +	}
[...]

This doesn't cover compat tasks, since get_compat_msghdr() has no such
check.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH 0/3] net: mv643xx_eth: various small fixes for v3.12
From: David Miller @ 2013-10-02 21:12 UTC (permalink / raw)
  To: sebastian.hesselbarth
  Cc: buytenh, jason, netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <1380711442-24735-1-git-send-email-sebastian.hesselbarth@gmail.com>

From: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Date: Wed,  2 Oct 2013 12:57:19 +0200

> This patch set comprises some one-liners to fix issues with repeated
> loading and unloading of a modular mv643xx_eth driver.
> 
> First two patches take care of the periodic port statistic timer, that
> updates statistics by reading port registers using add_timer/mod_timer.
> 
> Patch 1 moves timer re-schedule from mib_counters_update to the timer
> callback. As mib_counters_update is also called from non-timer context,
> this ensures the timer is reactivated from timer context only.
> 
> Patch 2 moves initial timer schedule from _probe() time to right before
> the port is actually started as the corresponding del_timer_sync is at
> _stop() time. This fixes a regression, where unloading the driver from a
> non-started eth device can cause the timer to access deallocated mem.
> 
> Patch 3 adds an assignment of the ports device_node to the corresponding
> self-created platform_device. This is required to allow fixups based on
> the device_node's compatible string later. Actually, it is also a potential
> regression because we already check compatible string for Kirkwood, but
> does not (yet) rely on the fixup.
> 
> All patches are based on v3.12-rc3 and have been tested on Kirkwood-based
> Seagate Dockstar.
> 
> Patches 1 and 2 can also possibly queued up for -stable.

Series applied, patch #1 and #2 queued up for -stable, thanks!

^ permalink raw reply

* Re: [net-next 2/3] udp: Add udp early demux
From: Shawn Bohrer @ 2013-10-02 21:24 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, tomk, netdev
In-Reply-To: <1380748118.19002.125.camel@edumazet-glaptop.roam.corp.google.com>

On Wed, Oct 02, 2013 at 02:08:38PM -0700, Eric Dumazet wrote:
> On Wed, 2013-10-02 at 15:35 -0500, Shawn Bohrer wrote:
> 
> > Sorry, I must be a little slow today.  I understand what you are
> > suggesting but I don't see how to implement it with a score.  Or at
> > least not without potentially changing existing behavior.  For example
> > I could make the inet->inet_daddr case add +100 to the score and I
> > would know that a score >= 100 was connected.  However, this would
> > unfairly favor that one case making a socket that only had a matching
> > inet_daddr be better than one that only had a matching inet_dport,
> > sk_bound_dev_if, and inet_rcv_saddr.
> > 
> 
> If early demux has to increment a socket refcount, then decrementing it
> because it found a non connected socket, this will be too expensive.
> 
> Also, keep in mind UDP chains can be long, so you should limit the early
> lookup to say a single socket.
> 
> TCP ehash is mostly empty (0 or 1 socket per bucket), so early demux
> really makes sense, but for UDP, there is no such property.

So... Are you suggesting that I just skip the early demux for unicast
UDP entirely?  That is fine by me since I only care about the
multicast case.

--
Shawn

-- 

---------------------------------------------------------------
This email, along with any attachments, is confidential. If you 
believe you received this message in error, please contact the 
sender immediately and delete all copies of the message.  
Thank you.

^ permalink raw reply

* Re: [patch] net: heap overflow in __audit_sockaddr()
From: Dan Carpenter @ 2013-10-02 21:26 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: David S. Miller, netdev, security, Jüri Aedla
In-Reply-To: <1380748306.1795.67.camel@bwh-desktop.uk.level5networks.com>

On Wed, Oct 02, 2013 at 10:11:46PM +0100, Ben Hutchings wrote:
> On Wed, 2013-10-02 at 21:58 +0300, Dan Carpenter wrote:
> > We need to cap ->msg_namelen or it leads to a buffer overflow when we
> > to the memcpy() in __audit_sockaddr().  It requires CAP_AUDIT_CONTROL to
> > exploit this bug.
> > 
> > The call tree is:
> > ___sys_recvmsg()
> >   move_addr_to_user()
> >     audit_sockaddr()
> >       __audit_sockaddr()
> > 
> > Reported-by: Jüri Aedla <juri.aedla@gmail.com>
> > Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> >
> > diff --git a/net/socket.c b/net/socket.c
> > index ebed4b6..c226ace 100644
> > --- a/net/socket.c
> > +++ b/net/socket.c
> > @@ -1964,6 +1964,16 @@ struct used_address {
> >  	unsigned int name_len;
> >  };
> >  
> > +static int copy_msghdr_from_user(struct msghdr *kmsg,
> > +				 struct msghdr __user *umsg)
> > +{
> > +	if (copy_from_user(kmsg, umsg, sizeof(struct msghdr)))
> > +		return -EFAULT;
> > +	if (kmsg->msg_namelen > sizeof(struct sockaddr_storage))
> > +		return -EINVAL;
> > +	return 0;
> > +}
> > +
> >  static int ___sys_sendmsg(struct socket *sock, struct msghdr __user *msg,
> >  			 struct msghdr *msg_sys, unsigned int flags,
> >  			 struct used_address *used_address)
> > @@ -1982,8 +1992,11 @@ static int ___sys_sendmsg(struct socket *sock, struct msghdr __user *msg,
> >  	if (MSG_CMSG_COMPAT & flags) {
> >  		if (get_compat_msghdr(msg_sys, msg_compat))
> >  			return -EFAULT;
> > -	} else if (copy_from_user(msg_sys, msg, sizeof(struct msghdr)))
> > -		return -EFAULT;
> > +	} else {
> > +		err = copy_msghdr_from_user(msg_sys, msg);
> > +		if (err)
> > +			return err;
> > +	}
> [...]
> 
> This doesn't cover compat tasks, since get_compat_msghdr() has no such
> check.
> 

Oops.  Gar...  Thanks for catching that.  I forgot to add that chunk to
the commit.

regards,
dan carpenter

^ permalink raw reply

* [patch v2] net: heap overflow in __audit_sockaddr()
From: Dan Carpenter @ 2013-10-02 21:27 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, security, Jüri Aedla
In-Reply-To: <1380748306.1795.67.camel@bwh-desktop.uk.level5networks.com>

We need to cap ->msg_namelen or it leads to a buffer overflow when we
to the memcpy() in __audit_sockaddr().  It requires CAP_AUDIT_CONTROL to
exploit this bug.

The call tree is:
___sys_recvmsg()
  move_addr_to_user()
    audit_sockaddr()
      __audit_sockaddr()

Reported-by: Jüri Aedla <juri.aedla@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
---
v2: The limit check to the compat code was missing as pointed out by
Ben Hutchings.

diff --git a/net/socket.c b/net/socket.c
index ebed4b6..c226ace 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1964,6 +1964,16 @@ struct used_address {
 	unsigned int name_len;
 };
 
+static int copy_msghdr_from_user(struct msghdr *kmsg,
+				 struct msghdr __user *umsg)
+{
+	if (copy_from_user(kmsg, umsg, sizeof(struct msghdr)))
+		return -EFAULT;
+	if (kmsg->msg_namelen > sizeof(struct sockaddr_storage))
+		return -EINVAL;
+	return 0;
+}
+
 static int ___sys_sendmsg(struct socket *sock, struct msghdr __user *msg,
 			 struct msghdr *msg_sys, unsigned int flags,
 			 struct used_address *used_address)
@@ -1982,8 +1992,11 @@ static int ___sys_sendmsg(struct socket *sock, struct msghdr __user *msg,
 	if (MSG_CMSG_COMPAT & flags) {
 		if (get_compat_msghdr(msg_sys, msg_compat))
 			return -EFAULT;
-	} else if (copy_from_user(msg_sys, msg, sizeof(struct msghdr)))
-		return -EFAULT;
+	} else {
+		err = copy_msghdr_from_user(msg_sys, msg);
+		if (err)
+			return err;
+	}
 
 	if (msg_sys->msg_iovlen > UIO_FASTIOV) {
 		err = -EMSGSIZE;
@@ -2191,8 +2204,11 @@ static int ___sys_recvmsg(struct socket *sock, struct msghdr __user *msg,
 	if (MSG_CMSG_COMPAT & flags) {
 		if (get_compat_msghdr(msg_sys, msg_compat))
 			return -EFAULT;
-	} else if (copy_from_user(msg_sys, msg, sizeof(struct msghdr)))
-		return -EFAULT;
+	} else {
+		err = copy_msghdr_from_user(msg_sys, msg);
+		if (err)
+			return err;
+	}
 
 	if (msg_sys->msg_iovlen > UIO_FASTIOV) {
 		err = -EMSGSIZE;
diff --git a/net/compat.c b/net/compat.c
index f0a1ba6..8903258 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -71,6 +71,8 @@ int get_compat_msghdr(struct msghdr *kmsg, struct compat_msghdr __user *umsg)
 	    __get_user(kmsg->msg_controllen, &umsg->msg_controllen) ||
 	    __get_user(kmsg->msg_flags, &umsg->msg_flags))
 		return -EFAULT;
+	if (kmsg->msg_namelen > sizeof(struct sockaddr_storage))
+		return -EINVAL;
 	kmsg->msg_name = compat_ptr(tmp1);
 	kmsg->msg_iov = compat_ptr(tmp2);
 	kmsg->msg_control = compat_ptr(tmp3);

^ permalink raw reply related

* Re: [net-next 2/3] udp: Add udp early demux
From: Eric Dumazet @ 2013-10-02 21:38 UTC (permalink / raw)
  To: Shawn Bohrer; +Cc: David Miller, tomk, netdev
In-Reply-To: <20131002212436.GC7824@sbohrermbp13-local.rgmadvisors.com>

On Wed, 2013-10-02 at 16:24 -0500, Shawn Bohrer wrote:

> So... Are you suggesting that I just skip the early demux for unicast
> UDP entirely?  That is fine by me since I only care about the
> multicast case.

Nope this is not what I suggested.

I suggested that for unicast, you do a limited lookup to the first
socket found in bucket.

If its an exact match, you take the socket.

If not, you give up, and do not scan the whole chain.

^ permalink raw reply

* Re: [PATCH net-next] net:drivers/net: Miscellaneous conversions to ETH_ALEN
From: Julia Lawall @ 2013-10-02 22:38 UTC (permalink / raw)
  To: Joe Perches; +Cc: Luis R. Rodriguez, Kalle Valo, netdev, ath10k
In-Reply-To: <1380736969.2081.72.camel@joe-AO722>



On Wed, 2 Oct 2013, Joe Perches wrote:

> On Wed, 2013-10-02 at 10:44 -0700, Luis R. Rodriguez wrote:
> > On Tue, Oct 1, 2013 at 11:40 PM, Joe Perches <joe@perches.com> wrote:
> > > Please include netdev.  (cc'd)
> > >
> > >> Joe Perches <joe@perches.com> writes:
> > >>
> > >> > Convert the memset/memcpy uses of 6 to ETH_ALEN
> > >> > where appropriate.
> > >
> > >> > Signed-off-by: Joe Perches <joe@perches.com>
> > 
> > I think these sorts of patches are good -- but once applied it'd be
> > good if we can get the SmPL grammar expressed for it and then have
> > developers / maintainers regularly doing:
> > 
> > make coccicheck MODE=patch M=path > path-cocci.patch
> > 
> > Unfortunately right now MODE=patch takes about 3 1/2 minutes for
> > ath9k, MODE=org takes ~10 minutes for ath9k (17 minutes for all of
> > ath/), and MODE=context takes ~8 minutes on ath9k -- I do believe its
> > a bit unreasonable to expect patch submitters to use this, but
> > certainly great practice. Some of the time differences on the reports
> > can be explained by the fact that some SmPL will only be used for some
> > modes.
> > 
> > Even though it takes a while right now it'd be great practice to use
> > coccicheck to prevent these type of changes from going in again,
> > things that checkpatch.pl won't be able to catch.
> 
> As far as I can tell, it's basically not possible for cocci to
> do this conversion.

I tried looking for memcpys and memsets that do use ETH_ALEN and then 
seeing what non-local functions the affected values flow to.  I then 
marked all of the calls to memcpy and memset that use 6 where an affected 
value flows to one of the functions identified in the first pass.  I get 
40 unique results on Linux 3.10.

The semantic patch is below.  It needs to be cleaned up to not return 
duplicate results.  It needs to be run with the argument --no-show-diff, 
and the result is printed in emacs org mode.

julia

virtual after_start

@initialize:ocaml@

let neighbors = ref []
let local_functions = ref []
let reset = ref false

let do_reset _ =
  if not !reset
  then
    begin
      reset := true;
      let it = new iteration() in
      it#add_virtual_rule After_start;
      it#register()
    end

@script:ocaml depends on !after_start@
@@

local_functions := []

@locl@
identifier f;
@@

f(...) { ... }

@script:ocaml@
f << locl.f;
@@

local_functions := f :: !local_functions

@r depends on !after_start@
identifier f;
expression e,e1,e2;
@@

(
memcpy(e,e1,ETH_ALEN)
|
memcpy(e1,e,ETH_ALEN)
|
memset(e,e1,ETH_ALEN)
)
... when any
    when != e = e2
f(...,e,...)

@script:ocaml@
f << r.f;
@@

if not (List.mem f !local_functions) && not (List.mem f !neighbors)
then begin neighbors := f :: !neighbors; do_reset() end

@r1 depends on !after_start@
identifier f;
expression e,e1,e2;
@@

f(...,e,...)
... when any
    when != e = e2
(
memcpy(e,e1,6)
|
memcpy(e1,e,6)
|
memset(e,e1,6)
)

@script:ocaml@
f << r1.f;
@@

if not (List.mem f !local_functions) && not (List.mem f !neighbors)
then begin neighbors := f :: !neighbors; do_reset() end

// --------------------------------------------------------------------------

@s depends on after_start@
identifier f;
expression e,e1,e2;
position p;
@@

(
memcpy@p(e,e1,6)
|
memcpy@p(e1,e,6)
|
memset@p(e,e1,6)
)
... when any
    when != e = e2
f(...,e,...)

@script:ocaml@
f << s.f;
p << s.p;
@@

if List.mem f !neighbors
then Coccilib.print_main "" p

@s1 depends on after_start@
identifier f;
expression e,e1,e2;
position p;
@@

f(...,e,...)
... when any
    when != e = e2
(
memcpy@p(e,e1,6)
|
memcpy@p(e1,e,6)
|
memset@p(e,e1,6)
)

@script:ocaml@
f << s1.f;
p << s1.p;
@@

if List.mem f !neighbors
then Coccilib.print_main "" p

^ permalink raw reply

* Re: [PATCH net-next] net:drivers/net: Miscellaneous conversions to ETH_ALEN
From: Joe Perches @ 2013-10-03  0:09 UTC (permalink / raw)
  To: Julia Lawall; +Cc: Luis R. Rodriguez, Kalle Valo, netdev, ath10k
In-Reply-To: <alpine.DEB.2.02.1310030034030.2162@localhost6.localdomain6>

On Thu, 2013-10-03 at 00:38 +0200, Julia Lawall wrote:
> 
> 
> On Wed, 2 Oct 2013, Joe Perches wrote:
> 
> > On Wed, 2013-10-02 at 10:44 -0700, Luis R. Rodriguez wrote:
> > > On Tue, Oct 1, 2013 at 11:40 PM, Joe Perches <joe@perches.com> wrote:
> > > > Please include netdev.  (cc'd)
> > > >
> > > >> Joe Perches <joe@perches.com> writes:
> > > >>
> > > >> > Convert the memset/memcpy uses of 6 to ETH_ALEN
> > > >> > where appropriate.
> > > >
> > > >> > Signed-off-by: Joe Perches <joe@perches.com>
> > > 
> > > I think these sorts of patches are good -- but once applied it'd be
> > > good if we can get the SmPL grammar expressed for it and then have
> > > developers / maintainers regularly doing:
> > > 
> > > make coccicheck MODE=patch M=path > path-cocci.patch
> > > 
> > > Unfortunately right now MODE=patch takes about 3 1/2 minutes for
> > > ath9k, MODE=org takes ~10 minutes for ath9k (17 minutes for all of
> > > ath/), and MODE=context takes ~8 minutes on ath9k -- I do believe its
> > > a bit unreasonable to expect patch submitters to use this, but
> > > certainly great practice. Some of the time differences on the reports
> > > can be explained by the fact that some SmPL will only be used for some
> > > modes.
> > > 
> > > Even though it takes a while right now it'd be great practice to use
> > > coccicheck to prevent these type of changes from going in again,
> > > things that checkpatch.pl won't be able to catch.
> > 
> > As far as I can tell, it's basically not possible for cocci to
> > do this conversion.
> 
> I tried looking for memcpys and memsets that do use ETH_ALEN and then 
> seeing what non-local functions the affected values flow to.  I then 
> marked all of the calls to memcpy and memset that use 6 where an affected 
> value flows to one of the functions identified in the first pass.  I get 
> 40 unique results on Linux 3.10.
> 
> The semantic patch is below.  It needs to be cleaned up to not return 
> duplicate results.  It needs to be run with the argument --no-show-diff, 
> and the result is printed in emacs org mode.

This has been running a _long_ time (broken?) on
drivers/net/ethernet/mellanox/mlx4/en_netdev.c

$ spatch --version
spatch version 1.0.0-rc14 without Python support and with PCRE support

^ permalink raw reply

* Re: [PATCH v2.41 5/5] datapath: Add basic MPLS support to kernel
From: Simon Horman @ 2013-10-03  0:20 UTC (permalink / raw)
  To: Pravin Shelar
  Cc: dev@openvswitch.org, netdev, Jesse Gross, Ben Pfaff, Ravi K,
	Isaku Yamahata, Joe Stringer
In-Reply-To: <CALnjE+qOwkY0NJ=LbT0SgZFmnzvqRHcTS35xWVfMyPtjj4r0ZA@mail.gmail.com>

On Wed, Oct 02, 2013 at 11:03:57AM -0700, Pravin Shelar wrote:
> On Mon, Sep 30, 2013 at 11:47 PM, Simon Horman <horms@verge.net.au> wrote:
> > Allow datapath to recognize and extract MPLS labels into flow keys
> > and execute actions which push, pop, and set labels on packets.
> >
> > Based heavily on work by Leo Alterman, Ravi K, Isaku Yamahata and Joe Stringer.
> >
> > Cc: Ravi K <rkerur@gmail.com>
> > Cc: Leo Alterman <lalterman@nicira.com>
> > Cc: Isaku Yamahata <yamahata@valinux.co.jp>
> > Cc: Joe Stringer <joe@wand.net.nz>
> > Signed-off-by: Simon Horman <horms@verge.net.au>
> >
> > ---
> >
> > +
> > +       /* this hack needed to get regular skb_gso_segment() */
> > +#ifdef HAVE___SKB_GSO_SEGMENT
> > +#undef __skb_gso_segment
> > +       skb_gso = __skb_gso_segment(skb, features, tx_path);
> > +#else
> > +#undef skb_gso_segment
> > +       skb_gso = skb_gso_segment(skb, features);
> > +#endif
> > +
> 
> We can get rid of #ifdefs by just using different name for
> rpl___skb_gso_segment(), something like mpls_vlan_skb_gso_segment().
> The way it is done for tnl-gso.

Thanks.

The reason that I had the code arranged this way was so that
calls to __skb_gso_segment() would go via rpl___skb_gso_segment()
on kernels older than v3.11. In particular calls outside of gso.c.

On closer examination the only such case is in ovs_dp_upcall().
Currently there should be no need to perform MPLS GSO segmentation in that
case because MPLS GSO segmentation can only be needed after actions are
applied.

However, I am concerned that it may be necessary later when
recirculation is introduced as in that case an upcall may occur
on a packet which has had actions applied.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox