Netdev List
 help / color / mirror / Atom feed
* Re: openconnect triggers soft lockup in __skb_get_rxhash
From: Kirill A. Shutemov @ 2012-12-17  1:46 UTC (permalink / raw)
  To: David Miller; +Cc: maxk, netdev, dwmw2
In-Reply-To: <20121216.172214.687979484434537200.davem@davemloft.net>

On Sun, Dec 16, 2012 at 05:22:14PM -0800, David Miller wrote:
> 
> Already fixed in Linus's tree by:
> 
> From 499744209b2cbca66c42119226e5470da3bb7040 Mon Sep 17 00:00:00 2001

No, it's not. I use up-to-date (2a74dbb) Linus tree with the patch in and
still see the issue.

-- 
 Kirill A. Shutemov

^ permalink raw reply

* XFRM: Could we change ESP padding?
From: RongQing Li @ 2012-12-17  3:28 UTC (permalink / raw)
  To: netdev

Hi:

setkey has the below parameter, but this parameter seems not be
implemented in kernel and userspace,
	
     -f pad_option  defines the content of the ESP padding.
pad_option is one of following:
        zero-pad    All the paddings are zero.
        random-pad  A series of randomized values are used.
        seq-pad     A series of sequential increasing numbers
 started from 1 are used.


and kernel seems not inspect the ESP padding content too, the result
is the packets are not dropped even if they are with a wrong pad
content(not a monotonically increasing sequence).


Could anyone tell me why, bad description in RFC, performance, lack time,
or other reason? Thanks very much!


RFC4303:
If Padding bytes are needed but the encryption algorithm does not
specify the padding contents, then the following default processing
MUST be used.  The Padding bytes are initialized with a series of
(unsigned, 1-byte) integer values.  The first padding byte appended
to the plaintext is numbered 1, with subsequent padding bytes making
up a monotonically increasing sequence: 1, 2, 3, ....  When this
padding scheme is employed, the receiver SHOULD inspect the Padding
field.  (This scheme was selected because of its relative simplicity,
ease of implementation in hardware, and because it offers limited
protection against certain forms of "cut and paste" attacks in the
absence of other integrity measures, if the receiver checks the
padding values upon decryption.)


Thanks

-RongQing

^ permalink raw reply

* Re: openconnect triggers soft lockup in __skb_get_rxhash
From: Eric Dumazet @ 2012-12-17  4:46 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: David Miller, maxk, netdev, dwmw2
In-Reply-To: <20121217014631.GA23101@shutemov.name>

On Mon, 2012-12-17 at 03:46 +0200, Kirill A. Shutemov wrote:
> On Sun, Dec 16, 2012 at 05:22:14PM -0800, David Miller wrote:
> > 
> > Already fixed in Linus's tree by:
> > 
> > From 499744209b2cbca66c42119226e5470da3bb7040 Mon Sep 17 00:00:00 2001
> 
> No, it's not. I use up-to-date (2a74dbb) Linus tree with the patch in and
> still see the issue.
> 

Coud you try the following one liner ?

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 255a9f5..173acf5 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1199,6 +1199,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 		skb_shinfo(skb)->tx_flags |= SKBTX_DEV_ZEROCOPY;
 	}
 
+	skb_reset_network_header(skb);
 	rxhash = skb_get_rxhash(skb);
 	netif_rx_ni(skb);
 

^ permalink raw reply related

* Re: [PATCH 2/2] bridge: add flags to distinguish permanent mdb entires
From: Cong Wang @ 2012-12-17  5:46 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, bridge, herbert, shemminger
In-Reply-To: <20121215.171656.1197452765852503859.davem@davemloft.net>

On Sat, 2012-12-15 at 17:16 -0800, David Miller wrote:
> From: Cong Wang <amwang@redhat.com>
> Date: Sat, 15 Dec 2012 16:09:51 +0800
> 
> > This patch adds a flag to each mdb entry, so that we can distinguish
> > permanent entries with temporary entries.
> > 
> > Cc: Herbert Xu <herbert@gondor.apana.org.au>
> > Cc: Stephen Hemminger <shemminger@vyatta.com>
> > Cc: "David S. Miller" <davem@davemloft.net>
> > Signed-off-by: Cong Wang <amwang@redhat.com>
> 
> Applied, but you _really_ need to lock down the interface and
> stop making changes to the user visible side of this _now_.
> 

OK. I think it is okay to break ABI at this time, since the merge window
is not closed yet, who will develop applications based on an unstable
kernel though. :-/

^ permalink raw reply

* Re: XFRM: Could we change ESP padding?
From: Steffen Klassert @ 2012-12-17  6:43 UTC (permalink / raw)
  To: RongQing Li; +Cc: netdev
In-Reply-To: <CAJFZqHzDDtUacnQzd-gcS8JBvPdgspozWkUFOogS4nDmvZz7rg@mail.gmail.com>

On Mon, Dec 17, 2012 at 11:28:05AM +0800, RongQing Li wrote:
> Hi:
> 
> setkey has the below parameter, but this parameter seems not be
> implemented in kernel and userspace,
> 	
>      -f pad_option  defines the content of the ESP padding.
> pad_option is one of following:
>         zero-pad    All the paddings are zero.
>         random-pad  A series of randomized values are used.
>         seq-pad     A series of sequential increasing numbers
>  started from 1 are used.
> 

We can not implement this. As you already mentioned, RFC 4303
makes strong statements on how the padding bytes are initialized.
An IPsec implementation that checks the padding bytes would drop our
packets if we don't use the padding method described in RFC 4303.

> 
> and kernel seems not inspect the ESP padding content too, the result
> is the packets are not dropped even if they are with a wrong pad
> content(not a monotonically increasing sequence).
> 
> 
> Could anyone tell me why, bad description in RFC, performance, lack time,
> or other reason? Thanks very much!
> 

RFC 4303 says that the receiver should inspect the padding field,
so we are free to do it or not. You can find a comment that explains
why we don't do it in the esp_input_done2() function ;-) 

^ permalink raw reply

* Re: [PATCH] tuntap: fix ambigious multiqueue API
From: Jason Wang @ 2012-12-17  6:46 UTC (permalink / raw)
  To: mst, davem, netdev, linux-kernel, pmoore; +Cc: wkevils, mprivozn
In-Reply-To: <1355478810-10144-1-git-send-email-jasowang@redhat.com>



----- Original Message -----
> The current multiqueue API is ambigious which may confuse both user
> and LSM to
> do things correctly:
> 
> - Both TUNSETIFF and TUNSETQUEUE could be used to create the queues
> of a tuntap
>   device.
> - TUNSETQUEUE were used to disable and enable a specific queue of the
>   device. But since the state of tuntap were completely removed from
>   the queue,
>   it could be used to attach to another device (there's no such kind
>   of
>   requirement currently, and it needs new kind of LSM policy.
> - TUNSETQUEUE could be used to attach to a persistent device without
> any
>   queues. This kind of attching bypass the necessary checking during
>   TUNSETIFF
>   and may lead unexpected result.
> 
> So this patch tries to make a cleaner and simpler API by:
> 
> - Only allow TUNSETIFF to create queues.
> - TUNSETQUEUE could be only used to disable and enabled the queues of
> a device,
>   and the state of the tuntap device were not detachd from the queues
>   when it
>   was disabled, so TUNSETQUEUE could be only used after TUNSETIFF and
>   with the
>    same device.
> 
> This is done by introducing a list which keeps track of all queues
> which were
> disabled. The queue would be moved between this list and tfiles[]
> array when it
> was enabled/disabled. A pointer of the tun_struct were also introdued
> to track
> the device it belongs to when it was disabled.
> 
> After the change, the isolation between management and application
> could be done
> through: TUNSETIFF were only called by management software and
> TUNSETQUEUE were
> only called by application.For LSM/SELinux, the things left is to do
> proper
> check during tun_set_queue() if needed.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/net/tun.c |   86
>  ++++++++++++++++++++++++++++++++++++++--------------
>  1 files changed, 63 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 2ac2164..6f2053d 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -138,6 +138,8 @@ struct tun_file {
>  	/* only used for fasnyc */
>  	unsigned int flags;
>  	u16 queue_index;
> +	struct list_head next;
> +	struct tun_struct *detached;
>  };
>  
>  struct tun_flow_entry {
> @@ -182,6 +184,8 @@ struct tun_struct {
>  	struct hlist_head flows[TUN_NUM_FLOW_ENTRIES];
>  	struct timer_list flow_gc_timer;
>  	unsigned long ageing_time;
> +	unsigned int numdisabled;
> +	struct list_head disabled;
>  };
>  
>  static inline u32 tun_hashfn(u32 rxhash)
> @@ -386,6 +390,23 @@ static void tun_set_real_num_queues(struct
> tun_struct *tun)
>  	netif_set_real_num_rx_queues(tun->dev, tun->numqueues);
>  }
>  
> +static void tun_disable_queue(struct tun_struct *tun, struct
> tun_file *tfile)
> +{
> +	tfile->detached = tun;
> +	list_add_tail(&tfile->next, &tun->disabled);
> +	++tun->numdisabled;
> +}
> +
> +struct tun_struct *tun_enable_queue(struct tun_file *tfile)
> +{
> +	struct tun_struct *tun = tfile->detached;
> +
> +	tfile->detached = NULL;
> +	list_del_init(&tfile->next);
> +	--tun->numdisabled;
> +	return tun;
> +}
> +
>  static void __tun_detach(struct tun_file *tfile, bool clean)
>  {
>  	struct tun_file *ntfile;
> @@ -407,20 +428,25 @@ static void __tun_detach(struct tun_file
> *tfile, bool clean)
>  		ntfile->queue_index = index;
>  
>  		--tun->numqueues;
> -		sock_put(&tfile->sk);
> +		if (clean)
> +			sock_put(&tfile->sk);
> +		else
> +			tun_disable_queue(tun, tfile);
>  
>  		synchronize_net();
>  		tun_flow_delete_by_queue(tun, tun->numqueues + 1);
>  		/* Drop read queue */
>  		skb_queue_purge(&tfile->sk.sk_receive_queue);
>  		tun_set_real_num_queues(tun);
> -
> -		if (tun->numqueues == 0 && !(tun->flags & TUN_PERSIST))
> -			if (dev->reg_state == NETREG_REGISTERED)
> -				unregister_netdevice(dev);
> -	}
> +	} else if (tfile->detached && clean)
> +		tun = tun_enable_queue(tfile);
>  
>  	if (clean) {
> +		if (tun && tun->numqueues == 0 && tun->numdisabled == 0 &&
> +		    !(tun->flags & TUN_PERSIST))
> +			if (tun->dev->reg_state == NETREG_REGISTERED)
> +				unregister_netdevice(tun->dev);
> +
>  		BUG_ON(!test_bit(SOCK_EXTERNALLY_ALLOCATED,
>  				 &tfile->socket.flags));
>  		sk_release_kernel(&tfile->sk);
> @@ -437,7 +463,7 @@ static void tun_detach(struct tun_file *tfile,
> bool clean)
>  static void tun_detach_all(struct net_device *dev)
>  {
>  	struct tun_struct *tun = netdev_priv(dev);
> -	struct tun_file *tfile;
> +	struct tun_file *tfile, *tmp;
>  	int i, n = tun->numqueues;
>  
>  	for (i = 0; i < n; i++) {
> @@ -458,6 +484,12 @@ static void tun_detach_all(struct net_device
> *dev)
>  		skb_queue_purge(&tfile->sk.sk_receive_queue);
>  		sock_put(&tfile->sk);
>  	}
> +	list_for_each_entry_safe(tfile, tmp, &tun->disabled, next) {
> +		tun_enable_queue(tfile);
> +		skb_queue_purge(&tfile->sk.sk_receive_queue);
> +		sock_put(&tfile->sk);
> +	}
> +	BUG_ON(tun->numdisabled != 0);
>  }
>  
>  static int tun_attach(struct tun_struct *tun, struct file *file)
> @@ -474,7 +506,8 @@ static int tun_attach(struct tun_struct *tun,
> struct file *file)
>  		goto out;
>  
>  	err = -E2BIG;
> -	if (tun->numqueues == MAX_TAP_QUEUES)
> +	if (!tfile->detached &&
> +	    tun->numqueues + tun->numdisabled == MAX_TAP_QUEUES)
>  		goto out;
>  
>  	err = 0;
> @@ -488,9 +521,13 @@ static int tun_attach(struct tun_struct *tun,
> struct file *file)
>  	tfile->queue_index = tun->numqueues;
>  	rcu_assign_pointer(tfile->tun, tun);
>  	rcu_assign_pointer(tun->tfiles[tun->numqueues], tfile);
> -	sock_hold(&tfile->sk);
>  	tun->numqueues++;
>  
> +	if (tfile->detached)
> +		tun_enable_queue(tfile);
> +	else
> +		sock_hold(&tfile->sk);
> +
>  	tun_set_real_num_queues(tun);
>  
>  	/* device is allowed to go away first, so no need to hold extra
> @@ -1348,6 +1385,7 @@ static void tun_free_netdev(struct net_device
> *dev)
>  {
>  	struct tun_struct *tun = netdev_priv(dev);
>  
> +	BUG_ON(!(list_empty(&tun->disabled)));
>  	tun_flow_uninit(tun);
>  	free_netdev(dev);
>  }
> @@ -1542,6 +1580,10 @@ static int tun_set_iff(struct net *net, struct
> file *file, struct ifreq *ifr)
>  		err = tun_attach(tun, file);
>  		if (err < 0)
>  			return err;
> +
> +		if (tun->flags & TUN_TAP_MQ &&
> +		    (tun->numqueues + tun->numdisabled > 1))
> +			return err;
>  	}
>  	else {
>  		char *name;
> @@ -1600,6 +1642,7 @@ static int tun_set_iff(struct net *net, struct
> file *file, struct ifreq *ifr)
>  			TUN_USER_FEATURES;
>  		dev->features = dev->hw_features;
>  
> +		INIT_LIST_HEAD(&tun->disabled);
>  		err = tun_attach(tun, file);
>  		if (err < 0)
>  			goto err_free_dev;
> @@ -1754,32 +1797,28 @@ static int tun_set_queue(struct file *file,
> struct ifreq *ifr)
>  {
>  	struct tun_file *tfile = file->private_data;
>  	struct tun_struct *tun;
> -	struct net_device *dev;
>  	int ret = 0;
>  
>  	rtnl_lock();
>  
>  	if (ifr->ifr_flags & IFF_ATTACH_QUEUE) {
> -		dev = __dev_get_by_name(tfile->net, ifr->ifr_name);
> -		if (!dev) {
> -			ret = -EINVAL;
> -			goto unlock;
> -		}
> -
> -		tun = netdev_priv(dev);
> -		if (dev->netdev_ops != &tap_netdev_ops &&
> -			dev->netdev_ops != &tun_netdev_ops)
> +		tun = tfile->detached;
> +		if (!tun)
>  			ret = -EINVAL;
>  		else if (tun_not_capable(tun))
>  			ret = -EPERM;
>  		else
>  			ret = tun_attach(tun, file);
> -	} else if (ifr->ifr_flags & IFF_DETACH_QUEUE)
> -		__tun_detach(tfile, false);
> -	else
> +	} else if (ifr->ifr_flags & IFF_DETACH_QUEUE) {
> +		tun = rcu_dereference_protected(tfile->tun,
> +						lockdep_rtnl_is_held());
> +		if (!tun || !(tun->flags & TUN_TAP_MQ))
> +			ret = -EINVAL;
> +		else
> +			__tun_detach(tfile, false);
> +	} else
>  		ret = -EINVAL;
>  
> -unlock:
>  	rtnl_unlock();
>  	return ret;
>  }
> @@ -2091,6 +2130,7 @@ static int tun_chr_open(struct inode *inode,
> struct file * file)
>  
>  	file->private_data = tfile;
>  	set_bit(SOCK_EXTERNALLY_ALLOCATED, &tfile->socket.flags);
> +	INIT_LIST_HEAD(&tfile->next);
>  
>  	return 0;
>  }
> --
> 1.7.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* Re: XFRM: Could we change ESP padding?
From: RongQing Li @ 2012-12-17  6:56 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: netdev
In-Reply-To: <20121217064302.GK18940@secunet.com>

2012/12/17 Steffen Klassert <steffen.klassert@secunet.com>:
> On Mon, Dec 17, 2012 at 11:28:05AM +0800, RongQing Li wrote:
>> Hi:
>>
>> setkey has the below parameter, but this parameter seems not be
>> implemented in kernel and userspace,
>>
>>      -f pad_option  defines the content of the ESP padding.
>> pad_option is one of following:
>>         zero-pad    All the paddings are zero.
>>         random-pad  A series of randomized values are used.
>>         seq-pad     A series of sequential increasing numbers
>>  started from 1 are used.
>>
>
> We can not implement this. As you already mentioned, RFC 4303
> makes strong statements on how the padding bytes are initialized.
> An IPsec implementation that checks the padding bytes would drop our
> packets if we don't use the padding method described in RFC 4303.
>
>>
>> and kernel seems not inspect the ESP padding content too, the result
>> is the packets are not dropped even if they are with a wrong pad
>> content(not a monotonically increasing sequence).
>>
>>
>> Could anyone tell me why, bad description in RFC, performance, lack time,
>> or other reason? Thanks very much!
>>
>
> RFC 4303 says that the receiver should inspect the padding field,
> so we are free to do it or not. You can find a comment that explains
> why we don't do it in the esp_input_done2() function ;-)
>
Thanks.

But I see BSD has implemented it, and cisco device has similar implmentation.

http://fxr.watson.org/fxr/source/netipsec/xform_esp.c

-RongQing

^ permalink raw reply

* Re: XFRM: Could we change ESP padding?
From: Steffen Klassert @ 2012-12-17  7:25 UTC (permalink / raw)
  To: RongQing Li; +Cc: netdev
In-Reply-To: <CAJFZqHzCEJyvxc2NNh3_U8oT9Uh94N6EXLb4UA=twvVcVnEd5w@mail.gmail.com>

On Mon, Dec 17, 2012 at 02:56:47PM +0800, RongQing Li wrote:
> 2012/12/17 Steffen Klassert <steffen.klassert@secunet.com>:
> >
> > RFC 4303 says that the receiver should inspect the padding field,
> > so we are free to do it or not. You can find a comment that explains
> > why we don't do it in the esp_input_done2() function ;-)
> >
> Thanks.
> 
> But I see BSD has implemented it, and cisco device has similar implmentation.
> 

The comment at the place where the padding field inspection should be done
is rather old. I always respected this when I came accross this code, but
I would not mind to have it implemented. Not sure if somebody still
remembers exactly why it was not implemented.

^ permalink raw reply

* Re: [PATCH v3] netfilter: nf_conntrack_sip: Handle Cisco 7941/7945 IP phones
From: Kevin Cernekee @ 2012-12-17  7:26 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: David Woodhouse, Eric Dumazet, Patrick McHardy, David S. Miller,
	Alexey Kuznetsov, Pekka Savola (ipv6), James Morris,
	Hideaki YOSHIFUJI, netfilter-devel, netfilter, coreteam,
	linux-kernel, netdev
In-Reply-To: <20121217004457.GA12234@1984>

On Sun, Dec 16, 2012 at 4:44 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
>> What happened to this? OpenWRT is still carrying it, and it broke in
>> 3.7. Here's a completely untested update...
>
> I requested Kevin to resend a new version based on the current kernel
> tree while spinning on old pending patches since I have no access to
> that hardware, but no luck.
>
> So I'll review this and, since OpenWRT is carrying, I guess we can get
> this into net-next merge window.

Sorry, been putting it off since the OpenWRT version has worked flawlessly...

I just reassembled my test rig and I'll get you a working patch this week.

Is it OK to use
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git as the
baseline?

^ permalink raw reply

* Re: openconnect triggers soft lockup in __skb_get_rxhash
From: Kirill A. Shutemov @ 2012-12-17  8:11 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, maxk, netdev, dwmw2
In-Reply-To: <1355719589.10504.13.camel@edumazet-glaptop>

On Sun, Dec 16, 2012 at 08:46:29PM -0800, Eric Dumazet wrote:
> On Mon, 2012-12-17 at 03:46 +0200, Kirill A. Shutemov wrote:
> > On Sun, Dec 16, 2012 at 05:22:14PM -0800, David Miller wrote:
> > > 
> > > Already fixed in Linus's tree by:
> > > 
> > > From 499744209b2cbca66c42119226e5470da3bb7040 Mon Sep 17 00:00:00 2001
> > 
> > No, it's not. I use up-to-date (2a74dbb) Linus tree with the patch in and
> > still see the issue.
> > 
> 
> Coud you try the following one liner ?

Works for me. So far no problems.

Reported-and-tested-by: Kirill A. Shutemov <kirill@shutemov.name>

> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 255a9f5..173acf5 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1199,6 +1199,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>  		skb_shinfo(skb)->tx_flags |= SKBTX_DEV_ZEROCOPY;
>  	}
>  
> +	skb_reset_network_header(skb);
>  	rxhash = skb_get_rxhash(skb);
>  	netif_rx_ni(skb);
>  
> 
> 

-- 
 Kirill A. Shutemov

^ permalink raw reply

* Re: [PATCH iproute2 v2] ip: use rtnelink to manage mroute
From: Nicolas Dichtel @ 2012-12-17  8:41 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20121214101004.56a1fb62@nehalam.linuxnetplumber.net>

Le 14/12/2012 19:10, Stephen Hemminger a écrit :
> On Thu, 13 Dec 2012 10:16:42 +0100
> Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote:
>
>> mroute was using /proc/net/ip_mr_[vif|cache] to display mroute entries. Hence,
>> only RT_TABLE_DEFAULT was displayed and only IPv4.
>> With rtnetlink, it is possible to display all tables for IPv4 and IPv6. The output
>> format is kept. Also, like before the patch, statistics are displayed when user specify
>> the '-s' argument.
>>
>> The patch also adds the support of 'ip monitor mroute', which is now possible.
>>
>> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>
> Applied. I had to clean up some merge conflicts because of applying your
> patches out of order. It would help if you would recheck the version
> that I just pushed to git.
>
Your version is ok.

^ permalink raw reply

* Re: [PATCH iproute2 6/6] ip/link_iptnl: fix indentation
From: Nicolas Dichtel @ 2012-12-17  8:44 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20121214100212.00297856@nehalam.linuxnetplumber.net>

Le 14/12/2012 19:02, Stephen Hemminger a écrit :
> On Thu, 13 Dec 2012 14:42:54 +0100
> Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote:
>
>> Use tabs instead of space when possible.
>>
>> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>
> Thanks applied all these.
>
Two patches are missing in your tree:
1/6 ip: update man pages and usage() for 'ip monitor'
2/6 ip: add man pages for netconf

Should I resend them?

^ permalink raw reply

* Re: net/usb device additions for -stable
From: Bjørn Mork @ 2012-12-17  9:10 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, jspurohit, valdis.kletnieks, jan.ceuleers, olof.ermis,
	tommy7765
In-Reply-To: <20121214.181241.739657750054924669.davem@davemloft.net>

David Miller <davem@davemloft.net> writes:
> From: Bjørn Mork <bjorn@mork.no>
> Date: Tue, 13 Nov 2012 21:25:58 +0100
>
>> I looked quickly through the list of added devices in the range
>> v3.6..net/master and tried to cherry-pick them into the current 3.0,
>> 3.2, 3.4 and 3.6 stable trees.  There weren't really that many. The
>> result was:
>> 
>> # for stable-3.6:
>> af1b85e usb/ipheth: Add iPhone 5 support
>> c6846ee net: qmi_wwan: adding more ZTE devices
>> bbc8d92 net: cdc_ncm: add Huawei devices
>> 
>> # for stable-3.4:
>> af1b85e usb/ipheth: Add iPhone 5 support
>> 
>> # for stable-3.2:
>> af1b85e usb/ipheth: Add iPhone 5 support
>> 
>> # for stable-3.0:
>> af1b85e usb/ipheth: Add iPhone 5 support
>
> The iPhone 5 change applied cleanly in all cases so I added that one.
> The others did not.

Sorry about that. They did in when I tested them, but the conditions
must have been different. 

> Could you respin them for me and I'll queue them up for the next batch
> I send out?

I don't think there is much point anymore, as Greg has announced the
last 3.6 stable: https://lkml.org/lkml/2012/12/14/441

I'll come back with a set for 3.7 stable when there is something to add
instead.


Thanks,
Bjørn

^ permalink raw reply

* Re: [PATCH 4/4] FEC: Add time stamping code and a PTP hardware clock
From: Sascha Hauer @ 2012-12-17  9:13 UTC (permalink / raw)
  To: Frank Li
  Cc: lznua, richardcochran, shawn.guo, linux-arm-kernel, netdev, davem
In-Reply-To: <1351657531-25989-1-git-send-email-Frank.Li@freescale.com>

On Wed, Oct 31, 2012 at 12:25:31PM +0800, Frank Li wrote:
> This patch adds a driver for the FEC(MX6) that offers time
> stamping and a PTP haderware clock. Because FEC\ENET(MX6)
> hardware frequency adjustment is complex, we have implemented
> this in software by changing the multiplication factor of the
> timecounter.
> 
> Signed-off-by: Frank Li <Frank.Li@freescale.com>
> ---
>  drivers/net/ethernet/freescale/Kconfig   |    9 +
>  drivers/net/ethernet/freescale/Makefile  |    1 +
>  drivers/net/ethernet/freescale/fec.c     |   88 +++++++-
>  drivers/net/ethernet/freescale/fec.h     |   38 +++
>  drivers/net/ethernet/freescale/fec_ptp.c |  386 ++++++++++++++++++++++++++++++
>  5 files changed, 521 insertions(+), 1 deletions(-)
>  create mode 100644 drivers/net/ethernet/freescale/fec_ptp.c
> 
> diff --git a/drivers/net/ethernet/freescale/Kconfig b/drivers/net/ethernet/freescale/Kconfig
> index feff516..ff3be53 100644
> --- a/drivers/net/ethernet/freescale/Kconfig
> +++ b/drivers/net/ethernet/freescale/Kconfig
> @@ -92,4 +92,13 @@ config GIANFAR
>  	  This driver supports the Gigabit TSEC on the MPC83xx, MPC85xx,
>  	  and MPC86xx family of chips, and the FEC on the 8540.
>  
> +config FEC_PTP
> +	bool "PTP Hardware Clock (PHC)"
> +	depends on FEC
> +	select PPS
> +	select PTP_1588_CLOCK
> +	--help---
> +	  Say Y here if you want to use PTP Hardware Clock (PHC) in the
> +	  driver.  Only the basic clock operations have been implemented.
> +
>  endif # NET_VENDOR_FREESCALE
> diff --git a/drivers/net/ethernet/freescale/Makefile b/drivers/net/ethernet/freescale/Makefile
> index 3d1839a..d4d19b3 100644
> --- a/drivers/net/ethernet/freescale/Makefile
> +++ b/drivers/net/ethernet/freescale/Makefile
> @@ -3,6 +3,7 @@
>  #
>  
>  obj-$(CONFIG_FEC) += fec.o
> +obj-$(CONFIG_FEC_PTP) += fec_ptp.o
>  obj-$(CONFIG_FEC_MPC52xx) += fec_mpc52xx.o
>  ifeq ($(CONFIG_FEC_MPC52xx_MDIO),y)
>  	obj-$(CONFIG_FEC_MPC52xx) += fec_mpc52xx_phy.o
> diff --git a/drivers/net/ethernet/freescale/fec.c b/drivers/net/ethernet/freescale/fec.c
> index d0e1b33..2665162 100644
> --- a/drivers/net/ethernet/freescale/fec.c
> +++ b/drivers/net/ethernet/freescale/fec.c
> @@ -280,6 +280,17 @@ fec_enet_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>  			| BD_ENET_TX_LAST | BD_ENET_TX_TC);
>  	bdp->cbd_sc = status;
>  
> +#ifdef CONFIG_FEC_PTP

This ifdef desert in the fec driver currently breaks all SoCs except
i.MX6 in the imx_v6_v7_defconfig.

Most of these could be fixed with something like if (fec_use_ptp(fep)),


>  #if defined(CONFIG_M523x) || defined(CONFIG_M527x) || defined(CONFIG_M528x) || \
>      defined(CONFIG_M520x) || defined(CONFIG_M532x) || \
>      defined(CONFIG_ARCH_MXC) || defined(CONFIG_SOC_IMX28)
> @@ -88,6 +94,13 @@ struct bufdesc {
>  	unsigned short cbd_datlen;	/* Data length */
>  	unsigned short cbd_sc;	/* Control and status info */
>  	unsigned long cbd_bufaddr;	/* Buffer address */
> +#ifdef CONFIG_FEC_PTP
> +	unsigned long cbd_esc;
> +	unsigned long cbd_prot;
> +	unsigned long cbd_bdu;
> +	unsigned long ts;
> +	unsigned short res0[4];
> +#endif
>  };

This one changes the layout of the hardware buffer description which is
not so easy to fix.

I don't know how to continue from here. Since the whole patch doesn't
seem to reviewed very much I tend to say we should revert it for now and
let Frank redo it for the next merge window.

Other opinions?

Sascha

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply

* Re: [PATCH v3] netfilter: nf_conntrack_sip: Handle Cisco 7941/7945 IP phones
From: Pablo Neira Ayuso @ 2012-12-17  9:55 UTC (permalink / raw)
  To: Kevin Cernekee
  Cc: David Woodhouse, Eric Dumazet, Patrick McHardy, David S. Miller,
	Alexey Kuznetsov, Pekka Savola (ipv6), James Morris,
	Hideaki YOSHIFUJI, netfilter-devel, netfilter, coreteam,
	linux-kernel, netdev
In-Reply-To: <CAJiQ=7BBquMQmQWp3=aD_s3-rSYr4Y+gke0GJKCkJV-mq5buug@mail.gmail.com>

On Sun, Dec 16, 2012 at 11:26:31PM -0800, Kevin Cernekee wrote:
> On Sun, Dec 16, 2012 at 4:44 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> >> What happened to this? OpenWRT is still carrying it, and it broke in
> >> 3.7. Here's a completely untested update...
> >
> > I requested Kevin to resend a new version based on the current kernel
> > tree while spinning on old pending patches since I have no access to
> > that hardware, but no luck.
> >
> > So I'll review this and, since OpenWRT is carrying, I guess we can get
> > this into net-next merge window.
> 
> Sorry, been putting it off since the OpenWRT version has worked flawlessly...
> 
> I just reassembled my test rig and I'll get you a working patch this week.
> 
> Is it OK to use
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git as the
> baseline?

That's fine in this case because no recent changes went into that
code, but better if you use the netfilter next tree:

git://1984.lsi.us.es/nf-next

Thanks Kevin.

^ permalink raw reply

* RE: [tcpdump-workers] vlan tagged packets and libpcap breakage
From: David Laight @ 2012-12-17  9:50 UTC (permalink / raw)
  To: Daniel Borkmann, Ani Sinha
  Cc: Michael Richardson, netdev, tcpdump-workers, Francesco Ruggeri
In-Reply-To: <CAD6jFUTht82HOjGjDU7hFCEWyE3TOx_W4_j=SZK-DrcGfrio-A@mail.gmail.com>

> > I do agree that instead of a /proc entry, we should check for a kenrel
> > version >= X where X is the upstream version that first started
> > supporting all the features needed by libpcap for vlan filtering. This
> > is not a compile time check but a run time one. Does anyone see any
> > issues with this? Is there any long term implications of this, like if
> > you backport patches to an older long term supported kernel? Are there
> > other better ways to do this, like may be returning feature bits from
> > an ioctl call? This is something we need to deal with on a continuous
> > basis as we keep supporting newer AUX fields and libpcap and other
> > user land code needs to make use of it. At the same time, they need to
> > handle backward compatibility issues with older kernels.
> 
> As Eric mentioned earlier, for now there seems not to be a reliable
> way to get to know which ops are present and which not. It's not
> really nice, but if you want to make use of those new (ANC*) features,
> probably checking kernel version might be the only way if I'm not
> missing something. Now net-next is closed, but if it reopens, I'll
> submit a version 2 of my patch where you've been CC'd to. If it gets
> in, then at least it's for sure that since kernel <xyz> this kind of
> feature test is present.

How are you going to tell whether a feature is present in a non-Linux
kernel ?

Testing kernel versions is somewhat suboptimal as support
could be patched into a much older kernel (maybe not for
this but ...)

	David

^ permalink raw reply

* RE: [PATCH] netlink: align attributes on 64-bits
From: David Laight @ 2012-12-17  9:59 UTC (permalink / raw)
  To: Nicolas Dichtel, tgraf; +Cc: netdev, davem
In-Reply-To: <1355491002-3931-1-git-send-email-nicolas.dichtel@6wind.com>

> -	if (unlikely(skb_tailroom(skb) < nla_total_size(attrlen)))
> +	int align = IS_ALIGNED((unsigned long)skb_tail_pointer(skb), sizeof(void *)) ? 0 : 4;
> +
> +	if (unlikely(skb_tailroom(skb) < nla_total_size(attrlen) + align))
>  		return -EMSGSIZE;
> 
> +	if (align) {
> +		/* Goal is to add an attribute with size 4. We know that
> +		 * NLA_HDRLEN is 4, hence payload is 0.
> +		 */
> +		__nla_reserve(skb, 0, 0);
> +	}
> +

Shouldn't the size of the dummy parameter be based on the value
of 'align' - and that be based on the amount of padding needed?

That aligns the write pointer, what guarantees the alignment of
the start of the buffer - so that the reader will find aligned data?

What guarantees that the reader will read the data into an
8-byte aligned buffer.

There is also the lurking issue of items that require more
than 8-byte alignment.
(x86/amd64 requires 16-byte alignment for 16-byte SSE2 regs and
32-byte alignment for the AVX regs.)

Will anyone ever want to put such items into a netlink message?

	David

^ permalink raw reply

* Re: vlan tagged packets and libpcap breakage
From: Guy Harris @ 2012-12-17 10:35 UTC (permalink / raw)
  To: David Laight
  Cc: Michael Richardson, netdev, Francesco Ruggeri, Daniel Borkmann,
	tcpdump-workers
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B70EF@saturn3.aculab.com>


On Dec 17, 2012, at 1:50 AM, "David Laight" <David.Laight@ACULAB.COM> wrote:

> How are you going to tell whether a feature is present in a non-Linux
> kernel ?

The Linux memory-mapped capture mechanism is not present in a non-Linux kernel, so all the libpcap work involved here would, if necessary on other platforms, have to be done differently on those platforms.  Those platforms would have to have their own mechanisms to indicate whether any changes to filter code, processing of VLAN tags supplied out of band, etc. would need to be done.

The same would apply to other additional features of the Linux memory-mapped capture mechanism that require changes in libpcap.  (Ideally, those changes would only require changes in order to use them, and would not break existing userland code, including but not limited to libpcap - your reply was to Daniel Borkmann, who is, I believe, the originator of netsniff-ng:

	http://netsniff-ng.org

which has its own code using PF_PACKET sockets.)

_______________________________________________
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers

^ permalink raw reply

* Re: vlan tagged packets and libpcap breakage
From: Daniel Borkmann @ 2012-12-17 11:08 UTC (permalink / raw)
  To: Guy Harris; +Cc: Michael Richardson, netdev, tcpdump-workers, Francesco Ruggeri
In-Reply-To: <DE6D5B28-FA1E-4F04-9BDF-F6D35878776E@alum.mit.edu>

On Mon, Dec 17, 2012 at 11:35 AM, Guy Harris <guy@alum.mit.edu> wrote:
> On Dec 17, 2012, at 1:50 AM, "David Laight" <David.Laight@ACULAB.COM> wrote:
>
>> How are you going to tell whether a feature is present in a non-Linux
>> kernel ?
>
> The Linux memory-mapped capture mechanism is not present in a non-Linux kernel, so all the libpcap work involved here would, if necessary on other platforms, have to be done differently on those platforms.  Those platforms would have to have their own mechanisms to indicate whether any changes to filter code, processing of VLAN tags supplied out of band, etc. would need to be done.
>
> The same would apply to other additional features of the Linux memory-mapped capture mechanism that require changes in libpcap.

Exactly.
_______________________________________________
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers

^ permalink raw reply

* RFC  [PATCH] iproute2:  temporary solution to fix xt breakage
From: Jamal Hadi Salim @ 2012-12-17 12:30 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Hasan Chowdhury, Jan Engelhardt, Yury Stankevich,
	netdev@vger.kernel.org, pablo, netfilter-devel
In-Reply-To: <50CE3203.9080007@mojatatu.com>

[-- Attachment #1: Type: text/plain, Size: 704 bytes --]

On 12-12-16 03:41 PM, Jamal Hadi Salim wrote:
>
> There is an "intermediate solution" from Hasan which doesnt require
> the kernel change. It changes the kernel endpoint to "ipt". I am
> conflicted because it is a quick hack while otoh forcing people to
> upgrade kernel is a usability issue.
>


Attached. Author is Hasan - I didnt sign it because i am looking for
feedback and i find it distasteful but it solves the problem.
This is needed until we have a proper fix in the kernel propagated.
Once that kernel change is ubiquitous this change is noise and a
maintanance pain. I am making it hard to even turn it on
(i.e someone knowledgeable will have to compile with CONFIG_XT_HACK)

cheers,
jamal



[-- Attachment #2: p1 --]
[-- Type: text/plain, Size: 1092 bytes --]

diff --git a/tc/m_action.c b/tc/m_action.c
index 1fe2431..fa9a7c8 100644
--- a/tc/m_action.c
+++ b/tc/m_action.c
@@ -209,10 +209,17 @@ done0:
 
 			tail = NLMSG_TAIL(n);
 			addattr_l(n, MAX_MSG, ++prio, NULL, 0);
+			/*XXX: hack to work around old kernels, newer xtables */
+#ifdef CONFIG_XT_HACK
+			if (strncmp(k,"xt",2)==0)
+				addattr_l(n, MAX_MSG, TCA_ACT_KIND, "ipt" , strlen("ipt") + 1);
+			else 
+				addattr_l(n, MAX_MSG, TCA_ACT_KIND, k, strlen(k) + 1);
+#else
 			addattr_l(n, MAX_MSG, TCA_ACT_KIND, k, strlen(k) + 1);
+#endif
 
 			ret = a->parse_aopt(a,&argc, &argv, TCA_ACT_OPTIONS, n);
-
 			if (ret < 0) {
 				fprintf(stderr,"bad action parsing\n");
 				goto bad_val;
@@ -259,7 +266,15 @@ tc_print_one_action(FILE * f, struct rtattr *arg)
 	}
 
 
+	/*XXX: hack to work around old kernels, newer xtables */
+#ifdef CONFIG_XT_HACK
+	if (strcmp(RTA_DATA(tb[TCA_ACT_KIND]), "ipt")==0)
+		a = get_action_kind("xt");
+	else 
+		a = get_action_kind(RTA_DATA(tb[TCA_ACT_KIND]));
+#else
 	a = get_action_kind(RTA_DATA(tb[TCA_ACT_KIND]));
+#endif
 	if (NULL == a)
 		return err;
 

^ permalink raw reply related

* bug? mac 00:00:00:00:00:00 with natsemi DP83815 after driver load
From: Roland Kletzing @ 2012-12-17 12:38 UTC (permalink / raw)
  To: netdev

Hello,
i recently played with my older evo t20/wyse 3235le thin clients and flashed
a linux kernel into those, apparently there seems an issue with the natsemi
driver.

after driver load (natsemi.ko) eth0 has no valid mac adress, dmesg and
ifconfig shows just zero`s: 00:00:00:00:00:00.

despite that , the nic is working fine for me (in this test setup i set the
mac manually: ifconfig eth0 hw ether de:ad:be:ef:be:ef )

apparently, the driver fails to read the proper mac from the eeprom, as
"natsemi-diag -ee" (from nictools-pci in debian squeeze) shows, that there
is a valid "Ethernet MAC Station Address" stored inside the eeprom. (see
below)

looks like a driver bug !?
does anybody have a clue what`s going wrong here?

regards
roland

#lspci

00:00.0 Host bridge: Cyrix Corporation PCI Master
00:0f.0 Ethernet controller: National Semiconductor Corporation DP83815
(MacPhyter) Ethernet Controller
00:12.0 ISA bridge: Cyrix Corporation 5530 Legacy [Kahlua] (rev 30)
00:12.1 Bridge: Cyrix Corporation 5530 SMI [Kahlua]
00:12.2 IDE interface: Cyrix Corporation 5530 IDE [Kahlua]
00:12.3 Multimedia audio controller: Cyrix Corporation 5530 Audio [Kahlua]
00:12.4 VGA compatible controller: Cyrix Corporation 5530 Video [Kahlua]
00:13.0 USB Controller: Compaq Computer Corporation ZFMicro Chipset USB (rev
06)

#dmesg |egrep "natsemi|eth"
natsemi dp8381x driver, version 2.1, Sept 11, 2006
natsemi 0000:00:0f.0: setting latency timer to 64
natsemi eth0: NatSemi DP8381[56] at 0x4010000 (0000:00:0f.0),
00:00:00:00:00:00, IRQ 10, port TP.
eth0: DSPCFG accepted after 0 usec.
eth0: link up.
eth0: Setting full-duplex based on negotiated link capability.

#natsemi-diag -aa
natsemi-diag.c:v2.08 2/28/2005 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
Index #1: Found a NatSemi DP83815 adapter at 0xf800.
 Natsemi 83815 series with station address de:ad:be:ef:be:ef
 Transceiver setting Autonegotation advertise 10/100 Mbps half and full
duplex.
 This device appears to be active, so some registers will not be read.
 To see all register values use the '-f' flag.
NatSemi DP83815 chip registers at 0xf800
 0x000: 00000004 e805e000 00000002 00000000 ******** 00f1cd65 00000001
00000000
 0x020: 03abd200 d0f01002 00000000 00000000 03abd000 18700010 00000000
00000000
 0x040: ******** 00200000 00000004 0000efbe ffff000b 30303030 00000403
00000000
 0x060: ******** ******** ******** ******** ******** ******** ********
********
 0x080: 00003100 0000786d 00002000 00005c21 000005e1 000045e1 00000005
00002801
 0x0A0: ******** ******** ******** ******** ******** ******** ********
********
 0x0C0: 00000615 00000002 00000000 00000000 00000000 00000000 00000100
00000030
 0x0E0: 00000000 000000bf 00000804 00008200 00000000 00000000 00000000
00000000
  Interrupt sources are pending (00000200).
   Tx queue emptied indication.
  Receive mode is 0xc8200000: Normal unicast and hashed multicast.
  Rx filter contents:   adde efbe efbe 0000 0000 0000 0000 0000

#natsemi-diag -ee
natsemi-diag.c:v2.08 2/28/2005 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
Index #1: Found a NatSemi DP83815 adapter at 0xf800.
 Natsemi 83815 series with station address de:ad:be:ef:be:ef
 Transceiver setting Autonegotation advertise 10/100 Mbps half and full
duplex.
 EEPROM address length 6, 64 words.
EEPROM contents (64 words):
0x00:  100b 0020 0b34 41fb 0000 0000 0000 4000
0x08:  0d32 dff4 1905 aa48 0000 0000 129c 4c4c
0x10:  ca52 2ccc 0cb2 9c6c 0c6c 8c0c 2020 6080
0x18:  0800 0000 0000 0000 0000 0000 0000 0000
0x20:  0000 0000 0000 0000 0000 0000 0000 0000
0x28:  0000 0000 0000 0000 0000 0000 0000 0000
0x30:  0000 0000 0000 0000 0000 0000 0000 0000
0x38:  0000 0000 0000 0000 0000 0000 0000 e418
Decoded EEPROM contents:
  PCI Subsystem IDs -- Vendor 0x100b, Device 0x0020.
  PCI timer settings -- minimum grant 11, maximum latency 52.
  Ethernet MAC Station Address 00:80:64:1a:e8:bf.
  Wake-On-LAN password 00:00:00:00:00:00.
  Transceiver setting 0x--f-: advertise 10/100 Mbps half and full duplex.
   Flow control enabled.
  EEPROM active region checksum read as aa48, vs aa48 calculated value.



 

^ permalink raw reply

* Re: tc ipt action
From: Jamal Hadi Salim @ 2012-12-17 12:58 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Pablo Neira Ayuso, Yury Stankevich, shemonc,
	netdev@vger.kernel.org, netfilter-devel
In-Reply-To: <alpine.LNX.2.01.1212162220270.20281@nerf07.vanv.qr>

On 12-12-16 04:21 PM, Jan Engelhardt wrote:

> If you have a preexisting clone of any linux tree, you can utilize
> `git remote add ...` to only grab the deltas.

It downloaded eventually.
So looking at this quickly, basic question: is xtables2 different API 
wise from what we do today in act_ipt?
Second: Are chain names unique system wide? i.e at the moment we send
a hook and table selection?
The patch i have currently for the kernel tries to pursue an approach 
that maximizes code reuse - depending on your answer I may go the 
approach of having a separate act_xt and hope you can build on top of that.

cheers,
jamal


^ permalink raw reply

* [PATCH] bugfix: network namespace & device dummy
From: V. Lavrov @ 2012-12-17 13:01 UTC (permalink / raw)
  To: netdev

If container has a network device dummyX (with lxc.network.type = phys), then it disappears from the system after you close the container.
The patch returns the device dummyX to the initial network namespace after container is closed.

Signed-off-by: Vitaly Lavrov <lve@guap.ru>
---
diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index bab0158..efa990c 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -160,6 +160,41 @@ static struct rtnl_link_ops dummy_link_ops __read_mostly = {
  module_param(numdummies, int, 0);
  MODULE_PARM_DESC(numdummies, "Number of dummy pseudo devices");

+
+static void __net_exit dummy_net_exit(struct net *net) {
+       struct net_device *dev, *aux;
+       int err;
+
+       if(net == &init_net) return;
+
+       rtnl_lock();
+       for_each_netdev_safe(net, dev, aux) {
+               if(dev->rtnl_link_ops == &dummy_link_ops) {
+                       err = dev_change_net_namespace(dev, &init_net, dev->name);
+                       if(err) {
+                               char fb_name[IFNAMSIZ];
+                               printk (KERN_INFO "%s: dev_change_net_namespace(init_net,%s) err: %d\n",
+                                       __func__,dev->name,err);
+                               snprintf(fb_name, IFNAMSIZ, "dev%d", dev->ifindex);
+                               err = dev_change_net_namespace(dev, &init_net, dev->name);
+                               if(err)
+                                       printk (KERN_INFO "%s: dev_change_net_namespace(%s,init_net,%s) err: %d\n",
+                                               __func__,dev->name,fb_name,err);
+                               else
+                                       printk (KERN_INFO "%s: %s rename to %s\n",
+                                               __func__,dev->name,fb_name);
+
+                       }
+               }
+       }
+       rtnl_unlock();
+}
+
+static struct pernet_operations __net_initdata dummy_net_ops = {
+       .exit = dummy_net_exit,
+};
+
+
  static int __init dummy_init_one(void)
  {
         struct net_device *dev_dummy;
@@ -184,6 +219,10 @@ static int __init dummy_init_module(void)
  {
         int i, err = 0;

+       err = register_pernet_device(&dummy_net_ops);
+       if(err)
+               return err;
+
         rtnl_lock();
         err = __rtnl_link_register(&dummy_link_ops);

@@ -191,8 +230,10 @@ static int __init dummy_init_module(void)
                 err = dummy_init_one();
                 cond_resched();
         }
-       if (err < 0)
+       if (err < 0) {
                 __rtnl_link_unregister(&dummy_link_ops);
+               unregister_pernet_device(&dummy_net_ops);
+       }
         rtnl_unlock();

         return err;
@@ -201,6 +242,7 @@ static int __init dummy_init_module(void)
  static void __exit dummy_cleanup_module(void)
  {
         rtnl_link_unregister(&dummy_link_ops);
+       unregister_pernet_device(&dummy_net_ops);
  }

  module_init(dummy_init_module);
--

^ permalink raw reply related

* [PATCH 0/3 v2] net/macb: RX path enhancement
From: Nicolas Ferre @ 2012-12-17 13:01 UTC (permalink / raw)
  To: David S. Miller, netdev
  Cc: linux-arm-kernel, linux-kernel, Joachim Eastwood,
	Jean-Christophe PLAGNIOL-VILLARD, Nicolas Ferre

Hi,

Here is the patch series for modifying the RX path in macb driver.
This change applies on GEM variant of the Cadence IP and introduces
function pointers to match the path to the proper adapter. The move
to RX buffers adapted to MTU and that can be DMAed directly in SKB
is done in two steps but can be merged in a single patch.

v2: - gave up the idea of using non-coherent memory for
      rx buffers
    - addition of the struct macb layout optimization


Havard Skinnemoen (1):
  net/macb: Try to optimize struct macb layout

Nicolas Ferre (2):
  net/macb: increase RX buffer size for GEM
  net/macb: change RX path for GEM

 drivers/net/ethernet/cadence/macb.c | 323 +++++++++++++++++++++++++++++++-----
 drivers/net/ethernet/cadence/macb.h |  35 ++--
 2 files changed, 306 insertions(+), 52 deletions(-)

-- 
1.8.0

^ permalink raw reply

* [PATCH 1/3 v2] net/macb: increase RX buffer size for GEM
From: Nicolas Ferre @ 2012-12-17 13:01 UTC (permalink / raw)
  To: David S. Miller, netdev
  Cc: linux-arm-kernel, linux-kernel, Joachim Eastwood,
	Jean-Christophe PLAGNIOL-VILLARD, Nicolas Ferre
In-Reply-To: <cover.1355748676.git.nicolas.ferre@atmel.com>

Macb Ethernet controller requires a RX buffer of 128 bytes. It is
highly sub-optimal for Gigabit-capable GEM that is able to use
a bigger DMA buffer. Change this constant and associated macros
with data stored in the private structure.
RX DMA buffer size has to be multiple of 64 bytes as indicated in
DMA Configuration Register specification.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
---
 drivers/net/ethernet/cadence/macb.c | 45 ++++++++++++++++++++++++++++++-------
 drivers/net/ethernet/cadence/macb.h |  1 +
 2 files changed, 38 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index a9b0830..50f8669 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -32,7 +32,9 @@
 
 #include "macb.h"
 
-#define RX_BUFFER_SIZE		128
+#define MACB_RX_BUFFER_SIZE	128
+#define GEM_RX_BUFFER_SIZE	2048
+#define RX_BUFFER_MULTIPLE	64  /* bytes */
 #define RX_RING_SIZE		512 /* must be power of 2 */
 #define RX_RING_BYTES		(sizeof(struct macb_dma_desc) * RX_RING_SIZE)
 
@@ -92,7 +94,7 @@ static struct macb_dma_desc *macb_rx_desc(struct macb *bp, unsigned int index)
 
 static void *macb_rx_buffer(struct macb *bp, unsigned int index)
 {
-	return bp->rx_buffers + RX_BUFFER_SIZE * macb_rx_ring_wrap(index);
+	return bp->rx_buffers + bp->rx_buffer_size * macb_rx_ring_wrap(index);
 }
 
 void macb_set_hwaddr(struct macb *bp)
@@ -572,7 +574,7 @@ static int macb_rx_frame(struct macb *bp, unsigned int first_frag,
 	skb_put(skb, len);
 
 	for (frag = first_frag; ; frag++) {
-		unsigned int frag_len = RX_BUFFER_SIZE;
+		unsigned int frag_len = bp->rx_buffer_size;
 
 		if (offset + frag_len > len) {
 			BUG_ON(frag != last_frag);
@@ -580,7 +582,7 @@ static int macb_rx_frame(struct macb *bp, unsigned int first_frag,
 		}
 		skb_copy_to_linear_data_offset(skb, offset,
 				macb_rx_buffer(bp, frag), frag_len);
-		offset += RX_BUFFER_SIZE;
+		offset += bp->rx_buffer_size;
 		desc = macb_rx_desc(bp, frag);
 		desc->addr &= ~MACB_BIT(RX_USED);
 
@@ -860,6 +862,30 @@ static int macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	return NETDEV_TX_OK;
 }
 
+static void macb_init_rx_buffer_size(struct macb *bp)
+{
+	if (!macb_is_gem(bp)) {
+		bp->rx_buffer_size = MACB_RX_BUFFER_SIZE;
+	} else {
+		bp->rx_buffer_size = GEM_RX_BUFFER_SIZE;
+
+		if (bp->rx_buffer_size > PAGE_SIZE) {
+			netdev_warn(bp->dev,
+				    "RX buffer cannot be bigger than PAGE_SIZE, shrinking\n");
+			bp->rx_buffer_size = PAGE_SIZE;
+		}
+		if (bp->rx_buffer_size % RX_BUFFER_MULTIPLE) {
+			netdev_warn(bp->dev,
+				    "RX buffer must be multiple of %d bytes, shrinking\n",
+				    RX_BUFFER_MULTIPLE);
+			bp->rx_buffer_size =
+				rounddown(bp->rx_buffer_size, RX_BUFFER_MULTIPLE);
+		}
+		bp->rx_buffer_size = max(RX_BUFFER_MULTIPLE, GEM_RX_BUFFER_SIZE);
+	}
+}
+
+
 static void macb_free_consistent(struct macb *bp)
 {
 	if (bp->tx_skb) {
@@ -878,7 +904,7 @@ static void macb_free_consistent(struct macb *bp)
 	}
 	if (bp->rx_buffers) {
 		dma_free_coherent(&bp->pdev->dev,
-				  RX_RING_SIZE * RX_BUFFER_SIZE,
+				  RX_RING_SIZE * bp->rx_buffer_size,
 				  bp->rx_buffers, bp->rx_buffers_dma);
 		bp->rx_buffers = NULL;
 	}
@@ -911,7 +937,7 @@ static int macb_alloc_consistent(struct macb *bp)
 		   "Allocated TX ring of %d bytes at %08lx (mapped %p)\n",
 		   size, (unsigned long)bp->tx_ring_dma, bp->tx_ring);
 
-	size = RX_RING_SIZE * RX_BUFFER_SIZE;
+	size = RX_RING_SIZE * bp->rx_buffer_size;
 	bp->rx_buffers = dma_alloc_coherent(&bp->pdev->dev, size,
 					    &bp->rx_buffers_dma, GFP_KERNEL);
 	if (!bp->rx_buffers)
@@ -936,7 +962,7 @@ static void macb_init_rings(struct macb *bp)
 	for (i = 0; i < RX_RING_SIZE; i++) {
 		bp->rx_ring[i].addr = addr;
 		bp->rx_ring[i].ctrl = 0;
-		addr += RX_BUFFER_SIZE;
+		addr += bp->rx_buffer_size;
 	}
 	bp->rx_ring[RX_RING_SIZE - 1].addr |= MACB_BIT(RX_WRAP);
 
@@ -1046,7 +1072,7 @@ static void macb_configure_dma(struct macb *bp)
 
 	if (macb_is_gem(bp)) {
 		dmacfg = gem_readl(bp, DMACFG) & ~GEM_BF(RXBS, -1L);
-		dmacfg |= GEM_BF(RXBS, RX_BUFFER_SIZE / 64);
+		dmacfg |= GEM_BF(RXBS, bp->rx_buffer_size / RX_BUFFER_MULTIPLE);
 		dmacfg |= GEM_BF(FBLDO, 16);
 		dmacfg |= GEM_BIT(TXPBMS) | GEM_BF(RXBMS, -1L);
 		gem_writel(bp, DMACFG, dmacfg);
@@ -1221,6 +1247,9 @@ static int macb_open(struct net_device *dev)
 	if (!bp->phy_dev)
 		return -EAGAIN;
 
+	/* RX buffers initialization */
+	macb_init_rx_buffer_size(bp);
+
 	err = macb_alloc_consistent(bp);
 	if (err) {
 		netdev_err(dev, "Unable to allocate DMA memory (error %d)\n",
diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index 570908b..2681455 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -544,6 +544,7 @@ struct macb {
 	unsigned int		rx_tail;
 	struct macb_dma_desc	*rx_ring;
 	void			*rx_buffers;
+	size_t			rx_buffer_size;
 
 	unsigned int		tx_head, tx_tail;
 	struct macb_dma_desc	*tx_ring;
-- 
1.8.0

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox