Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [patch net-next V3 0/4] net: allow to change carrier from userspace
From: Jiri Pirko @ 2012-12-23 10:12 UTC (permalink / raw)
  To: netdev
  Cc: davem, edumazet, bhutchings, mirqus, shemminger, greearb, fbl,
	john.r.fastabend
In-Reply-To: <1355909756-2389-1-git-send-email-jiri@resnulli.us>

Dave, I see that the patchwork status of this is "Deferred".
Should I repost the patchset?

thanks

Jiri


Wed, Dec 19, 2012 at 10:35:52AM CET, jiri@resnulli.us wrote:
>This is basically a V3 of a repost of my previous patchset:
>"[patch net-next-2.6 0/2] net: allow to change carrier via sysfs" from Aug 30
>
>The way net-sysfs stores values changed and this patchset reflects it.
>Also, I exposed carrier via rtnetlink iface.
>
>So far, only dummy driver uses carrier change ndo. In very near future
>team driver will use that as well.
>
>V2->V3:
> - updated ndo_change_carrier comment by Dan Williams
>
>V1->v2:
> - added bigger comment to ndo and also note to operstate.txt documentation
>   stating the clear purpose of this iface
>
>Jiri Pirko (4):
>  net: add change_carrier netdev op
>  net: allow to change carrier via sysfs
>  rtnl: expose carrier value with possibility to set it
>  dummy: implement carrier change
>
> Documentation/networking/operstates.txt |  4 ++++
> drivers/net/dummy.c                     | 10 ++++++++++
> include/linux/netdevice.h               | 12 ++++++++++++
> include/uapi/linux/if_link.h            |  1 +
> net/core/dev.c                          | 19 +++++++++++++++++++
> net/core/net-sysfs.c                    | 15 ++++++++++++++-
> net/core/rtnetlink.c                    | 10 ++++++++++
> 7 files changed, 70 insertions(+), 1 deletion(-)
>
>-- 
>1.8.0
>

^ permalink raw reply

* Re: Lockdep warning in vxlan
From: Yan Burman @ 2012-12-23  9:41 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Stephen Hemminger, netdev
In-Reply-To: <1356027360.21834.2973.camel@edumazet-glaptop>

On 20-Dec-12 20:16, Eric Dumazet wrote:
> On Thu, 2012-12-20 at 08:34 -0800, Stephen Hemminger wrote:
>> On Thu, 20 Dec 2012 16:00:32 +0200
>> Yan Burman <yanb@mellanox.com> wrote:
>>
>>> Hi.
>>>
>>> When working with vxlan from current net-next, I got a lockdep warning
>>> (below).
>>> It seems to happen when I have host B pinging host A and while the pings
>>> continue,
>>> I do "ip link del" on the vxlan interface on host A. The lockdep warning
>>> is on host A.
>>> Tell me if you need some more info.
>>>
>> Looks like the case of nested ARP requests, the initial request is coming
>> from neigh_timer (ARP retransmit), but inside neigh_probe the lock
>> is dropped?
> Bug is from arp_solicit(), releasing the lock after arp_send()
>
> Its used to protect neigh->ha
>
> We could instead copy neigh->ha, without taking n->lock but ha_lock
> seqlock, using neigh_ha_snapshot() helper
>
> Yan, could you test the following patch ?
>
> Thanks
> diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
> index ce6fbdf..1169ed4 100644
> --- a/net/ipv4/arp.c
> +++ b/net/ipv4/arp.c
> @@ -321,7 +321,7 @@ static void arp_error_report(struct neighbour *neigh, struct sk_buff *skb)
>   static void arp_solicit(struct neighbour *neigh, struct sk_buff *skb)
>   {
>   	__be32 saddr = 0;
> -	u8  *dst_ha = NULL;
> +	u8 dst_ha[MAX_ADDR_LEN];
>   	struct net_device *dev = neigh->dev;
>   	__be32 target = *(__be32 *)neigh->primary_key;
>   	int probes = atomic_read(&neigh->probes);
> @@ -363,9 +363,9 @@ static void arp_solicit(struct neighbour *neigh, struct sk_buff *skb)
>   	if (probes < 0) {
>   		if (!(neigh->nud_state & NUD_VALID))
>   			pr_debug("trying to ucast probe in NUD_INVALID\n");
> -		dst_ha = neigh->ha;
> -		read_lock_bh(&neigh->lock);
> +		neigh_ha_snapshot(dst_ha, neigh, dev);
>   	} else {
> +		memset(dst_ha, 0, dev->addr_len);
>   		probes -= neigh->parms->app_probes;
>   		if (probes < 0) {
>   #ifdef CONFIG_ARPD
> @@ -377,8 +377,6 @@ static void arp_solicit(struct neighbour *neigh, struct sk_buff *skb)
>   
>   	arp_send(ARPOP_REQUEST, ETH_P_ARP, target, dev, saddr,
>   		 dst_ha, dev->dev_addr, NULL);
> -	if (dst_ha)
> -		read_unlock_bh(&neigh->lock);
>   }
>   
>   static int arp_ignore(struct in_device *in_dev, __be32 sip, __be32 tip)
>
>

I am not being able to reproduce the problem now either with or without 
the patch...
I did get the warning twice when I first reported the issue

^ permalink raw reply

* Re: [PATCH] bnx2x: use ARRAY_SIZE where possible
From: Eilon Greenstein @ 2012-12-23  8:49 UTC (permalink / raw)
  To: Sasha Levin; +Cc: netdev, linux-kernel
In-Reply-To: <1356030701-16284-16-git-send-email-sasha.levin@oracle.com>

On Thu, 2012-12-20 at 14:11 -0500, Sasha Levin wrote:
> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

Hi Sasha,

I have no problem with this cosmetic change, but it should probably be
sent to net-next when it is open again. When you do resubmit it, you can
add my ACK:

Acked-by Eilon Greenstein <eilong@broadcom.com>

Thanks,
Eilon

^ permalink raw reply

* Re: IPv6 over Firewire
From: YOSHIFUJI Hideaki @ 2012-12-23  8:23 UTC (permalink / raw)
  To: stephan.gatzka; +Cc: netdev, linux1394-devel, YOSHIFUJI Hideaki
In-Reply-To: <50D54ED9.6090908@gmail.com>

(2012年12月22日 15:10), Stephan Gatzka wrote:
> 
>> Something like this:
>>
>>   static inline int ndisc_opt_addr_space(struct net_device *dev)
>>   {
>> -       return NDISC_OPT_SPACE(dev->addr_len + ndisc_addr_option_pad(dev->type));
>> +       switch (dev->type) {
>> +       case ARPHRD_IEEE1394:
>> +               return sizeof(struct ndisc_opt_ieee1394_llinfo);
>> +       default:
>> +               return NDISC_OPT_SPACE(dev->addr_len + ndisc_addr_option_pad(dev->type));
>> +       }
>>   }
>>
>> --yoshfuji
>>
> 
> O.k., this has the advantage that only ndisc packets get some more memory, but the question is if we are under such a hard memory pressure that we don't allow that.
> 
> Your solution has the disadvantage that now I have to publish struct ndisc_opt_ieee1394_llinfo to the ndisc stuff. Nobody in ndisc.c really wants to deal with that structure, only the size is of interest. So keeping this struct private is less invasive to the rest of linux. Just my two cents.

net/ipv6/ndisc.c SHOULD build full NDP messages for IPv6
over IEEE1394 as we do it for Infiniband.

Please, please do not try to mangle them in the driver.

--yoshfuji

^ permalink raw reply

* [3.8-rc1] Networking problems after pulling-in net.git#master
From: Sedat Dilek @ 2012-12-23  3:14 UTC (permalink / raw)
  To: netdev; +Cc: LKML, David Miller

Hi,

after reading the thread "Regression in 3.8-rc1: "BUG: sleeping
function called from invalid context"" [1] I decided to pull-in
net.git#master (up to commit 9b1536c490d5: "bridge: call
br_netpoll_disable in br_add_if") on top of Linus upstream.

With this kernel I can't neither ping my router nor does DNS work.
IIRC there were 6 commits, not sure which of them is culprit.

I tried the v3.8-rc1-raring from [2] which works nicely here on
Ubuntu/precise AMD64.
Can't say if NetworkManager is involved or not.

Any feedback appreciated, Thanks.

Regards,
- Sedat -

[1] https://lkml.org/lkml/2012/12/22/79
[2] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.8-rc1-raring/

^ permalink raw reply

* Re: Regression in 3.8-rc1: "BUG: sleeping function called from invalid context"
From: David Miller @ 2012-12-22 21:34 UTC (permalink / raw)
  To: bp; +Cc: Larry.Finger, linux-kernel, cl, penberg, netdev
In-Reply-To: <20121222180247.GA10250@liondog.tnic>

From: Borislav Petkov <bp@alien8.de>
Date: Sat, 22 Dec 2012 19:02:47 +0100

> Top-posting so that the rest can remain untouched.

This bug is fixed in the 'net' tree already by commit:

>From 30e6c9fa93cf3dbc7cc6df1d748ad25e4264545a Mon Sep 17 00:00:00 2001
From: Eric Dumazet <edumazet@google.com>
Date: Thu, 20 Dec 2012 17:25:08 +0000
Subject: [PATCH 04/10] net: devnet_rename_seq should be a seqcount

Using a seqlock for devnet_rename_seq is not a good idea,
as device_rename() can sleep.

As we hold RTNL, we dont need a protection for writers,
and only need a seqcount so that readers can catch a change done
by a writer.

Bug added in commit c91f6df2db4972d3 (sockopt: Change getsockopt() of
SO_BINDTODEVICE to return an interface name)

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/linux/netdevice.h |  2 +-
 net/core/dev.c            | 18 +++++++++---------
 net/core/sock.c           |  4 ++--
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 02e0f6b..c599e47 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1576,7 +1576,7 @@ extern int call_netdevice_notifiers(unsigned long val, struct net_device *dev);
 
 extern rwlock_t				dev_base_lock;		/* Device list lock */
 
-extern seqlock_t	devnet_rename_seq;	/* Device rename lock */
+extern seqcount_t	devnet_rename_seq;	/* Device rename seq */
 
 
 #define for_each_netdev(net, d)		\
diff --git a/net/core/dev.c b/net/core/dev.c
index d0cbc93..515473e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -203,7 +203,7 @@ static struct list_head offload_base __read_mostly;
 DEFINE_RWLOCK(dev_base_lock);
 EXPORT_SYMBOL(dev_base_lock);
 
-DEFINE_SEQLOCK(devnet_rename_seq);
+seqcount_t devnet_rename_seq;
 
 static inline void dev_base_seq_inc(struct net *net)
 {
@@ -1093,10 +1093,10 @@ int dev_change_name(struct net_device *dev, const char *newname)
 	if (dev->flags & IFF_UP)
 		return -EBUSY;
 
-	write_seqlock(&devnet_rename_seq);
+	write_seqcount_begin(&devnet_rename_seq);
 
 	if (strncmp(newname, dev->name, IFNAMSIZ) == 0) {
-		write_sequnlock(&devnet_rename_seq);
+		write_seqcount_end(&devnet_rename_seq);
 		return 0;
 	}
 
@@ -1104,7 +1104,7 @@ int dev_change_name(struct net_device *dev, const char *newname)
 
 	err = dev_get_valid_name(net, dev, newname);
 	if (err < 0) {
-		write_sequnlock(&devnet_rename_seq);
+		write_seqcount_end(&devnet_rename_seq);
 		return err;
 	}
 
@@ -1112,11 +1112,11 @@ rollback:
 	ret = device_rename(&dev->dev, dev->name);
 	if (ret) {
 		memcpy(dev->name, oldname, IFNAMSIZ);
-		write_sequnlock(&devnet_rename_seq);
+		write_seqcount_end(&devnet_rename_seq);
 		return ret;
 	}
 
-	write_sequnlock(&devnet_rename_seq);
+	write_seqcount_end(&devnet_rename_seq);
 
 	write_lock_bh(&dev_base_lock);
 	hlist_del_rcu(&dev->name_hlist);
@@ -1135,7 +1135,7 @@ rollback:
 		/* err >= 0 after dev_alloc_name() or stores the first errno */
 		if (err >= 0) {
 			err = ret;
-			write_seqlock(&devnet_rename_seq);
+			write_seqcount_begin(&devnet_rename_seq);
 			memcpy(dev->name, oldname, IFNAMSIZ);
 			goto rollback;
 		} else {
@@ -4180,7 +4180,7 @@ static int dev_ifname(struct net *net, struct ifreq __user *arg)
 		return -EFAULT;
 
 retry:
-	seq = read_seqbegin(&devnet_rename_seq);
+	seq = read_seqcount_begin(&devnet_rename_seq);
 	rcu_read_lock();
 	dev = dev_get_by_index_rcu(net, ifr.ifr_ifindex);
 	if (!dev) {
@@ -4190,7 +4190,7 @@ retry:
 
 	strcpy(ifr.ifr_name, dev->name);
 	rcu_read_unlock();
-	if (read_seqretry(&devnet_rename_seq, seq))
+	if (read_seqcount_retry(&devnet_rename_seq, seq))
 		goto retry;
 
 	if (copy_to_user(arg, &ifr, sizeof(struct ifreq)))
diff --git a/net/core/sock.c b/net/core/sock.c
index a692ef4..bc131d4 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -583,7 +583,7 @@ static int sock_getbindtodevice(struct sock *sk, char __user *optval,
 		goto out;
 
 retry:
-	seq = read_seqbegin(&devnet_rename_seq);
+	seq = read_seqcount_begin(&devnet_rename_seq);
 	rcu_read_lock();
 	dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if);
 	ret = -ENODEV;
@@ -594,7 +594,7 @@ retry:
 
 	strcpy(devname, dev->name);
 	rcu_read_unlock();
-	if (read_seqretry(&devnet_rename_seq, seq))
+	if (read_seqcount_retry(&devnet_rename_seq, seq))
 		goto retry;
 
 	len = strlen(devname) + 1;
-- 
1.7.11.7

^ permalink raw reply related

* [PATCH] brcmsmac: initialize morepending in brcms_b_recv()
From: Geert Uytterhoeven @ 2012-12-22 21:07 UTC (permalink / raw)
  To: Piotr Haber, Arend van Spriel, John W. Linville
  Cc: linux-wireless, brcm80211-dev-list, netdev, linux-kernel,
	Geert Uytterhoeven

drivers/net/wireless/brcm80211/brcmsmac/main.c: In function ‘brcms_b_recv’:
drivers/net/wireless/brcm80211/brcmsmac/main.c:7636: warning: ‘morepending’ may be used uninitialized in this function

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
---
 drivers/net/wireless/brcm80211/brcmsmac/main.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/wireless/brcm80211/brcmsmac/main.c b/drivers/net/wireless/brcm80211/brcmsmac/main.c
index 17594de..5f0f444 100644
--- a/drivers/net/wireless/brcm80211/brcmsmac/main.c
+++ b/drivers/net/wireless/brcm80211/brcmsmac/main.c
@@ -7633,7 +7633,7 @@ brcms_b_recv(struct brcms_hardware *wlc_hw, uint fifo, bool bound)
 
 	uint n = 0;
 	uint bound_limit = bound ? RXBND : -1;
-	bool morepending;
+	bool morepending = false;
 
 	skb_queue_head_init(&recv_frames);
 
-- 
1.7.0.4

^ permalink raw reply related

* [PATCH 2/2] cpts: fix a run time warn_on.
From: Richard Cochran @ 2012-12-22 19:41 UTC (permalink / raw)
  To: netdev
  Cc: linux-arm-kernel, linux-omap, David Miller, Cyril Chemparathy,
	Mugunthan V N
In-Reply-To: <cover.1356204860.git.richardcochran@gmail.com>

This patch fixes a warning in clk_enable by calling clk_prepare first.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
 drivers/net/ethernet/ti/cpts.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 5ab8fb4..3e32a31 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -247,6 +247,7 @@ static void cpts_clk_init(struct cpts *cpts)
 		cpts->refclk = NULL;
 		return;
 	}
+	clk_prepare(cpts->refclk);
 	clk_enable(cpts->refclk);
 }
 
-- 
1.7.2.5

^ permalink raw reply related

* [PATCH 1/2] cpts: fix build error by removing useless code.
From: Richard Cochran @ 2012-12-22 19:41 UTC (permalink / raw)
  To: netdev
  Cc: linux-arm-kernel, linux-omap, David Miller, Cyril Chemparathy,
	Mugunthan V N
In-Reply-To: <cover.1356204860.git.richardcochran@gmail.com>

The cpts driver tries to obtain the input clock frequency by calling the
clock's internal 'recalc' method. Since <plat/clock.h> has been removed,
this code can no longer compile.

However, the driver never makes use of the frequency value, so this patch
fixes the issue by removing the offending code altogether.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
 drivers/net/ethernet/ti/cpts.c |    1 -
 drivers/net/ethernet/ti/cpts.h |    1 -
 2 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 5e62c1a..5ab8fb4 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -248,7 +248,6 @@ static void cpts_clk_init(struct cpts *cpts)
 		return;
 	}
 	clk_enable(cpts->refclk);
-	cpts->freq = cpts->refclk->recalc(cpts->refclk);
 }
 
 static void cpts_clk_release(struct cpts *cpts)
diff --git a/drivers/net/ethernet/ti/cpts.h b/drivers/net/ethernet/ti/cpts.h
index e1bba3a..fe993cd 100644
--- a/drivers/net/ethernet/ti/cpts.h
+++ b/drivers/net/ethernet/ti/cpts.h
@@ -120,7 +120,6 @@ struct cpts {
 	struct delayed_work overflow_work;
 	int phc_index;
 	struct clk *refclk;
-	unsigned long freq;
 	struct list_head events;
 	struct list_head pool;
 	struct cpts_event pool_data[CPTS_MAX_EVENTS];
-- 
1.7.2.5

^ permalink raw reply related

* [PATCH 0/2] cpts fixes for v3.8-rc2
From: Richard Cochran @ 2012-12-22 19:41 UTC (permalink / raw)
  To: netdev
  Cc: linux-arm-kernel, linux-omap, David Miller, Cyril Chemparathy,
	Mugunthan V N

The new cpts driver has two small issues, but it otherwise seems to be
working in -rc1.

Thanks,
Richard

Richard Cochran (2):
  cpts: fix build error by removing useless code.
  cpts: fix a run time warn_on.

 drivers/net/ethernet/ti/cpts.c |    2 +-
 drivers/net/ethernet/ti/cpts.h |    1 -
 2 files changed, 1 insertions(+), 2 deletions(-)

-- 
1.7.2.5

^ permalink raw reply

* Re: Regression in 3.8-rc1:  "BUG: sleeping function called from invalid context"
From: Larry Finger @ 2012-12-22 19:04 UTC (permalink / raw)
  To: Borislav Petkov, Eric Dumazet, LKML, Christoph Lameter,
	Pekka Enberg, netdev
In-Reply-To: <20121222183044.GA10263@liondog.tnic>

On 12/22/2012 12:30 PM, Borislav Petkov wrote:
> On Sat, Dec 22, 2012 at 10:10:28AM -0800, Eric Dumazet wrote:
>> RTNL is a mutex, its perfectly valid to use GFP_KERNEL while holding a
>> mutex.
>
> Right, sorry. The check fires because we have preemption disabled.
>
>> As replied before your mail, fix for the problem is already in David
>> tree.
>
> Yep, saw that after hitting send.

Eric and Borislav,

The patch does fix my problem. I expect that it will be in mainline by -rc2.

Thanks,

Larry

^ permalink raw reply

* Re: IPv6 over Firewire
From: Stephan Gatzka @ 2012-12-22 18:33 UTC (permalink / raw)
  To: Stefan Richter; +Cc: YOSHIFUJI Hideaki, netdev, linux1394-devel
In-Reply-To: <20121222101521.08c783ac@stein>

>
> You could add another case to include/net/ndisc.h::ndisc_addr_option_pad()
> with a hardcoded size, couldn't you?
>
No, I think that is almost certainly not a good idea. The address space 
option is handed over to the firewire_net driver like this:
type, length, soure/target link address (GUID)

If I add another case in ndisc_addr_option_pad() I think the option will 
look like this:
pad, type, length, soure/target link address (GUID)

Because pad, type and GUID are already at the correct position for the 
3146 link layer option. So with padding I have to copy them to the 
correct position.

All I need is some (8 bytes) of additional tail room in the ndisc skb. 
This could be achieved either by specifying needed_tailroom in the 
firewire netdevice struct at the expense that now every skb allocated 
might get 8 bytes more allocated.

The second option is yoshfuji suggestion to pimp ndisc_opt_addr_space a 
bit. His solution only allocates additional memory for ndisc packets at 
the expense to introduce a dependency to the struct 
ndisc_opt_ieee1394_llinfo.

These are the two option we can go for. Personally I think reserving a 
bit more tail room looks cleaner if nobody votes against it...

Regards,

Stephan

^ permalink raw reply

* Re: Regression in 3.8-rc1:  "BUG: sleeping function called from invalid context"
From: Borislav Petkov @ 2012-12-22 18:30 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Larry Finger, LKML, Christoph Lameter, Pekka Enberg, netdev
In-Reply-To: <1356199828.21834.8500.camel@edumazet-glaptop>

On Sat, Dec 22, 2012 at 10:10:28AM -0800, Eric Dumazet wrote:
> RTNL is a mutex, its perfectly valid to use GFP_KERNEL while holding a
> mutex.

Right, sorry. The check fires because we have preemption disabled.

> As replied before your mail, fix for the problem is already in David
> tree.

Yep, saw that after hitting send.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply

* Re: Regression in 3.8-rc1:  "BUG: sleeping function called from invalid context"
From: Eric Dumazet @ 2012-12-22 18:10 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Larry Finger, LKML, Christoph Lameter, Pekka Enberg, netdev
In-Reply-To: <20121222180247.GA10250@liondog.tnic>

On Sat, 2012-12-22 at 19:02 +0100, Borislav Petkov wrote:
> Top-posting so that the rest can remain untouched.
> 
> Right, so AFAICT, something is holding rtnl_mutex (probably some
> rtnetlink traffic) and device_rename() is doing kstrdup with
> GFP_KERNEL which, among others, has __GFP_WAIT and *that* triggers the
> might_sleep_if() check in slab_pre_alloc_hook():
> 
> static inline int slab_pre_alloc_hook(struct kmem_cache *s, gfp_t flags)
> {
>         flags &= gfp_allowed_mask;
>         lockdep_trace_alloc(flags);
>         might_sleep_if(flags & __GFP_WAIT);		<--- HERE
> 
> Adding Christoph and Pekka although the slub.c might_sleep stuff is from
> 2010. Still, they might have a better idea.
> 
> Oh well, let's add netdev while we're at it. :-)

RTNL is a mutex, its perfectly valid to use GFP_KERNEL while holding a
mutex.

As replied before your mail, fix for the problem is already in David
tree.

http://git.kernel.org/?p=linux/kernel/git/davem/net.git;a=commitdiff;h=30e6c9fa93cf3dbc7cc6df1d748ad25e4264545a


Bug was added in commit c91f6df2db4972d3cc983e6988b9abf1ad02f5f9 :

http://git.kernel.org/?p=linux/kernel/git/davem/net.git;a=commit;h=c91f6df2db4972d3cc983e6988b9abf1ad02f5f9


Thanks

^ permalink raw reply

* Re: Regression in 3.8-rc1:  "BUG: sleeping function called from invalid context"
From: Borislav Petkov @ 2012-12-22 18:02 UTC (permalink / raw)
  To: Larry Finger; +Cc: LKML, Christoph Lameter, Pekka Enberg, netdev
In-Reply-To: <50D5E9D9.3070904@lwfinger.net>

Top-posting so that the rest can remain untouched.

Right, so AFAICT, something is holding rtnl_mutex (probably some
rtnetlink traffic) and device_rename() is doing kstrdup with
GFP_KERNEL which, among others, has __GFP_WAIT and *that* triggers the
might_sleep_if() check in slab_pre_alloc_hook():

static inline int slab_pre_alloc_hook(struct kmem_cache *s, gfp_t flags)
{
        flags &= gfp_allowed_mask;
        lockdep_trace_alloc(flags);
        might_sleep_if(flags & __GFP_WAIT);		<--- HERE

Adding Christoph and Pekka although the slub.c might_sleep stuff is from
2010. Still, they might have a better idea.

Oh well, let's add netdev while we're at it. :-)

HTH.

On Sat, Dec 22, 2012 at 11:11:53AM -0600, Larry Finger wrote:
> With kernel 3.8-rc1, I get 2 "BUG: sleeping function called from
> invalid context" reports. These have been present got some time in
> the 3.7-git versions and I have tried twice to bisect the problem.
> Both times, I ended up at a merge commit. The most recent found
> commit a11da7d as the bad one, and commit d7460f4 as the last good
> one. I have not had time to make a third try.
> 
> My system is x86_64 running on an HP laptop with a dual-core AMD
> CPU. My configuration file is attached.
> 
> The logged details are as follows:
> 
> [   31.715016] BUG: sleeping function called from invalid context at mm/slub.c:925
> [   31.715022] in_atomic(): 1, irqs_disabled(): 0, pid: 2129, name: udevd
> [   31.715025] 2 locks held by udevd/2129:
> [   31.715028]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff81386382>]
> rtnl_lock+0x12/0x20
> [   31.715041]  #1:  (devnet_rename_seq){+.+.+.}, at:
> [<ffffffff81376033>] dev_change_name+0x43/0x260
> [   31.715053] Pid: 2129, comm: udevd Not tainted 3.8.0-rc1 #56
> [   31.715056] Call Trace:
> [   31.715063]  [<ffffffff81076ca2>] __might_sleep+0x152/0x250
> [   31.715068]  [<ffffffff8114a353>] __kmalloc_track_caller+0x103/0x280
> [   31.715073]  [<ffffffff812d595d>] ? device_rename+0x4d/0xf0
> [   31.715078]  [<ffffffff8111d675>] kstrdup+0x35/0x70
> [   31.715082]  [<ffffffff812d595d>] device_rename+0x4d/0xf0
> [   31.715086]  [<ffffffff813760ca>] dev_change_name+0xda/0x260
> [   31.715091]  [<ffffffff81377f51>] dev_ifsioc+0x241/0x3a0
> [   31.715095]  [<ffffffff81378410>] dev_ioctl+0x360/0x830
> [   31.715101]  [<ffffffff810a54cd>] ? trace_hardirqs_on+0xd/0x10
> [   31.715106]  [<ffffffff8135b711>] sock_do_ioctl.constprop.41+0x41/0x50
> [   31.715109]  [<ffffffff8135b9c6>] sock_ioctl+0x66/0x2b0
> [   31.715115]  [<ffffffff811637b7>] do_vfs_ioctl+0x97/0x580
> [   31.715119]  [<ffffffff8116f27a>] ? fget_light+0x3da/0x4d0
> [   31.715124]  [<ffffffff81422a55>] ? sysret_check+0x22/0x5d
> [   31.715128]  [<ffffffff81163ceb>] sys_ioctl+0x4b/0x90
> [   31.715133]  [<ffffffff8121a69e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [   31.715136]  [<ffffffff81422a29>] system_call_fastpath+0x16/0x1b
> [   31.715764] BUG: scheduling while atomic: udevd/2129/0x00000002
> [   31.715768] 2 locks held by udevd/2129:
> [   31.715769]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff81386382>]
> rtnl_lock+0x12/0x20
> [   31.715779]  #1:  (devnet_rename_seq){+.+.+.}, at:
> [<ffffffff81376033>] dev_change_name+0x43/0x260
> [   31.715787] Modules linked in: b43 arc4 rtl8723ae rtlwifi
> mac80211 snd_hda_codec_conexant snd_hda_intel snd_hda_codec cfg80211
> powernow_k8 kvm_amd snd_pcm_oss kvm snd_pcm snd_seq bcma rng_core
> ssb snd_timer mmc_core snd_seq_device pcmcia rfkill snd sr_mod cdrom
> soundcore ehci_pci battery pcmcia_core sg k8temp ac i2c_nforce2
> hwmon forcedeth video snd_page_alloc serio_raw i2c_core joydev wmi
> button ipv6 autofs4 ext4 mbcache jbd2 crc16 ohci_hcd ehci_hcd
> usbcore usb_common thermal processor scsi_dh_rdac scsi_dh_alua
> scsi_dh_emc scsi_dh_hp_sw scsi_dh ata_generic pata_amd
> [   31.715867] Pid: 2129, comm: udevd Not tainted 3.8.0-rc1 #56
> [   31.715868] Call Trace:
> [   31.715874]  [<ffffffff8141987c>] __schedule_bug+0x62/0x70
> [   31.715878]  [<ffffffff81420160>] __schedule+0x730/0xa30
> [   31.715883]  [<ffffffff810a3982>] ? __lock_acquire+0x12d2/0x1c50
> [   31.715888]  [<ffffffff81420744>] schedule+0x24/0x70
> [   31.715893]  [<ffffffff8141dc5c>] schedule_timeout+0x18c/0x2f0
> [   31.715897]  [<ffffffff810a52ac>] ? mark_held_locks+0x8c/0x110
> [   31.715902]  [<ffffffff81421b9b>] ? _raw_spin_unlock_irq+0x2b/0x50
> [   31.715906]  [<ffffffff810a5435>] ? trace_hardirqs_on_caller+0x105/0x190
> [   31.715911]  [<ffffffff810a54cd>] ? trace_hardirqs_on+0xd/0x10
> [   31.715915]  [<ffffffff814205d5>] wait_for_common+0xe5/0x180
> [   31.715919]  [<ffffffff8107d010>] ? try_to_wake_up+0x2d0/0x2d0
> [   31.715924]  [<ffffffff81420718>] wait_for_completion+0x18/0x20
> [   31.715929]  [<ffffffff8105f48c>] call_usermodehelper_exec+0x19c/0x1d0
> [   31.715935]  [<ffffffff8105f592>] call_usermodehelper_fns+0xd2/0x100
> [   31.715941]  [<ffffffff8121211d>] kobject_uevent_env+0x47d/0x4b0
> [   31.715946]  [<ffffffff812110df>] kobject_rename+0x12f/0x140
> [   31.715951]  [<ffffffff812d59db>] device_rename+0xcb/0xf0
> [   31.715955]  [<ffffffff813760ca>] dev_change_name+0xda/0x260
> [   31.715960]  [<ffffffff81377f51>] dev_ifsioc+0x241/0x3a0
> [   31.715965]  [<ffffffff81378410>] dev_ioctl+0x360/0x830
> [   31.715969]  [<ffffffff810a54cd>] ? trace_hardirqs_on+0xd/0x10
> [   31.715974]  [<ffffffff8135b711>] sock_do_ioctl.constprop.41+0x41/0x50
> [   31.715978]  [<ffffffff8135b9c6>] sock_ioctl+0x66/0x2b0
> [   31.715982]  [<ffffffff811637b7>] do_vfs_ioctl+0x97/0x580
> [   31.715987]  [<ffffffff8116f27a>] ? fget_light+0x3da/0x4d0
> [   31.715991]  [<ffffffff81422a55>] ? sysret_check+0x22/0x5d
> [   31.716092]  [<ffffffff81163ceb>] sys_ioctl+0x4b/0x90
> [   31.716097]  [<ffffffff8121a69e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [   31.716102]  [<ffffffff81422a29>] system_call_fastpath+0x16/0x1b



-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply

* Re: [PATCH] pkt_sched: act_xt support new Xtables interface
From: Jamal Hadi Salim @ 2012-12-22 14:09 UTC (permalink / raw)
  To: Yury Stankevich
  Cc: Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt,
	netdev@vger.kernel.org, pablo, netfilter-devel
In-Reply-To: <50D5BC96.9010602@gmail.com>

On 12-12-22 08:58 AM, Yury Stankevich wrote:

> i'm trying if this can work, alas. it can't.

Now i want it to work ;-> So dont give up yet.


cheers,
jamal


^ permalink raw reply

* Re: [PATCH] pkt_sched: act_xt support new Xtables interface
From: Florian Westphal @ 2012-12-22 14:04 UTC (permalink / raw)
  To: Yury Stankevich
  Cc: Jamal Hadi Salim, Hasan Chowdhury, Stephen Hemminger,
	Jan Engelhardt, netdev@vger.kernel.org, pablo, netfilter-devel
In-Reply-To: <50D5BC96.9010602@gmail.com>

Yury Stankevich <urykhy@gmail.com> wrote:
> 22.12.2012 17:19, Jamal Hadi Salim пишет:
> > From the description Jan provided in his response, I dont
> > think this used to work at all. Are you saying it worked before?
> 
> no.
> i'm trying if this can work, alas. it can't.

Yury, what are you trying to accomplish?
Is there a particular reason why you want to use ingress shaping
instead of pure policing?

I ask, because you could try to use hashlimit match to do
rate policing via netfilter.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] pkt_sched: act_xt support new Xtables interface
From: Yury Stankevich @ 2012-12-22 13:58 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt,
	netdev@vger.kernel.org, pablo, netfilter-devel
In-Reply-To: <50D5B366.30005@mojatatu.com>

22.12.2012 17:19, Jamal Hadi Salim пишет:
> From the description Jan provided in his response, I dont
> think this used to work at all. Are you saying it worked before?

no.
i'm trying if this can work, alas. it can't.

> Having said that, what you are doing sounds so useful
> that we need to make it work ;-> But it appears like
> we need a brand new action for it, something like
> GetMarkFromConntrack. 

maybe ifb device can be made more friendly to iptables ?
for a sample, run some (or all?) nefilter hooks before qdisc, like on a
normal interface ?


-- 
Linux registered user #402966 // pub 1024D/E99AF373 <pgp.mit.edu>
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] pkt_sched: act_xt support new Xtables interface
From: Jamal Hadi Salim @ 2012-12-22 13:56 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Yury Stankevich, Hasan Chowdhury, Stephen Hemminger,
	netdev@vger.kernel.org, pablo, netfilter-devel
In-Reply-To: <alpine.LNX.2.01.1212221430280.9993@nerf07.vanv.qr>

On 12-12-22 08:43 AM, Jan Engelhardt wrote:

>
> Looking up CT before ingress would mean the entire "raw"
> table needs to be moved before ingress. But with classic
> ip_tables, calling a table requires a lot of setup
> (basically ip_rcv).

Scanning the code:
Would it not work if i only passed it IP packets (the tc
classifier can check) and then for v4 i do something like
ipv4_conntrack_in() with pre-routing as the hook to update
the skb?

> All new skbs (i.e. those that did not loop due to IPsec, for example)
> received through __netif_receive_skb should start out with
> skb->mark=0, which is why CONNMARK --restore-mark is needed
> to copy skb->mark=ct->mark.

I  may be overthinking this: are you saying connmark should do the
copying to skb->mark instead of some action? Earlier you said
conmark depends on presence of skb->nfct.

cheers,
jamal

^ permalink raw reply

* Re: [PATCH] pkt_sched: act_xt support new Xtables interface
From: Jan Engelhardt @ 2012-12-22 13:43 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Yury Stankevich, Hasan Chowdhury, Stephen Hemminger,
	netdev@vger.kernel.org, pablo, netfilter-devel
In-Reply-To: <50D5B366.30005@mojatatu.com>

On Saturday 2012-12-22 14:19, Jamal Hadi Salim wrote:
>
> Having said that, what you are doing sounds so useful
> that we need to make it work ;-> But it appears like
> we need a brand new action for it, something like
> GetMarkFromConntrack. Jan, I am assuming (on ingress only)
> we need to call "something" to give us the nfct then
> grab the skb->mark from nfct.

Looking up CT before ingress would mean the entire "raw"
table needs to be moved before ingress. But with classic
ip_tables, calling a table requires a lot of setup
(basically ip_rcv).

> On egress,
> I am assuming the skb->mark is already set if connmark
> is to be used... Am i correct?

All new skbs (i.e. those that did not loop due to IPsec, for example)
received through __netif_receive_skb should start out with
skb->mark=0, which is why CONNMARK --restore-mark is needed
to copy skb->mark=ct->mark.

^ permalink raw reply

* Re: [PATCH] pkt_sched: act_xt support new Xtables interface
From: Jamal Hadi Salim @ 2012-12-22 13:42 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jan Engelhardt, Yury Stankevich, Hasan Chowdhury,
	Stephen Hemminger, netdev@vger.kernel.org, pablo, netfilter-devel
In-Reply-To: <1356104720.21834.7497.camel@edumazet-glaptop>

On 12-12-21 10:45 AM, Eric Dumazet wrote:
> On Fri, 2012-12-21 at 15:35 +0100, Jan Engelhardt wrote:
>

> This reminds me this might be the reason we have
> skb_reset_transport_header(skb);
> in __netif_receive_skb(), while its not very logical.
>

You seem to have nailed the egress part finally. That has
been a constant battle. At one point the standard answer
was "turn off TSO" ;->

> (Yes, sorry for being off topic, but I am referring to
> http://www.spinics.net/lists/netdev/msg214662.html )


I think the skb_reset_transport_header() when Acme made
a major overhaul to replace direct pointer access.
For this reason i think your second option seems preferable.

cheers,
jamal


^ permalink raw reply

* Re: [PATCH] pkt_sched: act_xt support new Xtables interface
From: Jamal Hadi Salim @ 2012-12-22 13:19 UTC (permalink / raw)
  To: Yury Stankevich
  Cc: Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt,
	netdev@vger.kernel.org, pablo, netfilter-devel
In-Reply-To: <50D46EC1.2040608@gmail.com>

On 12-12-21 09:14 AM, Yury Stankevich wrote:
>
> well.
> let me describe whole picture i want to achieve
>

I think i got what you are trying to do Yury. Clever.
 From the description Jan provided in his response, I dont
think this used to work at all. Are you saying it worked before?

Having said that, what you are doing sounds so useful
that we need to make it work ;-> But it appears like
we need a brand new action for it, something like
GetMarkFromConntrack. Jan, I am assuming (on ingress only)
we need to call "something" to give us the nfct then
grab the skb->mark from nfct. On egress,
I am assuming the skb->mark is already set if connmark
is to be used... Am i correct?
If yes, then this action will only be useful at ingress.

cheers,
jamal

^ permalink raw reply

* Re: IPv6 over Firewire
From: Stefan Richter @ 2012-12-22  9:15 UTC (permalink / raw)
  To: stephan.gatzka; +Cc: YOSHIFUJI Hideaki, netdev, linux1394-devel
In-Reply-To: <50D54ED9.6090908@gmail.com>

On Dec 22 Stephan Gatzka wrote:
> 
> > Something like this:
> >
> >   static inline int ndisc_opt_addr_space(struct net_device *dev)
> >   {
> > -       return NDISC_OPT_SPACE(dev->addr_len + ndisc_addr_option_pad(dev->type));
> > +       switch (dev->type) {
> > +       case ARPHRD_IEEE1394:
> > +               return sizeof(struct ndisc_opt_ieee1394_llinfo);
> > +       default:
> > +               return NDISC_OPT_SPACE(dev->addr_len + ndisc_addr_option_pad(dev->type));
> > +       }
> >   }
> >
> > --yoshfuji
> >
> 
> O.k., this has the advantage that only ndisc packets get some more 
> memory, but the question is if we are under such a hard memory pressure 
> that we don't allow that.
> 
> Your solution has the disadvantage that now I have to publish struct 
> ndisc_opt_ieee1394_llinfo to the ndisc stuff. Nobody in ndisc.c really 
> wants to deal with that structure, only the size is of interest. So 
> keeping this struct private is less invasive to the rest of linux. Just 
> my two cents.

You could add another case to include/net/ndisc.h::ndisc_addr_option_pad()
with a hardcoded size, couldn't you?
-- 
Stefan Richter
-=====-===-- ==-- =-==-
http://arcgraph.de/sr/

^ permalink raw reply

* Re: [PATCH] Drivers: network: more __dev* removal
From: David Miller @ 2012-12-22  8:03 UTC (permalink / raw)
  To: gregkh; +Cc: netdev, wfp5p
In-Reply-To: <20121221234215.GA13447@kroah.com>

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Fri, 21 Dec 2012 15:42:15 -0800

> Remove some __dev* markings that snuck in the 3.8-rc1 merge window in
> the drivers/net/* directory.
> 
> Cc: Bill Pemberton <wfp5p@virginia.edu>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Applied.

^ permalink raw reply

* Re: [PATCH] CONFIG_HOTPLUG removal from networking core
From: David Miller @ 2012-12-22  8:03 UTC (permalink / raw)
  To: gregkh; +Cc: netdev, wfp5p
In-Reply-To: <20121221234429.GB13447@kroah.com>

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Fri, 21 Dec 2012 15:44:29 -0800

> CONFIG_HOTPLUG is always enabled now, so remove the unused code that was
> trying to be compiled out when this option was disabled, in the
> networking core.
> 
> Cc: Bill Pemberton <wfp5p@virginia.edu>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Applied.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox