Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next] fib_trie: Send RTM_DELROUTE when link goes down
From: Joe Perches @ 2013-10-17 17:23 UTC (permalink / raw)
  To: Kristian Evensen; +Cc: netdev
In-Reply-To: <1382028894-14236-1-git-send-email-kristian.evensen@gmail.com>

On Thu, 2013-10-17 at 18:54 +0200, Kristian Evensen wrote:
> From: Kristian Evensen <kristian.evensen@gmail.com>
> 
> When a link is set as down using for example "ip link set dev X down", routes
> are deleted, but RTM_DELROUTE messages are not sent to RTNLGRP_IPV4_ROUTE. This
> patch makes trie_flush_list() send a RTM_DELROUTE for each route that is
> removed, and makes rtnelink route deletion behavior consistent across commands.
> The parameter lists for trie_flush_list() and trie_flush_leaf() are expanded to
> include required information otherwise unavailable in trie_flush_list().
[]
> diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
[]
> @@ -1698,15 +1698,23 @@ int fib_table_delete(struct fib_table *tb, struct fib_config *cfg)
>  	return 0;
>  }
>  
> -static int trie_flush_list(struct list_head *head)
> +static int trie_flush_list(struct list_head *head, u32 tb_id, t_key key,
> +			    int plen)
>  {
>  	struct fib_alias *fa, *fa_node;
>  	int found = 0;
> +	struct nl_info cfg;
> +
> +	memset(&cfg, 0, sizeof(cfg));

Perhaps this memset should be moved into the
list_for_each_entry_safe loop where cfg is used.
 
>  	list_for_each_entry_safe(fa, fa_node, head, fa_list) {
>  		struct fib_info *fi = fa->fa_info;
>  
>  		if (fi && (fi->fib_flags & RTNH_F_DEAD)) {

			memset(&cfg, 0, sizeof(cfg));

?

> +			cfg.nl_net = fi->fib_net;
> +			rtmsg_fib(RTM_DELROUTE, htonl(key), fa, plen, tb_id,
> +				  &cfg, 0);

^ permalink raw reply

* [PATCH net-next] fib_trie: Send RTM_DELROUTE when link goes down
From: Kristian Evensen @ 2013-10-17 16:54 UTC (permalink / raw)
  To: netdev; +Cc: Kristian Evensen
In-Reply-To: <CAKfDRXizw1TaK-oHrAQv8Hi80gKiUHjQ6d21o3MOQpuVkcXSow@mail.gmail.com>

From: Kristian Evensen <kristian.evensen@gmail.com>

When a link is set as down using for example "ip link set dev X down", routes
are deleted, but RTM_DELROUTE messages are not sent to RTNLGRP_IPV4_ROUTE. This
patch makes trie_flush_list() send a RTM_DELROUTE for each route that is
removed, and makes rtnelink route deletion behavior consistent across commands.
The parameter lists for trie_flush_list() and trie_flush_leaf() are expanded to
include required information otherwise unavailable in trie_flush_list().

One use case that is simplified by this patch is IPv4 WAN connection monitoring.
Instead of listening for and handling both RTM_*ROUTE and RTM_*LINK-messages, it
is sufficient to handle RTM_*ROUTE.

Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>

---
 net/ipv4/fib_trie.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index ec9a9ef..629747a 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1698,15 +1698,23 @@ int fib_table_delete(struct fib_table *tb, struct fib_config *cfg)
 	return 0;
 }
 
-static int trie_flush_list(struct list_head *head)
+static int trie_flush_list(struct list_head *head, u32 tb_id, t_key key,
+			    int plen)
 {
 	struct fib_alias *fa, *fa_node;
 	int found = 0;
+	struct nl_info cfg;
+
+	memset(&cfg, 0, sizeof(cfg));
 
 	list_for_each_entry_safe(fa, fa_node, head, fa_list) {
 		struct fib_info *fi = fa->fa_info;
 
 		if (fi && (fi->fib_flags & RTNH_F_DEAD)) {
+			cfg.nl_net = fi->fib_net;
+			rtmsg_fib(RTM_DELROUTE, htonl(key), fa, plen, tb_id,
+				  &cfg, 0);
+
 			list_del_rcu(&fa->fa_list);
 			fib_release_info(fa->fa_info);
 			alias_free_mem_rcu(fa);
@@ -1716,7 +1724,7 @@ static int trie_flush_list(struct list_head *head)
 	return found;
 }
 
-static int trie_flush_leaf(struct leaf *l)
+static int trie_flush_leaf(struct leaf *l, u32 tb_id)
 {
 	int found = 0;
 	struct hlist_head *lih = &l->list;
@@ -1724,7 +1732,7 @@ static int trie_flush_leaf(struct leaf *l)
 	struct leaf_info *li = NULL;
 
 	hlist_for_each_entry_safe(li, tmp, lih, hlist) {
-		found += trie_flush_list(&li->falh);
+		found += trie_flush_list(&li->falh, tb_id, l->key, li->plen);
 
 		if (list_empty(&li->falh)) {
 			hlist_del_rcu(&li->hlist);
@@ -1813,7 +1821,7 @@ int fib_table_flush(struct fib_table *tb)
 	int found = 0;
 
 	for (l = trie_firstleaf(t); l; l = trie_nextleaf(l)) {
-		found += trie_flush_leaf(l);
+		found += trie_flush_leaf(l, tb->tb_id);
 
 		if (ll && hlist_empty(&ll->list))
 			trie_leaf_remove(t, ll);
-- 
1.8.1.2

^ permalink raw reply related

* Re: [PATCH net-next] fib_trie: Send RTM_DELROUTE when link goes down
From: Kristian Evensen @ 2013-10-17 16:40 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: Network Development
In-Reply-To: <525FE979.3010109@cogentembedded.com>

Hi,

>    The continuation line should start right under *struct* on the first
> line, according to the networking coding style.

Thank you very much for letting me know. Will resubmit the patch.

-Kristian

^ permalink raw reply

* Re: [Xen-devel] [PATCH net] xen-netback: add the scenario which now beyond the range time_after_eq().
From: Wei Liu @ 2013-10-17 16:41 UTC (permalink / raw)
  To: annie li
  Cc: Jan Beulich, jianhai luan, david.vrabel, ian.campbell, wei.liu2,
	xen-devel, netdev
In-Reply-To: <52601274.5010008@oracle.com>

On Fri, Oct 18, 2013 at 12:38:12AM +0800, annie li wrote:
> 
> On 2013-10-17 17:26, Jan Beulich wrote:
> >>Yes, the issue only can be  reproduced in 32-bit Dom0 (Beyond
> >>MAX_ULONG/2 in 64-bit will need long long time)
> >>
> >>I think the gap should be think all environment even now extending 480+.
> >>if now fall in the gap,  one timer will be pending and replenish will be
> >>in time.  Please run the attachment test program.
> >Not sure what this is supposed to tell me. I recognize that there
> >are overflow conditions not handled properly, but (a) I have a
> >hard time thinking of a sensible guest that sits idle for over 240
> >days (host uptime usually isn't even coming close to that due to
> >maintenance requirements) and (b) if there is such a sensible
> >guest, then I can't see why dealing with one being idle for over
> >480 days should be required too.
> >
> 
> If the guest contains multiple NICs, that situation probably happens
> when one NIC keeps idle and others work under load. BTW, how do you
> get the 240?
> 

I think Jan got this number with HZ=100. It take ~240 days for jiffies
to overflow in 32 bit machine when HZ=100.

Wei.

> Thanks
> Annie

^ permalink raw reply

* Re: [Xen-devel] [PATCH net] xen-netback: add the scenario which now beyond the range time_after_eq().
From: annie li @ 2013-10-17 16:38 UTC (permalink / raw)
  To: Jan Beulich
  Cc: jianhai luan, david.vrabel, ian.campbell, wei.liu2, xen-devel,
	netdev
In-Reply-To: <525FC98002000078000FBBB5@nat28.tlf.novell.com>


On 2013-10-17 17:26, Jan Beulich wrote:
>> Yes, the issue only can be  reproduced in 32-bit Dom0 (Beyond
>> MAX_ULONG/2 in 64-bit will need long long time)
>>
>> I think the gap should be think all environment even now extending 480+.
>> if now fall in the gap,  one timer will be pending and replenish will be
>> in time.  Please run the attachment test program.
> Not sure what this is supposed to tell me. I recognize that there
> are overflow conditions not handled properly, but (a) I have a
> hard time thinking of a sensible guest that sits idle for over 240
> days (host uptime usually isn't even coming close to that due to
> maintenance requirements) and (b) if there is such a sensible
> guest, then I can't see why dealing with one being idle for over
> 480 days should be required too.
>

If the guest contains multiple NICs, that situation probably happens 
when one NIC keeps idle and others work under load. BTW, how do you get 
the 240?

Thanks
Annie

^ permalink raw reply

* Re: [Xen-devel] [PATCH net] xen-netback: add the scenario which now beyond the range time_after_eq().
From: annie li @ 2013-10-17 16:21 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Jason Luan, david.vrabel, ian.campbell, wei.liu2, xen-devel,
	netdev
In-Reply-To: <525FBB4F02000078000FBB30@nat28.tlf.novell.com>


On 2013-10-17 16:26, Jan Beulich wrote:
>>>> On 16.10.13 at 19:22, Jason Luan<jianhai.luan@oracle.com>  wrote:
>> time_after_eq() only works if the delta is < MAX_ULONG/2.
>>
>> If netfront sends at a very low rate, the time between subsequent calls
>> to tx_credit_exceeded() may exceed MAX_ULONG/2 and the test for
>> timer_after_eq() will be incorrect.  Credit will not be replenished and
>> the guest may become unable to send (e.g., if prior to the long gap, all
>> credit was exhausted).
>>
>> We should add the scenario which now beyond next_credit+MAX_UNLONG/2. Because
>> the fact now must be not before than expire, time_before(now, expire) == true
>> will verify the scenario.
>>      time_after_eq(now, next_credit) || time_before (now, expire)
>>      ==
>>      !time_in_range_open(now, expire, next_credit)
> So first of all this must be with a 32-bit netback. And the not
> coverable gap between activity is well over 240 days long. _If_
> this really needs dealing with, then why is extending this from
> 240+ to 480+ days sufficient?

I am not so sure your mean about extending from 240+ to 480+. Do you 
mean "now" wrapped case happens and falls into the range of from expires 
to next_credit? If this happens, the timer would be set with value based 
on next_credit, which is actually implements the rate control.

Thanks
Annie
> I.e. why don't you simply
> change to 64-bit jiffy values, and use time_after_eq64()?
>
> Jan
>
>> Signed-off-by: Jason Luan<jianhai.luan@oracle.com>
>> ---
>>   drivers/net/xen-netback/netback.c |    7 +++++--
>>   1 files changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/xen-netback/netback.c
>> b/drivers/net/xen-netback/netback.c
>> index f3e591c..31eedaf 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -1194,8 +1194,11 @@ static bool tx_credit_exceeded(struct xenvif *vif,
>> unsigned size)
>>   	if (timer_pending(&vif->credit_timeout))
>>   		return true;
>>   
>> -	/* Passed the point where we can replenish credit? */
>> -	if (time_after_eq(now, next_credit)) {
>> +	/* Credit should be replenished when now does not fall into the
>> +	 * range from expires to next_credit, and time_in_range_open()
>> +	 * is used to verify whether this case happens.
>> +	 */
>> +	if (!time_in_range_open(now, vif->credit_timeout.expires, next_credit)) {
>>   		vif->credit_timeout.expires = now;
>>   		tx_add_credit(vif);
>>   	}
>> -- 
>> 1.7.6.5
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org  
>> http://lists.xen.org/xen-devel  
>

^ permalink raw reply

* Re: [PATCH] veth: Showing peer of veth type dev in ip link (kernel side)
From: Nicolas Dichtel @ 2013-10-17 16:05 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Stephen Hemminger, David Miller, yamato, netdev
In-Reply-To: <87txghm1qw.fsf@xmission.com>

Le 16/10/2013 21:53, Eric W. Biederman a écrit :
> Nicolas Dichtel <nicolas.dichtel@6wind.com> writes:
>
>> Le 15/10/2013 22:34, Eric W. Biederman a écrit :
>
>>> For IFLA_NET_NS_FD not that I know of.
>>>
>>> Mostly it is doable but there are some silly cases.
>>> - Do we need to actually implement SCM_RIGHTS to prevent people
>>>     accepting file-descriptors unknowingly and hitting their file
>>>     descriptor limits.
>>>
>>>     In which case we need to call the attribute IFLA_NET_NS_SCM_FD
>>>     so we knew it was just an index into the passed file descriptors.n
>>>
>>> - Do we need an extra permission check to prevent keeping a network
>>>     namespace alive longer than necessary?  Aka there are some permission
>>>     checks opening and bind mounting /proc/<pid>/ns/net do we need
>>>     a similar check.  Perhaps we would need to require CAP_NET_ADMIN over
>>>     the target network namespace.
>>>
>>> Beyond that it is just the logistics to open what is essentially
>>> /proc/<pid>/ns/net and add it to the file descriptor table of the
>>> requesting process.  Exactly which mount of proc we are going to
>>> find the appropriate file to open I don't know.
>>>
>>> It isn't likely to be lots of code but it is code that the necessary
>>> infrastructure is not in place for, and a bunch of moderately hairy
>>> corner cases to deal with.
>> Got it. This doesn't seems the simpliest/best way to resolve this pb.
>> Can we not introduce another identifier (something like IFLA_NET_NS_ID),
>> which will not have such constraint?
>> inode is unique on the system, why not using it as an opaque value to
>> identitfy the netns (like 'ip netns identify' do)?
>
> The age old question why can't we have global identifiers for
> namespaces?
>
> The answer is that I don't want to implement a namespace for namespaces.
Sorry, but I don't understand the problem. This ID is owned by the kernel, like
the netns list (for_each_net()) is owned by it.

>
> While the proc inode does work today across different mounts of proc, I
> reserve the right at some future date (if it solves a technical problem)
> to give each namespace a different inode number in each different mount
> of proc.  So the inode number is not quite the unique identifier you
> want.  The inode number is a close as I am willing to get to a namespace
> of namespaces.
>
> I think the simplest solution is to just not worry about which namespace
> the other half of a veth pair is in.  But I have not encountered the
> problem where I need to know exactly which namespace we are worrying
> about.
Ok, let's start by explaining our usecase.

We are using namespaces only to implement virtual routers (VR), ie only
the networking stack is virtualized. We don't care about other namespaces, we
just want to run several network stacks and beeing able to manage them.

For example, providers use this feature to isolate clients, one VR is opened
for each client. You can have a large number of clients (+10 000) and thus the
same number of netns.
Considering these numbers, we don't want to run one instance per VR for all of
our network daemons, but have only one instance that manage all VR.

You also have daemons that monitor the system and synchronize network objects
(interfaces, routes, etc.) on another linux. Goal is to implement an high
availablity system: it's possible to switch to the other linux to avoid service
interruption.
This kind of daemon wants to have the full information about interfaces to be
able to build/configure them on the other linux.

>
> Global identifiers are easy until you hit the cases where they make
> things impossible.
I don't want specially to use ID, but I fear that the solution with file
descriptors will be a nightmare.


Nicolas

^ permalink raw reply

* Re: [Xen-devel] [PATCH net] xen-netback: add the scenario which now beyond the range time_after_eq().
From: jianhai luan @ 2013-10-17 15:41 UTC (permalink / raw)
  To: David Vrabel
  Cc: Wei Liu, Jan Beulich, ian.campbell, xen-devel, annie.li, netdev
In-Reply-To: <5260015C.2030100@citrix.com>


On 2013-10-17 23:25, David Vrabel wrote:
> On 17/10/13 16:23, jianhai luan wrote:
>> On 2013-10-17 22:06, Wei Liu wrote:
>>> On Thu, Oct 17, 2013 at 09:59:30PM +0800, jianhai luan wrote:
>>> [...]
>>>>>>>> If use time_after_eq64(), expire ,next_credit and other member
>>>>>>>> will must
>>>>>>>> be u64.
>>>>>>> Yes, you'll need to store next_credit as a u64 in vif instead of
>>>>>>> calculating it in tx_credit_exceeded from expires (which is only an
>>>>>>> unsigned long).
>>>>>> I know that.  Even we use u64, time_after_eq()  will also do wrong
>>>>>> judge
>>>>>> in theory (not in reality because need long long time).
>>>>> If jiffies_64 has millisecond resolution that would be more than
>>>>> 500,000,000 years.
>>>> Yes, I agree the fact.
>>>>>> I think the two better fixed way is below:
>>>>>>     - By time_before() to judge if now beyond MAX_ULONG/2
>>>>> This is broken, so no.
>>>> Where is broken?  would you like to help me point it out.
>>> I think David means you didn't actually fix the problem. Your solution is
>>> merely a workaround.
>> I have think  about using u64, but more code need to be modified and
>> that is not all.  Key point is how to change the element of struct
>> time_list (expires)  and don't affect other thing?
> I already suggested a way that didn't require changing the timer
> structure -- calculate and store next_credit in advanced.
I think that  modify next_credit only  will not fix the issue. please 
think about:
   - If jiffies have beyond 32 bit. i assume expire  is 0, jiffies_64 
is  0x1000000ff.
     next_credit = 0 +  <always 32-bit value >

     time_after_eq64(jiffies_64, next_credit ) will always true. 
replenish will always do, rate control will lost their function.

Jason
>
> David

^ permalink raw reply

* Re: [Xen-devel] [PATCH net] xen-netback: add the scenario which now beyond the range time_after_eq().
From: David Vrabel @ 2013-10-17 15:25 UTC (permalink / raw)
  To: jianhai luan
  Cc: Wei Liu, Jan Beulich, ian.campbell, xen-devel, annie.li, netdev
In-Reply-To: <526000F5.1090102@oracle.com>

On 17/10/13 16:23, jianhai luan wrote:
> 
> On 2013-10-17 22:06, Wei Liu wrote:
>> On Thu, Oct 17, 2013 at 09:59:30PM +0800, jianhai luan wrote:
>> [...]
>>>>>>> If use time_after_eq64(), expire ,next_credit and other member
>>>>>>> will must
>>>>>>> be u64.
>>>>>> Yes, you'll need to store next_credit as a u64 in vif instead of
>>>>>> calculating it in tx_credit_exceeded from expires (which is only an
>>>>>> unsigned long).
>>>>> I know that.  Even we use u64, time_after_eq()  will also do wrong
>>>>> judge
>>>>> in theory (not in reality because need long long time).
>>>> If jiffies_64 has millisecond resolution that would be more than
>>>> 500,000,000 years.
>>> Yes, I agree the fact.
>>>>> I think the two better fixed way is below:
>>>>>    - By time_before() to judge if now beyond MAX_ULONG/2
>>>> This is broken, so no.
>>> Where is broken?  would you like to help me point it out.
>> I think David means you didn't actually fix the problem. Your solution is
>> merely a workaround.
> 
> I have think  about using u64, but more code need to be modified and
> that is not all.  Key point is how to change the element of struct
> time_list (expires)  and don't affect other thing?

I already suggested a way that didn't require changing the timer
structure -- calculate and store next_credit in advanced.

David

^ permalink raw reply

* Re: [Xen-devel] [PATCH net] xen-netback: add the scenario which now beyond the range time_after_eq().
From: jianhai luan @ 2013-10-17 15:23 UTC (permalink / raw)
  To: Wei Liu
  Cc: David Vrabel, Jan Beulich, ian.campbell, xen-devel, annie.li,
	netdev
In-Reply-To: <20131017140611.GM16371@zion.uk.xensource.com>


On 2013-10-17 22:06, Wei Liu wrote:
> On Thu, Oct 17, 2013 at 09:59:30PM +0800, jianhai luan wrote:
> [...]
>>>>>> If use time_after_eq64(), expire ,next_credit and other member will must
>>>>>> be u64.
>>>>> Yes, you'll need to store next_credit as a u64 in vif instead of
>>>>> calculating it in tx_credit_exceeded from expires (which is only an
>>>>> unsigned long).
>>>> I know that.  Even we use u64, time_after_eq()  will also do wrong judge
>>>> in theory (not in reality because need long long time).
>>> If jiffies_64 has millisecond resolution that would be more than
>>> 500,000,000 years.
>> Yes, I agree the fact.
>>>> I think the two better fixed way is below:
>>>>    - By time_before() to judge if now beyond MAX_ULONG/2
>>> This is broken, so no.
>> Where is broken?  would you like to help me point it out.
> I think David means you didn't actually fix the problem. Your solution is
> merely a workaround.

I have think  about using u64, but more code need to be modified and 
that is not all.  Key point is how to change the element of struct 
time_list (expires)  and don't affect other thing?

David && Wei, would you good idea about modification  of expire (which 
is element of struct time_list)?

Jason
>
>>> David

^ permalink raw reply

* RE: transmit lockup using smsc95xx ethernet on usb3
From: David Laight @ 2013-10-17 14:30 UTC (permalink / raw)
  To: Sarah Sharp; +Cc: netdev, linux-usb, Xenia Ragiadakou
In-Reply-To: <20131017000758.GL7082@xanatos>

> From: Sarah Sharp [mailto:sarah.a.sharp@linux.intel.com]

Thanks for taking an interest.

> On Tue, Oct 15, 2013 at 10:59:18AM +0100, David Laight wrote:
> > We are seeing complete lockups of the transmit side when using
> > the smsc95xx driver connected to a USB3 port on an i7 (Ivybridge) cpu.
> > These errors are very intermittent - less than once a day, and
> > it isn't actually clear that they are related to traffic load.
> >
> > Most of the systems are running the 3.2 kernel from Ubuntu 12.04
> > but I've seen the same problem when running a 3.4 kernel.
> 
> Have you tried the latest stable kernel or the latest -rc kernel?

No. Although I've written a lot of device driver and comms protocol
stack code over the years I've not had to work out how to build
linux kernels - this may be the time I start!

Given the difficulty (or rather the infrequency) of reproducing the
problem I'd like to sort out the failing code path before changing
kernels so that I can then verify that a more recent kernel fixes it.

...
> > We are also seeing similar problems if we connect to a USB2
> > header.
> 
> Do you mean a USB 2.0 port under the xHCI host controller?  What does
> `lsusb -t` show as the parent host controller in that case?

To clarify the fail trace below is from an xhci controller, but
I'm pretty sure we've seen a tx lockup when using ohci.

The usbmon trace when the tx locks up starts with:

> > Two Bo 'fail -71', 6 succeed, one fails -32 the rest fail -104.
> >    done:9871:6913:60 ffff88020ea16a80 293818155 C Bo:3:003:2 -71 EPROTO 512 > 
> >    done:9872:6927:59 ffff88020ea16f00 293818235 C Bo:3:003:2 -71 EPROTO 0
> >    done:9873:6875:58 ffff88020ea16480 293818313 C Bo:3:003:2 0 1514 >
> >    done:9874:6786:57 ffff88020c7c83c0 293818353 C Bo:3:003:2 0 1514 >
> >    done:9875:6794:56 ffff88020c7c80c0 293818470 C Bo:3:003:2 0 1514 >
> >    done:9876:6789:55 ffff88020c7c8e40 293818589 C Bo:3:003:2 0 1514 >
> >    done:9877:6775:54 ffff88020c7c8240 293818702 C Bo:3:003:2 0 1090 >
> >    done:9878:6751:53 ffff88020c7c8180 293818803 C Bo:3:003:2 0 1514 >
> >    done:9879:6735:52 ffff88020c7c89c0 293818885 C Bo:3:003:2 -32 EPIPE 0
> >    done:9880:6671:51 ffff88020c7c8900 293818925 C Bo:3:003:2 -104 ECONNRESET 0
> > ... the ring is cleared in a software loop
> >     done:9927:1292:4 ffff88020cf0c480 293819015 C Bo:3:003:2 -104 0
> >     done:9928:1170:3 ffff88020ea160c0 293819016 C Bo:3:003:2 -104 0
> > Something is known to be wrong...
> >   start:9931         ffff88020ea160c0 293819037 S Co:3:003:0
> >                          s 02 01 0000 0002 0000 0
> >     done:9929:1080:3 ffff88020ea16780 293819044 C Bo:3:003:2 -104 0
> >      done:9930:945:2 ffff88020ea16000 293819044 C Bo:3:003:2 -104 0
> >       done:9931:48:1 ffff88020ea160c0 293819085 C Co:3:003:0 0 0

I've also seen resets that start with an interrupt from device 1.
In this case the ring is cleared with ESHUTDOWN and dmesg traces what looks
like an unplug-plug action.

Last successful ethernet transmit
ffff88020c4870c0 701760986 C Bo:3:018:2 0 1090 >
ffff88020c4870c0 701760992 S Bo:3:018:2 -115 1090
                  = 3a340000 3a440000 22003200 00224d98
                    d8460002 1f0057d7 08004500 042879ca
Interrupt - I think from the root hub.
ffff88020c8570c0 701761038 C Ii:3:001:1 0:2048 1 = 02
ffff88020c8570c0 701761042 S Ii:3:001:1 -115:2048 4 <
ffff88020ea16840 701761046 C Ii:3:018:3 -71:1 0  EPROTO
ffff88020ea16840 701761047 S Ii:3:018:3 -115:1 16 <
ffff88020c53c480 701761051 C Bi:3:018:1 -71 0
ffff88020c487180 701761054 C Bo:3:018:2 -71 1024 >
ffff880210570240 701761063 S Ci:3:001:0 s a3 00 0000 0001 0004 4 <
ffff880210570240 701761071 C Ci:3:001:0 0 4 = 00010100
ffff880210570240 701761074 S Co:3:001:0 s 23 01 0010 0001 0000 0
ffff88020c53c540 701761076 C Bi:3:018:1 -71 0
ffff880210570240 701761078 C Co:3:001:0 0 0
ffff88020c487240 701761117 C Bo:3:018:2 -71 0
ffff88020c53cd80 701761156 C Bi:3:018:1 -108 0   ESHITDOWN
ffff88020c53c9c0 701761158 C Bi:3:018:1 -108 0
ffff88020c487840 701761196 C Bo:3:018:2 -108 0
ffff88020c487900 701761201 C Bo:3:018:2 -108 0
... lots of similar lines deleted
ffff88020c487540 701761299 C Bo:3:018:2 -108 0
ffff88020c4870c0 701761300 C Bo:3:018:2 -108 0
ffff88020ea16840 701761304 C Ii:3:018:3 -71:1 0
ffff88020c8570c0 701782179 C Ii:3:001:1 0:2048 1 = 02
ffff88020c8570c0 701782183 S Ii:3:001:1 -115:2048 4 <
ffff88020c95be40 703089906 C Ii:3:020:1 -108:8 0
device 17 will be the hub on the smsc95xx chip
ffff88020c95b240 703427572 C Ii:3:017:1 -108:2048 0
ffff88020c487540 703427788 S Ci:3:001:0 s a3 00 0000 0001 0004 4 <
ffff88020c487540 703427803 C Ci:3:001:0 0 4 = 01010100

...
> I would suggest you try with the latest stable kernel, with
> CONFIG_USB_DEBUG and CONFIG_USB_XHCI_HCD_DEBUGGING enabled.  If you try
> the latest 3.12-rc, you will only need the CONFIG_USB_DEBUG.  Or, if
> that output is too much (it will spew on short packets, which may be an
> issue with your ethernet adapter),

The only way I've got the above usbmon trace is by reading 1000000
lines (with dd) and saving the file if the last line output by dmesg
changes. I suspect the above might trace too much for that!
I think the kernel I have has the xhci_warn() compiled out - I'm not
seeing any of those reported by dmesg. I looked at the .h files and
couldn't see where xhci_warn() gets suppressed.

> with the 3.12-rc kernel, you can turn
> off CONFIG_USB_DEBUG and capture command trace events through the xHCI
> trace infrastructure.  Xenia can help you with that if necessary (I'm
> going to be at a conference starting Friday).
> 
> Without that dmesg, I really can't tell what's happening at an xHCI
> level.

Have you any idea which code loops are clearing the ring with ECONNRESET
and ESHUTDOWN and where they might be being called from?
I've been reading the code but couldn't see any obvious paths that would
lead to the ring being cleared with those codes.

	David

^ permalink raw reply

* Re: [PATCH 0/5] rfkill-gpio: ACPI support
From: Johannes Berg @ 2013-10-17 14:26 UTC (permalink / raw)
  To: Heikki Krogerus
  Cc: John W. Linville, Rhyland Klein, Rafael J. Wysocki, linux-acpi,
	linux-wireless, netdev
In-Reply-To: <1381920823-15403-1-git-send-email-heikki.krogerus@linux.intel.com>

On Wed, 2013-10-16 at 13:53 +0300, Heikki Krogerus wrote:
> Hi,
> 
> The first patches prepare the driver for the support. The last patch
> can then add the support quite easily. With these patches, adding DT
> support later will be quite straight forward if someone needs it.

Applied. I'm totally relying on all the reviewers though, since I have
very little idea of what's going on. If anyone else wants to maintain
the rfkill-gpio driver I'd welcome that :-)

johannes


^ permalink raw reply

* Re: [Xen-devel] [PATCH net] xen-netback: add the scenario which now beyond the range time_after_eq().
From: Wei Liu @ 2013-10-17 14:06 UTC (permalink / raw)
  To: jianhai luan
  Cc: David Vrabel, Jan Beulich, ian.campbell, wei.liu2, xen-devel,
	annie.li, netdev
In-Reply-To: <525FED42.4040608@oracle.com>

On Thu, Oct 17, 2013 at 09:59:30PM +0800, jianhai luan wrote:
[...]
> >>>>If use time_after_eq64(), expire ,next_credit and other member will must
> >>>>be u64.
> >>>Yes, you'll need to store next_credit as a u64 in vif instead of
> >>>calculating it in tx_credit_exceeded from expires (which is only an
> >>>unsigned long).
> >>I know that.  Even we use u64, time_after_eq()  will also do wrong judge
> >>in theory (not in reality because need long long time).
> >If jiffies_64 has millisecond resolution that would be more than
> >500,000,000 years.
> 
> Yes, I agree the fact.
> >
> >>I think the two better fixed way is below:
> >>   - By time_before() to judge if now beyond MAX_ULONG/2
> >This is broken, so no.
> 
> Where is broken?  would you like to help me point it out.

I think David means you didn't actually fix the problem. Your solution is
merely a workaround.

> >
> >David

^ permalink raw reply

* Re: [Xen-devel] [PATCH net] xen-netback: add the scenario which now beyond the range time_after_eq().
From: jianhai luan @ 2013-10-17 13:59 UTC (permalink / raw)
  To: David Vrabel
  Cc: Jan Beulich, ian.campbell, wei.liu2, xen-devel, annie.li, netdev
In-Reply-To: <525FBC7F.9040800@citrix.com>


On 2013-10-17 18:31, David Vrabel wrote:
> On 17/10/13 11:19, jianhai luan wrote:
>> On 2013-10-17 17:15, David Vrabel wrote:
>>> On 17/10/13 10:02, jianhai luan wrote:
>>>> On 2013-10-17 16:26, Jan Beulich wrote:
>>>>>>>> On 16.10.13 at 19:22, Jason Luan <jianhai.luan@oracle.com> wrote:
>>>>>> time_after_eq() only works if the delta is < MAX_ULONG/2.
>>>>>>
>>>>>> If netfront sends at a very low rate, the time between subsequent
>>>>>> calls
>>>>>> to tx_credit_exceeded() may exceed MAX_ULONG/2 and the test for
>>>>>> timer_after_eq() will be incorrect.  Credit will not be replenished
>>>>>> and
>>>>>> the guest may become unable to send (e.g., if prior to the long
>>>>>> gap, all
>>>>>> credit was exhausted).
>>>>>>
>>>>>> We should add the scenario which now beyond next_credit+MAX_UNLONG/2.
>>>>>> Because
>>>>>> the fact now must be not before than expire, time_before(now, expire)
>>>>>> == true
>>>>>> will verify the scenario.
>>>>>>        time_after_eq(now, next_credit) || time_before (now, expire)
>>>>>>        ==
>>>>>>        !time_in_range_open(now, expire, next_credit)
>>>>> So first of all this must be with a 32-bit netback. And the not
>>>>> coverable gap between activity is well over 240 days long. _If_
>>>>> this really needs dealing with, then why is extending this from
>>>>> 240+ to 480+ days sufficient? I.e. why don't you simply
>>>>> change to 64-bit jiffy values, and use time_after_eq64()?
>>>> Yes, the issue only can be  reproduced in 32-bit Dom0 (Beyond
>>>> MAX_ULONG/2 in 64-bit will need long long time)
>>>>
>>>> I think the gap should be think all environment even now extending 480+.
>>>> if now fall in the gap,  one timer will be pending and replenish will be
>>>> in time.  Please run the attachment test program.
>>>>
>>>> If use time_after_eq64(), expire ,next_credit and other member will must
>>>> be u64.
>>> Yes, you'll need to store next_credit as a u64 in vif instead of
>>> calculating it in tx_credit_exceeded from expires (which is only an
>>> unsigned long).
>> I know that.  Even we use u64, time_after_eq()  will also do wrong judge
>> in theory (not in reality because need long long time).
> If jiffies_64 has millisecond resolution that would be more than
> 500,000,000 years.

Yes, I agree the fact.
>
>> I think the two better fixed way is below:
>>    - By time_before() to judge if now beyond MAX_ULONG/2
> This is broken, so no.

Where is broken?  would you like to help me point it out.
>
> David

^ permalink raw reply

* Re: PROBLEM: GRE forwarding not working with 3.10.x, WCCPv2 and Cisco 7200 Router showing IP0 bad-hlen 4 in tcpdump
From: Eric Dumazet @ 2013-10-17 13:44 UTC (permalink / raw)
  To: Peter Schmitt; +Cc: netdev@vger.kernel.org, Pravin B Shelar
In-Reply-To: <1382007190.35868.YahooMailNeo@web172006.mail.ir2.yahoo.com>

On Thu, 2013-10-17 at 11:53 +0100, Peter Schmitt wrote:
> Hi,
> 
> 
> > 
> > 3.10 is buggy (for ETH_P_WCCP support I mean), 3.11 has the probable
> > fix :
> > 
> > commit 3d7b46cd20e300bd6989fb1f43d46f1b9645816e
> > ("ip_tunnel: push generic protocol handling to ip_tunnel module.")
> > 
> > The problem is 3.10 do not pull the extra 4 bytes of WCCP
> > 
> > This is now properly done in iptunnel_pull_header()
> > 
> 
> First of all, many thanks for your help on this.
> 
> I tried to apply the fix to the 3.10.16 sources, as I would like to
> stay with the long-term line. Unfortunately it does not apply, as
> there are a lot of dependencies on other patches.
> 
> Do you think there will be a fix for the long-time 3.10 kernel line?
> Or can you guide me on how to apply this fix to the long-term kernel?

3.10 is right in the middle of GRE refactoring, and many bugs are in it.

It might be very painful to get a complete list of patches to backport.

^ permalink raw reply

* Re: [PATCH net-next] fib_trie: Send RTM_DELROUTE when link goes down
From: Sergei Shtylyov @ 2013-10-17 13:43 UTC (permalink / raw)
  To: Kristian Evensen; +Cc: netdev
In-Reply-To: <1382008395-9869-1-git-send-email-kristian.evensen@gmail.com>

Hello.

On 17-10-2013 15:13, Kristian Evensen wrote:

> From: Kristian Evensen <kristian.evensen@gmail.com>

> When a link is set as down using for example "ip link set dev X down", routes
> are deleted, but RTM_DELROUTE messages are not sent to RTNLGRP_IPV4_ROUTE. This
> patch makes trie_flush_list() send a RTM_DELROUTE for each route that is
> removed, and makes rtnelink route deletion behavior consistent across commands.
> The parameter lists for trie_flush_list() and trie_flush_leaf() are expanded to
> include required information otherwise unavailable in trie_flush_list().

> One use case that is simplified by this patch is IPv4 WAN connection monitoring.
> Instead of listening for and handling both RTM_*ROUTE and RTM_*LINK-messages, it
> is sufficient to handle RTM_*ROUTE.

> Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>

> ---
>   net/ipv4/fib_trie.c | 16 ++++++++++++----
>   1 file changed, 12 insertions(+), 4 deletions(-)

> diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
> index ec9a9ef..acd5b3b 100644
> --- a/net/ipv4/fib_trie.c
> +++ b/net/ipv4/fib_trie.c
> @@ -1698,15 +1698,23 @@ int fib_table_delete(struct fib_table *tb, struct fib_config *cfg)
>   	return 0;
>   }
>
> -static int trie_flush_list(struct list_head *head)
> +static int trie_flush_list(struct list_head *head, u32 tb_id, t_key key,
> +		int plen)

    The continuation line should start right under *struct* on the first line, 
according to the networking coding style.

>   {
>   	struct fib_alias *fa, *fa_node;
>   	int found = 0;
> +	struct nl_info cfg;
> +
> +	memset(&cfg, 0, sizeof(cfg));
>
>   	list_for_each_entry_safe(fa, fa_node, head, fa_list) {
>   		struct fib_info *fi = fa->fa_info;
>
>   		if (fi && (fi->fib_flags & RTNH_F_DEAD)) {
> +			cfg.nl_net = fi->fib_net;
> +			rtmsg_fib(RTM_DELROUTE, htonl(key), fa, plen, tb_id,
> +					&cfg, 0);

    Here the line should start right under RTM_DELROUTE.

WBR, Sergei

^ permalink raw reply

* Re: [PATCH ipsec v3] xfrm: prevent ipcomp scratch buffer race condition
From: Herbert Xu @ 2013-10-17 13:39 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Michal Kubecek, Steffen Klassert, David S. Miller, netdev
In-Reply-To: <1382017117.2045.146.camel@edumazet-glaptop.roam.corp.google.com>

On Thu, Oct 17, 2013 at 06:38:37AM -0700, Eric Dumazet wrote:
> On Thu, 2013-10-17 at 15:07 +0200, Michal Kubecek wrote:
> > In ipcomp_compress(), sortirq is enabled too early, allowing the
> > per-cpu scratch buffer to be rewritten by ipcomp_decompress()
> > (called on the same CPU in softirq context) between populating
> > the buffer and copying the compressed data to the skb.
> > 
> > v2: as pointed out by Steffen Klassert, if we also move the
> > local_bh_disable() before reading the per-cpu pointers, we can
> > get rid of get_cpu()/put_cpu().
> > 
> > v3: removed ipcomp_decompress part (as explained by Herbert Xu,
> > it cannot be called from process context), get rid of cpu
> > variable (thanks to Eric Dumazet)
> > 
> > Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
> > ---
> 
> Reviewed-by: Eric Dumazet <edumazet@google.com>

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH ipsec v3] xfrm: prevent ipcomp scratch buffer race condition
From: Eric Dumazet @ 2013-10-17 13:38 UTC (permalink / raw)
  To: Michal Kubecek; +Cc: Steffen Klassert, Herbert Xu, David S. Miller, netdev
In-Reply-To: <20131017130740.D8741E8A5E@unicorn.suse.cz>

On Thu, 2013-10-17 at 15:07 +0200, Michal Kubecek wrote:
> In ipcomp_compress(), sortirq is enabled too early, allowing the
> per-cpu scratch buffer to be rewritten by ipcomp_decompress()
> (called on the same CPU in softirq context) between populating
> the buffer and copying the compressed data to the skb.
> 
> v2: as pointed out by Steffen Klassert, if we also move the
> local_bh_disable() before reading the per-cpu pointers, we can
> get rid of get_cpu()/put_cpu().
> 
> v3: removed ipcomp_decompress part (as explained by Herbert Xu,
> it cannot be called from process context), get rid of cpu
> variable (thanks to Eric Dumazet)
> 
> Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
> ---

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* Re: [PATCH ipsec] xfrm: prevent ipcomp scratch buffer race condition
From: Michal Kubecek @ 2013-10-17 13:13 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Steffen Klassert, David S. Miller, netdev
In-Reply-To: <20131017110420.GA21027@gondor.apana.org.au>

On Thu, Oct 17, 2013 at 07:04:20PM +0800, Herbert Xu wrote:
V> On Thu, Oct 17, 2013 at 01:01:49PM +0200, Michal Kubecek wrote:
> > 
> > Am I wrong?
> 
> I hope so :) Because otherwise xfrm_input will surely dead-lock since
> it uses spin_lock.
> 
> The cases you mentioned should all go through netif_rx which will
> then do the processing in BH context.

You are right. Thank you for the correction.

Michal Kubecek

^ permalink raw reply

* [PATCH ipsec v3] xfrm: prevent ipcomp scratch buffer race condition
From: Michal Kubecek @ 2013-10-17 13:07 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: Herbert Xu, David S. Miller, netdev, Eric Dumazet
In-Reply-To: <20131017095502.GD7660@secunet.com>

In ipcomp_compress(), sortirq is enabled too early, allowing the
per-cpu scratch buffer to be rewritten by ipcomp_decompress()
(called on the same CPU in softirq context) between populating
the buffer and copying the compressed data to the skb.

v2: as pointed out by Steffen Klassert, if we also move the
local_bh_disable() before reading the per-cpu pointers, we can
get rid of get_cpu()/put_cpu().

v3: removed ipcomp_decompress part (as explained by Herbert Xu,
it cannot be called from process context), get rid of cpu
variable (thanks to Eric Dumazet)

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
---
 net/xfrm/xfrm_ipcomp.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
index 2906d52..3be02b6 100644
--- a/net/xfrm/xfrm_ipcomp.c
+++ b/net/xfrm/xfrm_ipcomp.c
@@ -141,14 +141,14 @@ static int ipcomp_compress(struct xfrm_state *x, struct sk_buff *skb)
 	const int plen = skb->len;
 	int dlen = IPCOMP_SCRATCH_SIZE;
 	u8 *start = skb->data;
-	const int cpu = get_cpu();
-	u8 *scratch = *per_cpu_ptr(ipcomp_scratches, cpu);
-	struct crypto_comp *tfm = *per_cpu_ptr(ipcd->tfms, cpu);
+	struct crypto_comp *tfm;
+	u8 *scratch;
 	int err;

 	local_bh_disable();
+	scratch = *this_cpu_ptr(ipcomp_scratches);
+	tfm = *this_cpu_ptr(ipcd->tfms);
 	err = crypto_comp_compress(tfm, start, plen, scratch, &dlen);
-	local_bh_enable();
 	if (err)
 		goto out;

@@ -158,13 +158,13 @@ static int ipcomp_compress(struct xfrm_state *x, struct sk_buff *skb)
 	}

 	memcpy(start + sizeof(struct ip_comp_hdr), scratch, dlen);
-	put_cpu();
+	local_bh_enable();

 	pskb_trim(skb, dlen + sizeof(struct ip_comp_hdr));
 	return 0;

 out:
-	put_cpu();
+	local_bh_enable();
 	return err;
 }

-- 
1.8.1.4

^ permalink raw reply related

* Re: [PATCH v2 net 4/4] bridge: Fix updating FDB entries when the PVID is applied
From: Toshiaki Makita @ 2013-10-17 12:52 UTC (permalink / raw)
  To: vyasevic
  Cc: Stephen Hemminger, David S . Miller, netdev, Toshiaki Makita,
	Fernando Luis Vazquez Cao
In-Reply-To: <525EBABF.6030304@redhat.com>

On Wed, 2013-10-16 at 12:11 -0400, Vlad Yasevich wrote:
> On 10/16/2013 11:57 AM, Stephen Hemminger wrote:
> > On Wed, 16 Oct 2013 17:07:16 +0900
> > Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:
> >
> >> We currently set the value that variable vid is pointing, which will be
> >> used in FDB later, to 0 at br_allowed_ingress() when we receive untagged
> >> or priority-tagged frames, even though the PVID is valid.
> >> This leads to FDB updates in such a wrong way that they are learned with
> >> VID 0.
> >> Update the value to that of PVID if the PVID is applied.
> >>
> >> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> >> Reviewed-by: Vlad Yasevich <vyasevic@redhat.com>
> >> ---
> >>   net/bridge/br_vlan.c | 1 +
> >>   1 file changed, 1 insertion(+)
> >>
> >> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
> >> index 5a9c44a..53f0990 100644
> >> --- a/net/bridge/br_vlan.c
> >> +++ b/net/bridge/br_vlan.c
> >> @@ -217,6 +217,7 @@ bool br_allowed_ingress(struct net_bridge *br, struct net_port_vlans *v,
> >>   		/* PVID is set on this port.  Any untagged or priority-tagged
> >>   		 * ingress frame is considered to belong to this vlan.
> >>   		 */
> >> +		*vid = pvid;
> >>   		if (likely(err))
> >>   			/* Untagged Frame. */
> >>   			__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), pvid);
> >
> >
> > Ok, but side-effects seem like an indication of poor code logic
> > flow design. Not your fault but part of the the per-vlan filtering code.
> >
> 
> I'll see if I can re-work the code to get rid of the side-effects.

I'm thinking br_allowed_ingress() might have too many roles.
- Determine whether an incoming frame is acceptable.
- Update skb->vlan_tci if PVID is applied.
- Update the argument 'vid'.

Besides, 'vid' is actually updated by not br_allowed_ingress() but
br_vlan_get_tag().

I think this complicated implementation could lead to missing expected
code for updating vid.

At least we can remove the third role from br_allowed_ingress() because
the required vid is recorded in skb->vlan_tci when we exit the function.

So, we can write the caller of br_allowed_ingress() like
	...
	if (!br_allowed_ingress(br, v, skb))
		goto drop;
	vid = br_vlan_get_tag(skb);

(Assuming br_vlan_get_tag() has been changed to return vid.)

However, this will require br_vlan_get_tag() to check br->vlan_enabled.
Does this change reduce complexity of current implementation?

BTW, some codes in mdb, such as br_multicast_ipv4_rcv(), seem to call
br_vlan_get_tag() without checking br->vlan_enabled.
Is this right way?
Or does br_vlan_get_tag() originally need to check br->vlan_enabled?

Thanks,

Toshiaki Makita

> 
> -vlad

^ permalink raw reply

* Re: sun4i_handle_irq: WARNING at net/ipv4/tcp_input.c:2711 tcp_fastretrans_alert
From: Maxime Ripard @ 2013-10-17 12:32 UTC (permalink / raw)
  To: Richard Genoud
  Cc: Stefan Roese, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <CACQ1gAinSoTCTtf9Y6TL=vtEm1jnJgis0n=y=frNtuPxue9-ng@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 981 bytes --]

Hi Richard

On Thu, Oct 17, 2013 at 01:46:49PM +0200, Richard Genoud wrote:
> > Hmmm, that seems pretty far from the network/interrupt drivers, and it
> > looks like other users have seen this on !ARM machines:
> > http://forums.gentoo.org/viewtopic-p-7379928.html
> > https://bugzilla.redhat.com/show_bug.cgi?id=989251
> >
> > There's a patch in the redhat's bugzilla, could you try to apply it and
> > see if it solves your problem?
> >
> > Maybe the netdev guys will have other ideas as well.
> For the record, I set:
> net.ipv4.tcp_early_retrans = 2
> net.ipv4.tcp_frto = 0
> And I didn't see the warning any more.
> It's seems that a patch is on it's way. I'll try it instead of setting
> the sysctl.
> http://thread.gmane.org/gmane.linux.network/286793

Ah, I wanted to ask you about the status of this.

Thanks for the follow-up,
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH v2 net 2/4] bridge: Apply the PVID to priority-tagged frames
From: Toshiaki Makita @ 2013-10-17 12:14 UTC (permalink / raw)
  To: vyasevic
  Cc: Stephen Hemminger, David S . Miller, netdev, Toshiaki Makita,
	Fernando Luis Vazquez Cao
In-Reply-To: <525EBBC9.8050809@redhat.com>

On Wed, 2013-10-16 at 12:16 -0400, Vlad Yasevich wrote:
> On 10/16/2013 11:55 AM, Stephen Hemminger wrote:
> > On Wed, 16 Oct 2013 17:07:14 +0900
> > Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:
> >
> >> IEEE 802.1Q says that when we receive priority-tagged (VID 0) frames
> >> use the PVID for the port as its VID.
> >> (See IEEE 802.1Q-2011 6.9.1 and Table 9-2)
> >>
> >> Apply the PVID to not only untagged frames but also priority-tagged frames.
> >>
> >> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> >> ---
> >>   net/bridge/br_vlan.c | 27 ++++++++++++++++++++-------
> >>   1 file changed, 20 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
> >> index 21b6d21..5a9c44a 100644
> >> --- a/net/bridge/br_vlan.c
> >> +++ b/net/bridge/br_vlan.c
> >> @@ -189,6 +189,8 @@ out:
> >>   bool br_allowed_ingress(struct net_bridge *br, struct net_port_vlans *v,
> >>   			struct sk_buff *skb, u16 *vid)
> >>   {
> >> +	int err;
> >> +
> >>   	/* If VLAN filtering is disabled on the bridge, all packets are
> >>   	 * permitted.
> >>   	 */
> >> @@ -201,20 +203,31 @@ bool br_allowed_ingress(struct net_bridge *br, struct net_port_vlans *v,
> >>   	if (!v)
> >>   		return false;
> >>
> >> -	if (br_vlan_get_tag(skb, vid)) {
> >> +	err = br_vlan_get_tag(skb, vid);
> >> +	if (!*vid) {
> >>   		u16 pvid = br_get_pvid(v);
> >
> > Ok, but it looks like br_vlan_get_tag() could be cleaner if it just returned
> > the tag, and there was another br_vlan_tag_present() function.

Thank you for reviewing.
I agree with you.
I had been afraid that if it affects other codes because
br_vlan_get_tag() is used in many places else, but now I have decided
not to hesitate to change its signature and behavior.

> 
> I was just thinking about that as well.  If we make br_vlan_get_tag()
> return either the actual tag (if the packet is tagged), or the pvid
> if (untagged/prio_tagged), then we can skp most of this.

Hmm... maybe I don't fully understand you.

Is what you intend something like
	br_allowed_ingress(...) {
		...
		vid = br_vlan_get_tag(skb, v);
		if (!tagged(skb)) put_tag(skb, vid); /* untagged */
		else if (!get_vid(skb)) update_vid(skb, vid); /* prio_tagged */
		...
	}

	br_vlan_get_tag(skb, v) {
		if (tagged(skb)) {
			vid = get_vid(skb);
			if (!vid) return get_pvid(v); /* prio_tagged */
			return vid;
		}
		return get_pvid(v); /* untagged */
	}

This needs double check for prio_tagged at br_allowed_ingress() and
br_vlan_get_tag().

Or if we modify skb->vlan_tci at br_vlan_get_tag(), isn't it a little
dangerous to other codes that use this function in order to just get
vid?

I am thinking it makes things simple that br_vlan_get_tag() returns 0 if
(untagged/prio_tagged).

	br_allowed_ingress(...) {
		...
		vid = br_vlan_get_tag(skb);
		if (!vid) {
			vid = get_pvid(v);
			if (!tagged(skb)) put_tag(skb, vid);/* untagged */
			else update_vid(skb, vid); /* prio_tagged */
		}
		...
	}

	br_vlan_get_tag(skb) {
		if (tagged(skb)) return get_vid(skb);
		return 0;
	}

Thanks,

Toshiaki Makita

> 
> >
> > Also, does this still work if CONFIG_BRIDGE_VLAN_FILTERING is disabled?
> 
> Yes.  br_allowed_ingress becomes an inline if the config option is disabled.
> 
> -vlad

^ permalink raw reply

* Re: [PATCH] X.25: Fix address field length calculation
From: Kelleter, Günther @ 2013-10-17 12:09 UTC (permalink / raw)
  To: Andrew Hendry, David Laight
  Cc: Joe Perches, David Miller, linux-x25@vger.kernel.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <CADo0ohh7jZhc_WJFkrYYxoYza8ZeSEadzwgwabJWwQ1TucdCcg@mail.gmail.com>


E.g. called address 7 digits and caller address 3 digits. Called DCE
answering without facilities
gives us this packet (hex):

37 12 34 56 71 23 00

then x25_parse_address_block()  tries to pull 1+7+3 = 11 bytes from the
packet (with pskb_may_pull())
which only has 7 bytes.
When facilities are included the wrong calculated length has no effect
since the facilities make this packet long enough to make pskb_may_pull
with wrong number
of bytes succeed. later x25_addr_ntoa() correctly pulls 6 bytes for
addresses from the packet.


Am 17.10.2013 13:02, schrieb Andrew Hendry:
> Sorry for the previous html mail.
> This appears to be correct, what length addresses are you getting back
> in the call accept when this happens?
>
> On Wed, Oct 16, 2013 at 7:56 PM, David Laight <David.Laight@aculab.com> wrote:
>>> On Tue, 2013-10-15 at 14:29 +0000, Kelleter, Günther wrote:
>>>> Addresses are BCD encoded, not ASCII. x25_addr_ntoa got it right.
>>> []
>>>> Wrong length calculation leads to rejection of CALL ACCEPT packets.
>>> []
>>>> diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
>>> []
>>>> @@ -98,7 +98,7 @@ int x25_parse_address_block(struct sk_buff *skb,
>>>>     }
>>>>     len = *skb->data;
>>>> -   needed = 1 + (len >> 4) + (len & 0x0f);
>>>> +   needed = 1 + ((len >> 4) + (len & 0x0f) + 1) / 2;
>>> This calculation looks odd.
>> Looks correct to me...
>> In X.25 the lengths (in digits) of the called and calling addresses
>> are encoded in the high and low nibbles of one byte and then
>> followed by both addresses with a digit in each nibble.
>> If the length of the first address is odd, the second one
>> isn't byte aligned.
>>
>>         David
>>
>>
>>


-- 

^ permalink raw reply

* Re: [PATCH 5/5] net: rfkill: gpio: add ACPI support
From: Rafael J. Wysocki @ 2013-10-17 11:48 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Heikki Krogerus, John W. Linville, Johannes Berg, Rhyland Klein,
	linux-acpi, linux-wireless, netdev
In-Reply-To: <20131017074426.GG3521@intel.com>

On Thursday, October 17, 2013 10:44:26 AM Mika Westerberg wrote:
> On Wed, Oct 16, 2013 at 10:55:01PM +0200, Rafael J. Wysocki wrote:
> > On Wednesday, October 16, 2013 01:53:43 PM Heikki Krogerus wrote:
> > > Including ACPI ID for Broadcom GPS receiver BCM4752.
> > > 
> > > Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
> > > ---
> > >  net/rfkill/rfkill-gpio.c | 31 ++++++++++++++++++++++++++++++-
> > >  1 file changed, 30 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/net/rfkill/rfkill-gpio.c b/net/rfkill/rfkill-gpio.c
> > > index 2dd78c6..5620d3c 100644
> > > --- a/net/rfkill/rfkill-gpio.c
> > > +++ b/net/rfkill/rfkill-gpio.c
> > > @@ -24,6 +24,8 @@
> > >  #include <linux/platform_device.h>
> > >  #include <linux/clk.h>
> > >  #include <linux/slab.h>
> > > +#include <linux/acpi.h>
> > > +#include <linux/acpi_gpio.h>
> > >  
> > >  #include <linux/rfkill-gpio.h>
> > >  
> > > @@ -70,6 +72,23 @@ static const struct rfkill_ops rfkill_gpio_ops = {
> > >  	.set_block = rfkill_gpio_set_power,
> > >  };
> > >  
> > > +static int rfkill_gpio_acpi_probe(struct device *dev,
> > > +				  struct rfkill_gpio_data *rfkill)
> > > +{
> > > +	const struct acpi_device_id *id;
> > > +
> > > +	id = acpi_match_device(dev->driver->acpi_match_table, dev);
> > > +	if (!id)
> > > +		return -ENODEV;
> > > +
> > > +	rfkill->name = dev_name(dev);
> > > +	rfkill->type = (unsigned)id->driver_data;
> > > +	rfkill->reset_gpio = acpi_get_gpio_by_index(dev, 0, NULL);
> > > +	rfkill->shutdown_gpio = acpi_get_gpio_by_index(dev, 1, NULL);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >  static int rfkill_gpio_probe(struct platform_device *pdev)
> > >  {
> > >  	struct rfkill_gpio_platform_data *pdata = pdev->dev.platform_data;
> > > @@ -82,7 +101,11 @@ static int rfkill_gpio_probe(struct platform_device *pdev)
> > >  	if (!rfkill)
> > >  		return -ENOMEM;
> > >  
> > > -	if (pdata) {
> > > +	if (ACPI_HANDLE(&pdev->dev)) {
> > > +		ret = rfkill_gpio_acpi_probe(&pdev->dev, rfkill);
> > > +		if (ret)
> > > +			return ret;
> > > +	} else if (pdata) {
> > >  		clk_name = pdata->power_clk_name;
> > >  		rfkill->name = pdata->name;
> > >  		rfkill->type = pdata->type;
> > > @@ -170,12 +193,18 @@ static int rfkill_gpio_remove(struct platform_device *pdev)
> > >  	return 0;
> > >  }
> > >  
> > > +static const struct acpi_device_id rfkill_acpi_match[] = {
> > > +	{ "BCM4752", RFKILL_TYPE_GPS },
> > > +	{ },
> > > +};
> > > +
> > >  static struct platform_driver rfkill_gpio_driver = {
> > >  	.probe = rfkill_gpio_probe,
> > >  	.remove = rfkill_gpio_remove,
> > >  	.driver = {
> > >  		.name = "rfkill_gpio",
> > >  		.owner = THIS_MODULE,
> > > +		.acpi_match_table = ACPI_PTR(rfkill_acpi_match),
> > >  	},
> > >  };
> > 
> > Looks good to me.
> > 
> > Has Mika seen this?
> 
> Yes, saw it now and looks good to me as well.
> 
> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> 
> for the whole series, for what it's worth.

OK, so

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

for the ACPI-related changes.

Thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox