Netdev List
 help / color / mirror / Atom feed
* Re: [PATCHSET net-next v2 00/07] Support for byte queue limits on various drivers
From: David Miller @ 2013-10-21 20:34 UTC (permalink / raw)
  To: milky-kernel; +Cc: netdev
In-Reply-To: <1382292803-18875-1-git-send-email-milky-kernel@mcmilk.de>


I'm not applying any patches that add module parameters for this, sorry.

^ permalink raw reply

* Re: [PATCH net-next] net: fix build warnings because of net_get_random_once merge
From: David Miller @ 2013-10-21 20:27 UTC (permalink / raw)
  To: hannes; +Cc: netdev, linux-kernel
In-Reply-To: <20131020042602.GC27787@order.stressinduktion.org>

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Sun, 20 Oct 2013 06:26:02 +0200

> This patch fixes the following warning:
> 
>    In file included from include/linux/skbuff.h:27:0,
>                     from include/linux/netfilter.h:5,
>                     from include/net/netns/netfilter.h:5,
>                     from include/net/net_namespace.h:20,
>                     from include/linux/init_task.h:14,
>                     from init/init_task.c:1:
> include/linux/net.h:243:14: warning: 'struct static_key' declared inside parameter list [enabled by default]
>           struct static_key *done_key);
> 
> on x86_64 allnoconfig, um defconfig and ia64 allmodconfig and maybe others as well.
> 
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Applied, thanks Hannes.

^ permalink raw reply

* Re: [patch net v2 0/3] UFO fixes
From: David Miller @ 2013-10-21 20:26 UTC (permalink / raw)
  To: hannes
  Cc: jiri, netdev, eric.dumazet, jdmason, yoshfuji, kuznet, jmorris,
	kaber, herbert
In-Reply-To: <20131020032617.GA27787@order.stressinduktion.org>

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Sun, 20 Oct 2013 05:26:17 +0200

> Hi David!
> 
> On Sat, Oct 19, 2013 at 07:21:47PM -0400, David Miller wrote:
>> From: Jiri Pirko <jiri@resnulli.us>
>> Date: Sat, 19 Oct 2013 12:29:14 +0200
>> 
>> > Couple of patches fixing UFO functionality in different situations.
>> > 
>> > v1->v2:
>> > - minor if{}else{} coding style adjustment suggested by Sergei Shtylyov
>> 
>> Series applied, thanks Jiri.
> 
> I would propose that the patches
> 
> "ip6_output: do skb ufo init for peeked non ufo skb as well"
> (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b)
> 
> and
> 
> "ip_output: do skb ufo init for peeked non ufo skb as well"
> (e93b7d748be887cd7639b113ba7d7ef792a7efb9)
> 
> should go to stable because they solve a possible memory corruption
> from userspace.

I suppose... the reason I didn't automatically queue these up for -stable
is that they are rather non-trivial.

^ permalink raw reply

* Re: [PATCH v2.44 2/5] ofp-actions: Add separate OpenFlow 1.3 action parser
From: Ben Pfaff @ 2013-10-21 20:19 UTC (permalink / raw)
  To: Simon Horman
  Cc: dev, netdev, Jesse Gross, Pravin B Shelar, Ravi K, Isaku Yamahata,
	Joe Stringer
In-Reply-To: <1381972511-27221-3-git-send-email-horms@verge.net.au>

On Thu, Oct 17, 2013 at 10:15:08AM +0900, Simon Horman wrote:
> From: Joe Stringer <joe@wand.net.nz>
> 
> This patch adds new ofpact_from_openflow13() and
> ofpacts_from_openflow13() functions parallel to the existing ofpact
> handling code. In the OpenFlow 1.3 version, push_mpls is handled
> differently, but all other actions are handled by the existing code.
> 
> In the case of push_mpls for OpenFlow 1.3 the new mpls_before_vlan field of
> struct ofpact_push_mpls is set to true.  This will be used by a subsequent
> patch to allow allow the correct VLAN+MPLS datapath behaviour to be
> determined at odp translation time.
> 
> enum ofpact_mpls_position contributed by Ben Pfaff.
> 
> Signed-off-by: Joe Stringer <joe@wand.net.nz>
> Signed-off-by: Simon Horman <horms@verge.net.au>

I applied this commit to master, but I changed the commit message to:

>From a7a2d006baae4152d338bd0bb4de1687084b1b07 Mon Sep 17 00:00:00 2001
From: Joe Stringer <joe@wand.net.nz>
Date: Thu, 17 Oct 2013 10:15:08 +0900
Subject: [PATCH] ofp-actions: Distinguish OF1.1/1.2 push_mpls from OF1.3+.

In OpenFlow 1.1 and 1.2, the push_mpls action pushes the MPLS label after
any existing VLAN tag.  In OpenFlow 1.3, it pushes the label before any
existing VLAN tag.  Until now, the action parser didn't distinguish these
cases.  This commit adds support.  Nothing yet actually changes the
behavior of push_mpls.

enum ofpact_mpls_position contributed by Ben Pfaff.

Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Pfaff <blp@nicira.com>

^ permalink raw reply

* Re: [PATCH net 0/3] ipv6: use rt6i_gateway as nexthop
From: Julian Anastasov @ 2013-10-21 20:02 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: David Miller, netdev, netfilter-devel, lvs-devel,
	Hideaki YOSHIFUJI
In-Reply-To: <20131021093541.GG28333@order.stressinduktion.org>


	Hello,

On Mon, 21 Oct 2013, Hannes Frederic Sowa wrote:

> Not related to the patch:
> 
> That reminds me that Yoshifuji had the idea to cache the results for
> ipv6_addr_type in IP6CB to avoid calling this function over and over again.
> Maybe we can do the same for rt6_infos to save some cycles here and there.

	Yes, ipv6_addr_type has little price. May be only
DNAT and IPSec can complicate such caching.

> Also, what do you think about this site:
> 
> net/ipv6/ip6_output.c:
>     411 
>     412                 rt = (struct rt6_info *) dst;
>     413                 if (rt->rt6i_flags & RTF_GATEWAY)
>     414                         target = &rt->rt6i_gateway;
>     415                 else
>     416                         target = &hdr->daddr;
>     417 
> 
> Our provided skb_dst should come from ip6_route_input, thus ip6_pol_route. So
> I assume we have rt6i_gateway == hdr->daddr there, too. It is a bit more
> complicated because of possible routing extension headers. Maybe you already
> looked at this already?

	Yes, I checked every site that uses rt6i_gateway but
not with the perspective to use rt6_nexthop(). It seems
this is a rt6_nexthop() candidate.

> I just found it while searching which other code paths do emit packets
> while xt_TEE is processing (generation of redirects) and could also lead
> to stack exhaustion. But the path in ip6_forward seems fine.

	Yes. I found only one place more dangerous:
fib6_add_rt2node(). But I think its checks are still valid.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* [PATCH net] tcp: initialize passive-side sk_pacing_rate after 3WHS
From: Neal Cardwell @ 2013-10-21 19:40 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Neal Cardwell, Eric Dumazet, Yuchung Cheng

For passive TCP connections, upon receiving the ACK that completes the
3WHS, make sure we set our pacing rate after we get our first RTT
sample.

On passive TCP connections, when we receive the ACK completing the
3WHS we do not take an RTT sample in tcp_ack(), but rather in
tcp_synack_rtt_meas(). So upon receiving the ACK that completes the
3WHS, tcp_ack() leaves sk_pacing_rate at its initial value.

Originally the initial sk_pacing_rate value was 0, so passive-side
connections defaulted to sysctl_tcp_min_tso_segs (2 segs) in skbuffs
made in the first RTT. With a default initial cwnd of 10 packets, this
happened to be correct for RTTs 5ms or bigger, so it was hard to
see problems in WAN or emulated WAN testing.

Since 7eec4174ff ("pkt_sched: fq: fix non TCP flows pacing"), the
initial sk_pacing_rate is 0xffffffff. So after that change, passive
TCP connections were keeping this value (and using large numbers of
segments per skbuff) until receiving an ACK for data.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
---
 net/ipv4/tcp_input.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 53974c7..a16b01b 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5712,6 +5712,8 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
 		} else
 			tcp_init_metrics(sk);
 
+		tcp_update_pacing_rate(sk);
+
 		/* Prevent spurious tcp_cwnd_restart() on first data packet */
 		tp->lsndtime = tcp_time_stamp;
 
-- 
1.8.4

^ permalink raw reply related

* Re: [PATCH] can: add Renesas R-Car CAN driver
From: Wolfgang Grandegger @ 2013-10-21 19:12 UTC (permalink / raw)
  To: Sergei Shtylyov, netdev, mkl, linux-can; +Cc: linux-sh, vksavl
In-Reply-To: <526061BE.7060204@cogentembedded.com>

Hi Sergei,

On 10/18/2013 12:16 AM, Sergei Shtylyov wrote:
> Hello.
> 
> On 10/02/2013 10:09 AM, Wolfgang Grandegger wrote:
> 
>    Sorry for the belated reply -- was on vacations.
> 
>> thanks for your contribution. The patch looks already quite good. Before
>> I find time for a detailed review could you please check error handling
>> and bus-off recovery by reporting the output of "$ candump -td -e
>> any,0:0,#FFFFFFFF" while sending messages to the device ...
> 
>> 1. ... without cable connected
> 
> Terminal 1:
> 
> root@10.0.0.101:/opt/can-utils# ./cangen -n 1 -g 1 can0
> root@10.0.0.101:/opt/can-utils#
> 
> Terminal 2:
> 
> root@10.0.0.101:/opt/can-utils# ./candump -td -e any,0:0,#FFFFFFFF
> (000.000000) can0 200000AC [8] 00 08 00 19 00 00 00 00 ERRORFRAME
> controller-problem{tx-error-warning}
> protocol-violation{{}{acknowledge-slot}}
> no-acknowledgement-on-tx
> bus-error
> (000.004496) can0 20000004 [8] 00 20 00 00 00 00 00 00 ERRORFRAME
> controller-problem{tx-error-passive}
> 
> So we get and stay in error- passive state:

Looks good.

> 
> root@10.0.0.101:/opt/can-utils# ip -details link show can0
> 2: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> qlen 10 link/can
> can state ERROR-PASSIVE (berr-counter tx 128 rx 0) restart-ms 0
> bitrate 297619 sample-point 0.714

Strange, what bitrate did you configure?

> tq 480 prop-seg 2 phase-seg1 2 phase-seg2 2 sjw 1
> rcar_can: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..1024 brp-inc 1
> clock 49999999

Could you please try if the algorithm works better with 50000000.

> root@10.0.0.101:/opt/can-utils#
> 
> dmesg:
> rcar_can rcar_can.0 can0: bitrate error 0.7%
> rcar_can rcar_can.0 can0: Error warning interrupt
> rcar_can rcar_can.0 can0: Bus error interrupt:
> rcar_can rcar_can.0 can0: ACK Error
> rcar_can rcar_can.0 can0: Error passive interrupt
> 
>> 2. ... with short-circuited CAN high and low and doing some time later
>>         a manual recovery with "ip link set can0 type can restart"
> 
>    Now we have auto recovery only. Manual recovery was tested with the
> first driver version and worked.

What do you mean with "auto recovery"? Auto recovery by the hardware or
via "restart-ms <ms>"? How do you choose between "manual" and "auto"
recovery?

> Terminal 1:
> 
> root@10.0.0.104:/opt/can-utils# ./cangen -n 1 -g 1 can0
> root@10.0.0.104:/opt/can-utils# ./cangen -n 1 -g 1 can0
> root@10.0.0.104:/opt/can-utils# ./cangen -n 1 -g 1 can0
> root@10.0.0.104:/opt/can-utils#
> 
> Terminal 2:
> 
> root@10.0.0.104:/opt/can-utils# ./candump -td -e any,0:0,#FFFFFFFF
> (000.000000) can0 2000008C [8] 00 00 08 00 00 00 00 00 ERRORFRAME
> controller-problem{}
> protocol-violation{{tx-dominant-bit-error}{}}
> bus-error
> (000.021147) can0 20000144 [8] 00 00 00 00 00 00 00 00 ERRORFRAME
> controller-problem{}
> bus-off
> restarted-after-bus-off

Why does it get "restarted" directly after the bus-off?

> (011.738522) can0 2000008C [8] 00 00 08 00 00 00 00 00 ERRORFRAME
> controller-problem{}

What controller problem? data[1] is not set for some reasom.

> protocol-violation{{tx-dominant-bit-error}{}}
> bus-error
> (000.021163) can0 20000144 [8] 00 00 00 00 00 00 00 00 ERRORFRAME
> controller-problem{}
> bus-off
> restarted-after-bus-off
> (001.666625) can0 2000008C [8] 00 00 08 00 00 00 00 00 ERRORFRAME
> controller-problem{}
> protocol-violation{{tx-dominant-bit-error}{}}
> bus-error
> (000.021157) can0 20000144 [8] 00 00 00 00 00 00 00 00 ERRORFRAME
> controller-problem{}
> bus-off
> restarted-after-bus-off
> 
> dmesg:
> rcar_can rcar_can.0 can0: Error warning interrupt
> rcar_can rcar_can.0 can0: Error passive interrupt
> rcar_can rcar_can.0 can0: Bus error interrupt:
> rcar_can rcar_can.0 can0: Bit Error (dominant)
> rcar_can rcar_can.0 can0: Error warning interrupt
> rcar_can rcar_can.0 can0: Error passive interrupt

Why are they reported again. You are already in error passive.

> rcar_can rcar_can.0 can0: Bus-off entry interrupt
> rcar_can rcar_can.0 can0: bus-off
> rcar_can rcar_can.0 can0: Bus-off recovery interrupt
> rcar_can rcar_can.0 can0: Error warning interrupt
> rcar_can rcar_can.0 can0: Error passive interrupt
> rcar_can rcar_can.0 can0: Bus error interrupt:
> rcar_can rcar_can.0 can0: Bit Error (dominant)
> rcar_can rcar_can.0 can0: Error warning interrupt
> rcar_can rcar_can.0 can0: Error passive interrupt
> rcar_can rcar_can.0 can0: Bus-off entry interrupt
> rcar_can rcar_can.0 can0: bus-off
> rcar_can rcar_can.0 can0: Bus-off recovery interrupt
> rcar_can rcar_can.0 can0: Error warning interrupt
> rcar_can rcar_can.0 can0: Error passive interrupt
> rcar_can rcar_can.0 can0: Bus error interrupt:
> rcar_can rcar_can.0 can0: Bit Error (dominant)
> rcar_can rcar_can.0 can0: Error warning interrupt
> rcar_can rcar_can.0 can0: Error passive interrupt
> rcar_can rcar_can.0 can0: Bus-off entry interrupt
> rcar_can rcar_can.0 can0: bus-off
> rcar_can rcar_can.0 can0: Bus-off recovery interrupt

>> I also wonder if the messages are always sent in order. You could use
>> the program "canfdtest" [1] from the can-utils for validation.
> 
>    This program is PITA. With the driver workaroung it works:

What workaround?

Wolfgang.

^ permalink raw reply

* Re: [PATCH RFC 3/5] net:stmmac: ensure we reclaim all dirty descriptors.
From: Eric Dumazet @ 2013-10-21 18:49 UTC (permalink / raw)
  To: Jimmy PERCHET; +Cc: Giuseppe CAVALLARO, netdev
In-Reply-To: <1382380327.3284.77.camel@edumazet-glaptop.roam.corp.google.com>

On Mon, 2013-10-21 at 11:32 -0700, Eric Dumazet wrote:
> On Mon, 2013-10-21 at 15:10 +0200, Jimmy PERCHET wrote:
> > Hello Peppe,
> > 
> 
> > I can reproduce this problem by issuing 9KiB jumbo frames on 10MBit/s link.
> > If socket's wmemory size is about 500kiB (or less), the transfer stall.
> > (I guess it is reproducible with 1500o frames by decreasing
> > socket's wmemory to 90KB)
> > Re-arming the timer fix this behaviour.
> > 
> > Here my understanding of this issue : 
> > With 9KiB frames and 500kiB of wmemory, only 60 frames can be
> > prepared in a row. It is below the tx coalescence threshold,
> > so there will be no interrupt. When the tx coalescence timer 
> > expires (40ms after), only five descriptors have to be
> > freed (9000*5 @ 10Mbit/s = 34ms), it is not enough to reach
> > the socket's wake-up threshold. We get into a deadlock :
> > *Socket is waiting for free buffers before performing new transfer.
> > *Driver is waiting for new transfer before performing cleanup.
> > 
> > Maybe, it is not a real life use-case, and is not worth
> > a patch. What do you think ?
> > 
> 
> I think there is probably a bug in the driver, a race of some sort,
> and it would be better to find it and fix it ;)
> 

coalesce params should not be hardcoded, but depend on link speed and
mtu.

On 10Mbits, and MTU=9000 there is really no point using coalescing !

^ permalink raw reply

* Re: [PATCH RFC 3/5] net:stmmac: ensure we reclaim all dirty descriptors.
From: Eric Dumazet @ 2013-10-21 18:32 UTC (permalink / raw)
  To: Jimmy PERCHET; +Cc: Giuseppe CAVALLARO, netdev
In-Reply-To: <526527DE.5060906@parrot.com>

On Mon, 2013-10-21 at 15:10 +0200, Jimmy PERCHET wrote:
> Hello Peppe,
> 

> I can reproduce this problem by issuing 9KiB jumbo frames on 10MBit/s link.
> If socket's wmemory size is about 500kiB (or less), the transfer stall.
> (I guess it is reproducible with 1500o frames by decreasing
> socket's wmemory to 90KB)
> Re-arming the timer fix this behaviour.
> 
> Here my understanding of this issue : 
> With 9KiB frames and 500kiB of wmemory, only 60 frames can be
> prepared in a row. It is below the tx coalescence threshold,
> so there will be no interrupt. When the tx coalescence timer 
> expires (40ms after), only five descriptors have to be
> freed (9000*5 @ 10Mbit/s = 34ms), it is not enough to reach
> the socket's wake-up threshold. We get into a deadlock :
> *Socket is waiting for free buffers before performing new transfer.
> *Driver is waiting for new transfer before performing cleanup.
> 
> Maybe, it is not a real life use-case, and is not worth
> a patch. What do you think ?
> 

I think there is probably a bug in the driver, a race of some sort,
and it would be better to find it and fix it ;)

^ permalink raw reply

* Re: [PATCH RFC 5/5] net:stmmac: asynchronous tx_clean
From: Eric Dumazet @ 2013-10-21 18:24 UTC (permalink / raw)
  To: Jimmy PERCHET; +Cc: Giuseppe CAVALLARO, netdev
In-Reply-To: <52656CE1.1060703@parrot.com>

On Mon, 2013-10-21 at 20:05 +0200, Jimmy PERCHET wrote:

> I understand your point. Nevertheless I think it is still possible
> to avoid serialization, and therefore increase performance, even if
> completions must remain in softirq. What do you think ?

I think this will break over time. This kind of 'optimizations' work on
a particular workload, and thats it. It reminds me the 'skb recycle'
thing.

The workload you try to optimize might conflict with other workloads.

Have you tried to reduce the 64 value in netif_napi_add() ?

> 
> In my patch I tried to avoid any race condition. (by updating both
> descriptor's cursors only once, for instance)
> Could you explain the possible race you see ?

Well, take a look at netif_queue_stopped(), netif_wake_queue(),
netif_queue_stopped() calls for a start.

Its really hard to make start_xmit() lockless (versus TX completion).

Adding atomic_t wont be enough (and btw thats often not needed at all)

drivers/net/ethernet/broadcom/tg3.c is a good reference if you really
want to do that properly.

^ permalink raw reply

* Re: [PATCH RFC 5/5] net:stmmac: asynchronous tx_clean
From: Jimmy PERCHET @ 2013-10-21 18:05 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Giuseppe CAVALLARO, netdev, Jimmy Perchet
In-Reply-To: <1382373005.3284.61.camel@edumazet-glaptop.roam.corp.google.com>

On 21/10/2013 18:30, Eric Dumazet wrote:
> On Mon, 2013-10-21 at 15:52 +0200, Giuseppe CAVALLARO wrote:
>> Hello Jimmy
>>
>> On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
>>> Tx descriptor's cleanup and preparation are serialized, which is not necessary
>>> and decrease performance.
>>> In addition TX descriptor's cleanup is performed on NET_RX softirq, this is
>>> confusing.
>>
>> hmm, here you are changing the logic behind the tx/rx processes.
>>
>> As done in many drivers, the stmmac cleans the tx resources in
>> NAPI context and this is not a confuse approach ;-).
>>
>> It gave me some performance improvements especially on TCP benchmarks.
>>
>>> This patch unserialize tx descriptor's cleanup and preparation
>>> and defer cleanup in workqueue.
>>
>> So you decide to use workqueue and I kindly ask you to give me more
>> details about the performance improvements (UDP/TCP) and cpu usage.
>>
>> I can try to do some tests on my side too. This could take a while
>> unfortunately.
> 
> Anyway this patch is buggy.
> 
> 1) Removing tx_lock spinlock in TX completion adds a race in
> stmmac_xmit()
> 
> 2) Generally speaking, we should not rely on a work queue to perform TX
> completions.
> 
> Think about being flooded by incoming frames.
> 
> Work queue could never be scheduled.
> 
> 
I understand your point. Nevertheless I think it is still possible
to avoid serialization, and therefore increase performance, even if
completions must remain in softirq. What do you think ?

In my patch I tried to avoid any race condition. (by updating both
descriptor's cursors only once, for instance)
Could you explain the possible race you see ?

Best Regards,
Jimmy

^ permalink raw reply

* Re: [PATCH] davinci_emac.c: Fix IFF_ALLMULTI setup
From: Mariusz Ceier @ 2013-10-21 17:59 UTC (permalink / raw)
  To: Mugunthan V N
  Cc: David S. Miller, Lad Prabhakar, Jingoo Han, Jiri Pirko, netdev,
	linux-kernel
In-Reply-To: <52656971.3020509@ti.com>

Yes of course ;) This was my first patch for the kernel. I will
remember about v2 tag next time.

Thanks,
Mariusz Ceier

On 21 October 2013 19:50, Mugunthan V N <mugunthanvnm@ti.com> wrote:
> On Monday 21 October 2013 11:15 PM, Mariusz Ceier wrote:
>> When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't,
>> emac_dev_mcast_set should only enable RX of multicasts and reset
>> MACHASH registers.
>>
>> It does this, but afterwards it either sets up multicast MACs
>> filtering or disables RX of multicasts and resets MACHASH registers
>> again, rendering IFF_ALLMULTI flag useless.
>>
>> This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and
>> disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set.
>>
>> Tested with kernel 2.6.37.
>>
>> Signed-off-by: Mariusz Ceier <mceier+kernel@gmail.com>
>> ---
> Can you add [PATCH v2] in your subject in future so that it will be
> easier to the maintainer to pick the latest version of the patch.
>
> Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
>
> Regards
> Mugunthan V N

^ permalink raw reply

* Re: [PATCH] davinci_emac.c: Fix IFF_ALLMULTI setup
From: Mugunthan V N @ 2013-10-21 17:50 UTC (permalink / raw)
  To: Mariusz Ceier
  Cc: David S. Miller, Lad Prabhakar, Jingoo Han, Jiri Pirko, netdev,
	linux-kernel
In-Reply-To: <1382377504-24688-1-git-send-email-mceier+kernel@gmail.com>

On Monday 21 October 2013 11:15 PM, Mariusz Ceier wrote:
> When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't,
> emac_dev_mcast_set should only enable RX of multicasts and reset
> MACHASH registers.
>
> It does this, but afterwards it either sets up multicast MACs
> filtering or disables RX of multicasts and resets MACHASH registers
> again, rendering IFF_ALLMULTI flag useless.
>
> This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and
> disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set.
>
> Tested with kernel 2.6.37.
>
> Signed-off-by: Mariusz Ceier <mceier+kernel@gmail.com>
> ---
Can you add [PATCH v2] in your subject in future so that it will be
easier to the maintainer to pick the latest version of the patch.

Acked-by: Mugunthan V N <mugunthanvnm@ti.com>

Regards
Mugunthan V N

^ permalink raw reply

* [PATCH] davinci_emac.c: Fix IFF_ALLMULTI setup
From: Mariusz Ceier @ 2013-10-21 17:45 UTC (permalink / raw)
  To: David S. Miller, Mugunthan V N, Lad Prabhakar, Jingoo Han,
	Jiri Pirko
  Cc: netdev, linux-kernel, Mariusz Ceier
In-Reply-To: <5265640C.4070201@cogentembedded.com>

When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't,
emac_dev_mcast_set should only enable RX of multicasts and reset
MACHASH registers.

It does this, but afterwards it either sets up multicast MACs
filtering or disables RX of multicasts and resets MACHASH registers
again, rendering IFF_ALLMULTI flag useless.

This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and
disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set.

Tested with kernel 2.6.37.

Signed-off-by: Mariusz Ceier <mceier+kernel@gmail.com>
---
 drivers/net/ethernet/ti/davinci_emac.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/davinci_emac.c b/drivers/net/ethernet/ti/davinci_emac.c
index 67df09e..6a32ef9 100644
--- a/drivers/net/ethernet/ti/davinci_emac.c
+++ b/drivers/net/ethernet/ti/davinci_emac.c
@@ -876,8 +876,7 @@ static void emac_dev_mcast_set(struct net_device *ndev)
 		    netdev_mc_count(ndev) > EMAC_DEF_MAX_MULTICAST_ADDRESSES) {
 			mbp_enable = (mbp_enable | EMAC_MBP_RXMCAST);
 			emac_add_mcast(priv, EMAC_ALL_MULTI_SET, NULL);
-		}
-		if (!netdev_mc_empty(ndev)) {
+		} else if (!netdev_mc_empty(ndev)) {
 			struct netdev_hw_addr *ha;
 
 			mbp_enable = (mbp_enable | EMAC_MBP_RXMCAST);
-- 
1.8.4

^ permalink raw reply related

* Re: [PATCH] davinci_emac.c: Fix IFF_ALLMULTI setup
From: Sergei Shtylyov @ 2013-10-21 17:27 UTC (permalink / raw)
  To: Mariusz Ceier, David S. Miller, Mugunthan V N, Lad Prabhakar,
	Jingoo Han, Jiri Pirko
  Cc: netdev, linux-kernel
In-Reply-To: <1382374777-24200-1-git-send-email-mceier+kernel@gmail.com>

Hello.

On 10/21/2013 08:59 PM, Mariusz Ceier wrote:

> When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't,
> emac_dev_mcast_set should only enable RX of multicasts and reset
> MACHASH registers.

> It does this, but afterwards it either sets up multicast MACs
> filtering or disables RX of multicasts and resets MACHASH registers
> again, rendering IFF_ALLMULTI flag useless.

> This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and
> disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set.

> Tested with kernel 2.6.37.

> Signed-off-by: Mariusz Ceier <mceier+kernel@gmail.com>
> ---
>   drivers/net/ethernet/ti/davinci_emac.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)

> diff --git a/drivers/net/ethernet/ti/davinci_emac.c b/drivers/net/ethernet/ti/davinci_emac.c
> index 67df09e..ff3bf0e 100644
> --- a/drivers/net/ethernet/ti/davinci_emac.c
> +++ b/drivers/net/ethernet/ti/davinci_emac.c
> @@ -876,7 +876,7 @@ static void emac_dev_mcast_set(struct net_device *ndev)
>   		    netdev_mc_count(ndev) > EMAC_DEF_MAX_MULTICAST_ADDRESSES) {
>   			mbp_enable = (mbp_enable | EMAC_MBP_RXMCAST);
>   			emac_add_mcast(priv, EMAC_ALL_MULTI_SET, NULL);
> -		}
> +		} else
>   		if (!netdev_mc_empty(ndev)) {

    It should be:

		} else if (!netdev_mc_empty(ndev)) {

WBR, Sergei

^ permalink raw reply

* [PATCH] mac802154: correct a typo in ieee802154_alloc_device() prototype
From: Alexandre Belloni @ 2013-10-21 17:09 UTC (permalink / raw)
  To: David S. Miller
  Cc: alex.bluesman.smirnov, netdev, linux-kernel, Alexandre Belloni

This has no other impact than a cosmetic one.

Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
---
 include/net/mac802154.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/mac802154.h b/include/net/mac802154.h
index d0d11df..807d6b7 100644
--- a/include/net/mac802154.h
+++ b/include/net/mac802154.h
@@ -133,7 +133,7 @@ struct ieee802154_ops {
 
 /* Basic interface to register ieee802154 device */
 struct ieee802154_dev *
-ieee802154_alloc_device(size_t priv_data_lex, struct ieee802154_ops *ops);
+ieee802154_alloc_device(size_t priv_data_len, struct ieee802154_ops *ops);
 void ieee802154_free_device(struct ieee802154_dev *dev);
 int ieee802154_register_device(struct ieee802154_dev *dev);
 void ieee802154_unregister_device(struct ieee802154_dev *dev);
-- 
1.8.3.2

^ permalink raw reply related

* [PATCH] davinci_emac.c: Fix IFF_ALLMULTI setup
From: Mariusz Ceier @ 2013-10-21 16:59 UTC (permalink / raw)
  To: David S. Miller, Mugunthan V N, Lad Prabhakar, Jingoo Han,
	Jiri Pirko
  Cc: netdev, linux-kernel, Mariusz Ceier

When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't,
emac_dev_mcast_set should only enable RX of multicasts and reset
MACHASH registers.

It does this, but afterwards it either sets up multicast MACs
filtering or disables RX of multicasts and resets MACHASH registers
again, rendering IFF_ALLMULTI flag useless.

This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and
disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set.

Tested with kernel 2.6.37.

Signed-off-by: Mariusz Ceier <mceier+kernel@gmail.com>
---
 drivers/net/ethernet/ti/davinci_emac.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ti/davinci_emac.c b/drivers/net/ethernet/ti/davinci_emac.c
index 67df09e..ff3bf0e 100644
--- a/drivers/net/ethernet/ti/davinci_emac.c
+++ b/drivers/net/ethernet/ti/davinci_emac.c
@@ -876,7 +876,7 @@ static void emac_dev_mcast_set(struct net_device *ndev)
 		    netdev_mc_count(ndev) > EMAC_DEF_MAX_MULTICAST_ADDRESSES) {
 			mbp_enable = (mbp_enable | EMAC_MBP_RXMCAST);
 			emac_add_mcast(priv, EMAC_ALL_MULTI_SET, NULL);
-		}
+		} else
 		if (!netdev_mc_empty(ndev)) {
 			struct netdev_hw_addr *ha;
 
-- 
1.8.4

^ permalink raw reply related

* Re: [PATCH RFC 5/5] net:stmmac: asynchronous tx_clean
From: Eric Dumazet @ 2013-10-21 16:30 UTC (permalink / raw)
  To: Giuseppe CAVALLARO; +Cc: Jimmy Perchet, netdev
In-Reply-To: <5265318C.5050307@st.com>

On Mon, 2013-10-21 at 15:52 +0200, Giuseppe CAVALLARO wrote:
> Hello Jimmy
> 
> On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
> > Tx descriptor's cleanup and preparation are serialized, which is not necessary
> > and decrease performance.
> > In addition TX descriptor's cleanup is performed on NET_RX softirq, this is
> > confusing.
> 
> hmm, here you are changing the logic behind the tx/rx processes.
> 
> As done in many drivers, the stmmac cleans the tx resources in
> NAPI context and this is not a confuse approach ;-).
> 
> It gave me some performance improvements especially on TCP benchmarks.
> 
> > This patch unserialize tx descriptor's cleanup and preparation
> > and defer cleanup in workqueue.
> 
> So you decide to use workqueue and I kindly ask you to give me more
> details about the performance improvements (UDP/TCP) and cpu usage.
> 
> I can try to do some tests on my side too. This could take a while
> unfortunately.

Anyway this patch is buggy.

1) Removing tx_lock spinlock in TX completion adds a race in
stmmac_xmit()

2) Generally speaking, we should not rely on a work queue to perform TX
completions.

Think about being flooded by incoming frames.

Work queue could never be scheduled.

^ permalink raw reply

* Re: [PATCH RFC 4/5] net:stmmac: fix jumbo frame handling.
From: Jimmy PERCHET @ 2013-10-21 16:28 UTC (permalink / raw)
  To: Giuseppe CAVALLARO; +Cc: netdev, jimmy.perchet
In-Reply-To: <52652EE7.2060500@st.com>

On 21/10/2013 15:40, Giuseppe CAVALLARO wrote:
> On 10/16/2013 5:24 PM, Jimmy Perchet wrote:
>> This patch addresses several issues which prevent jumbo frames from working properly :
>> .jumbo frames' last descriptor was not closed
>> .several confusion regarding descriptor's max buffer size
>> .frags could not be jumbo
>>
>> Signed-off-by: Jimmy Perchet <jimmy.perchet@parrot.com>
> 
> 
> Jimmy, thx for thi patch. BElow some my first notes.

Thanks a lot for this first review.

> I'll continue to look at the patch to verify if I missed
> soemthing. I kindly ask you, for the next version, to add
> more comments especially in the function to prepare the
> tx desc in order to help me on reviewing.

Sure ;)

I hope do v2 by next week.

I'm OK with most of your comments. Some additional
notes below:

>>   }
>> @@ -81,7 +81,7 @@ static inline void ndesc_end_tx_desc_on_ring(struct dma_desc *p, int ter)
>>
>>   static inline void norm_set_tx_desc_len_on_ring(struct dma_desc *p, int len)
>>   {
>> -    if (unlikely(len > BUF_SIZE_2KiB)) {
>> +    if (unlikely(len >= BUF_SIZE_2KiB)) {
> 
> we cannot manage a size of 2048 on normal desc
> 
> Pls you should verify to not break the back-compatibility.

IMHO, this actually fix the problem you think I create.
In current code, if len is equal to 2048, buffer1_size is set to 2048,
this is wrong because the max size is actually 2047...

> 
>>           p->des01.etx.buffer1_size = BUF_SIZE_2KiB - 1;
>>           p->des01.etx.buffer2_size = len - p->des01.etx.buffer1_size;
>>       } else



>>
>>   static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
>>   {
>> @@ -103,13 +90,13 @@ static void stmmac_refill_desc3(void *priv_ptr, struct dma_desc *p)
>>       if (unlikely(priv->plat->has_gmac))
>>           /* Fill DES3 in case of RING mode */
>>           if (priv->dma_buf_sz >= BUF_SIZE_8KiB)
>> -            p->des3 = p->des2 + BUF_SIZE_8KiB;
>> +            p->des3 = p->des2 + BUF_SIZE_8KiB - 1;
> 
> is it correct? can you check?

The actual buffer's max size is 8191, so, in ring mode,
the second buffer must start at p->des2 + 8191.

>> -    priv->cur_tx++;
>> +    priv->cur_tx += nb_desc;
> 
> can we avoid to use the nb_desc?
Actually, it is a preparation for my 5th patch : I want to write cur_tx only once.
I can split this.



Best Regards,
Jimmy

^ permalink raw reply

* Stale IPv6 address accumulation on linux 3.2.17
From: Templin, Fred L @ 2013-10-21 15:50 UTC (permalink / raw)
  To: netdev@vger.kernel.org
In-Reply-To: <loom.20131021T165114-650@post.gmane.org>

Hi,

On linux 3.2.17, I have a host that configures IPv6 addresses on
an eth0 interface based on Router Advertisements received from an
on-link linux box configured as an IPv6 router and running radvd.
When the host gets an RA, it configures both an EUI-64-based IPv6
address and an IPv6 privacy address, so it has two IPv6 addresses.
But, if I leave the host up for long periods of time, it seems to
accumulate additional IPv6 addresses - perhaps these are stale
IPv6 privacy addresses?

Is this known behavior, and if so is there a way to turn it off?
Or, perhaps this was a known bug that has been corrected in more
recent linux kernel versions?

Thanks - Fred
fred.l.templin@boeing.com

^ permalink raw reply

* Re: [BUG] 3.12.0-rcX IPv6 panic
From: Hannes Frederic Sowa @ 2013-10-21 15:52 UTC (permalink / raw)
  To: Bob Tracy; +Cc: linux-kernel, netdev
In-Reply-To: <20131021131846.GA5769@gherkin.frus.com>

Hi!

On Mon, Oct 21, 2013 at 08:18:46AM -0500, Bob Tracy wrote:
> Actually, a regression: the 3.11 kernel is rock-solid stable on my
> Alpha.
> 
> Beginning with 3.12.0-rc1, I can reliably trigger a kernel panic by
> executing the gogo6.net "gw6c" IPv6 client program.  If the networking
> layer is active, an "Oops" will eventually (within a day) occur regardless
> of whether I attempt to run "gw6c".  3.12.0-rcX is stable as long as I
> leave networking completely disabled.  The error has persisted up through
> -rc6.  Apologies for not mentioning this earlier, but the state of my
> PWS-433au has been questionable, and I wanted to make sure I had a
> legitimate bug sighting.
> 
> I'll have to transcribe the panic backtrace by hand: nothing makes it
> into any of the system logs :-(.  I *can* recall that every backtrace
> I've seen thus far has included one of the skb_copy() variants near the
> top of the list (first or second function).

Try to capture the panic via serial console. Otherwise a picture
would give us a first hint. Please watch out for lines like
skb_(over|under)_panic.

gw6c is a tunnel client? Can you post ip -6 tunnel ls?

(Also please send networking bugs to netdev@).

Greetings,

  Hannes

^ permalink raw reply

* Re: [PATCH nf-next] netfilter: xtables: lightweight process control group matching
From: Daniel Borkmann @ 2013-10-21 15:48 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Eric W. Biederman, pablo-Cap9r6Oaw4JrovVCs/uTlw,
	netfilter-devel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <526543A2.2040901-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>

On 10/21/2013 05:09 PM, Daniel Wagner wrote:
> On 10/19/2013 08:16 AM, Daniel Borkmann wrote:
>> On 10/19/2013 01:21 AM, Eric W. Biederman wrote:
>>
>>> I am coming to this late.  But two concrete suggestions.
>>>
>>> 1) process groups and sessions don't change as frequently as pids.
>>>
>>> 2) It is possible to put a set of processes in their own network
>>>     namespace and pipe just the packets you want those processes to
>>>     use into that network namespace.  Using an ingress queueing filter
>>>     makes that process very efficient even if you have to filter by port.
>>
>> Actually in our case we're filtering outgoing traffic, based on which
>> local socket that originated from; so you wouldn't need all of that
>> construct. Also, you wouldn't even need to have an a-prio knowledge of
>> the application internals regarding their use of particular use of ports
>> or protocols. I don't think that such a setup will have the same
>> efficiency, ease of use, and power to distinguish the application the
>> traffic came from in such a lightweight, protocol independent and easy way.
>
> Sorry for beeing late as well (and also stupid question)
>
> Couldn't you use something from the LSM? I mean you allow the
> application to create the socket etc and then block later
> the traffic originated from that socket. Wouldn't it make
> more sense to block early?

I gave one simple example for blocking in the commit message,
that's true, but it is not limited to that, meaning we can have
much different scenarios/policies that netfilter allows us than
just blocking, e.g. fine grained settings where applications are
allowed to connect/send traffic to, application traffic marking/
conntracking, application-specific packet mangling, and so on,
just think of the whole netfilter universe.

^ permalink raw reply

* DEVELLOP YOUR INTEREST PLEASE
From: MRS MUSA SAMORA @ 2013-10-21 15:18 UTC (permalink / raw)









FROM MRS MUSA SAMORA




Hello




My name is Mrs Samora Musa,I have been suffering from Ovarian cancer disease and the doctor says that i have just few days to leave.I am from Libya, but based in Burkina Faso,Africa since ten years ago as a business woman dealing with cocoa exportation,now that i am about to end the race like this,without any family members and no child.I have $1.1 Million US DOLLARS in Bank of Africa(BOA)Burkina Faso which i instructed the bank to give African union leaders to help sick people around Africa.But my mind is not at rest because i am writing this letter now through the help of my nurse beside me here in my hospital room.
 
I also have $2.1 Million US Dollars in another bank in Burkina Faso which i want you to claim from the bank and use help less privileged people in your country,but you must assure me that you will  take only 40% of the total money and give the rest 60% to the orphanage home in your country for my heart to rest.
 
Upon the receipt of your email that you are willing and capable to execute my plan, i will instruct the bank management to make the immediate transfer into your account.
 
Sincerely,
 
Mrs Samora Musa.

^ permalink raw reply

* Re: [PATCH nf-next] netfilter: xtables: lightweight process control group matching
From: Daniel Wagner @ 2013-10-21 15:09 UTC (permalink / raw)
  To: Daniel Borkmann, Eric W. Biederman
  Cc: pablo, netfilter-devel, netdev, Tejun Heo, cgroups
In-Reply-To: <526231E0.6060903@redhat.com>

Hi Daniel

On 10/19/2013 08:16 AM, Daniel Borkmann wrote:
> On 10/19/2013 01:21 AM, Eric W. Biederman wrote:
>
>> I am coming to this late.  But two concrete suggestions.
>>
>> 1) process groups and sessions don't change as frequently as pids.
>>
>> 2) It is possible to put a set of processes in their own network
>>     namespace and pipe just the packets you want those processes to
>>     use into that network namespace.  Using an ingress queueing filter
>>     makes that process very efficient even if you have to filter by port.
>
> Actually in our case we're filtering outgoing traffic, based on which
> local socket that originated from; so you wouldn't need all of that
> construct. Also, you wouldn't even need to have an a-prio knowledge of
> the application internals regarding their use of particular use of ports
> or protocols. I don't think that such a setup will have the same
> efficiency, ease of use, and power to distinguish the application the
> traffic came from in such a lightweight, protocol independent and easy way.

Sorry for beeing late as well (and also stupid question)

Couldn't you use something from the LSM? I mean you allow the
application to create the socket etc and then block later
the traffic originated from that socket. Wouldn't it make
more sense to block early?

cheers,
daniel

^ permalink raw reply

* Re: unmanaged L2TPv3 ethernet pseudowire Cisco <=> Linux
From: Pierre Desvaux @ 2013-10-21 15:08 UTC (permalink / raw)
  To: netdev
In-Reply-To: <51535D16.4080207@katalix.com>

James Chapman <jchapman <at> katalix.com> writes:

> 
> On 27/03/13 20:08, Tomas Agartz wrote:
> > On Tue, 26 Mar 2013, James Chapman wrote:
> > 
> >> The issue is that Linux and Cisco use a different default for the
> >> L2SpecificSublayer header setting and neither implementation provides
> >> a config option to change its setting.  The Linux default is to use
> >> the Default L2SpecificSublayer as defined in the RFC. Unfortunately
> >> the Cisco default is to use no L2SpecificSublayer.
> >>
> >> The kernel already has an API to allow the L2SpecificSublayer setting
> >> to be configured. The missing piece is an iproute2 l2tp config option
> >> to configure it. I'll work on an iproute2 patch now to allow this
> >> setting to be configured.
> > 
> > I patched my iproute2 with your patch and now my tunnel is working.
> > Thank you! :)
> 
> Great. Thanks for reporting back.
> 
> >> For unmanaged tunnels, these parameters must be manually configured
> >> consistently at each side. Both Cisco and Linux default to use no
> >> cookies and both already have config parameters to set cookie
> >> parameters, if needed. However, for L2SpecificSublayer this isn't the
> >> case. We need to add a config option on the Linux side to force the
> >> same setting as Cisco is using.
> > 
> > Does the API in the kernel allow you to set the cookie? In that case it
> > seems like a good idea to add that to iproute2 as well?
> 
> It is already supported. See the cookie and peer_cookie parameters of ip
> l2tp add session.
> 
> ip l2tp help
> or
> man ip-l2tp
> 
> James
> 
> 

Hi,

I have tried an other solution to bypass this issue.
I put a 4 bytes cookie in the paquets sent by the Cisco. It looks like this:
[IPv4][L2TPv3][Cookie][payload]

With value 0, the cookie is seen by the Linux as a L2SpecificSublayer with
Sbit at 0. Wich means ignore the value of the sequence number in
L2SpecificSublayer so Linux accepts it. Linux replies automaticaly with Sbit
0 to Cisco.
Cisco is as well configured to accept a 4 bytes cookie, the
L2SpecificSublayer is now accepted as a cookie.

To configure Cisco:
xconnect 192.168.0.1 200 encapsulation l2tpv3 manual pw-class tlund
   l2tp id 200 200
   l2tp cookie local 4 0
   l2tp cookie remote 4 0

Pierre

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox