Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next] tcp: enhance tcp_collapse_retrans() with skb_shift()
From: Eric Dumazet @ 2016-11-15 20:51 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Neal Cardwell, Yuchung Cheng

From: Eric Dumazet <edumazet@google.com>

In commit 2331ccc5b323 ("tcp: enhance tcp collapsing"),
we made a first step allowing copying right skb to left skb head.

Since all skbs in socket write queue are headless (but possibly the very
first one), this strategy often does not work.

This patch extends tcp_collapse_retrans() to perform frag shifting,
thanks to skb_shift() helper.

This helper needs to not BUG on non headless skbs, as callers are ok
with that.

Tested:

Following packetdrill test now passes :

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
   +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
   +0 bind(3, ..., ...) = 0
   +0 listen(3, 1) = 0

   +0 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 8>
   +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
+.100 < . 1:1(0) ack 1 win 257
   +0 accept(3, ..., ...) = 4

   +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
   +0 write(4, ..., 200) = 200
   +0 > P. 1:201(200) ack 1
+.001 write(4, ..., 200) = 200
   +0 > P. 201:401(200) ack 1
+.001 write(4, ..., 200) = 200
   +0 > P. 401:601(200) ack 1
+.001 write(4, ..., 200) = 200
   +0 > P. 601:801(200) ack 1
+.001 write(4, ..., 200) = 200
   +0 > P. 801:1001(200) ack 1
+.001 write(4, ..., 100) = 100
   +0 > P. 1001:1101(100) ack 1
+.001 write(4, ..., 100) = 100
   +0 > P. 1101:1201(100) ack 1
+.001 write(4, ..., 100) = 100
   +0 > P. 1201:1301(100) ack 1
+.001 write(4, ..., 100) = 100
   +0 > P. 1301:1401(100) ack 1

+.099 < . 1:1(0) ack 201 win 257
+.001 < . 1:1(0) ack 201 win 257 <nop,nop,sack 1001:1401>
   +0 > P. 201:1001(800) ack 1

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
---
 net/core/skbuff.c     |    4 +++-
 net/ipv4/tcp_output.c |   22 +++++++++++-----------
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 0b2a6e94af2de73ed638634c47a0fb71e2cbc1cb..a9cb81a10c4ba895587727aa4cf098e9a38424ea 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2656,7 +2656,9 @@ int skb_shift(struct sk_buff *tgt, struct sk_buff *skb, int shiftlen)
 	struct skb_frag_struct *fragfrom, *fragto;
 
 	BUG_ON(shiftlen > skb->len);
-	BUG_ON(skb_headlen(skb));	/* Would corrupt stream */
+
+	if (skb_headlen(skb))
+		return 0;
 
 	todo = shiftlen;
 	from = 0;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index f57b5aa51b59cf0a58975fe34a7dcdb886ea8c50..19105b46a30436ebb85fe97ee43089e77aa028bb 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2514,7 +2514,7 @@ void tcp_skb_collapse_tstamp(struct sk_buff *skb,
 }
 
 /* Collapses two adjacent SKB's during retransmission. */
-static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb)
+static bool tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct sk_buff *next_skb = tcp_write_queue_next(sk, skb);
@@ -2525,14 +2525,17 @@ static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb)
 
 	BUG_ON(tcp_skb_pcount(skb) != 1 || tcp_skb_pcount(next_skb) != 1);
 
+	if (next_skb_size) {
+		if (next_skb_size <= skb_availroom(skb))
+			skb_copy_bits(next_skb, 0, skb_put(skb, next_skb_size),
+				      next_skb_size);
+		else if (!skb_shift(skb, next_skb, next_skb_size))
+			return false;
+	}
 	tcp_highest_sack_combine(sk, next_skb, skb);
 
 	tcp_unlink_write_queue(next_skb, sk);
 
-	if (next_skb_size)
-		skb_copy_bits(next_skb, 0, skb_put(skb, next_skb_size),
-			      next_skb_size);
-
 	if (next_skb->ip_summed == CHECKSUM_PARTIAL)
 		skb->ip_summed = CHECKSUM_PARTIAL;
 
@@ -2561,6 +2564,7 @@ static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb)
 	tcp_skb_collapse_tstamp(skb, next_skb);
 
 	sk_wmem_free_skb(sk, next_skb);
+	return true;
 }
 
 /* Check if coalescing SKBs is legal. */
@@ -2610,16 +2614,12 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to,
 
 		if (space < 0)
 			break;
-		/* Punt if not enough space exists in the first SKB for
-		 * the data in the second
-		 */
-		if (skb->len > skb_availroom(to))
-			break;
 
 		if (after(TCP_SKB_CB(skb)->end_seq, tcp_wnd_end(tp)))
 			break;
 
-		tcp_collapse_retrans(sk, to);
+		if (!tcp_collapse_retrans(sk, to))
+			break;
 	}
 }
 

^ permalink raw reply related

* Re: [net PATCH 1/2] ipv4: Restore fib_trie_flush_external function and fix call ordering
From: Duyck, Alexander H @ 2016-11-15 20:49 UTC (permalink / raw)
  To: jiri@resnulli.us
  Cc: netdev@vger.kernel.org, jiri@mellanox.com, davem@davemloft.net,
	edumazet@google.com
In-Reply-To: <20161115203140.GC1783@nanopsycho.orion>

On Tue, 2016-11-15 at 21:31 +0100, Jiri Pirko wrote:
> Tue, Nov 15, 2016 at 09:29:09PM CET, alexander.h.duyck@intel.com wrote:
> > 
> > On Tue, 2016-11-15 at 20:51 +0100, Jiri Pirko wrote:
> > > 
> > > Tue, Nov 15, 2016 at 11:46:06AM CET, alexander.h.duyck@intel.com wrote:
> > > > 
> > > > 
> > > > The patch that removed the FIB offload infrastructure was a bit too
> > > > aggressive and also removed code needed to clean up us splitting the table
> > > > if additional rules were added.  Specifically the function
> > > > fib_trie_flush_external was called at the end of a new rule being added to
> > > > flush the foreign trie entries from the main trie.
> > > > 
> > > > I updated the code so that we only call fib_trie_flush_external on the main
> > > > table so that we flush the entries for local from main.  This way we don't
> > > > call it for every rule change which is what was happening previously.
> > > 
> > > Well, the function was introduced by:
> > > 
> > > commit 104616e74e0b464d449fdd2ee2f547d2fad71610
> > > Author: Scott Feldman <sfeldma@gmail.com>
> > > Date:   Thu Mar 5 21:21:16 2015 -0800
> > > 
> > >     switchdev: don't support custom ip rules, for now
> > >     
> > >     Keep switchdev FIB offload model simple for now and don't allow custom ip
> > >     rules.
> > > 
> > > Why this was not needed before? What changed in between:
> > > 104616e74e0b464d449fdd2ee2f547d2fad71610 ("switchdev: don't support custom ip rules, for now")
> > > and
> > > 347e3b28c1ba2 ("switchdev: remove FIB offload infrastructure")
> > 
> > We collapsed the two tables into one in commit 0ddcf43d5d4a ("ipv4: FIB
> > Local/MAIN table collapse") which was submitted the next day.  Scott
> > and I were working on things at the same time and the
> > fib_table_flush_external function was something we had worked out that
> > would allow him to take care of his use case and me to take care of
> > cleaning up the tables after unmerging.
> 
> Okay. But please name the fuction differently, as it does not flush
> external. Thanks!

You and I have different meanings for "external".

In my case I am flushing entries that belong to a foreign "external"
table from the table specified. So by "external" I am referring to
entries that don't actually live in main, but actually reside in local.
If you take a look at fib_table_flush that gets rid of all entries,
fib_table_flush_external simply clears the foreign ones.

Also I'd rather maintain naming since it makes it easier if we need to
backport fixes.

Finally, the flag RTNH_F_EXTERNAL was renamed over a year ago in commit
36583eb54d46c ("rename RTNH_F_EXTERNAL to RTNH_F_OFFLOAD") so there
isn't too much likelihood of this being confused for something that
handles offloaded entries.  If you take a look in net/ipv4/* after your
patch there isn't actually anything that references the word external
so the likelihood for any confusion is extremely low.

- Alex

^ permalink raw reply

* Re: linux-next: net->netns_ids is used after calling idr_destroy for it
From: Andrei Vagin @ 2016-11-15 20:48 UTC (permalink / raw)
  To: Cong Wang; +Cc: Nicolas Dichtel, Linux Kernel Network Developers
In-Reply-To: <CAM_iQpU=DXKEboYHi4yO3m6VuUm1RoDRD26ZK-pVY--yUE0wNA@mail.gmail.com>

On Tue, Nov 15, 2016 at 10:50 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Tue, Nov 15, 2016 at 10:04 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> On Mon, Nov 14, 2016 at 10:23 PM, Andrei Vagin <avagin@gmail.com> wrote:
>>> Hi Nicolas,
>>>
>>> cleanup_net() calls idr_destroy(net->netns_ids) for network namespaces
>>> and then it calls unregister_netdevice_many() which calls
>>> idr_alloc(net0>netns_ids). It looks wrong, doesn't it?
>>>
>>
>> netns id is designed to allocate lazily, but yeah it makes no sense
>> to allocate id for the netns being destroyed, not to mention idr is freed.
>>
>> I will send a patch.
>
> Could you try the attached patch? I just did some quick netns creation/destroy
> tests.

Here is another fail:

unreferenced object 0xffff94153912a0c0 (size 2096):
  comm "ip", pid 29175, jiffies 4294954213 (age 137.624s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 b2 3b 1d 15 94 ff ff  ..........;.....
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffffac865c1a>] kmemleak_alloc+0x4a/0xa0
    [<ffffffffac243b38>] kmem_cache_alloc+0x128/0x280
    [<ffffffffac42f5ab>] idr_layer_alloc+0x2b/0x90
    [<ffffffffac42f9cd>] idr_get_empty_slot+0x34d/0x370
    [<ffffffffac42fa4e>] idr_alloc+0x5e/0x110
    [<ffffffffac70ac3d>] __peernet2id_alloc+0x6d/0x90
    [<ffffffffac70bda5>] peernet2id_alloc+0x55/0xb0
    [<ffffffffac731246>] rtnl_fill_ifinfo+0xaa6/0x10a0
    [<ffffffffac7330a3>] rtmsg_ifinfo_build_skb+0x73/0xd0
    [<ffffffffac7125e1>] rollback_registered_many+0x2a1/0x3a0
    [<ffffffffac712779>] __unregister_netdevice_many+0x29/0x80
    [<ffffffffac7127e3>] unregister_netdevice_many+0x13/0x20
    [<ffffffffc02dc4ce>] macvlan_device_event+0x13e/0x235 [macvlan]
    [<ffffffffac0bef2a>] notifier_call_chain+0x4a/0x70
    [<ffffffffac0bf066>] raw_notifier_call_chain+0x16/0x20
    [<ffffffffac710205>] call_netdevice_notifiers_info+0x35/0x60


What do you think about calling idr_destroy() at the final step in
cleanup_net()? In this case we can avoid this sort of problems in a
future.

@@ -422,21 +425,6 @@ static void cleanup_net(struct work_struct *work)
        list_for_each_entry(net, &net_kill_list, cleanup_list) {
                list_del_rcu(&net->list);
                list_add_tail(&net->exit_list, &net_exit_list);
-               for_each_net(tmp) {
-                       int id;
-
-                       spin_lock_irq(&tmp->nsid_lock);
-                       id = __peernet2id(tmp, net);
-                       if (id >= 0)
-                               idr_remove(&tmp->netns_ids, id);
-                       spin_unlock_irq(&tmp->nsid_lock);
-                       if (id >= 0)
-                               rtnl_net_notifyid(tmp, RTM_DELNSID, id);
-               }
-               spin_lock_irq(&net->nsid_lock);
-               idr_destroy(&net->netns_ids);
-               spin_unlock_irq(&net->nsid_lock);
-
        }
        rtnl_unlock();

@@ -464,6 +452,22 @@ static void cleanup_net(struct work_struct *work)

        /* Finally it is safe to free my network namespace structure */
        list_for_each_entry_safe(net, tmp, &net_exit_list, exit_list) {
+               rtnl_lock();
+               for_each_net(tmp) {
+                       int id;
+
+                       spin_lock_irq(&tmp->nsid_lock);
+                       id = __peernet2id(tmp, net);
+                       if (id >= 0)
+                               idr_remove(&tmp->netns_ids, id);
+                       spin_unlock_irq(&tmp->nsid_lock);
+                       if (id >= 0)
+                               rtnl_net_notifyid(tmp, RTM_DELNSID, id);
+               }
+               rtnl_unlock();
+
+               idr_destroy(&net->netns_ids);
+
                list_del_init(&net->exit_list);
                dec_net_namespaces(net->ucounts);
                put_user_ns(net->user_ns);
>
> Thanks!

^ permalink raw reply

* Re: [net-next 1/2] net: ethernet: slicoss: add slicoss gigabit ethernet driver
From: Lino Sanfilippo @ 2016-11-15 20:46 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: davem, charrer, liodot, gregkh, devel, linux-kernel, netdev
In-Reply-To: <20161113195544.GA18258@lunn.ch>

Hi,


On 13.11.2016 20:55, Andrew Lunn wrote:
>> +static const char slic_stats_strings[][ETH_GSTRING_LEN] = {
>> +	"rx_packets     ",
>> +	"rx_bytes       ",
>> +	"rx_multicasts  ",
>> +	"rx_errors      ",
>> +	"rx_buff_miss   ",
>> +	"rx_tp_csum     ",
>> +	"rx_tp_oflow    ",
>> +	"rx_tp_hlen     ",
>> +	"rx_ip_csum     ",
>> +	"rx_ip_len      ",
> 
> Are there any other drivers which pad the statistics strings?
> 

First off, thank you for the review!

I took a look into a few drivers and most of them do not pad the statistic strings.
Actually there are some that do (e.g. the chelsio drivers cxgb4, cxgb3, xcgb4v), 
but this seems to be rather the minority, so I will remove it. Thank you for the hint.

>> +static void slic_set_link_autoneg(struct slic_device *sdev)
>> +{
>> +	unsigned int subid = sdev->pdev->subsystem_device;
>> +	u32 val;
>> +
>> +	if (sdev->is_fiber) {
>> +		/* We've got a fiber gigabit interface, and register 4 is
>> +		 * different in fiber mode than in copper mode.
>> +		 */
>> +		/* advertise FD only @1000 Mb */
>> +		val = MII_ADVERTISE << 16 | SLIC_PAR_ADV1000XFD |
>> +		      SLIC_PAR_ASYMPAUSE_FIBER;
>> +		/* enable PAUSE frames */
>> +		slic_write(sdev, SLIC_REG_WPHY, val);
>> +		/* reset phy, enable auto-neg  */
>> +		val = MII_BMCR << 16 | SLIC_PCR_RESET | SLIC_PCR_AUTONEG |
>> +		      SLIC_PCR_AUTONEG_RST;
>> +		slic_write(sdev, SLIC_REG_WPHY, val);
>> +	} else {	/* copper gigabit */
>> +		/* We've got a copper gigabit interface, and register 4 is
>> +		 * different in copper mode than in fiber mode.
>> +		 */
>> +		/* advertise 10/100 Mb modes   */
>> +		val = MII_ADVERTISE << 16 | SLIC_PAR_ADV100FD |
>> +		      SLIC_PAR_ADV100HD | SLIC_PAR_ADV10FD | SLIC_PAR_ADV10HD;
>> +		/* enable PAUSE frames  */
>> +		val |= SLIC_PAR_ASYMPAUSE;
>> +		/* required by the Cicada PHY  */
>> +		val |= SLIC_PAR_802_3;
>> +		slic_write(sdev, SLIC_REG_WPHY, val);
>> +
>> +		/* advertise FD only @1000 Mb  */
>> +		val = MII_CTRL1000 << 16 | SLIC_PGC_ADV1000FD;
>> +		slic_write(sdev, SLIC_REG_WPHY, val);
>> +
>> +		if (subid != PCI_SUBDEVICE_ID_ALACRITECH_CICADA) {
>> +			 /* if a Marvell PHY enable auto crossover */
>> +			val = SLIC_MIICR_REG_16 | SLIC_MRV_REG16_XOVERON;
>> +			slic_write(sdev, SLIC_REG_WPHY, val);
>> +
>> +			/* reset phy, enable auto-neg  */
>> +			val = MII_BMCR << 16 | SLIC_PCR_RESET |
>> +			      SLIC_PCR_AUTONEG | SLIC_PCR_AUTONEG_RST;
>> +			slic_write(sdev, SLIC_REG_WPHY, val);
>> +		} else {
>> +			/* enable and restart auto-neg (don't reset)  */
>> +			val = MII_BMCR << 16 | SLIC_PCR_AUTONEG |
>> +			      SLIC_PCR_AUTONEG_RST;
>> +			slic_write(sdev, SLIC_REG_WPHY, val);
>> +		}
>> +	}
>> +	sdev->autoneg = true;
>> +}
> 
> Could this be pulled out into a standard PHY driver? All the SLIC
> SLIC_PCR_ defines seems to be the same as those in mii.h. This could
> be a standard PHY hidden behind a single register.
> 
>    Andrew

You are right, the driver should really use the defines in mii.h. I will fix this in
 a v2.

Concerning the use of the PHY API: What would be the advantage of using it? Note that the
 phy is always internal and not interchangeable. Is not the interchangeability of PHYs
the main reason for using this API?

Regards,
Lino

^ permalink raw reply

* Re: [net PATCH 1/2] ipv4: Restore fib_trie_flush_external function and fix call ordering
From: Jiri Pirko @ 2016-11-15 20:31 UTC (permalink / raw)
  To: Duyck, Alexander H
  Cc: netdev@vger.kernel.org, jiri@mellanox.com, davem@davemloft.net,
	edumazet@google.com
In-Reply-To: <1479241747.681.105.camel@intel.com>

Tue, Nov 15, 2016 at 09:29:09PM CET, alexander.h.duyck@intel.com wrote:
>On Tue, 2016-11-15 at 20:51 +0100, Jiri Pirko wrote:
>> Tue, Nov 15, 2016 at 11:46:06AM CET, alexander.h.duyck@intel.com wrote:
>> > 
>> > The patch that removed the FIB offload infrastructure was a bit too
>> > aggressive and also removed code needed to clean up us splitting the table
>> > if additional rules were added.  Specifically the function
>> > fib_trie_flush_external was called at the end of a new rule being added to
>> > flush the foreign trie entries from the main trie.
>> > 
>> > I updated the code so that we only call fib_trie_flush_external on the main
>> > table so that we flush the entries for local from main.  This way we don't
>> > call it for every rule change which is what was happening previously.
>> 
>> Well, the function was introduced by:
>> 
>> commit 104616e74e0b464d449fdd2ee2f547d2fad71610
>> Author: Scott Feldman <sfeldma@gmail.com>
>> Date:   Thu Mar 5 21:21:16 2015 -0800
>> 
>>     switchdev: don't support custom ip rules, for now
>>     
>>     Keep switchdev FIB offload model simple for now and don't allow custom ip
>>     rules.
>> 
>> Why this was not needed before? What changed in between:
>> 104616e74e0b464d449fdd2ee2f547d2fad71610 ("switchdev: don't support custom ip rules, for now")
>> and
>> 347e3b28c1ba2 ("switchdev: remove FIB offload infrastructure")
>
>We collapsed the two tables into one in commit 0ddcf43d5d4a ("ipv4: FIB
>Local/MAIN table collapse") which was submitted the next day.  Scott
>and I were working on things at the same time and the
>fib_table_flush_external function was something we had worked out that
>would allow him to take care of his use case and me to take care of
>cleaning up the tables after unmerging.

Okay. But please name the fuction differently, as it does not flush
external. Thanks!

^ permalink raw reply

* Re: [net PATCH 1/2] ipv4: Restore fib_trie_flush_external function and fix call ordering
From: Duyck, Alexander H @ 2016-11-15 20:29 UTC (permalink / raw)
  To: jiri@resnulli.us
  Cc: netdev@vger.kernel.org, jiri@mellanox.com, davem@davemloft.net,
	edumazet@google.com
In-Reply-To: <20161115195107.GB1783@nanopsycho.orion>

On Tue, 2016-11-15 at 20:51 +0100, Jiri Pirko wrote:
> Tue, Nov 15, 2016 at 11:46:06AM CET, alexander.h.duyck@intel.com wrote:
> > 
> > The patch that removed the FIB offload infrastructure was a bit too
> > aggressive and also removed code needed to clean up us splitting the table
> > if additional rules were added.  Specifically the function
> > fib_trie_flush_external was called at the end of a new rule being added to
> > flush the foreign trie entries from the main trie.
> > 
> > I updated the code so that we only call fib_trie_flush_external on the main
> > table so that we flush the entries for local from main.  This way we don't
> > call it for every rule change which is what was happening previously.
> 
> Well, the function was introduced by:
> 
> commit 104616e74e0b464d449fdd2ee2f547d2fad71610
> Author: Scott Feldman <sfeldma@gmail.com>
> Date:   Thu Mar 5 21:21:16 2015 -0800
> 
>     switchdev: don't support custom ip rules, for now
>     
>     Keep switchdev FIB offload model simple for now and don't allow custom ip
>     rules.
> 
> Why this was not needed before? What changed in between:
> 104616e74e0b464d449fdd2ee2f547d2fad71610 ("switchdev: don't support custom ip rules, for now")
> and
> 347e3b28c1ba2 ("switchdev: remove FIB offload infrastructure")

We collapsed the two tables into one in commit 0ddcf43d5d4a ("ipv4: FIB
Local/MAIN table collapse") which was submitted the next day.  Scott
and I were working on things at the same time and the
fib_table_flush_external function was something we had worked out that
would allow him to take care of his use case and me to take care of
cleaning up the tables after unmerging.

- Alex

^ permalink raw reply

* Re: [PATCH 03/15] net: bcm63xx_enet: Utilize phy_ethtool_nway_reset
From: Florian Fainelli @ 2016-11-15 20:15 UTC (permalink / raw)
  To: Heinrich Schuchardt, netdev
  Cc: davem, andrew, tremyfr,
	maintainer:BROADCOM BCM63XX ARM ARCHITECTURE, Arnd Bergmann,
	xypron.glpk@gmx.de, Jarod Wilson, Thierry Reding,
	moderated list:BROADCOM BCM63XX ARM ARCHITECTURE, open list
In-Reply-To: <3dc4471a-1afd-0e28-1de9-60558cece143@gmx.de>

On 11/15/2016 11:59 AM, Heinrich Schuchardt wrote:
> On 11/15/2016 07:06 PM, Florian Fainelli wrote:
>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> 
> Each patch of your patch series should have a commit message.

It's pretty obvious what this does, really.

> 
> You can add the missing commit messages with
> git rebase -i HEAD~15
> choosing reword.
> 
> Please, add a cover letter message describing the patch series using
> 
> git format-patch -ns --cover-letter HEAD~15..HEAD
> 
> and provide a summary in the 0/15 message.

I did:

https://www.mail-archive.com/netdev@vger.kernel.org/maillist.html

and that is also apparent from:

In-Reply-To: <20161115180644.3941-1-f.fainelli@gmail.com>
References: <20161115180644.3941-1-f.fainelli@gmail.com>

And what I also did is to intentionally make sure that for each patch,
there would not be 50+ people CC'd, but just the relevant people for
each driver, which is exactly how cc-cmd=scripts/get_maintainer.pl work,
making my life simpler honestly.

It's not nearly my first patch submission, but I admit I have been semi
intentionally lazy with this patch set considering the number of patches.

Anyway, enough time wasted on this.
-- 
Florian

^ permalink raw reply

* Re: [PATCH 03/15] net: bcm63xx_enet: Utilize phy_ethtool_nway_reset
From: Heinrich Schuchardt @ 2016-11-15 19:59 UTC (permalink / raw)
  To: Florian Fainelli, netdev
  Cc: davem, andrew, tremyfr,
	maintainer:BROADCOM BCM63XX ARM ARCHITECTURE, Arnd Bergmann,
	xypron.glpk@gmx.de, Jarod Wilson, Thierry Reding,
	moderated list:BROADCOM BCM63XX ARM ARCHITECTURE, open list
In-Reply-To: <20161115180644.3941-4-f.fainelli@gmail.com>

On 11/15/2016 07:06 PM, Florian Fainelli wrote:
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>

Each patch of your patch series should have a commit message.

You can add the missing commit messages with
git rebase -i HEAD~15
choosing reword.

Please, add a cover letter message describing the patch series using

git format-patch -ns --cover-letter HEAD~15..HEAD

and provide a summary in the 0/15 message.

The cover letter and all patches should be in one mails thhread.
If you put all messages into one directory you can send the series as
one mail thread with

git send-email <directory>

> ---
>  drivers/net/ethernet/broadcom/bcm63xx_enet.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bcm63xx_enet.c b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
> index 5c7acef1de2e..a43ab90c051e 100644
> --- a/drivers/net/ethernet/broadcom/bcm63xx_enet.c
> +++ b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
> @@ -1434,11 +1434,8 @@ static int bcm_enet_nway_reset(struct net_device *dev)
>  	struct bcm_enet_priv *priv;
>  
>  	priv = netdev_priv(dev);
> -	if (priv->has_phy) {
> -		if (!dev->phydev)
> -			return -ENODEV;
> -		return genphy_restart_aneg(dev->phydev);
> -	}
> +	if (priv->has_phy)
> +		return phy_ethtool_nway_reset(dev),
>  
>  	return -EOPNOTSUPP;
>  }
> 

^ permalink raw reply

* Re: [net PATCH 1/2] ipv4: Restore fib_trie_flush_external function and fix call ordering
From: Jiri Pirko @ 2016-11-15 19:51 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev, Eric Dumazet, Jiri Pirko, davem
In-Reply-To: <20161115104606.13711.37506.stgit@ahduyck-blue-test.jf.intel.com>

Tue, Nov 15, 2016 at 11:46:06AM CET, alexander.h.duyck@intel.com wrote:
>The patch that removed the FIB offload infrastructure was a bit too
>aggressive and also removed code needed to clean up us splitting the table
>if additional rules were added.  Specifically the function
>fib_trie_flush_external was called at the end of a new rule being added to
>flush the foreign trie entries from the main trie.
>
>I updated the code so that we only call fib_trie_flush_external on the main
>table so that we flush the entries for local from main.  This way we don't
>call it for every rule change which is what was happening previously.

Well, the function was introduced by:

commit 104616e74e0b464d449fdd2ee2f547d2fad71610
Author: Scott Feldman <sfeldma@gmail.com>
Date:   Thu Mar 5 21:21:16 2015 -0800

    switchdev: don't support custom ip rules, for now
    
    Keep switchdev FIB offload model simple for now and don't allow custom ip
    rules.

Why this was not needed before? What changed in between:
104616e74e0b464d449fdd2ee2f547d2fad71610 ("switchdev: don't support custom ip rules, for now")
and
347e3b28c1ba2 ("switchdev: remove FIB offload infrastructure")


?


>
>Fixes: 347e3b28c1ba2 ("switchdev: remove FIB offload infrastructure")
>Reported-by: Eric Dumazet <edumazet@google.com>
>Cc: Jiri Pirko <jiri@mellanox.com>
>Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
>---
> include/net/ip_fib.h    |    1 +
> net/ipv4/fib_frontend.c |   20 +++++++++++---
> net/ipv4/fib_trie.c     |   65 +++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 81 insertions(+), 5 deletions(-)
>
>diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
>index b9314b4..f390c3b 100644
>--- a/include/net/ip_fib.h
>+++ b/include/net/ip_fib.h
>@@ -243,6 +243,7 @@ int fib_table_dump(struct fib_table *table, struct sk_buff *skb,
> 		   struct netlink_callback *cb);
> int fib_table_flush(struct net *net, struct fib_table *table);
> struct fib_table *fib_trie_unmerge(struct fib_table *main_tb);
>+void fib_table_flush_external(struct fib_table *table);
> void fib_free_table(struct fib_table *tb);
> 
> #ifndef CONFIG_IP_MULTIPLE_TABLES
>diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
>index c3b8047..161fc0f 100644
>--- a/net/ipv4/fib_frontend.c
>+++ b/net/ipv4/fib_frontend.c
>@@ -151,7 +151,7 @@ static void fib_replace_table(struct net *net, struct fib_table *old,
> 
> int fib_unmerge(struct net *net)
> {
>-	struct fib_table *old, *new;
>+	struct fib_table *old, *new, *main_table;
> 
> 	/* attempt to fetch local table if it has been allocated */
> 	old = fib_get_table(net, RT_TABLE_LOCAL);
>@@ -162,11 +162,21 @@ int fib_unmerge(struct net *net)
> 	if (!new)
> 		return -ENOMEM;
> 
>+	/* table is already unmerged */
>+	if (new == old)
>+		return 0;
>+
> 	/* replace merged table with clean table */
>-	if (new != old) {
>-		fib_replace_table(net, old, new);
>-		fib_free_table(old);
>-	}
>+	fib_replace_table(net, old, new);
>+	fib_free_table(old);
>+
>+	/* attempt to fetch main table if it has been allocated */
>+	main_table = fib_get_table(net, RT_TABLE_MAIN);
>+	if (!main_table)
>+		return 0;
>+
>+	/* flush local entries from main table */
>+	fib_table_flush_external(main_table);
> 
> 	return 0;
> }
>diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
>index 4cff74d..735edc9 100644
>--- a/net/ipv4/fib_trie.c
>+++ b/net/ipv4/fib_trie.c
>@@ -1760,6 +1760,71 @@ struct fib_table *fib_trie_unmerge(struct fib_table *oldtb)
> 	return NULL;
> }
> 
>+/* Caller must hold RTNL */
>+void fib_table_flush_external(struct fib_table *tb)
>+{
>+	struct trie *t = (struct trie *)tb->tb_data;
>+	struct key_vector *pn = t->kv;
>+	unsigned long cindex = 1;
>+	struct hlist_node *tmp;
>+	struct fib_alias *fa;
>+
>+	/* walk trie in reverse order */
>+	for (;;) {
>+		unsigned char slen = 0;
>+		struct key_vector *n;
>+
>+		if (!(cindex--)) {
>+			t_key pkey = pn->key;
>+
>+			/* cannot resize the trie vector */
>+			if (IS_TRIE(pn))
>+				break;
>+
>+			/* resize completed node */
>+			pn = resize(t, pn);
>+			cindex = get_index(pkey, pn);
>+
>+			continue;
>+		}
>+
>+		/* grab the next available node */
>+		n = get_child(pn, cindex);
>+		if (!n)
>+			continue;
>+
>+		if (IS_TNODE(n)) {
>+			/* record pn and cindex for leaf walking */
>+			pn = n;
>+			cindex = 1ul << n->bits;
>+
>+			continue;
>+		}
>+
>+		hlist_for_each_entry_safe(fa, tmp, &n->leaf, fa_list) {
>+			/* if alias was cloned to local then we just
>+			 * need to remove the local copy from main
>+			 */
>+			if (tb->tb_id != fa->tb_id) {
>+				hlist_del_rcu(&fa->fa_list);
>+				alias_free_mem_rcu(fa);
>+				continue;
>+			}
>+
>+			/* record local slen */
>+			slen = fa->fa_slen;
>+		}
>+
>+		/* update leaf slen */
>+		n->slen = slen;
>+
>+		if (hlist_empty(&n->leaf)) {
>+			put_child_root(pn, n->key, NULL);
>+			node_free(n);
>+		}
>+	}
>+}
>+
> /* Caller must hold RTNL. */
> int fib_table_flush(struct net *net, struct fib_table *tb)
> {
>

^ permalink raw reply

* Re: [PATCH] icmp: Restore resistence to abnormal messages
From: Vicente Jiménez @ 2016-11-15 19:32 UTC (permalink / raw)
  To: Florian Westphal
  Cc: David Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, netdev, linux-kernel
In-Reply-To: <20161115173024.GD30581@breakpoint.cc>

I agree that both patches try to solve the same problem in a very similar way.
Florian Westphal's patch do two more things:
1- add warning with pr_warn_ratelimited. I like this idea. I also
though about adding some message but I have no kernel experience and I
preferred to have just a working solution.
2- Check if the packet size is lower than (536 + 8). I think this is
not necessary because low values (even the zero case) is already
handled by the protocol. Also I don't understand why you choose this
value, it seems to be related to TCP MSS and the compared value is IP
packet size.

Finally, both patches decrement current packet by a value: Mine by 2
and Florian's by 8 bytes. Both arbitrary values. Personally I prefer
to go by small steps. If the small step fails, it just iterate again
and with 4 iterations, my patch also decrement the original value by 8
bytes (4x2).
Basically they are the same but my patch take smaller steps and miss
the warning message.

If David Miller thinks this could be a good addition, I'll add the
warning message to my patch.

We can also discuss the amount to subtract.

On Tue, Nov 15, 2016 at 6:30 PM, Florian Westphal <fw@strlen.de> wrote:
> David Miller <davem@redhat.com> wrote:
>> From: Vicente Jiménez <googuy@gmail.com>
>> Date: Tue, 15 Nov 2016 17:49:43 +0100
>>
>> > On Mon, Nov 14, 2016 at 7:36 PM, David Miller <davem@davemloft.net> wrote:
>> >> From: Vicente Jimenez Aguilar <googuy@gmail.com>
>> >> Date: Fri, 11 Nov 2016 21:20:18 +0100
>> >>
>> >>> @@ -819,6 +820,12 @@ static bool icmp_unreach(struct sk_buff *skb)
>> >>>                               /* fall through */
>> >>>                       case 0:
>> >>>                               info = ntohs(icmph->un.frag.mtu);
>> >>> +                             /* Handle weird case where next hop MTU is
>> >>> +                              * equal to or exceeding dropped packet size
>> >>> +                              */
>> >>> +                             old_mtu = ntohs(iph->tot_len);
>> >>> +                             if (info >= old_mtu)
>> >>> +                                     info = old_mtu - 2;
>> >>
>> >> This isn't something the old code did.
>> >>
>> >> The old code behaved much differently.
>> >>
>> > I don't wanted to restore old behavior just fix a strange case that
>> > was handle by this code where the next hop MTU reported by the router
>> > is equal or greater than the actual path MTU. Because router
>> > information is wrong, we need a way to guess a good packet size
>> > ignoring router data. The simplest strategy that avoid odd numbers is
>> > reducing dropped packet size by 2.
>>
>> This whole approach seems arbitrary.
>>
>> You haven't discussed in any way, what causes this in the first place.
>> And what about that cause makes simply subtracting by 2 work well or
>> not.
>>
>> You have a very locallized, specific, situation on your end you want
>> to fix.  But we must accept changes that handle things generically and
>> in a way that would help more than just your specific case.
>
> FWIW this is similar to the patch I sent a while ago:
>
> https://patchwork.ozlabs.org/patch/493997/
>
> I think in interest of robustness principle ("eat shit and don't die")
> one of these changes should go in :-|



-- 
saludos
vicente

^ permalink raw reply

* Re: [PATCH 04/15] net: mv643xx_eth: Utilize phy_ethtool_nway_reset
From: Andrew Lunn @ 2016-11-15 19:28 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: netdev, davem, tremyfr, Sebastian Hesselbarth, open list
In-Reply-To: <20161115180644.3941-5-f.fainelli@gmail.com>

On Tue, Nov 15, 2016 at 10:06:33AM -0800, Florian Fainelli wrote:
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* [PATCH net-next] bpf: Fix compilation warning in __bpf_lru_list_rotate_inactive
From: Martin KaFai Lau @ 2016-11-15 19:00 UTC (permalink / raw)
  To: netdev, David Miller; +Cc: Alexei Starovoitov, Daniel Borkmann, Kernel Team

gcc-6.2.1 gives the following warning:
kernel/bpf/bpf_lru_list.c: In function ‘__bpf_lru_list_rotate_inactive.isra.3’:
kernel/bpf/bpf_lru_list.c:201:28: warning: ‘next’ may be used uninitialized in this function [-Wmaybe-uninitialized]

The "next" is currently initialized in the while() loop which must have >=1
iterations.

This patch initializes next to get rid of the compiler warning.

Fixes: 3a08c2fd7634 ("bpf: LRU List")
Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 kernel/bpf/bpf_lru_list.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/bpf_lru_list.c b/kernel/bpf/bpf_lru_list.c
index bfebff0..89b7ef4 100644
--- a/kernel/bpf/bpf_lru_list.c
+++ b/kernel/bpf/bpf_lru_list.c
@@ -170,7 +170,7 @@ static void __bpf_lru_list_rotate_inactive(struct bpf_lru *lru,
 					   struct bpf_lru_list *l)
 {
 	struct list_head *inactive = &l->lists[BPF_LRU_LIST_T_INACTIVE];
-	struct list_head *cur, *next, *last;
+	struct list_head *cur, *last, *next = inactive;
 	struct bpf_lru_node *node;
 	unsigned int i = 0;
 
-- 
2.5.1

^ permalink raw reply related

* [PATCH net-next 0/5] net: Implenent ethtool::nway_reset for a few drivers
From: Florian Fainelli @ 2016-11-15 19:19 UTC (permalink / raw)
  To: netdev; +Cc: davem, andrew, tremyfr, Florian Fainelli

Hi all,

This patch series depends on "net: phy: Centralize auto-negotation restart"
since it provides phy_ethtool_nway_reset as a helper function.

The drivers here already support PHYLIB, so there really is no reason why
restarting auto-negotiation would not be possible with these.

Florian Fainelli (5):
  net: stmmac: Implement ethtool::nway_reset
  net: ethoc: Implement ethtool::nway_reset
  net: ethernet: mvneta: Implement ethtool::nway_reset
  net: ethernet: mvpp2: Implement ethtool::nway_reset
  net: ethernet: marvell: pxa168_eth: Implement ethtool::nway_reset

 drivers/net/ethernet/ethoc.c                         | 1 +
 drivers/net/ethernet/marvell/mvneta.c                | 1 +
 drivers/net/ethernet/marvell/mvpp2.c                 | 1 +
 drivers/net/ethernet/marvell/pxa168_eth.c            | 1 +
 drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 1 +
 5 files changed, 5 insertions(+)

-- 
2.9.3

^ permalink raw reply

* [PATCH net-next 5/5] net: ethernet: marvell: pxa168_eth: Implement ethtool::nway_reset
From: Florian Fainelli @ 2016-11-15 19:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, tremyfr, Florian Fainelli, Sergei Shtylyov,
	Jisheng Zhang, Peter Chen, Jarod Wilson, Florian Westphal,
	open list
In-Reply-To: <20161115191949.15361-1-f.fainelli@gmail.com>

Implement ethtool::nway_reset using phy_ethtool_nway_reset. We are
already using dev->phydev all over the place so this comes for free.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/ethernet/marvell/pxa168_eth.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/marvell/pxa168_eth.c b/drivers/net/ethernet/marvell/pxa168_eth.c
index b78a838f306c..3af2814ada23 100644
--- a/drivers/net/ethernet/marvell/pxa168_eth.c
+++ b/drivers/net/ethernet/marvell/pxa168_eth.c
@@ -1393,6 +1393,7 @@ static void pxa168_get_drvinfo(struct net_device *dev,
 
 static const struct ethtool_ops pxa168_ethtool_ops = {
 	.get_drvinfo	= pxa168_get_drvinfo,
+	.nway_reset	= phy_ethtool_nway_reset,
 	.get_link	= ethtool_op_get_link,
 	.get_ts_info	= ethtool_op_get_ts_info,
 	.get_link_ksettings = pxa168_get_link_ksettings,
-- 
2.9.3

^ permalink raw reply related

* [PATCH net-next 4/5] net: ethernet: mvpp2: Implement ethtool::nway_reset
From: Florian Fainelli @ 2016-11-15 19:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, tremyfr, Florian Fainelli, Jisheng Zhang,
	Marcin Wojtas, Jarod Wilson, Peter Chen, open list
In-Reply-To: <20161115191949.15361-1-f.fainelli@gmail.com>

Implement ethtool::nway_reset using phy_ethtool_nway_reset. We are
already using dev->phydev all over the place so this comes for free.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/ethernet/marvell/mvpp2.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index c8bf155cad94..ee857868bea5 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -5923,6 +5923,7 @@ static const struct net_device_ops mvpp2_netdev_ops = {
 };
 
 static const struct ethtool_ops mvpp2_eth_tool_ops = {
+	.nway_reset	= phy_ethtool_nway_reset,
 	.get_link	= ethtool_op_get_link,
 	.set_coalesce	= mvpp2_ethtool_set_coalesce,
 	.get_coalesce	= mvpp2_ethtool_get_coalesce,
-- 
2.9.3

^ permalink raw reply related

* [PATCH net-next 3/5] net: ethernet: mvneta: Implement ethtool::nway_reset
From: Florian Fainelli @ 2016-11-15 19:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, tremyfr, Florian Fainelli, Thomas Petazzoni,
	open list
In-Reply-To: <20161115191949.15361-1-f.fainelli@gmail.com>

Implement ethtool::nway_reset using phy_ethtool_nway_reset. We are
already using dev->phydev all over the place so this comes for free.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/ethernet/marvell/mvneta.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index b85819ea8eea..87274d4ab102 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -3811,6 +3811,7 @@ static const struct net_device_ops mvneta_netdev_ops = {
 };
 
 const struct ethtool_ops mvneta_eth_tool_ops = {
+	.nway_reset	= phy_ethtool_nway_reset,
 	.get_link       = ethtool_op_get_link,
 	.set_coalesce   = mvneta_ethtool_set_coalesce,
 	.get_coalesce   = mvneta_ethtool_get_coalesce,
-- 
2.9.3

^ permalink raw reply related

* [PATCH net-next 2/5] net: ethoc: Implement ethtool::nway_reset
From: Florian Fainelli @ 2016-11-15 19:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, tremyfr, Florian Fainelli, Tobias Klauser,
	Colin Ian King, open list
In-Reply-To: <20161115191949.15361-1-f.fainelli@gmail.com>

Implement ethtool::nway_reset using phy_ethtool_nway_reset. We are
already using dev->phydev all over the place so this comes for free.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/ethernet/ethoc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/ethoc.c b/drivers/net/ethernet/ethoc.c
index c044667a0a25..6456c180114b 100644
--- a/drivers/net/ethernet/ethoc.c
+++ b/drivers/net/ethernet/ethoc.c
@@ -966,6 +966,7 @@ static int ethoc_set_ringparam(struct net_device *dev,
 const struct ethtool_ops ethoc_ethtool_ops = {
 	.get_regs_len = ethoc_get_regs_len,
 	.get_regs = ethoc_get_regs,
+	.nway_reset = phy_ethtool_nway_reset,
 	.get_link = ethtool_op_get_link,
 	.get_ringparam = ethoc_get_ringparam,
 	.set_ringparam = ethoc_set_ringparam,
-- 
2.9.3

^ permalink raw reply related

* [PATCH net-next 1/5] net: stmmac: Implement ethtool::nway_reset
From: Florian Fainelli @ 2016-11-15 19:19 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, tremyfr, Florian Fainelli, Giuseppe Cavallaro,
	Alexandre Torgue, open list
In-Reply-To: <20161115191949.15361-1-f.fainelli@gmail.com>

Utilize the generic phy_ethtool_nway_reset() helper function to
implement an autonegotiation restart.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
index 3fe9340b748f..0290d52330da 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
@@ -870,6 +870,7 @@ static const struct ethtool_ops stmmac_ethtool_ops = {
 	.get_regs = stmmac_ethtool_gregs,
 	.get_regs_len = stmmac_ethtool_get_regs_len,
 	.get_link = ethtool_op_get_link,
+	.nway_reset = phy_ethtool_nway_reset,
 	.get_pauseparam = stmmac_get_pauseparam,
 	.set_pauseparam = stmmac_set_pauseparam,
 	.get_ethtool_stats = stmmac_get_ethtool_stats,
-- 
2.9.3

^ permalink raw reply related

* Re: [ovs-dev] [PATCH net-next v13 0/8] openvswitch: support for layer 3 encapsulated packets
From: Thadeu Lima de Souza Cascardo @ 2016-11-15 19:01 UTC (permalink / raw)
  To: Yang, Yi Y, Jiri Benc, netdev@vger.kernel.org
  Cc: dev@openvswitch.org, Simon Horman, egarver
In-Reply-To: <79BBBFE6CB6C9B488C1A45ACD284F5195F0EBFEA@SHSMSX103.ccr.corp.intel.com>

On November 15, 2016 11:57:21 AM GMT-02:00, "Yang, Yi Y" <yi.y.yang@intel.com> wrote:
>Hi, Jiri
>
>I'm very glad to see you're continuing this work :-), I asked Simon
>about this twice, but nobody replies. I also remember Cascardo has a
>patch set to collaborate with this patch set, I asked Cascardo, but
>nobody responds, will you continue to do Cascardo's " create tunnel
>devices using rtnetlink interface" patch set? I test the old one v3,
>that can work with vxlan module in kernel, but if I build ovs with
>option " --with-linux=/lib/modules/`uname -r`/build", ovs vxlan module
>is built in vport_vxlan module, when I create vxlan-gpe port, kernel
>will automatically load vxlan module in the kernel instead of using the
>APIs in vport_vxlan module. 
>
>Cascardo, are you still working on this?
>
>-----Original Message-----
>From: netdev-owner@vger.kernel.org
>[mailto:netdev-owner@vger.kernel.org] On Behalf Of Jiri Benc
>Sent: Thursday, November 10, 2016 11:28 PM
>To: netdev@vger.kernel.org
>Cc: dev@openvswitch.org; Pravin Shelar <pshelar@ovn.org>; Lorand Jakab
><lojakab@cisco.com>; Simon Horman <simon.horman@netronome.com>
>Subject: [PATCH net-next v13 0/8] openvswitch: support for layer 3
>encapsulated packets
>
>At the core of this patch set is removing the assumption in Open
>vSwitch datapath that all packets have Ethernet header.
>
>The implementation relies on the presence of pop_eth and push_eth
>actions in datapath flows to facilitate adding and removing Ethernet
>headers as appropriate. The construction of such flows is left up to
>user-space.
>
>This series is based on work by Simon Horman, Lorand Jakab, Thomas
>Morin and others. I kept Lorand's and Simon's s-o-b in the patches that
>are derived from v11 to record their authorship of parts of the code.
>
>Changes from v12 to v13:
>
>* Addressed Pravin's feedback.
>* Removed the GRE vport conversion patch; L3 GRE ports should be
>created by
>  rtnetlink instead.
>
>Main changes from v11 to v12:
>
>* The patches were restructured and split differently for easier
>review.
>* They were rebased and adjusted to the current net-next. Especially
>MPLS
>handling is different (and easier) thanks to the recent MPLS GSO
>rework.
>* Several bugs were discovered and fixed. The most notable is fragment
>handling: header adjustment for ARPHRD_NONE devices on tx needs to be
>done
>after refragmentation, not before it. This required significant changes
>in
>the patchset. Another one is stricter checking of attributes (match on
>L2
>  vs. L3 packet) at the kernel level.
>* Instead of is_layer3 bool, a mac_proto field is used.
>
>Jiri Benc (8):
>  openvswitch: use hard_header_len instead of hardcoded ETH_HLEN
>  openvswitch: add mac_proto field to the flow key
>  openvswitch: pass mac_proto to ovs_vport_send
>  openvswitch: support MPLS push and pop for L3 packets
>  openvswitch: add processing of L3 packets
>  openvswitch: netlink: support L3 packets
>  openvswitch: add Ethernet push and pop actions
>  openvswitch: allow L3 netdev ports
>
> include/uapi/linux/openvswitch.h |  15 ++++
> net/openvswitch/actions.c        | 111 +++++++++++++++++-------
> net/openvswitch/datapath.c       |  13 +--
> net/openvswitch/flow.c           | 105 +++++++++++++++++------
> net/openvswitch/flow.h           |  22 +++++
>net/openvswitch/flow_netlink.c   | 179
>++++++++++++++++++++++++++-------------
> net/openvswitch/vport-netdev.c   |   9 +-
> net/openvswitch/vport.c          |  31 +++++--
> net/openvswitch/vport.h          |   2 +-
> 9 files changed, 353 insertions(+), 134 deletions(-)
>
>--
>1.8.3.1
>
>_______________________________________________
>dev mailing list
>dev@openvswitch.org
>https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Hi.

I am still working on this. Just see my recent discussion with Pravin about the way to support out of tree drivers. If you have any opinion on that, please share on that thread, preferably with a patch. It's likely that Eric Garver will take this task as he was already working with me.

Cascardo.

^ permalink raw reply

* Re: [PATCH net] rtnetlink: fix rtnl message size computation for XDP
From: Brenden Blanco @ 2016-11-15 19:05 UTC (permalink / raw)
  To: Sabrina Dubroca; +Cc: netdev
In-Reply-To: <0169d2419991a4ba533087a4fda70420ccdaca30.1479204870.git.sd@queasysnail.net>

On Tue, Nov 15, 2016 at 11:16:35AM +0100, Sabrina Dubroca wrote:
> rtnl_xdp_size() only considers the size of the actual payload attribute,
> and misses the space taken by the attribute used for nesting (IFLA_XDP).
> 
> Fixes: d1fdd9138682 ("rtnl: add option for setting link xdp prog")
> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Brenden Blanco <bblanco@plumgrid.com>
> ---
>  net/core/rtnetlink.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index 5aeb61aadd32..5b4c3ba507d0 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -901,7 +901,8 @@ static size_t rtnl_port_size(const struct net_device *dev,
>  
>  static size_t rtnl_xdp_size(const struct net_device *dev)
>  {
> -	size_t xdp_size = nla_total_size(1);	/* XDP_ATTACHED */
> +	size_t xdp_size = nla_total_size(0) +	/* nest IFLA_XDP */
> +			  nla_total_size(1);	/* XDP_ATTACHED */
>  
>  	if (!dev->netdev_ops->ndo_xdp)
>  		return 0;
> -- 
> 2.10.2
> 

^ permalink raw reply

* Re: linux-next: net->netns_ids is used after calling idr_destroy for it
From: Cong Wang @ 2016-11-15 18:50 UTC (permalink / raw)
  To: Andrei Vagin; +Cc: Nicolas Dichtel, Linux Kernel Network Developers
In-Reply-To: <CAM_iQpUQOCsVkaRPU3OkmNvyOLDtmRRK1G3uzvi9URSVEbapJQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 637 bytes --]

On Tue, Nov 15, 2016 at 10:04 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Mon, Nov 14, 2016 at 10:23 PM, Andrei Vagin <avagin@gmail.com> wrote:
>> Hi Nicolas,
>>
>> cleanup_net() calls idr_destroy(net->netns_ids) for network namespaces
>> and then it calls unregister_netdevice_many() which calls
>> idr_alloc(net0>netns_ids). It looks wrong, doesn't it?
>>
>
> netns id is designed to allocate lazily, but yeah it makes no sense
> to allocate id for the netns being destroyed, not to mention idr is freed.
>
> I will send a patch.

Could you try the attached patch? I just did some quick netns creation/destroy
tests.

Thanks!

[-- Attachment #2: unregister-netdevice.diff --]
[-- Type: text/plain, Size: 2296 bytes --]

diff --git a/net/core/dev.c b/net/core/dev.c
index 6deba68..f9a2969 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6685,7 +6685,7 @@ static void net_set_todo(struct net_device *dev)
 	dev_net(dev)->dev_unreg_count++;
 }
 
-static void rollback_registered_many(struct list_head *head)
+static void rollback_registered_many(struct list_head *head, bool send_rtmsg)
 {
 	struct net_device *dev, *tmp;
 	LIST_HEAD(close_head);
@@ -6737,8 +6737,8 @@ static void rollback_registered_many(struct list_head *head)
 		*/
 		call_netdevice_notifiers(NETDEV_UNREGISTER, dev);
 
-		if (!dev->rtnl_link_ops ||
-		    dev->rtnl_link_state == RTNL_LINK_INITIALIZED)
+		if ((!dev->rtnl_link_ops ||
+		    dev->rtnl_link_state == RTNL_LINK_INITIALIZED) && send_rtmsg)
 			skb = rtmsg_ifinfo_build_skb(RTM_DELLINK, dev, ~0U,
 						     GFP_KERNEL);
 
@@ -6777,7 +6777,7 @@ static void rollback_registered(struct net_device *dev)
 	LIST_HEAD(single);
 
 	list_add(&dev->unreg_list, &single);
-	rollback_registered_many(&single);
+	rollback_registered_many(&single, true);
 	list_del(&single);
 }
 
@@ -7769,6 +7769,18 @@ void unregister_netdevice_queue(struct net_device *dev, struct list_head *head)
 }
 EXPORT_SYMBOL(unregister_netdevice_queue);
 
+static void __unregister_netdevice_many(struct list_head *head, bool send_rtmsg)
+{
+	struct net_device *dev;
+
+	if (!list_empty(head)) {
+		rollback_registered_many(head, send_rtmsg);
+		list_for_each_entry(dev, head, unreg_list)
+			net_set_todo(dev);
+		list_del(head);
+	}
+}
+
 /**
  *	unregister_netdevice_many - unregister many devices
  *	@head: list of devices
@@ -7778,14 +7790,7 @@ EXPORT_SYMBOL(unregister_netdevice_queue);
  */
 void unregister_netdevice_many(struct list_head *head)
 {
-	struct net_device *dev;
-
-	if (!list_empty(head)) {
-		rollback_registered_many(head);
-		list_for_each_entry(dev, head, unreg_list)
-			net_set_todo(dev);
-		list_del(head);
-	}
+	__unregister_netdevice_many(head, true);
 }
 EXPORT_SYMBOL(unregister_netdevice_many);
 
@@ -8239,7 +8244,7 @@ static void __net_exit default_device_exit_batch(struct list_head *net_list)
 				unregister_netdevice_queue(dev, &dev_kill_list);
 		}
 	}
-	unregister_netdevice_many(&dev_kill_list);
+	__unregister_netdevice_many(&dev_kill_list, false);
 	rtnl_unlock();
 }
 

^ permalink raw reply related

* Re: [PATCH v2 iproute2 1/1] tc: print raw qdisc handle.
From: Stephen Hemminger @ 2016-11-15 18:30 UTC (permalink / raw)
  To: Roman Mashak; +Cc: netdev, jhs
In-Reply-To: <1479164360-24188-1-git-send-email-mrv@mojatatu.com>

On Mon, 14 Nov 2016 17:59:20 -0500
Roman Mashak <mrv@mojatatu.com> wrote:

> This is v2 patch with fixed code indentation.
> 
> Signed-off-by: Roman Mashak <mrv@mojatatu.com>
> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>

Applied, but inverted the if() since it reads better if code
is:
   if (show_raw)
	...
   else
	...

^ permalink raw reply

* Re: [LKP] [net] 2ab9fb18c4: kernel BUG at include/linux/skbuff.h:1935!
From: Alexander Duyck @ 2016-11-15 18:30 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Ye Xiaolong, Eric Dumazet, Alexander Duyck, Willem de Bruijn,
	netdev, Alexei Starovoitov, Jojy Varghese, Tom Herbert,
	Yibin Yang, LKP, David Miller
In-Reply-To: <20161114031130.yjb4anou24ede4ue@wfg-t540p.sh.intel.com>

On Sun, Nov 13, 2016 at 7:11 PM, Fengguang Wu <fengguang.wu@intel.com> wrote:
>>> Hi guys.
>>>
>>> I took a look at the commit again and I do not see how this can happen.
>>>
>>> Are you sure patch was properly applied ?
>>>
>>> In particular, the following extract is obscure for me :
>>>
>>>
>>>> https://github.com/0day-ci/linux
>>>> Eric-Dumazet/net-__skb_flow_dissect-must-cap-its-return-value/20161110-080839
>>>> commit 2ab9fb18c46b91b16a0f0f329336d3be9fc32deb ("net:
>>>> __skb_flow_dissect() must cap its return value")
>>>>
>>
>> Hi,
>>
>> The above two lines means 0day repo setup a new branch
>>
>> "Eric-Dumazet/net-__skb_flow_dissect-must-cap-its-return-value/20161110-080839"
>> which is based on net/master, then applied you patch on top of it,
>> commit id is 2ab9fb18c46b91b16a0f0f329336d3be9fc32deb.
>
>
> Xiaolong, it may be more helpful to show the base tree where we apply
> the patch to. And the final url:
>
> https://github.com/0day-ci/linux/tree/Eric-Dumazet/net-__skb_flow_dissect-must-cap-its-return-value/20161110-080839
>
> Thanks,
> Fengguang

Is there any chance you guys could send me the net/ethernet/eth.o and
drivers/net/ethernet/intel/igb/igb.o files for this build?  I'm still
not able to reproduce but I have a thought that this might be some
sort of compiler issue.

Specifically looking over the call trace it looks like we only pulled
8 bytes of the header instead of 14.  This is based on the values of
RAX and R15 differing by 6 which I believe represent skb->len and
skb->data_len  However eth_get_headlen is supposed to be using
sizeof(*eth) which should be 14.  I'm wondering if there is an issue
due to it being a packed structure that is resulting in 8 being
reported instead of 14.

Thanks.

- Alex

^ permalink raw reply

* Re: [net PATCH 1/2] ipv4: Restore fib_trie_flush_external function and fix call ordering
From: Eric Dumazet @ 2016-11-15 18:30 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev, Eric Dumazet, Jiri Pirko, davem
In-Reply-To: <20161115104606.13711.37506.stgit@ahduyck-blue-test.jf.intel.com>

On Tue, 2016-11-15 at 05:46 -0500, Alexander Duyck wrote:
> The patch that removed the FIB offload infrastructure was a bit too
> aggressive and also removed code needed to clean up us splitting the table
> if additional rules were added.  Specifically the function
> fib_trie_flush_external was called at the end of a new rule being added to
> flush the foreign trie entries from the main trie.
> 
> I updated the code so that we only call fib_trie_flush_external on the main
> table so that we flush the entries for local from main.  This way we don't
> call it for every rule change which is what was happening previously.
> 
> Fixes: 347e3b28c1ba2 ("switchdev: remove FIB offload infrastructure")
> Reported-by: Eric Dumazet <edumazet@google.com>
> Cc: Jiri Pirko <jiri@mellanox.com>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---


Acked-by: Eric Dumazet <edumazet@google.com>

Thanks for sorting this out Alex !

^ permalink raw reply

* Re: [net PATCH 2/2] ipv4: Fix memory leak in exception case for splitting tries
From: Eric Dumazet @ 2016-11-15 18:24 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev, Eric Dumazet, davem
In-Reply-To: <20161115104612.13711.44592.stgit@ahduyck-blue-test.jf.intel.com>

On Tue, 2016-11-15 at 05:46 -0500, Alexander Duyck wrote:
> Fix a small memory leak that can occur where we leak a fib_alias in the
> event of us not being able to insert it into the local table.
> 
> Fixes: 0ddcf43d5d4a0 ("ipv4: FIB Local/MAIN table collapse")
> Reported-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
>  net/ipv4/fib_trie.c |    4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox