Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [net-next, v2, 1/2] net: stmmac: Rework coalesce timer and fix multi-queue races
From: Neil Armstrong @ 2018-09-10 15:49 UTC (permalink / raw)
  To: Jose Abreu, netdev
  Cc: Jerome Brunet, Martin Blumenstingl, David S. Miller, Joao Pinto,
	Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <9283e03e-5167-0d04-5ad9-59593a46fa8a@synopsys.com>

Hi Jose,

On 10/09/2018 16:44, Jose Abreu wrote:
> On 10-09-2018 14:46, Neil Armstrong wrote:
>> hi Jose,
>>
>> On 10/09/2018 14:55, Jose Abreu wrote:
>>> On 10-09-2018 13:52, Jose Abreu wrote:
>>>> Can you please try attached follow-up patch ? 
>>> Oh, please apply the whole series otherwise this will not apply
>>> cleanly.
>> Indeed, it helps!
>>
>> With the fixups, it fails later, around 15s instead of 3, in RX and TX.
> 
> Thanks for testing Neil. What if we keep rearming the timer
> whilst there are pending packets ? Something like in the attach.
> (applies on top of previous one).

It fixes RX, but TX fails after ~13s.

Neil

> 
> Thanks and Best Regards,
> Jose Miguel Abreu
> 

^ permalink raw reply

* Re: [PATCH 2/2] erspan: fix error handling for erspan tunnel
From: William Tu @ 2018-09-10 15:52 UTC (permalink / raw)
  To: Haishuang Yan
  Cc: David Miller, Alexey Kuznetsov, Linux Kernel Network Developers,
	LKML
In-Reply-To: <1536589188-27550-2-git-send-email-yanhaishuang@cmss.chinamobile.com>

On Mon, Sep 10, 2018 at 7:20 AM Haishuang Yan
<yanhaishuang@cmss.chinamobile.com> wrote:
>
> When processing icmp unreachable message for erspan tunnel, tunnel id
> should be erspan_net_id instead of ipgre_net_id.
>
> Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN")
> Cc: William Tu <u9012063@gmail.com>
> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
> ---

Thanks for the fix.
Acked-by: William Tu <u9012063@gmail.com>

^ permalink raw reply

* Re: [PATCH 1/2] erspan: return PACKET_REJECT when the appropriate tunnel is not found
From: William Tu @ 2018-09-10 15:53 UTC (permalink / raw)
  To: Haishuang Yan
  Cc: David Miller, Alexey Kuznetsov, Linux Kernel Network Developers,
	LKML
In-Reply-To: <1536589188-27550-1-git-send-email-yanhaishuang@cmss.chinamobile.com>

On Mon, Sep 10, 2018 at 7:20 AM Haishuang Yan
<yanhaishuang@cmss.chinamobile.com> wrote:
>
> If erspan tunnel hasn't been established, we'd better send icmp port
> unreachable message after receive erspan packets.
>
> Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN")
> Cc: William Tu <u9012063@gmail.com>
> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
> ---

LGTM.
Acked-by: William Tu <u9012063@gmail.com>

^ permalink raw reply

* Corrupted sit-tunnelled packets when using skb_gso_segment() on an IFB interface?
From: Toke Høiland-Jørgensen @ 2018-09-10 16:04 UTC (permalink / raw)
  To: netdev; +Cc: cake

Hi everyone

While investigating a bug report on CAKE[0], I've run into the following
behaviour:

When running CAKE as an ingress shaper on an IFB interface, if the GSO
splitting feature is turned on, TCP throughput will drop dramatically on
6in4 (sit) tunnels running over the interface in question. Looking at a
traffic dump, I'm seeing ~15% packet loss on the encapsulated TCP
stream.

IPv4 traffic is fine on the same interface, as is native IPv6 traffic.
And turning off GSO splitting in CAKE makes the packet loss go away. The
issue only seems to appear on IFB interfaces. So I'm wondering if there
is some interaction that corrupts packets when they are being split in
this configuration?

Steps to reproduce (assuming the box you are running on has IP 10.0.0.2
on eth0, and has a peer at 10.0.0.1 with a suitably configured sit
tunnel):

# modprobe ifb
# ip link set dev ifb0 up
# tc qdisc add dev eth0 handle ffff: ingress
# tc filter add dev eth0 parent ffff: protocol all prio 10 matchall action mirred egress redirect dev ifb0
# tc qdisc replace dev ifb0 root cake
# ip link add type sit local 10.0.0.2 remote 10.0.0.1
# ip link set dev sit1 up
# netperf -H fe80::a00:1%sit1 -t TCP_MAERTS

Whereas, in the same setup, this will work fine:

# netperf -H 10.0.0.1 -t TCP_MAERTS

As will this:

# tc qdisc replace dev ifb0 root cake no-split-gso
# netperf -H fe80::a00:1%sit1 -t TCP_MAERTS

Does anyone have any ideas? :)

-Toke

[0] https://github.com/tohojo/sqm-scripts/issues/72

^ permalink raw reply

* [PATCH net-next] net/ipv6: Remove rt6i_prefsrc
From: dsahern @ 2018-09-10 16:11 UTC (permalink / raw)
  To: netdev; +Cc: lucien.xin, David Ahern

From: David Ahern <dsahern@gmail.com>

After the conversion to fib6_info, rt6i_prefsrc has a single user that
reads the value and otherwise it is only set. The one reader can be
converted to use rt->from so rt6i_prefsrc can be removed, reducing
rt6_info by another 20 bytes.

Signed-off-by: David Ahern <dsahern@gmail.com>
---
 drivers/scsi/cxgbi/libcxgbi.c |  4 ++--
 include/net/ip6_fib.h         |  1 -
 net/ipv6/route.c              | 27 ---------------------------
 3 files changed, 2 insertions(+), 30 deletions(-)

diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
index 3f3af5e74a07..6b3ea50c594e 100644
--- a/drivers/scsi/cxgbi/libcxgbi.c
+++ b/drivers/scsi/cxgbi/libcxgbi.c
@@ -784,7 +784,7 @@ cxgbi_check_route6(struct sockaddr *dst_addr, int ifindex)
 	csk->mtu = mtu;
 	csk->dst = dst;
 
-	if (ipv6_addr_any(&rt->rt6i_prefsrc.addr)) {
+	if (!rt->from || ipv6_addr_any(&rt->from->fib6_prefsrc.addr)) {
 		struct inet6_dev *idev = ip6_dst_idev((struct dst_entry *)rt);
 
 		err = ipv6_dev_get_saddr(&init_net, idev ? idev->dev : NULL,
@@ -795,7 +795,7 @@ cxgbi_check_route6(struct sockaddr *dst_addr, int ifindex)
 			goto rel_rt;
 		}
 	} else {
-		pref_saddr = rt->rt6i_prefsrc.addr;
+		pref_saddr = rt->from->fib6_prefsrc.addr;
 	}
 
 	csk->csk_family = AF_INET6;
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 3d4930528db0..c7496663f99a 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -182,7 +182,6 @@ struct rt6_info {
 	struct in6_addr			rt6i_gateway;
 	struct inet6_dev		*rt6i_idev;
 	u32				rt6i_flags;
-	struct rt6key			rt6i_prefsrc;
 
 	struct list_head		rt6i_uncached;
 	struct uncached_list		*rt6i_uncached_list;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 18e00ce1719a..41f04b966008 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -995,7 +995,6 @@ static void ip6_rt_copy_init(struct rt6_info *rt, struct fib6_info *ort)
 #ifdef CONFIG_IPV6_SUBTREES
 	rt->rt6i_src = ort->fib6_src;
 #endif
-	rt->rt6i_prefsrc = ort->fib6_prefsrc;
 }
 
 static struct fib6_node* fib6_backtrack(struct fib6_node *fn,
@@ -1449,11 +1448,6 @@ static int rt6_insert_exception(struct rt6_info *nrt,
 	if (ort->fib6_src.plen)
 		src_key = &nrt->rt6i_src.addr;
 #endif
-
-	/* Update rt6i_prefsrc as it could be changed
-	 * in rt6_remove_prefsrc()
-	 */
-	nrt->rt6i_prefsrc = ort->fib6_prefsrc;
 	/* rt6_mtu_change() might lower mtu on ort.
 	 * Only insert this exception route if its mtu
 	 * is less than ort's mtu value.
@@ -1635,25 +1629,6 @@ static void rt6_update_exception_stamp_rt(struct rt6_info *rt)
 	rcu_read_unlock();
 }
 
-static void rt6_exceptions_remove_prefsrc(struct fib6_info *rt)
-{
-	struct rt6_exception_bucket *bucket;
-	struct rt6_exception *rt6_ex;
-	int i;
-
-	bucket = rcu_dereference_protected(rt->rt6i_exception_bucket,
-					lockdep_is_held(&rt6_exception_lock));
-
-	if (bucket) {
-		for (i = 0; i < FIB6_EXCEPTION_BUCKET_SIZE; i++) {
-			hlist_for_each_entry(rt6_ex, &bucket->chain, hlist) {
-				rt6_ex->rt6i->rt6i_prefsrc.plen = 0;
-			}
-			bucket++;
-		}
-	}
-}
-
 static bool rt6_mtu_change_route_allowed(struct inet6_dev *idev,
 					 struct rt6_info *rt, int mtu)
 {
@@ -3795,8 +3770,6 @@ static int fib6_remove_prefsrc(struct fib6_info *rt, void *arg)
 		spin_lock_bh(&rt6_exception_lock);
 		/* remove prefsrc entry */
 		rt->fib6_prefsrc.plen = 0;
-		/* need to update cache as well */
-		rt6_exceptions_remove_prefsrc(rt);
 		spin_unlock_bh(&rt6_exception_lock);
 	}
 	return 0;
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH net] ipv6: use rt6_info members when dst is set in rt6_fill_node
From: David Ahern @ 2018-09-10 16:13 UTC (permalink / raw)
  To: Xin Long; +Cc: network dev, davem, Roopa Prabhu
In-Reply-To: <CADvbK_fgFM+VZ=kew4QkuM1xP90T2rWetXo3Awu48AEjJ+nvkg@mail.gmail.com>

On 9/9/18 12:29 AM, Xin Long wrote:
>>> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
>>> index 18e00ce..e554922 100644
>>> --- a/net/ipv6/route.c
>>> +++ b/net/ipv6/route.c
>>> @@ -4670,20 +4670,33 @@ static int rt6_fill_node(struct net *net, struct sk_buff *skb,
>>>                        int iif, int type, u32 portid, u32 seq,
>>>                        unsigned int flags)
>>>  {
>>> -     struct rtmsg *rtm;
>>> +     struct rt6key *fib6_prefsrc, *fib6_dst, *fib6_src;
>>> +     struct rt6_info *rt6 = (struct rt6_info *)dst;
>>> +     u32 *pmetrics, table, fib6_flags;
>>>       struct nlmsghdr *nlh;
>>> +     struct rtmsg *rtm;
>>>       long expires = 0;
>>> -     u32 *pmetrics;
>>> -     u32 table;
>>>
>>>       nlh = nlmsg_put(skb, portid, seq, type, sizeof(*rtm), flags);
>>>       if (!nlh)
>>>               return -EMSGSIZE;
>>>
>>> +     if (rt6) {
>>> +             fib6_dst = &rt6->rt6i_dst;
>>> +             fib6_src = &rt6->rt6i_src;
>>> +             fib6_flags = rt6->rt6i_flags;
>>> +             fib6_prefsrc = &rt6->rt6i_prefsrc;
>>> +     } else {
>>> +             fib6_dst = &rt->fib6_dst;
>>> +             fib6_src = &rt->fib6_src;
>>> +             fib6_flags = rt->fib6_flags;
>>> +             fib6_prefsrc = &rt->fib6_prefsrc;
>>> +     }
>>
>> Unless I am missing something at the moment, an rt6_info can only have
>> the same dst, src and prefsrc as the fib6_info on which it is based.
>> Thus, only the flags is needed above. That simplifies this patch a lot.
> If dst, src and prefsrc in rt6_info are always the same as these in fib6_info,
> why do we need them in rt6_info? we could just get it by 'from'.
> 

I just sent a patch removing rt6i_prefsrc. It is set with only 1 reader
that can be converted.

rt6i_src is checked against the fib6_info to invalidate a dst if the src
has changed, so a valid rt will always have the same rt6i_src as the
rt->from.

rt6i_dst is set to the dest address / 128 in cases, so it should be used
for rt6_info cases above.

^ permalink raw reply

* Re: [net-next, v2, 1/2] net: stmmac: Rework coalesce timer and fix multi-queue races
From: Jose Abreu @ 2018-09-10 16:21 UTC (permalink / raw)
  To: Neil Armstrong, Jose Abreu, netdev
  Cc: Jerome Brunet, Martin Blumenstingl, David S. Miller, Joao Pinto,
	Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <d190300c-269b-1bae-569e-f09e782e49cc@baylibre.com>

[-- Attachment #1: Type: text/plain, Size: 930 bytes --]

On 10-09-2018 16:49, Neil Armstrong wrote:
> Hi Jose,
>
> On 10/09/2018 16:44, Jose Abreu wrote:
>> On 10-09-2018 14:46, Neil Armstrong wrote:
>>> hi Jose,
>>>
>>> On 10/09/2018 14:55, Jose Abreu wrote:
>>>> On 10-09-2018 13:52, Jose Abreu wrote:
>>>>> Can you please try attached follow-up patch ? 
>>>> Oh, please apply the whole series otherwise this will not apply
>>>> cleanly.
>>> Indeed, it helps!
>>>
>>> With the fixups, it fails later, around 15s instead of 3, in RX and TX.
>> Thanks for testing Neil. What if we keep rearming the timer
>> whilst there are pending packets ? Something like in the attach.
>> (applies on top of previous one).
> It fixes RX, but TX fails after ~13s.

Ok :(

Can you please try attached follow-up patch ?

I'm so sorry about this back and forth and I appreciate all your
help .

Thanks and Best Regards,
Jose Miguel Abreu


>
> Neil
>
>> Thanks and Best Regards,
>> Jose Miguel Abreu
>>


[-- Attachment #2: 0001-fixup_coalesce_3.patch --]
[-- Type: text/x-patch, Size: 3315 bytes --]

>From 4f2ba5fca6c8858cfe640f3d466fd01904c451e3 Mon Sep 17 00:00:00 2001
Message-Id: <4f2ba5fca6c8858cfe640f3d466fd01904c451e3.1536596296.git.joabreu@synopsys.com>
From: Jose Abreu <joabreu@synopsys.com>
Date: Mon, 10 Sep 2018 18:18:10 +0200
Subject: [PATCH] fixup_coalesce_3

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 25 ++++++-----------------
 1 file changed, 6 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 76a6196b3263..f6587ee372ab 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2245,11 +2245,7 @@ static void stmmac_tx_timer_arm(struct stmmac_priv *priv, u32 queue)
 {
 	struct stmmac_tx_queue *tx_q = &priv->tx_queue[queue];
 
-	if (tx_q->tx_timer_active)
-		return;
-
 	mod_timer(&tx_q->txtimer, STMMAC_COAL_TIMER(priv->tx_coal_timer));
-	tx_q->tx_timer_active = true;
 }
 
 /**
@@ -2264,10 +2260,7 @@ static void stmmac_tx_timer(struct timer_list *t)
 	struct stmmac_priv *priv = tx_q->priv_data;
 	bool more;
 
-	tx_q->tx_timer_active = false;
 	stmmac_tx_clean(priv, ~0, tx_q->queue_index, &more);
-	if (more)
-		stmmac_tx_timer_arm(priv, tx_q->queue_index);
 }
 
 /**
@@ -2866,9 +2859,6 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
 	/* Compute header lengths */
 	proto_hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb);
 
-	/* Start coalesce timer earlier in case TX Queue is stopped */
-	stmmac_tx_timer_arm(priv, queue);
-
 	/* Desc availability based on threshold should be enough safe */
 	if (unlikely(stmmac_tx_avail(priv, queue) <
 		(((skb->len - proto_hdr_len) / TSO_MAX_BUFF_SIZE + 1)))) {
@@ -2975,6 +2965,8 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
 		stmmac_set_tx_ic(priv, desc);
 		priv->xstats.tx_set_ic_bit++;
 		tx_q->tx_count_frames = 0;
+	} else {
+		stmmac_tx_timer_arm(priv, queue);
 	}
 
 	skb_tx_timestamp(skb);
@@ -3065,9 +3057,6 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 			return stmmac_tso_xmit(skb, dev);
 	}
 
-	/* Start coalesce timer earlier in case TX Queue is stopped */
-	stmmac_tx_timer_arm(priv, queue);
-
 	if (unlikely(stmmac_tx_avail(priv, queue) < nfrags + 1)) {
 		if (!netif_tx_queue_stopped(netdev_get_tx_queue(dev, queue))) {
 			netif_tx_stop_queue(netdev_get_tx_queue(priv->dev,
@@ -3186,6 +3175,8 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 		stmmac_set_tx_ic(priv, desc);
 		priv->xstats.tx_set_ic_bit++;
 		tx_q->tx_count_frames = 0;
+	} else {
+		stmmac_tx_timer_arm(priv, queue);
 	}
 
 	skb_tx_timestamp(skb);
@@ -3572,16 +3563,12 @@ static int stmmac_tx_poll(struct napi_struct *napi, int budget)
 	struct stmmac_priv *priv = tx_q->priv_data;
 	u32 chan = tx_q->queue_index;
 	int work_done = 0;
-	bool more;
 
 	priv->xstats.napi_poll++;
 
-	work_done = stmmac_tx_clean(priv, budget, chan, &more);
-	if (work_done < budget) {
+	work_done = stmmac_tx_clean(priv, budget, chan, NULL);
+	if (work_done < budget)
 		napi_complete_done(napi, work_done);
-		if (more)
-			napi_reschedule(napi);
-	}
 
 	return min(work_done, budget);
 }
-- 
2.7.4


^ permalink raw reply related

* Re: [net-next, PATCH 2/2, v1] net: socionext: add AF_XDP support
From: Ilias Apalodimas @ 2018-09-10 16:21 UTC (permalink / raw)
  To: Toshiaki Makita
  Cc: netdev, jaswinder.singh, ard.biesheuvel, masami.hiramatsu, arnd,
	mykyta.iziumtsev, bjorn.topel, magnus.karlsson
In-Reply-To: <8bfd8219-acea-8b63-b6be-d17a7e3b6e24@lab.ntt.co.jp>

> > @@ -707,6 +731,26 @@ static int netsec_process_rx(struct netsec_priv *priv, int budget)
> >  		if (unlikely(!buf_addr))
> >  			break;
> >  
> > +		if (xdp_prog) {
> > +			xdp_result = netsec_run_xdp(desc, priv, xdp_prog,
> > +						    pkt_len);
> > +			if (xdp_result != NETSEC_XDP_PASS) {
> > +				xdp_flush |= xdp_result & NETSEC_XDP_REDIR;
> > +
> > +				dma_unmap_single_attrs(priv->dev,
> > +						       desc->dma_addr,
> > +						       desc->len, DMA_TO_DEVICE,
> > +						       DMA_ATTR_SKIP_CPU_SYNC);
> > +
> > +				desc->len = desc_len;
> > +				desc->dma_addr = dma_handle;
> > +				desc->addr = buf_addr;
> > +				netsec_rx_fill(priv, idx, 1);
> > +				nsetsec_adv_desc(&dring->tail);
> > +			}
> > +			continue;
> 
> Continue even on XDP_PASS? Is this really correct?
> 
> Also seems there is no handling of adjust_head/tail for XDP_PASS case.
> 
A question on this. Should XDP related frames be allocated using 1 page
per packet?

Thanks

Ilias

^ permalink raw reply

* Re: [PATCH] net: ipv4: Use BUG_ON directly instead of a if condition followed by BUG
From: kbuild test robot @ 2018-09-10 21:20 UTC (permalink / raw)
  To: zhong jiang
  Cc: kbuild-all, davem, edumazet, kuznet, yoshfuji, netdev,
	linux-kernel
In-Reply-To: <1536590282-23899-1-git-send-email-zhongjiang@huawei.com>

[-- Attachment #1: Type: text/plain, Size: 5670 bytes --]

Hi zhong,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on net/master]
[also build test ERROR on v4.19-rc3 next-20180910]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/zhong-jiang/net-ipv4-Use-BUG_ON-directly-instead-of-a-if-condition-followed-by-BUG/20180911-034152
config: mips-rt305x_defconfig (attached as .config)
compiler: mipsel-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.2.0 make.cross ARCH=mips 

All errors (new ones prefixed by >>):

   net/ipv4/tcp_input.c: In function 'tcp_collapse':
>> net/ipv4/tcp_input.c:4924:5: error: too many arguments to function 'BUG'
        BUG(skb_copy_bits(skb, offset,
        ^~~
   In file included from include/linux/bug.h:5:0,
                    from include/linux/mmdebug.h:5,
                    from include/linux/mm.h:9,
                    from net/ipv4/tcp_input.c:67:
   arch/mips/include/asm/bug.h:12:31: note: declared here
    static inline void __noreturn BUG(void)
                                  ^~~
   net/ipv4/tcp_input.c: In function 'tcp_urg':
   net/ipv4/tcp_input.c:5318:4: error: too many arguments to function 'BUG'
       BUG(skb_copy_bits(skb, ptr, &tmp, 1));
       ^~~
   In file included from include/linux/bug.h:5:0,
                    from include/linux/mmdebug.h:5,
                    from include/linux/mm.h:9,
                    from net/ipv4/tcp_input.c:67:
   arch/mips/include/asm/bug.h:12:31: note: declared here
    static inline void __noreturn BUG(void)
                                  ^~~

vim +/BUG +4924 net/ipv4/tcp_input.c

  4838	
  4839	/* Collapse contiguous sequence of skbs head..tail with
  4840	 * sequence numbers start..end.
  4841	 *
  4842	 * If tail is NULL, this means until the end of the queue.
  4843	 *
  4844	 * Segments with FIN/SYN are not collapsed (only because this
  4845	 * simplifies code)
  4846	 */
  4847	static void
  4848	tcp_collapse(struct sock *sk, struct sk_buff_head *list, struct rb_root *root,
  4849		     struct sk_buff *head, struct sk_buff *tail, u32 start, u32 end)
  4850	{
  4851		struct sk_buff *skb = head, *n;
  4852		struct sk_buff_head tmp;
  4853		bool end_of_skbs;
  4854	
  4855		/* First, check that queue is collapsible and find
  4856		 * the point where collapsing can be useful.
  4857		 */
  4858	restart:
  4859		for (end_of_skbs = true; skb != NULL && skb != tail; skb = n) {
  4860			n = tcp_skb_next(skb, list);
  4861	
  4862			/* No new bits? It is possible on ofo queue. */
  4863			if (!before(start, TCP_SKB_CB(skb)->end_seq)) {
  4864				skb = tcp_collapse_one(sk, skb, list, root);
  4865				if (!skb)
  4866					break;
  4867				goto restart;
  4868			}
  4869	
  4870			/* The first skb to collapse is:
  4871			 * - not SYN/FIN and
  4872			 * - bloated or contains data before "start" or
  4873			 *   overlaps to the next one.
  4874			 */
  4875			if (!(TCP_SKB_CB(skb)->tcp_flags & (TCPHDR_SYN | TCPHDR_FIN)) &&
  4876			    (tcp_win_from_space(sk, skb->truesize) > skb->len ||
  4877			     before(TCP_SKB_CB(skb)->seq, start))) {
  4878				end_of_skbs = false;
  4879				break;
  4880			}
  4881	
  4882			if (n && n != tail &&
  4883			    TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(n)->seq) {
  4884				end_of_skbs = false;
  4885				break;
  4886			}
  4887	
  4888			/* Decided to skip this, advance start seq. */
  4889			start = TCP_SKB_CB(skb)->end_seq;
  4890		}
  4891		if (end_of_skbs ||
  4892		    (TCP_SKB_CB(skb)->tcp_flags & (TCPHDR_SYN | TCPHDR_FIN)))
  4893			return;
  4894	
  4895		__skb_queue_head_init(&tmp);
  4896	
  4897		while (before(start, end)) {
  4898			int copy = min_t(int, SKB_MAX_ORDER(0, 0), end - start);
  4899			struct sk_buff *nskb;
  4900	
  4901			nskb = alloc_skb(copy, GFP_ATOMIC);
  4902			if (!nskb)
  4903				break;
  4904	
  4905			memcpy(nskb->cb, skb->cb, sizeof(skb->cb));
  4906	#ifdef CONFIG_TLS_DEVICE
  4907			nskb->decrypted = skb->decrypted;
  4908	#endif
  4909			TCP_SKB_CB(nskb)->seq = TCP_SKB_CB(nskb)->end_seq = start;
  4910			if (list)
  4911				__skb_queue_before(list, skb, nskb);
  4912			else
  4913				__skb_queue_tail(&tmp, nskb); /* defer rbtree insertion */
  4914			skb_set_owner_r(nskb, sk);
  4915	
  4916			/* Copy data, releasing collapsed skbs. */
  4917			while (copy > 0) {
  4918				int offset = start - TCP_SKB_CB(skb)->seq;
  4919				int size = TCP_SKB_CB(skb)->end_seq - start;
  4920	
  4921				BUG_ON(offset < 0);
  4922				if (size > 0) {
  4923					size = min(copy, size);
> 4924					BUG(skb_copy_bits(skb, offset,
  4925							  skb_put(nskb, size), size));
  4926					TCP_SKB_CB(nskb)->end_seq += size;
  4927					copy -= size;
  4928					start += size;
  4929				}
  4930				if (!before(start, TCP_SKB_CB(skb)->end_seq)) {
  4931					skb = tcp_collapse_one(sk, skb, list, root);
  4932					if (!skb ||
  4933					    skb == tail ||
  4934					    (TCP_SKB_CB(skb)->tcp_flags & (TCPHDR_SYN | TCPHDR_FIN)))
  4935						goto end;
  4936	#ifdef CONFIG_TLS_DEVICE
  4937					if (skb->decrypted != nskb->decrypted)
  4938						goto end;
  4939	#endif
  4940				}
  4941			}
  4942		}
  4943	end:
  4944		skb_queue_walk_safe(&tmp, skb, n)
  4945			tcp_rbtree_insert(root, skb);
  4946	}
  4947	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 14248 bytes --]

^ permalink raw reply

* Re: [PATCH net-next] net/ipv6: Remove rt6i_prefsrc
From: David Miller @ 2018-09-10 17:02 UTC (permalink / raw)
  To: dsahern; +Cc: netdev, lucien.xin, dsahern
In-Reply-To: <20180910161128.25520-1-dsahern@kernel.org>

From: dsahern@kernel.org
Date: Mon, 10 Sep 2018 09:11:28 -0700

> From: David Ahern <dsahern@gmail.com>
> 
> After the conversion to fib6_info, rt6i_prefsrc has a single user that
> reads the value and otherwise it is only set. The one reader can be
> converted to use rt->from so rt6i_prefsrc can be removed, reducing
> rt6_info by another 20 bytes.
> 
> Signed-off-by: David Ahern <dsahern@gmail.com>

Applied, thanks David.

^ permalink raw reply

* Re: unexpected GRO/veth behavior
From: Eric Dumazet @ 2018-09-10 17:06 UTC (permalink / raw)
  To: Paolo Abeni, Eric Dumazet, netdev; +Cc: Toshiaki Makita
In-Reply-To: <c5be74086876ce96353cb79e6486df321d58d48d.camel@redhat.com>

On 09/10/2018 08:22 AM, Paolo Abeni wrote:
 in this already heavy cost engine.
> 
> Yup, even if I do not see any measurable cost added by the posted code.

Sure, micro bench marks wont show anything.

Now, if GRO receives one packet every 100 usec, as many hosts in the wild do,
there is an additional cost because of icache being wasted.

^ permalink raw reply

* [PATCH net-next v1] net/tls: Fixed return value when tls_complete_pending_work() fails
From: Vakul Garg @ 2018-09-10 17:23 UTC (permalink / raw)
  To: netdev; +Cc: borisp, aviadye, davejwatson, davem, doronrk, Vakul Garg

In tls_sw_sendmsg() and tls_sw_sendpage(), the variable 'ret' has
been set to return value of tls_complete_pending_work(). This allows
return of proper error code if tls_complete_pending_work() fails.

Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
---
 net/tls/tls_sw.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index be4f2e990f9f..adab598bd6db 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -486,7 +486,7 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
 	struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
-	int ret = 0;
+	int ret;
 	int required_size;
 	long timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
 	bool eor = !(msg->msg_flags & MSG_MORE);
@@ -502,7 +502,8 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 
 	lock_sock(sk);
 
-	if (tls_complete_pending_work(sk, tls_ctx, msg->msg_flags, &timeo))
+	ret = tls_complete_pending_work(sk, tls_ctx, msg->msg_flags, &timeo);
+	if (ret)
 		goto send_end;
 
 	if (unlikely(msg->msg_controllen)) {
@@ -637,7 +638,7 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
 	struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
-	int ret = 0;
+	int ret;
 	long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
 	bool eor;
 	size_t orig_size = size;
@@ -657,7 +658,8 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
 
 	sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
 
-	if (tls_complete_pending_work(sk, tls_ctx, flags, &timeo))
+	ret = tls_complete_pending_work(sk, tls_ctx, flags, &timeo);
+	if (ret)
 		goto sendpage_end;
 
 	/* Call the sk_stream functions to manage the sndbuf mem. */
-- 
2.13.6

^ permalink raw reply related

* Re: [PATCH net-next v2] net: sched: cls_flower: dump offload count value
From: David Miller @ 2018-09-10 17:35 UTC (permalink / raw)
  To: vladbu; +Cc: netdev, jakub.kicinski, jhs, xiyou.wangcong, jiri
In-Reply-To: <1536330141-10354-1-git-send-email-vladbu@mellanox.com>

From: Vlad Buslov <vladbu@mellanox.com>
Date: Fri,  7 Sep 2018 17:22:21 +0300

> Change flower in_hw_count type to fixed-size u32 and dump it as
> TCA_FLOWER_IN_HW_COUNT. This change is necessary to properly test shared
> blocks and re-offload functionality.
> 
> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
> Acked-by: Jiri Pirko <jiri@mellanox.com>

Applied, thank you.

^ permalink raw reply

* Re: [Patch net-next] net_sched: remove redundant qdisc lock classes
From: David Miller @ 2018-09-10 17:44 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, jiri, jhs
In-Reply-To: <20180907202914.21331-1-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Fri,  7 Sep 2018 13:29:13 -0700

> We no longer take any spinlock on RX path for ingress qdisc,
> so this lockdep annotation is no longer needed.
> 
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied.

^ permalink raw reply

* Re: [Patch net-next] htb: use anonymous union for simplicity
From: David Miller @ 2018-09-10 17:44 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, jiri, jhs
In-Reply-To: <20180907202914.21331-2-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Fri,  7 Sep 2018 13:29:14 -0700

> cl->leaf.q is slightly more readable than cl->un.leaf.q.
> 
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] qmi_wwan: Support dynamic config on Quectel EP06
From: David Miller @ 2018-09-10 17:49 UTC (permalink / raw)
  To: kristian.evensen; +Cc: netdev, bjorn
In-Reply-To: <20180908115048.12667-1-kristian.evensen@gmail.com>

From: Kristian Evensen <kristian.evensen@gmail.com>
Date: Sat,  8 Sep 2018 13:50:48 +0200

> Quectel EP06 (and EM06/EG06) supports dynamic configuration of USB
> interfaces, without the device changing VID/PID or configuration number.
> When the configuration is updated and interfaces are added/removed, the
> interface numbers change. This means that the current code for matching
> EP06 does not work.
> 
> This patch removes the current EP06 interface number match, and replaces
> it with a match on class, subclass and protocol. Unfortunately, matching
> on those three alone is not enough, as the diag interface exports the
> same values as QMI. The other serial interfaces + adb export different
> values and do not match.
> 
> The diag interface only has two endpoints, while the QMI interface has
> three. I have therefore added a check for number of interfaces, and we
> ignore the interface if the number of endpoints equals two.
> 
> Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net] ipv6: use rt6_info members when dst is set in rt6_fill_node
From: Xin Long @ 2018-09-10 17:55 UTC (permalink / raw)
  To: David Ahern; +Cc: network dev, davem, Roopa Prabhu
In-Reply-To: <b13be408-a734-e959-3299-bdd49a1318e7@cumulusnetworks.com>

On Tue, Sep 11, 2018 at 12:13 AM David Ahern <dsa@cumulusnetworks.com> wrote:
>
> On 9/9/18 12:29 AM, Xin Long wrote:
> >>> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> >>> index 18e00ce..e554922 100644
> >>> --- a/net/ipv6/route.c
> >>> +++ b/net/ipv6/route.c
> >>> @@ -4670,20 +4670,33 @@ static int rt6_fill_node(struct net *net, struct sk_buff *skb,
> >>>                        int iif, int type, u32 portid, u32 seq,
> >>>                        unsigned int flags)
> >>>  {
> >>> -     struct rtmsg *rtm;
> >>> +     struct rt6key *fib6_prefsrc, *fib6_dst, *fib6_src;
> >>> +     struct rt6_info *rt6 = (struct rt6_info *)dst;
> >>> +     u32 *pmetrics, table, fib6_flags;
> >>>       struct nlmsghdr *nlh;
> >>> +     struct rtmsg *rtm;
> >>>       long expires = 0;
> >>> -     u32 *pmetrics;
> >>> -     u32 table;
> >>>
> >>>       nlh = nlmsg_put(skb, portid, seq, type, sizeof(*rtm), flags);
> >>>       if (!nlh)
> >>>               return -EMSGSIZE;
> >>>
> >>> +     if (rt6) {
> >>> +             fib6_dst = &rt6->rt6i_dst;
> >>> +             fib6_src = &rt6->rt6i_src;
> >>> +             fib6_flags = rt6->rt6i_flags;
> >>> +             fib6_prefsrc = &rt6->rt6i_prefsrc;
> >>> +     } else {
> >>> +             fib6_dst = &rt->fib6_dst;
> >>> +             fib6_src = &rt->fib6_src;
> >>> +             fib6_flags = rt->fib6_flags;
> >>> +             fib6_prefsrc = &rt->fib6_prefsrc;
> >>> +     }
> >>
> >> Unless I am missing something at the moment, an rt6_info can only have
> >> the same dst, src and prefsrc as the fib6_info on which it is based.
> >> Thus, only the flags is needed above. That simplifies this patch a lot.
> > If dst, src and prefsrc in rt6_info are always the same as these in fib6_info,
> > why do we need them in rt6_info? we could just get it by 'from'.
> >
>
> I just sent a patch removing rt6i_prefsrc. It is set with only 1 reader
> that can be converted.
>
> rt6i_src is checked against the fib6_info to invalidate a dst if the src
> has changed, so a valid rt will always have the same rt6i_src as the
> rt->from.
>
> rt6i_dst is set to the dest address / 128 in cases, so it should be used
> for rt6_info cases above.
So that means, I will use rt6i_dst and rt6i_flags when dst is set?
how about I use rt6i_src there as well? just to make it look clear.
and plus the gw/nh dump fix in rt6_fill_node():
-        if (rt->fib6_nsiblings) {
+        if (rt6) {
+                if (fib6_flags & RTF_GATEWAY)
+                        if (nla_put_in6_addr(skb, RTA_GATEWAY,
+                                             &rt6->rt6i_gateway) < 0)
+                                goto nla_put_failure;
+
+                if (dst->dev && nla_put_u32(skb, RTA_OIF, dst->dev->ifindex))
+                        goto nla_put_failure;
+        } else if (rt->fib6_nsiblings) {
                 struct fib6_info *sibling, *next_sibling;
                 struct nlattr *mp;

looks good to you?

^ permalink raw reply

* Re: [PATCH can-next] can: ucan: remove duplicated include from ucan.c
From: Martin Elshuber @ 2018-09-10 18:10 UTC (permalink / raw)
  To: YueHaibing, Wolfgang Grandegger, Marc Kleine-Budde,
	David S. Miller, Jakob Unterwurzacher, Philipp Tomsich
  Cc: linux-can, netdev, kernel-janitors
In-Reply-To: <1535505945-143347-1-git-send-email-yuehaibing@huawei.com>


[-- Attachment #1.1: Type: text/plain, Size: 811 bytes --]

Thank you for the fix

Reviewed-by: Martin Elshuber <martin.elshuber@theobroma-systems.com>

Am 29.08.18 um 03:25 schrieb YueHaibing:
> Remove duplicated include.
> 
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> ---
>  drivers/net/can/usb/ucan.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/net/can/usb/ucan.c b/drivers/net/can/usb/ucan.c
> index 0678a38..c6f4b41 100644
> --- a/drivers/net/can/usb/ucan.c
> +++ b/drivers/net/can/usb/ucan.c
> @@ -35,10 +35,6 @@
>  #include <linux/slab.h>
>  #include <linux/usb.h>
>  
> -#include <linux/can.h>
> -#include <linux/can/dev.h>
> -#include <linux/can/error.h>
> -
>  #define UCAN_DRIVER_NAME "ucan"
>  #define UCAN_MAX_RX_URBS 8
>  /* the CAN controller needs a while to enable/disable the bus */
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 842 bytes --]

^ permalink raw reply

* Re: [net-next, v2, 1/2] net: stmmac: Rework coalesce timer and fix multi-queue races
From: Neil Armstrong @ 2018-09-10 18:15 UTC (permalink / raw)
  To: Jose Abreu, netdev
  Cc: Jerome Brunet, Martin Blumenstingl, David S. Miller, Joao Pinto,
	Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <5d7c2397-8ca9-1f9c-c394-f73a12456384@synopsys.com>

Hi Jose,

On 10/09/2018 18:21, Jose Abreu wrote:
> On 10-09-2018 16:49, Neil Armstrong wrote:
>> Hi Jose,
>>
>> On 10/09/2018 16:44, Jose Abreu wrote:
>>> On 10-09-2018 14:46, Neil Armstrong wrote:
>>>> hi Jose,
>>>>
>>>> On 10/09/2018 14:55, Jose Abreu wrote:
>>>>> On 10-09-2018 13:52, Jose Abreu wrote:
>>>>>> Can you please try attached follow-up patch ? 
>>>>> Oh, please apply the whole series otherwise this will not apply
>>>>> cleanly.
>>>> Indeed, it helps!
>>>>
>>>> With the fixups, it fails later, around 15s instead of 3, in RX and TX.
>>> Thanks for testing Neil. What if we keep rearming the timer
>>> whilst there are pending packets ? Something like in the attach.
>>> (applies on top of previous one).
>> It fixes RX, but TX fails after ~13s.
> 
> Ok :(
> 
> Can you please try attached follow-up patch ?

RX is still ok but now TX fails almost immediately...

With 100ms report :

$ iperf3 -c 192.168.1.47 -t 0 -p 5202 -R -i 0.1
Connecting to host 192.168.1.47, port 5202
Reverse mode, remote host 192.168.1.47 is sending
[  4] local 192.168.1.45 port 45900 connected to 192.168.1.47 port 5202
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-0.10   sec  10.9 MBytes   913 Mbits/sec
[  4]   0.10-0.20   sec  11.0 MBytes   923 Mbits/sec
[  4]   0.20-0.30   sec  6.34 MBytes   532 Mbits/sec
[  4]   0.30-0.40   sec  0.00 Bytes  0.00 bits/sec
[  4]   0.40-0.50   sec  0.00 Bytes  0.00 bits/sec
[  4]   0.50-0.60   sec  0.00 Bytes  0.00 bits/sec
[  4]   0.60-0.70   sec  0.00 Bytes  0.00 bits/sec
[  4]   0.70-0.80   sec  0.00 Bytes  0.00 bits/sec
[  4]   0.80-0.90   sec  0.00 Bytes  0.00 bits/sec
[  4]   0.90-1.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   1.00-1.10   sec  0.00 Bytes  0.00 bits/sec
^C[  4]   1.10-1.10   sec  0.00 Bytes  0.00 bits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.10   sec  0.00 Bytes  0.00 bits/sec                  sender
[  4]   0.00-1.10   sec  28.2 MBytes   214 Mbits/sec                  receiver
iperf3: interrupt - the client has terminated

Neil

> 
> I'm so sorry about this back and forth and I appreciate all your
> help .
> 
> Thanks and Best Regards,
> Jose Miguel Abreu
> 
> 
>>
>> Neil
>>
>>> Thanks and Best Regards,
>>> Jose Miguel Abreu
>>>
> 

^ permalink raw reply

* Re: [PATCH net-next v2 2/2] net: stmmac: Fixup the tail addr setting in xmit path
From: Florian Fainelli @ 2018-09-10 18:46 UTC (permalink / raw)
  To: Jose Abreu, netdev
  Cc: David S. Miller, Joao Pinto, Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <2b430bedf98176f052e1530004ab623d26d2c71a.1536570319.git.joabreu@synopsys.com>

On 09/10/2018 02:14 AM, Jose Abreu wrote:
> Currently we are always setting the tail address of descriptor list to
> the end of the pre-allocated list.
> 
> According to databook this is not correct. Tail address should point to
> the last available descriptor + 1, which means we have to update the
> tail address everytime we call the xmit function.
> 
> This should make no impact in older versions of MAC but in newer
> versions there are some DMA features which allows the IP to fetch
> descriptors in advance and in a non sequential order so its critical
> that we set the tail address correctly.

Can you include the appropriate Fixes tag here so this can easily be
backported to relevant stable branches?
-- 
Florian

^ permalink raw reply

* Re: Corrupted sit-tunnelled packets when using skb_gso_segment() on an IFB interface?
From: Eric Dumazet @ 2018-09-10 18:52 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, netdev; +Cc: cake
In-Reply-To: <87in3djq45.fsf@toke.dk>



On 09/10/2018 09:04 AM, Toke Høiland-Jørgensen wrote:
> Hi everyone
> 
> While investigating a bug report on CAKE[0], I've run into the following
> behaviour:
> 
> When running CAKE as an ingress shaper on an IFB interface, if the GSO
> splitting feature is turned on, TCP throughput will drop dramatically on
> 6in4 (sit) tunnels running over the interface in question. Looking at a
> traffic dump, I'm seeing ~15% packet loss on the encapsulated TCP
> stream.
> 
> IPv4 traffic is fine on the same interface, as is native IPv6 traffic.
> And turning off GSO splitting in CAKE makes the packet loss go away. The
> issue only seems to appear on IFB interfaces. So I'm wondering if there
> is some interaction that corrupts packets when they are being split in
> this configuration?
> 
> Steps to reproduce (assuming the box you are running on has IP 10.0.0.2
> on eth0, and has a peer at 10.0.0.1 with a suitably configured sit
> tunnel):
> 
> # modprobe ifb
> # ip link set dev ifb0 up
> # tc qdisc add dev eth0 handle ffff: ingress
> # tc filter add dev eth0 parent ffff: protocol all prio 10 matchall action mirred egress redirect dev ifb0
> # tc qdisc replace dev ifb0 root cake
> # ip link add type sit local 10.0.0.2 remote 10.0.0.1
> # ip link set dev sit1 up
> # netperf -H fe80::a00:1%sit1 -t TCP_MAERTS
> 
> Whereas, in the same setup, this will work fine:
> 
> # netperf -H 10.0.0.1 -t TCP_MAERTS
> 
> As will this:
> 
> # tc qdisc replace dev ifb0 root cake no-split-gso
> # netperf -H fe80::a00:1%sit1 -t TCP_MAERTS
> 
> 
> Does anyone have any ideas? :)
> 

My guess is that skb->mac_len is not properly updated in the segments (compared to the original GSO packet)

^ permalink raw reply

* Fw: [Bug 201071] New: Creating a vxlan in state 'up' does not give proper RTM_NEWLINK message
From: Stephen Hemminger @ 2018-09-10 18:55 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: netdev



Begin forwarded message:

Date: Mon, 10 Sep 2018 04:04:37 +0000
From: bugzilla-daemon@bugzilla.kernel.org
To: stephen@networkplumber.org
Subject: [Bug 201071] New: Creating a vxlan in state 'up' does not give proper RTM_NEWLINK message


https://bugzilla.kernel.org/show_bug.cgi?id=201071

            Bug ID: 201071
           Summary: Creating a vxlan in state 'up' does not give proper
                    RTM_NEWLINK message
           Product: Networking
           Version: 2.5
    Kernel Version: 4.19-rc1
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Other
          Assignee: stephen@networkplumber.org
          Reporter: liam.mcbirnie@boeing.com
        Regression: Yes

If a vxlan is created with state 'up', the RTM_NEWLINK message shows the state
as down, and there no other netlink messages are sent.
As a result, processes listening to netlink are never notified that the vxlan
link is up.

eg.
# ip link add test up type vxlan id 8 group 224.224.224.224 dev eth0

Output of ip monitor link
# 4: test: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN group default
      link/ether ee:cd:97:1a:cf:91 brd ff:ff:ff:ff:ff:ff

Output of ip link show (expected from netlink message)
# 4: test: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state
UNKNOWN group default qlen 1000
      link/ether ee:cd:97:1a:cf:91 brd ff:ff:ff:ff:ff:ff

This is a regression introduced by the following patch series.
https://patchwork.ozlabs.org/patch/947181/

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply

* Fw: [Bug 201063] New: kernel panic on heavy network use
From: Stephen Hemminger @ 2018-09-10 18:56 UTC (permalink / raw)
  To: netdev



Begin forwarded message:

Date: Sun, 09 Sep 2018 13:45:28 +0000
From: bugzilla-daemon@bugzilla.kernel.org
To: stephen@networkplumber.org
Subject: [Bug 201063] New: kernel panic on heavy network use


https://bugzilla.kernel.org/show_bug.cgi?id=201063

            Bug ID: 201063
           Summary: kernel panic on heavy network use
           Product: Networking
           Version: 2.5
    Kernel Version: 4.19rc2
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Other
          Assignee: stephen@networkplumber.org
          Reporter: oyvinds@everdot.org
        Regression: No

Created attachment 278379
  --> https://bugzilla.kernel.org/attachment.cgi?id=278379&action=edit  
RIP: native_smp_send_rechedule, what did they mean by this

kernel panics, it seems to happen when there is heavy network traffic going
through that box. no nothing in logs, took picture of screen with kernel panic,
it is attached

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply

* Re: [PATCH net] ipv6: use rt6_info members when dst is set in rt6_fill_node
From: David Ahern @ 2018-09-10 19:07 UTC (permalink / raw)
  To: Xin Long; +Cc: network dev, davem, Roopa Prabhu
In-Reply-To: <CADvbK_fJCA9PUqzsuFjVYgMtfM1tvtq32=OU90PPykq8CPP=PA@mail.gmail.com>

On 9/10/18 11:55 AM, Xin Long wrote:
> On Tue, Sep 11, 2018 at 12:13 AM David Ahern <dsa@cumulusnetworks.com> wrote:
>>
>> On 9/9/18 12:29 AM, Xin Long wrote:
>>>>> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
>>>>> index 18e00ce..e554922 100644
>>>>> --- a/net/ipv6/route.c
>>>>> +++ b/net/ipv6/route.c
>>>>> @@ -4670,20 +4670,33 @@ static int rt6_fill_node(struct net *net, struct sk_buff *skb,
>>>>>                        int iif, int type, u32 portid, u32 seq,
>>>>>                        unsigned int flags)
>>>>>  {
>>>>> -     struct rtmsg *rtm;
>>>>> +     struct rt6key *fib6_prefsrc, *fib6_dst, *fib6_src;
>>>>> +     struct rt6_info *rt6 = (struct rt6_info *)dst;
>>>>> +     u32 *pmetrics, table, fib6_flags;
>>>>>       struct nlmsghdr *nlh;
>>>>> +     struct rtmsg *rtm;
>>>>>       long expires = 0;
>>>>> -     u32 *pmetrics;
>>>>> -     u32 table;
>>>>>
>>>>>       nlh = nlmsg_put(skb, portid, seq, type, sizeof(*rtm), flags);
>>>>>       if (!nlh)
>>>>>               return -EMSGSIZE;
>>>>>
>>>>> +     if (rt6) {
>>>>> +             fib6_dst = &rt6->rt6i_dst;
>>>>> +             fib6_src = &rt6->rt6i_src;
>>>>> +             fib6_flags = rt6->rt6i_flags;
>>>>> +             fib6_prefsrc = &rt6->rt6i_prefsrc;
>>>>> +     } else {
>>>>> +             fib6_dst = &rt->fib6_dst;
>>>>> +             fib6_src = &rt->fib6_src;
>>>>> +             fib6_flags = rt->fib6_flags;
>>>>> +             fib6_prefsrc = &rt->fib6_prefsrc;
>>>>> +     }
>>>>
>>>> Unless I am missing something at the moment, an rt6_info can only have
>>>> the same dst, src and prefsrc as the fib6_info on which it is based.
>>>> Thus, only the flags is needed above. That simplifies this patch a lot.
>>> If dst, src and prefsrc in rt6_info are always the same as these in fib6_info,
>>> why do we need them in rt6_info? we could just get it by 'from'.
>>>
>>
>> I just sent a patch removing rt6i_prefsrc. It is set with only 1 reader
>> that can be converted.
>>
>> rt6i_src is checked against the fib6_info to invalidate a dst if the src
>> has changed, so a valid rt will always have the same rt6i_src as the
>> rt->from.
>>
>> rt6i_dst is set to the dest address / 128 in cases, so it should be used
>> for rt6_info cases above.
> So that means, I will use rt6i_dst and rt6i_flags when dst is set?
> how about I use rt6i_src there as well? just to make it look clear.
> and plus the gw/nh dump fix in rt6_fill_node():
> -        if (rt->fib6_nsiblings) {
> +        if (rt6) {
> +                if (fib6_flags & RTF_GATEWAY)
> +                        if (nla_put_in6_addr(skb, RTA_GATEWAY,
> +                                             &rt6->rt6i_gateway) < 0)
> +                                goto nla_put_failure;
> +
> +                if (dst->dev && nla_put_u32(skb, RTA_OIF, dst->dev->ifindex))
> +                        goto nla_put_failure;
> +        } else if (rt->fib6_nsiblings) {
>                  struct fib6_info *sibling, *next_sibling;
>                  struct nlattr *mp;
> 
> looks good to you?
> 

sure

^ permalink raw reply

* Re: [PATCH iproute2 v2] tc/mqprio: Print extra info on invalid args.
From: Stephen Hemminger @ 2018-09-10 19:15 UTC (permalink / raw)
  To: Caleb Raitto; +Cc: netdev, jhs, xiyou.wangcong, jiri, Caleb Raitto
In-Reply-To: <20180906210117.203461-1-caleb.raitto@gmail.com>

On Thu,  6 Sep 2018 14:01:17 -0700
Caleb Raitto <caleb.raitto@gmail.com> wrote:

> From: Caleb Raitto <caraitto@google.com>
> 
> Print the name of the argument that wasn't understood.
> 
> Signed-off-by: Caleb Raitto <caraitto@google.com>

That is simpler, thanks. Applied

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox