Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH iproute2] lib/libnetlink: ensure a minimum of 32KB for the buffer used in rtnl_recvmsg()
From: David Ahern @ 2019-02-14 17:34 UTC (permalink / raw)
  To: Michal Kubecek, netdev
  Cc: Eric Dumazet, Stephen Hemminger, Eric Dumazet, Hangbin Liu,
	Phil Sutter
In-Reply-To: <20190214134945.GJ25518@unicorn.suse.cz>

On 2/14/19 6:49 AM, Michal Kubecek wrote:
> On Tue, Feb 12, 2019 at 07:04:17PM -0700, David Ahern wrote:
>>
>> Do we know of any single message sizes > 32k? 2d34851cd341 cites
>> increasing VF's but at some point there is a limit. If not, the whole
>> PEEK thing should go away and we just malloc 32k (or 64k) buffers for
>> each recvmsg.
> 
> IFLA_VF_LIST is by far the biggest thing I have seen so far. I don't
> remember exact numbers but the issue with 32KB buffer (for the whole
> RTM_NELINK message) was encountered by some of our customers with NICs
> having 120 or 128 VFs.
> 
> There is a bigger issue with IFLA_VFINFO_LIST, though, as it's an
> attribute so that netlink limits its size to 64 KB. IIRC with current
> size of IFLA_VF_INFO this would be reached with 270-280 VFs (I'm sure
> the number was higer than 256 but not too much higher.)
> 
> This would mean unless we let something else grow too much, the whole
> message shouldn't get much bigger than 64 KB. And if we can find some
> other solution (e.g. passing VF information in separate messages if
> client declares support), even 32 KB would be more than enough.

That's what I was asking, thanks. So 32kB today is sufficient, 64kB has
future buffer. So this whole PEEK and allocate the message size is
overkill. It could just as easily been bumped from 32kB to 64kB in the
original patch and been good for a while.

^ permalink raw reply

* Re: [PATCH] net: phy: at803x: disable delay only for RGMII mode
From: David Miller @ 2019-02-14 17:33 UTC (permalink / raw)
  To: marc.w.gonzalez
  Cc: vkoul, linux-arm-msm, bjorn.andersson, netdev, niklas.cassel,
	andrew, f.fainelli, nsekhar, peter.ujfalusi
In-Reply-To: <d566afe6-4c63-a59f-0141-cc979b29aedf@free.fr>

From: Marc Gonzalez <marc.w.gonzalez@free.fr>
Date: Thu, 14 Feb 2019 17:46:54 +0100

> On 14/02/2019 17:38, David Miller wrote:
> 
>> From: Vinod Koul <vkoul@kernel.org>
>> Date: Tue, 12 Feb 2019 19:49:22 +0530
>> 
>>> diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
>>> index 8ff12938ab47..7b54b54e3316 100644
>>> --- a/drivers/net/phy/at803x.c
>>> +++ b/drivers/net/phy/at803x.c
>>> @@ -110,6 +110,18 @@ static int at803x_debug_reg_mask(struct phy_device *phydev, u16 reg,
>>>  	return phy_write(phydev, AT803X_DEBUG_DATA, val);
>>>  }
>>>  
>>> +static inline int at803x_enable_rx_delay(struct phy_device *phydev)
>>> +{
>>> +	return at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_0, 0,
>>> +				     AT803X_DEBUG_RX_CLK_DLY_EN);
>>> +}
>>> +
>>> +static inline int at803x_enable_tx_delay(struct phy_device *phydev)
>>> +{
>>> +	return at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_5, 0,
>>> +				     AT803X_DEBUG_TX_CLK_DLY_EN);
>>> +}
>>> +
>> 
>> Please do not use the inline directive in foo.c files, let the compiler
>> decide.
> 
> Isn't the compiler free to ignore the "inline" hint?

I'm not going into this, but our rules are to not use the inline keyword
except in header files where they are required in order to not emit a
static copy of the function into every file that includes the header.

^ permalink raw reply

* Re: [PATCH] net: hns: Fix object reference leaks in hns_dsaf_roce_reset()
From: David Miller @ 2019-02-14 17:29 UTC (permalink / raw)
  To: huang.zijiang
  Cc: yisen.zhuang, salil.mehta, lipeng321, liuyonglong, yuehaibing,
	keescook, wangxi11, netdev, linux-kernel, wang.yi59
In-Reply-To: <1550126505-28394-1-git-send-email-huang.zijiang@zte.com.cn>

From: Huang Zijiang <huang.zijiang@zte.com.cn>
Date: Thu, 14 Feb 2019 14:41:45 +0800

> The of_find_device_by_node() takes a reference to the underlying device
> structure, we should release that reference.
> 
> Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH 2/2] net: Replace dev_kfree_skb_any by dev_consume_skb_any
From: David Miller @ 2019-02-14 17:28 UTC (permalink / raw)
  To: huang.zijiang
  Cc: linux-net-drivers, ecree, bkenward, netdev, linux-kernel,
	wang.yi59
In-Reply-To: <1550126533-28462-1-git-send-email-huang.zijiang@zte.com.cn>

From: Huang Zijiang <huang.zijiang@zte.com.cn>
Date: Thu, 14 Feb 2019 14:42:13 +0800

> The skb should be freed by dev_consume_skb_any() efx_tx_tso_fallback()
> when skb is still used. The skb is be replaced by segments, so the
> original skb should be consumed(not drop).
> 
> Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>

Applied with Sergei's suggested commit message fixups.

Thanks.

^ permalink raw reply

* Re: [rdma-rc PATCH 2/2] iw_cxgb4: cq/qp mask depends on bar2 pages in a host page
From: Raju Rangoju @ 2019-02-14 17:28 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: davem@davemloft.net, linux-rdma@vger.kernel.org,
	netdev@vger.kernel.org, swise@opengridcomputing.com
In-Reply-To: <20190214154127.GD1725@mellanox.com>

On Thursday, February 02/14/19, 2019 at 15:41:34 +0000, Jason Gunthorpe wrote:
> On Thu, Feb 14, 2019 at 05:40:54PM +0530, Raju Rangoju wrote:
> > Adjust the cq/qp mask based on no.of bar2 pages in a host page.
> > 
> > For user-mode rdma, the granularity of the BAR2 memory mapped
> > to a user rdma process during queue allocation must be based
> > on the host page size. The lld attributes udb_density and
> > ucq_density are used to figure out how many sge contexts are
> > in a bar2 page. So the rdev->qpmask and rdev->cqmask in
> > iw_cxgb4 need to now be adjusted based on how many sge bar2
> > pages are in a host page.
> 
> Why is this rc? Do certain arches fail to work or something?
>

Yes, this series fixes a regression that was introduced by commit
2391b0030e (v5.0-rc1~129^2~272)

> Jason

^ permalink raw reply

* Re: [PATCH] net:dl2k: Replace dev_kfree_skb_irq by dev_consume_skb_irq
From: David Miller @ 2019-02-14 17:27 UTC (permalink / raw)
  To: huang.zijiang
  Cc: quentin.monnet, jakub.kicinski, henrik, dsahern, netdev,
	linux-kernel, wang.yi59
In-Reply-To: <1550126456-28233-1-git-send-email-huang.zijiang@zte.com.cn>

From: Huang Zijiang <huang.zijiang@zte.com.cn>
Date: Thu, 14 Feb 2019 14:40:56 +0800

> dev_consume_skb_irq() should be called when skb xmit
> done.It makes drop profiles more friendly.
> 
> Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>

Applied.

^ permalink raw reply

* Re: [PATCH 1/2] net:ethernet:cadence: Replace dev_kfree_skb_any by dev_consume_skb_any
From: David Miller @ 2019-02-14 17:27 UTC (permalink / raw)
  To: huang.zijiang; +Cc: nicolas.ferre, netdev, linux-kernel, wang.yi59
In-Reply-To: <1550126478-28316-1-git-send-email-huang.zijiang@zte.com.cn>

From: Huang Zijiang <huang.zijiang@zte.com.cn>
Date: Thu, 14 Feb 2019 14:41:18 +0800

> The skb should be freed by dev_consume_skb_any() in macb_pad_and_fcs()
> when *skb is still used. The *skb is be replaced by nskb, so the
> original *skb should be consumed(not drop).
> 
> Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>

Applied.

^ permalink raw reply

* [PATCH net-next] sfc: ensure recovery after allocation failures
From: Bert Kenward @ 2019-02-14 17:27 UTC (permalink / raw)
  To: Dave Miller; +Cc: linux-net-drivers, netdev, Robert Stonehouse

From: Robert Stonehouse <rstonehouse@solarflare.com>

After failing to allocate a receive buffer the driver may fail to ever
request additional allocations. EF10 NICs require new receive buffers to
be pushed in batches of eight or more. The test for whether a slow fill
should be scheduled failed to take account of this. There is little
downside to *always* requesting a slow fill if we failed to allocate a
buffer, so the condition has been removed completely. The timer that
triggers the request for a refill has also been shortened.

Signed-off-by: Robert Stonehouse <rstonehouse@solarflare.com>
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
---
 drivers/net/ethernet/sfc/efx.c | 2 +-
 drivers/net/ethernet/sfc/rx.c  | 3 +--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index 3643015a55cf..bc655ffc9e02 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -915,7 +915,7 @@ efx_realloc_channels(struct efx_nic *efx, u32 rxq_entries, u32 txq_entries)
 
 void efx_schedule_slow_fill(struct efx_rx_queue *rx_queue)
 {
-	mod_timer(&rx_queue->slow_fill, jiffies + msecs_to_jiffies(100));
+	mod_timer(&rx_queue->slow_fill, jiffies + msecs_to_jiffies(10));
 }
 
 static bool efx_default_channel_want_txqs(struct efx_channel *channel)
diff --git a/drivers/net/ethernet/sfc/rx.c b/drivers/net/ethernet/sfc/rx.c
index 396ff01298cd..8702ab44d80b 100644
--- a/drivers/net/ethernet/sfc/rx.c
+++ b/drivers/net/ethernet/sfc/rx.c
@@ -360,8 +360,7 @@ void efx_fast_push_rx_descriptors(struct efx_rx_queue *rx_queue, bool atomic)
 		rc = efx_init_rx_buffers(rx_queue, atomic);
 		if (unlikely(rc)) {
 			/* Ensure that we don't leave the rx queue empty */
-			if (rx_queue->added_count == rx_queue->removed_count)
-				efx_schedule_slow_fill(rx_queue);
+			efx_schedule_slow_fill(rx_queue);
 			goto out;
 		}
 	} while ((space -= batch_size) >= batch_size);
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH] net:dl2k: Modify the code style escaping the warning
From: David Miller @ 2019-02-14 17:27 UTC (permalink / raw)
  To: huang.zijiang
  Cc: quentin.monnet, jakub.kicinski, henrik, dsahern, netdev,
	linux-kernel, wang.yi59
In-Reply-To: <1550126431-28152-1-git-send-email-huang.zijiang@zte.com.cn>

From: Huang Zijiang <huang.zijiang@zte.com.cn>
Date: Thu, 14 Feb 2019 14:40:31 +0800

> modify the code style in order to removing the following warning
> when excute the script checkpatch.pl
> WARNING: space prohibited between function name and open parenthesis '('
> 
> Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>

Applied.

^ permalink raw reply

* [net-next] net: stmmac: handle endianness in dwmac4_get_timestamp
From: Alexandre Torgue @ 2019-02-14 17:26 UTC (permalink / raw)
  To: Giuseppe Cavallaro, Jose Abreu, davem
  Cc: netdev, linux-stm32, linux-arm-kernel, linux-kernel,
	Alexandre Torgue

GMAC IP is little-endian and used on several kind of CPU (big or little
endian). Main callbacks functions of the stmmac drivers take care about
it. It was not the case for dwmac4_get_timestamp function.

Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com>

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
index 20299f6..736e296 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
@@ -241,15 +241,18 @@ static inline void dwmac4_get_timestamp(void *desc, u32 ats, u64 *ts)
 static int dwmac4_rx_check_timestamp(void *desc)
 {
 	struct dma_desc *p = (struct dma_desc *)desc;
+	unsigned int rdes0 = le32_to_cpu(p->des0);
+	unsigned int rdes1 = le32_to_cpu(p->des1);
+	unsigned int rdes3 = le32_to_cpu(p->des3);
 	u32 own, ctxt;
 	int ret = 1;
 
-	own = p->des3 & RDES3_OWN;
-	ctxt = ((p->des3 & RDES3_CONTEXT_DESCRIPTOR)
+	own = rdes3 & RDES3_OWN;
+	ctxt = ((rdes3 & RDES3_CONTEXT_DESCRIPTOR)
 		>> RDES3_CONTEXT_DESCRIPTOR_SHIFT);
 
 	if (likely(!own && ctxt)) {
-		if ((p->des0 == 0xffffffff) && (p->des1 == 0xffffffff))
+		if ((rdes0 == 0xffffffff) && (rdes1 == 0xffffffff))
 			/* Corrupted value */
 			ret = -EINVAL;
 		else
-- 
2.7.4


^ permalink raw reply related

* Re: [PATCH] isdn:hisax: Replace dev_kfree_skb_any by dev_consume_skb_any
From: David Miller @ 2019-02-14 17:22 UTC (permalink / raw)
  To: huang.zijiang
  Cc: isdn, natechancellor, yuehaibing, baijiaju1990, netdev,
	linux-kernel, wang.yi59
In-Reply-To: <1550126399-28063-1-git-send-email-huang.zijiang@zte.com.cn>

From: Huang Zijiang <huang.zijiang@zte.com.cn>
Date: Thu, 14 Feb 2019 14:39:59 +0800

> The skb should be freed by dev_consume_skb_any() in hfcpci_fill_fifo()
> when bcs->tx_skb is still used. The bcs->tx_skb is be replaced by
> skb_dequeue(&bcs->squeue), so the original bcs->tx_skb should
> be consumed(not drop).
> 
> Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] net: ip6_gre: Give ERSPAN a fill_info link op of its own
From: Petr Machata @ 2019-02-14 17:17 UTC (permalink / raw)
  To: Lorenzo Bianconi; +Cc: netdev@vger.kernel.org
In-Reply-To: <20190214111002.GB16752@localhost.localdomain>


Lorenzo Bianconi <lorenzo.bianconi@redhat.com> writes:

> Does it fix reported issue?

It does. Can you please formally send this? I'll retest and add my
Tested-by.

Thanks,
Petr

^ permalink raw reply

* [PATCH net-next 2/2] neigh: hook tracepoints in neigh update code
From: Roopa Prabhu @ 2019-02-14 17:15 UTC (permalink / raw)
  To: davem; +Cc: netdev, dsa
In-Reply-To: <1550164511-21195-1-git-send-email-roopa@cumulusnetworks.com>

From: Roopa Prabhu <roopa@cumulusnetworks.com>

hook tracepoints at the end of functions that
update a neigh entry. neigh_update gets an additional
tracepoint to trace the update flags and old and new
neigh states.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
 net/core/neighbour.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 4230400..30f6fd8 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -42,6 +42,8 @@
 #include <linux/inetdevice.h>
 #include <net/addrconf.h>
 
+#include <trace/events/neigh.h>
+
 #define DEBUG
 #define NEIGH_DEBUG 1
 #define neigh_dbg(level, fmt, ...)		\
@@ -102,6 +104,7 @@ static void neigh_cleanup_and_release(struct neighbour *neigh)
 	if (neigh->parms->neigh_cleanup)
 		neigh->parms->neigh_cleanup(neigh);
 
+	trace_neigh_cleanup_and_release(neigh, 0);
 	__neigh_notify(neigh, RTM_DELNEIGH, 0, 0);
 	call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, neigh);
 	neigh_release(neigh);
@@ -1095,6 +1098,8 @@ static void neigh_timer_handler(struct timer_list *t)
 	if (notify)
 		neigh_update_notify(neigh, 0);
 
+	trace_neigh_timer_handler(neigh, 0);
+
 	neigh_release(neigh);
 }
 
@@ -1165,6 +1170,7 @@ int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
 	else
 		write_unlock(&neigh->lock);
 	local_bh_enable();
+	trace_neigh_event_send_done(neigh, rc);
 	return rc;
 
 out_dead:
@@ -1172,6 +1178,7 @@ int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
 		goto out_unlock_bh;
 	write_unlock_bh(&neigh->lock);
 	kfree_skb(skb);
+	trace_neigh_event_send_dead(neigh, 1);
 	return 1;
 }
 EXPORT_SYMBOL(__neigh_event_send);
@@ -1227,6 +1234,8 @@ static int __neigh_update(struct neighbour *neigh, const u8 *lladdr,
 	struct net_device *dev;
 	int update_isrouter = 0;
 
+	trace_neigh_update(neigh, lladdr, new, flags, nlmsg_pid);
+
 	write_lock_bh(&neigh->lock);
 
 	dev    = neigh->dev;
@@ -1393,6 +1402,8 @@ static int __neigh_update(struct neighbour *neigh, const u8 *lladdr,
 	if (notify)
 		neigh_update_notify(neigh, nlmsg_pid);
 
+	trace_neigh_update_done(neigh, err);
+
 	return err;
 }
 
-- 
2.1.4


^ permalink raw reply related

* Re: [Bug reporting] kernel panic during handle the dst unreach icmp msg.
From: Eric Dumazet @ 2019-02-14 17:15 UTC (permalink / raw)
  To: soukjin.bae, netdev@vger.kernel.org
  Cc: 박종언, Steffen Klassert, Herbert Xu
In-Reply-To: <20190214074641epcms1p1db1c5589f96718a440a166328eec9ebd@epcms1p1>



On 02/13/2019 11:46 PM, 배석진 wrote:
> Dear all,
> 
> 
> https://www.mail-archive.com/netdev@vger.kernel.org/msg256527.html
> 
> as we concerned before at above mail thread,
> we faced a problem cased by not removed socket.
> 
> (from now, 'the socket' means the socket alloced at 0xFFFFFFC0051E5E00)
> 
> #1. the socket is state in TIME_WAIT1. maybe it's process closed the socket.
>     below is memory dump information with Trace32.
> 
>   (struct sock *)0xFFFFFFC0051E5E00 = 0xFFFFFFC0051E5E00 = end+0x3FF9E4CE00 -> (
>     __sk_common = (
>        ...
>        skc_rcv_saddr = 0x0200A8C0,   ==> 192.168.0.2
>        ...
>        skc_state = 4,                ==> TIME_WAIT1
>        ...
>        skc_flags = 0x4301,           ==> SOCK_DEAD(0x01) set
> 
> 
> #2. user changed WIFI AP to another one, so previous netdevice deleted and destroied it's sockets.
> 
> [60392.948657][4:            netd] 02-13 00:39:32.095  5249  5323 I NetdDestroyed 30 sockets on 192.168.0.2 in 2.7 ms
> [60392.948705][4:            netd] 02-13 00:39:32.095  5249  5323 D Netdnotify() code: 614, msg: Address removed 192.168.0.2/24 wlan0 128 0
> 
>   --> the socket will be exist for a while.
>       because of 'sock_diag_destory() -> tcp_abort()' can not call tcp_done() for the socket.
>       but clearing the socket's sk_write_queue by calling tcp_write_queue_purge(sk).
> 
> 
> #3. icmp msg(dst unreach) came for sent packet by the socket.
>     to retransmit them, lookup sk and fint it. (because the socket still exist)
>     but it's sk_write_queue was already cleared so has no skb to send.
>     and make the kernel bug.
> 
> <4>[60392.948306] I[1:    ksoftirqd/1:   19] ------------[ cut here ]------------
> <0>[60392.948334] I[1:    ksoftirqd/1:   19] kernel BUG at net/ipv4/tcp_ipv4.c:519!
> <2>[60392.948344] I[1:    ksoftirqd/1:   19] sec_debug_set_extra_info_fault = BUG / 0xffffff80090351d0
> <0>[60392.948386] I[1:    ksoftirqd/1:   19] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> ...
> <4>[60392.950676] I[1:    ksoftirqd/1:   19] PC is at tcp_v4_err+0x4b0/0x4bc
> <4>[60392.950684] I[1:    ksoftirqd/1:   19] LR is at tcp_v4_err+0x3ac/0x4bc
> 
> 
> 370 void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
> 371 {
>         ...
> 516		icsk->icsk_rto = inet_csk_rto_backoff(icsk, TCP_RTO_MAX);
> 517
> 518		skb = tcp_write_queue_head(sk);
> 519		BUG_ON(!skb);
> 520
> 521		tcp_mstamp_refresh(tp);
> 
> 
> we know that the line 519 removed on latest state. instead this will be shown to kernel panic.
> how about below change? do not retransmit packets when socket was already closed.
> 
> best regards,
> 
> 
> 
> From: soukjin bae <soukjin.bae@samsung.com>
> Date: Wen, 14 Jan 2019 14:26:35 +0900
> Subject: net: Don't retransmit packets when socket was already closed
>  
> Signed-off-by: soukjin bae <soukjin.bae@samsung.com>
> Signed-off-by: jongeon park <jongeon.park@samsung.com>
> ---
>  net/ipv4/tcp_ipv4 | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/net/ipv4/tcp_ipv4 b/net/ipv4/tcp_ipv4
> index fe4daf6..654bd19 100755
> --- a/net/ipv4/tcp_ipv4
> +++ b/net/ipv4/tcp_ipv4
> 
> @@ -442,6 +465,10 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
>  		err = EPROTO;
>  		break;
>  	case ICMP_DEST_UNREACH:
> +		/* Don't retransmit packets when socket was already closed */
> +		if (sock_flag(sk, SOCK_DEAD))
> +			goto out;
> +
>  		if (code > NR_ICMP_UNREACH)
>  			goto out;
> 

I do not believe this patch is needed.

You probably hit another more serious bug, but since you do not post the full stack trace
it is hard to help.

Are you using vti tunnel ?

I just got a syzbot report that might give us a clue :

(I suspect commit 61220ab349485d911083d0b7990ccd3db6c63297 vti6: Enable namespace changing
was wrong, since vti tunnels have t->net assigned to a struct net without holding a reference)

So we end up freeing a struct net (and associated resources) too soon.


BUG: KASAN: slab-out-of-bounds in atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
BUG: KASAN: slab-out-of-bounds in queued_spin_trylock include/asm-generic/qspinlock.h:69 [inline]
BUG: KASAN: slab-out-of-bounds in do_raw_spin_trylock+0x6a/0x180 kernel/locking/spinlock_debug.c:119
Read of size 4 at addr ffff888066405d9c by task syz-executor.4/10575

CPU: 0 PID: 10575 Comm: syz-executor.4 Not tainted 5.0.0-rc6+ #70
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
 check_memory_region_inline mm/kasan/generic.c:185 [inline]
 check_memory_region+0x123/0x190 mm/kasan/generic.c:191
 kasan_check_read+0x11/0x20 mm/kasan/common.c:100
 atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
 queued_spin_trylock include/asm-generic/qspinlock.h:69 [inline]
 do_raw_spin_trylock+0x6a/0x180 kernel/locking/spinlock_debug.c:119
 __raw_spin_trylock include/linux/spinlock_api_smp.h:89 [inline]
 _raw_spin_trylock+0x1c/0x80 kernel/locking/spinlock.c:128
 spin_trylock include/linux/spinlock.h:339 [inline]
 icmp_xmit_lock net/ipv4/icmp.c:219 [inline]
 icmp_send+0x54c/0x1400 net/ipv4/icmp.c:665
 ipv4_link_failure+0x2c/0x210 net/ipv4/route.c:1187
 dst_link_failure include/net/dst.h:427 [inline]
 vti6_xmit net/ipv6/ip6_vti.c:514 [inline]
 vti6_tnl_xmit+0x10db/0x1c6e net/ipv6/ip6_vti.c:553
 __netdev_start_xmit include/linux/netdevice.h:4385 [inline]
 netdev_start_xmit include/linux/netdevice.h:4394 [inline]
 xmit_one net/core/dev.c:3278 [inline]
 dev_hard_start_xmit+0x1b2/0x980 net/core/dev.c:3294
 __dev_queue_xmit+0x26e5/0x2fe0 net/core/dev.c:3864
 dev_queue_xmit+0x18/0x20 net/core/dev.c:3897
 neigh_direct_output+0x16/0x20 net/core/neighbour.c:1516
 neigh_output include/net/neighbour.h:508 [inline]
 ip_finish_output2+0x949/0x1740 net/ipv4/ip_output.c:229
 ip_finish_output+0x73c/0xd50 net/ipv4/ip_output.c:317
 NF_HOOK_COND include/linux/netfilter.h:278 [inline]
 ip_output+0x21f/0x670 net/ipv4/ip_output.c:405
 dst_output include/net/dst.h:444 [inline]
 ip_local_out+0xc4/0x1b0 net/ipv4/ip_output.c:124
 __ip_queue_xmit+0x86f/0x1bf0 net/ipv4/ip_output.c:505
 ip_queue_xmit+0x5a/0x70 include/net/ip.h:198
 __tcp_transmit_skb+0x1a5f/0x3680 net/ipv4/tcp_output.c:1160
 tcp_transmit_skb net/ipv4/tcp_output.c:1176 [inline]
 tcp_write_xmit+0xe89/0x5160 net/ipv4/tcp_output.c:2401
 __tcp_push_pending_frames+0xb4/0x350 net/ipv4/tcp_output.c:2577
 tcp_send_fin+0x149/0xbb0 net/ipv4/tcp_output.c:3122
 tcp_close+0xddf/0x10c0 net/ipv4/tcp.c:2405
 inet_release+0x105/0x1f0 net/ipv4/af_inet.c:428
 __sock_release+0xd3/0x250 net/socket.c:579
 sock_close+0x1b/0x30 net/socket.c:1139
 __fput+0x2df/0x8d0 fs/file_table.c:278
 ____fput+0x16/0x20 fs/file_table.c:309
 task_work_run+0x14a/0x1c0 kernel/task_work.c:113
 tracehook_notify_resume include/linux/tracehook.h:188 [inline]
 exit_to_usermode_loop+0x273/0x2c0 arch/x86/entry/common.c:166
 prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
 do_syscall_32_irqs_on arch/x86/entry/common.c:341 [inline]
 do_fast_syscall_32+0xa9d/0xc98 arch/x86/entry/common.c:397
 entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7fe8869
Code: 85 d2 74 02 89 0a 5b 5d c3 8b 04 24 c3 8b 14 24 c3 8b 3c 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 002b:000000000845fdac EFLAGS: 00000216 ORIG_RAX: 0000000000000006
RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000000
RDX: 0000000000000005 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

Allocated by task 9609:
 save_stack+0x45/0xd0 mm/kasan/common.c:73
 set_track mm/kasan/common.c:85 [inline]
 __kasan_kmalloc mm/kasan/common.c:496 [inline]
 __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:469
 kasan_kmalloc mm/kasan/common.c:504 [inline]
 kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:411
 kmem_cache_alloc_node+0x144/0x710 mm/slab.c:3633
 alloc_task_struct_node kernel/fork.c:158 [inline]
 dup_task_struct kernel/fork.c:845 [inline]
 copy_process.part.0+0x1d08/0x79a0 kernel/fork.c:1753
 copy_process kernel/fork.c:1710 [inline]
 _do_fork+0x257/0xfe0 kernel/fork.c:2227
 __do_compat_sys_x86_clone arch/x86/ia32/sys_ia32.c:240 [inline]
 __se_compat_sys_x86_clone arch/x86/ia32/sys_ia32.c:236 [inline]
 __ia32_compat_sys_x86_clone+0xbc/0x140 arch/x86/ia32/sys_ia32.c:236
 do_syscall_32_irqs_on arch/x86/entry/common.c:326 [inline]
 do_int80_syscall_32+0x14d/0x670 arch/x86/entry/common.c:349
 entry_INT80_compat+0x76/0x80 arch/x86/entry/entry_64_compat.S:413

Freed by task 9627:
 save_stack+0x45/0xd0 mm/kasan/common.c:73
 set_track mm/kasan/common.c:85 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/common.c:458
 kasan_slab_free+0xe/0x10 mm/kasan/common.c:466
 __cache_free mm/slab.c:3487 [inline]
 kmem_cache_free+0x86/0x260 mm/slab.c:3749
 free_task_struct kernel/fork.c:163 [inline]
 free_task+0xdd/0x120 kernel/fork.c:458
 __put_task_struct+0x20a/0x4e0 kernel/fork.c:731
 put_task_struct include/linux/sched/task.h:98 [inline]
 delayed_put_task_struct+0x1fd/0x350 kernel/exit.c:181
 __rcu_reclaim kernel/rcu/rcu.h:240 [inline]
 rcu_do_batch kernel/rcu/tree.c:2452 [inline]
 invoke_rcu_callbacks kernel/rcu/tree.c:2773 [inline]
 rcu_process_callbacks+0x928/0x1390 kernel/rcu/tree.c:2754
 __do_softirq+0x266/0x95a kernel/softirq.c:292

The buggy address belongs to the object at ffff888066404540
 which belongs to the cache task_struct(81:syz5) of size 6080
The buggy address is located 156 bytes to the right of
 6080-byte region [ffff888066404540, ffff888066405d00)
The buggy address belongs to the page:
page:ffffea0001990100 count:1 mapcount:0 mapping:ffff888092e85080 index:0x0 compound_mapcount: 0
flags: 0x1fffc0000010200(slab|head)
raw: 01fffc0000010200 ffffea00026efe08 ffffea0002554f08 ffff888092e85080
raw: 0000000000000000 ffff888066404540 0000000100000001 ffff8880602fe480
page dumped because: kasan: bad access detected
page->mem_cgroup:ffff8880602fe480

Memory state around the buggy address:
 ffff888066405c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888066405d00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888066405d80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                            ^
 ffff888066405e00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888066405e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

^ permalink raw reply

* [PATCH net-next 1/2] trace: events: add a few neigh tracepoints
From: Roopa Prabhu @ 2019-02-14 17:15 UTC (permalink / raw)
  To: davem; +Cc: netdev, dsa
In-Reply-To: <1550164511-21195-1-git-send-email-roopa@cumulusnetworks.com>

From: Roopa Prabhu <roopa@cumulusnetworks.com>

The goal here is to trace neigh state changes covering all possible
neigh update paths. Plus have a specific trace point in neigh_update
to cover flags sent to neigh_update.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
 include/trace/events/neigh.h | 204 +++++++++++++++++++++++++++++++++++++++++++
 net/core/net-traces.c        |   8 ++
 2 files changed, 212 insertions(+)
 create mode 100644 include/trace/events/neigh.h

diff --git a/include/trace/events/neigh.h b/include/trace/events/neigh.h
new file mode 100644
index 0000000..ed10353
--- /dev/null
+++ b/include/trace/events/neigh.h
@@ -0,0 +1,204 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM neigh
+
+#if !defined(_TRACE_NEIGH_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_NEIGH_H
+
+#include <linux/skbuff.h>
+#include <linux/netdevice.h>
+#include <linux/tracepoint.h>
+#include <net/neighbour.h>
+
+#define neigh_state_str(state)				\
+	__print_symbolic(state,				\
+		{ NUD_INCOMPLETE, "incomplete" },	\
+		{ NUD_REACHABLE, "reachable" },		\
+		{ NUD_STALE, "stale" },			\
+		{ NUD_DELAY, "delay" },			\
+		{ NUD_PROBE, "probe" },			\
+		{ NUD_FAILED, "failed" })
+
+TRACE_EVENT(neigh_update,
+
+	TP_PROTO(struct neighbour *n, const u8 *lladdr, u8 new,
+		 u32 flags, u32 nlmsg_pid),
+
+	TP_ARGS(n, lladdr, new, flags, nlmsg_pid),
+
+	TP_STRUCT__entry(
+		__field(u32, family)
+		__string(dev, (n->dev ? n->dev->name : "NULL"))
+		__array(u8, lladdr, MAX_ADDR_LEN)
+		__field(u8, lladdr_len)
+		__field(u8, flags)
+		__field(u8, nud_state)
+		__field(u8, type)
+		__field(u8, dead)
+		__field(int, refcnt)
+		__array(__u8, primary_key4, 4)
+		__array(__u8, primary_key6, 16)
+		__field(unsigned long, confirmed)
+		__field(unsigned long, updated)
+		__field(unsigned long, used)
+		__array(u8, new_lladdr, MAX_ADDR_LEN)
+		__field(u8, new_state)
+		__field(u32, update_flags)
+		__field(u32, pid)
+	),
+
+	TP_fast_assign(
+		int lladdr_len = (n->dev ? n->dev->addr_len : MAX_ADDR_LEN);
+		struct in6_addr *pin6;
+		__be32 *p32;
+
+		__entry->family = n->tbl->family;
+		__assign_str(dev, (n->dev ? n->dev->name : "NULL"));
+		__entry->lladdr_len = lladdr_len;
+		memcpy(__entry->lladdr, n->ha, lladdr_len);
+		__entry->flags = n->flags;
+		__entry->nud_state = n->nud_state;
+		__entry->type = n->type;
+		__entry->dead = n->dead;
+		__entry->refcnt = refcount_read(&n->refcnt);
+		pin6 = (struct in6_addr *)__entry->primary_key6;
+		p32 = (__be32 *)__entry->primary_key4;
+
+		if (n->tbl->family == AF_INET)
+			*p32 = *(__be32 *)n->primary_key;
+		else
+			*p32 = 0;
+
+#if IS_ENABLED(CONFIG_IPV6)
+		if (n->tbl->family == AF_INET6) {
+			pin6 = (struct in6_addr *)__entry->primary_key6;
+			*pin6 = *(struct in6_addr *)n->primary_key;
+		} else
+#endif
+		{
+			ipv6_addr_set_v4mapped(*p32, pin6);
+		}
+		__entry->confirmed = n->confirmed;
+		__entry->updated = n->updated;
+		__entry->used = n->used;
+		if (lladdr)
+			memcpy(__entry->new_lladdr, lladdr, lladdr_len);
+		__entry->new_state = new;
+		__entry->update_flags = flags;
+		__entry->pid = nlmsg_pid;
+	),
+
+	TP_printk("family %d dev %s lladdr %s flags %02x nud_state %s type %02x "
+		  "dead %d refcnt %d primary_key4 %pI4 primary_key6 %pI6c "
+		  "confirmed %lu updated %lu used %lu new_lladdr %s "
+		  "new_state %02x update_flags %02x pid %d",
+		  __entry->family, __get_str(dev),
+		  __print_hex_str(__entry->lladdr, __entry->lladdr_len),
+		  __entry->flags, neigh_state_str(__entry->nud_state),
+		  __entry->type, __entry->dead, __entry->refcnt,
+		  __entry->primary_key4, __entry->primary_key6,
+		  __entry->confirmed, __entry->updated, __entry->used,
+		  __print_hex_str(__entry->new_lladdr, __entry->lladdr_len),
+		  __entry->new_state,
+		  __entry->update_flags, __entry->pid)
+);
+
+DECLARE_EVENT_CLASS(neigh__update,
+	TP_PROTO(struct neighbour *n, int err),
+	TP_ARGS(n, err),
+	TP_STRUCT__entry(
+		__field(u32, family)
+		__string(dev, (n->dev ? n->dev->name : "NULL"))
+		__array(u8, lladdr, MAX_ADDR_LEN)
+		__field(u8, lladdr_len)
+		__field(u8, flags)
+		__field(u8, nud_state)
+		__field(u8, type)
+		__field(u8, dead)
+		__field(int, refcnt)
+		__array(__u8, primary_key4, 4)
+		__array(__u8, primary_key6, 16)
+		__field(unsigned long, confirmed)
+		__field(unsigned long, updated)
+		__field(unsigned long, used)
+		__field(u32, err)
+	),
+
+	TP_fast_assign(
+		int lladdr_len = (n->dev ? n->dev->addr_len : MAX_ADDR_LEN);
+		struct in6_addr *pin6;
+		__be32 *p32;
+
+		__entry->family = n->tbl->family;
+		__assign_str(dev, (n->dev ? n->dev->name : "NULL"));
+		__entry->lladdr_len = lladdr_len;
+		memcpy(__entry->lladdr, n->ha, lladdr_len);
+		__entry->flags = n->flags;
+		__entry->nud_state = n->nud_state;
+		__entry->type = n->type;
+		__entry->dead = n->dead;
+		__entry->refcnt = refcount_read(&n->refcnt);
+		pin6 = (struct in6_addr *)__entry->primary_key6;
+		p32 = (__be32 *)__entry->primary_key4;
+
+		if (n->tbl->family == AF_INET)
+			*p32 = *(__be32 *)n->primary_key;
+		else
+			*p32 = 0;
+
+#if IS_ENABLED(CONFIG_IPV6)
+		if (n->tbl->family == AF_INET6) {
+			pin6 = (struct in6_addr *)__entry->primary_key6;
+			*pin6 = *(struct in6_addr *)n->primary_key;
+		} else
+#endif
+		{
+			ipv6_addr_set_v4mapped(*p32, pin6);
+		}
+
+		__entry->confirmed = n->confirmed;
+		__entry->updated = n->updated;
+		__entry->used = n->used;
+		__entry->err = err;
+	),
+
+	TP_printk("family %d dev %s lladdr %s flags %02x nud_state %s type %02x "
+		  "dead %d refcnt %d primary_key4 %pI4 primary_key6 %pI6c "
+		  "confirmed %lu updated %lu used %lu err %d",
+		  __entry->family, __get_str(dev),
+		  __print_hex_str(__entry->lladdr, __entry->lladdr_len),
+		  __entry->flags, neigh_state_str(__entry->nud_state),
+		  __entry->type, __entry->dead, __entry->refcnt,
+		  __entry->primary_key4, __entry->primary_key6,
+		  __entry->confirmed, __entry->updated, __entry->used,
+		  __entry->err)
+);
+
+DEFINE_EVENT(neigh__update, neigh_update_done,
+	TP_PROTO(struct neighbour *neigh, int err),
+	TP_ARGS(neigh, err)
+);
+
+DEFINE_EVENT(neigh__update, neigh_timer_handler,
+	TP_PROTO(struct neighbour *neigh, int err),
+	TP_ARGS(neigh, err)
+);
+
+DEFINE_EVENT(neigh__update, neigh_event_send_done,
+	TP_PROTO(struct neighbour *neigh, int err),
+	TP_ARGS(neigh, err)
+);
+
+DEFINE_EVENT(neigh__update, neigh_event_send_dead,
+	TP_PROTO(struct neighbour *neigh, int err),
+	TP_ARGS(neigh, err)
+);
+
+DEFINE_EVENT(neigh__update, neigh_cleanup_and_release,
+	TP_PROTO(struct neighbour *neigh, int rc),
+	TP_ARGS(neigh, rc)
+);
+
+#endif /* _TRACE_NEIGH_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/net/core/net-traces.c b/net/core/net-traces.c
index 419af6d..470b179 100644
--- a/net/core/net-traces.c
+++ b/net/core/net-traces.c
@@ -43,6 +43,14 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(fdb_delete);
 EXPORT_TRACEPOINT_SYMBOL_GPL(br_fdb_update);
 #endif
 
+#include <trace/events/neigh.h>
+EXPORT_TRACEPOINT_SYMBOL_GPL(neigh_update);
+EXPORT_TRACEPOINT_SYMBOL_GPL(neigh_update_done);
+EXPORT_TRACEPOINT_SYMBOL_GPL(neigh_timer_handler);
+EXPORT_TRACEPOINT_SYMBOL_GPL(neigh_event_send_done);
+EXPORT_TRACEPOINT_SYMBOL_GPL(neigh_event_send_dead);
+EXPORT_TRACEPOINT_SYMBOL_GPL(neigh_cleanup_and_release);
+
 EXPORT_TRACEPOINT_SYMBOL_GPL(kfree_skb);
 
 EXPORT_TRACEPOINT_SYMBOL_GPL(napi_poll);
-- 
2.1.4


^ permalink raw reply related

* [PATCH net-next 0/2] tracepoints in neighbor subsystem
From: Roopa Prabhu @ 2019-02-14 17:15 UTC (permalink / raw)
  To: davem; +Cc: netdev, dsa

From: Roopa Prabhu <roopa@cumulusnetworks.com>

Roopa Prabhu (2):
  trace: events: add a few neigh tracepoints
  neigh: hook tracepoints in neigh update code

 include/trace/events/neigh.h | 213 +++++++++++++++++++++++++++++++++++++++++++
 net/core/neighbour.c         |  11 +++
 net/core/net-traces.c        |   8 ++
 3 files changed, 232 insertions(+)
 create mode 100644 include/trace/events/neigh.h

-- 
2.1.4


^ permalink raw reply

* Re: [RESEND PATCH net] mm: page_alloc: fix ref bias in page_frag_alloc() for 1-byte allocs
From: David Miller @ 2019-02-14 17:13 UTC (permalink / raw)
  To: jannh
  Cc: netdev, linux-mm, linux-kernel, mhocko, vbabka, pavel.tatashin,
	osalvador, mgorman, aaron.lu, alexander.h.duyck
In-Reply-To: <20190213214559.125666-1-jannh@google.com>

From: Jann Horn <jannh@google.com>
Date: Wed, 13 Feb 2019 22:45:59 +0100

> The basic idea behind ->pagecnt_bias is: If we pre-allocate the maximum
> number of references that we might need to create in the fastpath later,
> the bump-allocation fastpath only has to modify the non-atomic bias value
> that tracks the number of extra references we hold instead of the atomic
> refcount. The maximum number of allocations we can serve (under the
> assumption that no allocation is made with size 0) is nc->size, so that's
> the bias used.
> 
> However, even when all memory in the allocation has been given away, a
> reference to the page is still held; and in the `offset < 0` slowpath, the
> page may be reused if everyone else has dropped their references.
> This means that the necessary number of references is actually
> `nc->size+1`.
> 
> Luckily, from a quick grep, it looks like the only path that can call
> page_frag_alloc(fragsz=1) is TAP with the IFF_NAPI_FRAGS flag, which
> requires CAP_NET_ADMIN in the init namespace and is only intended to be
> used for kernel testing and fuzzing.
> 
> To test for this issue, put a `WARN_ON(page_ref_count(page) == 0)` in the
> `offset < 0` path, below the virt_to_page() call, and then repeatedly call
> writev() on a TAP device with IFF_TAP|IFF_NO_PI|IFF_NAPI_FRAGS|IFF_NAPI,
> with a vector consisting of 15 elements containing 1 byte each.
> 
> Signed-off-by: Jann Horn <jannh@google.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH net-next] net: ip6_gre: Give ERSPAN a fill_info link op of its own
From: David Miller @ 2019-02-14 17:08 UTC (permalink / raw)
  To: petrm; +Cc: netdev, kuznet, yoshfuji, lorenzo.bianconi
In-Reply-To: <c14a9085e87ca9e36ba7f5feea46e5750a5baeeb.1550086179.git.petrm@mellanox.com>

From: Petr Machata <petrm@mellanox.com>
Date: Wed, 13 Feb 2019 19:31:32 +0000

> In commit c706863bc890 ("net: ip6_gre: always reports o_key to
> userspace"), ip6gre and ip6gretap tunnels started reporting a TUNNEL_KEY
> output flag even if one was not configured at the device.
> 
> When an okey-less ip6gre or ip6gretap netdevice is created, it initially
> encapsulates the packets without okey. But any configuration change
> (even a non-change such as setting TOS to an already-configured value)
> then causes the okey flag from the reported configuration to be
> circulated back to actual configuration. From that point on, the device
> encapsulates packets with output key of 0.
> 
> The intention was to implement this behavior for ERSPAN devices, not for
> all ip6gre devices. The ERSPAN netdevice should really have its own
> fill_info callback. Add one.
> 
> Fixes: c706863bc890 ("net: ip6_gre: always reports o_key to userspace")
> CC: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> Signed-off-by: Petr Machata <petrm@mellanox.com>

This commit you are fixing exists in the 'net' tree, therefore this is
a bug fix and should be targetted at 'net'.

^ permalink raw reply

* Re: [PATCH net 0/2] net: phy: fix locking issue
From: David Miller @ 2019-02-14 17:05 UTC (permalink / raw)
  To: hkallweit1; +Cc: andrew, f.fainelli, linux, netdev
In-Reply-To: <2a39271d-3b9e-e425-98b4-b2a24074e806@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Wed, 13 Feb 2019 20:10:36 +0100

> Russell pointed out that the locking used in phy_is_started() isn't
> needed and misleading. This locking also contributes to a race fixed
> with patch 2.

Series applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH 5/9] perf, bpf: save bpf_prog_info in a rbtree in perf_env
From: Song Liu @ 2019-02-14 17:03 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Netdev, linux-kernel, ast@kernel.org, daniel@iogearbox.net,
	Kernel Team, peterz@infradead.org, acme@redhat.com
In-Reply-To: <20190214123311.GA7465@krava>



> On Feb 14, 2019, at 4:33 AM, Jiri Olsa <jolsa@redhat.com> wrote:
> 
> On Fri, Feb 08, 2019 at 05:17:01PM -0800, Song Liu wrote:
>> bpf_prog_info contains information necessary to annotate bpf programs.
>> This patch saves bpf_prog_info for bpf programs loaded in the system.
>> 
>> perf-record saves bpf_prog_info information as headers to perf.data.
>> A new header type HEADER_BPF_PROG_INFO is introduced for this data.
> 
> please move those 2 changes into separate patches then

Do you mean one patch to save data in rbtree, then a separate patch 
to save data in perf.data file?

Thanks,
Song

> 
> it's hard to make comments when I don't see the rest of
> the patches on the list please resend the patchset
> 
> thanks,
> jirka


^ permalink raw reply

* Re: [PATCH net] selftests: fix timestamping Makefile
From: David Miller @ 2019-02-14 17:03 UTC (permalink / raw)
  To: deepa.kernel; +Cc: shuah, willemb, netdev, linux-kselftest
In-Reply-To: <20190213170914.11991-1-deepa.kernel@gmail.com>

From: Deepa Dinamani <deepa.kernel@gmail.com>
Date: Wed, 13 Feb 2019 09:09:13 -0800

> The clean target in the makefile conflicts with the generic
> kselftests lib.mk, and fails to properly remove the compiled
> test programs.
> 
> Remove the redundant rule, the TEST_GEN_FILES will be already
> removed by the CLEAN macro in lib.mk.
> 
> Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH 5/9] perf, bpf: save bpf_prog_info in a rbtree in perf_env
From: Song Liu @ 2019-02-14 17:01 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Netdev, linux-kernel@vger.kernel.org, ast@kernel.org,
	daniel@iogearbox.net, Kernel Team, peterz@infradead.org,
	acme@redhat.com
In-Reply-To: <20190214122638.GD26714@krava>



> On Feb 14, 2019, at 4:26 AM, Jiri Olsa <jolsa@redhat.com> wrote:
> 
> On Fri, Feb 08, 2019 at 05:17:01PM -0800, Song Liu wrote:
> 
> SNIP
> 
>> diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
>> index d01b8355f4ca..5894a177b7cf 100644
>> --- a/tools/perf/util/env.h
>> +++ b/tools/perf/util/env.h
>> @@ -3,7 +3,10 @@
>> #define __PERF_ENV_H
>> 
>> #include <linux/types.h>
>> +#include <linux/rbtree.h>
>> #include "cpumap.h"
>> +#include "rwsem.h"
>> +#include "bpf-event.h"
>> 
>> struct cpu_topology_map {
>> 	int	socket_id;
>> @@ -64,6 +67,8 @@ struct perf_env {
>> 	struct memory_node	*memory_nodes;
>> 	unsigned long long	 memory_bsize;
>> 	u64                     clockid_res_ns;
>> +	struct rw_semaphore	bpf_info_lock;
> 
> why's the lock needed?
> 
> jirka

It protects the retries for bpf_prog_info and btf. For perf-top, 
we will have one thread writing to the trees, while the main 
thread reading from them. 

Let me add comments to clarify. 

Thanks,
Song

^ permalink raw reply

* Re: [PATCH net] net: stmmac: Fix NAPI poll in TX path when in multi-queue
From: David Miller @ 2019-02-14 17:01 UTC (permalink / raw)
  To: jose.abreu
  Cc: netdev, linux-kernel, joao.pinto, peppe.cavallaro,
	alexandre.torgue
In-Reply-To: <a264c48823687434e4d18aeb5830707e00c64250.1550077162.git.joabreu@synopsys.com>

From: Jose Abreu <jose.abreu@synopsys.com>
Date: Wed, 13 Feb 2019 18:00:43 +0100

> Commit 8fce33317023 introduced the concept of NAPI per-channel and
> independent cleaning of TX path.
> 
> This is currently breaking performance in some cases. The scenario
> happens when all packets are being received in Queue 0 but the TX is
> performed in Queue != 0.
> 
> I didn't look very deep but it seems that NAPI for Queue 0 will clean
> the RX path but as TX is in different NAPI, this last one is called at a
> slower rate which kills performance in TX. I suspect this is due to TX
> cleaning takes much longer than RX and because NAPI will get canceled
> once we return with 0 budget consumed (e.g. when TX is still not done it
> will return 0 budget).
> 
> Fix this by looking at all TX channels in NAPI poll function.
> 
> Signed-off-by: Jose Abreu <joabreu@synopsys.com>
> Fixes: 8fce33317023 ("net: stmmac: Rework coalesce timer and fix multi-queue races")

No this isn't right.

The TX interrupt events for Queue != 0 should clean up the TX packets
on those queues.

Furthermore you are breaking the locality of the TX processing.

I'm not applying this, sorry.

^ permalink raw reply

* RE: [PATCH net-next 2/3] arm64: dts: fsl: ls1028a-rdb: Add ENETC external eth ports for the LS1028A RDB board
From: Claudiu Manoil @ 2019-02-14 17:00 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Shawn Guo, Leo Li, David S . Miller, devicetree@vger.kernel.org,
	Alexandru Marginean, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org
In-Reply-To: <20190214162746.GI708@lunn.ch>

>-----Original Message-----
>From: Andrew Lunn <andrew@lunn.ch>
>Sent: Thursday, February 14, 2019 6:28 PM
>To: Claudiu Manoil <claudiu.manoil@nxp.com>
>Cc: Shawn Guo <shawnguo@kernel.org>; Leo Li <leoyang.li@nxp.com>; David S .
>Miller <davem@davemloft.net>; devicetree@vger.kernel.org; Alexandru
>Marginean <alexandru.marginean@nxp.com>; linux-kernel@vger.kernel.org;
>linux-arm-kernel@lists.infradead.org; netdev@vger.kernel.org
>Subject: Re: [PATCH net-next 2/3] arm64: dts: fsl: ls1028a-rdb: Add ENETC
>external eth ports for the LS1028A RDB board
>
>> Hi Andrew,
>>
>> The extra node for mdio seems to complicate things somewhat.
>> Just adding this node seems not enough.  How to find out easily if a
>> child of a enetc port node is a mdio node?
>
>You copy somebody else code :-)
>

Provided you find the right thing to copy : ) . Thanks for the hint.

^ permalink raw reply

* Re: [PATCH -next] net: ipvlan_l3s: fix kconfig dependency warning
From: David Miller @ 2019-02-14 16:59 UTC (permalink / raw)
  To: rdunlap; +Cc: netdev, maheshb, daniel
In-Reply-To: <204a7785-a1d2-e714-653e-2cb19e36f279@infradead.org>

From: Randy Dunlap <rdunlap@infradead.org>
Date: Wed, 13 Feb 2019 08:55:02 -0800

> From: Randy Dunlap <rdunlap@infradead.org>
> 
> Fix the kconfig warning in IPVLAN_L3S when neither INET nor IPV6
> is enabled:
> 
> WARNING: unmet direct dependencies detected for NET_L3_MASTER_DEV
>   Depends on [n]: NET [=y] && (INET [=n] || IPV6 [=n])
>   Selected by [y]:
>   - IPVLAN_L3S [=y] && NETDEVICES [=y] && NET_CORE [=y] && NETFILTER [=y]
> 
> Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
> Cc: Mahesh Bandewar <maheshb@google.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> ---
> v2: simplify the dependency to IPVLAN

Applied, thanks Randy.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox