Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH iproute2] lib/libnetlink: ensure a minimum of 32KB for the buffer used in rtnl_recvmsg()
From: Eric Dumazet @ 2019-02-13  2:08 UTC (permalink / raw)
  To: David Ahern, Eric Dumazet, Stephen Hemminger
  Cc: netdev, Hangbin Liu, Phil Sutter
In-Reply-To: <b42f0dcb-3c8c-9797-a9f1-da71642e26cc@gmail.com>



On 02/12/2019 06:04 PM, David Ahern wrote:
> On 2/12/19 6:58 PM, Eric Dumazet wrote:
>> In the past, we tried to increase the buffer size up to 32 KB in order
>> to reduce number of syscalls per dump.
>>
>> Commit 2d34851cd341 ("lib/libnetlink: re malloc buff if size is not enough")
>> brought the size back to 4KB because the kernel can not know the application
>> is ready to receive bigger requests.
>>
>> See kernel commits 9063e21fb026 ("netlink: autosize skb lengthes") and
>> d35c99ff77ec ("netlink: do not enter direct reclaim from netlink_dump()")
>> for more details.
>>
>> Fixes: 2d34851cd341 ("lib/libnetlink: re malloc buff if size is not enough")
>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>> Cc: Hangbin Liu <liuhangbin@gmail.com>
>> Cc: Phil Sutter <phil@nwl.cc>
>> ---
>>  lib/libnetlink.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/lib/libnetlink.c b/lib/libnetlink.c
>> index 1892a02ab5d0d73776c9882ffc77edcd2c663d01..0d48a3d43cf03065dacbd419578ab10af56431a4 100644
>> --- a/lib/libnetlink.c
>> +++ b/lib/libnetlink.c
>> @@ -718,6 +718,8 @@ static int rtnl_recvmsg(int fd, struct msghdr *msg, char **answer)
>>  	if (len < 0)
>>  		return len;
>>  
>> +	if (len < 32768)
>> +		len = 32768;
>>  	buf = malloc(len);
>>  	if (!buf) {
>>  		fprintf(stderr, "malloc error: not enough buffer\n");
>>
> 
> I believe that negates the whole point of 2d34851cd341 - which I have no
> problem with. 2 recvmsg calls per message is overkill.
> 

It does not negates the point at all.

The main point was to eventually be able to allocate more than 32KB.

We need to have a minimum size of 32KB so that the kernel can cook reasonably sized skbs

Because trying to allocate 4KB only in 2019 is kind of stupid...

( Especially considering ss currently buffers the whole thing before calling render() !!! )

> Do we know of any single message sizes > 32k? 2d34851cd341 cites
> increasing VF's but at some point there is a limit. If not, the whole
> PEEK thing should go away and we just malloc 32k (or 64k) buffers for
> each recvmsg.
> 


^ permalink raw reply

* Re: [bpf-next 1/2] tcp: replace SOCK_DEBUG() with tcp_stats()
From: Yafang Shao @ 2019-02-13  2:07 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Daniel Borkmann, ast, Yonghong Song, brakmo, Eric Dumazet,
	David Miller, netdev, LKML, shaoyafang
In-Reply-To: <38d07cb3-b767-bfc4-9ae5-48367971d839@gmail.com>

On Tue, Feb 12, 2019 at 11:07 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 02/12/2019 03:31 AM, Yafang Shao wrote:
> > SOCK_DEBUG is a very ancient debugging interface, and it's not very useful
> > for debugging.
> > So this patch removes the SOCK_DEBUG() and introduce a new function
> > tcp_stats() to trace this kind of events.
> > Some MIBs are added for these events.
> >
> > Regarding the SO_DEBUG in sock_{s,g}etsockopt, I think it is better to
> > keep as-is, because if we return an errno to tell the application that
> > this optname isn't supported for TCP, it may break the application.
> > The application still can use this option but don't take any effect for
> > TCP.
> >
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > ---
> >  include/uapi/linux/snmp.h |  3 +++
> >  net/ipv4/proc.c           |  3 +++
> >  net/ipv4/tcp_input.c      | 26 +++++++++++---------------
> >  net/ipv6/tcp_ipv6.c       |  2 --
> >  4 files changed, 17 insertions(+), 17 deletions(-)
> >
> > diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
> > index 86dc24a..fd5c09c 100644
> > --- a/include/uapi/linux/snmp.h
> > +++ b/include/uapi/linux/snmp.h
> > @@ -283,6 +283,9 @@ enum
> >       LINUX_MIB_TCPACKCOMPRESSED,             /* TCPAckCompressed */
> >       LINUX_MIB_TCPZEROWINDOWDROP,            /* TCPZeroWindowDrop */
> >       LINUX_MIB_TCPRCVQDROP,                  /* TCPRcvQDrop */
> > +     LINUX_MIB_TCPINVALIDACK,                /* TCPInvalidAck */
> > +     LINUX_MIB_TCPOLDACK,                    /* TCPOldAck */
> > +     LINUX_MIB_TCPPARTIALPACKET,             /* TCPPartialPacket */
> >       __LINUX_MIB_MAX
> >  };
> >
> > diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
> > index c3610b3..1b0320a 100644
> > --- a/net/ipv4/proc.c
> > +++ b/net/ipv4/proc.c
> > @@ -291,6 +291,9 @@ static int sockstat_seq_show(struct seq_file *seq, void *v)
> >       SNMP_MIB_ITEM("TCPAckCompressed", LINUX_MIB_TCPACKCOMPRESSED),
> >       SNMP_MIB_ITEM("TCPZeroWindowDrop", LINUX_MIB_TCPZEROWINDOWDROP),
> >       SNMP_MIB_ITEM("TCPRcvQDrop", LINUX_MIB_TCPRCVQDROP),
> > +     SNMP_MIB_ITEM("TCPInvalidAck", LINUX_MIB_TCPINVALIDACK),
> > +     SNMP_MIB_ITEM("TCPOldAck", LINUX_MIB_TCPOLDACK),
> > +     SNMP_MIB_ITEM("TCPPartialPacket", LINUX_MIB_TCPPARTIALPACKET),
> >       SNMP_MIB_SENTINEL
> >  };
> >
> > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> > index 7a027dec..88deb1f 100644
> > --- a/net/ipv4/tcp_input.c
> > +++ b/net/ipv4/tcp_input.c
> > @@ -3554,6 +3554,11 @@ static u32 tcp_newly_delivered(struct sock *sk, u32 prior_delivered, int flag)
> >       return delivered;
> >  }
> >
> > +static void tcp_stats(struct sock *sk, int mib_idx)
> > +{
> > +     NET_INC_STATS(sock_net(sk), mib_idx);
> > +}
>
> This is not a very descriptive name.
>
> Why is it static, and in net/ipv4/tcp_input.c ???
>

Because it is only called in net/ipv4/tcp_input.c currently, so I
define it as static in this file,
the reseaon I don't define it as 'static inline' is that I think the
compiler can make a better decision than me.

In the future it may be called in other files, then we can put it into
a more proper file.

> > +
> >  /* This routine deals with incoming acks, but not outgoing ones. */
> >  static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
> >  {
> > @@ -3715,7 +3720,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
> >       return 1;
> >
> >  invalid_ack:
> > -     SOCK_DEBUG(sk, "Ack %u after %u:%u\n", ack, tp->snd_una, tp->snd_nxt);
> > +     tcp_stats(sk, LINUX_MIB_TCPINVALIDACK);
> >       return -1;
> >
> >  old_ack:
> > @@ -3731,7 +3736,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
> >               tcp_xmit_recovery(sk, rexmit);
> >       }
> >
> > -     SOCK_DEBUG(sk, "Ack %u before %u:%u\n", ack, tp->snd_una, tp->snd_nxt);
> > +     tcp_stats(sk, LINUX_MIB_TCPOLDACK);
> >       return 0;
> >  }
> >
>
>
> These counters will add noise to an already crowded MIB space.
>

I have another idea that we can define some tcp bpf events to replace
these MIB counters somehing like,
    #define BPF_EVENT_TCP_OLDACK 1
    #define BPF_EVENT_TCP_PARTIALPACKET 2
    ...
Maybe we could also cleanup some MIBs to make it less crowded.

> What bug do you expect to track and fix with these ?
>

Let me explain the background for you.
I want to track some TCP abnormal  behavior in TCP/IP stack. But I
find there's no good way to do it.
The current MIBs are per net, other than per socket, that makes it not
very powerful.
And the ancient SOCK_DEBUG is not good as well.
So we think why not cleanup this ancient SOCK_DEBUG() and introduce a
more powerful method.

> I see many TCP patches coming adding icache pressure, enabling companies to build their own modified
> TCP stack, but no real meat.
>

Thanks
Yafang

^ permalink raw reply

* Re: [PATCH iproute2] lib/libnetlink: ensure a minimum of 32KB for the buffer used in rtnl_recvmsg()
From: David Ahern @ 2019-02-13  2:04 UTC (permalink / raw)
  To: Eric Dumazet, Stephen Hemminger
  Cc: netdev, Eric Dumazet, Hangbin Liu, Phil Sutter
In-Reply-To: <20190213015841.140383-1-edumazet@google.com>

On 2/12/19 6:58 PM, Eric Dumazet wrote:
> In the past, we tried to increase the buffer size up to 32 KB in order
> to reduce number of syscalls per dump.
> 
> Commit 2d34851cd341 ("lib/libnetlink: re malloc buff if size is not enough")
> brought the size back to 4KB because the kernel can not know the application
> is ready to receive bigger requests.
> 
> See kernel commits 9063e21fb026 ("netlink: autosize skb lengthes") and
> d35c99ff77ec ("netlink: do not enter direct reclaim from netlink_dump()")
> for more details.
> 
> Fixes: 2d34851cd341 ("lib/libnetlink: re malloc buff if size is not enough")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Hangbin Liu <liuhangbin@gmail.com>
> Cc: Phil Sutter <phil@nwl.cc>
> ---
>  lib/libnetlink.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/lib/libnetlink.c b/lib/libnetlink.c
> index 1892a02ab5d0d73776c9882ffc77edcd2c663d01..0d48a3d43cf03065dacbd419578ab10af56431a4 100644
> --- a/lib/libnetlink.c
> +++ b/lib/libnetlink.c
> @@ -718,6 +718,8 @@ static int rtnl_recvmsg(int fd, struct msghdr *msg, char **answer)
>  	if (len < 0)
>  		return len;
>  
> +	if (len < 32768)
> +		len = 32768;
>  	buf = malloc(len);
>  	if (!buf) {
>  		fprintf(stderr, "malloc error: not enough buffer\n");
> 

I believe that negates the whole point of 2d34851cd341 - which I have no
problem with. 2 recvmsg calls per message is overkill.

Do we know of any single message sizes > 32k? 2d34851cd341 cites
increasing VF's but at some point there is a limit. If not, the whole
PEEK thing should go away and we just malloc 32k (or 64k) buffers for
each recvmsg.

^ permalink raw reply

* [PATCH -next] net: ipvlan_l3s: fix kconfig unmet dependency warning
From: Randy Dunlap @ 2019-02-13  2:02 UTC (permalink / raw)
  To: LKML, netdev@vger.kernel.org, Mahesh Bandewar, David Miller; +Cc: Andrew Morton

From: Randy Dunlap <rdunlap@infradead.org>

Fix the kconfig warning in IPVLAN_L3S when neither INET nor IPV6
is enabled:

WARNING: unmet direct dependencies detected for NET_L3_MASTER_DEV
  Depends on [n]: NET [=y] && (INET [=n] || IPV6 [=n])
  Selected by [y]:
  - IPVLAN_L3S [=y] && NETDEVICES [=y] && NET_CORE [=y] && NETFILTER [=y]

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Mahesh Bandewar <maheshb@google.com>
---
Seen in mmotm but applies to linux-next.

 drivers/net/Kconfig |    1 +
 1 file changed, 1 insertion(+)

--- mmotm-2019-0212-1537.orig/drivers/net/Kconfig
+++ mmotm-2019-0212-1537/drivers/net/Kconfig
@@ -147,6 +147,7 @@ config MACVTAP
 
 config IPVLAN_L3S
 	depends on NETFILTER
+	depends on INET || IPV6 # same as NET_L3_MASTER_DEV
 	def_bool y
 	select NET_L3_MASTER_DEV
 



^ permalink raw reply

* [PATCH iproute2] lib/libnetlink: ensure a minimum of 32KB for the buffer used in rtnl_recvmsg()
From: Eric Dumazet @ 2019-02-13  1:58 UTC (permalink / raw)
  To: David Ahern, Stephen Hemminger
  Cc: netdev, Eric Dumazet, Eric Dumazet, Hangbin Liu, Phil Sutter

In the past, we tried to increase the buffer size up to 32 KB in order
to reduce number of syscalls per dump.

Commit 2d34851cd341 ("lib/libnetlink: re malloc buff if size is not enough")
brought the size back to 4KB because the kernel can not know the application
is ready to receive bigger requests.

See kernel commits 9063e21fb026 ("netlink: autosize skb lengthes") and
d35c99ff77ec ("netlink: do not enter direct reclaim from netlink_dump()")
for more details.

Fixes: 2d34851cd341 ("lib/libnetlink: re malloc buff if size is not enough")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Phil Sutter <phil@nwl.cc>
---
 lib/libnetlink.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index 1892a02ab5d0d73776c9882ffc77edcd2c663d01..0d48a3d43cf03065dacbd419578ab10af56431a4 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -718,6 +718,8 @@ static int rtnl_recvmsg(int fd, struct msghdr *msg, char **answer)
 	if (len < 0)
 		return len;

+	if (len < 32768)
+		len = 32768;
 	buf = malloc(len);
 	if (!buf) {
 		fprintf(stderr, "malloc error: not enough buffer\n");
-- 
2.21.0.rc0.258.g878e2cd30e-goog

^ permalink raw reply related

* [net-next PATCH V3 3/3] page_pool: use DMA_ATTR_SKIP_CPU_SYNC for DMA mappings
From: Jesper Dangaard Brouer @ 2019-02-13  1:55 UTC (permalink / raw)
  To: netdev, linux-mm
  Cc: Toke Høiland-Jørgensen, Ilias Apalodimas, willy,
	Saeed Mahameed, Alexander Duyck, Jesper Dangaard Brouer,
	Andrew Morton, mgorman, David S. Miller, Tariq Toukan
In-Reply-To: <155002290134.5597.6544755780651689517.stgit@firesoul>

As pointed out by Alexander Duyck, the DMA mapping done in page_pool needs
to use the DMA attribute DMA_ATTR_SKIP_CPU_SYNC.

As the principle behind page_pool keeping the pages mapped is that the
driver takes over the DMA-sync steps.

Reported-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
---
 net/core/page_pool.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 897a69a1477e..5b2252c6d49b 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -141,9 +141,9 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
 	 * into page private data (i.e 32bit cpu with 64bit DMA caps)
 	 * This mapping is kept for lifetime of page, until leaving pool.
 	 */
-	dma = dma_map_page(pool->p.dev, page, 0,
-			   (PAGE_SIZE << pool->p.order),
-			   pool->p.dma_dir);
+	dma = dma_map_page_attrs(pool->p.dev, page, 0,
+				 (PAGE_SIZE << pool->p.order),
+				 pool->p.dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
 	if (dma_mapping_error(pool->p.dev, dma)) {
 		put_page(page);
 		return NULL;
@@ -184,8 +184,9 @@ static void __page_pool_clean_page(struct page_pool *pool,
 
 	dma = page->dma_addr;
 	/* DMA unmap */
-	dma_unmap_page(pool->p.dev, dma,
-		       PAGE_SIZE << pool->p.order, pool->p.dma_dir);
+	dma_unmap_page_attrs(pool->p.dev, dma,
+			     PAGE_SIZE << pool->p.order, pool->p.dma_dir,
+			     DMA_ATTR_SKIP_CPU_SYNC);
 	page->dma_addr = 0;
 }
 


^ permalink raw reply related

* [net-next PATCH V3 2/3] net: page_pool: don't use page->private to store dma_addr_t
From: Jesper Dangaard Brouer @ 2019-02-13  1:55 UTC (permalink / raw)
  To: netdev, linux-mm
  Cc: Toke Høiland-Jørgensen, Ilias Apalodimas, willy,
	Saeed Mahameed, Alexander Duyck, Jesper Dangaard Brouer,
	Andrew Morton, mgorman, David S. Miller, Tariq Toukan
In-Reply-To: <155002290134.5597.6544755780651689517.stgit@firesoul>

From: Ilias Apalodimas <ilias.apalodimas@linaro.org>

As pointed out by David Miller the current page_pool implementation
stores dma_addr_t in page->private.
This won't work on 32-bit platforms with 64-bit DMA addresses since the
page->private is an unsigned long and the dma_addr_t a u64.

A previous patch is adding dma_addr_t on struct page to accommodate this.
This patch adapts the page_pool related functions to use the newly added
struct for storing and retrieving DMA addresses from network drivers.

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 net/core/page_pool.c |   13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 43a932cb609b..897a69a1477e 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -136,7 +136,9 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
 	if (!(pool->p.flags & PP_FLAG_DMA_MAP))
 		goto skip_dma_map;
 
-	/* Setup DMA mapping: use page->private for DMA-addr
+	/* Setup DMA mapping: use 'struct page' area for storing DMA-addr
+	 * since dma_addr_t can be either 32 or 64 bits and does not always fit
+	 * into page private data (i.e 32bit cpu with 64bit DMA caps)
 	 * This mapping is kept for lifetime of page, until leaving pool.
 	 */
 	dma = dma_map_page(pool->p.dev, page, 0,
@@ -146,7 +148,7 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
 		put_page(page);
 		return NULL;
 	}
-	set_page_private(page, dma); /* page->private = dma; */
+	page->dma_addr = dma;
 
 skip_dma_map:
 	/* When page just alloc'ed is should/must have refcnt 1. */
@@ -175,13 +177,16 @@ EXPORT_SYMBOL(page_pool_alloc_pages);
 static void __page_pool_clean_page(struct page_pool *pool,
 				   struct page *page)
 {
+	dma_addr_t dma;
+
 	if (!(pool->p.flags & PP_FLAG_DMA_MAP))
 		return;
 
+	dma = page->dma_addr;
 	/* DMA unmap */
-	dma_unmap_page(pool->p.dev, page_private(page),
+	dma_unmap_page(pool->p.dev, dma,
 		       PAGE_SIZE << pool->p.order, pool->p.dma_dir);
-	set_page_private(page, 0);
+	page->dma_addr = 0;
 }
 
 /* Return a page to the page allocator, cleaning up our state */


^ permalink raw reply related

* [net-next PATCH V3 1/3] mm: add dma_addr_t to struct page
From: Jesper Dangaard Brouer @ 2019-02-13  1:55 UTC (permalink / raw)
  To: netdev, linux-mm
  Cc: Toke Høiland-Jørgensen, Ilias Apalodimas, willy,
	Saeed Mahameed, Alexander Duyck, Jesper Dangaard Brouer,
	Andrew Morton, mgorman, David S. Miller, Tariq Toukan
In-Reply-To: <155002290134.5597.6544755780651689517.stgit@firesoul>

The page_pool API is using page->private to store DMA addresses.
As pointed out by David Miller we can't use that on 32-bit architectures
with 64-bit DMA

This patch adds a new dma_addr_t struct to allow storing DMA addresses

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
---
 include/linux/mm_types.h |    7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 2c471a2c43fa..0a36a22228e7 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -95,6 +95,13 @@ struct page {
 			 */
 			unsigned long private;
 		};
+		struct {	/* page_pool used by netstack */
+			/**
+			 * @dma_addr: might require a 64-bit value even on
+			 * 32-bit architectures.
+			 */
+			dma_addr_t dma_addr;
+		};
 		struct {	/* slab, slob and slub */
 			union {
 				struct list_head slab_list;	/* uses lru */


^ permalink raw reply related

* [net-next PATCH V3 0/3] Fix page_pool API and dma address storage
From: Jesper Dangaard Brouer @ 2019-02-13  1:55 UTC (permalink / raw)
  To: netdev, linux-mm
  Cc: Toke Høiland-Jørgensen, Ilias Apalodimas, willy,
	Saeed Mahameed, Alexander Duyck, Jesper Dangaard Brouer,
	Andrew Morton, mgorman, David S. Miller, Tariq Toukan

As pointed out by David Miller in [1] the current page_pool implementation
stores dma_addr_t in page->private. This won't work on 32-bit platforms with
64-bit DMA addresses since the page->private is an unsigned long and the
dma_addr_t a u64.

Since no driver is yet using the DMA mapping capabilities of the API let's
fix this by storing the information in 'struct page' and use that to store
and retrieve DMA addresses from network drivers.

As long as the addresses returned from dma_map_page() are aligned the first
bit, used by the compound pages code should not be set.

Ilias tested the first two patches on Espressobin driver mvneta, for which
we have patches for using the DMA API of page_pool.

[1]: https://lore.kernel.org/netdev/20181207.230655.1261252486319967024.davem@davemloft.net/

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
---

Ilias Apalodimas (1):
      net: page_pool: don't use page->private to store dma_addr_t

Jesper Dangaard Brouer (2):
      mm: add dma_addr_t to struct page
      page_pool: use DMA_ATTR_SKIP_CPU_SYNC for DMA mappings

 include/linux/mm_types.h |    7 +++++++
 net/core/page_pool.c     |   22 ++++++++++++++--------
 2 files changed, 21 insertions(+), 8 deletions(-)

--

^ permalink raw reply

* [PATCH] libertas_tf: remove set but not used variable 'flags'
From: YueHaibing @ 2019-02-13  1:49 UTC (permalink / raw)
  To: Kalle Valo, Colin Ian King
  Cc: YueHaibing, linux-wireless, kernel-janitors, netdev

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/wireless/marvell/libertas_tf/main.c: In function 'lbtf_rx':
drivers/net/wireless/marvell/libertas_tf/main.c:554:15: warning:
 variable 'flags' set but not used [-Wunused-but-set-variable]

It never used and can be removed.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 drivers/net/wireless/marvell/libertas_tf/main.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/wireless/marvell/libertas_tf/main.c b/drivers/net/wireless/marvell/libertas_tf/main.c
index a7cb7d06e5e6..d36962528838 100644
--- a/drivers/net/wireless/marvell/libertas_tf/main.c
+++ b/drivers/net/wireless/marvell/libertas_tf/main.c
@@ -551,7 +551,6 @@ int lbtf_rx(struct lbtf_private *priv, struct sk_buff *skb)
 	struct ieee80211_rx_status stats;
 	struct rxpd *prxpd;
 	int need_padding;
-	unsigned int flags;
 	struct ieee80211_hdr *hdr;
 
 	lbtf_deb_enter(LBTF_DEB_RX);
@@ -572,7 +571,6 @@ int lbtf_rx(struct lbtf_private *priv, struct sk_buff *skb)
 	skb_pull(skb, sizeof(struct rxpd));
 
 	hdr = (struct ieee80211_hdr *)skb->data;
-	flags = le32_to_cpu(*(__le32 *)(skb->data + 4));
 
 	need_padding = ieee80211_is_data_qos(hdr->frame_control);
 	need_padding ^= ieee80211_has_a4(hdr->frame_control);




^ permalink raw reply related

* Re: [PATCH net-next 0/2] Remove unused variables
From: David Miller @ 2019-02-13  1:32 UTC (permalink / raw)
  To: f.fainelli; +Cc: netdev
In-Reply-To: <20190212234000.15796-1-f.fainelli@gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Tue, 12 Feb 2019 15:39:57 -0800

> This removes unused variables from mlxsw and ethsw after the recent
> removal of SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS, build scripts are now
> fixed to take care of those warnings :).

Series applied, thanks Florian.

^ permalink raw reply

* [PATCH bpf-next 2/2] tools: sync uapi/linux/if_link.h header
From: Andrii Nakryiko @ 2019-02-13  1:29 UTC (permalink / raw)
  To: andrii.nakryiko, netdev, kernel-team, yhs, ast, daniel,
	david.laight, acme
  Cc: Andrii Nakryiko
In-Reply-To: <20190213012941.2571769-1-andriin@fb.com>

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/include/uapi/linux/if_link.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/include/uapi/linux/if_link.h b/tools/include/uapi/linux/if_link.h
index d6533828123a..5b225ff63b48 100644
--- a/tools/include/uapi/linux/if_link.h
+++ b/tools/include/uapi/linux/if_link.h
@@ -925,6 +925,7 @@ enum {
 enum {
 	LINK_XSTATS_TYPE_UNSPEC,
 	LINK_XSTATS_TYPE_BRIDGE,
+	LINK_XSTATS_TYPE_BOND,
 	__LINK_XSTATS_TYPE_MAX
 };
 #define LINK_XSTATS_TYPE_MAX (__LINK_XSTATS_TYPE_MAX - 1)
-- 
2.17.1


^ permalink raw reply related

* [PATCH bpf-next 0/2] tools/bpf: smaller clean ups
From: Andrii Nakryiko @ 2019-02-13  1:29 UTC (permalink / raw)
  To: andrii.nakryiko, netdev, kernel-team, yhs, ast, daniel,
	david.laight, acme
  Cc: Andrii Nakryiko

This patchset replaces bzero() with memset() and syncs if_link.h header
to suppress unsynchronized headers warning.

Andrii Nakryiko (2):
  tools/bpf: replace bzero with memset
  tools: sync uapi/linux/if_link.h header

 tools/include/uapi/linux/if_link.h |  1 +
 tools/lib/bpf/bpf.c                | 48 +++++++++++++++---------------
 tools/lib/bpf/btf.c                |  5 ++--
 tools/lib/bpf/libbpf.c             |  5 ++--
 4 files changed, 29 insertions(+), 30 deletions(-)

-- 
2.17.1


^ permalink raw reply

* [PATCH bpf-next 1/2] tools/bpf: replace bzero with memset
From: Andrii Nakryiko @ 2019-02-13  1:29 UTC (permalink / raw)
  To: andrii.nakryiko, netdev, kernel-team, yhs, ast, daniel,
	david.laight, acme
  Cc: Andrii Nakryiko
In-Reply-To: <20190213012941.2571769-1-andriin@fb.com>

bzero() call is deprecated and superseded by memset().

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Reported-by: David Laight <david.laight@aculab.com>
---
 tools/lib/bpf/bpf.c    | 48 +++++++++++++++++++++---------------------
 tools/lib/bpf/btf.c    |  5 ++---
 tools/lib/bpf/libbpf.c |  5 ++---
 3 files changed, 28 insertions(+), 30 deletions(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index a5261f39e2bd..9cd015574e83 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -22,7 +22,7 @@
  */
 
 #include <stdlib.h>
-#include <strings.h>
+#include <string.h>
 #include <memory.h>
 #include <unistd.h>
 #include <asm/unistd.h>
@@ -228,7 +228,7 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
 
 	name_len = load_attr->name ? strlen(load_attr->name) : 0;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.prog_type = load_attr->prog_type;
 	attr.expected_attach_type = load_attr->expected_attach_type;
 	attr.insn_cnt = (__u32)load_attr->insns_cnt;
@@ -340,7 +340,7 @@ int bpf_verify_program(enum bpf_prog_type type, const struct bpf_insn *insns,
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.prog_type = type;
 	attr.insn_cnt = (__u32)insns_cnt;
 	attr.insns = ptr_to_u64(insns);
@@ -360,7 +360,7 @@ int bpf_map_update_elem(int fd, const void *key, const void *value,
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.map_fd = fd;
 	attr.key = ptr_to_u64(key);
 	attr.value = ptr_to_u64(value);
@@ -373,7 +373,7 @@ int bpf_map_lookup_elem(int fd, const void *key, void *value)
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.map_fd = fd;
 	attr.key = ptr_to_u64(key);
 	attr.value = ptr_to_u64(value);
@@ -385,7 +385,7 @@ int bpf_map_lookup_elem_flags(int fd, const void *key, void *value, __u64 flags)
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.map_fd = fd;
 	attr.key = ptr_to_u64(key);
 	attr.value = ptr_to_u64(value);
@@ -398,7 +398,7 @@ int bpf_map_lookup_and_delete_elem(int fd, const void *key, void *value)
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.map_fd = fd;
 	attr.key = ptr_to_u64(key);
 	attr.value = ptr_to_u64(value);
@@ -410,7 +410,7 @@ int bpf_map_delete_elem(int fd, const void *key)
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.map_fd = fd;
 	attr.key = ptr_to_u64(key);
 
@@ -421,7 +421,7 @@ int bpf_map_get_next_key(int fd, const void *key, void *next_key)
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.map_fd = fd;
 	attr.key = ptr_to_u64(key);
 	attr.next_key = ptr_to_u64(next_key);
@@ -433,7 +433,7 @@ int bpf_obj_pin(int fd, const char *pathname)
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.pathname = ptr_to_u64((void *)pathname);
 	attr.bpf_fd = fd;
 
@@ -444,7 +444,7 @@ int bpf_obj_get(const char *pathname)
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.pathname = ptr_to_u64((void *)pathname);
 
 	return sys_bpf(BPF_OBJ_GET, &attr, sizeof(attr));
@@ -455,7 +455,7 @@ int bpf_prog_attach(int prog_fd, int target_fd, enum bpf_attach_type type,
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.target_fd	   = target_fd;
 	attr.attach_bpf_fd = prog_fd;
 	attr.attach_type   = type;
@@ -468,7 +468,7 @@ int bpf_prog_detach(int target_fd, enum bpf_attach_type type)
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.target_fd	 = target_fd;
 	attr.attach_type = type;
 
@@ -479,7 +479,7 @@ int bpf_prog_detach2(int prog_fd, int target_fd, enum bpf_attach_type type)
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.target_fd	 = target_fd;
 	attr.attach_bpf_fd = prog_fd;
 	attr.attach_type = type;
@@ -493,7 +493,7 @@ int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
 	union bpf_attr attr;
 	int ret;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.query.target_fd	= target_fd;
 	attr.query.attach_type	= type;
 	attr.query.query_flags	= query_flags;
@@ -514,7 +514,7 @@ int bpf_prog_test_run(int prog_fd, int repeat, void *data, __u32 size,
 	union bpf_attr attr;
 	int ret;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.test.prog_fd = prog_fd;
 	attr.test.data_in = ptr_to_u64(data);
 	attr.test.data_out = ptr_to_u64(data_out);
@@ -539,7 +539,7 @@ int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr)
 	if (!test_attr->data_out && test_attr->data_size_out > 0)
 		return -EINVAL;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.test.prog_fd = test_attr->prog_fd;
 	attr.test.data_in = ptr_to_u64(test_attr->data_in);
 	attr.test.data_out = ptr_to_u64(test_attr->data_out);
@@ -559,7 +559,7 @@ int bpf_prog_get_next_id(__u32 start_id, __u32 *next_id)
 	union bpf_attr attr;
 	int err;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.start_id = start_id;
 
 	err = sys_bpf(BPF_PROG_GET_NEXT_ID, &attr, sizeof(attr));
@@ -574,7 +574,7 @@ int bpf_map_get_next_id(__u32 start_id, __u32 *next_id)
 	union bpf_attr attr;
 	int err;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.start_id = start_id;
 
 	err = sys_bpf(BPF_MAP_GET_NEXT_ID, &attr, sizeof(attr));
@@ -588,7 +588,7 @@ int bpf_prog_get_fd_by_id(__u32 id)
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.prog_id = id;
 
 	return sys_bpf(BPF_PROG_GET_FD_BY_ID, &attr, sizeof(attr));
@@ -598,7 +598,7 @@ int bpf_map_get_fd_by_id(__u32 id)
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.map_id = id;
 
 	return sys_bpf(BPF_MAP_GET_FD_BY_ID, &attr, sizeof(attr));
@@ -608,7 +608,7 @@ int bpf_btf_get_fd_by_id(__u32 id)
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.btf_id = id;
 
 	return sys_bpf(BPF_BTF_GET_FD_BY_ID, &attr, sizeof(attr));
@@ -619,7 +619,7 @@ int bpf_obj_get_info_by_fd(int prog_fd, void *info, __u32 *info_len)
 	union bpf_attr attr;
 	int err;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.info.bpf_fd = prog_fd;
 	attr.info.info_len = *info_len;
 	attr.info.info = ptr_to_u64(info);
@@ -635,7 +635,7 @@ int bpf_raw_tracepoint_open(const char *name, int prog_fd)
 {
 	union bpf_attr attr;
 
-	bzero(&attr, sizeof(attr));
+	memset(&attr, 0, sizeof(attr));
 	attr.raw_tracepoint.name = ptr_to_u64(name);
 	attr.raw_tracepoint.prog_fd = prog_fd;
 
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 6953fedb88ff..ade1c32fb083 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -4,7 +4,6 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
-#include <strings.h>
 #include <unistd.h>
 #include <errno.h>
 #include <linux/err.h>
@@ -484,7 +483,7 @@ int btf__get_from_id(__u32 id, struct btf **btf)
 		goto exit_free;
 	}
 
-	bzero(ptr, last_size);
+	memset(ptr, 0, last_size);
 	btf_info.btf = ptr_to_u64(ptr);
 	err = bpf_obj_get_info_by_fd(btf_fd, &btf_info, &len);
 
@@ -498,7 +497,7 @@ int btf__get_from_id(__u32 id, struct btf **btf)
 			goto exit_free;
 		}
 		ptr = temp_ptr;
-		bzero(ptr, last_size);
+		memset(ptr, 0, last_size);
 		btf_info.btf = ptr_to_u64(ptr);
 		err = bpf_obj_get_info_by_fd(btf_fd, &btf_info, &len);
 	}
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index e3c39edfb9d3..6ef7e6e4cbd3 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -18,7 +18,6 @@
 #include <libgen.h>
 #include <inttypes.h>
 #include <string.h>
-#include <strings.h>
 #include <unistd.h>
 #include <fcntl.h>
 #include <errno.h>
@@ -308,7 +307,7 @@ bpf_program__init(void *data, size_t size, char *section_name, int idx,
 		return -EINVAL;
 	}
 
-	bzero(prog, sizeof(*prog));
+	memset(prog, 0, sizeof(*prog));
 
 	prog->section_name = strdup(section_name);
 	if (!prog->section_name) {
@@ -1577,7 +1576,7 @@ bpf_program__load(struct bpf_program *prog,
 		struct bpf_prog_prep_result result;
 		bpf_program_prep_t preprocessor = prog->preprocessor;
 
-		bzero(&result, sizeof(result));
+		memset(&result, 0, sizeof(result));
 		err = preprocessor(prog, i, prog->insns,
 				   prog->insns_cnt, &result);
 		if (err) {
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next] net: sched: remove duplicated include from cls_api.c
From: YueHaibing @ 2019-02-13  1:42 UTC (permalink / raw)
  To: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S . Miller
  Cc: YueHaibing, netdev, kernel-janitors

Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 net/sched/cls_api.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 02cf6d2fa0e1..774b663e07f1 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -38,7 +38,6 @@
 #include <net/tc_act/tc_csum.h>
 #include <net/tc_act/tc_gact.h>
 #include <net/tc_act/tc_skbedit.h>
-#include <net/tc_act/tc_mirred.h>

 extern const struct nla_policy rtm_tca_policy[TCA_MAX + 1];

^ permalink raw reply related

* Re: linux-next: build warning after merge of the net-next tree
From: Florian Fainelli @ 2019-02-13  0:58 UTC (permalink / raw)
  To: Stephen Rothwell, David Miller, Networking
  Cc: Linux Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <20190213115141.376058e8@canb.auug.org.au>

Le 2/12/19 à 4:51 PM, Stephen Rothwell a écrit :
> Hi all,
> 
> After merging the net-next tree, today's linux-next build (x86_64
> allmodconfig) produced this warning:
> 
> drivers/staging/fsl-dpaa2/ethsw/ethsw.c: In function 'swdev_port_attr_get':
> drivers/staging/fsl-dpaa2/ethsw/ethsw.c:646:26: warning: unused variable 'port_priv' [-Wunused-variable]
>   struct ethsw_port_priv *port_priv = netdev_priv(netdev);
>                           ^~~~~~~~~
> 
> Introduced by commit
> 
>   1b8b589d9103 ("staging: fsl-dpaa2: ethsw: Remove getting PORT_BRIDGE_FLAGS")
> 

Also fixed with:

http://patchwork.ozlabs.org/project/netdev/list/?series=91603
-- 
Florian

^ permalink raw reply

* Re: linux-next: build warning after merge of the net-next tree
From: Florian Fainelli @ 2019-02-13  0:57 UTC (permalink / raw)
  To: Stephen Rothwell, David Miller, Networking
  Cc: Linux Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <20190213114916.1a83a363@canb.auug.org.au>

Le 2/12/19 à 4:49 PM, Stephen Rothwell a écrit :
> Hi all,
> 
> After merging the net-next tree, today's linux-next build (x86_64
> allmodconfig) produced this warning:
> 
> drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c: In function 'mlxsw_sp_port_attr_get':
> drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c:438:19: warning: unused variable 'mlxsw_sp' [-Wunused-variable]
>   struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
>                    ^~~~~~~~
> 
> Introduced by commit
> 
>   1ecb195753a1 ("mlxsw: spectrum_switchdev: Remove getting PORT_BRIDGE_FLAGS")

Fixed with:

http://patchwork.ozlabs.org/project/netdev/list/?series=91603
-- 
Florian

^ permalink raw reply

* linux-next: build warning after merge of the net-next tree
From: Stephen Rothwell @ 2019-02-13  0:51 UTC (permalink / raw)
  To: David Miller, Networking
  Cc: Linux Next Mailing List, Linux Kernel Mailing List,
	Florian Fainelli

[-- Attachment #1: Type: text/plain, Size: 538 bytes --]

Hi all,

After merging the net-next tree, today's linux-next build (x86_64
allmodconfig) produced this warning:

drivers/staging/fsl-dpaa2/ethsw/ethsw.c: In function 'swdev_port_attr_get':
drivers/staging/fsl-dpaa2/ethsw/ethsw.c:646:26: warning: unused variable 'port_priv' [-Wunused-variable]
  struct ethsw_port_priv *port_priv = netdev_priv(netdev);
                          ^~~~~~~~~

Introduced by commit

  1b8b589d9103 ("staging: fsl-dpaa2: ethsw: Remove getting PORT_BRIDGE_FLAGS")

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* linux-next: build warning after merge of the net-next tree
From: Stephen Rothwell @ 2019-02-13  0:49 UTC (permalink / raw)
  To: David Miller, Networking
  Cc: Linux Next Mailing List, Linux Kernel Mailing List,
	Florian Fainelli

[-- Attachment #1: Type: text/plain, Size: 562 bytes --]

Hi all,

After merging the net-next tree, today's linux-next build (x86_64
allmodconfig) produced this warning:

drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c: In function 'mlxsw_sp_port_attr_get':
drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c:438:19: warning: unused variable 'mlxsw_sp' [-Wunused-variable]
  struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
                   ^~~~~~~~

Introduced by commit

  1ecb195753a1 ("mlxsw: spectrum_switchdev: Remove getting PORT_BRIDGE_FLAGS")

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH iproute2 net-next v2 3/4] ss: Buffer raw fields first, then render them as a table
From: Eric Dumazet @ 2019-02-13  0:42 UTC (permalink / raw)
  To: Stefano Brivio, Stephen Hemminger; +Cc: netdev, Sabrina Dubroca
In-Reply-To: <fef5eed38c3f47f334136fa2d857b6478d497d03.1513039237.git.sbrivio@redhat.com>



On 12/11/2017 04:46 PM, Stefano Brivio wrote:
> This allows us to measure the maximum field length for each
> column before printing fields and will permit us to apply
> optimal field spacing and distribution. Structure of the output
> buffer with chunked allocation is described in comments.
> 
> Output is still unchanged, original spacing is used.
> 
> Running over one million sockets with -tul options by simply
> modifying main() to loop 50,000 times over the *_show()
> functions, buffering the whole output and rendering it at the
> end, with 10 UDP sockets, 10 TCP sockets, while throwing
> output away, doesn't show significant changes in execution time
> on my laptop with an Intel i7-6600U CPU:
> 
> - before this patch:
> $ time ./ss -tul > /dev/null
> real	0m29.899s
> user	0m2.017s
> sys	0m27.801s
> 
> - after this patch:
> $ time ./ss -tul > /dev/null
> real	0m29.827s
> user	0m1.942s
> sys	0m27.812s
> 

I do not get it.

"ss -emoi " uses almost 1KB per socket.

10,000,000 sockets -> we need about 10GB of memory  ???

This is a serious regression.



^ permalink raw reply

* Re: linux-next: manual merge of the net-next tree with the net tree
From: Stephen Rothwell @ 2019-02-13  0:33 UTC (permalink / raw)
  To: David Miller, Networking
  Cc: Linux Next Mailing List, Linux Kernel Mailing List, Cong Wang,
	Vlad Buslov
In-Reply-To: <20190213111325.30cfc931@canb.auug.org.au>

[-- Attachment #1: Type: text/plain, Size: 2344 bytes --]

Hi all,

On Wed, 13 Feb 2019 11:13:25 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> Today's linux-next merge of the net-next tree got a conflict in:
> 
>   net/sched/cls_tcindex.c
> 
> between commits:
> 
>   8015d93ebd27 ("net_sched: fix a race condition in tcindex_destroy()")
>   033b228e7f26 ("net_sched: fix a memory leak in cls_tcindex")
> 
> from the net tree and commit:
> 
>   12db03b65c2b ("net: sched: extend proto ops to support unlocked classifiers")
> 
> from the net-next tree.
> 
> I fixed it up (see the final resolution when linux-next is published)
> and can carry the fix as necessary. This is now fixed as far as
> linux-next is concerned, but any non trivial conflicts should be
> mentioned to your upstream maintainer when your tree is submitted for
> merging.  You may also want to consider cooperating with the maintainer
> of the conflicting tree to minimise any particularly complex conflicts.

Actually, see the below resolution.

-- 
Cheers,
Stephen Rothwell

diff --cc net/sched/cls_tcindex.c
index 38bb882bb958,14d6b4058045..e6cf20bc8e80
--- a/net/sched/cls_tcindex.c
+++ b/net/sched/cls_tcindex.c
@@@ -559,34 -563,15 +560,34 @@@ static void tcindex_destroy(struct tcf_
  			    struct netlink_ext_ack *extack)
  {
  	struct tcindex_data *p = rtnl_dereference(tp->root);
 -	struct tcf_walker walker;
 +	int i;
  
  	pr_debug("tcindex_destroy(tp %p),p %p\n", tp, p);
 -	walker.count = 0;
 -	walker.skip = 0;
 -	walker.fn = tcindex_destroy_element;
 -	tcindex_walk(tp, &walker, true);
  
 -	call_rcu(&p->rcu, __tcindex_destroy);
 +	if (p->perfect) {
 +		for (i = 0; i < p->hash; i++) {
 +			struct tcindex_filter_result *r = p->perfect + i;
 +
 +			tcf_unbind_filter(tp, &r->res);
 +			if (tcf_exts_get_net(&r->exts))
 +				tcf_queue_work(&r->rwork,
 +					       tcindex_destroy_rexts_work);
 +			else
 +				__tcindex_destroy_rexts(r);
 +		}
 +	}
 +
 +	for (i = 0; p->h && i < p->hash; i++) {
 +		struct tcindex_filter *f, *next;
 +		bool last;
 +
 +		for (f = rtnl_dereference(p->h[i]); f; f = next) {
 +			next = rtnl_dereference(f->next);
- 			tcindex_delete(tp, &f->result, &last, NULL);
++			tcindex_delete(tp, &f->result, &last, false, NULL);
 +		}
 +	}
 +
 +	tcf_queue_work(&p->rwork, tcindex_destroy_work);
  }
  
  

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* [PATCH net-next 1/1] flow_offload: fix block stats
From: John Hurley @ 2019-02-13  0:23 UTC (permalink / raw)
  To: jiri, davem; +Cc: netdev, pablo, oss-drivers, John Hurley

With the introduction of flow_stats_update(), drivers now update the stats
fields of the passed tc_cls_flower_offload struct, rather than call
tcf_exts_stats_update() directly to update the stats of offloaded TC
flower rules. However, if multiple qdiscs are registered to a TC shared
block and a flower rule is applied, then, when getting stats for the rule,
multiple callbacks may be made.

Take this into consideration by modifying flow_stats_update to gather the
stats from all callbacks. Currently, the values in tc_cls_flower_offload
only account for the last stats callback in the list.

Fixes: 3b1903ef97c0 ("flow_offload: add statistics retrieval infrastructure and use it")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 include/net/flow_offload.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index a307ccb..d035183 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -195,9 +195,9 @@ struct flow_stats {
 static inline void flow_stats_update(struct flow_stats *flow_stats,
 				     u64 bytes, u64 pkts, u64 lastused)
 {
-	flow_stats->pkts	= pkts;
-	flow_stats->bytes	= bytes;
-	flow_stats->lastused	= lastused;
+	flow_stats->pkts	+= pkts;
+	flow_stats->bytes	+= bytes;
+	flow_stats->lastused	= max_t(u64, flow_stats->lastused, lastused);
 }

 #endif /* _NET_FLOW_OFFLOAD_H */
-- 
2.7.4

^ permalink raw reply related

* linux-next: manual merge of the net-next tree with the net tree
From: Stephen Rothwell @ 2019-02-13  0:13 UTC (permalink / raw)
  To: David Miller, Networking
  Cc: Linux Next Mailing List, Linux Kernel Mailing List, Cong Wang,
	Vlad Buslov

[-- Attachment #1: Type: text/plain, Size: 1009 bytes --]

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  net/sched/cls_tcindex.c

between commits:

  8015d93ebd27 ("net_sched: fix a race condition in tcindex_destroy()")
  033b228e7f26 ("net_sched: fix a memory leak in cls_tcindex")

from the net tree and commit:

  12db03b65c2b ("net: sched: extend proto ops to support unlocked classifiers")

from the net-next tree.

I fixed it up (see the final resolution when linux-next is published)
and can carry the fix as necessary. This is now fixed as far as
linux-next is concerned, but any non trivial conflicts should be
mentioned to your upstream maintainer when your tree is submitted for
merging.  You may also want to consider cooperating with the maintainer
of the conflicting tree to minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc net/sched/cls_tcindex.c
index 38bb882bb958,14d6b4058045..000000000000
--- a/net/sched/cls_tcindex.c
+++ b/net/sched/cls_tcindex.c

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCHv5 iproute2 net-next 2/2] lib/libnetlink: re malloc buff if size is not enough
From: Eric Dumazet @ 2019-02-12 23:43 UTC (permalink / raw)
  To: Hangbin Liu, netdev; +Cc: Stephen Hemminger, Michal Kubecek, Phil Sutter
In-Reply-To: <244174ae-3ab4-68c0-6783-f8c91840a7e1@gmail.com>



On 02/12/2019 03:32 PM, Eric Dumazet wrote:
> 

> 
> This patch brings a serious performance penalty.
> 
> ss command now uses two system calls per ~4KB worth of data
> 
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{NULL, 0}], msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 3328 <0.000120>
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"h\0\0\0\24\0\2\0@\342\1\0\322\0\6\0\n\1\1\0\250\253\276@&\7\370\260\200\231\16\6"..., 3328}], msg_controllen=0, msg_flags=0}, 0) = 3328 <0.000108>
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{NULL, 0}], msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 3328 <0.000086>
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"h\0\0\0\24\0\2\0@\342\1\0\322\0\6\0\n\10\2\0002A\266S&\7\370\260\200\231\16\6"..., 3328}], msg_controllen=0, msg_flags=0}, 0) = 3328 <0.000121>
> 
> 
> So we are back to a very pessimistic situation.
> 

I guess this patch will solve the issue :

diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index ced33728777a17e0905e76acb904ac4709707488..309b5b3787e3d8f8c47f035d270ae2b4df01703e 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -442,6 +442,8 @@ static int rtnl_recvmsg(int fd, struct msghdr *msg, char **answer)
        if (len < 0)
                return len;
 
+       if (len < 32768)
+               len = 32768;
        buf = malloc(len);
        if (!buf) {
                fprintf(stderr, "malloc error: not enough buffer\n");

^ permalink raw reply related

* [PATCH bpf 2/2] selftests/bpf: make sure signal interrupts BPF_PROG_TEST_RUN
From: Stanislav Fomichev @ 2019-02-12 23:42 UTC (permalink / raw)
  To: netdev; +Cc: davem, ast, daniel, Stanislav Fomichev
In-Reply-To: <20190212234239.174386-1-sdf@google.com>

Simple test that I used to reproduce the issue in the previous commit:
Do BPF_PROG_TEST_RUN with max iterations, each program is 4096 simple
move instructions. File alarm in 0.1 second and check that
bpf_prog_test_run is interrupted (i.e. test doesn't hang).

Feel free to ignore it if you feel like that's just a one-off fix and it
doesn't require a test going forward.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/testing/selftests/bpf/test_progs.c | 44 ++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index 25f0083a9b2e..7842e3749b19 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -11,6 +11,7 @@
 #include <assert.h>
 #include <stdlib.h>
 #include <time.h>
+#include <signal.h>
 
 #include <linux/types.h>
 typedef __u16 __sum16;
@@ -27,6 +28,7 @@ typedef __u16 __sum16;
 #include <sys/ioctl.h>
 #include <sys/wait.h>
 #include <sys/types.h>
+#include <sys/time.h>
 #include <fcntl.h>
 
 #include <linux/bpf.h>
@@ -1912,6 +1914,47 @@ static void test_queue_stack_map(int type)
 	bpf_object__close(obj);
 }
 
+static void sigalrm_handler(int s) {}
+static struct sigaction sigalrm_action = {
+	.sa_handler = sigalrm_handler,
+};
+
+static void test_signal_pending(void)
+{
+	struct bpf_insn prog[4096];
+	struct itimerval timeo = {
+		.it_value.tv_usec = 100000, /* 100ms */
+	};
+	__u32 duration, retval;
+	int prog_fd;
+	int err;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(prog); i++)
+		prog[i] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0);
+	prog[ARRAY_SIZE(prog) - 1] = BPF_EXIT_INSN();
+
+	prog_fd = bpf_load_program(BPF_PROG_TYPE_SOCKET_FILTER,
+				   prog, ARRAY_SIZE(prog),
+				   "GPL", 0, NULL, 0);
+	CHECK(prog_fd < 0, "test-run", "errno %d\n", errno);
+
+	err = sigaction(SIGALRM, &sigalrm_action, NULL);
+	CHECK(err, "test-run-signal-sigaction", "errno %d\n", errno);
+
+	err = setitimer(ITIMER_REAL, &timeo, NULL);
+	CHECK(err, "test-run-signal-timer", "errno %d\n", errno);
+
+	err = bpf_prog_test_run(prog_fd, 0xffffffff, &pkt_v4, sizeof(pkt_v4),
+				NULL, NULL, &retval, &duration);
+	CHECK(err != -1 || errno != EINTR || duration > 1000000000,
+	      "test-run-signal-run",
+	      "err %d errno %d retval %d\n",
+	      err, errno, retval);
+
+	signal(SIGALRM, SIG_DFL);
+}
+
 int main(void)
 {
 	srand(time(NULL));
@@ -1939,6 +1982,7 @@ int main(void)
 	test_reference_tracking();
 	test_queue_stack_map(QUEUE);
 	test_queue_stack_map(STACK);
+	test_signal_pending();
 
 	printf("Summary: %d PASSED, %d FAILED\n", pass_cnt, error_cnt);
 	return error_cnt ? EXIT_FAILURE : EXIT_SUCCESS;
-- 
2.20.1.791.gb4d0f1c61a-goog


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox