Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next v2] dsa:mv88e6xxx: dispose irq mapping for chip->irq
From: Andrew Lunn @ 2016-12-09 18:00 UTC (permalink / raw)
  To: Volodymyr Bendiuga
  Cc: vivien.didelot-4ysUXcep3aM1wj+D4I0NRVaTQe2KTcn/,
	f.fainelli-Re5JQEeQqe8AvxtiuMwx3w, netdev-u79uwXL29TY76Z2rM5mHXA,
	volodymyr.bendiuga-Re5JQEeQqe8AvxtiuMwx3w, Rob Herring,
	devicetree-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4388a669-b425-97e0-346b-6b20f7f47f86-qeDNsGSBLoYwFerOooGFRg@public.gmane.org>

On Wed, Dec 07, 2016 at 05:40:12PM +0100, Volodymyr Bendiuga wrote:
> Yes, most of the users of of_irq_get() do not use irq_dispose_mapping().
> 
> But some of them do (some irq chips), and I believe the correct way
> of doing this is to
> 
> dispose irq mapping, as the description for this function says that
> it unmaps
> 
> the irq, which is mapped by of_irq_parse_and_map(). Not disposing
> irq might not make
> 
> any affect on most drivers, but some, that get EPROBE_DEFER error do
> need to dispose.
> 
> This is what I get when I run the code.
> 
> of_irq_put() could be implemented, and it would be a wrapper for
> irq_dispose_mapping()
> 
> as I can see it. Should I do it this way?

Hi Volodymyr

Yes, i think having of_irq_put() would be good. It gives some symmetry
to the API.

   Andrew
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCHv3 perf/core 5/7] samples/bpf: Switch over to libbpf
From: Joe Stringer @ 2016-12-09 17:59 UTC (permalink / raw)
  To: Wangnan (F); +Cc: LKML, ast, Daniel Borkmann, Arnaldo Carvalho de Melo, netdev
In-Reply-To: <77ff1746-6271-0eac-a921-bb852c14a602@huawei.com>

On 8 December 2016 at 21:18, Wangnan (F) <wangnan0@huawei.com> wrote:
>
>
> On 2016/12/9 13:04, Wangnan (F) wrote:
>>
>>
>>
>> On 2016/12/9 10:46, Joe Stringer wrote:
>>
>> [SNIP]
>>
>>>   diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile
>>> index 62d89d50fcbd..616bd55f3be8 100644
>>> --- a/tools/lib/bpf/Makefile
>>> +++ b/tools/lib/bpf/Makefile
>>> @@ -149,6 +149,8 @@ CMD_TARGETS = $(LIB_FILE)
>>>     TARGETS = $(CMD_TARGETS)
>>>   +libbpf: all
>>> +
>>
>>
>> Why we need this? I tested this patch without it and it seems to work, and
>> this line causes an extra error:
>>  $ pwd
>>  /home/wn/kernel/tools/lib/bpf
>>  $ make libbpf
>>  ...
>>  gcc -g -Wall -DHAVE_LIBELF_MMAP_SUPPORT -DHAVE_ELF_GETPHDRNUM_SUPPORT
>> -Wbad-function-cast -Wdeclaration-after-statement -Wformat-security
>> -Wformat-y2k -Winit-self -Wmissing-declarations -Wmissing-prototypes
>> -Wnested-externs -Wno-system-headers -Wold-style-definition -Wpacked
>> -Wredundant-decls -Wshadow -Wstrict-aliasing=3 -Wstrict-prototypes
>> -Wswitch-default -Wswitch-enum -Wundef -Wwrite-strings -Wformat -Werror
>> -Wall -fPIC -I. -I/home/wn/kernel-hydrogen/tools/include
>> -I/home/wn/kernel-hydrogen/tools/arch/x86/include/uapi
>> -I/home/wn/kernel-hydrogen/tools/include/uapi    libbpf.c all   -o libbpf
>>  gcc: error: all: No such file or directory
>>  make: *** [libbpf] Error 1
>>
>> Thank you.
>
>
> It is not 'caused' by your patch. 'make libbpf' fails without
> your change because it tries to build an executable from
> libbpf.c, but main() is missing.
>
> I think libbpf should never be used as a make target. Your
> new dependency looks strange.

Thanks for the feedback, I sent a patch to address this on top of perf/core:

https://lkml.org/lkml/2016/12/9/518

^ permalink raw reply

* Re: [PATCH V2 00/22] Broadcom RoCE Driver (bnxt_re)
From: Selvin Xavier @ 2016-12-09 17:52 UTC (permalink / raw)
  To: David Miller; +Cc: Doug Ledford, linux-rdma, netdev
In-Reply-To: <20161209.102602.181161966975624956.davem@davemloft.net>

On Fri, Dec 9, 2016 at 8:56 PM, David Miller <davem@davemloft.net> wrote:
> From: Selvin Xavier <selvin.xavier@broadcom.com>
> Date: Thu,  8 Dec 2016 22:47:54 -0800
>
>> This series introduces the RoCE driver for the Broadcom
>> NetXtreme-E 10/25/40/50 gigabit RoCE HCAs.
>> This driver is dependent on the bnxt_en NIC driver and is
>> based on the bnxt_re branch in Doug's repository. bnxt_en changes
>> required for this patch series is already available in this branch.
>>
>> I am preparing a git repository with these changes as per Jason's
>> comment and will share the details later today.
>
> If this is targetted at the net-next tree, it is too late as I've
> closed the net-next tree two nights ago.
>

This patch series is targeting linux-rdma tree. netdev is copied since
this series is dependent on  bnxt_en.

Thanks
Selvin

^ permalink raw reply

* [PATCH perf/core] samples/bpf: Drop unnecessary build targets.
From: Joe Stringer @ 2016-12-09 17:51 UTC (permalink / raw)
  To: linux-kernel; +Cc: wangnan0, ast, daniel, acme, netdev

Commit f72179ef11db ("samples/bpf: Switch over to libbpf") added these
two makefile changes that were unnecessary for switching samples to use
libbpf. The extra make is already handled by the build dependency, and
libbpf target doesn't build because it lacks main(). Remove these.

Reported-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
---
 samples/bpf/Makefile   | 1 -
 tools/lib/bpf/Makefile | 2 --
 2 files changed, 3 deletions(-)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 9ffa6a2c061d..60ffc8115b67 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -127,7 +127,6 @@ CLANG ?= clang
 
 # Trick to allow make to be run from this directory
 all:
-	$(MAKE) -C ../../ tools/lib/bpf/
 	$(MAKE) -C ../../ $$PWD/
 
 clean:
diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile
index 616bd55f3be8..62d89d50fcbd 100644
--- a/tools/lib/bpf/Makefile
+++ b/tools/lib/bpf/Makefile
@@ -149,8 +149,6 @@ CMD_TARGETS = $(LIB_FILE)
 
 TARGETS = $(CMD_TARGETS)
 
-libbpf: all
-
 all: fixdep $(VERSION_FILES) all_cmd
 
 all_cmd: $(CMD_TARGETS)
-- 
2.10.2

^ permalink raw reply related

* Re: [PATCH] uio-hv-generic: store physical addresses instead of virtual
From: Stephen Hemminger @ 2016-12-09 17:28 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Greg Kroah-Hartman, Stephen Hemminger, netdev, Haiyang Zhang,
	linux-kernel, devel
In-Reply-To: <20161209114456.3719619-1-arnd@arndb.de>

On Fri,  9 Dec 2016 12:44:40 +0100
Arnd Bergmann <arnd@arndb.de> wrote:

> gcc warns about the newly added driver when phys_addr_t is wider than
> a pointer:
> 
> drivers/uio/uio_hv_generic.c: In function 'hv_uio_mmap':
> drivers/uio/uio_hv_generic.c:71:17: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
>     virt_to_phys((void *)info->mem[mi].addr) >> PAGE_SHIFT,
> drivers/uio/uio_hv_generic.c: In function 'hv_uio_probe':
> drivers/uio/uio_hv_generic.c:140:5: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
>    = (phys_addr_t)dev->channel->ringbuffer_pages;
> drivers/uio/uio_hv_generic.c:147:3: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
>    (phys_addr_t)vmbus_connection.int_page;
> drivers/uio/uio_hv_generic.c:153:3: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
>    (phys_addr_t)vmbus_connection.monitor_pages[1];
> 
> I can't see why we store a virtual address in a phys_addr_t here,
> as the only user of that variable converts it into a physical
> address anyway, so this moves the conversion to where it logically
> fits according to the types.
> 
> Fixes: 95096f2fbd10 ("uio-hv-generic: new userspace i/o driver for VMBus")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Thanks, the code was inherited from outside, and only tested on x86_64.
Not sure which platform and GCC version generates the warning, was this just W=1?

Acked-by: Stephen Hemminger <sthemmin@microsoft.com>

^ permalink raw reply

* Re: [PATCH v2 net-next 0/4] udp: receive path optimizations
From: Tom Herbert @ 2016-12-09 17:13 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jesper Dangaard Brouer, Eric Dumazet, David S . Miller, netdev,
	Paolo Abeni
In-Reply-To: <1481302391.4930.201.camel@edumazet-glaptop3.roam.corp.google.com>

On Fri, Dec 9, 2016 at 8:53 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2016-12-09 at 08:43 -0800, Tom Herbert wrote:
>>
>>
>
>
>> Are you thinking of allowing unconnected socket to have multiple input
>> queues? Sort of an automatic and transparent SO_REUSEPORT...
>
> It all depends if the user application is using a single thread or
> multiple threads to drain the queue.
>
If they're using multiple threads hopefully there's no reason they
can't use SO_REUSEPORT. Since we should always assume DDOS is
possibility it seems like that should be a general recommendation: If
you have multiple threads listening on a port use SO_REUSEPORT.

> Since we used to grab socket lock in udp_recvmsg(), I guess nobody uses
> multiple threads to read packets from a single socket.
>
That's the hope! So the problem at hand is multiple producer CPUs and
one consumer CPU.

> So heavy users must use SO_REUSEPORT already, not sure what we would
> gain trying to go to a single socket, with the complexity of mem
> charging.
>
I think you're making a good point a the possibility that any
unconnected UDP socket could be subject to an attack, so any use of
unconnected UDP has the potential to become a "heavy user" (in fact
we've seen bring down whole networks before in production). Therefore
the single thread reader case is relevant to consider.

Tom

>
>>
>
>

^ permalink raw reply

* Re: [PATCH v2 net-next 0/4] udp: receive path optimizations
From: Eric Dumazet @ 2016-12-09 16:53 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Jesper Dangaard Brouer, Eric Dumazet, David S . Miller, netdev,
	Paolo Abeni
In-Reply-To: <CALx6S35roMkor_0maXk-SwdXeF4GxBfbxXLEXLGnn6mRRaut6g@mail.gmail.com>

On Fri, 2016-12-09 at 08:43 -0800, Tom Herbert wrote:
> 
> 


> Are you thinking of allowing unconnected socket to have multiple input
> queues? Sort of an automatic and transparent SO_REUSEPORT... 

It all depends if the user application is using a single thread or
multiple threads to drain the queue.

Since we used to grab socket lock in udp_recvmsg(), I guess nobody uses
multiple threads to read packets from a single socket.

So heavy users must use SO_REUSEPORT already, not sure what we would
gain trying to go to a single socket, with the complexity of mem
charging.


> 

^ permalink raw reply

* Re: [PATCH net-next 1/2] net: phy: add extension of phy-mode for XLGMII
From: Andrew Lunn @ 2016-12-09 16:39 UTC (permalink / raw)
  To: Jie Deng
  Cc: Florian Fainelli, davem, netdev, linux-kernel, CARLOS.PALMINHA,
	lars.persson, thomas.lendacky
In-Reply-To: <d42cbc77-1409-281a-161f-cf9c85443369@synopsys.com>

On Fri, Dec 09, 2016 at 01:19:07PM +0800, Jie Deng wrote:
> 
> 
> On 2016/12/9 6:15, Florian Fainelli wrote:
> > On 12/06/2016 07:57 PM, Jie Deng wrote:
> >> This patch adds phy-mode support for Synopsys XLGMAC
> > The functional changes look good, but I would like to see some
> > description of what the XL part stands for here.
> >
> > While you are modifying this, do you also mind submitting a Device Tree
> > specification change:
> >
> > https://www.devicetree.org/specifications/
> >
> > Thanks!
> Thank you for the information.
> 
> Currenlty, the XLGMAC is a new IP from Synopsys.

I think Florian wants to know about the IEEE standard or what ever
which defines what the phy-mode XLGMAC is, in the same way there are
standards for RGMII, SGMII, etc.

	  Andrew

^ permalink raw reply

* Re: [PATCH v2 net-next 0/4] udp: receive path optimizations
From: Eric Dumazet @ 2016-12-09 16:26 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Eric Dumazet, David S . Miller, netdev, Paolo Abeni
In-Reply-To: <20161209170509.25347c9b@redhat.com>

On Fri, 2016-12-09 at 17:05 +0100, Jesper Dangaard Brouer wrote:
> On Thu, 08 Dec 2016 13:13:15 -0800
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > On Thu, 2016-12-08 at 21:48 +0100, Jesper Dangaard Brouer wrote:
> > > On Thu,  8 Dec 2016 09:38:55 -0800
> > > Eric Dumazet <edumazet@google.com> wrote:
> > >   
> > > > This patch series provides about 100 % performance increase under flood.   
> > > 
> > > Could you please explain a bit more about what kind of testing you are
> > > doing that can show 100% performance improvement?
> > > 
> > > I've tested this patchset and my tests show *huge* speeds ups, but
> > > reaping the performance benefit depend heavily on setup and enabling
> > > the right UDP socket settings, and most importantly where the
> > > performance bottleneck is: ksoftirqd(producer) or udp_sink(consumer).  
> > 
> > Right.
> > 
> > So here at Google we do not try (yet) to downgrade our expensive
> > Multiqueue Nics into dumb NICS from last decade by using a single queue
> > on them. Maybe it will happen when we can process 10Mpps per core,
> > but we are not there yet  ;)
> > 
> > So my test is using a NIC, programmed with 8 queues, on a dual-socket
> > machine. (2 physical packages)
> > 
> > 4 queues are handled by 4 cpus on socket0 (NUMA node 0)
> > 4 queues are handled by 4 cpus on socket1 (NUMA node 1)
> 
> Interesting setup, it will be good to catch cache-line bouncing and
> false-sharing, which the streak of recent patches show ;-) (Hopefully
> such setup are avoided for production).

Well, if you have 100Gbit NIC, and 2 NUMA nodes, what do you suggest
exactly, when jobs run on both nodes ?

If you suggest to remove one package, or force jobs to run on Socket0,
just because the NIC is attached to it, it wont be an option.

Most of the traffic is TCP, so RSS comes nicely here to affine traffic
on one RX queue of the NIC.

Now, if for some reason an innocent UDP socket is the target of a flood,
we need to not make all cpus blocked in a spinlock to eventually queue a
packet.

Be assured that high performance UDP servers use kernel bypass, or
SO_REUSEPORT already. My effort is not targeting these special users,
since they already have good performance.

My effort is to provide some isolation, a bit like the effort I did for
SYN flood attacks (Cpus were all spinning on listener spinlock)




> 
> 
> > So I explicitly put my poor single thread UDP application in the worst
> > condition, having skbs produced on two NUMA nodes. 
> 
> On which CPU do you place the single thread UDP application?

No matter in this case. You can either force it to run on a group of
cpu, or let the scheduler choose.

If you let the scheduler choose, then it might help the single tuple
flood attack, since the user thread will be moved on a difference cpu
than the ksoftirqd.

> 
> E.g. do you allow it to run on a CPU that also process ksoftirq?
> My experience is that performance is approx half, if ksoftirq and
> UDP-thread share a CPU (after you fixed the softirq issue).

Well, this is exactly what I said earlier. Your choices about cpu
pinning might help or might hurt in different scenarios.

> 
> 
> > Then my load generator use trafgen, with spoofed UDP source addresses,
> > like a UDP flood would use. Or typical DNS traffic, malicious or not.
> 
> I also like trafgen
>  https://github.com/netoptimizer/network-testing/tree/master/trafgen
> 
> > So I have 8 cpus all trying to queue packets in a single UDP socket.
> > 
> > Of course, a real high performance server would use 8 UDP sockets, and
> > SO_REUSEPORT with nice eBPF filter to spread the packets based on the
> > queue/cpu they arrived.
> 
> Once the ksoftirq and UDP-threads are silo'ed like that, it should
> basically correspond to the benchmarks of my single queue test,
> multiplied by the number of CPUs/UDP-threads.

Well, if one cpu is shared by the producer and consumer then packets are
hot in caches, so trying to avoid cache line misses as I did is not
really helping.

I optimized the case where we do not assume both parties run on the same
cpu. If you leave process scheduler do its job, then your throughput can
be doubled ;)

Now if for some reason you are stuck with a single CPU, this is a very
different problem, and af_packet might be better.


> 
> I think it might be a good idea (for me) to implement such a
> UDP-multi-threaded sink example program (with SO_REUSEPORT and eBPF
> filter) to demonstrate and make sure the stack scales (and every
> time we/I improve single queue performance, the numbers should multiply
> with the scaling). Maybe you already have such an example program?


Well, I do have something using SO_REUSEPORT, but not yet BPF, so not in
a state I can share at this moment.

^ permalink raw reply

* Re: [PATCH V2  00/22] Broadcom RoCE Driver (bnxt_re)
From: Leon Romanovsky @ 2016-12-09 16:27 UTC (permalink / raw)
  To: Selvin Xavier; +Cc: dledford, linux-rdma, netdev
In-Reply-To: <1481266096-23331-1-git-send-email-selvin.xavier@broadcom.com>

[-- Attachment #1: Type: text/plain, Size: 1178 bytes --]

On Thu, Dec 08, 2016 at 10:47:54PM -0800, Selvin Xavier wrote:
> 

...

>  create mode 100644 include/uapi/rdma/bnxt_re_uverbs_abi.h

Please use already established naming format for this file.
It will simplify our future integration with rdma-core library.

Thanks

➜  linux-rdma git:(master) ls -l include/uapi/rdma/*-abi.h 
-rw-r--r-- 1 leonro leonro 2291 Dec  7 13:07 include/uapi/rdma/cxgb3-abi.h
-rw-r--r-- 1 leonro leonro 2488 Dec  7 13:07 include/uapi/rdma/cxgb4-abi.h
-rw-r--r-- 1 leonro leonro 2864 Dec  7 13:07 include/uapi/rdma/mlx4-abi.h
-rw-r--r-- 1 leonro leonro 6103 Dec  8 12:52 include/uapi/rdma/mlx5-abi.h
-rw-r--r-- 1 leonro leonro 2932 Dec  7 13:07 include/uapi/rdma/mthca-abi.h
-rw-r--r-- 1 leonro leonro 3380 Dec  7 13:07 include/uapi/rdma/nes-abi.h
-rw-r--r-- 1 leonro leonro 3918 Dec  7 13:07 include/uapi/rdma/ocrdma-abi.h
-rw-r--r-- 1 leonro leonro 2559 Dec  7 13:07 include/uapi/rdma/qedr-abi.h

> 
> -- 
> 2.5.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: stmmac DT property snps,axi_all
From: Alexandre Torgue @ 2016-12-09 16:06 UTC (permalink / raw)
  To: Niklas Cassel, Giuseppe Cavallaro; +Cc: netdev
In-Reply-To: <e0362693-4ae9-b3d6-3955-c72df7a1b0c0@axis.com>

Hi Niklas

On 12/09/2016 10:53 AM, Niklas Cassel wrote:
> On 12/09/2016 10:20 AM, Niklas Cassel wrote:
>> On 12/08/2016 02:36 PM, Alexandre Torgue wrote:
>>> Hi Niklas,
>>>
>>> On 12/05/2016 05:18 PM, Niklas Cassel wrote:
>>>> Hello Giuseppe
>>>>
>>>>
>>>> I'm trying to figure out what snps,axi_all is supposed to represent.
>>>>
>>>> It appears that the value is saved, but never used in the code.
>>>>
>>>> Looking at the register specification, I'm guessing that it represents
>>>> Address-Aligned Beats, but there is already the property snps,aal
>>>> for that.
>>> IMO, it is not useful. Indeed AXI_AAL is a read only bit (in AXI bus mode register) and reflects the aal bit in DMA bus register.
>>> As you know we use "snps,aal" to set aal bit in DMA bus register.
>>> So "snps,axi_all" entry seems useless. Let's see with Peppe.
>> Ok, I see. GMAC and GMAC4 is different here.
>>
>> For GMAC4 AAL only exists in DMA_SYS_BUS_MODE.
>> It's not reflected anywhere else.
>>
>> The code is correct in the driver.
>>
>> If snps,axi_all is just created for a read-only register,
>> and it is currently never used in the code,
>> while we have snps,aal, which is correct and works,
>> I guess it should be ok to remove snps,axi_all.
>>
>> I can cook up a patch.
>>
>
> Here we go :)
>
> I will send it as a real patch once net-next reopens.

Thanks ;). Just check with Peppe next week (as he added in the past this 
property).

Regards
Alex

>
>
> From defc01cb7c22611b89d9cf1fcae72544092bd62c Mon Sep 17 00:00:00 2001
> From: Niklas Cassel <niklas.cassel@axis.com>
> Date: Fri, 9 Dec 2016 10:27:00 +0100
> Subject: [PATCH net-next] net: stmmac: remove unused duplicate property
>  snps,axi_all
>
> For core revision 3.x Address-Aligned Beats is available in two registers.
> The DT property snps,aal was created for AAL in the DMA bus register,
> which is a read/write bit.
> The DT property snps,axi_all was created for AXI_AAL in the AXI bus mode
> register, which is a read only bit that reflects the value of AAL in the
> DMA bus register.
>
> Since the value of snps,axi_all is never used in the driver,
> and since the property was created for a bit that is read only,
> it should be safe to remove the property.
>
> Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
> ---
>  Documentation/devicetree/bindings/net/stmmac.txt      | 1 -
>  drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 1 -
>  include/linux/stmmac.h                                | 1 -
>  3 files changed, 3 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/net/stmmac.txt b/Documentation/devicetree/bindings/net/stmmac.txt
> index 128da752fec9..c3d2fd480a1b 100644
> --- a/Documentation/devicetree/bindings/net/stmmac.txt
> +++ b/Documentation/devicetree/bindings/net/stmmac.txt
> @@ -65,7 +65,6 @@ Optional properties:
>      - snps,wr_osr_lmt: max write outstanding req. limit
>      - snps,rd_osr_lmt: max read outstanding req. limit
>      - snps,kbbe: do not cross 1KiB boundary.
> -    - snps,axi_all: align address
>      - snps,blen: this is a vector of supported burst length.
>      - snps,fb: fixed-burst
>      - snps,mb: mixed-burst
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> index 082cd48db6a7..60ba8993c650 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> @@ -121,7 +121,6 @@ static struct stmmac_axi *stmmac_axi_setup(struct platform_device *pdev)
>      axi->axi_lpi_en = of_property_read_bool(np, "snps,lpi_en");
>      axi->axi_xit_frm = of_property_read_bool(np, "snps,xit_frm");
>      axi->axi_kbbe = of_property_read_bool(np, "snps,axi_kbbe");
> -    axi->axi_axi_all = of_property_read_bool(np, "snps,axi_all");
>      axi->axi_fb = of_property_read_bool(np, "snps,axi_fb");
>      axi->axi_mb = of_property_read_bool(np, "snps,axi_mb");
>      axi->axi_rb =  of_property_read_bool(np, "snps,axi_rb");
> diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
> index 266dab9ad782..889e0e9a3f1c 100644
> --- a/include/linux/stmmac.h
> +++ b/include/linux/stmmac.h
> @@ -103,7 +103,6 @@ struct stmmac_axi {
>      u32 axi_wr_osr_lmt;
>      u32 axi_rd_osr_lmt;
>      bool axi_kbbe;
> -    bool axi_axi_all;
>      u32 axi_blen[AXI_BLEN];
>      bool axi_fb;
>      bool axi_mb;
>

^ permalink raw reply

* Re: [PATCH v2 net-next 0/4] udp: receive path optimizations
From: Jesper Dangaard Brouer @ 2016-12-09 16:05 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Eric Dumazet, David S . Miller, netdev, Paolo Abeni, brouer
In-Reply-To: <1481231595.4930.142.camel@edumazet-glaptop3.roam.corp.google.com>

On Thu, 08 Dec 2016 13:13:15 -0800
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> On Thu, 2016-12-08 at 21:48 +0100, Jesper Dangaard Brouer wrote:
> > On Thu,  8 Dec 2016 09:38:55 -0800
> > Eric Dumazet <edumazet@google.com> wrote:
> >   
> > > This patch series provides about 100 % performance increase under flood.   
> > 
> > Could you please explain a bit more about what kind of testing you are
> > doing that can show 100% performance improvement?
> > 
> > I've tested this patchset and my tests show *huge* speeds ups, but
> > reaping the performance benefit depend heavily on setup and enabling
> > the right UDP socket settings, and most importantly where the
> > performance bottleneck is: ksoftirqd(producer) or udp_sink(consumer).  
> 
> Right.
> 
> So here at Google we do not try (yet) to downgrade our expensive
> Multiqueue Nics into dumb NICS from last decade by using a single queue
> on them. Maybe it will happen when we can process 10Mpps per core,
> but we are not there yet  ;)
> 
> So my test is using a NIC, programmed with 8 queues, on a dual-socket
> machine. (2 physical packages)
> 
> 4 queues are handled by 4 cpus on socket0 (NUMA node 0)
> 4 queues are handled by 4 cpus on socket1 (NUMA node 1)

Interesting setup, it will be good to catch cache-line bouncing and
false-sharing, which the streak of recent patches show ;-) (Hopefully
such setup are avoided for production).


> So I explicitly put my poor single thread UDP application in the worst
> condition, having skbs produced on two NUMA nodes. 

On which CPU do you place the single thread UDP application?

E.g. do you allow it to run on a CPU that also process ksoftirq?
My experience is that performance is approx half, if ksoftirq and
UDP-thread share a CPU (after you fixed the softirq issue).


> Then my load generator use trafgen, with spoofed UDP source addresses,
> like a UDP flood would use. Or typical DNS traffic, malicious or not.

I also like trafgen
 https://github.com/netoptimizer/network-testing/tree/master/trafgen

> So I have 8 cpus all trying to queue packets in a single UDP socket.
> 
> Of course, a real high performance server would use 8 UDP sockets, and
> SO_REUSEPORT with nice eBPF filter to spread the packets based on the
> queue/cpu they arrived.

Once the ksoftirq and UDP-threads are silo'ed like that, it should
basically correspond to the benchmarks of my single queue test,
multiplied by the number of CPUs/UDP-threads.

I think it might be a good idea (for me) to implement such a
UDP-multi-threaded sink example program (with SO_REUSEPORT and eBPF
filter) to demonstrate and make sure the stack scales (and every
time we/I improve single queue performance, the numbers should multiply
with the scaling). Maybe you already have such an example program?


> In the case you have one cpu that you need to share between ksoftirq and
> all user threads, then your test results depend on process scheduler
> decisions more than anything we can code in network land.

Yes, also my experience, the scheduler have large influence.
 
> It is actually easy for user space to get more than 50% of the cycles,
> and 'starve' ksoftirqd.

FYI, Paolo recently added an option for parsing of pktgen payload in
the udp_sink.c program, this way we can simulate the app doing something.

I've started testing with 4 CPUs doing ksoftirq, multiple flows
(pktgen_sample04_many_flows.sh) and then increasing adding udp_sink
--reuse-port programs, on other 4 CPUs, and it looks like it scales
nicely :-)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* [PATCH net-next] net: skb_condense() can also deal with empty skbs
From: Eric Dumazet @ 2016-12-09 16:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

It seems attackers can also send UDP packets with no payload at all.

skb_condense() can still be a win in this case.

It will be possible to replace the custom code in tcp_add_backlog()
to get full benefit from skb_condense()

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/skbuff.c |   22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 84151cf40aebb973bad5bee3ee4be0758084d83c..b1451e66d570269252ce628b2dc1714b860e1ca4 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4946,16 +4946,20 @@ EXPORT_SYMBOL(pskb_extract);
  */
 void skb_condense(struct sk_buff *skb)
 {
-	if (!skb->data_len ||
-	    skb->data_len > skb->end - skb->tail ||
-	    skb_cloned(skb))
-		return;
-
-	/* Nice, we can free page frag(s) right now */
-	__pskb_pull_tail(skb, skb->data_len);
+	if (skb->data_len) {
+		if (skb->data_len > skb->end - skb->tail ||
+		    skb_cloned(skb))
+			return;
 
-	/* Now adjust skb->truesize, since __pskb_pull_tail() does
-	 * not do this.
+		/* Nice, we can free page frag(s) right now */
+		__pskb_pull_tail(skb, skb->data_len);
+	}
+	/* At this point, skb->truesize might be over estimated,
+	 * because skb had a fragment, and fragments do not tell
+	 * their truesize.
+	 * When we pulled its content into skb->head, fragment
+	 * was freed, but __pskb_pull_tail() could not possibly
+	 * adjust skb->truesize, not knowing the frag truesize.
 	 */
 	skb->truesize = SKB_TRUESIZE(skb_end_offset(skb));
 }

^ permalink raw reply related

* Re: Synopsys Ethernet QoS
From: Joao Pinto @ 2016-12-09 15:54 UTC (permalink / raw)
  To: David Miller, Joao.Pinto
  Cc: peppe.cavallaro, lars.persson, rabin.vincent, netdev,
	andy.shevchenko, CARLOS.PALMINHA
In-Reply-To: <20161209.104152.1969880574279771010.davem@davemloft.net>

Às 3:41 PM de 12/9/2016, David Miller escreveu:
> From: Joao Pinto <Joao.Pinto@synopsys.com>
> Date: Fri, 9 Dec 2016 15:36:38 +0000
> 
>> Of course, I started a general discussion about the subject and
>> those were the conclusions, but I would like to know if you as the
>> subsystem maintainer also support the approach or have any
>> suggestion.
> 
> Generally, I support whatever the interested parties agree to.
> 
> But one thing I am against is changing the driver name for existing
> users.  If an existing chip is supported by the stmmac driver for
> existing users, they should still continue to use the "stmmac" driver.
> 
> Therefore, if consolidation changes the driver module name for
> existing users, then that is not a good plan at all.
> 

Of course, 100% with you! Retro-compatibility for existing drivers is a must
have. The consolidation is going to be done with extreme careful.

Joao

^ permalink raw reply

* Re: Synopsys Ethernet QoS
From: David Miller @ 2016-12-09 15:41 UTC (permalink / raw)
  To: Joao.Pinto
  Cc: peppe.cavallaro, lars.persson, rabin.vincent, netdev,
	andy.shevchenko, CARLOS.PALMINHA
In-Reply-To: <93b73b79-36aa-56b8-f975-b890b7a48bd1@synopsys.com>

From: Joao Pinto <Joao.Pinto@synopsys.com>
Date: Fri, 9 Dec 2016 15:36:38 +0000

> Of course, I started a general discussion about the subject and
> those were the conclusions, but I would like to know if you as the
> subsystem maintainer also support the approach or have any
> suggestion.

Generally, I support whatever the interested parties agree to.

But one thing I am against is changing the driver name for existing
users.  If an existing chip is supported by the stmmac driver for
existing users, they should still continue to use the "stmmac" driver.

Therefore, if consolidation changes the driver module name for
existing users, then that is not a good plan at all.

^ permalink raw reply

* Re: Synopsys Ethernet QoS
From: Joao Pinto @ 2016-12-09 15:36 UTC (permalink / raw)
  To: David Miller, Joao.Pinto
  Cc: peppe.cavallaro, lars.persson, rabin.vincent, netdev,
	andy.shevchenko, CARLOS.PALMINHA
In-Reply-To: <20161209.103327.1742213347114742435.davem@davemloft.net>

Hi David,

Of course, I started a general discussion about the subject and those were the
conclusions, but I would like to know if you as the subsystem maintainer also
support the approach or have any suggestion.

Thanks,
Joao

Às 3:33 PM de 12/9/2016, David Miller escreveu:
> From: Joao Pinto <Joao.Pinto@synopsys.com>
> Date: Fri, 9 Dec 2016 11:29:02 +0000
> 
>> Dear David Miller,
>  ...
>> I would like to know if you support this plan.
> 
> This is not how this works.
> 
> You need to discuss and work out a plan with the other people
> with a direct interest in the existing drivers and maintainence.
> 
> Not me.
> 

^ permalink raw reply

* Re: Synopsys Ethernet QoS
From: David Miller @ 2016-12-09 15:33 UTC (permalink / raw)
  To: Joao.Pinto
  Cc: peppe.cavallaro, lars.persson, rabin.vincent, netdev,
	andy.shevchenko, CARLOS.PALMINHA
In-Reply-To: <2df7a6dd-1128-d1d6-bf61-891f76cf7200@synopsys.com>

From: Joao Pinto <Joao.Pinto@synopsys.com>
Date: Fri, 9 Dec 2016 11:29:02 +0000

> Dear David Miller,
 ...
> I would like to know if you support this plan.

This is not how this works.

You need to discuss and work out a plan with the other people
with a direct interest in the existing drivers and maintainence.

Not me.

^ permalink raw reply

* Re: [PATCHv3 perf/core 0/7] Reuse libbpf from samples/bpf
From: Daniel Borkmann @ 2016-12-09 15:30 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Joe Stringer
  Cc: linux-kernel, netdev, wangnan0, ast
In-Reply-To: <20161209150907.GM8257@kernel.org>

Hi Arnaldo,

On 12/09/2016 04:09 PM, Arnaldo Carvalho de Melo wrote:
> Em Thu, Dec 08, 2016 at 06:46:13PM -0800, Joe Stringer escreveu:
>> (Was "libbpf: Synchronize implementations")
>>
>> Update tools/lib/bpf to provide the remaining bpf wrapper pieces needed by the
>> samples/bpf/ code, then get rid of all of the duplicate BPF libraries in
>> samples/bpf/libbpf.[ch].
>>
>> ---
>> v3: Add ack for first patch.
>>      Split out second patch from v2 into separate changes for remaining diff.
>>      Add patches to switch samples/bpf over to using tools/lib/.
>> v2: https://www.mail-archive.com/netdev@vger.kernel.org/msg135088.html
>>      Don't shift non-bpf code into libbpf.
>>      Drop the patch to synchronize ELF definitions with tc.
>> v1: https://www.mail-archive.com/netdev@vger.kernel.org/msg135088.html
>>      First post.
>
> Thanks, applied after addressing the -I$(objtree) issue raised by Wang,

[ Sorry for late reply. ]

First of all, glad to see us getting rid of the duplicate lib eventually! :)

Please note that this might result in hopefully just a minor merge issue
with net-next. Looks like patch 4/7 touches test_maps.c and test_verifier.c,
which moved to a new bpf selftest suite [1] this net-next cycle. Seems it's
just log buffer and some renames there, which can be discarded for both
files sitting in selftests.

Thanks,
Daniel

   [1] https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/tree/tools/testing/selftests/bpf

^ permalink raw reply

* Re: [PATCH V2 00/22] Broadcom RoCE Driver (bnxt_re)
From: David Miller @ 2016-12-09 15:26 UTC (permalink / raw)
  To: selvin.xavier; +Cc: dledford, linux-rdma, netdev
In-Reply-To: <1481266096-23331-1-git-send-email-selvin.xavier@broadcom.com>

From: Selvin Xavier <selvin.xavier@broadcom.com>
Date: Thu,  8 Dec 2016 22:47:54 -0800

> This series introduces the RoCE driver for the Broadcom
> NetXtreme-E 10/25/40/50 gigabit RoCE HCAs. 
> This driver is dependent on the bnxt_en NIC driver and is 
> based on the bnxt_re branch in Doug's repository. bnxt_en changes
> required for this patch series is already available in this branch.
> 
> I am preparing a git repository with these changes as per Jason's
> comment and will share the details later today.

If this is targetted at the net-next tree, it is too late as I've
closed the net-next tree two nights ago.

Please resubmit this after the upcoming merge window closes.

Thanks.

^ permalink raw reply

* Re: [PATCH 37/50] netfilter: nf_tables: atomic dump and reset for stateful objects
From: Eric Dumazet @ 2016-12-09 15:22 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Paul Gortmaker, netfilter-devel, David Miller, netdev,
	linux-next@vger.kernel.org
In-Reply-To: <1481293492.4930.168.camel@edumazet-glaptop3.roam.corp.google.com>

On Fri, 2016-12-09 at 06:24 -0800, Eric Dumazet wrote:

> It looks that you want a seqcount, even on 64bit arches,
> so that CPU 2 can restart its loop, and more importantly you need
> to not accumulate the values you read, because they might be old/invalid.

Untested patch to give general idea. I can polish it a bit later today.

 net/netfilter/nft_counter.c |   59 +++++++++++++---------------------
 1 file changed, 23 insertions(+), 36 deletions(-)

diff --git a/net/netfilter/nft_counter.c b/net/netfilter/nft_counter.c
index f6a02c5071c2aeafca7635da3282a809aa04d6ab..57ed95b024473a2aa76298fe5bb5013bf709801b 100644
--- a/net/netfilter/nft_counter.c
+++ b/net/netfilter/nft_counter.c
@@ -31,18 +31,25 @@ struct nft_counter_percpu_priv {
 	struct nft_counter_percpu __percpu *counter;
 };
 
+static DEFINE_PER_CPU(seqcount_t, nft_counter_seq);
+
 static inline void nft_counter_do_eval(struct nft_counter_percpu_priv *priv,
 				       struct nft_regs *regs,
 				       const struct nft_pktinfo *pkt)
 {
 	struct nft_counter_percpu *this_cpu;
+	seqcount_t *myseq;
 
 	local_bh_disable();
 	this_cpu = this_cpu_ptr(priv->counter);
-	u64_stats_update_begin(&this_cpu->syncp);
+	myseq = this_cpu_ptr(&nft_counter_seq);
+
+	write_seqcount_begin(myseq);
+
 	this_cpu->counter.bytes += pkt->skb->len;
 	this_cpu->counter.packets++;
-	u64_stats_update_end(&this_cpu->syncp);
+
+	write_seqcount_end(myseq);
 	local_bh_enable();
 }
 
@@ -110,52 +117,30 @@ static void nft_counter_fetch(struct nft_counter_percpu __percpu *counter,
 
 	memset(total, 0, sizeof(*total));
 	for_each_possible_cpu(cpu) {
+		seqcount_t *seqp = per_cpu_ptr(&nft_counter_seq, cpu);
+
 		cpu_stats = per_cpu_ptr(counter, cpu);
 		do {
-			seq	= u64_stats_fetch_begin_irq(&cpu_stats->syncp);
+			seq	= read_seqcount_begin(seqp);
 			bytes	= cpu_stats->counter.bytes;
 			packets	= cpu_stats->counter.packets;
-		} while (u64_stats_fetch_retry_irq(&cpu_stats->syncp, seq));
+		} while (read_seqcount_retry(seqp, seq));
 
 		total->packets += packets;
 		total->bytes += bytes;
 	}
 }
 
-static u64 __nft_counter_reset(u64 *counter)
-{
-	u64 ret, old;
-
-	do {
-		old = *counter;
-		ret = cmpxchg64(counter, old, 0);
-	} while (ret != old);
-
-	return ret;
-}
-
 static void nft_counter_reset(struct nft_counter_percpu __percpu *counter,
 			      struct nft_counter *total)
 {
 	struct nft_counter_percpu *cpu_stats;
-	u64 bytes, packets;
-	unsigned int seq;
-	int cpu;
 
-	memset(total, 0, sizeof(*total));
-	for_each_possible_cpu(cpu) {
-		bytes = packets = 0;
-
-		cpu_stats = per_cpu_ptr(counter, cpu);
-		do {
-			seq	= u64_stats_fetch_begin_irq(&cpu_stats->syncp);
-			packets	+= __nft_counter_reset(&cpu_stats->counter.packets);
-			bytes	+= __nft_counter_reset(&cpu_stats->counter.bytes);
-		} while (u64_stats_fetch_retry_irq(&cpu_stats->syncp, seq));
-
-		total->packets += packets;
-		total->bytes += bytes;
-	}
+	local_bh_disable();
+	cpu_stats = this_cpu_ptr(counter);
+	cpu_stats->counter.packets -= total->packets;
+	cpu_stats->counter.bytes -= total->bytes;
+	local_bh_enable();
 }
 
 static int nft_counter_do_dump(struct sk_buff *skb,
@@ -164,10 +149,9 @@ static int nft_counter_do_dump(struct sk_buff *skb,
 {
 	struct nft_counter total;
 
+	nft_counter_fetch(priv->counter, &total);
 	if (reset)
 		nft_counter_reset(priv->counter, &total);
-	else
-		nft_counter_fetch(priv->counter, &total);
 
 	if (nla_put_be64(skb, NFTA_COUNTER_BYTES, cpu_to_be64(total.bytes),
 			 NFTA_COUNTER_PAD) ||
@@ -285,7 +269,10 @@ static struct nft_expr_type nft_counter_type __read_mostly = {
 
 static int __init nft_counter_module_init(void)
 {
-	int err;
+	int err, cpu;
+
+	for_each_possible_cpu(cpu)
+		seqcount_init(per_cpu_ptr(&nft_counter_seq, cpu));
 
 	err = nft_register_obj(&nft_counter_obj);
 	if (err < 0)

^ permalink raw reply related

* Re: [PATCH] linux/types.h: enable endian checks for all sparse builds
From: Bart Van Assche @ 2016-12-09 15:18 UTC (permalink / raw)
  To: Madhani, Himanshu, Michael S. Tsirkin
  Cc: kvm@vger.kernel.org, Neil Armstrong, David Airlie,
	linux-remoteproc@vger.kernel.org, dri-devel@lists.freedesktop.org,
	virtualization@lists.linux-foundation.org,
	linux-s390@vger.kernel.org, James E.J. Bottomley, Herbert Xu,
	linux-scsi@vger.kernel.org, Christoph Hellwig,
	v9fs-developer@lists.sourceforge.net, Asias He, Arnd Bergmann,
	linux-kbuild@vger.kernel.org, Jens Axboe, Michal Marek,
	Stefan Hajnoczi <stef
In-Reply-To: <6199215E-2AA4-4705-9552-5D61FE03F866@cavium.com>

On 12/08/16 22:40, Madhani, Himanshu wrote:
> We’ll take a look and send patches to resolve these warnings.

Thanks!

Bart.

^ permalink raw reply

* Re: [PATCH net-next 0/2] Initial driver for Synopsys DWC XLGMAC
From: Carlos Palminha @ 2016-12-09 15:15 UTC (permalink / raw)
  To: Jie Deng, davem@davemloft.net, f.fainelli@gmail.com,
	netdev@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, lars.persson@axis.com,
	thomas.lendacky@amd.com
In-Reply-To: <cover.1481075763.git.jiedeng@synopsys.com>

Hi Jie,

I don't think we have the need to create the "dwc" subdirectory under "synopsys".
Its preferable to have them directly under drivers/net/ethernet/synopsys.

Regards,
C.Palminha

On 07-12-2016 03:57, Jie Deng wrote:
> This series provides the support for 25/40/50/100 GbE
> devices using Synopsys DWC Enterprise Ethernet (XLGMAC).
> 
> The first patch adds support for Synopsys XLGMII.
> The second patch provides the initial driver for Synopsys XLGMAC
> 
> The driver has three layers by refactoring AMD XGBE.
> 
> dwc-eth-xxx.x
>   The DWC ethernet core layer (DWC ECL). This layer contains codes
> can be shared by different DWC series ethernet cores
> 
> dwc-xxx.x (e.g. dwc-xlgmac.c)
>   The DWC MAC HW adapter layer (DWC MHAL). This layer contains
> special support for a specific MAC. e.g. currently, XLGMAC.
> 
> xxx-xxx-pci.c xxx-xxx-plat.c (e.g. dwc-xlgmac-pci.c)
>   The glue adapter layer (GAL). Vendors who adopt Synopsys Etherent
> cores can develop a glue driver for their platform.
> 
> Jie Deng (2):
>   net: phy: add extension of phy-mode for XLGMII
>   net: ethernet: Initial driver for Synopsys DWC XLGMAC
> 
>  Documentation/devicetree/bindings/net/ethernet.txt |    1 +
>  MAINTAINERS                                        |    6 +
>  drivers/net/ethernet/synopsys/Kconfig              |    2 +
>  drivers/net/ethernet/synopsys/Makefile             |    1 +
>  drivers/net/ethernet/synopsys/dwc/Kconfig          |   37 +
>  drivers/net/ethernet/synopsys/dwc/Makefile         |    9 +
>  drivers/net/ethernet/synopsys/dwc/dwc-eth-dcb.c    |  228 ++
>  .../net/ethernet/synopsys/dwc/dwc-eth-debugfs.c    |  328 +++
>  drivers/net/ethernet/synopsys/dwc/dwc-eth-desc.c   |  715 +++++
>  .../net/ethernet/synopsys/dwc/dwc-eth-ethtool.c    |  567 ++++
>  drivers/net/ethernet/synopsys/dwc/dwc-eth-hw.c     | 3098 ++++++++++++++++++++
>  drivers/net/ethernet/synopsys/dwc/dwc-eth-mdio.c   |  252 ++
>  drivers/net/ethernet/synopsys/dwc/dwc-eth-net.c    | 2319 +++++++++++++++
>  drivers/net/ethernet/synopsys/dwc/dwc-eth-ptp.c    |  216 ++
>  drivers/net/ethernet/synopsys/dwc/dwc-eth-regacc.h | 1115 +++++++
>  drivers/net/ethernet/synopsys/dwc/dwc-eth.h        |  738 +++++
>  drivers/net/ethernet/synopsys/dwc/dwc-xlgmac-pci.c |  538 ++++
>  drivers/net/ethernet/synopsys/dwc/dwc-xlgmac.c     |  135 +
>  drivers/net/ethernet/synopsys/dwc/dwc-xlgmac.h     |   85 +
>  include/linux/phy.h                                |    3 +
>  20 files changed, 10393 insertions(+)
>  create mode 100644 drivers/net/ethernet/synopsys/dwc/Kconfig
>  create mode 100644 drivers/net/ethernet/synopsys/dwc/Makefile
>  create mode 100644 drivers/net/ethernet/synopsys/dwc/dwc-eth-dcb.c
>  create mode 100644 drivers/net/ethernet/synopsys/dwc/dwc-eth-debugfs.c
>  create mode 100644 drivers/net/ethernet/synopsys/dwc/dwc-eth-desc.c
>  create mode 100644 drivers/net/ethernet/synopsys/dwc/dwc-eth-ethtool.c
>  create mode 100644 drivers/net/ethernet/synopsys/dwc/dwc-eth-hw.c
>  create mode 100644 drivers/net/ethernet/synopsys/dwc/dwc-eth-mdio.c
>  create mode 100644 drivers/net/ethernet/synopsys/dwc/dwc-eth-net.c
>  create mode 100644 drivers/net/ethernet/synopsys/dwc/dwc-eth-ptp.c
>  create mode 100644 drivers/net/ethernet/synopsys/dwc/dwc-eth-regacc.h
>  create mode 100644 drivers/net/ethernet/synopsys/dwc/dwc-eth.h
>  create mode 100644 drivers/net/ethernet/synopsys/dwc/dwc-xlgmac-pci.c
>  create mode 100644 drivers/net/ethernet/synopsys/dwc/dwc-xlgmac.c
>  create mode 100644 drivers/net/ethernet/synopsys/dwc/dwc-xlgmac.h
> 

^ permalink raw reply

* Re: [PATCHv3 perf/core 0/7] Reuse libbpf from samples/bpf
From: Arnaldo Carvalho de Melo @ 2016-12-09 15:09 UTC (permalink / raw)
  To: Joe Stringer; +Cc: linux-kernel, netdev, wangnan0, ast, daniel
In-Reply-To: <20161209024620.31660-1-joe@ovn.org>

Em Thu, Dec 08, 2016 at 06:46:13PM -0800, Joe Stringer escreveu:
> (Was "libbpf: Synchronize implementations")
> 
> Update tools/lib/bpf to provide the remaining bpf wrapper pieces needed by the
> samples/bpf/ code, then get rid of all of the duplicate BPF libraries in
> samples/bpf/libbpf.[ch].
> 
> ---
> v3: Add ack for first patch.
>     Split out second patch from v2 into separate changes for remaining diff.
>     Add patches to switch samples/bpf over to using tools/lib/.
> v2: https://www.mail-archive.com/netdev@vger.kernel.org/msg135088.html
>     Don't shift non-bpf code into libbpf.
>     Drop the patch to synchronize ELF definitions with tc.
> v1: https://www.mail-archive.com/netdev@vger.kernel.org/msg135088.html
>     First post.

Thanks, applied after addressing the -I$(objtree) issue raised by Wang,

- Arnaldo

^ permalink raw reply

* Re: [PATCHv3 perf/core 6/7] samples/bpf: Remove perf_event_open() declaration
From: Arnaldo Carvalho de Melo @ 2016-12-09 14:59 UTC (permalink / raw)
  To: Joe Stringer; +Cc: linux-kernel, wangnan0, ast, daniel, netdev
In-Reply-To: <20161209024620.31660-7-joe@ovn.org>

Em Thu, Dec 08, 2016 at 06:46:19PM -0800, Joe Stringer escreveu:
> This declaration was made in samples/bpf/libbpf.c for convenience, but
> there's already one in tools/perf/perf-sys.h. Reuse that one.
> 
> Signed-off-by: Joe Stringer <joe@ovn.org>
> ---
> v3: First post.
> ---
>  samples/bpf/Makefile            | 3 ++-
>  samples/bpf/bpf_load.c          | 3 ++-
>  samples/bpf/libbpf.c            | 7 -------
>  samples/bpf/libbpf.h            | 3 ---
>  samples/bpf/sampleip_user.c     | 3 ++-
>  samples/bpf/trace_event_user.c  | 9 +++++----
>  samples/bpf/trace_output_user.c | 3 ++-
>  samples/bpf/tracex6_user.c      | 3 ++-
>  8 files changed, 15 insertions(+), 19 deletions(-)
> 
> diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
> index c8f7ed37b2de..0adc47e67e65 100644
> --- a/samples/bpf/Makefile
> +++ b/samples/bpf/Makefile
> @@ -92,7 +92,8 @@ always += test_current_task_under_cgroup_kern.o
>  always += trace_event_kern.o
>  always += sampleip_kern.o
>  
> -HOSTCFLAGS += -I$(objtree)/usr/include -I$(objtree)/tools/lib/
> +HOSTCFLAGS += -I$(objtree)/usr/include -I$(objtree)/tools/lib/ \
> +	      -I$(objtree)/tools/include -I$(objtree)/tools/perf

Switching these to $(srctree) as well, to support building it like:

  make -j4 O=../build/v4.9.0-rc8+ samples/bpf/

>  
>  HOSTCFLAGS_bpf_load.o += -I$(objtree)/usr/include -Wno-unused-variable
>  HOSTLOADLIBES_fds_example += -lelf
> diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c
> index f8e3c58a0897..d683bd278171 100644
> --- a/samples/bpf/bpf_load.c
> +++ b/samples/bpf/bpf_load.c
> @@ -19,6 +19,7 @@
>  #include <ctype.h>
>  #include "libbpf.h"
>  #include "bpf_load.h"
> +#include "perf-sys.h"
>  
>  #define DEBUGFS "/sys/kernel/debug/tracing/"
>  
> @@ -168,7 +169,7 @@ static int load_and_attach(const char *event, struct bpf_insn *prog, int size)
>  	id = atoi(buf);
>  	attr.config = id;
>  
> -	efd = perf_event_open(&attr, -1/*pid*/, 0/*cpu*/, -1/*group_fd*/, 0);
> +	efd = sys_perf_event_open(&attr, -1/*pid*/, 0/*cpu*/, -1/*group_fd*/, 0);
>  	if (efd < 0) {
>  		printf("event %d fd %d err %s\n", id, efd, strerror(errno));
>  		return -1;
> diff --git a/samples/bpf/libbpf.c b/samples/bpf/libbpf.c
> index d9af876b4a2c..bee473a494f1 100644
> --- a/samples/bpf/libbpf.c
> +++ b/samples/bpf/libbpf.c
> @@ -34,10 +34,3 @@ int open_raw_sock(const char *name)
>  
>  	return sock;
>  }
> -
> -int perf_event_open(struct perf_event_attr *attr, int pid, int cpu,
> -		    int group_fd, unsigned long flags)
> -{
> -	return syscall(__NR_perf_event_open, attr, pid, cpu,
> -		       group_fd, flags);
> -}
> diff --git a/samples/bpf/libbpf.h b/samples/bpf/libbpf.h
> index cc815624aacf..09aedc320009 100644
> --- a/samples/bpf/libbpf.h
> +++ b/samples/bpf/libbpf.h
> @@ -188,7 +188,4 @@ struct bpf_insn;
>  /* create RAW socket and bind to interface 'name' */
>  int open_raw_sock(const char *name);
>  
> -struct perf_event_attr;
> -int perf_event_open(struct perf_event_attr *attr, int pid, int cpu,
> -		    int group_fd, unsigned long flags);
>  #endif
> diff --git a/samples/bpf/sampleip_user.c b/samples/bpf/sampleip_user.c
> index 09ab620b324c..476a11947180 100644
> --- a/samples/bpf/sampleip_user.c
> +++ b/samples/bpf/sampleip_user.c
> @@ -21,6 +21,7 @@
>  #include <sys/ioctl.h>
>  #include "libbpf.h"
>  #include "bpf_load.h"
> +#include "perf-sys.h"
>  
>  #define DEFAULT_FREQ	99
>  #define DEFAULT_SECS	5
> @@ -50,7 +51,7 @@ static int sampling_start(int *pmu_fd, int freq)
>  	};
>  
>  	for (i = 0; i < nr_cpus; i++) {
> -		pmu_fd[i] = perf_event_open(&pe_sample_attr, -1 /* pid */, i,
> +		pmu_fd[i] = sys_perf_event_open(&pe_sample_attr, -1 /* pid */, i,
>  					    -1 /* group_fd */, 0 /* flags */);
>  		if (pmu_fd[i] < 0) {
>  			fprintf(stderr, "ERROR: Initializing perf sampling\n");
> diff --git a/samples/bpf/trace_event_user.c b/samples/bpf/trace_event_user.c
> index de8fd0266d78..ccb0cba8324a 100644
> --- a/samples/bpf/trace_event_user.c
> +++ b/samples/bpf/trace_event_user.c
> @@ -20,6 +20,7 @@
>  #include <sys/resource.h>
>  #include "libbpf.h"
>  #include "bpf_load.h"
> +#include "perf-sys.h"
>  
>  #define SAMPLE_FREQ 50
>  
> @@ -126,9 +127,9 @@ static void test_perf_event_all_cpu(struct perf_event_attr *attr)
>  
>  	/* open perf_event on all cpus */
>  	for (i = 0; i < nr_cpus; i++) {
> -		pmu_fd[i] = perf_event_open(attr, -1, i, -1, 0);
> +		pmu_fd[i] = sys_perf_event_open(attr, -1, i, -1, 0);
>  		if (pmu_fd[i] < 0) {
> -			printf("perf_event_open failed\n");
> +			printf("sys_perf_event_open failed\n");
>  			goto all_cpu_err;
>  		}
>  		assert(ioctl(pmu_fd[i], PERF_EVENT_IOC_SET_BPF, prog_fd[0]) == 0);
> @@ -147,9 +148,9 @@ static void test_perf_event_task(struct perf_event_attr *attr)
>  	int pmu_fd;
>  
>  	/* open task bound event */
> -	pmu_fd = perf_event_open(attr, 0, -1, -1, 0);
> +	pmu_fd = sys_perf_event_open(attr, 0, -1, -1, 0);
>  	if (pmu_fd < 0) {
> -		printf("perf_event_open failed\n");
> +		printf("sys_perf_event_open failed\n");
>  		return;
>  	}
>  	assert(ioctl(pmu_fd, PERF_EVENT_IOC_SET_BPF, prog_fd[0]) == 0);
> diff --git a/samples/bpf/trace_output_user.c b/samples/bpf/trace_output_user.c
> index 9c38f7aa4515..64e692fd7d51 100644
> --- a/samples/bpf/trace_output_user.c
> +++ b/samples/bpf/trace_output_user.c
> @@ -21,6 +21,7 @@
>  #include <signal.h>
>  #include "libbpf.h"
>  #include "bpf_load.h"
> +#include "perf-sys.h"
>  
>  static int pmu_fd;
>  
> @@ -160,7 +161,7 @@ static void test_bpf_perf_event(void)
>  	};
>  	int key = 0;
>  
> -	pmu_fd = perf_event_open(&attr, -1/*pid*/, 0/*cpu*/, -1/*group_fd*/, 0);
> +	pmu_fd = sys_perf_event_open(&attr, -1/*pid*/, 0/*cpu*/, -1/*group_fd*/, 0);
>  
>  	assert(pmu_fd >= 0);
>  	assert(bpf_map_update_elem(map_fd[0], &key, &pmu_fd, BPF_ANY) == 0);
> diff --git a/samples/bpf/tracex6_user.c b/samples/bpf/tracex6_user.c
> index 7a3b4a4b19f3..1681cb7cd713 100644
> --- a/samples/bpf/tracex6_user.c
> +++ b/samples/bpf/tracex6_user.c
> @@ -10,6 +10,7 @@
>  #include <linux/bpf.h>
>  #include "libbpf.h"
>  #include "bpf_load.h"
> +#include "perf-sys.h"
>  
>  #define SAMPLE_PERIOD  0x7fffffffffffffffULL
>  
> @@ -32,7 +33,7 @@ static void test_bpf_perf_event(void)
>  	};
>  
>  	for (i = 0; i < nr_cpus; i++) {
> -		pmu_fd[i] = perf_event_open(&attr_insn_pmu, -1/*pid*/, i/*cpu*/, -1/*group_fd*/, 0);
> +		pmu_fd[i] = sys_perf_event_open(&attr_insn_pmu, -1/*pid*/, i/*cpu*/, -1/*group_fd*/, 0);
>  		if (pmu_fd[i] < 0) {
>  			printf("event syscall failed\n");
>  			goto exit;
> -- 
> 2.10.2

^ permalink raw reply

* Re: 4.9.0-rc8: tg3 dead after resume
From: Billy Shuman @ 2016-12-09 14:29 UTC (permalink / raw)
  To: Siva Reddy Kallam; +Cc: Michael Chan, Netdev
In-Reply-To: <CAMet4B6t9neFPcGstZw6ebhFCBQzRsesStXZ8bjSaC5ggcuKxw@mail.gmail.com>

On Thu, Dec 8, 2016 at 4:03 AM, Siva Reddy Kallam
<siva.kallam@broadcom.com> wrote:
> On Thu, Dec 8, 2016 at 12:14 AM, Billy Shuman <wshuman3@gmail.com> wrote:
>> On Wed, Dec 7, 2016 at 12:37 PM, Michael Chan <michael.chan@broadcom.com> wrote:
>>> On Wed, Dec 7, 2016 at 7:20 AM, Billy Shuman <wshuman3@gmail.com> wrote:
>>>> After resume on 4.9.0-rc8 tg3 is dead.
>>>>
>>>> In logs I see:
>>>> kernel: tg3 0000:44:00.0: phy probe failed, err -19
>>>> kernel: tg3 0000:44:00.0: Problem fetching invariants of chip, aborting
>>>
>>> -19 is -ENODEV which means tg3 cannot read the PHY ID.
>>>
>>> If it's a true suspend/resume operation, the driver does not have to
>>> go through probe during resume.  Please explain how you do
>>> suspend/resume.
>>>
>>
>> Sorry my previous message was accidentally sent to early.
>>
>> I used systemd (systemctl suspend) to suspend.
>>
> We need more information to proceed further.
> Without suspend, Are you able to use the tg3 port?

Yes the port works fine without suspend.

> Which Broadcom card are you having in laptop?

The nic is a NetXtreme BCM57762 Gigabit Ethernet PCIe in a thunderbolt3 dock.

> Please provide complete tg3 specific logs in dmesg.
>

[   32.084010] tg3.c:v3.137 (May 11, 2014)
[   32.124695] tg3 0000:44:00.0 eth0: Tigon3 [partno(BCM957762) rev
57766001] (PCI Express) MAC address 98:e7:f4:8b:13:19
[   32.124698] tg3 0000:44:00.0 eth0: attached PHY is 57765
(10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1])
[   32.124699] tg3 0000:44:00.0 eth0: RXcsums[1] LinkChgREG[0]
MIirq[0] ASF[0] TSOcap[1]
[   32.124700] tg3 0000:44:00.0 eth0: dma_rwctrl[00000001] dma_mask[64-bit]
[   32.219764] tg3 0000:44:00.0 enp68s0: renamed from eth0
[   36.219245] tg3 0000:44:00.0 enp68s0: Link is up at 1000 Mbps, full duplex
[   36.219250] tg3 0000:44:00.0 enp68s0: Flow control is on for TX and on for RX
[   36.219251] tg3 0000:44:00.0 enp68s0: EEE is disabled

after resume
[   92.292838] tg3 0000:44:00.0 enp68s0: No firmware running
[   93.521744] tg3 0000:44:00.0: tg3_abort_hw timed out,
TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
[  106.704655] tg3 0000:44:00.0 enp68s0: Link is down
[  108.370356] tg3 0000:44:00.0: tg3_abort_hw timed out,
TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff

after rmmod, modprobe
[  570.933636] tg3 0000:44:00.0: tg3_abort_hw timed out,
TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
[  604.847215] tg3.c:v3.137 (May 11, 2014)
[  605.010075] tg3 0000:44:00.0: phy probe failed, err -19
[  605.010077] tg3 0000:44:00.0: Problem fetching invariants of chip, aborting




>>> Did this work before?  There has been very few changes to tg3 recently.
>>>
>>
>> This is a new laptop for me, but the same behavior is seen on 4.4.36 and 4.8.12.
>>
>>>>
>>>> rmmod and modprobe does not fix the problem only a reboot resolves the issue.
>>>>
>>>> Billy

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox