Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] libbpf: Remove the duplicate checking of function storage
From: Taeung Song @ 2018-08-22  2:54 UTC (permalink / raw)
  To: Daniel Borkmann, Jakub Kicinski
  Cc: Alexei Starovoitov, Linux Netdev List, LKML
In-Reply-To: <e0799ee4-875d-de62-e6a1-cc461fe38be7@iogearbox.net>



On 08/22/2018 05:11 AM, Daniel Borkmann wrote:
> On 08/21/2018 06:46 PM, Jakub Kicinski wrote:
>> On Tue, Aug 21, 2018 at 6:12 PM, Taeung Song <treeze.taeung@gmail.com> wrote:
>>> After the commit eac7d84519a3 ("tools: libbpf: don't return '.text'
>>> as a program for multi-function programs"), bpf_program__next()
>>> in bpf_object__for_each_program skips the function storage such as .text,
>>> so eliminate the duplicate checking.
>>>
>>> Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
>>> Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
>>
>> Looks reasonable, but you may need to repost once bpf-next is open:
>>
>> https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
>>
>> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> 
> Agree, please resubmit once bpf-next opens up again. Thanks!
> 

OK ! I will.

Thanks,
Taeung

^ permalink raw reply

* Re: [PATCH] strparser: remove any offset before parsing messages
From: Doron Roberts-Kedes @ 2018-08-22  2:33 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: Tom Herbert, Dave Watson, David S. Miller, netdev, linux-kernel
In-Reply-To: <20180822004647.GA10656@nautica>

On Wed, Aug 22, 2018 at 02:46:47AM +0200, Dominique Martinet wrote:
> Yes, the rcv_msg callback itself can get the offset easily, and it's not
> that which needs an extra parameter but the bpf function kcm/sockmap are
> calling which would need either an extra parameter or changing to get
> that value themselves.

Ah cool. Thanks for explaining. 

> For what it's worth, I don't think either are acceptable solutions, I'm
> just stating what would a "fix in bpf" would be.

Agreed that the discussion should be about whether to fix it up in
strparser or sockmap. bpf seems inappropriate.

> strparser logic in that case -- it might work to pull in the parser
> function but it might not work in rcv for all I know, or the next user
> might think that since pull is ok some other operation on the skb is as
> well...

Just to make sure I understand, is it possible you meant to say that the
other way around? Surely the rcv callback can do whatever it wants with
the skb. Its the parse callback that may need to be a little more
careful with the skb.

For the parse case, why not just clone and pull? 

> As I wrote above, I think it should not be possible, so we're not
> even talking about a small percentage here.
> The reason I didn't use skb_pull (the head-only variant) is that I'd
> rather have the overhead than a BUG() if I'm wrong on this...

A printk in that section when (orig_offset + eaten > skb_headlen(head)) 
confirms that this case is not uncommon or impossible. Would have to do
more work to see how many hundreds of times per second, but it is not a
philosophical concern.

^ permalink raw reply

* Re: [PATCH] datapath.c: fix missing return value check of nla_nest_start()
From: Pravin Shelar @ 2018-08-21 22:38 UTC (permalink / raw)
  To: jasonwood2031; +Cc: Linux Kernel Network Developers
In-Reply-To: <20180817081508.7104-1-jasonwood2031@gmail.com>

On Fri, Aug 17, 2018 at 1:15 AM Jiecheng Wu <jasonwood2031@gmail.com> wrote:
>
> Function queue_userspace_packet() defined in net/openvswitch/datapath.c calls nla_nest_start() to allocate memory for struct nlattr which is dereferenced immediately. As nla_nest_start() may return NULL on failure, this code piece may cause NULL pointer dereference bug.
> ---
>  net/openvswitch/datapath.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index 0f5ce77..ff4457d 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -460,6 +460,8 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb,
>
>         if (upcall_info->egress_tun_info) {
>                 nla = nla_nest_start(user_skb, OVS_PACKET_ATTR_EGRESS_TUN_KEY);
> +               if (!nla)
> +                       return -EMSGSIZE;
It is not possible, since user_skb is allocated to accommodate all
netlink attributes.

>                 err = ovs_nla_put_tunnel_info(user_skb,
>                                               upcall_info->egress_tun_info);
>                 BUG_ON(err);
> @@ -468,6 +470,8 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *skb,
>
>         if (upcall_info->actions_len) {
>                 nla = nla_nest_start(user_skb, OVS_PACKET_ATTR_ACTIONS);
> +               if (!nla)
> +                       return -EMSGSIZE;
same as above, the check is not required.

>                 err = ovs_nla_put_actions(upcall_info->actions,
>                                           upcall_info->actions_len,
>                                           user_skb);
> --
> 2.6.4
>

^ permalink raw reply

* [PATCH 00/11] tree-wide: use SPDX identifier for Renesas drivers
From: Wolfram Sang @ 2018-08-21 22:02 UTC (permalink / raw)
  To: linux-renesas-soc
  Cc: Kuninori Morimoto, Wolfram Sang, dmaengine, linux-can, linux-clk,
	linux-i2c, linux-kernel, linux-mmc, linux-pwm, linux-spi,
	linux-watchdog, netdev

Here is my first batch of SPDX conversion patches. Mostly low hanging fruits,
only a handful of drivers needed corrections in MODULE_LICENSE, too.

checkpatch is happy as well.

I think an ack from Morimoto-san with his Renesas email address would be nice
to make it more official? Still, I *am* contracted by Renesas.

Thanks,

   Wolfram

Wolfram Sang (11):
  clk: renesas: use SPDX identifier for Renesas drivers
  dmaengine: use SPDX identifier for Renesas drivers
  i2c: use SPDX identifier for Renesas drivers
  mmc: use SPDX identifier for Renesas drivers
  can: rcar: use SPDX identifier for Renesas drivers
  net: ethernet: renesas: use SPDX identifier for Renesas drivers
  phy: renesas: use SPDX identifier for Renesas drivers
  pwm: use SPDX identifier for Renesas drivers
  soc: renesas: use SPDX identifier for Renesas drivers
  spi: use SPDX identifier for Renesas drivers
  watchdog: use SPDX identifier for Renesas drivers

 drivers/clk/renesas/clk-div6.c                |  5 +----
 drivers/clk/renesas/r8a7795-cpg-mssr.c        |  5 +----
 drivers/clk/renesas/r8a7796-cpg-mssr.c        |  5 +----
 drivers/clk/renesas/r8a77995-cpg-mssr.c       |  5 +----
 drivers/clk/renesas/rcar-gen3-cpg.c           |  5 +----
 drivers/clk/renesas/rcar-usb2-clock-sel.c     |  5 +----
 drivers/clk/renesas/renesas-cpg-mssr.c        |  5 +----
 drivers/clk/renesas/renesas-cpg-mssr.h        |  5 +----
 drivers/dma/nbpfaxi.c                         |  5 +----
 drivers/dma/sh/shdma-arm.h                    |  5 +----
 drivers/dma/sh/shdma-base.c                   |  5 +----
 drivers/dma/sh/shdma-of.c                     |  5 +----
 drivers/dma/sh/shdma-r8a73a4.c                |  5 +----
 drivers/dma/sh/shdma.h                        |  6 +-----
 drivers/dma/sh/shdmac.c                       |  6 +-----
 drivers/dma/sh/sudmac.c                       |  5 +----
 drivers/dma/sh/usb-dmac.c                     |  5 +----
 drivers/i2c/busses/i2c-emev2.c                |  5 +----
 drivers/i2c/busses/i2c-highlander.c           |  5 +----
 drivers/i2c/busses/i2c-rcar.c                 | 10 +---------
 drivers/i2c/busses/i2c-riic.c                 |  5 +----
 drivers/i2c/busses/i2c-sh_mobile.c            | 10 +---------
 drivers/mmc/host/renesas_sdhi.h               |  5 +----
 drivers/mmc/host/renesas_sdhi_core.c          |  5 +----
 drivers/mmc/host/renesas_sdhi_internal_dmac.c |  5 +----
 drivers/mmc/host/renesas_sdhi_sys_dmac.c      |  5 +----
 drivers/mmc/host/sh_mmcif.c                   |  7 ++-----
 drivers/mmc/host/tmio_mmc.c                   |  5 +----
 drivers/mmc/host/tmio_mmc.h                   |  6 +-----
 drivers/mmc/host/tmio_mmc_core.c              |  5 +----
 drivers/mmc/host/usdhi6rol0.c                 |  5 +----
 drivers/net/can/rcar/rcar_can.c               |  6 +-----
 drivers/net/can/rcar/rcar_canfd.c             |  6 +-----
 drivers/net/ethernet/renesas/ravb.h           |  5 +----
 drivers/net/ethernet/renesas/ravb_main.c      |  5 +----
 drivers/net/ethernet/renesas/sh_eth.c         | 13 +------------
 drivers/net/ethernet/renesas/sh_eth.h         | 13 +------------
 drivers/phy/renesas/phy-rcar-gen2.c           |  5 +----
 drivers/phy/renesas/phy-rcar-gen3-usb2.c      |  5 +----
 drivers/phy/renesas/phy-rcar-gen3-usb3.c      |  5 +----
 drivers/pwm/pwm-rcar.c                        |  5 +----
 drivers/pwm/pwm-renesas-tpu.c                 | 10 +---------
 drivers/soc/renesas/r8a7743-sysc.c            |  5 +----
 drivers/soc/renesas/r8a7745-sysc.c            |  5 +----
 drivers/soc/renesas/r8a7779-sysc.c            |  5 +----
 drivers/soc/renesas/r8a7790-sysc.c            |  5 +----
 drivers/soc/renesas/r8a7791-sysc.c            |  5 +----
 drivers/soc/renesas/r8a7792-sysc.c            |  5 +----
 drivers/soc/renesas/r8a7794-sysc.c            |  5 +----
 drivers/soc/renesas/r8a7795-sysc.c            |  5 +----
 drivers/soc/renesas/r8a7796-sysc.c            |  5 +----
 drivers/soc/renesas/r8a77970-sysc.c           |  5 +----
 drivers/soc/renesas/r8a77995-sysc.c           |  5 +----
 drivers/soc/renesas/rcar-sysc.h               |  5 +----
 drivers/soc/renesas/renesas-soc.c             | 10 +---------
 drivers/spi/spi-rspi.c                        | 10 +---------
 drivers/spi/spi-sh-hspi.c                     | 12 ++----------
 drivers/spi/spi-sh-msiof.c                    |  6 +-----
 drivers/spi/spi-sh.c                          | 12 ++----------
 drivers/watchdog/renesas_wdt.c                |  5 +----
 60 files changed, 63 insertions(+), 300 deletions(-)

-- 
2.11.0

^ permalink raw reply

* [PATCH bpf] bpf, sockmap: fix sock_hash_alloc and reject zero-sized keys
From: Daniel Borkmann @ 2018-08-21 22:02 UTC (permalink / raw)
  To: alexei.starovoitov; +Cc: john.fastabend, netdev, Daniel Borkmann

Currently, it is possible to create a sock hash map with key size
of 0 and have the kernel return a fd back to user space. This is
invalid for hash maps (and kernel also hasn't been tested for zero
key size support in general at this point). Thus, reject such
configuration.

Fixes: 81110384441a ("bpf: sockmap, add hash map support")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/sockmap.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index 98e621a..60ceb0e 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -2140,7 +2140,9 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr)
 		return ERR_PTR(-EPERM);
 
 	/* check sanity of attributes */
-	if (attr->max_entries == 0 || attr->value_size != 4 ||
+	if (attr->max_entries == 0 ||
+	    attr->key_size == 0 ||
+	    attr->value_size != 4 ||
 	    attr->map_flags & ~SOCK_CREATE_FLAG_MASK)
 		return ERR_PTR(-EINVAL);
 
-- 
2.9.5

^ permalink raw reply related

* Re: Experimental fix for MSI-X issue on r8169
From: Steve Dodd @ 2018-08-21 21:48 UTC (permalink / raw)
  To: Heiner Kallweit; +Cc: Jian-Hong Pan, Lou Reed, netdev@vger.kernel.org
In-Reply-To: <98df6b0a-44df-9c0d-ddf6-11b892ccd990@gmail.com>

On 21 August 2018 at 22:19, Heiner Kallweit <hkallweit1@gmail.com> wrote:

> Below is a patch printing the MSI-X table entry in different contexts,
> it's not supposed to fix anything. Could you please let me know
> what the output is on your system?
> I want to get an idea whether the issue clears the complete entry or
> just corrupts certain parts.

Here we go..

[    2.865434] r8169 0000:01:00.0: MSI-X entry: context probe: fee01004 0 40ef 1
[  269.747608] r8169 0000:01:00.0: MSI-X entry: context probe: fee01004 0 40ef 1
[  295.454099] r8169 0000:01:00.0: MSI-X entry: context suspend:
fee08004 0 4028 0

S.

^ permalink raw reply

* Re: [PATCH] strparser: remove any offset before parsing messages
From: Dominique Martinet @ 2018-08-22  0:46 UTC (permalink / raw)
  To: Doron Roberts-Kedes
  Cc: Tom Herbert, Dave Watson, David S. Miller, netdev, linux-kernel
In-Reply-To: <20180821233549.GA96607@doronrk-mbp.dhcp.thefacebook.com>

Doron Roberts-Kedes wrote on Tue, Aug 21, 2018:
> On Wed, Aug 22, 2018 at 12:51:13AM +0200, Dominique Martinet wrote:
> > That's maybe three more lines than the current patch, which is also
> > pretty simple, I'm not sure what you're expecting from alternative
> > solutions to call that overly complicated...
> 
> The line count is not the source of the complexity. The undue complexity
> is having strparser operate in two modes: one for clients that properly
> use the API by respecting the value of offset, and another for clients 
> that do not. 

I might sound stubborn at this point but that still does not sound
complex compared to the alternatives I can think of.


> > I don't think bpf itself needs to be changed here -- the offset is
> > stored in a strparser specific struct so short of such a skb_pull I
> > think we'd need to change the type of the bpf function, pass it it the
> > extra parameter, and make it a user visible change breaking the kcm
> > API... And I have no idea for sockmap but probably something similar.
> 
> I'm not sure I follow you here. Any rcv_msg callback implementation
> receives an skb. Calling strp_msg() on the skb gives you the strp_msg
> which has the offset value. Can you explain why passing an extra
> parameter is necessary to get the offset?


Yes, the rcv_msg callback itself can get the offset easily, and it's not
that which needs an extra parameter but the bpf function kcm/sockmap are
calling which would need either an extra parameter or changing to get
that value themselves -- the later is probably not very difficult and
won't break the API, but at the same time pushes the responsability to
kcm/sockmap users who will get that wrong and waste time trying to
understand how kcm works like I did.

Breaking the API has the advantage that users will get an error telling
them to update their bpf program instead of randomly failing when tcp
aggregation happens.

For what it's worth, I don't think either are acceptable solutions, I'm
just stating what would a "fix in bpf" would be.


A "fix in kcm/sockmap" would be to pull just before calling the bpf
program, which could be a middle ground, but I think that's more
perverse than adding a flag to strparser for new users because we'd
basically be saying users should modify the skb that strparser users
under its feet, and there is no guarantee what would happen to the
strparser logic in that case -- it might work to pull in the parser
function but it might not work in rcv for all I know, or the next user
might think that since pull is ok some other operation on the skb is as
well...


> > (Also, note that pskb_pull will not copy any data or allocate memory
> > unless we're pulling past the end of the skb, which seems pretty
> > unlikely in that situation as we should have consumed any fully "eaten"
> > skb before getting to a new one here -- so in practice this patch just
> > adds a skb->data += offset with safety guards "just in case")
> 
> Yes, no data will be copied if the you don't pull beyond the linear
> buffer. Adding overhead even in a small percentage of cases still
> requires a good justification.

As I wrote above, I think it should not be possible, so we're not
even talking about a small percentage here.
The reason I didn't use skb_pull (the head-only variant) is that I'd
rather have the overhead than a BUG() if I'm wrong on this...

> In this particular case, I think a good justification would be
> demonstrating that it is impractical for the buggy strparser users
> you've pointed out to use the existing API and respect the value of
> offset.

That's what the previous paragraph about changing the API intended to
do.

> You have indicated that you are not super familiar with the bpf code,
> which is fine (I'm not either), but this isn't a good reason to make a
> change to strparser instead of bpf. 

I didn't even know kcm/strparser existed a few weeks ago, not being
familiar doesn't mean I didn't look at how it works.

I'd be glad to be proven wrong here but I really do not think there is a
possibility within the bpf framework to give a fake skb with an external
offset to the bpf program transparently.

Assuming there is, it would have to be more expensive than a pull (it'd
basically need to do a pull then restore the original data somehow)...

And if there isn't I certainly do not want to add one, not because I
don't want to mess with something I'm not too familiar with (that'd
actually be an argument to do it that way to me), but because I do not
think it belongs there; bpf should not need to know what a skb is or how
it works.


All being said I'm still convinced having two modes in strparser is the
simplest solution.

-- 
Dominique

^ permalink raw reply

* Re: Experimental fix for MSI-X issue on r8169
From: Heiner Kallweit @ 2018-08-21 21:19 UTC (permalink / raw)
  To: Jian-Hong Pan; +Cc: Steve Dodd, Lou Reed, netdev@vger.kernel.org
In-Reply-To: <CAPpJ_ecsMOL23VYM2juZ9R8JLrSh1bjCet16XCSpv0mDaSYu6w@mail.gmail.com>

On 20.08.2018 05:47, Jian-Hong Pan wrote:
> 2018-08-20 4:34 GMT+08:00 Heiner Kallweit <hkallweit1@gmail.com>:
>> The three of you reported an MSI-X-related error when the system
>> resumes from suspend. This has been fixed for now by disabling MSI-X
>> on certain chip versions. However more versions may be affected.
>>
>> I checked with Realtek and they confirmed that on certain chip
>> versions a MSIX-related value in PCI config space is reset when
>> resuming from S3.
>>
>> I would appreciate if you could test the following experimental patch
>> and whether warning "MSIX address lost, re-configuring" appears in
>> your dmesg output after resume from suspend.
>>
>> Thanks a lot for your efforts.
> 
> Tested with the experiment patch on ASUS X441UAR.
> 
> This is the information before suspend:
> 
> dev@endless:~$ dmesg | grep r8169
> [   10.279565] libphy: r8169: probed
> [   10.279947] r8169 0000:02:00.0 eth0: RTL8106e, 0c:9d:92:32:67:b4,
> XID 44900000, IRQ 127
> [   10.445952] r8169 0000:02:00.0 enp2s0: renamed from eth0
> [   15.676229] Generic PHY r8169-200:00: attached PHY driver [Generic
> PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE)
> [   17.455392] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full -
> flow control off
> 
> dev@endless:~$ ip addr show enp2s0
> 4: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
> state UP group default qlen 1000
>     link/ether 0c:9d:92:32:67:b4 brd ff:ff:ff:ff:ff:ff
>     inet 10.100.13.152/24 brd 10.100.13.255 scope global noprefixroute
> dynamic enp2s0
>        valid_lft 86347sec preferred_lft 86347sec
>     inet6 fe80::2873:a2a9:6ca1:c79d/64 scope link noprefixroute
>        valid_lft forever preferred_lft forever
> 
> This is the information after resume:
> 
> dev@endless:~$ dmesg | grep r8169
> [   10.279565] libphy: r8169: probed
> [   10.279947] r8169 0000:02:00.0 eth0: RTL8106e, 0c:9d:92:32:67:b4,
> XID 44900000, IRQ 127
> [   10.445952] r8169 0000:02:00.0 enp2s0: renamed from eth0
> [   15.676229] Generic PHY r8169-200:00: attached PHY driver [Generic
> PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE)
> [   17.455392] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full -
> flow control off
> [   95.594265] r8169 0000:02:00.0 enp2s0: Link is Down
> [   96.242074] Generic PHY r8169-200:00: attached PHY driver [Generic
> PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE)
> 
> dev@endless:~$ ip addr show enp2s0
> 4: enp2s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc
> pfifo_fast state DOWN group default qlen 1000
>     link/ether 0c:9d:92:32:67:b4 brd ff:ff:ff:ff:ff:ff
> 
> There is no "MSIX address lost, re-configuring" in dmesg.
> The ethernet interface is still down after resume.
> 

Thanks a lot for testing. Unfortunately I don't have test hardware
affected by this MSI-X issue, so maybe you can help me to understand
the issue a little better.

Below is a patch printing the MSI-X table entry in different contexts,
it's not supposed to fix anything. Could you please let me know
what the output is on your system?
I want to get an idea whether the issue clears the complete entry or
just corrupts certain parts.

That's what I get on my system (RTL8168E-VL). In your case you'll come
only till the first suspend.

[    3.743404] r8169 0000:03:00.0: MSI-X entry: context probe: fee01004 0 40ef 1
[   29.539250] r8169 0000:03:00.0: MSI-X entry: context suspend: fee02004 0 4028 0
[   29.837457] r8169 0000:03:00.0: MSI-X entry: context resume: fee01004 0 402b 0
[   36.921370] r8169 0000:03:00.0: MSI-X entry: context suspend: fee01004 0 402b 0
[   37.239407] r8169 0000:03:00.0: MSI-X entry: context resume: fee01004 0 402b 0

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 54f53c8c0..f32645119 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -11,6 +11,7 @@
 #include <linux/module.h>
 #include <linux/moduleparam.h>
 #include <linux/pci.h>
+#include <linux/msi.h>
 #include <linux/netdevice.h>
 #include <linux/etherdevice.h>
 #include <linux/delay.h>
@@ -6822,6 +6823,20 @@ rtl8169_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	pm_runtime_put_noidle(&pdev->dev);
 }
 
+static void rtl_print_msix_entry(struct rtl8169_private *tp, const char *context)
+{
+	struct msi_desc *desc = first_pci_msi_entry(tp->pci_dev);
+	u32 data[4];
+
+	data[0] = readl(desc->mask_base + PCI_MSIX_ENTRY_LOWER_ADDR);
+	data[1] = readl(desc->mask_base + PCI_MSIX_ENTRY_UPPER_ADDR);
+	data[2] = readl(desc->mask_base + PCI_MSIX_ENTRY_DATA);
+	data[3] = readl(desc->mask_base + PCI_MSIX_ENTRY_VECTOR_CTRL);
+
+	dev_info(tp_to_dev(tp), "MSI-X entry: context %s: %x %x %x %x\n",
+		 context, data[0], data[1], data[2], data[3]);
+}
+
 static void rtl8169_net_suspend(struct net_device *dev)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
@@ -6846,9 +6861,12 @@ static int rtl8169_suspend(struct device *device)
 {
 	struct pci_dev *pdev = to_pci_dev(device);
 	struct net_device *dev = pci_get_drvdata(pdev);
+	struct rtl8169_private *tp = netdev_priv(dev);
 
 	rtl8169_net_suspend(dev);
 
+	rtl_print_msix_entry(tp, "suspend");
+
 	return 0;
 }
 
@@ -6875,6 +6893,9 @@ static int rtl8169_resume(struct device *device)
 {
 	struct pci_dev *pdev = to_pci_dev(device);
 	struct net_device *dev = pci_get_drvdata(pdev);
+	struct rtl8169_private *tp = netdev_priv(dev);
+
+	rtl_print_msix_entry(tp, "resume");
 
 	if (netif_running(dev))
 		__rtl8169_resume(dev);
@@ -7075,11 +7096,6 @@ static int rtl_alloc_irq(struct rtl8169_private *tp)
 		RTL_W8(tp, Config2, RTL_R8(tp, Config2) & ~MSIEnable);
 		RTL_W8(tp, Cfg9346, Cfg9346_Lock);
 		flags = PCI_IRQ_LEGACY;
-	} else if (tp->mac_version == RTL_GIGA_MAC_VER_40) {
-		/* This version was reported to have issues with resume
-		 * from suspend when using MSI-X
-		 */
-		flags = PCI_IRQ_LEGACY | PCI_IRQ_MSI;
 	} else {
 		flags = PCI_IRQ_ALL_TYPES;
 	}
@@ -7354,6 +7370,8 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		return rc;
 	}
 
+	rtl_print_msix_entry(tp, "probe");
+
 	tp->saved_wolopts = __rtl8169_get_wol(tp);
 
 	mutex_init(&tp->wk.mutex);
-- 
2.18.0

^ permalink raw reply related

* Re: [PATCH] strparser: remove any offset before parsing messages
From: Doron Roberts-Kedes @ 2018-08-21 21:15 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: Tom Herbert, Dave Watson, David S. Miller, netdev, linux-kernel
In-Reply-To: <20180821193655.GA15354@nautica>

On Tue, Aug 21, 2018 at 09:36:55PM +0200, Dominique Martinet wrote:

> One of the solutions I had suggested was adding a flag at strparser
> setup time to only do that pull for users which cannot handle offset,
> but nobody seemed interested two weeks ago. I can still do that.

This seems overly complicated. 

> That's still suboptimal, but I don't have any better idea.
> To properly fix the users, I'd really need help with how bpf works to
> even know if passing an offset would be possible in the first place, as
> I do not see how at this time.

Thanks for clarifying Dominique.
It seems like we mainly agree that the proposed patch is suboptimal for
existing clients of the library that use offset correctly (tls). 

It also seems like you've identified that the proper fix is in bpf. 
Regrettably, I cannot help you understand how bpf works because I'm not
familiar with that code. 

As an aside, I would recommend reaching the netdev FAQ page:
https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt

It contains helpful hints about how to format email subjects (specifying
net vs. net-next) and determining when trees are closed (currently
closed). 

^ permalink raw reply

* Re: [PATCH] strparser: remove any offset before parsing messages
From: Doron Roberts-Kedes @ 2018-08-21 23:35 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: Tom Herbert, Dave Watson, David S. Miller, netdev, linux-kernel
In-Reply-To: <20180821225113.GA6515@nautica>

On Wed, Aug 22, 2018 at 12:51:13AM +0200, Dominique Martinet wrote:
> That's maybe three more lines than the current patch, which is also
> pretty simple, I'm not sure what you're expecting from alternative
> solutions to call that overly complicated...

The line count is not the source of the complexity. The undue complexity
is having strparser operate in two modes: one for clients that properly
use the API by respecting the value of offset, and another for clients 
that do not. 

> I don't think bpf itself needs to be changed here -- the offset is
> stored in a strparser specific struct so short of such a skb_pull I
> think we'd need to change the type of the bpf function, pass it it the
> extra parameter, and make it a user visible change breaking the kcm
> API... And I have no idea for sockmap but probably something similar.

I'm not sure I follow you here. Any rcv_msg callback implementation
receives an skb. Calling strp_msg() on the skb gives you the strp_msg
which has the offset value. Can you explain why passing an extra
parameter is necessary to get the offset?

> I can't think of that as better than adding a flag to strparser.
> 
> (Also, note that pskb_pull will not copy any data or allocate memory
> unless we're pulling past the end of the skb, which seems pretty
> unlikely in that situation as we should have consumed any fully "eaten"
> skb before getting to a new one here -- so in practice this patch just
> adds a skb->data += offset with safety guards "just in case")

Yes, no data will be copied if the you don't pull beyond the linear
buffer. Adding overhead even in a small percentage of cases still
requires a good justification. In this particular case, I think a good
justification would be demonstrating that it is impractical for the 
buggy strparser users you've pointed out to use the existing API and
respect the value of offset. You have indicated that you are not super
familiar with the bpf code, which is fine (I'm not either), but this
isn't a good reason to make a change to strparser instead of bpf.

^ permalink raw reply

* Re: [PATCH bpf] xsk: fix return value of xdp_umem_assign_dev()
From: Daniel Borkmann @ 2018-08-21 20:12 UTC (permalink / raw)
  To: Björn Töpel, bhole_prashant_q7
  Cc: ast, Björn Töpel, Karlsson, Magnus, David Miller,
	Netdev
In-Reply-To: <CAJ+HfNh88W3gODra0Uh2BSSAVhg=bka_MyA9=9J8QuEep16VWA@mail.gmail.com>

On 08/20/2018 10:31 AM, Björn Töpel wrote:
> Den mån 20 aug. 2018 kl 02:58 skrev Prashant Bhole
> <bhole_prashant_q7@lab.ntt.co.jp>:
>>
>> s/ENOTSUPP/EOPNOTSUPP/ in function umem_assign_dev().
>> This function's return value is directly returned by xsk_bind().
>> EOPNOTSUPP is bind()'s possible return value.
>>
>> Fixes: f734607e819b ("xsk: refactor xdp_umem_assign_dev()")
>> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>

Applied to bpf, thanks!

^ permalink raw reply

* Re: [Patch net 0/9] net_sched: pending clean up and bug fixes
From: David Miller @ 2018-08-21 20:00 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, jhs
In-Reply-To: <20180819192213.14196-1-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Sun, 19 Aug 2018 12:22:04 -0700

> This patchset aims to clean up and fixes some bugs in current
> merge window, this is why it is targeting -net.
> 
> Patch 1-5 are clean up Vlad's patches merged in current merge
> window, patch 6 is just a trivial cleanup.
> 
> Patch 7 reverts a lockdep warning fix and patch 8 provides a better
> fix for it.
> 
> Patch 9 fixes a potential deadlock found by me during code review.
> 
> Please see each patch for details.
> 
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Series applied and patches #8 and #9 queued up for -stable.

^ permalink raw reply

* Re: [Patch net 8/9] act_ife: move tcfa_lock down to where necessary
From: David Miller @ 2018-08-21 19:43 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, jhs, vladbu
In-Reply-To: <CAM_iQpXR56EUwF77QETJvKAvA6mHR3kNOEdA0CuF+uyhr+_Frg@mail.gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Mon, 20 Aug 2018 16:57:46 -0700

> Passing 'exists' as 'atomic' is prior to my change. With my change,
> they are separated as two parameters:

I mis-read the patch, thanks for explaining :)

^ permalink raw reply

* Re: [PATCH] strparser: remove any offset before parsing messages
From: Dominique Martinet @ 2018-08-21 22:51 UTC (permalink / raw)
  To: Doron Roberts-Kedes
  Cc: Tom Herbert, Dave Watson, David S. Miller, netdev, linux-kernel
In-Reply-To: <20180821211504.GA76892@doronrk-mbp.dhcp.thefacebook.com>

Doron Roberts-Kedes wrote on Tue, Aug 21, 2018:
> On Tue, Aug 21, 2018 at 09:36:55PM +0200, Dominique Martinet wrote:
> > One of the solutions I had suggested was adding a flag at strparser
> > setup time to only do that pull for users which cannot handle offset,
> > but nobody seemed interested two weeks ago. I can still do that.
> 
> This seems overly complicated. 

That sounds much simpler than adding a similar feature to bpf if it
doesn't already support it -- we already have an user-provided struct at
strparser init timer (strp->cb) in which it would be easy to add a flag
saying whether the callbacks support offset or not.

That's maybe three more lines than the current patch, which is also
pretty simple, I'm not sure what you're expecting from alternative
solutions to call that overly complicated...


> It seems like we mainly agree that the proposed patch is suboptimal for
> existing clients of the library that use offset correctly (tls). 
> 
> It also seems like you've identified that the proper fix is in bpf. 

I don't think bpf itself needs to be changed here -- the offset is
stored in a strparser specific struct so short of such a skb_pull I
think we'd need to change the type of the bpf function, pass it it the
extra parameter, and make it a user visible change breaking the kcm
API... And I have no idea for sockmap but probably something similar.

I can't think of that as better than adding a flag to strparser.

(Also, note that pskb_pull will not copy any data or allocate memory
unless we're pulling past the end of the skb, which seems pretty
unlikely in that situation as we should have consumed any fully "eaten"
skb before getting to a new one here -- so in practice this patch just
adds a skb->data += offset with safety guards "just in case")


> As an aside, I would recommend reaching the netdev FAQ page:
> https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
> 
> It contains helpful hints about how to format email subjects (specifying
> net vs. net-next) and determining when trees are closed (currently
> closed). 

Thank you for pointing out to that document.
While this bug has been around from day one of kcm it is still a bug fix
so I'd rather let maintainers decide which tree it goes to.

I wasn't aware that net-next closes during the Linus merge window, but
if you want to split hairs, the FAQ says "do not send new net-next
content to netdev [while closed]" and this isn't new content as the v0
of the patch was mostly the same and sent before the 4.18 release, (and,
ironically, did not get any comment).
Anyway, sure, I'll wait until next week for a v2 if that matters.


Thanks,
-- 
Dominique

^ permalink raw reply

* Re: [PATCH 05/11] can: rcar: use SPDX identifier for Renesas drivers
From: Fabio Estevam @ 2018-08-21 22:45 UTC (permalink / raw)
  To: Wolfram Sang, Philippe Ombredanne
  Cc: Linux-Renesas, Kuninori Morimoto, Wolfgang Grandegger,
	Marc Kleine-Budde, David S. Miller, linux-can, netdev,
	linux-kernel
In-Reply-To: <20180821220233.9202-6-wsa+renesas@sang-engineering.com>

Hi Wolfram,

On Tue, Aug 21, 2018 at 7:02 PM, Wolfram Sang
<wsa+renesas@sang-engineering.com> wrote:
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> ---
>
> To be applied individually per subsystem tree. Morimoto-san, could you maybe
> ack this with your Renesas address?
>
>  drivers/net/can/rcar/rcar_can.c   | 6 +-----
>  drivers/net/can/rcar/rcar_canfd.c | 6 +-----
>  2 files changed, 2 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/can/rcar/rcar_can.c b/drivers/net/can/rcar/rcar_can.c
> index 11662f479e76..051bf4ef4be2 100644
> --- a/drivers/net/can/rcar/rcar_can.c
> +++ b/drivers/net/can/rcar/rcar_can.c
> @@ -1,12 +1,8 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later

According to Documentation/process/license-rules.rst the format should
be like this instead:

// SPDX-License-Identifier: GPL-2.0+

^ permalink raw reply

* Re: [PATCH net] hv_netvsc: ignore devices that are not PCI
From: David Miller @ 2018-08-21 19:03 UTC (permalink / raw)
  To: stephen; +Cc: kys, netdev, sthemmin
In-Reply-To: <20180821174038.7942-1-sthemmin@microsoft.com>

From: Stephen Hemminger <stephen@networkplumber.org>
Date: Tue, 21 Aug 2018 10:40:38 -0700

> Registering another device with same MAC address (such as TAP, VPN or
> DPDK KNI) will confuse the VF autobinding logic.  Restrict the search
> to only run if the device is known to be a PCI attached VF.
> 
> Fixes: e8ff40d4bff1 ("hv_netvsc: improve VF device matching")
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH] selftests: net: move fragment forwarding/config up a level
From: Ido Schimmel @ 2018-08-21 18:56 UTC (permalink / raw)
  To: Anders Roxell; +Cc: davem, shuah, linux-kernel, netdev, linux-kselftest
In-Reply-To: <20180821161212.5750-1-anders.roxell@linaro.org>

On Tue, Aug 21, 2018 at 06:12:12PM +0200, Anders Roxell wrote:
> 'make kselftest-merge' assumes that the config files for the tests are
> located under the 'main' tet dir, like tools/testing/selftests/net/ and
> not in a subdir to net.

The tests under tools/testing/selftests/net/forwarding/ aren't executed
as part of the Makefile. The config file is there mainly so that people
will know which config options they need in order to run the tests.

The tests can be added to the Makefile, but some of them take a few
minutes to complete which is probably against "Don't take too long;"
mentioned in Documentation/dev-tools/kselftest.rst.

^ permalink raw reply

* [PATCHv2 iproute2 3/3] iproute: make clang happy with iproute2 package
From: Mahesh Bandewar @ 2018-08-21 17:48 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Mahesh Bandewar

From: Mahesh Bandewar <maheshb@google.com>

These are primarily fixes for "string is not string literal" warnings
/ errors (with -Werror -Wformat-nonliteral). This should be a no-op
change. I had to replace couple of print helper functions with the
code they call as it was becoming harder to eliminate these warnings,
however these helpers were used only at couple of places, so no
major change as such.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
---
 include/json_writer.h |  3 +--
 ip/iplink_can.c       | 19 ++++++++++++-------
 lib/color.c           |  1 +
 lib/json_print.c      |  1 +
 lib/json_writer.c     | 15 +--------------
 misc/ss.c             |  3 ++-
 tc/m_ematch.c         |  1 +
 tc/m_ematch.h         |  1 +
 8 files changed, 20 insertions(+), 24 deletions(-)

diff --git a/include/json_writer.h b/include/json_writer.h
index 9ab88e1dbdd9..0c8831c1136d 100644
--- a/include/json_writer.h
+++ b/include/json_writer.h
@@ -29,6 +29,7 @@ void jsonw_pretty(json_writer_t *self, bool on);
 void jsonw_name(json_writer_t *self, const char *name);
 
 /* Add value  */
+__attribute__((format(printf, 2, 3)))
 void jsonw_printf(json_writer_t *self, const char *fmt, ...);
 void jsonw_string(json_writer_t *self, const char *value);
 void jsonw_bool(json_writer_t *self, bool value);
@@ -59,8 +60,6 @@ void jsonw_luint_field(json_writer_t *self, const char *prop,
 			unsigned long int num);
 void jsonw_lluint_field(json_writer_t *self, const char *prop,
 			unsigned long long int num);
-void jsonw_float_field_fmt(json_writer_t *self, const char *prop,
-			   const char *fmt, double val);
 
 /* Collections */
 void jsonw_start_object(json_writer_t *self);
diff --git a/ip/iplink_can.c b/ip/iplink_can.c
index 587413da15c4..c0deeb1f1fcf 100644
--- a/ip/iplink_can.c
+++ b/ip/iplink_can.c
@@ -316,11 +316,14 @@ static void can_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[])
 		struct can_bittiming *bt = RTA_DATA(tb[IFLA_CAN_BITTIMING]);
 
 		if (is_json_context()) {
+			json_writer_t *jw;
+
 			open_json_object("bittiming");
 			print_int(PRINT_ANY, "bitrate", NULL, bt->bitrate);
-			jsonw_float_field_fmt(get_json_writer(),
-					      "sample_point", "%.3f",
-					      (float) bt->sample_point / 1000.);
+			jw = get_json_writer();
+			jsonw_name(jw, "sample_point");
+			jsonw_printf(jw, "%.3f",
+				     (float) bt->sample_point / 1000);
 			print_int(PRINT_ANY, "tq", NULL, bt->tq);
 			print_int(PRINT_ANY, "prop_seg", NULL, bt->prop_seg);
 			print_int(PRINT_ANY, "phase_seg1",
@@ -415,12 +418,14 @@ static void can_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[])
 			RTA_DATA(tb[IFLA_CAN_DATA_BITTIMING]);
 
 		if (is_json_context()) {
+			json_writer_t *jw;
+
 			open_json_object("data_bittiming");
 			print_int(PRINT_JSON, "bitrate", NULL, dbt->bitrate);
-			jsonw_float_field_fmt(get_json_writer(),
-					      "sample_point",
-					      "%.3f",
-					      (float) dbt->sample_point / 1000.);
+			jw = get_json_writer();
+			jsonw_name(jw, "sample_point");
+			jsonw_printf(jw, "%.3f",
+				     (float) dbt->sample_point / 1000.);
 			print_int(PRINT_JSON, "tq", NULL, dbt->tq);
 			print_int(PRINT_JSON, "prop_seg", NULL, dbt->prop_seg);
 			print_int(PRINT_JSON, "phase_seg1",
diff --git a/lib/color.c b/lib/color.c
index eaf69e74d673..e5406294dfc4 100644
--- a/lib/color.c
+++ b/lib/color.c
@@ -132,6 +132,7 @@ void set_color_palette(void)
 		is_dark_bg = 1;
 }
 
+__attribute__((format(printf, 3, 4)))
 int color_fprintf(FILE *fp, enum color_attr attr, const char *fmt, ...)
 {
 	int ret = 0;
diff --git a/lib/json_print.c b/lib/json_print.c
index 5dc41bfabfd4..77902824a738 100644
--- a/lib/json_print.c
+++ b/lib/json_print.c
@@ -100,6 +100,7 @@ void close_json_array(enum output_type type, const char *str)
  * functions handling different types
  */
 #define _PRINT_FUNC(type_name, type)					\
+	__attribute__((format(printf, 4, 0)))				\
 	void print_color_##type_name(enum output_type t,		\
 				     enum color_attr color,		\
 				     const char *key,			\
diff --git a/lib/json_writer.c b/lib/json_writer.c
index aa9ce1c65e51..68890b34ee92 100644
--- a/lib/json_writer.c
+++ b/lib/json_writer.c
@@ -152,6 +152,7 @@ void jsonw_name(json_writer_t *self, const char *name)
 		putc(' ', self->out);
 }
 
+__attribute__((format(printf, 2, 3)))
 void jsonw_printf(json_writer_t *self, const char *fmt, ...)
 {
 	va_list ap;
@@ -205,11 +206,6 @@ void jsonw_null(json_writer_t *self)
 	jsonw_printf(self, "null");
 }
 
-void jsonw_float_fmt(json_writer_t *self, const char *fmt, double num)
-{
-	jsonw_printf(self, fmt, num);
-}
-
 void jsonw_float(json_writer_t *self, double num)
 {
 	jsonw_printf(self, "%g", num);
@@ -274,15 +270,6 @@ void jsonw_float_field(json_writer_t *self, const char *prop, double val)
 	jsonw_float(self, val);
 }
 
-void jsonw_float_field_fmt(json_writer_t *self,
-			   const char *prop,
-			   const char *fmt,
-			   double val)
-{
-	jsonw_name(self, prop);
-	jsonw_float_fmt(self, fmt, val);
-}
-
 void jsonw_uint_field(json_writer_t *self, const char *prop, unsigned int num)
 {
 	jsonw_name(self, prop);
diff --git a/misc/ss.c b/misc/ss.c
index 41e7762bb61f..93b1baf5dc40 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -976,6 +976,7 @@ static int buf_update(int len)
 }
 
 /* Append content to buffer as part of the current field */
+__attribute__((format(printf, 1, 2)))
 static void out(const char *fmt, ...)
 {
 	struct column *f = current_field;
@@ -1093,7 +1094,7 @@ static void print_header(void)
 {
 	while (!field_is_last(current_field)) {
 		if (!current_field->disabled)
-			out(current_field->header);
+			out("%s", current_field->header);
 		field_next();
 	}
 }
diff --git a/tc/m_ematch.c b/tc/m_ematch.c
index ace4b3dd738b..a524b520b276 100644
--- a/tc/m_ematch.c
+++ b/tc/m_ematch.c
@@ -277,6 +277,7 @@ static int flatten_tree(struct ematch *head, struct ematch *tree)
 	return count;
 }
 
+__attribute__((format(printf, 5, 6)))
 int em_parse_error(int err, struct bstr *args, struct bstr *carg,
 		   struct ematch_util *e, char *fmt, ...)
 {
diff --git a/tc/m_ematch.h b/tc/m_ematch.h
index 80b02cfad6cc..95515a074624 100644
--- a/tc/m_ematch.h
+++ b/tc/m_ematch.h
@@ -107,6 +107,7 @@ static inline int parse_layer(struct bstr *b)
 		return INT_MAX;
 }
 
+__attribute__((format(printf, 5, 6)))
 int em_parse_error(int err, struct bstr *args, struct bstr *carg,
 		   struct ematch_util *, char *fmt, ...);
 int print_ematch(FILE *, const struct rtattr *);
-- 
2.18.0.865.gffc8e1a3cd6-goog

^ permalink raw reply related

* Re: [PATCHv2 iproute2 2/3] tc: remove extern from prototype declarations
From: Mahesh Bandewar (महेश बंडेवार) @ 2018-08-21 18:44 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Mahesh Bandewar, netdev
In-Reply-To: <20180821111911.796b5c9f@xeon-e3>

On Tue, Aug 21, 2018 at 11:19 AM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Tue, 21 Aug 2018 10:48:54 -0700
> Mahesh Bandewar <mahesh@bandewar.net> wrote:
>
>> From: Mahesh Bandewar <maheshb@google.com>
>>
>> Signed-off-by: Mahesh Bandewar <maheshb@google.com>
>
> I already did this yesterday. Patch was on mailing list.
Hmm, I thought I did mention that I would take care in v2. In any
case, we can remove this from the patch-series if remaining patches
are fine. Does that make sense?

^ permalink raw reply

* linux-next: build warning after merge of the net tree
From: Stephen Rothwell @ 2018-08-21 22:04 UTC (permalink / raw)
  To: David Miller, Networking
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List, Cong Wang

[-- Attachment #1: Type: text/plain, Size: 433 bytes --]

Hi all,

After merging the net tree, today's linux-next build (KCONFIG_NAME)
produced this warning:

drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c: In function 'tc_fill_actions':
drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c:64:6: warning: unused variable 'i' [-Wunused-variable]
  int i;
      ^

Introduced by commit

  244cd96adb5f ("net_sched: remove list_head from tc_action")

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* [PATCH 06/11] net: ethernet: renesas: use SPDX identifier for Renesas drivers
From: Wolfram Sang @ 2018-08-21 22:02 UTC (permalink / raw)
  To: linux-renesas-soc
  Cc: Kuninori Morimoto, Wolfram Sang, Sergei Shtylyov, David S. Miller,
	netdev, linux-kernel
In-Reply-To: <20180821220233.9202-1-wsa+renesas@sang-engineering.com>

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
---

To be applied individually per subsystem tree. Morimoto-san, could you maybe
ack this with your Renesas address?

 drivers/net/ethernet/renesas/ravb.h      |  5 +----
 drivers/net/ethernet/renesas/ravb_main.c |  5 +----
 drivers/net/ethernet/renesas/sh_eth.c    | 13 +------------
 drivers/net/ethernet/renesas/sh_eth.h    | 13 +------------
 4 files changed, 4 insertions(+), 32 deletions(-)

diff --git a/drivers/net/ethernet/renesas/ravb.h b/drivers/net/ethernet/renesas/ravb.h
index b81f4faf7b10..1470fc12282b 100644
--- a/drivers/net/ethernet/renesas/ravb.h
+++ b/drivers/net/ethernet/renesas/ravb.h
@@ -1,3 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /* Renesas Ethernet AVB device driver
  *
  * Copyright (C) 2014-2015 Renesas Electronics Corporation
@@ -5,10 +6,6 @@
  * Copyright (C) 2015-2016 Cogent Embedded, Inc. <source@cogentembedded.com>
  *
  * Based on the SuperH Ethernet driver
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms and conditions of the GNU General Public License version 2,
- * as published by the Free Software Foundation.
  */
 
 #ifndef __RAVB_H__
diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
index c06f2df895c2..aff5516b781e 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /* Renesas Ethernet AVB device driver
  *
  * Copyright (C) 2014-2015 Renesas Electronics Corporation
@@ -5,10 +6,6 @@
  * Copyright (C) 2015-2016 Cogent Embedded, Inc. <source@cogentembedded.com>
  *
  * Based on the SuperH Ethernet driver
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms and conditions of the GNU General Public License version 2,
- * as published by the Free Software Foundation.
  */
 
 #include <linux/cache.h>
diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
index 5573199c4536..ad4433d59237 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /*  SuperH Ethernet device driver
  *
  *  Copyright (C) 2014 Renesas Electronics Corporation
@@ -5,18 +6,6 @@
  *  Copyright (C) 2008-2014 Renesas Solutions Corp.
  *  Copyright (C) 2013-2017 Cogent Embedded, Inc.
  *  Copyright (C) 2014 Codethink Limited
- *
- *  This program is free software; you can redistribute it and/or modify it
- *  under the terms and conditions of the GNU General Public License,
- *  version 2, as published by the Free Software Foundation.
- *
- *  This program is distributed in the hope it will be useful, but WITHOUT
- *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- *  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
- *  more details.
- *
- *  The full GNU General Public License is included in this distribution in
- *  the file called "COPYING".
  */
 
 #include <linux/module.h>
diff --git a/drivers/net/ethernet/renesas/sh_eth.h b/drivers/net/ethernet/renesas/sh_eth.h
index f94be99cf400..0c18650bbfe6 100644
--- a/drivers/net/ethernet/renesas/sh_eth.h
+++ b/drivers/net/ethernet/renesas/sh_eth.h
@@ -1,19 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*  SuperH Ethernet device driver
  *
  *  Copyright (C) 2006-2012 Nobuhiro Iwamatsu
  *  Copyright (C) 2008-2012 Renesas Solutions Corp.
- *
- *  This program is free software; you can redistribute it and/or modify it
- *  under the terms and conditions of the GNU General Public License,
- *  version 2, as published by the Free Software Foundation.
- *
- *  This program is distributed in the hope it will be useful, but WITHOUT
- *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- *  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
- *  more details.
- *
- *  The full GNU General Public License is included in this distribution in
- *  the file called "COPYING".
  */
 
 #ifndef __SH_ETH_H__
-- 
2.11.0

^ permalink raw reply related

* [PATCH 05/11] can: rcar: use SPDX identifier for Renesas drivers
From: Wolfram Sang @ 2018-08-21 22:02 UTC (permalink / raw)
  To: linux-renesas-soc
  Cc: Kuninori Morimoto, Wolfram Sang, Wolfgang Grandegger,
	Marc Kleine-Budde, David S. Miller, linux-can, netdev,
	linux-kernel
In-Reply-To: <20180821220233.9202-1-wsa+renesas@sang-engineering.com>

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
---

To be applied individually per subsystem tree. Morimoto-san, could you maybe
ack this with your Renesas address?

 drivers/net/can/rcar/rcar_can.c   | 6 +-----
 drivers/net/can/rcar/rcar_canfd.c | 6 +-----
 2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/net/can/rcar/rcar_can.c b/drivers/net/can/rcar/rcar_can.c
index 11662f479e76..051bf4ef4be2 100644
--- a/drivers/net/can/rcar/rcar_can.c
+++ b/drivers/net/can/rcar/rcar_can.c
@@ -1,12 +1,8 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
 /* Renesas R-Car CAN device driver
  *
  * Copyright (C) 2013 Cogent Embedded, Inc. <source@cogentembedded.com>
  * Copyright (C) 2013 Renesas Solutions Corp.
- *
- * This program is free software; you can redistribute  it and/or modify it
- * under  the terms of  the GNU General  Public License as published by the
- * Free Software Foundation;  either version 2 of the  License, or (at your
- * option) any later version.
  */
 
 #include <linux/module.h>
diff --git a/drivers/net/can/rcar/rcar_canfd.c b/drivers/net/can/rcar/rcar_canfd.c
index 602c19e23f05..09a5b038a9f0 100644
--- a/drivers/net/can/rcar/rcar_canfd.c
+++ b/drivers/net/can/rcar/rcar_canfd.c
@@ -1,11 +1,7 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
 /* Renesas R-Car CAN FD device driver
  *
  * Copyright (C) 2015 Renesas Electronics Corp.
- *
- * This program is free software; you can redistribute  it and/or modify it
- * under  the terms of  the GNU General  Public License as published by the
- * Free Software Foundation;  either version 2 of the  License, or (at your
- * option) any later version.
  */
 
 /* The R-Car CAN FD controller can operate in either one of the below two modes
-- 
2.11.0

^ permalink raw reply related

* Fw: [Bug 200879] New: Poor network performance using CX-5 Mellanox card
From: Stephen Hemminger @ 2018-08-21 18:39 UTC (permalink / raw)
  To: netdev



Begin forwarded message:

Date: Tue, 21 Aug 2018 18:37:01 +0000
From: bugzilla-daemon@bugzilla.kernel.org
To: stephen@networkplumber.org
Subject: [Bug 200879] New: Poor network performance using CX-5 Mellanox card


https://bugzilla.kernel.org/show_bug.cgi?id=200879

            Bug ID: 200879
           Summary: Poor network performance using CX-5 Mellanox card
           Product: Networking
           Version: 2.5
    Kernel Version: 4.18
          Hardware: Intel
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: IPV4
          Assignee: stephen@networkplumber.org
          Reporter: kolga@netapp.com
        Regression: No

I'm having issues with the latest kernel (4.18) and using Mellanox CX-5 cards.
Tested using a direct connection between the machines and via a 40G link. There
is an asymmetric throughput.

I have tested some kernels in between 4.15-rc4 and 4.18. What I notice is that
I have asymmetric flow performance where "good" direction gets 20+G and the bad
direction gets 2+G. I measure performance over doing multiple runs (10). I have
noticed that while 4.15-rc4 was not perfect at getting symmetric 20+G
performance all the time (details below), the poor performance is more
prevalent starting from 4.16 kernel.

In 4.15-rc4 6 out of 10 runs show good performance 20+G (in the bad
direction). performance in other direction is mostly 28+G (7 out of 10
runs. where 3 runs it goes down to 15+G)
In 4.16 3 out of 10 runs show good performance 20+G. performance in
other direction is mostly 20+G (7 out of 10 runs where 3 runs goes
down to 15G)
In 4.17 0 out of 10 runs show good performance. performance in other
direction is mostly 20+G (7 out of 10 runs where 3 runs it goes down
to 5G)
In 4.18 0 out of 10 runs show good performance in the "bad
direction". the "good direction" is now also pretty bad where 7 out of
10 runs had throughput between 7-10G and 3 runs had 11-17G.

Let me try to do with a description of the setup. I have 2machines:
sti-rx200-231 and sti-rx200-232. I have been running iperf3 testing
bandwidth where at first the server was on -231 and then when the
server was on -232. The "bad" direction is when -232 is the server and
-231 is the client.

> can you share some configuration details:
> CPU numa and affinity details:  

[kolga@sti-rx200-232 ~]$ numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
node 0 size: 32155 MB
node 0 free: 30394 MB
node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
node 1 size: 32231 MB
node 1 free: 30486 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

[kolga@sti-rx200-231 ~]$ numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
node 0 size: 32161 MB
node 0 free: 31100 MB
node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
node 1 size: 32231 MB
node 1 free: 29774 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

> ethtool -l  
[kolga@sti-rx200-231 ~]$ sudo ethtool -l enp4s0
Channel parameters for enp4s0:
Pre-set maximums:
RX:             0
TX:             0
Other:          0
Combined:       24
Current hardware settings:
RX:             0
TX:             0
Other:          0
Combined:       24

(same for the other machine)

> ethtool -g  
[kolga@sti-rx200-231 ~]$ sudo ethtool -g enp4s0
Ring parameters for enp4s0:
Pre-set maximums:
RX:             8192
RX Mini:        0
RX Jumbo:       0
TX:             8192
Current hardware settings:
RX:             1024
RX Mini:        0
RX Jumbo:       0
TX:             1024

(same for the other machine)

> ethtool -x  
[kolga@sti-rx200-231 ~]$ sudo ethtool -x enp4s0
RX flow hash indirection table for enp4s0 with 24 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19    20    21    22    23
   24:      0     1     2     3     4     5     6     7
   32:      8     9    10    11    12    13    14    15
   40:     16    17    18    19    20    21    22    23
   48:      0     1     2     3     4     5     6     7
   56:      8     9    10    11    12    13    14    15
   64:     16    17    18    19    20    21    22    23
   72:      0     1     2     3     4     5     6     7
   80:      8     9    10    11    12    13    14    15
   88:     16    17    18    19    20    21    22    23
   96:      0     1     2     3     4     5     6     7
  104:      8     9    10    11    12    13    14    15
  112:     16    17    18    19    20    21    22    23
  120:      0     1     2     3     4     5     6     7
RSS hash key:
5e:1c:93:e2:ec:b6:44:b1:e4:ec:b1:20:57:ab:90:f6:0c:1a:46:13:b8:19:66:c8:56:0c:06:b2:d5:53:a6:4d:89:6b:0b:b1:d4:30:90:31

(same for the other machine but the RSS hash key is different)

> ethtool -k  
[kolga@sti-rx200-231 ~]$ sudo ethtool -k enp4s0
Features for enp4s0:
Cannot get device udp-fragmentation-offload settings: Operation not supported
rx-checksumming: on
tx-checksumming: on
        tx-checksum-ipv4: on
        tx-checksum-ip-generic: off [fixed]
        tx-checksum-ipv6: on
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp-mangleid-segmentation: off
        tx-tcp6-segmentation: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: on
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: on
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: on [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: on
tls-hw-tx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]

(i think it's the same for the other machine but just in case here's the
output)
[kolga@sti-rx200-232 ~]$ sudo ethtool -k enp4s0
Features for enp4s0:
Cannot get device udp-fragmentation-offload settings: Operation not supported
rx-checksumming: on
tx-checksumming: on
        tx-checksum-ipv4: on
        tx-checksum-ip-generic: off [fixed]
        tx-checksum-ipv6: on
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp-mangleid-segmentation: off
        tx-tcp6-segmentation: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: on
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: on
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: on [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: on
tls-hw-tx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]

> ethtool --show-priv-flags  
[kolga@sti-rx200-231 ~]$ sudo ethtool --show-priv-flags enp4s0
Private flags for enp4s0:
rx_cqe_moder   : on
tx_cqe_moder   : off
rx_cqe_compress: off
rx_striding_rq : on

(same for the other machine)

> ethool -S //before and after the good and bad runs"
> perf report/top while running the test.  

This is a run where -232 is a server and -231 is a client.

[kolga@sti-rx200-232 src]$ ./iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 172.20.35.189, port 37302
[  5] local 172.20.35.191 port 5201 connected to 172.20.35.189 port 37304
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   236 MBytes  1.98 Gbits/sec
[  5]   1.00-2.00   sec   233 MBytes  1.95 Gbits/sec
[  5]   2.00-3.00   sec   235 MBytes  1.97 Gbits/sec
[  5]   3.00-4.00   sec   231 MBytes  1.94 Gbits/sec
[  5]   4.00-5.00   sec   243 MBytes  2.04 Gbits/sec
[  5]   5.00-6.00   sec   238 MBytes  1.99 Gbits/sec
[  5]   6.00-7.00   sec   230 MBytes  1.93 Gbits/sec
[  5]   7.00-8.00   sec   232 MBytes  1.94 Gbits/sec
[  5]   8.00-9.00   sec   272 MBytes  2.28 Gbits/sec
[  5]   9.00-10.00  sec   249 MBytes  2.09 Gbits/sec
[  5]  10.00-10.05  sec  10.4 MBytes  1.90 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.05  sec  2.35 GBytes  2.01 Gbits/sec                  receiver

[kolga@sti-rx200-231 src]$ sudo ./iperf3 -c 172.20.35.191
Connecting to host 172.20.35.191, port 5201
[  5] local 172.20.35.189 port 37304 connected to 172.20.35.191 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   249 MBytes  2.09 Gbits/sec    4   1.93 MBytes
[  5]   1.00-2.00   sec   232 MBytes  1.95 Gbits/sec    1   1.69 MBytes
[  5]   2.00-3.00   sec   234 MBytes  1.96 Gbits/sec    0   2.24 MBytes
[  5]   3.00-4.00   sec   232 MBytes  1.95 Gbits/sec    0   2.66 MBytes
[  5]   4.00-5.00   sec   242 MBytes  2.03 Gbits/sec   18   1.32 MBytes
[  5]   5.00-6.00   sec   238 MBytes  1.99 Gbits/sec    1   1.63 MBytes
[  5]   6.00-7.00   sec   230 MBytes  1.93 Gbits/sec    0   2.18 MBytes
[  5]   7.00-8.00   sec   232 MBytes  1.95 Gbits/sec    8   1.38 MBytes
[  5]   8.00-9.00   sec   272 MBytes  2.29 Gbits/sec    2   1.45 MBytes
[  5]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec    2   1.66 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.35 GBytes  2.02 Gbits/sec   36             sender
[  5]   0.00-10.05  sec  2.35 GBytes  2.01 Gbits/sec                  receiver

iperf Done.

Top output from -231
Tasks: 412 total,   1 running, 209 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.3 sy,  0.0 ni, 99.2 id,  0.0 wa,  0.0 hi,  0.4 si,  0.0 st
KiB Mem : 65938632 total, 63195184 free,  2191948 used,   551500 buff/cache
KiB Swap: 33030140 total, 33030140 free,        0 used. 63105388 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
  162 root      20   0       0      0      0 I   3.3  0.0   0:00.51 kworker/12+
 2530 root      20   0       0      0      0 I   1.7  0.0   0:00.39 kworker/u4+
 2742 root      20   0       0      0      0 I   1.3  0.0   0:00.17 kworker/u4+
 2394 root      20   0       0      0      0 I   1.0  0.0   0:00.16 kworker/17+
 2774 kolga     20   0  158000   4640   3676 R   0.3  0.0   0:00.03 top
    1 root      20   0  191980   6436   3900 S   0.0  0.0   0:26.21 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.01 kthreadd
    3 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_gp
    4 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_par_gp
    5 root      20   0       0      0      0 I   0.0  0.0   0:00.02 kworker/0:+
    6 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/0:+
    8 root      20   0       0      0      0 I   0.0  0.0   0:00.00 kworker/u4+
    9 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 mm_percpu_+
   10 root      20   0       0      0      0 S   0.0  0.0   0:00.00 ksoftirqd/0
   11 root      20   0       0      0      0 I   0.0  0.0   0:00.17 rcu_sched
   12 root      20   0       0      0      0 I   0.0  0.0   0:00.00 rcu_bh
   13 root      rt   0       0      0      0 S   0.0  0.0   0:00.06 migration/0

Output from -232
Tasks: 400 total,   2 running, 202 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  1.5 sy,  0.0 ni, 97.3 id,  0.0 wa,  0.0 hi,  1.0 si,  0.0 st
KiB Mem : 65932576 total, 63205228 free,  2180492 used,   546856 buff/cache
KiB Swap: 28901372 total, 28901372 free,        0 used. 63114220 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2467 kolga     20   0   43716   4144   3608 R  37.9  0.0   0:03.32 lt-iperf3
 1199 root      20   0       0      0      0 D   2.7  0.0   0:00.18 kworker/23+
  565 root      20   0       0      0      0 I   1.7  0.0   0:00.23 kworker/2:+
 2360 root      20   0       0      0      0 I   1.3  0.0   0:00.16 kworker/u4+
  153 root      20   0       0      0      0 I   0.7  0.0   0:00.11 kworker/23+
 2273 root      20   0       0      0      0 I   0.7  0.0   0:01.13 kworker/u4+
  691 root      20   0       0      0      0 S   0.3  0.0   0:00.20 xfsaild/dm+
 2448 root      20   0       0      0      0 I   0.3  0.0   0:00.08 kworker/u4+
    1 root      20   0  191820   6108   3804 S   0.0  0.0   0:26.12 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.01 kthreadd
    3 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_gp
    4 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_par_gp
    6 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/0:+
    9 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 mm_percpu_+
   10 root      20   0       0      0      0 S   0.0  0.0   0:00.00 ksoftirqd/0
   11 root      20   0       0      0      0 I   0.0  0.0   0:00.13 rcu_sched
   12 root      20   0       0      0      0 I   0.0  0.0   0:00.00 rcu_bh


Here's a run where -231 is the server and -232 is the client. A "good"
direction
[kolga@sti-rx200-231 src]$ ./iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 172.20.35.191, port 35060
[  5] local 172.20.35.189 port 5201 connected to 172.20.35.191 port 35062
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  1.81 GBytes  15.5 Gbits/sec
[  5]   1.00-2.00   sec  1.90 GBytes  16.3 Gbits/sec
[  5]   2.00-3.00   sec  1.25 GBytes  10.8 Gbits/sec
[  5]   3.00-4.00   sec   826 MBytes  6.93 Gbits/sec
[  5]   4.00-5.00   sec   819 MBytes  6.87 Gbits/sec
[  5]   5.00-6.00   sec  1.47 GBytes  12.6 Gbits/sec
[  5]   6.00-7.00   sec  1.79 GBytes  15.4 Gbits/sec
[  5]   7.00-8.00   sec  1.14 GBytes  9.75 Gbits/sec
[  5]   8.00-9.00   sec  1.81 GBytes  15.6 Gbits/sec
[  5]   9.00-10.00  sec  1.77 GBytes  15.2 Gbits/sec
[  5]  10.00-10.04  sec  78.2 MBytes  16.3 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.04  sec  14.6 GBytes  12.5 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------

./iperf3 -c 172.20.35.189
Connecting to host 172.20.35.189, port 5201
[  5] local 172.20.35.191 port 35062 connected to 172.20.35.189 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.89 GBytes  16.2 Gbits/sec  100   1.45 MBytes
[  5]   1.00-2.00   sec  1.90 GBytes  16.4 Gbits/sec   86    638 KBytes
[  5]   2.00-3.00   sec  1.21 GBytes  10.4 Gbits/sec   57   1.15 MBytes
[  5]   3.00-4.00   sec   830 MBytes  6.96 Gbits/sec    7   1.54 MBytes
[  5]   4.00-5.00   sec   815 MBytes  6.84 Gbits/sec   15   1.10 MBytes
[  5]   5.00-6.00   sec  1.51 GBytes  12.9 Gbits/sec  362    690 KBytes
[  5]   6.00-7.00   sec  1.72 GBytes  14.8 Gbits/sec  665    690 KBytes
[  5]   7.00-8.00   sec  1.20 GBytes  10.4 Gbits/sec  639    778 KBytes
[  5]   8.00-9.00   sec  1.81 GBytes  15.6 Gbits/sec  879    708 KBytes
[  5]   9.00-10.00  sec  1.77 GBytes  15.2 Gbits/sec  865    577 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  14.6 GBytes  12.6 Gbits/sec  3675             sender
[  5]   0.00-10.04  sec  14.6 GBytes  12.5 Gbits/sec                  receiver

iperf Done.

-232 top
Tasks: 391 total,   1 running, 202 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  1.4 sy,  0.0 ni, 97.2 id,  0.0 wa,  0.0 hi,  1.4 si,  0.0 st
KiB Mem : 65932576 total, 63203164 free,  2181920 used,   547492 buff/cache
KiB Swap: 28901372 total, 28901372 free,        0 used. 63111764 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2637 kolga     20   0   43716   4476   3944 S  23.6  0.0   0:02.09 lt-iperf3
 2510 root      20   0       0      0      0 I   4.3  0.0   0:00.46 kworker/10+
 2653 root      20   0       0      0      0 I   3.0  0.0   0:00.16 kworker/1:+
 2630 root      20   0       0      0      0 I   1.7  0.0   0:00.14 kworker/u4+
  565 root      20   0       0      0      0 I   1.0  0.0   0:00.32 kworker/2:+
 2609 root      20   0       0      0      0 I   1.0  0.0   0:00.48 kworker/u4+
 2273 root      20   0       0      0      0 I   0.7  0.0   0:01.67 kworker/u4+
   25 root      20   0       0      0      0 S   0.3  0.0   0:00.01 ksoftirqd/2
   74 root      20   0       0      0      0 S   0.3  0.0   0:00.03 ksoftirqd/+
 2651 kolga     20   0  158000   4568   3640 R   0.3  0.0   0:00.03 top
    1 root      20   0  191820   6108   3804 S   0.0  0.0   0:26.12 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.01 kthreadd
    3 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_gp
    4 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_par_gp
    6 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/0:+
    9 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 mm_percpu_+
   10 root      20   0       0      0      0 S   0.0  0.0   0:00.00 ksoftirqd/0

-231 top
Tasks: 413 total,   3 running, 209 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  3.1 sy,  0.0 ni, 94.7 id,  0.0 wa,  0.0 hi,  2.1 si,  0.0 st
KiB Mem : 65938632 total, 63202896 free,  2183572 used,   552164 buff/cache
KiB Swap: 33030140 total, 33030140 free,        0 used. 63113436 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2859 kolga     20   0   43716   4108   3572 R  70.8  0.0   0:05.75 lt-iperf3
 2875 root      20   0       0      0      0 I   7.3  0.0   0:00.36 kworker/9:+
  185 root      20   0       0      0      0 I   6.6  0.0   0:00.47 kworker/9:+
   68 root      20   0       0      0      0 S   5.0  0.0   0:00.25 ksoftirqd/9
 2421 root      20   0       0      0      0 I   3.0  0.0   0:00.10 kworker/13+
 2832 root      20   0       0      0      0 I   1.7  0.0   0:00.28 kworker/u4+
 2742 root      20   0       0      0      0 I   1.0  0.0   0:00.63 kworker/u4+
 2530 root      20   0       0      0      0 I   0.7  0.0   0:01.41 kworker/u4+
 2877 kolga     20   0  158000   4712   3740 R   0.3  0.0   0:00.02 top
    1 root      20   0  191980   6440   3900 S   0.0  0.0   0:26.23 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.01 kthreadd
    3 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_gp
    4 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_par_gp
    5 root      20   0       0      0      0 I   0.0  0.0   0:00.02 kworker/0:+
    6 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/0:+
    8 root      20   0       0      0      0 I   0.0  0.0   0:00.00 kworker/u4+
    9 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 mm_percpu_+

I could attach outputs from ethtool -S (too long to include) 

From google about CPU states and disabling them
[kolga@sti-rx200-232 ~]$ sudo cat
/sys/module/intel_idle/parameters/max_cstate  9

adding processor.max_cstate=0 and intel_idle.max_cstate=0 to the
kernel boot parameters made is so max_cstate stated at 0.

I re-did the re-experiments and it made no difference.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply

* Re: [PATCHv2 iproute2 2/3] tc: remove extern from prototype declarations
From: Stephen Hemminger @ 2018-08-21 18:19 UTC (permalink / raw)
  To: Mahesh Bandewar; +Cc: netdev, Mahesh Bandewar
In-Reply-To: <20180821174854.208561-1-mahesh@bandewar.net>

On Tue, 21 Aug 2018 10:48:54 -0700
Mahesh Bandewar <mahesh@bandewar.net> wrote:

> From: Mahesh Bandewar <maheshb@google.com>
> 
> Signed-off-by: Mahesh Bandewar <maheshb@google.com>

I already did this yesterday. Patch was on mailing list.

^ permalink raw reply

* Re: ixgbe hangs when XDP_TX is enabled
From: Alexander Duyck @ 2018-08-21 18:13 UTC (permalink / raw)
  To: tehnerd; +Cc: Netdev, Jeff Kirsher
In-Reply-To: <20180821165858.GA1507@maindev>

On Tue, Aug 21, 2018 at 9:59 AM Nikita V. Shirokov <tehnerd@tehnerd.com> wrote:
>
> On Tue, Aug 21, 2018 at 08:58:15AM -0700, Alexander Duyck wrote:
> > On Mon, Aug 20, 2018 at 12:32 PM Nikita V. Shirokov <tehnerd@tehnerd.com> wrote:
> > >
> > > we are getting such errors:
> > >
> > > [  408.737313] ixgbe 0000:03:00.0 eth0: Detected Tx Unit Hang (XDP)
> > >                  Tx Queue             <46>
> > >                  TDH, TDT             <0>, <2>
> > >                  next_to_use          <2>
> > >                  next_to_clean        <0>
> > >                tx_buffer_info[next_to_clean]
> > >                  time_stamp           <0>
> > >                  jiffies              <1000197c0>
> > > [  408.804438] ixgbe 0000:03:00.0 eth0: tx hang 1 detected on queue 46, resetting adapter
> > > [  408.804440] ixgbe 0000:03:00.0 eth0: initiating reset due to tx timeout
> > > [  408.817679] ixgbe 0000:03:00.0 eth0: Reset adapter
> > > [  408.866091] ixgbe 0000:03:00.0 eth0: TXDCTL.ENABLE for one or more queues not cleared within the polling period
> > > [  409.345289] ixgbe 0000:03:00.0 eth0: detected SFP+: 3
> > > [  409.497232] ixgbe 0000:03:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
> > >
> > > while running XDP prog on ixgbe nic.
> > > right now i'm seing this on bpfnext kernel
> > > (latest commit from Wed Aug 15 15:04:25 2018 -0700 ;
> > > 9a76aba02a37718242d7cdc294f0a3901928aa57)
> > >
> > > looks like this is the same issue as reported by Brenden in
> > > https://www.spinics.net/lists/netdev/msg439438.html
> > >
> > > --
> > > Nikita V. Shirokov
> >
> > Could you provide some additional information about your setup.
> > Specifically useful would be "ethtool -i", "ethtool -l", and lspci
> > -vvv info for your device. The total number of CPUs on the system
> > would be useful to know as well. In addition could you try
> > reproducing
> sure:
>
> ethtool -l eth0
> Channel parameters for eth0:
> Pre-set maximums:
> RX:             0
> TX:             0
> Other:          1
> Combined:       63
> Current hardware settings:
> RX:             0
> TX:             0
> Other:          1
> Combined:       48
>
> # ethtool -i eth0
> driver: ixgbe
> version: 5.1.0-k
> firmware-version: 0x800006f1
> expansion-rom-version:
> bus-info: 0000:03:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: yes
>
>
> # nproc
> 48
>
> lspci:
>
> 03:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
>         Subsystem: Intel Corporation Device 000d
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 32 bytes
>         Interrupt: pin A routed to IRQ 30
>         NUMA node: 0
>         Region 0: Memory at c7d00000 (64-bit, non-prefetchable) [size=1M]
>         Region 2: I/O ports at 6000 [size=32]
>         Region 4: Memory at c7e80000 (64-bit, non-prefetchable) [size=16K]
>         Expansion ROM at c7e00000 [disabled] [size=512K]
>         Capabilities: [40] Power Management version 3
>                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>         Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
>                 Address: 0000000000000000  Data: 0000
>                 Masking: 00000000  Pending: 00000000
>         Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
>                 Vector table: BAR=4 offset=00000000
>                 PBA: BAR=4 offset=00002000
>         Capabilities: [a0] Express (v2) Endpoint, MSI 00
>                 DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
>                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
>                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
>                         RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
>                         MaxPayload 256 bytes, MaxReadReq 512 bytes
>                 DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend+
>                 LnkCap: Port #2, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s unlimited, L1 <8us
>                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                 LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>                 DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
>                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>                 LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
>                          Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
>                          Compliance De-emphasis: -6dB
>                 LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
>                          EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>         Capabilities: [100 v1] Advanced Error Reporting
>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>                 AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>         Capabilities: [140 v1] Device Serial Number 90-e2-ba-ff-ff-b6-b2-60
>         Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
>                 ARICap: MFVC- ACS-, Next Function: 0
>                 ARICtl: MFVC- ACS-, Function Group: 0
>         Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
>                 IOVCap: Migration-, Interrupt Message Number: 000
>                 IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
>                 IOVSta: Migration-
>                 Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 00
>                 VF offset: 128, stride: 2, Device ID: 10ed
>                 Supported Page Size: 00000553, System Page Size: 00000001
>                 Region 0: Memory at 00000000c7c00000 (64-bit, prefetchable)
>                 Region 3: Memory at 00000000c7b00000 (64-bit, prefetchable)
>                 VF Migration: offset: 00000000, BIR: 0
>         Kernel driver in use: ixgbe
>
>
>
>
> workaround for now is to do the same, as Brenden did in his original
> finding: make sure that combined + xdp queues < max_tx_queues
> (e.g. w/ combined == 14 the issue goes away).
>
> > the issue with one of the sample XDP programs provided with the kernel
> > such as the xdp2 which I believe uses the XDP_TX function. We need to
> > try and create a similar setup in our own environment for
> > reproduction and debugging.
>
> will try but this could take a while, because i'm not sure that we have
> ixgbe in our test lab (and it would be hard to run such test in prod)
>
> >
> > Thanks.
> >
> > - Alex
>
> --
> Nikita V. Shirokov

So I have been reading the datasheet
(https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82599-10-gbe-controller-datasheet.pdf)
and it looks like the assumption that Brenden came to in the earlier
referenced link is probably correct. From what I can tell there is a
limit of 64 queues in the base RSS mode of the device, so while it
supports more than 64 queues you can only make use of 64 as per table
7-25.

For now I think the workaround you are using is probably the only
viable solution. I myself don't have time to work on resolving this,
but I am sure on of the maintainers for ixgbe will be responding
shortly.

One possible solution we may want to look at would be to make use of
the 32 pool/VF mode in the MTQC register. That should enable us to
make use of all 128 queues but I am sure there would be other side
effects such as having to set the bits in the PFVFTE register in order
to enable the extra Tx queues.

Thanks.

- Alex

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox