Netdev List
 help / color / mirror / Atom feed
* [net-next v2 00/13][pull request] Intel Wired LAN Driver Updates 2019-09-11
From: Jeff Kirsher @ 2019-09-11 16:50 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann

This series contains updates to i40e, ixgbe/vf and iavf.

Wenwen Wang fixes a potential memory leak where 3 allocated variables
are not properly cleaned up on failure for ixgbe.

Stefan Assmann fixes a potential kernel panic found when repeatedly
spawning and destroying VFs in i40e when a NULL pointer is dereferenced
due to a race condition.  Fixed up the i40e driver to clear the
__I40E_VIRTCHNL_OP_PENDING bit before returning after an invalid
minimum transmit rate is requested.  Updates the iavf driver to only
apply the MAC address change when the PF ACK's the requested change.

Tonghao Zhang updates ixgbe to use the skb_get_queue_mapping() API call
instead of the driver accessing the queue mapping directly.

Jake updates i40e to use ktime_get_real_ts64() instead of
ktime_to_timespec64().  Removes the define for bit 0x0001 for cloud
filters, since it is a reserved bit and not a valid type.  Also added
code comments to clearly state which bits are reserved and should not be
used or defined for cloud filter adminq command.  Clarify the macros
used to specify the cloud filter fields are individual bits, so use the
BIT() macro.

Aleksandr fixes up the print_link_message() to include the "negotiated"
FEC status for i40e.

Czeslaw also adds additional log message for devices without FEC in the
print_link_message() for i40e.

Colin Ian King reduces the object code size by making the array API
static constant.

Magnus fixes a potential receive buffer starvation issue for AF_XDP by
kicking the NAPI context of any queue with an attached AF_XDP zero-copy
socket.

v2: Removed patch 11 from the original series (Alex Duyck's ITR fix), 
    so that it can be sent to the net tree.

The following are changes since commit c1609946b8b6485e1d405663004867ea9e92178a:
  Merge branch 'qed-Fix-series'
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 40GbE

Aleksandr Loktionov (1):
  i40e: fix missed "Negotiated" string in i40e_print_link_message()

Colin Ian King (1):
  net/ixgbevf: make array api static const, makes object smaller

Czeslaw Zagorski (1):
  i40e: Fix message for other card without FEC.

Jacob Keller (4):
  i40e: use ktime_get_real_ts64 instead of ktime_to_timespec64
  i40e: remove I40E_AQC_ADD_CLOUD_FILTER_OIP
  i40e: mark additional missing bits as reserved
  i40e: use BIT macro to specify the cloud filter field flags

Magnus Karlsson (1):
  i40e: fix potential RX buffer starvation for AF_XDP

Stefan Assmann (3):
  i40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask
  i40e: clear __I40E_VIRTCHNL_OP_PENDING on invalid min Tx rate
  iavf: fix MAC address setting for VFs when filter is rejected

Tonghao Zhang (1):
  ixgbe: use skb_get_queue_mapping in tx path

Wenwen Wang (1):
  ixgbe: fix memory leaks

 drivers/net/ethernet/intel/i40e/i40e.h        | 10 +++----
 .../net/ethernet/intel/i40e/i40e_adminq_cmd.h |  5 +++-
 drivers/net/ethernet/intel/i40e/i40e_main.c   | 30 ++++++++++++-------
 drivers/net/ethernet/intel/i40e/i40e_ptp.c    |  2 +-
 .../ethernet/intel/i40e/i40e_virtchnl_pf.c    |  3 +-
 drivers/net/ethernet/intel/i40e/i40e_xsk.c    |  5 ++++
 drivers/net/ethernet/intel/iavf/iavf_main.c   |  1 -
 .../net/ethernet/intel/iavf/iavf_virtchnl.c   |  7 +++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  6 +++-
 .../net/ethernet/intel/ixgbevf/ixgbevf_main.c | 14 +++++----
 10 files changed, 57 insertions(+), 26 deletions(-)

-- 
2.21.0


^ permalink raw reply

* Re: [PATCH 04/11] net: phylink: switch to using fwnode_gpiod_get_index()
From: Andy Shevchenko @ 2019-09-11 16:52 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Dmitry Torokhov, Linus Walleij, Mika Westerberg, linux-kernel,
	linux-gpio, Andrew Lunn, David S. Miller, Florian Fainelli,
	Heiner Kallweit, netdev
In-Reply-To: <20190911101016.GW13294@shell.armlinux.org.uk>

On Wed, Sep 11, 2019 at 11:10:16AM +0100, Russell King - ARM Linux admin wrote:
> On Wed, Sep 11, 2019 at 02:55:11AM -0700, Dmitry Torokhov wrote:
> > On Wed, Sep 11, 2019 at 10:49:29AM +0100, Russell King - ARM Linux admin wrote:
> > > On Wed, Sep 11, 2019 at 12:46:19PM +0300, Andy Shevchenko wrote:
> > > > On Wed, Sep 11, 2019 at 10:39:14AM +0100, Russell King - ARM Linux admin wrote:
> > > > > On Wed, Sep 11, 2019 at 12:25:14PM +0300, Andy Shevchenko wrote:
> > > > > > On Wed, Sep 11, 2019 at 12:52:08AM -0700, Dmitry Torokhov wrote:
> > > > > > > Instead of fwnode_get_named_gpiod() that I plan to hide away, let's use
> > > > > > > the new fwnode_gpiod_get_index() that mimics gpiod_get_index(), bit
> > > > > > > works with arbitrary firmware node.
> > > e > > 
> > > > > > I'm wondering if it's possible to step forward and replace
> > > > > > fwnode_get_gpiod_index by gpiod_get() / gpiod_get_index() here and
> > > > > > in other cases in this series.
> > > > > 
> > > > > No, those require a struct device, but we have none.  There are network
> > > > > drivers where there is a struct device for the network complex, but only
> > > > > DT nodes for the individual network interfaces.  So no, gpiod_* really
> > > > > doesn't work.
> > > > 
> > > > In the following patch the node is derived from struct device. So, I believe
> > > > some cases can be handled differently.

> Referring back to my comment, notice that I said we have none for the
> phylink case, so it's not possible there.
> 
> I'm not sure why Andy replied the way he did, unless he mis-read my
> comment.

It is a first patch which does the change. Mostly my reply was to Dmitry and
your comment clarifies the case with this patch, thanks!

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply

* Re: VRF Issue Since kernel 5
From: David Ahern @ 2019-09-11 16:53 UTC (permalink / raw)
  To: Alexis Bauvin, Gowen; +Cc: netdev@vger.kernel.org
In-Reply-To: <9E920DE7-9CC9-493C-A1D2-957FE1AED897@online.net>

On 9/9/19 10:28 AM, Alexis Bauvin wrote:
> Also, your `unreachable default metric 4278198272` route looks odd to me.
> 

New recommendation from FRR group. See
https://www.kernel.org/doc/Documentation/networking/vrf.txt and search
for 4278198272

^ permalink raw reply

* Re: VRF Issue Since kernel 5
From: David Ahern @ 2019-09-11 17:02 UTC (permalink / raw)
  To: Gowen, netdev@vger.kernel.org
In-Reply-To: <CWLP265MB1554308A1373D9ECE68CB854FDB70@CWLP265MB1554.GBRP265.PROD.OUTLOOK.COM>

At LPC this week and just now getting a chance to process the data you sent.

On 9/9/19 8:46 AM, Gowen wrote:
> the production traffic is all in the 10.0.0.0/8 network (eth1 global VRF) except for a few subnets (DNS) which are routed out eth0 (mgmt-vrf)
> 
> 
> Admin@NETM06:~$ ip route show
> default via 10.24.12.1 dev eth0
> 10.0.0.0/8 via 10.24.12.1 dev eth1
> 10.24.12.0/24 dev eth1 proto kernel scope link src 10.24.12.9
> 10.24.65.0/24 via 10.24.12.1 dev eth0
> 10.25.65.0/24 via 10.24.12.1 dev eth0
> 10.26.0.0/21 via 10.24.12.1 dev eth0
> 10.26.64.0/21 via 10.24.12.1 dev eth0

interesting route table. This is default VRF but you have route leaking
through eth0 which is in mgmt-vrf.

> 
> 
> Admin@NETM06:~$ ip route show vrf mgmt-vrf
> default via 10.24.12.1 dev eth0
> unreachable default metric 4278198272
> 10.24.12.0/24 dev eth0 proto kernel scope link src 10.24.12.10
> 10.24.65.0/24 via 10.24.12.1 dev eth0
> 10.25.65.0/24 via 10.24.12.1 dev eth0
> 10.26.0.0/21 via 10.24.12.1 dev eth0
> 10.26.64.0/21 via 10.24.12.1 dev eth0

The DNS servers are 10.24.65.203 or 10.24.64.203 which you want to go
out mgmt-vrf. correct?

10.24.65.203 should hit the route "10.24.65.0/24 via 10.24.12.1 dev
eth0" for both default VRF and mgmt-vrf.

10.24.64.203 will NOT hit a route leak entry so traverse the VRF
associated with the context of the command (mgmt-vrf or default). Is
that intentional? (verify with: `ip ro get 10.24.64.203 fibmatch` and
`ip ro get 10.24.64.203 vrf mgmt-vrf fibmatch`)


> 
> 
> 
> The strange activity occurs when I enter the command “sudo apt update” as I can resolve the DNS request (10.24.65.203 or 10.24.64.203, verified with tcpdump) out eth0 but for the actual update traffic there is no activity:
> 
> 
> sudo tcpdump -i eth0 '(host 10.24.65.203 or host 10.25.65.203) and port 53' -n
> <OUTPUT OMITTED FOR BREVITY>
> 10:06:05.268735 IP 10.24.12.10.39963 > 10.24.65.203.53: 48798+ [1au] A? security.ubuntu.com. (48)
> <OUTPUT OMITTED FOR BREVITY>
> 10:06:05.284403 IP 10.24.65.203.53 > 10.24.12.10.39963: 48798 13/0/1 A 91.189.91.23, A 91.189.88.24, A 91.189.91.26, A 91.189.88.162, A 91.189.88.149, A 91.189.91.24, A 91.189.88.173, A 91.189.88.177, A 91.189.88.31, A 91.189.91.14, A 91.189.88.176, A 91.189.88.175, A 91.189.88.174 (256)
> 
> 
> 
> You can see that the update traffic is returned but is not accepted by the stack and a RST is sent
> 
> 
> Admin@NETM06:~$ sudo tcpdump -i eth0 '(not host 168.63.129.16 and port 80)' -n
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
> 10:17:12.690658 IP 10.24.12.10.40216 > 91.189.88.175.80: Flags [S], seq 2279624826, win 64240, options [mss 1460,sackOK,TS val 2029365856 ecr 0,nop,wscale 7], length 0
> 10:17:12.691929 IP 10.24.12.10.52362 > 91.189.95.83.80: Flags [S], seq 1465797256, win 64240, options [mss 1460,sackOK,TS val 3833463674 ecr 0,nop,wscale 7], length 0
> 10:17:12.696270 IP 91.189.88.175.80 > 10.24.12.10.40216: Flags [S.], seq 968450722, ack 2279624827, win 28960, options [mss 1418,sackOK,TS val 81957103 ecr 2029365856,nop,wscale 7], length 0                                                                                                                            
> 10:17:12.696301 IP 10.24.12.10.40216 > 91.189.88.175.80: Flags [R], seq 2279624827, win 0, length 0
> 10:17:12.697884 IP 91.189.95.83.80 > 10.24.12.10.52362: Flags [S.], seq 4148330738, ack 1465797257, win 28960, options [mss 1418,sackOK,TS val 2257624414 ecr 3833463674,nop,wscale 8], length 0                                                                                                                         
> 10:17:12.697909 IP 10.24.12.10.52362 > 91.189.95.83.80: Flags [R], seq 1465797257, win 0, length 0
> 
> 
> 
> 
> I can emulate the DNS lookup using netcat in the vrf:
> 
> 
> sudo ip vrf exec mgmt-vrf nc -u 10.24.65.203 53
> 

`ip vrf exec mgmt-vrf <COMMAND>` means that every IPv4 and IPv6 socket
opened by <COMMAND> is automatically bound to mgmt-vrf which causes
route lookups to hit the mgmt-vrf table.

Just running <COMMAND> (without binding to any vrf) means no socket is
bound to anything unless the command does a bind. In that case the
routing lookups determine which egress device is used.

Now the response comes back, if the ingress interface is a VRF then the
socket lookup wants to match on a device.

Now, a later response shows this for DNS lookups:

  isc-worker0000 20261 [000]  2215.013849: fib:fib_table_lookup: table
10 oif 0 iif 0 proto 0 0.0.0.0/0 -> 127.0.0.1/0 tos 0 scope 0 flags 0
==> dev eth0 gw 10.24.12.1 src 10.24.12.10 err 0
  isc-worker0000 20261 [000]  2215.013915: fib:fib_table_lookup: table
10 oif 4 iif 1 proto 17 0.0.0.0/52138 -> 127.0.0.53/53 tos 0 scope 0
flags 4 ==> dev eth0 gw 10.24.12.1 src 10.24.12.10 err 0
  isc-worker0000 20261 [000]  2220.014006: fib:fib_table_lookup: table
10 oif 4 iif 1 proto 17 0.0.0.0/52138 -> 127.0.0.53/53 tos 0 scope 0
flags 4 ==> dev eth0 gw 10.24.12.1 src 10.24.12.10 err 0

which suggests your process is passing off the DNS lookup to a local
process (isc-worker) and it hits the default route for mgmt-vrf when it
is trying to connect to a localhost address.

For mgmt-vrf I suggest always adding 127.0.0.1/8 to the mgmt vrf device
(and ::1/128 for IPv6 starting with 5.x kernels - I forget the exact
kernel version).

That might solve your problem; it might not.

(BTW: Cumulus uses fib rules for DNS servers to force DNS packets out
the mgmt-vrf interface.)

^ permalink raw reply

* Re: [PATCH 00/11] Add support for software nodes to gpiolib
From: Andy Shevchenko @ 2019-09-11 17:13 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Linus Walleij, Mika Westerberg, linux-kernel, linux-gpio,
	Andrew Lunn, Andrzej Hajda, Bartosz Golaszewski, Daniel Vetter,
	David Airlie, David S. Miller, Florian Fainelli, Heiner Kallweit,
	Jernej Skrabec, Jonas Karlman, Laurent Pinchart, Neil Armstrong,
	Russell King, dri-devel, linux-acpi, netdev
In-Reply-To: <20190911075215.78047-1-dmitry.torokhov@gmail.com>

On Wed, Sep 11, 2019 at 12:52:04AM -0700, Dmitry Torokhov wrote:
> This series attempts to add support for software nodes to gpiolib, using
> software node references that were introduced recently. This allows us
> to convert more drivers to the generic device properties and drop
> support for custom platform data:
> 
> static const struct software_node gpio_bank_b_node = {
> |-------.name = "B",
> };
> 
> static const struct property_entry simone_key_enter_props[] = {
> |-------PROPERTY_ENTRY_U32("linux,code", KEY_ENTER),
> |-------PROPERTY_ENTRY_STRING("label", "enter"),
> |-------PROPERTY_ENTRY_REF("gpios", &gpio_bank_b_node, 123, GPIO_ACTIVE_LOW),
> |-------{ }
> };
> 
> If we agree in principle, I would like to have the very first 3 patches
> in an immutable branch off maybe -rc8 so that it can be pulled into
> individual subsystems so that patches switching various drivers to
> fwnode_gpiod_get_index() could be applied.

FWIW,
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

for patches 1-8 after addressing minor issues.
I'll review the rest later on.

> 
> Thanks,
> Dmitry
> 
> Dmitry Torokhov (11):
>   gpiolib: of: add a fallback for wlf,reset GPIO name
>   gpiolib: introduce devm_fwnode_gpiod_get_index()
>   gpiolib: introduce fwnode_gpiod_get_index()
>   net: phylink: switch to using fwnode_gpiod_get_index()
>   net: mdio: switch to using fwnode_gpiod_get_index()
>   drm/bridge: ti-tfp410: switch to using fwnode_gpiod_get_index()
>   gpliolib: make fwnode_get_named_gpiod() static
>   gpiolib: of: tease apart of_find_gpio()
>   gpiolib: of: tease apart acpi_find_gpio()
>   gpiolib: consolidate fwnode GPIO lookups
>   gpiolib: add support for software nodes
> 
>  drivers/gpio/Makefile              |   1 +
>  drivers/gpio/gpiolib-acpi.c        | 153 ++++++++++++++----------
>  drivers/gpio/gpiolib-acpi.h        |  21 ++--
>  drivers/gpio/gpiolib-devres.c      |  33 ++----
>  drivers/gpio/gpiolib-of.c          | 159 ++++++++++++++-----------
>  drivers/gpio/gpiolib-of.h          |  26 ++--
>  drivers/gpio/gpiolib-swnode.c      |  92 +++++++++++++++
>  drivers/gpio/gpiolib-swnode.h      |  13 ++
>  drivers/gpio/gpiolib.c             | 184 ++++++++++++++++-------------
>  drivers/gpu/drm/bridge/ti-tfp410.c |   4 +-
>  drivers/net/phy/mdio_bus.c         |   4 +-
>  drivers/net/phy/phylink.c          |   4 +-
>  include/linux/gpio/consumer.h      |  53 ++++++---
>  13 files changed, 471 insertions(+), 276 deletions(-)
>  create mode 100644 drivers/gpio/gpiolib-swnode.c
>  create mode 100644 drivers/gpio/gpiolib-swnode.h
> 
> -- 
> 2.23.0.162.g0b9fbb3734-goog
> 

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply

* [PATCH bpf-next 1/3] i40e: fix xdp handle calculations
From: Ciara Loftus @ 2019-09-11 17:24 UTC (permalink / raw)
  To: netdev, ast, daniel, bjorn.topel, magnus.karlsson, jonathan.lemon
  Cc: bruce.richardson, bpf, intel-wired-lan, kevin.laatz, Ciara Loftus

Commit 4c5d9a7fa149 ("i40e: fix xdp handle calculations") reintroduced
the addition of the umem headroom to the xdp handle in the i40e_zca_free,
i40e_alloc_buffer_slow_zc and i40e_alloc_buffer_zc functions. However,
the headroom is already added to the handle in the function i40_run_xdp_zc.
This commit removes the latter addition and fixes the case where the
headroom is non-zero.

Fixes: 4c5d9a7fa149 ("i40e: fix xdp handle calculations")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index 0373bc6c7e61..5f285ba1f1f9 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -192,7 +192,7 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp)
 {
 	struct xdp_umem *umem = rx_ring->xsk_umem;
 	int err, result = I40E_XDP_PASS;
-	u64 offset = umem->headroom;
+	u64 offset;
 	struct i40e_ring *xdp_ring;
 	struct bpf_prog *xdp_prog;
 	u32 act;
@@ -203,7 +203,7 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp)
 	 */
 	xdp_prog = READ_ONCE(rx_ring->xdp_prog);
 	act = bpf_prog_run_xdp(xdp_prog, xdp);
-	offset += xdp->data - xdp->data_hard_start;
+	offset = xdp->data - xdp->data_hard_start;
 
 	xdp->handle = xsk_umem_adjust_offset(umem, xdp->handle, offset);
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH bpf-next 2/3] ixgbe: fix xdp handle calculations
From: Ciara Loftus @ 2019-09-11 17:24 UTC (permalink / raw)
  To: netdev, ast, daniel, bjorn.topel, magnus.karlsson, jonathan.lemon
  Cc: bruce.richardson, bpf, intel-wired-lan, kevin.laatz, Ciara Loftus
In-Reply-To: <20190911172435.21042-1-ciara.loftus@intel.com>

Commit 7cbbf9f1fa23 ("ixgbe: fix xdp handle calculations") reintroduced
the addition of the umem headroom to the xdp handle in the ixgbe_zca_free,
ixgbe_alloc_buffer_slow_zc and ixgbe_alloc_buffer_zc functions. However,
the headroom is already added to the handle in the function
ixgbe_run_xdp_zc. This commit removes the latter addition and fixes the
case where the headroom is non-zero.

Fixes: 7cbbf9f1fa23 ("ixgbe: fix xdp handle calculations")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
index ad802a8909e0..5ed8b5a257cf 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
@@ -145,7 +145,7 @@ static int ixgbe_run_xdp_zc(struct ixgbe_adapter *adapter,
 {
 	struct xdp_umem *umem = rx_ring->xsk_umem;
 	int err, result = IXGBE_XDP_PASS;
-	u64 offset = umem->headroom;
+	u64 offset;
 	struct bpf_prog *xdp_prog;
 	struct xdp_frame *xdpf;
 	u32 act;
@@ -153,7 +153,7 @@ static int ixgbe_run_xdp_zc(struct ixgbe_adapter *adapter,
 	rcu_read_lock();
 	xdp_prog = READ_ONCE(rx_ring->xdp_prog);
 	act = bpf_prog_run_xdp(xdp_prog, xdp);
-	offset += xdp->data - xdp->data_hard_start;
+	offset = xdp->data - xdp->data_hard_start;
 
 	xdp->handle = xsk_umem_adjust_offset(umem, xdp->handle, offset);
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH bpf-next 3/3] samples/bpf: fix xdpsock l2fwd tx for unaligned mode
From: Ciara Loftus @ 2019-09-11 17:24 UTC (permalink / raw)
  To: netdev, ast, daniel, bjorn.topel, magnus.karlsson, jonathan.lemon
  Cc: bruce.richardson, bpf, intel-wired-lan, kevin.laatz, Ciara Loftus
In-Reply-To: <20190911172435.21042-1-ciara.loftus@intel.com>

Preserve the offset of the address of the received descriptor, and include
it in the address set for the tx descriptor, so the kernel can correctly
locate the start of the packet data.

Fixes: 03895e63ff97 ("samples/bpf: add buffer recycling for unaligned chunks to xdpsock")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
 samples/bpf/xdpsock_user.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
index 102eace22956..df011ac33402 100644
--- a/samples/bpf/xdpsock_user.c
+++ b/samples/bpf/xdpsock_user.c
@@ -685,7 +685,7 @@ static void l2fwd(struct xsk_socket_info *xsk, struct pollfd *fds)
 	for (i = 0; i < rcvd; i++) {
 		u64 addr = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx)->addr;
 		u32 len = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx++)->len;
-		u64 orig = xsk_umem__extract_addr(addr);
+		u64 orig = addr;
 
 		addr = xsk_umem__add_offset_to_addr(addr);
 		char *pkt = xsk_umem__get_data(xsk->umem->buffer, addr);
-- 
2.17.1


^ permalink raw reply related

* Re: [PATCH net] tcp: remove empty skb from write queue in error cases
From: Christoph Paasch @ 2019-09-11 17:36 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, netdev, Soheil Hassas Yeganeh, Neal Cardwell,
	Eric Dumazet, Jason Baron, Vladimir Rutsky
In-Reply-To: <20190826161915.81676-1-edumazet@google.com>

Hello,

On Mon, Aug 26, 2019 at 11:04 AM Eric Dumazet <edumazet@google.com> wrote:
>
> Vladimir Rutsky reported stuck TCP sessions after memory pressure
> events. Edge Trigger epoll() user would never receive an EPOLLOUT
> notification allowing them to retry a sendmsg().
>
> Jason tested the case of sk_stream_alloc_skb() returning NULL,
> but there are other paths that could lead both sendmsg() and sendpage()
> to return -1 (EAGAIN), with an empty skb queued on the write queue.
>
> This patch makes sure we remove this empty skb so that
> Jason code can detect that the queue is empty, and
> call sk->sk_write_space(sk) accordingly.
>
> Fixes: ce5ec440994b ("tcp: ensure epoll edge trigger wakeup when write queue is empty")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jason Baron <jbaron@akamai.com>
> Reported-by: Vladimir Rutsky <rutsky@google.com>
> Cc: Soheil Hassas Yeganeh <soheil@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> ---
>  net/ipv4/tcp.c | 30 ++++++++++++++++++++----------
>  1 file changed, 20 insertions(+), 10 deletions(-)

I got syzkaller complaining now on 4.14.143 with the following reproducer:

# {Threaded:true Collide:true Repeat:true RepeatTimes:0 Procs:1
Sandbox: Fault:false FaultCall:-1 FaultNth:0 EnableTun:false
UseTmpDir:false EnableCgroups:false EnableNetdev:false ResetNet:false
HandleSegv:false Repro:false Trace:false}
r0 = socket$inet_tcp(0x2, 0x1, 0x0)
setsockopt$inet_tcp_TCP_REPAIR(r0, 0x6, 0x13, &(0x7f0000000040)=0x1, 0x4)
setsockopt$inet_tcp_TCP_REPAIR_QUEUE(r0, 0x6, 0x14, &(0x7f00000012c0)=0x2, 0x4)
setsockopt$inet_tcp_int(r0, 0x6, 0x19, &(0x7f0000000000)=0x9, 0x4)
setsockopt$inet_tcp_TCP_MD5SIG(r0, 0x6, 0xe,
&(0x7f00000001c0)={@in={{0x2, 0x0, @empty}}, 0x0, 0x2, 0x0,
"c157cf4809151e5e89cfd6d934fbe981ec8ff6afc252ccf486c325c7ff3d35f3a89412a5cb6430e169092617df2ba65bf0ab844572e4e7dd4ece8ec1de5ac1ccd870067b018cb3b1f05f2391d872b67d"},
0xd8)
connect$inet(r0, &(0x7f0000000080)={0x2, 0x0, @dev={0xac, 0x14, 0x14,
0x1d}}, 0x10)
sendto(r0, 0x0, 0x87, 0x0, 0x0, 0x391)

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN PTI
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 2529 Comm: syz-executor709 Not tainted 4.14.143 #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
task: ffff8880677fdc00 task.stack: ffff8880642b0000
RIP: 0010:tcp_sendmsg_locked+0x6b4/0x4390 net/ipv4/tcp.c:1350
RSP: 0018:ffff8880642bf718 EFLAGS: 00010206
RAX: 0000000000000014 RBX: 0000000000000087 RCX: ffff88806a794f50
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000000a0
RBP: ffff8880642bfaa8 R08: 0000000000000006 R09: ffff8880677fe3a0
R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000
R13: ffff88806a794f50 R14: ffff88806a794d00 R15: 0000000000000087
FS:  00007f644b697700(0000) GS:ffff88806cf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffcd37370b0 CR3: 00000000679f2006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 tcp_sendmsg+0x2a/0x40 net/ipv4/tcp.c:1533
 inet_sendmsg+0x173/0x4e0 net/ipv4/af_inet.c:784
 sock_sendmsg_nosec net/socket.c:646 [inline]
 sock_sendmsg+0xc3/0x100 net/socket.c:656
 SYSC_sendto+0x35d/0x5e0 net/socket.c:1766
 do_syscall_64+0x241/0x680 arch/x86/entry/common.c:292
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x7f644afc6469
RSP: 002b:00007f644b696f28 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000602130 RCX: 00007f644afc6469
RDX: 0000000000000087 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 0000000000602138 R08: 0000000000000000 R09: 0000000000000391
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000060213c
R13: 00007ffcd373700f R14: 00007f644b677000 R15: 0000000000000003
Code: 74 08 3c 03 0f 8e f1 32 00 00 8b 85 98 fd ff ff 89 85 60 fd ff
ff 48 8b 85 70 fd ff ff 48 8d b8 a0 00 00 00 48 89 f8 48 c1 e8 03 <42>
0f b6 04 20 84 c0 74 06 0f 8e d2 32 00 00 4c 8b bd 70 fd ff
RIP: tcp_sendmsg_locked+0x6b4/0x4390 net/ipv4/tcp.c:1350 RSP: ffff8880642bf718
---[ end trace 70f07f242cd3b9d8 ]---


It's because skb is NULL in tcp_sendmsg_locked at:
                  skb = tcp_write_queue_tail(sk);
                  if (tcp_send_head(sk)) {
                          if (skb->ip_summed == CHECKSUM_NONE)


I think we need this here on pre-rb-tree kernels :

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 5ce069ce2a97..efe767e20d01 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -924,8 +924,7 @@ static void tcp_remove_empty_skb(struct sock *sk,
struct sk_buff *skb)
 {
  if (skb && !skb->len) {
  tcp_unlink_write_queue(skb, sk);
- if (tcp_write_queue_empty(sk))
- tcp_chrono_stop(sk, TCP_CHRONO_BUSY);
+ tcp_check_send_head(sk, skb);
  sk_wmem_free_skb(sk, skb);
  }
 }

Does that look good?


Thanks,
Christoph

^ permalink raw reply related

* Re: [PATCH] ftgmac100: Disable HW checksum generation on AST2500
From: Vijay Khemka @ 2019-09-11 17:44 UTC (permalink / raw)
  To: Joel Stanley, Florian Fainelli, Benjamin Herrenschmidt
  Cc: David S. Miller, YueHaibing, Andrew Lunn, Kate Stewart,
	Mauro Carvalho Chehab, Luis Chamberlain, Thomas Gleixner,
	netdev@vger.kernel.org, Linux Kernel Mailing List,
	openbmc @ lists . ozlabs . org, linux-aspeed, Sai Dasari
In-Reply-To: <CACPK8XcS4iKfKigPbPg0BFbmjbT-kdyjiPDXjk1k5XaS5bCdAA@mail.gmail.com>



On 9/11/19, 7:49 AM, "Joel Stanley" <joel@jms.id.au> wrote:

    Hi Ben,
    
    On Tue, 10 Sep 2019 at 22:05, Florian Fainelli <f.fainelli@gmail.com> wrote:
    >
    > On 9/10/19 2:37 PM, Vijay Khemka wrote:
    > > HW checksum generation is not working for AST2500, specially with IPV6
    > > over NCSI. All TCP packets with IPv6 get dropped. By disabling this
    > > it works perfectly fine with IPV6.
    > >
    > > Verified with IPV6 enabled and can do ssh.
    >
    > How about IPv4, do these packets have problem? If not, can you continue
    > advertising NETIF_F_IP_CSUM but take out NETIF_F_IPV6_CSUM?
    >
    > >
    > > Signed-off-by: Vijay Khemka <vijaykhemka@fb.com>
    > > ---
    > >  drivers/net/ethernet/faraday/ftgmac100.c | 5 +++--
    > >  1 file changed, 3 insertions(+), 2 deletions(-)
    > >
    > > diff --git a/drivers/net/ethernet/faraday/ftgmac100.c b/drivers/net/ethernet/faraday/ftgmac100.c
    > > index 030fed65393e..591c9725002b 100644
    > > --- a/drivers/net/ethernet/faraday/ftgmac100.c
    > > +++ b/drivers/net/ethernet/faraday/ftgmac100.c
    > > @@ -1839,8 +1839,9 @@ static int ftgmac100_probe(struct platform_device *pdev)
    > >       if (priv->use_ncsi)
    > >               netdev->hw_features |= NETIF_F_HW_VLAN_CTAG_FILTER;
    > >
    > > -     /* AST2400  doesn't have working HW checksum generation */
    > > -     if (np && (of_device_is_compatible(np, "aspeed,ast2400-mac")))
    > > +     /* AST2400  and AST2500 doesn't have working HW checksum generation */
    > > +     if (np && (of_device_is_compatible(np, "aspeed,ast2400-mac") ||
    > > +                of_device_is_compatible(np, "aspeed,ast2500-mac")))
    
    Do you recall under what circumstances we need to disable hardware checksumming?
Mainly, TCP packets over IPV6 getting dropped. After disabling it was working.
    
    Cheers,
    
    Joel
    
    > >               netdev->hw_features &= ~NETIF_F_HW_CSUM;
    > >       if (np && of_get_property(np, "no-hw-checksum", NULL))
    > >               netdev->hw_features &= ~(NETIF_F_HW_CSUM | NETIF_F_RXCSUM);
    > >
    >
    >
    > --
    > Florian
    


^ permalink raw reply

* Re: [PATCH net-next 5/5] sctp: add spt_pathcpthld in struct sctp_paddrthlds
From: Xin Long @ 2019-09-11 17:47 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: David Laight, network dev, linux-sctp@vger.kernel.org,
	Neil Horman, davem@davemloft.net
In-Reply-To: <20190911125609.GC3499@localhost.localdomain>

On Wed, Sep 11, 2019 at 8:56 PM Marcelo Ricardo Leitner
<marcelo.leitner@gmail.com> wrote:
>
> On Wed, Sep 11, 2019 at 05:38:33PM +0800, Xin Long wrote:
> > On Wed, Sep 11, 2019 at 5:21 PM Xin Long <lucien.xin@gmail.com> wrote:
> > >
> > > On Wed, Sep 11, 2019 at 5:03 PM David Laight <David.Laight@aculab.com> wrote:
> > > >
> > > > From: Xin Long [mailto:lucien.xin@gmail.com]
> > > > > Sent: 11 September 2019 09:52
> > > > > On Tue, Sep 10, 2019 at 9:19 PM David Laight <David.Laight@aculab.com> wrote:
> > > > > >
> > > > > > From: Xin Long
> > > > > > > Sent: 09 September 2019 08:57
> > > > > > > Section 7.2 of rfc7829: "Peer Address Thresholds (SCTP_PEER_ADDR_THLDS)
> > > > > > > Socket Option" extends 'struct sctp_paddrthlds' with 'spt_pathcpthld'
> > > > > > > added to allow a user to change ps_retrans per sock/asoc/transport, as
> > > > > > > other 2 paddrthlds: pf_retrans, pathmaxrxt.
> > > > > > >
> > > > > > > Note that ps_retrans is not allowed to be greater than pf_retrans.
> > > > > > >
> > > > > > > Signed-off-by: Xin Long <lucien.xin@gmail.com>
> > > > > > > ---
> > > > > > >  include/uapi/linux/sctp.h |  1 +
> > > > > > >  net/sctp/socket.c         | 10 ++++++++++
> > > > > > >  2 files changed, 11 insertions(+)
> > > > > > >
> > > > > > > diff --git a/include/uapi/linux/sctp.h b/include/uapi/linux/sctp.h
> > > > > > > index a15cc28..dfd81e1 100644
> > > > > > > --- a/include/uapi/linux/sctp.h
> > > > > > > +++ b/include/uapi/linux/sctp.h
> > > > > > > @@ -1069,6 +1069,7 @@ struct sctp_paddrthlds {
> > > > > > >       struct sockaddr_storage spt_address;
> > > > > > >       __u16 spt_pathmaxrxt;
> > > > > > >       __u16 spt_pathpfthld;
> > > > > > > +     __u16 spt_pathcpthld;
> > > > > > >  };
> > > > > > >
> > > > > > >  /*
> > > > > > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> > > > > > > index 5e2098b..5b9774d 100644
> > > > > > > --- a/net/sctp/socket.c
> > > > > > > +++ b/net/sctp/socket.c
> > > > > > > @@ -3954,6 +3954,9 @@ static int sctp_setsockopt_paddr_thresholds(struct sock *sk,
> > > > > >
> > > > > > This code does:
> > > > > >         if (optlen < sizeof(struct sctp_paddrthlds))
> > > > > >                 return -EINVAL;
> > > > > here will become:
> > > > >
> > > > >         if (optlen >= sizeof(struct sctp_paddrthlds)) {
> > > > >                 optlen = sizeof(struct sctp_paddrthlds);
> > > > >         } else if (optlen >= ALIGN(offsetof(struct sctp_paddrthlds,
> > > > >                                             spt_pathcpthld), 4))
> > > > >                 optlen = ALIGN(offsetof(struct sctp_paddrthlds,
> > > > >                                         spt_pathcpthld), 4);
> > > > >                 val.spt_pathcpthld = 0xffff;
> > > > >         else {
> > > > >                 return -EINVAL;
> > > > >         }
> > > >
> > > > Hmmm...
> > > > If the kernel has to default 'val.spt_pathcpthld = 0xffff'
> > > > then recompiling an existing application with the new uapi
> > > > header is going to lead to very unexpected behaviour.
> > > >
> > > > The best you can hope for is that the application memset the
> > > > structure to zero.
> > > > But more likely it is 'random' on-stack data.
> > > 0xffff is a value to disable the feature 'Primary Path Switchover'.
> > > you're right that user might set it to zero unexpectly with their
> > > old application rebuilt.
> > >
> > > A safer way is to introduce "sysctl net.sctp.ps_retrans", it won't
> > > matter if users set spt_pathcpthld properly when they're not aware
> > > of this feature, like "sysctl net.sctp.pF_retrans". Looks better?
> > Sorry for confusing,  "sysctl net.sctp.ps_retrans" is already there
> > (its value is 0xffff by default),
> > we just need to do this in sctp_setsockopt_paddr_thresholds():
> >
> >         if (copy_from_user(&val, (struct sctp_paddrthlds __user *)optval,
> >                            optlen))
> >                 return -EFAULT;
> >
> >         if (sock_net(sk)->sctp.ps_retrans == 0xffff)
> >                 val.spt_pathcpthld = 0xffff;
>
> I'm confused with the snippets, but if I got them right, this is after
> dealing with proper len and could leave val.spt_pathcpthld
> uninitialized if the application used the old format and sysctl is !=
> 0xffff.
right, how about this in sctp_setsockopt_paddr_thresholds():

        offset = ALIGN(offsetof(struct sctp_paddrthlds, spt_pathcpthld), 4);
        if (optlen < offset)
                return -EINVAL;
        if (optlen < sizeof(val) || sock_net(sk)->sctp.ps_retrans == 0xffff) {
                optlen = offset;
                val.spt_pathcpthld = 0xffff;
        } else {
                optlen = sizeof(val);
        }

        if (copy_from_user(&val, (struct sctp_paddrthlds __user *)optval,
                           optlen))
                return -EFAULT;

        if (val.spt_pathpfthld > val.spt_pathcpthld)
                return -EINVAL;

Which means we will 'skip' spt_pathcpthld if (it's using old format) or
(ps_retrans is disabled and it's using new format).
Note that  ps_retrans < pf_retrans is not allowed in rfc7829.

and in sctp_getsockopt_paddr_thresholds():

        offset = ALIGN(offsetof(struct sctp_paddrthlds, spt_pathcpthld), 4);
        if (len < offset)
                return -EINVAL;
        if (len < sizeof(val) || sock_net(sk)->sctp.ps_retrans == 0xffff)
                len = offset;
        else
                len = sizeof(val);

        if (copy_from_user(&val, (struct sctp_paddrthlds __user *)optval, len))
                return -EFAULT;


>
> >
> >         if (val.spt_pathpfthld > val.spt_pathcpthld)
> >                 return -EINVAL;
> >
> > >
> > > >
> > > >         David
> > > >
> > > > -
> > > > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> > > > Registration No: 1397386 (Wales)
> >

^ permalink raw reply

* Re: WARNING at net/mac80211/sta_info.c:1057 (__sta_info_destroy_part2())
From: Kalle Valo @ 2019-09-11 18:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Johannes Berg, David S. Miller, linux-wireless, Netdev,
	Linux List Kernel Mailing, ath10k
In-Reply-To: <CAHk-=wgBuu8PiYpD7uWgxTSY8aUOJj6NJ=ivNQPYjAKO=cRinA@mail.gmail.com>

+ ath10k list

Linus Torvalds <torvalds@linux-foundation.org> writes:

> So I'm at LCA, reading email, using my laptop more than I normally do,
> and with different networking than I normally do.
>
> And I just had a 802.11 WARN_ON() trigger, followed by essentially a
> dead machine due to some lock held (maybe rtnl_lock).
>
> It's possible that the lock held thing happened before, and is the
> _reason_ for the delay, I don't know. I had to reboot the machine, but
> I gathered as much information as made sense and was obvious before I
> did so. That's appended.

Some notes while investigating this:

> But wait!
>
> ... then 10+ minutes later:
>
>    ath10k_pci 0000:02:00.0: wmi command 16387 timeout, restarting hardware
>    ath10k_pci 0000:02:00.0: failed to set 5g txpower 23: -11
>    ath10k_pci 0000:02:00.0: failed to setup tx power 23: -11
>    ath10k_pci 0000:02:00.0: failed to recalc tx power: -11
>    ath10k_pci 0000:02:00.0: failed to set inactivity time for vdev 0: -108
>    ath10k_pci 0000:02:00.0: failed to setup powersave: -108
>
> That certainly looks like something did try to set a power limit, but
> eventually failed.

I suspect the failing WMI command is called from:

ath10k_bss_info_changed()
ath10k_mac_txpower_recalc()
ath10k_mac_txpower_setup()
ath10k_wmi_pdev_set_param()
ath10k_wmi_cmd_send()
ath10k_wmi_cmd_send_nowait()
ath10k_htc_send()

-11 is -EAGAIN which would mean that the HTC credits have run out some
 reason for the WMI command:

if (ep->tx_credits < credits) {
        ath10k_dbg(ar, ATH10K_DBG_HTC,
                "htc insufficient credits ep %d required %d available %d\n",
                eid, credits, ep->tx_credits);
        spin_unlock_bh(&htc->tx_lock);
        ret = -EAGAIN;
        goto err_pull;
}

Credits can run out, for example, if there's a lot of WMI command/event
activity and are not returned during the 3s wait, firmware crashed or
problems with the PCI bus. But when the WMI command timeout happens
ath10k is supposed to restart the firmware and everything should be
usable again.
                                             
> Immediately after that:
>
>    wlp2s0: deauthenticating from 54:ec:2f:05:70:2c by local choice
> (Reason: 3=DEAUTH_LEAVING)
>    ath10k_pci 0000:02:00.0: failed to read hi_board_data address: -16
>    ath10k_pci 0000:02:00.0: failed to receive initialized event from
> target: 00000000
>    ath10k_pci 0000:02:00.0: failed to receive initialized event from
> target: 00000000
>    ath10k_pci 0000:02:00.0: failed to wait for target init: -110

I suspect here ath10k tries to reset the target during stop operation,
"failed to receive initialized event from target" comes from:

ath10k_pci_hif_stop()
ath10k_pci_safe_chip_reset()
ath10k_pci_warm_reset()
ath10k_pci_wait_for_target_init()

It shouldn't fail like that, which makes me suspect either a low level
problem or a bug in qca6174 firmware restart code. To check the latter,
could you please try to force a firmware crash and see if firmware
restart is working for you?

To crash the firmware you need to write either "hard" or "assert" (I
forgot which one QCA6174 firmware supports) to
/sys/kernel/debug/ieee80211/phy*/ath10k/simulate_fw_crash. And what
should happen is that the firmware crashes, ath10k prints a big pile of
warnings, restarts it and in few seconds everything resumes to normal
without user space even noticing it.

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* [PATCH bpf] bpf: respect CAP_IPC_LOCK in RLIMIT_MEMLOCK check
From: Christian Barcenas @ 2019-09-11 18:18 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, netdev
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, Christian Barcenas,
	bpf

A process can lock memory addresses into physical RAM explicitly
(via mlock, mlockall, shmctl, etc.) or implicitly (via VFIO,
perf ring-buffers, bpf maps, etc.), subject to RLIMIT_MEMLOCK limits.

CAP_IPC_LOCK allows a process to exceed these limits, and throughout
the kernel this capability is checked before allowing/denying an attempt
to lock memory regions into RAM.

Because bpf locks its programs and maps into RAM, it should respect
CAP_IPC_LOCK. Previously, bpf would return EPERM when RLIMIT_MEMLOCK was
exceeded by a privileged process, which is contrary to documented
RLIMIT_MEMLOCK+CAP_IPC_LOCK behavior.

Fixes: aaac3ba95e4c ("bpf: charge user for creation of BPF maps and programs")
Signed-off-by: Christian Barcenas <christian@cbarcenas.com>
---
 kernel/bpf/syscall.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 272071e9112f..e551961f364b 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -183,8 +183,9 @@ void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr)
 static int bpf_charge_memlock(struct user_struct *user, u32 pages)
 {
 	unsigned long memlock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
+	unsigned long locked = atomic_long_add_return(pages, &user->locked_vm);
 
-	if (atomic_long_add_return(pages, &user->locked_vm) > memlock_limit) {
+	if (locked > memlock_limit && !capable(CAP_IPC_LOCK)) {
 		atomic_long_sub(pages, &user->locked_vm);
 		return -EPERM;
 	}
@@ -1231,7 +1232,7 @@ int __bpf_prog_charge(struct user_struct *user, u32 pages)
 
 	if (user) {
 		user_bufs = atomic_long_add_return(pages, &user->locked_vm);
-		if (user_bufs > memlock_limit) {
+		if (user_bufs > memlock_limit && !capable(CAP_IPC_LOCK)) {
 			atomic_long_sub(pages, &user->locked_vm);
 			return -EPERM;
 		}
-- 
2.23.0

^ permalink raw reply related

* Re: WARNING at net/mac80211/sta_info.c:1057 (__sta_info_destroy_part2())
From: Kalle Valo @ 2019-09-11 18:19 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Linus Torvalds, David S. Miller, Kalle Valo, linux-wireless,
	Netdev, Linux List Kernel Mailing, ath10k
In-Reply-To: <feecebfcceba521703f13c8ee7f5bb9016924cb6.camel@sipsolutions.net>

Johannes Berg <johannes@sipsolutions.net> writes:

>>    ath10k_pci 0000:02:00.0: wmi command 16387 timeout, restarting hardware
>>    ath10k_pci 0000:02:00.0: failed to set 5g txpower 23: -11
>>    ath10k_pci 0000:02:00.0: failed to setup tx power 23: -11
>>    ath10k_pci 0000:02:00.0: failed to recalc tx power: -11
>>    ath10k_pci 0000:02:00.0: failed to set inactivity time for vdev 0: -108
>>    ath10k_pci 0000:02:00.0: failed to setup powersave: -108
>> 
>> That certainly looks like something did try to set a power limit, but
>> eventually failed.
>
> Yeah, that does seem a bit fishy. Kalle would have to comment for
> ath10k.
>
>> Immediately after that:
>> 
>>    wlp2s0: deauthenticating from 54:ec:2f:05:70:2c by local choice
>> (Reason: 3=DEAUTH_LEAVING)
>
> I don't _think_ any of the above would be a reason to disconnect, but it
> clearly looks like the device got stuck at this point, since everything
> just fails afterwards.

Yeah, to me it looks anything ath10k tries to do with the devie fails,
even resetting the device.

> Looks like indeed the driver gives the device at least *3 seconds* for
> every command, see ath10k_wmi_cmd_send(), so most likely this would
> eventually have finished, but who knows how many firmware commands it
> would still have attempted to send...

3 seconds is a bit short but in normal cases it should be enough. Of
course we could increase the delay but I'm skeptic it would help here.

> Perhaps the driver should mark the device as dead and fail quickly once
> it timed out once, or so, but I'll let Kalle comment on that.

Actually we do try to restart the device when a timeout happens in
ath10k_wmi_cmd_send():

        if (ret == -EAGAIN) {
                ath10k_warn(ar, "wmi command %d timeout, restarting hardware\n",
                            cmd_id);
                queue_work(ar->workqueue, &ar->restart_work);
        }
                        

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: WARNING at net/mac80211/sta_info.c:1057 (__sta_info_destroy_part2())
From: Johannes Berg @ 2019-09-11 18:23 UTC (permalink / raw)
  To: Kalle Valo
  Cc: Linus Torvalds, David S. Miller, linux-wireless, Netdev,
	Linux List Kernel Mailing, ath10k
In-Reply-To: <87ef0mlmqg.fsf@tynnyri.adurom.net>

On Wed, 2019-09-11 at 21:19 +0300, Kalle Valo wrote:
> > Looks like indeed the driver gives the device at least *3 seconds* for
> > every command, see ath10k_wmi_cmd_send(), so most likely this would
> > eventually have finished, but who knows how many firmware commands it
> > would still have attempted to send...
> 
> 3 seconds is a bit short but in normal cases it should be enough. Of
> course we could increase the delay but I'm skeptic it would help here.

I was thinking 3 seconds is way too long :-)

> > Perhaps the driver should mark the device as dead and fail quickly once
> > it timed out once, or so, but I'll let Kalle comment on that.
> 
> Actually we do try to restart the device when a timeout happens in
> ath10k_wmi_cmd_send():
> 
>         if (ret == -EAGAIN) {
>                 ath10k_warn(ar, "wmi command %d timeout, restarting hardware\n",
>                             cmd_id);
>                 queue_work(ar->workqueue, &ar->restart_work);
>         }

Yeah, and this is the problem, in a sense, I'd think. It seems to me
that at this point the code needs to tag the device as "dead" and
immediately return from any further commands submitted to it with an
error (e.g. -EIO). You can can actually see in the initial report that
while the restart was triggered, it too is waiting to acquire the RTNL:

>    Workqueue: events_freezable ieee80211_restart_work [mac80211]
>    Call Trace:
>     schedule+0x39/0xa0
>     schedule_preempt_disabled+0xa/0x10
>     __mutex_lock.isra.0+0x263/0x4b0
>     ieee80211_restart_work+0x54/0xe0 [mac80211]
>     process_one_work+0x1cf/0x370
>     worker_thread+0x4a/0x3c0
>     kthread+0xfb/0x130
>     ret_from_fork+0x35/0x40


So basically all this delay is mac80211 and the driver doing stuff with
the device, but every single thing has to time out and probably some
stuff loops etc., and then it just takes long enough with the RTNL held
that everything goes south.

johannes


^ permalink raw reply

* Re: [PATCH] ftgmac100: Disable HW checksum generation on AST2500
From: Vijay Khemka @ 2019-09-11 18:30 UTC (permalink / raw)
  To: Florian Fainelli, David S. Miller, YueHaibing, Andrew Lunn,
	Kate Stewart, Mauro Carvalho Chehab, Luis Chamberlain,
	Thomas Gleixner, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: openbmc @ lists . ozlabs . org, Sai Dasari,
	linux-aspeed@lists.ozlabs.org
In-Reply-To: <D79D04CC-4A02-4E51-8FDF-48B7C7EB6CC2@fb.com>



On 9/10/19, 4:08 PM, "Linux-aspeed on behalf of Vijay Khemka" <linux-aspeed-bounces+vijaykhemka=fb.com@lists.ozlabs.org on behalf of vijaykhemka@fb.com> wrote:

    
    
    On 9/10/19, 3:50 PM, "Linux-aspeed on behalf of Vijay Khemka" <linux-aspeed-bounces+vijaykhemka=fb.com@lists.ozlabs.org on behalf of vijaykhemka@fb.com> wrote:
    
        
        
        On 9/10/19, 3:05 PM, "Florian Fainelli" <f.fainelli@gmail.com> wrote:
        
            On 9/10/19 2:37 PM, Vijay Khemka wrote:
            > HW checksum generation is not working for AST2500, specially with IPV6
            > over NCSI. All TCP packets with IPv6 get dropped. By disabling this
            > it works perfectly fine with IPV6.
            > 
            > Verified with IPV6 enabled and can do ssh.
            
            How about IPv4, do these packets have problem? If not, can you continue
            advertising NETIF_F_IP_CSUM but take out NETIF_F_IPV6_CSUM?
        
        I changed code from (netdev->hw_features &= ~NETIF_F_HW_CSUM) to 
        (netdev->hw_features &= ~NETIF_F_ IPV6_CSUM). And it is not working. 
        Don't know why. IPV4 works without any change but IPv6 needs HW_CSUM
        Disabled.
    
    Now I changed to
    netdev->hw_features &= (~NETIF_F_HW_CSUM) | NETIF_F_IP_CSUM;
    And it works.

I investigated more on these features and found that we cannot set NETIF_F_IP_CSUM 
While NETIF_F_HW_CSUM is set. So I disabled NETIF_F_HW_CSUM first and enabled
NETIF_F_IP_CSUM in next statement. And it works fine.

But as per line 166 in include/linux/skbuff.h,  
*   NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM are being deprecated in favor of
 *   NETIF_F_HW_CSUM. New devices should use NETIF_F_HW_CSUM to indicate
 *   checksum offload capability.

Please suggest which of below 2 I should do. As both works for me.
1. Disable completely NETIF_F_HW_CSUM and do nothing. This is original patch.
2. Enable NETIF_F_IP_CSUM in addition to 1. I can have v2 if this is accepted.
            
            > 
            > Signed-off-by: Vijay Khemka <vijaykhemka@fb.com>
            > ---
            >  drivers/net/ethernet/faraday/ftgmac100.c | 5 +++--
            >  1 file changed, 3 insertions(+), 2 deletions(-)
            > 
            > diff --git a/drivers/net/ethernet/faraday/ftgmac100.c b/drivers/net/ethernet/faraday/ftgmac100.c
            > index 030fed65393e..591c9725002b 100644
            > --- a/drivers/net/ethernet/faraday/ftgmac100.c
            > +++ b/drivers/net/ethernet/faraday/ftgmac100.c
            > @@ -1839,8 +1839,9 @@ static int ftgmac100_probe(struct platform_device *pdev)
            >  	if (priv->use_ncsi)
            >  		netdev->hw_features |= NETIF_F_HW_VLAN_CTAG_FILTER;
            >  
            > -	/* AST2400  doesn't have working HW checksum generation */
            > -	if (np && (of_device_is_compatible(np, "aspeed,ast2400-mac")))
            > +	/* AST2400  and AST2500 doesn't have working HW checksum generation */
            > +	if (np && (of_device_is_compatible(np, "aspeed,ast2400-mac") ||
            > +		   of_device_is_compatible(np, "aspeed,ast2500-mac")))
            >  		netdev->hw_features &= ~NETIF_F_HW_CSUM;
            >  	if (np && of_get_property(np, "no-hw-checksum", NULL))
            >  		netdev->hw_features &= ~(NETIF_F_HW_CSUM | NETIF_F_RXCSUM);
            > 
            
            
            -- 
            Florian
            
        
        
    
    


^ permalink raw reply

* Re: [PATCH] ftgmac100: Disable HW checksum generation on AST2500
From: Florian Fainelli @ 2019-09-11 18:34 UTC (permalink / raw)
  To: Vijay Khemka, David S. Miller, YueHaibing, Andrew Lunn,
	Kate Stewart, Mauro Carvalho Chehab, Luis Chamberlain,
	Thomas Gleixner, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: openbmc @ lists . ozlabs . org, Sai Dasari,
	linux-aspeed@lists.ozlabs.org
In-Reply-To: <8A8392C8-5E5E-444D-AB1B-E0FAD3C29425@fb.com>

On 9/11/19 11:30 AM, Vijay Khemka wrote:
> 
> 
> On 9/10/19, 4:08 PM, "Linux-aspeed on behalf of Vijay Khemka" <linux-aspeed-bounces+vijaykhemka=fb.com@lists.ozlabs.org on behalf of vijaykhemka@fb.com> wrote:
> 
>     
>     
>     On 9/10/19, 3:50 PM, "Linux-aspeed on behalf of Vijay Khemka" <linux-aspeed-bounces+vijaykhemka=fb.com@lists.ozlabs.org on behalf of vijaykhemka@fb.com> wrote:
>     
>         
>         
>         On 9/10/19, 3:05 PM, "Florian Fainelli" <f.fainelli@gmail.com> wrote:
>         
>             On 9/10/19 2:37 PM, Vijay Khemka wrote:
>             > HW checksum generation is not working for AST2500, specially with IPV6
>             > over NCSI. All TCP packets with IPv6 get dropped. By disabling this
>             > it works perfectly fine with IPV6.
>             > 
>             > Verified with IPV6 enabled and can do ssh.
>             
>             How about IPv4, do these packets have problem? If not, can you continue
>             advertising NETIF_F_IP_CSUM but take out NETIF_F_IPV6_CSUM?
>         
>         I changed code from (netdev->hw_features &= ~NETIF_F_HW_CSUM) to 
>         (netdev->hw_features &= ~NETIF_F_ IPV6_CSUM). And it is not working. 
>         Don't know why. IPV4 works without any change but IPv6 needs HW_CSUM
>         Disabled.
>     
>     Now I changed to
>     netdev->hw_features &= (~NETIF_F_HW_CSUM) | NETIF_F_IP_CSUM;
>     And it works.
> 
> I investigated more on these features and found that we cannot set NETIF_F_IP_CSUM 
> While NETIF_F_HW_CSUM is set. So I disabled NETIF_F_HW_CSUM first and enabled
> NETIF_F_IP_CSUM in next statement. And it works fine.
> 
> But as per line 166 in include/linux/skbuff.h,  
> *   NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM are being deprecated in favor of
>  *   NETIF_F_HW_CSUM. New devices should use NETIF_F_HW_CSUM to indicate
>  *   checksum offload capability.
> 
> Please suggest which of below 2 I should do. As both works for me.
> 1. Disable completely NETIF_F_HW_CSUM and do nothing. This is original patch.
> 2. Enable NETIF_F_IP_CSUM in addition to 1. I can have v2 if this is accepted.

Sounds like 2 would leave the option of offloading IPv4 checksum
offload, so that would be a better middle group than flat out disable
checksum offload for both IPv4 and IPv6, no?
-- 
Florian

^ permalink raw reply

* [Patch net] sch_sfb: fix a crash in sfb_destroy()
From: Cong Wang @ 2019-09-11 18:34 UTC (permalink / raw)
  To: netdev
  Cc: Cong Wang, syzbot+d5870a903591faaca4ae, Linus Torvalds,
	Jamal Hadi Salim, Jiri Pirko

When tcf_block_get() fails in sfb_init(), q->qdisc is still a NULL
pointer which leads to a crash in sfb_destroy().

Linus suggested three solutions for this problem, the simplest fix
is just moving the noop_qdisc assignment before tcf_block_get()
so that qdisc_put() would become a nop.

Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure")
Reported-by: syzbot+d5870a903591faaca4ae@syzkaller.appspotmail.com
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/sch_sfb.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c
index 1dff8506a715..db1c8eb521a2 100644
--- a/net/sched/sch_sfb.c
+++ b/net/sched/sch_sfb.c
@@ -552,11 +552,11 @@ static int sfb_init(struct Qdisc *sch, struct nlattr *opt,
 	struct sfb_sched_data *q = qdisc_priv(sch);
 	int err;
 
+	q->qdisc = &noop_qdisc;
+
 	err = tcf_block_get(&q->block, &q->filter_list, sch, extack);
 	if (err)
 		return err;
-
-	q->qdisc = &noop_qdisc;
 	return sfb_change(sch, opt, extack);
 }
 
-- 
2.21.0


^ permalink raw reply related

* Re: ixgbe: driver drops packets routed from an IPSec interface with a "bad sa_idx" error
From: Jeff Kirsher @ 2019-09-11 18:45 UTC (permalink / raw)
  To: Michael Marley, Steffen Klassert; +Cc: Shannon Nelson, netdev
In-Reply-To: <12d6d2313eeb61a51731a2ba9b1fa9bf@michaelmarley.com>

[-- Attachment #1: Type: text/plain, Size: 1975 bytes --]

On Wed, 2019-09-11 at 10:50 -0400, Michael Marley wrote:
> On 2019-09-11 02:15, Steffen Klassert wrote:
> > On Tue, Sep 10, 2019 at 06:53:30PM -0400, Michael Marley wrote:
> > > StrongSwan has hardware offload disabled by default, and I
> > > didn't 
> > > enable
> > > it explicitly.  I also already tried turning off all those
> > > switches 
> > > with
> > > ethtool and it has no effect.  This doesn't surprise me though, 
> > > because
> > > as I said, I don't actually have the IPSec connection running
> > > over the
> > > ixgbe device.  The IPSec connection runs over another network
> > > adapter
> > > that doesn't support IPSec offload at all.  The problem comes
> > > when
> > > traffic received over the IPSec interface is then routed back out
> > > (unencrypted) through the ixgbe device into the local network.
> > 
> > Seems like the ixgbe driver tries to use the sec_path
> > from RX to setup an offload at the TX side.
> > 
> > Can you please try this (completely untested) patch?

Steffen, can you send your patch to intel-wired-lan@lists.osuosl.org
mailing list?

> > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > index 9bcae44e9883..ae31bd57127c 100644
> > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > @@ -36,6 +36,7 @@
> >  #include <net/vxlan.h>
> >  #include <net/mpls.h>
> >  #include <net/xdp_sock.h>
> > +#include <net/xfrm.h>
> > 
> >  #include "ixgbe.h"
> >  #include "ixgbe_common.h"
> > @@ -8696,7 +8697,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct
> > sk_buff 
> > *skb,
> >  #endif /* IXGBE_FCOE */
> > 
> >  #ifdef CONFIG_IXGBE_IPSEC
> > -	if (secpath_exists(skb) &&
> > +	if (xfrm_offload(skb) &&
> >  	    !ixgbe_ipsec_tx(tx_ring, first, &ipsec_tx))
> >  		goto out_drop;
> >  #endif
> With the patch, the problem is gone.  Thanks!
> 
> Michael


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [v2 1/3] samples: pktgen: make variable consistent with option
From: Daniel T. Lee @ 2019-09-11 18:48 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, David S . Miller; +Cc: netdev

This commit changes variable names that can cause confusion.

For example, variable DST_MIN is quite confusing since the
keyword 'udp_dst_min' and keyword 'dst_min' is used with pg_ctrl.

On the following commit, 'dst_min' will be used to set destination IP,
and the existing variable name DST_MIN should be changed.

Variable names are matched to the exact keyword used with pg_ctrl.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
---
 .../pktgen_bench_xmit_mode_netif_receive.sh      |  8 ++++----
 .../pktgen/pktgen_bench_xmit_mode_queue_xmit.sh  |  8 ++++----
 samples/pktgen/pktgen_sample01_simple.sh         | 16 ++++++++--------
 samples/pktgen/pktgen_sample02_multiqueue.sh     | 16 ++++++++--------
 .../pktgen/pktgen_sample03_burst_single_flow.sh  |  8 ++++----
 samples/pktgen/pktgen_sample04_many_flows.sh     |  8 ++++----
 .../pktgen/pktgen_sample05_flow_per_thread.sh    |  8 ++++----
 ...en_sample06_numa_awared_queue_irq_affinity.sh | 16 ++++++++--------
 8 files changed, 44 insertions(+), 44 deletions(-)

diff --git a/samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh b/samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh
index e14b1a9144d9..9b74502c58f7 100755
--- a/samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh
+++ b/samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh
@@ -42,8 +42,8 @@ fi
 [ -z "$BURST" ] && BURST=1024
 [ -z "$COUNT" ] && COUNT="10000000" # Zero means indefinitely
 if [ -n "$DST_PORT" ]; then
-    read -r DST_MIN DST_MAX <<< $(parse_ports $DST_PORT)
-    validate_ports $DST_MIN $DST_MAX
+    read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
+    validate_ports $UDP_DST_MIN $UDP_DST_MAX
 fi
 
 # Base Config
@@ -76,8 +76,8 @@ for ((thread = $F_THREAD; thread <= $L_THREAD; thread++)); do
     if [ -n "$DST_PORT" ]; then
 	# Single destination port or random port range
 	pg_set $dev "flag UDPDST_RND"
-	pg_set $dev "udp_dst_min $DST_MIN"
-	pg_set $dev "udp_dst_max $DST_MAX"
+	pg_set $dev "udp_dst_min $UDP_DST_MIN"
+	pg_set $dev "udp_dst_max $UDP_DST_MAX"
     fi
 
     # Inject packet into RX path of stack
diff --git a/samples/pktgen/pktgen_bench_xmit_mode_queue_xmit.sh b/samples/pktgen/pktgen_bench_xmit_mode_queue_xmit.sh
index 82c3e504e056..0f332555b40d 100755
--- a/samples/pktgen/pktgen_bench_xmit_mode_queue_xmit.sh
+++ b/samples/pktgen/pktgen_bench_xmit_mode_queue_xmit.sh
@@ -25,8 +25,8 @@ if [[ -n "$BURST" ]]; then
 fi
 [ -z "$COUNT" ] && COUNT="10000000" # Zero means indefinitely
 if [ -n "$DST_PORT" ]; then
-    read -r DST_MIN DST_MAX <<< $(parse_ports $DST_PORT)
-    validate_ports $DST_MIN $DST_MAX
+    read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
+    validate_ports $UDP_DST_MIN $UDP_DST_MAX
 fi
 
 # Base Config
@@ -59,8 +59,8 @@ for ((thread = $F_THREAD; thread <= $L_THREAD; thread++)); do
     if [ -n "$DST_PORT" ]; then
 	# Single destination port or random port range
 	pg_set $dev "flag UDPDST_RND"
-	pg_set $dev "udp_dst_min $DST_MIN"
-	pg_set $dev "udp_dst_max $DST_MAX"
+	pg_set $dev "udp_dst_min $UDP_DST_MIN"
+	pg_set $dev "udp_dst_max $UDP_DST_MAX"
     fi
 
     # Inject packet into TX qdisc egress path of stack
diff --git a/samples/pktgen/pktgen_sample01_simple.sh b/samples/pktgen/pktgen_sample01_simple.sh
index d1702fdde8f3..063ec0998906 100755
--- a/samples/pktgen/pktgen_sample01_simple.sh
+++ b/samples/pktgen/pktgen_sample01_simple.sh
@@ -23,16 +23,16 @@ fi
 [ -z "$DST_MAC" ] && usage && err 2 "Must specify -m dst_mac"
 [ -z "$COUNT" ]   && COUNT="100000" # Zero means indefinitely
 if [ -n "$DST_PORT" ]; then
-    read -r DST_MIN DST_MAX <<< $(parse_ports $DST_PORT)
-    validate_ports $DST_MIN $DST_MAX
+    read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
+    validate_ports $UDP_DST_MIN $UDP_DST_MAX
 fi
 
 # Base Config
 DELAY="0"        # Zero means max speed
 
 # Flow variation random source port between min and max
-UDP_MIN=9
-UDP_MAX=109
+UDP_SRC_MIN=9
+UDP_SRC_MAX=109
 
 # General cleanup everything since last run
 # (especially important if other threads were configured by other scripts)
@@ -66,14 +66,14 @@ pg_set $DEV "dst$IP6 $DEST_IP"
 if [ -n "$DST_PORT" ]; then
     # Single destination port or random port range
     pg_set $DEV "flag UDPDST_RND"
-    pg_set $DEV "udp_dst_min $DST_MIN"
-    pg_set $DEV "udp_dst_max $DST_MAX"
+    pg_set $DEV "udp_dst_min $UDP_DST_MIN"
+    pg_set $DEV "udp_dst_max $UDP_DST_MAX"
 fi
 
 # Setup random UDP port src range
 pg_set $DEV "flag UDPSRC_RND"
-pg_set $DEV "udp_src_min $UDP_MIN"
-pg_set $DEV "udp_src_max $UDP_MAX"
+pg_set $DEV "udp_src_min $UDP_SRC_MIN"
+pg_set $DEV "udp_src_max $UDP_SRC_MAX"
 
 # start_run
 echo "Running... ctrl^C to stop" >&2
diff --git a/samples/pktgen/pktgen_sample02_multiqueue.sh b/samples/pktgen/pktgen_sample02_multiqueue.sh
index 7f7a9a27548f..a4726fb50197 100755
--- a/samples/pktgen/pktgen_sample02_multiqueue.sh
+++ b/samples/pktgen/pktgen_sample02_multiqueue.sh
@@ -21,8 +21,8 @@ DELAY="0"        # Zero means max speed
 [ -z "$CLONE_SKB" ] && CLONE_SKB="0"
 
 # Flow variation random source port between min and max
-UDP_MIN=9
-UDP_MAX=109
+UDP_SRC_MIN=9
+UDP_SRC_MAX=109
 
 # (example of setting default params in your script)
 if [ -z "$DEST_IP" ]; then
@@ -30,8 +30,8 @@ if [ -z "$DEST_IP" ]; then
 fi
 [ -z "$DST_MAC" ] && DST_MAC="90:e2:ba:ff:ff:ff"
 if [ -n "$DST_PORT" ]; then
-    read -r DST_MIN DST_MAX <<< $(parse_ports $DST_PORT)
-    validate_ports $DST_MIN $DST_MAX
+    read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
+    validate_ports $UDP_DST_MIN $UDP_DST_MAX
 fi
 
 # General cleanup everything since last run
@@ -67,14 +67,14 @@ for ((thread = $F_THREAD; thread <= $L_THREAD; thread++)); do
     if [ -n "$DST_PORT" ]; then
 	# Single destination port or random port range
 	pg_set $dev "flag UDPDST_RND"
-	pg_set $dev "udp_dst_min $DST_MIN"
-	pg_set $dev "udp_dst_max $DST_MAX"
+	pg_set $dev "udp_dst_min $UDP_DST_MIN"
+	pg_set $dev "udp_dst_max $UDP_DST_MAX"
     fi
 
     # Setup random UDP port src range
     pg_set $dev "flag UDPSRC_RND"
-    pg_set $dev "udp_src_min $UDP_MIN"
-    pg_set $dev "udp_src_max $UDP_MAX"
+    pg_set $dev "udp_src_min $UDP_SRC_MIN"
+    pg_set $dev "udp_src_max $UDP_SRC_MAX"
 done
 
 # start_run
diff --git a/samples/pktgen/pktgen_sample03_burst_single_flow.sh b/samples/pktgen/pktgen_sample03_burst_single_flow.sh
index b520637817ce..dfea91a09ccc 100755
--- a/samples/pktgen/pktgen_sample03_burst_single_flow.sh
+++ b/samples/pktgen/pktgen_sample03_burst_single_flow.sh
@@ -34,8 +34,8 @@ fi
 [ -z "$CLONE_SKB" ] && CLONE_SKB="0" # No need for clones when bursting
 [ -z "$COUNT" ]     && COUNT="0" # Zero means indefinitely
 if [ -n "$DST_PORT" ]; then
-    read -r DST_MIN DST_MAX <<< $(parse_ports $DST_PORT)
-    validate_ports $DST_MIN $DST_MAX
+    read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
+    validate_ports $UDP_DST_MIN $UDP_DST_MAX
 fi
 
 # Base Config
@@ -67,8 +67,8 @@ for ((thread = $F_THREAD; thread <= $L_THREAD; thread++)); do
     if [ -n "$DST_PORT" ]; then
 	# Single destination port or random port range
 	pg_set $dev "flag UDPDST_RND"
-	pg_set $dev "udp_dst_min $DST_MIN"
-	pg_set $dev "udp_dst_max $DST_MAX"
+	pg_set $dev "udp_dst_min $UDP_DST_MIN"
+	pg_set $dev "udp_dst_max $UDP_DST_MAX"
     fi
 
     # Setup burst, for easy testing -b 0 disable bursting
diff --git a/samples/pktgen/pktgen_sample04_many_flows.sh b/samples/pktgen/pktgen_sample04_many_flows.sh
index 5b6e9d9cb5b5..7ea9b4a3acf6 100755
--- a/samples/pktgen/pktgen_sample04_many_flows.sh
+++ b/samples/pktgen/pktgen_sample04_many_flows.sh
@@ -18,8 +18,8 @@ source ${basedir}/parameters.sh
 [ -z "$CLONE_SKB" ] && CLONE_SKB="0"
 [ -z "$COUNT" ]     && COUNT="0" # Zero means indefinitely
 if [ -n "$DST_PORT" ]; then
-    read -r DST_MIN DST_MAX <<< $(parse_ports $DST_PORT)
-    validate_ports $DST_MIN $DST_MAX
+    read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
+    validate_ports $UDP_DST_MIN $UDP_DST_MAX
 fi
 
 # NOTICE:  Script specific settings
@@ -63,8 +63,8 @@ for ((thread = $F_THREAD; thread <= $L_THREAD; thread++)); do
     if [ -n "$DST_PORT" ]; then
 	# Single destination port or random port range
 	pg_set $dev "flag UDPDST_RND"
-	pg_set $dev "udp_dst_min $DST_MIN"
-	pg_set $dev "udp_dst_max $DST_MAX"
+	pg_set $dev "udp_dst_min $UDP_DST_MIN"
+	pg_set $dev "udp_dst_max $UDP_DST_MAX"
     fi
 
     # Randomize source IP-addresses
diff --git a/samples/pktgen/pktgen_sample05_flow_per_thread.sh b/samples/pktgen/pktgen_sample05_flow_per_thread.sh
index 0c06e63fbe97..fbfafe029e11 100755
--- a/samples/pktgen/pktgen_sample05_flow_per_thread.sh
+++ b/samples/pktgen/pktgen_sample05_flow_per_thread.sh
@@ -23,8 +23,8 @@ source ${basedir}/parameters.sh
 [ -z "$BURST" ]     && BURST=32
 [ -z "$COUNT" ]     && COUNT="0" # Zero means indefinitely
 if [ -n "$DST_PORT" ]; then
-    read -r DST_MIN DST_MAX <<< $(parse_ports $DST_PORT)
-    validate_ports $DST_MIN $DST_MAX
+    read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
+    validate_ports $UDP_DST_MIN $UDP_DST_MAX
 fi
 
 # Base Config
@@ -56,8 +56,8 @@ for ((thread = $F_THREAD; thread <= $L_THREAD; thread++)); do
     if [ -n "$DST_PORT" ]; then
 	# Single destination port or random port range
 	pg_set $dev "flag UDPDST_RND"
-	pg_set $dev "udp_dst_min $DST_MIN"
-	pg_set $dev "udp_dst_max $DST_MAX"
+	pg_set $dev "udp_dst_min $UDP_DST_MIN"
+	pg_set $dev "udp_dst_max $UDP_DST_MAX"
     fi
 
     # Setup source IP-addresses based on thread number
diff --git a/samples/pktgen/pktgen_sample06_numa_awared_queue_irq_affinity.sh b/samples/pktgen/pktgen_sample06_numa_awared_queue_irq_affinity.sh
index 97f0266c0356..755e662183f1 100755
--- a/samples/pktgen/pktgen_sample06_numa_awared_queue_irq_affinity.sh
+++ b/samples/pktgen/pktgen_sample06_numa_awared_queue_irq_affinity.sh
@@ -20,8 +20,8 @@ DELAY="0"        # Zero means max speed
 [ -z "$CLONE_SKB" ] && CLONE_SKB="0"
 
 # Flow variation random source port between min and max
-UDP_MIN=9
-UDP_MAX=109
+UDP_SRC_MIN=9
+UDP_SRC_MAX=109
 
 node=`get_iface_node $DEV`
 irq_array=(`get_iface_irqs $DEV`)
@@ -36,8 +36,8 @@ if [ -z "$DEST_IP" ]; then
 fi
 [ -z "$DST_MAC" ] && DST_MAC="90:e2:ba:ff:ff:ff"
 if [ -n "$DST_PORT" ]; then
-    read -r DST_MIN DST_MAX <<< $(parse_ports $DST_PORT)
-    validate_ports $DST_MIN $DST_MAX
+    read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
+    validate_ports $UDP_DST_MIN $UDP_DST_MAX
 fi
 
 # General cleanup everything since last run
@@ -84,14 +84,14 @@ for ((i = 0; i < $THREADS; i++)); do
     if [ -n "$DST_PORT" ]; then
 	# Single destination port or random port range
 	pg_set $dev "flag UDPDST_RND"
-	pg_set $dev "udp_dst_min $DST_MIN"
-	pg_set $dev "udp_dst_max $DST_MAX"
+	pg_set $dev "udp_dst_min $UDP_DST_MIN"
+	pg_set $dev "udp_dst_max $UDP_DST_MAX"
     fi
 
     # Setup random UDP port src range
     pg_set $dev "flag UDPSRC_RND"
-    pg_set $dev "udp_src_min $UDP_MIN"
-    pg_set $dev "udp_src_max $UDP_MAX"
+    pg_set $dev "udp_src_min $UDP_SRC_MIN"
+    pg_set $dev "udp_src_max $UDP_SRC_MAX"
 done
 
 # start_run
-- 
2.20.1


^ permalink raw reply related

* Re: WARNING at net/mac80211/sta_info.c:1057 (__sta_info_destroy_part2())
From: Kalle Valo @ 2019-09-11 18:48 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Linus Torvalds, David S. Miller, linux-wireless, Netdev,
	Linux List Kernel Mailing, ath10k
In-Reply-To: <383b145b608e0fe3a35ffb0ceb99fdf938d4e2bb.camel@sipsolutions.net>

Johannes Berg <johannes@sipsolutions.net> writes:

> On Wed, 2019-09-11 at 21:19 +0300, Kalle Valo wrote:
>> > Looks like indeed the driver gives the device at least *3 seconds* for
>> > every command, see ath10k_wmi_cmd_send(), so most likely this would
>> > eventually have finished, but who knows how many firmware commands it
>> > would still have attempted to send...
>> 
>> 3 seconds is a bit short but in normal cases it should be enough. Of
>> course we could increase the delay but I'm skeptic it would help here.
>
> I was thinking 3 seconds is way too long :-)

Heh :)

>> > Perhaps the driver should mark the device as dead and fail quickly once
>> > it timed out once, or so, but I'll let Kalle comment on that.
>> 
>> Actually we do try to restart the device when a timeout happens in
>> ath10k_wmi_cmd_send():
>> 
>>         if (ret == -EAGAIN) {
>>                 ath10k_warn(ar, "wmi command %d timeout, restarting hardware\n",
>>                             cmd_id);
>>                 queue_work(ar->workqueue, &ar->restart_work);
>>         }
>
> Yeah, and this is the problem, in a sense, I'd think. It seems to me
> that at this point the code needs to tag the device as "dead" and
> immediately return from any further commands submitted to it with an
> error (e.g. -EIO).

Yeah, ath10k_core_restart() is supposed change to state
ATH10K_STATE_RESTARTING but very few mac80211 ops in ath10k_ops are
checking for it, and to me it looks like quite late even. I think a
proper fix for ops which can sleep is to check ar->state is
ATH10K_STATE_ON and for ops which cannot sleep check
ATH10K_FLAG_CRASH_FLUSH.

But of course this just fixes the symptoms, the root cause for timeouts
needs to be found as well.

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* [v2 2/3] samples: pktgen: add helper functions for IP(v4/v6) CIDR parsing
From: Daniel T. Lee @ 2019-09-11 18:48 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, David S . Miller; +Cc: netdev
In-Reply-To: <20190911184807.21770-1-danieltimlee@gmail.com>

This commit adds CIDR parsing and IP validate helper function to parse
single IP or range of IP with CIDR. (e.g. 198.18.0.0/15)

Helpers will be used in prior to set target address in samples/pktgen.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
---
 samples/pktgen/functions.sh | 122 ++++++++++++++++++++++++++++++++++++
 1 file changed, 122 insertions(+)

diff --git a/samples/pktgen/functions.sh b/samples/pktgen/functions.sh
index 4af4046d71be..8be5a6b6c097 100644
--- a/samples/pktgen/functions.sh
+++ b/samples/pktgen/functions.sh
@@ -163,6 +163,128 @@ function get_node_cpus()
 	echo $node_cpu_list
 }
 
+# Extend shrunken IPv6 address.
+# fe80::42:bcff:fe84:e10a => fe80:0:0:0:42:bcff:fe84:e10a
+function extend_addr6()
+{
+    local addr=$1
+    local sep=: sep2=::
+    local sep_cnt=$(tr -cd $sep <<< $1 | wc -c)
+    local shrink
+
+    # separator count : should be between 2, 7.
+    if [[ $sep_cnt -lt 2 || $sep_cnt -gt 7 ]]; then
+        err 5 "Invalid IP6 address sep: $1"
+    fi
+
+    # if shrink '::' occurs multiple, it's malformed.
+    shrink=( $(egrep -o "$sep{2,}" <<< $addr) )
+    if [[ ${#shrink[@]} -ne 0 ]]; then
+        if [[ ${#shrink[@]} -gt 1 || ( ${shrink[0]} != $sep2 ) ]]; then
+            err 5 "Invalid IP$IP6 address shr: $1"
+        fi
+    fi
+
+    # add 0 at begin & end, and extend addr by adding :0
+    [[ ${addr:0:1} == $sep ]] && addr=0${addr}
+    [[ ${addr: -1} == $sep ]] && addr=${addr}0
+    echo "${addr/$sep2/$(printf ':0%.s' $(seq $[8-sep_cnt])):}"
+}
+
+
+# Given a single IP(v4/v6) address, whether it is valid.
+function validate_addr()
+{
+    # check function is called with (funcname)6
+    [[ ${FUNCNAME[1]: -1} == 6 ]] && local IP6=6
+    local len=$[ IP6 ? 8 : 4 ]
+    local max=$[ 2**(len*2)-1 ]
+    local addr sep
+
+    # set separator for each IP(v4/v6)
+    [[ $IP6 ]] && sep=: || sep=.
+    IFS=$sep read -a addr <<< $1
+
+    # array length
+    if [[ ${#addr[@]} != $len ]]; then
+        err 5 "Invalid IP$IP6 address: $1"
+    fi
+
+    # check each digit between 0, $max
+    for digit in "${addr[@]}"; do
+        [[ $IP6 ]] && digit=$[ 16#$digit ]
+        if [[ $digit -lt 0 || $digit -gt $max ]]; then
+            err 5 "Invalid IP$IP6 address: $1"
+        fi
+    done
+
+    return 0
+}
+
+function validate_addr6() { validate_addr $@ ; }
+
+# Given a single IP(v4/v6) or CIDR, return minimum and maximum IP addr.
+function parse_addr()
+{
+    # check function is called with (funcname)6
+    [[ ${FUNCNAME[1]: -1} == 6 ]] && local IP6=6
+    local bitlen=$[ IP6 ? 128 : 32 ]
+    local octet=$[ IP6 ? 16 : 8 ]
+
+    local addr=$1
+    local net prefix
+    local min_ip max_ip
+
+    IFS='/' read net prefix <<< $addr
+    [[ $IP6 ]] && net=$(extend_addr6 $net)
+    validate_addr$IP6 $net
+
+    if [[ $prefix -gt $bitlen ]]; then
+        err 5 "Invalid prefix: $prefix"
+    elif [[ -z $prefix ]]; then
+        min_ip=$net
+        max_ip=$net
+    else
+        # defining array for converting Decimal 2 Binary
+        # 00000000 00000001 00000010 00000011 00000100 ...
+        local d2b='{0..1}{0..1}{0..1}{0..1}{0..1}{0..1}{0..1}{0..1}'
+        [[ $IP6 ]] && d2b+=$d2b
+        eval local D2B=($d2b)
+
+        local shift=$[ bitlen-prefix ]
+        local min_mask max_mask
+        local min max
+        local ip_bit
+        local ip sep
+
+        # set separator for each IP(v4/v6)
+        [[ $IP6 ]] && sep=: || sep=.
+        IFS=$sep read -ra ip <<< $net
+
+        min_mask="$(printf '1%.s' $(seq $prefix))$(printf '0%.s' $(seq $shift))"
+        max_mask="$(printf '0%.s' $(seq $prefix))$(printf '1%.s' $(seq $shift))"
+
+        # calculate min/max ip with &,| operator
+        for i in "${!ip[@]}"; do
+            digit=$[ IP6 ? 16#${ip[$i]} : ${ip[$i]} ]
+            ip_bit=${D2B[$digit]}
+
+            idx=$[ octet*i ]
+            min[$i]=$[ 2#$ip_bit & 2#${min_mask:$idx:$octet} ]
+            max[$i]=$[ 2#$ip_bit | 2#${max_mask:$idx:$octet} ]
+            [[ $IP6 ]] && { min[$i]=$(printf '%X' ${min[$i]});
+                            max[$i]=$(printf '%X' ${max[$i]}); }
+        done
+
+        min_ip=$(IFS=$sep; echo "${min[*]}")
+        max_ip=$(IFS=$sep; echo "${max[*]}")
+    fi
+
+    echo $min_ip $max_ip
+}
+
+function parse_addr6() { parse_addr $@ ; }
+
 # Given a single or range of port(s), return minimum and maximum port number.
 function parse_ports()
 {
-- 
2.20.1


^ permalink raw reply related

* [v2 3/3] samples: pktgen: allow to specify destination IP range (CIDR)
From: Daniel T. Lee @ 2019-09-11 18:48 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, David S . Miller; +Cc: netdev
In-Reply-To: <20190911184807.21770-1-danieltimlee@gmail.com>

Currently, kernel pktgen has the feature to specify destination
address range for sending packet. (e.g. pgset "dst_min/dst_max")

But on samples, each of the scripts doesn't have any option to achieve this.

This commit adds the feature to specify the destination address range with CIDR.

    -d : ($DEST_IP)   destination IP. CIDR (e.g. 198.18.0.0/15) is also allowed

    # ./pktgen_sample01_simple.sh -6 -d fe80::20/126 -p 3000 -n 4
    # tcpdump ip6 and udp
    05:14:18.082285 IP6 fe80::99.71 > fe80::23.3000: UDP, length 16
    05:14:18.082564 IP6 fe80::99.43 > fe80::23.3000: UDP, length 16
    05:14:18.083366 IP6 fe80::99.107 > fe80::22.3000: UDP, length 16
    05:14:18.083585 IP6 fe80::99.97 > fe80::21.3000: UDP, length 16

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
---
 samples/pktgen/README.rst                             |  2 +-
 samples/pktgen/parameters.sh                          |  2 +-
 .../pktgen/pktgen_bench_xmit_mode_netif_receive.sh    |  4 +++-
 samples/pktgen/pktgen_bench_xmit_mode_queue_xmit.sh   |  4 +++-
 samples/pktgen/pktgen_sample01_simple.sh              |  4 +++-
 samples/pktgen/pktgen_sample02_multiqueue.sh          |  4 +++-
 samples/pktgen/pktgen_sample03_burst_single_flow.sh   |  4 +++-
 samples/pktgen/pktgen_sample04_many_flows.sh          | 11 ++++++++---
 samples/pktgen/pktgen_sample05_flow_per_thread.sh     |  4 +++-
 .../pktgen_sample06_numa_awared_queue_irq_affinity.sh |  4 +++-
 10 files changed, 31 insertions(+), 12 deletions(-)

diff --git a/samples/pktgen/README.rst b/samples/pktgen/README.rst
index fd39215db508..3f6483e8b2df 100644
--- a/samples/pktgen/README.rst
+++ b/samples/pktgen/README.rst
@@ -18,7 +18,7 @@ across the sample scripts.  Usage example is printed on errors::
  Usage: ./pktgen_sample01_simple.sh [-vx] -i ethX
   -i : ($DEV)       output interface/device (required)
   -s : ($PKT_SIZE)  packet size
-  -d : ($DEST_IP)   destination IP
+  -d : ($DEST_IP)   destination IP. CIDR (e.g. 198.18.0.0/15) is also allowed
   -m : ($DST_MAC)   destination MAC-addr
   -p : ($DST_PORT)  destination PORT range (e.g. 433-444) is also allowed
   -t : ($THREADS)   threads to start
diff --git a/samples/pktgen/parameters.sh b/samples/pktgen/parameters.sh
index a06b00a0c7b6..ff0ed474fee9 100644
--- a/samples/pktgen/parameters.sh
+++ b/samples/pktgen/parameters.sh
@@ -8,7 +8,7 @@ function usage() {
     echo "Usage: $0 [-vx] -i ethX"
     echo "  -i : (\$DEV)       output interface/device (required)"
     echo "  -s : (\$PKT_SIZE)  packet size"
-    echo "  -d : (\$DEST_IP)   destination IP"
+    echo "  -d : (\$DEST_IP)   destination IP. CIDR (e.g. 198.18.0.0/15) is also allowed"
     echo "  -m : (\$DST_MAC)   destination MAC-addr"
     echo "  -p : (\$DST_PORT)  destination PORT range (e.g. 433-444) is also allowed"
     echo "  -t : (\$THREADS)   threads to start"
diff --git a/samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh b/samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh
index 9b74502c58f7..da6cb711b7f4 100755
--- a/samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh
+++ b/samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh
@@ -41,6 +41,7 @@ fi
 [ -z "$DST_MAC" ] && DST_MAC="90:e2:ba:ff:ff:ff"
 [ -z "$BURST" ] && BURST=1024
 [ -z "$COUNT" ] && COUNT="10000000" # Zero means indefinitely
+[ -n "$DEST_IP" ] && read -r DST_MIN DST_MAX <<< $(parse_addr${IP6} $DEST_IP)
 if [ -n "$DST_PORT" ]; then
     read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
     validate_ports $UDP_DST_MIN $UDP_DST_MAX
@@ -71,7 +72,8 @@ for ((thread = $F_THREAD; thread <= $L_THREAD; thread++)); do
 
     # Destination
     pg_set $dev "dst_mac $DST_MAC"
-    pg_set $dev "dst$IP6 $DEST_IP"
+    pg_set $dev "dst${IP6}_min $DST_MIN"
+    pg_set $dev "dst${IP6}_max $DST_MAX"
 
     if [ -n "$DST_PORT" ]; then
 	# Single destination port or random port range
diff --git a/samples/pktgen/pktgen_bench_xmit_mode_queue_xmit.sh b/samples/pktgen/pktgen_bench_xmit_mode_queue_xmit.sh
index 0f332555b40d..355937787364 100755
--- a/samples/pktgen/pktgen_bench_xmit_mode_queue_xmit.sh
+++ b/samples/pktgen/pktgen_bench_xmit_mode_queue_xmit.sh
@@ -24,6 +24,7 @@ if [[ -n "$BURST" ]]; then
     err 1 "Bursting not supported for this mode"
 fi
 [ -z "$COUNT" ] && COUNT="10000000" # Zero means indefinitely
+[ -n "$DEST_IP" ] && read -r DST_MIN DST_MAX <<< $(parse_addr${IP6} $DEST_IP)
 if [ -n "$DST_PORT" ]; then
     read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
     validate_ports $UDP_DST_MIN $UDP_DST_MAX
@@ -54,7 +55,8 @@ for ((thread = $F_THREAD; thread <= $L_THREAD; thread++)); do
 
     # Destination
     pg_set $dev "dst_mac $DST_MAC"
-    pg_set $dev "dst$IP6 $DEST_IP"
+    pg_set $dev "dst${IP6}_min $DST_MIN"
+    pg_set $dev "dst${IP6}_max $DST_MAX"
 
     if [ -n "$DST_PORT" ]; then
 	# Single destination port or random port range
diff --git a/samples/pktgen/pktgen_sample01_simple.sh b/samples/pktgen/pktgen_sample01_simple.sh
index 063ec0998906..08995fa70025 100755
--- a/samples/pktgen/pktgen_sample01_simple.sh
+++ b/samples/pktgen/pktgen_sample01_simple.sh
@@ -22,6 +22,7 @@ fi
 # Example enforce param "-m" for dst_mac
 [ -z "$DST_MAC" ] && usage && err 2 "Must specify -m dst_mac"
 [ -z "$COUNT" ]   && COUNT="100000" # Zero means indefinitely
+[ -n "$DEST_IP" ] && read -r DST_MIN DST_MAX <<< $(parse_addr${IP6} $DEST_IP)
 if [ -n "$DST_PORT" ]; then
     read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
     validate_ports $UDP_DST_MIN $UDP_DST_MAX
@@ -61,7 +62,8 @@ pg_set $DEV "flag NO_TIMESTAMP"
 
 # Destination
 pg_set $DEV "dst_mac $DST_MAC"
-pg_set $DEV "dst$IP6 $DEST_IP"
+pg_set $DEV "dst${IP6}_min $DST_MIN"
+pg_set $DEV "dst${IP6}_max $DST_MAX"
 
 if [ -n "$DST_PORT" ]; then
     # Single destination port or random port range
diff --git a/samples/pktgen/pktgen_sample02_multiqueue.sh b/samples/pktgen/pktgen_sample02_multiqueue.sh
index a4726fb50197..9b806e41c23a 100755
--- a/samples/pktgen/pktgen_sample02_multiqueue.sh
+++ b/samples/pktgen/pktgen_sample02_multiqueue.sh
@@ -29,6 +29,7 @@ if [ -z "$DEST_IP" ]; then
     [ -z "$IP6" ] && DEST_IP="198.18.0.42" || DEST_IP="FD00::1"
 fi
 [ -z "$DST_MAC" ] && DST_MAC="90:e2:ba:ff:ff:ff"
+[ -n "$DEST_IP" ] && read -r DST_MIN DST_MAX <<< $(parse_addr${IP6} $DEST_IP)
 if [ -n "$DST_PORT" ]; then
     read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
     validate_ports $UDP_DST_MIN $UDP_DST_MAX
@@ -62,7 +63,8 @@ for ((thread = $F_THREAD; thread <= $L_THREAD; thread++)); do
 
     # Destination
     pg_set $dev "dst_mac $DST_MAC"
-    pg_set $dev "dst$IP6 $DEST_IP"
+    pg_set $dev "dst${IP6}_min $DST_MIN"
+    pg_set $dev "dst${IP6}_max $DST_MAX"
 
     if [ -n "$DST_PORT" ]; then
 	# Single destination port or random port range
diff --git a/samples/pktgen/pktgen_sample03_burst_single_flow.sh b/samples/pktgen/pktgen_sample03_burst_single_flow.sh
index dfea91a09ccc..cb067788ceb3 100755
--- a/samples/pktgen/pktgen_sample03_burst_single_flow.sh
+++ b/samples/pktgen/pktgen_sample03_burst_single_flow.sh
@@ -33,6 +33,7 @@ fi
 [ -z "$BURST" ]     && BURST=32
 [ -z "$CLONE_SKB" ] && CLONE_SKB="0" # No need for clones when bursting
 [ -z "$COUNT" ]     && COUNT="0" # Zero means indefinitely
+[ -n "$DEST_IP" ]   && read -r DST_MIN DST_MAX <<< $(parse_addr${IP6} $DEST_IP)
 if [ -n "$DST_PORT" ]; then
     read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
     validate_ports $UDP_DST_MIN $UDP_DST_MAX
@@ -62,7 +63,8 @@ for ((thread = $F_THREAD; thread <= $L_THREAD; thread++)); do
 
     # Destination
     pg_set $dev "dst_mac $DST_MAC"
-    pg_set $dev "dst$IP6 $DEST_IP"
+    pg_set $dev "dst${IP6}_min $DST_MIN"
+    pg_set $dev "dst${IP6}_max $DST_MAX"
 
     if [ -n "$DST_PORT" ]; then
 	# Single destination port or random port range
diff --git a/samples/pktgen/pktgen_sample04_many_flows.sh b/samples/pktgen/pktgen_sample04_many_flows.sh
index 7ea9b4a3acf6..626e33016869 100755
--- a/samples/pktgen/pktgen_sample04_many_flows.sh
+++ b/samples/pktgen/pktgen_sample04_many_flows.sh
@@ -17,6 +17,7 @@ source ${basedir}/parameters.sh
 [ -z "$DST_MAC" ]   && DST_MAC="90:e2:ba:ff:ff:ff"
 [ -z "$CLONE_SKB" ] && CLONE_SKB="0"
 [ -z "$COUNT" ]     && COUNT="0" # Zero means indefinitely
+[ -n "$DEST_IP" ]   && read -r DST_MIN DST_MAX <<< $(parse_addr $DEST_IP)
 if [ -n "$DST_PORT" ]; then
     read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
     validate_ports $UDP_DST_MIN $UDP_DST_MAX
@@ -37,6 +38,9 @@ if [[ -n "$BURST" ]]; then
     err 1 "Bursting not supported for this mode"
 fi
 
+# 198.18.0.0 / 198.19.255.255
+read -r SRC_MIN SRC_MAX <<< $(parse_addr 198.18.0.0/15)
+
 # General cleanup everything since last run
 pg_ctrl "reset"
 
@@ -58,7 +62,8 @@ for ((thread = $F_THREAD; thread <= $L_THREAD; thread++)); do
 
     # Single destination
     pg_set $dev "dst_mac $DST_MAC"
-    pg_set $dev "dst $DEST_IP"
+    pg_set $dev "dst_min $DST_MIN"
+    pg_set $dev "dst_max $DST_MAX"
 
     if [ -n "$DST_PORT" ]; then
 	# Single destination port or random port range
@@ -69,8 +74,8 @@ for ((thread = $F_THREAD; thread <= $L_THREAD; thread++)); do
 
     # Randomize source IP-addresses
     pg_set $dev "flag IPSRC_RND"
-    pg_set $dev "src_min 198.18.0.0"
-    pg_set $dev "src_max 198.19.255.255"
+    pg_set $dev "src_min $SRC_MIN"
+    pg_set $dev "src_max $SRC_MAX"
 
     # Limit number of flows (max 65535)
     pg_set $dev "flows $FLOWS"
diff --git a/samples/pktgen/pktgen_sample05_flow_per_thread.sh b/samples/pktgen/pktgen_sample05_flow_per_thread.sh
index fbfafe029e11..cb79de073e9d 100755
--- a/samples/pktgen/pktgen_sample05_flow_per_thread.sh
+++ b/samples/pktgen/pktgen_sample05_flow_per_thread.sh
@@ -22,6 +22,7 @@ source ${basedir}/parameters.sh
 [ -z "$CLONE_SKB" ] && CLONE_SKB="0"
 [ -z "$BURST" ]     && BURST=32
 [ -z "$COUNT" ]     && COUNT="0" # Zero means indefinitely
+[ -n "$DEST_IP" ]   && read -r DST_MIN DST_MAX <<< $(parse_addr $DEST_IP)
 if [ -n "$DST_PORT" ]; then
     read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
     validate_ports $UDP_DST_MIN $UDP_DST_MAX
@@ -51,7 +52,8 @@ for ((thread = $F_THREAD; thread <= $L_THREAD; thread++)); do
 
     # Single destination
     pg_set $dev "dst_mac $DST_MAC"
-    pg_set $dev "dst $DEST_IP"
+    pg_set $dev "dst_min $DST_MIN"
+    pg_set $dev "dst_max $DST_MAX"
 
     if [ -n "$DST_PORT" ]; then
 	# Single destination port or random port range
diff --git a/samples/pktgen/pktgen_sample06_numa_awared_queue_irq_affinity.sh b/samples/pktgen/pktgen_sample06_numa_awared_queue_irq_affinity.sh
index 755e662183f1..739adcda5b5f 100755
--- a/samples/pktgen/pktgen_sample06_numa_awared_queue_irq_affinity.sh
+++ b/samples/pktgen/pktgen_sample06_numa_awared_queue_irq_affinity.sh
@@ -35,6 +35,7 @@ if [ -z "$DEST_IP" ]; then
     [ -z "$IP6" ] && DEST_IP="198.18.0.42" || DEST_IP="FD00::1"
 fi
 [ -z "$DST_MAC" ] && DST_MAC="90:e2:ba:ff:ff:ff"
+[ -n "$DEST_IP" ] && read -r DST_MIN DST_MAX <<< $(parse_addr${IP6} $DEST_IP)
 if [ -n "$DST_PORT" ]; then
     read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
     validate_ports $UDP_DST_MIN $UDP_DST_MAX
@@ -79,7 +80,8 @@ for ((i = 0; i < $THREADS; i++)); do
 
     # Destination
     pg_set $dev "dst_mac $DST_MAC"
-    pg_set $dev "dst$IP6 $DEST_IP"
+    pg_set $dev "dst${IP6}_min $DST_MIN"
+    pg_set $dev "dst${IP6}_max $DST_MAX"
 
     if [ -n "$DST_PORT" ]; then
 	# Single destination port or random port range
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH] ftgmac100: Disable HW checksum generation on AST2500
From: Vijay Khemka @ 2019-09-11 18:50 UTC (permalink / raw)
  To: Florian Fainelli, David S. Miller, YueHaibing, Andrew Lunn,
	Kate Stewart, Mauro Carvalho Chehab, Luis Chamberlain,
	Thomas Gleixner, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: openbmc @ lists . ozlabs . org, Sai Dasari,
	linux-aspeed@lists.ozlabs.org
In-Reply-To: <c9876340-c8d0-ba8b-2ae1-9900958f1834@gmail.com>



On 9/11/19, 11:34 AM, "Florian Fainelli" <f.fainelli@gmail.com> wrote:

    On 9/11/19 11:30 AM, Vijay Khemka wrote:
    > 
    > 
    > On 9/10/19, 4:08 PM, "Linux-aspeed on behalf of Vijay Khemka" <linux-aspeed-bounces+vijaykhemka=fb.com@lists.ozlabs.org on behalf of vijaykhemka@fb.com> wrote:
    > 
    >     
    >     
    >     On 9/10/19, 3:50 PM, "Linux-aspeed on behalf of Vijay Khemka" <linux-aspeed-bounces+vijaykhemka=fb.com@lists.ozlabs.org on behalf of vijaykhemka@fb.com> wrote:
    >     
    >         
    >         
    >         On 9/10/19, 3:05 PM, "Florian Fainelli" <f.fainelli@gmail.com> wrote:
    >         
    >             On 9/10/19 2:37 PM, Vijay Khemka wrote:
    >             > HW checksum generation is not working for AST2500, specially with IPV6
    >             > over NCSI. All TCP packets with IPv6 get dropped. By disabling this
    >             > it works perfectly fine with IPV6.
    >             > 
    >             > Verified with IPV6 enabled and can do ssh.
    >             
    >             How about IPv4, do these packets have problem? If not, can you continue
    >             advertising NETIF_F_IP_CSUM but take out NETIF_F_IPV6_CSUM?
    >         
    >         I changed code from (netdev->hw_features &= ~NETIF_F_HW_CSUM) to 
    >         (netdev->hw_features &= ~NETIF_F_ IPV6_CSUM). And it is not working. 
    >         Don't know why. IPV4 works without any change but IPv6 needs HW_CSUM
    >         Disabled.
    >     
    >     Now I changed to
    >     netdev->hw_features &= (~NETIF_F_HW_CSUM) | NETIF_F_IP_CSUM;
    >     And it works.
    > 
    > I investigated more on these features and found that we cannot set NETIF_F_IP_CSUM 
    > While NETIF_F_HW_CSUM is set. So I disabled NETIF_F_HW_CSUM first and enabled
    > NETIF_F_IP_CSUM in next statement. And it works fine.
    > 
    > But as per line 166 in include/linux/skbuff.h,  
    > *   NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM are being deprecated in favor of
    >  *   NETIF_F_HW_CSUM. New devices should use NETIF_F_HW_CSUM to indicate
    >  *   checksum offload capability.
    > 
    > Please suggest which of below 2 I should do. As both works for me.
    > 1. Disable completely NETIF_F_HW_CSUM and do nothing. This is original patch.
    > 2. Enable NETIF_F_IP_CSUM in addition to 1. I can have v2 if this is accepted.
    
    Sounds like 2 would leave the option of offloading IPv4 checksum
    offload, so that would be a better middle group than flat out disable
    checksum offload for both IPv4 and IPv6, no?
Sounds good, I will have v2 after lot more testing.
    -- 
    Florian
    


^ permalink raw reply

* [bpf-next,v3] samples: bpf: add max_pckt_size option at xdp_adjust_tail
From: Daniel T. Lee @ 2019-09-11 19:02 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov; +Cc: netdev, bpf

Currently, at xdp_adjust_tail_kern.c, MAX_PCKT_SIZE is limited
to 600. To make this size flexible, a new map 'pcktsz' is added.

By updating new packet size to this map from the userland,
xdp_adjust_tail_kern.o will use this value as a new max_pckt_size.

If no '-P <MAX_PCKT_SIZE>' option is used, the size of maximum packet
will be 600 as a default.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>

---
Changes in v2:
    - Change the helper to fetch map from 'bpf_map__next' to 
    'bpf_object__find_map_fd_by_name'.
 
 samples/bpf/xdp_adjust_tail_kern.c | 23 +++++++++++++++++++----
 samples/bpf/xdp_adjust_tail_user.c | 28 ++++++++++++++++++++++------
 2 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/samples/bpf/xdp_adjust_tail_kern.c b/samples/bpf/xdp_adjust_tail_kern.c
index 411fdb21f8bc..d6d84ffe6a7a 100644
--- a/samples/bpf/xdp_adjust_tail_kern.c
+++ b/samples/bpf/xdp_adjust_tail_kern.c
@@ -25,6 +25,13 @@
 #define ICMP_TOOBIG_SIZE 98
 #define ICMP_TOOBIG_PAYLOAD_SIZE 92
 
+struct bpf_map_def SEC("maps") pcktsz = {
+	.type = BPF_MAP_TYPE_ARRAY,
+	.key_size = sizeof(__u32),
+	.value_size = sizeof(__u32),
+	.max_entries = 1,
+};
+
 struct bpf_map_def SEC("maps") icmpcnt = {
 	.type = BPF_MAP_TYPE_ARRAY,
 	.key_size = sizeof(__u32),
@@ -64,7 +71,8 @@ static __always_inline void ipv4_csum(void *data_start, int data_size,
 	*csum = csum_fold_helper(*csum);
 }
 
-static __always_inline int send_icmp4_too_big(struct xdp_md *xdp)
+static __always_inline int send_icmp4_too_big(struct xdp_md *xdp,
+					      __u32 max_pckt_size)
 {
 	int headroom = (int)sizeof(struct iphdr) + (int)sizeof(struct icmphdr);
 
@@ -92,7 +100,7 @@ static __always_inline int send_icmp4_too_big(struct xdp_md *xdp)
 	orig_iph = data + off;
 	icmp_hdr->type = ICMP_DEST_UNREACH;
 	icmp_hdr->code = ICMP_FRAG_NEEDED;
-	icmp_hdr->un.frag.mtu = htons(MAX_PCKT_SIZE-sizeof(struct ethhdr));
+	icmp_hdr->un.frag.mtu = htons(max_pckt_size - sizeof(struct ethhdr));
 	icmp_hdr->checksum = 0;
 	ipv4_csum(icmp_hdr, ICMP_TOOBIG_PAYLOAD_SIZE, &csum);
 	icmp_hdr->checksum = csum;
@@ -118,14 +126,21 @@ static __always_inline int handle_ipv4(struct xdp_md *xdp)
 {
 	void *data_end = (void *)(long)xdp->data_end;
 	void *data = (void *)(long)xdp->data;
+	__u32 max_pckt_size = MAX_PCKT_SIZE;
+	__u32 *pckt_sz;
+	__u32 key = 0;
 	int pckt_size = data_end - data;
 	int offset;
 
-	if (pckt_size > MAX_PCKT_SIZE) {
+	pckt_sz = bpf_map_lookup_elem(&pcktsz, &key);
+	if (pckt_sz && *pckt_sz)
+		max_pckt_size = *pckt_sz;
+
+	if (pckt_size > max_pckt_size) {
 		offset = pckt_size - ICMP_TOOBIG_SIZE;
 		if (bpf_xdp_adjust_tail(xdp, 0 - offset))
 			return XDP_PASS;
-		return send_icmp4_too_big(xdp);
+		return send_icmp4_too_big(xdp, max_pckt_size);
 	}
 	return XDP_PASS;
 }
diff --git a/samples/bpf/xdp_adjust_tail_user.c b/samples/bpf/xdp_adjust_tail_user.c
index a3596b617c4c..aef6c69a48a7 100644
--- a/samples/bpf/xdp_adjust_tail_user.c
+++ b/samples/bpf/xdp_adjust_tail_user.c
@@ -23,6 +23,7 @@
 #include "libbpf.h"
 
 #define STATS_INTERVAL_S 2U
+#define MAX_PCKT_SIZE 600
 
 static int ifindex = -1;
 static __u32 xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST;
@@ -72,6 +73,7 @@ static void usage(const char *cmd)
 	printf("Usage: %s [...]\n", cmd);
 	printf("    -i <ifname|ifindex> Interface\n");
 	printf("    -T <stop-after-X-seconds> Default: 0 (forever)\n");
+	printf("    -P <MAX_PCKT_SIZE> Default: %u\n", MAX_PCKT_SIZE);
 	printf("    -S use skb-mode\n");
 	printf("    -N enforce native mode\n");
 	printf("    -F force loading prog\n");
@@ -85,13 +87,14 @@ int main(int argc, char **argv)
 		.prog_type	= BPF_PROG_TYPE_XDP,
 	};
 	unsigned char opt_flags[256] = {};
-	const char *optstr = "i:T:SNFh";
+	const char *optstr = "i:T:P:SNFh";
 	struct bpf_prog_info info = {};
 	__u32 info_len = sizeof(info);
+	__u32 max_pckt_size = 0;
+	__u32 key = 0;
 	unsigned int kill_after_s = 0;
 	int i, prog_fd, map_fd, opt;
 	struct bpf_object *obj;
-	struct bpf_map *map;
 	char filename[256];
 	int err;
 
@@ -110,6 +113,9 @@ int main(int argc, char **argv)
 		case 'T':
 			kill_after_s = atoi(optarg);
 			break;
+		case 'P':
+			max_pckt_size = atoi(optarg);
+			break;
 		case 'S':
 			xdp_flags |= XDP_FLAGS_SKB_MODE;
 			break;
@@ -150,12 +156,22 @@ int main(int argc, char **argv)
 	if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd))
 		return 1;
 
-	map = bpf_map__next(NULL, obj);
-	if (!map) {
-		printf("finding a map in obj file failed\n");
+	/* update pcktsz map */
+	if (max_pckt_size) {
+		map_fd = bpf_object__find_map_fd_by_name(obj, "pcktsz");
+		if (!map_fd) {
+			printf("finding a pcktsz map in obj file failed\n");
+			return 1;
+		}
+		bpf_map_update_elem(map_fd, &key, &max_pckt_size, BPF_ANY);
+	}
+
+	/* fetch icmpcnt map */
+	map_fd = bpf_object__find_map_fd_by_name(obj, "icmpcnt");
+	if (!map_fd) {
+		printf("finding a icmpcnt map in obj file failed\n");
 		return 1;
 	}
-	map_fd = bpf_map__fd(map);
 
 	if (!prog_fd) {
 		printf("load_bpf_file: %s\n", strerror(errno));
-- 
2.20.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox