Netdev List

Netdev List
 help / color / mirror / Atom feed

* [net 2/6] net/mlx5e: Fix port tunnel GRE entropy control
From: Saeed Mahameed @ 2019-07-11 18:54 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev@vger.kernel.org, Eli Britstein, Saeed Mahameed
In-Reply-To: <20190711185353.5715-1-saeedm@mellanox.com>

From: Eli Britstein <elibr@mellanox.com>

GRE entropy calculation is a single bit per card, and not per port.
Force disable GRE entropy calculation upon the first GRE encap rule,
and release the force at the last GRE encap rule removal. This is done
per port.

Fixes: 97417f6182f8 ("net/mlx5e: Fix GRE key by controlling port tunnel entropy calculation")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../mellanox/mlx5/core/lib/port_tun.c         | 23 ++++---------------
 1 file changed, 4 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/port_tun.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/port_tun.c
index be69c1d7941a..48b5c847b642 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/port_tun.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/port_tun.c
@@ -98,27 +98,12 @@ static int mlx5_set_entropy(struct mlx5_tun_entropy *tun_entropy,
 	 */
 	if (entropy_flags.gre_calc_supported &&
 	    reformat_type == MLX5_REFORMAT_TYPE_L2_TO_NVGRE) {
-		/* Other applications may change the global FW entropy
-		 * calculations settings. Check that the current entropy value
-		 * is the negative of the updated value.
-		 */
-		if (entropy_flags.force_enabled &&
-		    enable == entropy_flags.gre_calc_enabled) {
-			mlx5_core_warn(tun_entropy->mdev,
-				       "Unexpected GRE entropy calc setting - expected %d",
-				       !entropy_flags.gre_calc_enabled);
-			return -EOPNOTSUPP;
-		}
-		err = mlx5_set_port_gre_tun_entropy_calc(tun_entropy->mdev, enable,
-							 entropy_flags.force_supported);
+		if (!entropy_flags.force_supported)
+			return 0;
+		err = mlx5_set_port_gre_tun_entropy_calc(tun_entropy->mdev,
+							 enable, !enable);
 		if (err)
 			return err;
-		/* if we turn on the entropy we don't need to force it anymore */
-		if (entropy_flags.force_supported && enable) {
-			err = mlx5_set_port_gre_tun_entropy_calc(tun_entropy->mdev, 1, 0);
-			if (err)
-				return err;
-		}
 	} else if (entropy_flags.calc_supported) {
 		/* Other applications may change the global FW entropy
 		 * calculations settings. Check that the current entropy value
-- 
2.21.0


^ permalink raw reply related

* [net 1/6] net/mlx5: E-Switch, Fix default encap mode
From: Saeed Mahameed @ 2019-07-11 18:54 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev@vger.kernel.org, Maor Gottlieb, Roi Dayan, Saeed Mahameed
In-Reply-To: <20190711185353.5715-1-saeedm@mellanox.com>

From: Maor Gottlieb <maorg@mellanox.com>

Encap mode is related to switchdev mode only. Move the init of
the encap mode to eswitch_offloads. Before this change, we reported
that eswitch supports encap, even tough the device was in non
SRIOV mode.

Fixes: 7768d1971de67 ('net/mlx5: E-Switch, Add control for encapsulation')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c          | 5 -----
 drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 7 +++++++
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index 6a921e24cd5e..e9339e7d6a18 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -1882,11 +1882,6 @@ int mlx5_eswitch_init(struct mlx5_core_dev *dev)
 	esw->enabled_vports = 0;
 	esw->mode = SRIOV_NONE;
 	esw->offloads.inline_mode = MLX5_INLINE_MODE_NONE;
-	if (MLX5_CAP_ESW_FLOWTABLE_FDB(dev, reformat) &&
-	    MLX5_CAP_ESW_FLOWTABLE_FDB(dev, decap))
-		esw->offloads.encap = DEVLINK_ESWITCH_ENCAP_MODE_BASIC;
-	else
-		esw->offloads.encap = DEVLINK_ESWITCH_ENCAP_MODE_NONE;
 
 	dev->priv.eswitch = esw;
 	return 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 47b446d30f71..c2beadc41c40 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -1840,6 +1840,12 @@ int esw_offloads_init(struct mlx5_eswitch *esw, int vf_nvports,
 {
 	int err;
 
+	if (MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, reformat) &&
+	    MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, decap))
+		esw->offloads.encap = DEVLINK_ESWITCH_ENCAP_MODE_BASIC;
+	else
+		esw->offloads.encap = DEVLINK_ESWITCH_ENCAP_MODE_NONE;
+
 	err = esw_offloads_steering_init(esw, vf_nvports, total_nvports);
 	if (err)
 		return err;
@@ -1901,6 +1907,7 @@ void esw_offloads_cleanup(struct mlx5_eswitch *esw)
 	esw_offloads_devcom_cleanup(esw);
 	esw_offloads_unload_all_reps(esw, num_vfs);
 	esw_offloads_steering_cleanup(esw);
+	esw->offloads.encap = DEVLINK_ESWITCH_ENCAP_MODE_NONE;
 }
 
 static int esw_mode_from_devlink(u16 mode, u16 *mlx5_mode)
-- 
2.21.0


^ permalink raw reply related

* [pull request][net 0/6] Mellanox, mlx5 fixes 2019-07-11
From: Saeed Mahameed @ 2019-07-11 18:54 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev@vger.kernel.org, Saeed Mahameed

Hi Dave,

This series introduces some fixes to mlx5 driver.

Please pull and let me know if there is any problem.

For -stable v4.15
('net/mlx5e: IPoIB, Add error path in mlx5_rdma_setup_rn')

For -stable v5.1
('net/mlx5e: Fix port tunnel GRE entropy control')
('net/mlx5e: Rx, Fix checksum calculation for new hardware')
('net/mlx5e: Fix return value from timeout recover function')
('net/mlx5e: Fix error flow in tx reporter diagnose')

For -stable v5.2
('net/mlx5: E-Switch, Fix default encap mode')

Conflict note: This pull request will produce a small conflict when
merged with net-next.
In drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
Take the hunk from net and replace:
esw_offloads_steering_init(esw, vf_nvports, total_nvports);
with:
esw_offloads_steering_init(esw);

Thanks,
Saeed.

---
The following changes since commit e858faf556d4e14c750ba1e8852783c6f9520a0e:

  tcp: Reset bytes_acked and bytes_received when disconnecting (2019-07-08 19:29:19 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-fixes-2019-07-11

for you to fetch changes up to ef1ce7d7b67b46661091c7ccc0396186b7a247ef:

  net/mlx5e: IPoIB, Add error path in mlx5_rdma_setup_rn (2019-07-11 11:45:04 -0700)

----------------------------------------------------------------
mlx5-fixes-2019-07-11

----------------------------------------------------------------
Aya Levin (3):
      net/mlx5e: Fix return value from timeout recover function
      net/mlx5e: Fix error flow in tx reporter diagnose
      net/mlx5e: IPoIB, Add error path in mlx5_rdma_setup_rn

Eli Britstein (1):
      net/mlx5e: Fix port tunnel GRE entropy control

Maor Gottlieb (1):
      net/mlx5: E-Switch, Fix default encap mode

Saeed Mahameed (1):
      net/mlx5e: Rx, Fix checksum calculation for new hardware

 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  1 +
 .../ethernet/mellanox/mlx5/core/en/reporter_tx.c   | 10 ++++------
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  3 +++
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |  7 ++++++-
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c  |  5 -----
 .../ethernet/mellanox/mlx5/core/eswitch_offloads.c |  7 +++++++
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c  |  9 ++++++++-
 .../net/ethernet/mellanox/mlx5/core/lib/port_tun.c | 23 ++++------------------
 include/linux/mlx5/mlx5_ifc.h                      |  3 ++-
 9 files changed, 35 insertions(+), 33 deletions(-)

^ permalink raw reply

* Re: [PATCH net 2/4] tcp: tcp_fragment() should apply sane memory limits
From: Eric Dumazet @ 2019-07-11 18:50 UTC (permalink / raw)
  To: Michal Kubecek, netdev
  Cc: Eric Dumazet, Christoph Paasch, Prout, Andrew - LLSC - MITLL,
	David Miller, Greg Kroah-Hartman, Jonathan Looney, Neal Cardwell,
	Tyler Hicks, Yuchung Cheng, Bruce Curtis, Jonathan Lemon,
	Dustin Marquess
In-Reply-To: <20190711182654.GG5700@unicorn.suse.cz>

On 7/11/19 8:26 PM, Michal Kubecek wrote:

> 
> I'm aware it's not a realistic test. It was written as quick and simple
> check of the pre-4.19 patch, but it shows that even TLP may not get
> through.

Most of TLP probes send new data, not rtx.

But yes, I get your point.

SO_SNDBUF=15000 in your case is seriously wrong.

Lets code a safety feature over SO_SNDBUF to not allow pathological small values,
because I do not want to support a constrained TCP stack in 2019.

^ permalink raw reply

* Re: [GIT] Networking
From: pr-tracker-bot @ 2019-07-11 18:35 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, akpm, netdev, linux-kernel
In-Reply-To: <20190709.223834.2182721912834033108.davem@davemloft.net>

The pull request you sent on Tue, 09 Jul 2019 22:38:34 -0700 (PDT):

> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git refs/heads/master

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/237f83dfbe668443b5e31c3c7576125871cca674

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

^ permalink raw reply

* Re: [PATCH] MAINTAINERS: update BPF JIT S390 maintainers
From: David Miller @ 2019-07-11 18:33 UTC (permalink / raw)
  To: gor; +Cc: ast, daniel, heiko.carstens, borntraeger, iii, netdev, bpf,
	linux-s390
In-Reply-To: <your-ad-here.call-01562758494-ext-2794@work.hours>

From: Vasily Gorbik <gor@linux.ibm.com>
Date: Wed, 10 Jul 2019 13:34:54 +0200

> Dave, Alexei, Daniel,
> would you take it via one of your trees? Or should I take it via s390?

I think it can go via the bpf tree.

^ permalink raw reply

* Re: [bpf PATCH v2 2/6] bpf: tls fix transition through disconnect with close
From: Jakub Kicinski @ 2019-07-11 18:32 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev, edumazet, bpf
In-Reply-To: <5d276814a76ad_698f2aaeaaf925bc8a@john-XPS-13-9370.notmuch>

On Thu, 11 Jul 2019 09:47:16 -0700, John Fastabend wrote:
> Jakub Kicinski wrote:
> > On Wed, 10 Jul 2019 12:34:17 -0700, Jakub Kicinski wrote:  
> > > > > > +		if (sk->sk_prot->unhash)
> > > > > > +			sk->sk_prot->unhash(sk);
> > > > > > +	}
> > > > > > +
> > > > > > +	ctx = tls_get_ctx(sk);
> > > > > > +	if (ctx->tx_conf == TLS_SW || ctx->rx_conf == TLS_SW)
> > > > > > +		tls_sk_proto_cleanup(sk, ctx, timeo);  
> > 
> > Do we still need to hook into unhash? With patch 6 in place perhaps we
> > can just do disconnect 🥺  
> 
> ?? "can just do a disconnect", not sure I folow. We still need unhash
> in cases where we have a TLS socket transition from ESTABLISHED
> to LISTEN state without calling close(). This is independent of if
> sockmap is running or not.
> 
> Originally, I thought this would be extremely rare but I did see it
> in real applications on the sockmap side so presumably it is possible
> here as well.

Ugh, sorry, I meant shutdown. Instead of replacing the unhash callback
replace the shutdown callback. We probably shouldn't release the socket
lock either there, but we can sleep, so I'll be able to run the device
connection remove callback (which sleep).

> > cleanup is going to kick off TX but also:
> > 
> > 	if (unlikely(sk->sk_write_pending) &&
> > 	    !wait_on_pending_writer(sk, &timeo))
> > 		tls_handle_open_record(sk, 0);
> > 
> > Are we guaranteed that sk_write_pending is 0?  Otherwise
> > wait_on_pending_writer is hiding yet another release_sock() :(  
> 
> Not seeing the path to release_sock() at the moment?
> 
>    tls_handle_open_record
>      push_pending_record
>       tls_sw_push_pending_record
>         bpf_exec_tx_verdict

wait_on_pending_writer
  sk_wait_event
    release_sock

> If bpf_exec_tx_verdict does a redirect we could hit a relase but that
> is another fix I have to get queued up shortly. I think we can fix
> that in another series.

Ugh.

^ permalink raw reply

* Re: [PATCH net 2/4] tcp: tcp_fragment() should apply sane memory limits
From: Eric Dumazet @ 2019-07-11 18:28 UTC (permalink / raw)
  To: Prout, Andrew - LLSC - MITLL, Eric Dumazet, Christoph Paasch
  Cc: David S . Miller, netdev, Greg Kroah-Hartman, Jonathan Looney,
	Neal Cardwell, Tyler Hicks, Yuchung Cheng, Bruce Curtis,
	Jonathan Lemon, Dustin Marquess
In-Reply-To: <adec774ed16540c6b627c2f607f3e216@ll.mit.edu>

On 7/11/19 7:14 PM, Prout, Andrew - LLSC - MITLL wrote:
> 
> In my opinion, if a small SO_SNDBUF below a certain value is no longer supported, then SOCK_MIN_SNDBUF should be adjusted to reflect this. The RCVBUF/SNDBUF sizes are supposed to be hints, no error is returned if they are not honored. The kernel should continue to function regardless of what userspace requests for their values.
> 

It is supported to set whatever SO_SNDBUF value and get terrible performance.

It always has been.

The only difference is that we no longer allow an attacker to fool TCP stack
and consume up to 2 GB per socket while SO_SNDBUF was set to 128 KB.

The side effect is that in some cases, the workload can appear to have the signature of the attack.

The solution is to increase your SO_SNDBUF, or even better let TCP stack autotune it.
nobody forced you to set very small values for it.

^ permalink raw reply

* Re: [PATCH net 2/4] tcp: tcp_fragment() should apply sane memory limits
From: Michal Kubecek @ 2019-07-11 18:26 UTC (permalink / raw)
  To: netdev
  Cc: Eric Dumazet, Christoph Paasch, Prout, Andrew - LLSC - MITLL,
	David Miller, Greg Kroah-Hartman, Jonathan Looney, Neal Cardwell,
	Tyler Hicks, Yuchung Cheng, Bruce Curtis, Jonathan Lemon,
	Dustin Marquess
In-Reply-To: <eb6121ea-b02d-672e-25c9-2ad054d49fc7@gmail.com>

On Thu, Jul 11, 2019 at 11:19:45AM +0200, Eric Dumazet wrote:
> 
> 
> On 7/11/19 9:28 AM, Christoph Paasch wrote:
> > 
> > 
> >> On Jul 10, 2019, at 9:26 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >>
> >>
> >>
> >> On 7/10/19 8:53 PM, Prout, Andrew - LLSC - MITLL wrote:
> >>>
> >>> Our initial rollout was v4.14.130, but I reproduced it with v4.14.132 as well, reliably for the samba test and once (not reliably) with synthetic test I was trying. A patched v4.14.132 with this patch partially reverted (just the four lines from tcp_fragment deleted) passed the samba test.
> >>>
> >>> The synthetic test was a pair of simple send/recv test programs under the following conditions:
> >>> -The send socket was non-blocking
> >>> -SO_SNDBUF set to 128KiB
> >>> -The receiver NIC was being flooded with traffic from multiple hosts (to induce packet loss/retransmits)
> >>> -Load was on both systems: a while(1) program spinning on each CPU core
> >>> -The receiver was on an older unaffected kernel
> >>>
> >>
> >> SO_SNDBUF to 128KB does not permit to recover from heavy losses,
> >> since skbs needs to be allocated for retransmits.
> > 
> > Would it make sense to always allow the alloc in tcp_fragment when coming from __tcp_retransmit_skb() through the retransmit-timer ?
> 
> 4.15+ kernels have :
> 
> if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf &&
>     tcp_queue != TCP_FRAG_IN_WRITE_QUEUE)) {
> 
> 
> Meaning that things like TLP will succeed.

I get

          <idle>-0     [010] ..s. 301696.143296: p_tcp_fragment_0: (tcp_fragment+0x0/0x310) sndbuf=30000 wmemq=65600
          <idle>-0     [010] d.s. 301696.143301: r_tcp_fragment_0: (tcp_send_loss_probe+0x13d/0x1f0 <- tcp_fragment) ret=-12
          <idle>-0     [010] ..s. 301696.267644: p_tcp_fragment_0: (tcp_fragment+0x0/0x310) sndbuf=30000 wmemq=65600
          <idle>-0     [010] d.s. 301696.267650: r_tcp_fragment_0: (__tcp_retransmit_skb+0xf9/0x800 <- tcp_fragment) ret=-12
          <idle>-0     [010] ..s. 301696.875289: p_tcp_fragment_0: (tcp_fragment+0x0/0x310) sndbuf=30000 wmemq=65600
          <idle>-0     [010] d.s. 301696.875293: r_tcp_fragment_0: (__tcp_retransmit_skb+0xf9/0x800 <- tcp_fragment) ret=-12
          <idle>-0     [010] ..s. 301698.059267: p_tcp_fragment_0: (tcp_fragment+0x0/0x310) sndbuf=30000 wmemq=65600
          <idle>-0     [010] d.s. 301698.059271: r_tcp_fragment_0: (__tcp_retransmit_skb+0xf9/0x800 <- tcp_fragment) ret=-12
          <idle>-0     [010] ..s. 301700.427225: p_tcp_fragment_0: (tcp_fragment+0x0/0x310) sndbuf=30000 wmemq=65600
          <idle>-0     [010] d.s. 301700.427230: r_tcp_fragment_0: (__tcp_retransmit_skb+0xf9/0x800 <- tcp_fragment) ret=-12
          <idle>-0     [010] ..s. 301705.291144: p_tcp_fragment_0: (tcp_fragment+0x0/0x310) sndbuf=30000 wmemq=65600
          <idle>-0     [010] d.s. 301705.291151: r_tcp_fragment_0: (__tcp_retransmit_skb+0xf9/0x800 <- tcp_fragment) ret=-12
          <idle>-0     [010] ..s. 301714.762961: p_tcp_fragment_0: (tcp_fragment+0x0/0x310) sndbuf=30000 wmemq=65600
          <idle>-0     [010] d.s. 301714.762966: r_tcp_fragment_0: (__tcp_retransmit_skb+0xf9/0x800 <- tcp_fragment) ret=-12

on 5.2 kernel with this packetdrill script:

------------------------------------------------------------------------
--tolerance_usecs=10000

// flush cached TCP metrics
0.000  `ip tcp_metrics flush all`

// establish a connection
+0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0.000 setsockopt(3, SOL_SOCKET, SO_SNDBUF, [15000], 4) = 0
+0.000 bind(3, ..., ...) = 0
+0.000 listen(3, 1) = 0

+0.100 < S 0:0(0) win 60000 <mss 1000,nop,nop,sackOK,nop,wscale 7>
+0.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
+0.100 < . 1:1(0) ack 1 win 2000
+0.000 accept(3, ..., ...) = 4
+0.100 write(4, ..., 30000) = 30000

+0.000 > . 1:2001(2000) ack 1
+0.000 > . 2001:4001(2000) ack 1
+0.000 > . 4001:6001(2000) ack 1
+0.000 > . 6001:8001(2000) ack 1
+0.000 > . 8001:10001(2000) ack 1
+0.010 < . 1:1(0) ack 10001 win 2000
+0.000 > . 10001:12001(2000) ack 1
+0.000 > . 12001:14001(2000) ack 1
+0.000 > . 14001:16001(2000) ack 1
+0.000 > . 16001:18001(2000) ack 1
+0.000 > . 18001:20001(2000) ack 1
+0.000 > . 20001:22001(2000) ack 1
+0.000 > . 22001:24001(2000) ack 1
+0.000 > . 24001:26001(2000) ack 1
+0.000 > . 26001:28001(2000) ack 1
+0.000 > P. 28001:30001(2000) ack 1
+0.010 < . 1:1(0) ack 30001 win 2000
+0.000 write(4, ..., 40000) = 40000
+0.000 > . 30001:32001(2000) ack 1
+0.000 > . 32001:34001(2000) ack 1
+0.000 > . 34001:36001(2000) ack 1
+0.000 > . 36001:38001(2000) ack 1
+0.000 > . 38001:40001(2000) ack 1
+0.000 > . 40001:42001(2000) ack 1
+0.000 > . 42001:44001(2000) ack 1
+0.000 > . 44001:46001(2000) ack 1
+0.000 > . 46001:48001(2000) ack 1
+0.000 > . 48001:50001(2000) ack 1
+0.000 > . 50001:52001(2000) ack 1
+0.000 > . 52001:54001(2000) ack 1
+0.000 > . 54001:56001(2000) ack 1
+0.000 > . 56001:58001(2000) ack 1
+0.000 > . 58001:60001(2000) ack 1
+0.000 > . 60001:62001(2000) ack 1
+0.000 > . 62001:64001(2000) ack 1
+0.000 > . 64001:66001(2000) ack 1
+0.000 > . 66001:68001(2000) ack 1
+0.000 > P. 68001:70001(2000) ack 1

+0.000 `ss -nteim state established sport == :8080`

+0.120~+0.200 > P. 69001:70001(1000) ack 1
------------------------------------------------------------------------

I'm aware it's not a realistic test. It was written as quick and simple
check of the pre-4.19 patch, but it shows that even TLP may not get
through.

Michal

^ permalink raw reply

* Re: [PATCH net-next iproute2 2/3] tc: Introduce tc ct action
From: Marcelo Ricardo Leitner @ 2019-07-11 17:40 UTC (permalink / raw)
  To: Paul Blakey
  Cc: Roi Dayan, John Hurley, Yossi, Oz Shlomo, netdev@vger.kernel.org,
	Aaron Conole, Rony Efraim, Justin Pettit, Jiri Pirko,
	nst-kernel@redhat.com, Simon Horman, Zhike Wang, David Miller,
	Kuperman
In-Reply-To: <5ded2e5b-958e-eca3-76ad-909ebf79234e@mellanox.com>

On Thu, Jul 11, 2019 at 07:21:51AM +0000, Paul Blakey wrote:
> 
> On 7/9/2019 6:36 PM, Marcelo Ricardo Leitner wrote:
> > On Tue, Jul 09, 2019 at 06:58:36AM +0000, Paul Blakey wrote:
> >> On 7/8/2019 8:54 PM, Marcelo Ricardo Leitner wrote:
> >>> On Sun, Jul 07, 2019 at 11:53:47AM +0300, Paul Blakey wrote:
> >>>> New tc action to send packets to conntrack module, commit
> >>>> them, and set a zone, labels, mark, and nat on the connection.
> >>>>
> >>>> It can also clear the packet's conntrack state by using clear.
> >>>>
> >>>> Usage:
> >>>>      ct clear
> >>>>      ct commit [force] [zone] [mark] [label] [nat]
> >>> Isn't the 'commit' also optional? More like
> >>>       ct [commit [force]] [zone] [mark] [label] [nat]
> >>>
> >>>>      ct [nat] [zone]
> >>>>
> >>>> Signed-off-by: Paul Blakey <paulb@mellanox.com>
> >>>> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> >>>> Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
> >>>> Acked-by: Jiri Pirko <jiri@mellanox.com>
> >>>> Acked-by: Roi Dayan <roid@mellanox.com>
> >>>> ---
> >>> ...
> >>>> +static void
> >>>> +usage(void)
> >>>> +{
> >>>> +	fprintf(stderr,
> >>>> +		"Usage: ct clear\n"
> >>>> +		"	ct commit [force] [zone ZONE] [mark MASKED_MARK] [label MASKED_LABEL] [nat NAT_SPEC]\n"
> >>> Ditto here then.
> >>
> >> In commit msg and here, it means there is multiple modes of operation. I
> >> think it's easier to split those.
> > Yep, that is good.
> > More below.
> >
> >> "ct clear" to clear it , not other options can be added here.
> >>
> >> "ct commit  [force].... " sends to conntrack and commit a connection,
> >> and only for commit can you specify force mark  label, and nat with
> >> nat_spec....
> >>
> >> and the last one, "ct [nat] [zone ZONE]" is to just send the packet to
> >> conntrack on some zone [optional], restore nat [optional].
> >>
> >>
> >>>> +		"	ct [nat] [zone ZONE]\n"
> >>>> +		"Where: ZONE is the conntrack zone table number\n"
> >>>> +		"	NAT_SPEC is {src|dst} addr addr1[-addr2] [port port1[-port2]]\n"
> >>>> +		"\n");
> >>>> +	exit(-1);
> >>>> +}
> >>> ...
> >>>
> >>> The validation below doesn't enforce that commit must be there for
> >>> such case.
> >> which case? commit is optional. the above are the three valid patterns.
> > That's the point. But the 2nd example is saying 'commit' word is
> > mandatory in that mode. It is written as it is a command that was
> > selected.
> >
> > One may use just:
> >      ct [zone]
> > And not
> >      ct commit [zone]
> > Right?
> 
> It is optional in the overall syntax.
> 
> 
> But I split it into modes:
> 
> clear, commit, and "restore" (I unofficial call it like that, because it 
> usually used to get the +est state on the packet and can restore nat, it 
> doesn't actually restore anything for the first packet on the -trk rule)
> 
> It is mandatory in the second mode (commit), if you don't specify commit 
> or clear, you can only use the third form - "restore", which is to send 
> to ct on some optional zone, and optionally and restore nat (so we get 
> ct [zone] [nat]).

I see. Thanks Paul.

  Marcelo

^ permalink raw reply

* RE: [PATCH net 2/4] tcp: tcp_fragment() should apply sane memory limits
From: Prout, Andrew - LLSC - MITLL @ 2019-07-11 17:14 UTC (permalink / raw)
  To: Eric Dumazet, Christoph Paasch
  Cc: David S . Miller, netdev, Greg Kroah-Hartman, Jonathan Looney,
	Neal Cardwell, Tyler Hicks, Yuchung Cheng, Bruce Curtis,
	Jonathan Lemon, Dustin Marquess
In-Reply-To: <b1dfd327-a784-6609-3c83-dab42c3c7eda@gmail.com>

On 7/10/19 3:27 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On 7/10/19 8:53 PM, Prout, Andrew - LLSC - MITLL wrote:
>> 
>> Our initial rollout was v4.14.130, but I reproduced it with v4.14.132 as well, reliably for the samba test and once (not reliably) with synthetic test I was trying. A patched v4.14.132 with this patch partially reverted (just the four lines from tcp_fragment deleted) passed the samba test.
>> 
>> The synthetic test was a pair of simple send/recv test programs under the following conditions:
>> -The send socket was non-blocking
>> -SO_SNDBUF set to 128KiB
>> -The receiver NIC was being flooded with traffic from multiple hosts (to induce packet loss/retransmits)
>> -Load was on both systems: a while(1) program spinning on each CPU core
>> -The receiver was on an older unaffected kernel
>> 
>
> SO_SNDBUF to 128KB does not permit to recover from heavy losses,
> since skbs needs to be allocated for retransmits.
>
> The bug we fixed allowed remote attackers to crash all linux hosts,
>
> I am afraid we have to enforce the real SO_SNDBUF limit, finally.
>
> Even a cushion of 128KB per socket is dangerous, for servers with millions of TCP sockets.
>
> You will either have to set SO_SNDBUF to higher values, or let autotuning in place.
> Or revert the patches and allow attackers hit you badly.

I in no way intended to imply that I had confirmed the small SO_SNDBUF was a prerequisite to our problem. With my synthetic test, I was looking for a concise reproducer and trying things to stress the system.

Unfortunately we're often stuck being forced to support very old code, right alongside the latest and greatest. We still run a lot of FORTRAN. Telling users en-mass to search and revise their code is not an option for us.

In my opinion, if a small SO_SNDBUF below a certain value is no longer supported, then SOCK_MIN_SNDBUF should be adjusted to reflect this. The RCVBUF/SNDBUF sizes are supposed to be hints, no error is returned if they are not honored. The kernel should continue to function regardless of what userspace requests for their values.

Alternatively, a config option could be added. I am not concerned about DoS attacks, our system is not connected to the internet, and we shouldn't have to maintain an out-of-tree patch for basic functionality.

^ permalink raw reply

* Re: [bpf PATCH v2 2/6] bpf: tls fix transition through disconnect with close
From: John Fastabend @ 2019-07-11 16:47 UTC (permalink / raw)
  To: Jakub Kicinski, John Fastabend; +Cc: ast, daniel, netdev, edumazet, bpf
In-Reply-To: <20190710130411.08c54ddd@cakuba.netronome.com>

Jakub Kicinski wrote:
> On Wed, 10 Jul 2019 12:34:17 -0700, Jakub Kicinski wrote:
> > > > > +		if (sk->sk_prot->unhash)
> > > > > +			sk->sk_prot->unhash(sk);
> > > > > +	}
> > > > > +
> > > > > +	ctx = tls_get_ctx(sk);
> > > > > +	if (ctx->tx_conf == TLS_SW || ctx->rx_conf == TLS_SW)
> > > > > +		tls_sk_proto_cleanup(sk, ctx, timeo);
> 
> Do we still need to hook into unhash? With patch 6 in place perhaps we
> can just do disconnect 🥺

?? "can just do a disconnect", not sure I folow. We still need unhash
in cases where we have a TLS socket transition from ESTABLISHED
to LISTEN state without calling close(). This is independent of if
sockmap is running or not.

Originally, I thought this would be extremely rare but I did see it
in real applications on the sockmap side so presumably it is possible
here as well.

> 
> cleanup is going to kick off TX but also:
> 
> 	if (unlikely(sk->sk_write_pending) &&
> 	    !wait_on_pending_writer(sk, &timeo))
> 		tls_handle_open_record(sk, 0);
> 
> Are we guaranteed that sk_write_pending is 0?  Otherwise
> wait_on_pending_writer is hiding yet another release_sock() :(

Not seeing the path to release_sock() at the moment?

   tls_handle_open_record
     push_pending_record
      tls_sw_push_pending_record
        bpf_exec_tx_verdict

If bpf_exec_tx_verdict does a redirect we could hit a relase but that
is another fix I have to get queued up shortly. I think we can fix
that in another series.

^ permalink raw reply

* [PATCH][bpf-next] bpf: verifier: avoid fall-through warnings
From: Gustavo A. R. Silva @ 2019-07-11 16:22 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, Andrii Nakryiko, Lawrence Brakmo
  Cc: netdev, bpf, linux-kernel, Gustavo A. R. Silva, Kees Cook

In preparation to enabling -Wimplicit-fallthrough, this patch silences
the following warning:

kernel/bpf/verifier.c: In function ‘check_return_code’:
kernel/bpf/verifier.c:6106:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (env->prog->expected_attach_type == BPF_CGROUP_UDP4_RECVMSG ||
      ^
kernel/bpf/verifier.c:6109:2: note: here
  case BPF_PROG_TYPE_CGROUP_SKB:
  ^~~~

Warning level 3 was used: -Wimplicit-fallthrough=3

Notice that is much clearer to explicitly add breaks in each case
statement (that actually contains some code), rather than letting
the code to fall through.

This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.

Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
---

NOTE: -Wimplicit-fallthrough will be enabled globally in v5.3. So, I
      suggest you to take this patch for 5.3-rc1.

 kernel/bpf/verifier.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a2e763703c30..44c3b947400e 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -6106,11 +6106,13 @@ static int check_return_code(struct bpf_verifier_env *env)
 		if (env->prog->expected_attach_type == BPF_CGROUP_UDP4_RECVMSG ||
 		    env->prog->expected_attach_type == BPF_CGROUP_UDP6_RECVMSG)
 			range = tnum_range(1, 1);
+		break;
 	case BPF_PROG_TYPE_CGROUP_SKB:
 		if (env->prog->expected_attach_type == BPF_CGROUP_INET_EGRESS) {
 			range = tnum_range(0, 3);
 			enforce_attach_type_range = tnum_range(2, 3);
 		}
+		break;
 	case BPF_PROG_TYPE_CGROUP_SOCK:
 	case BPF_PROG_TYPE_SOCK_OPS:
 	case BPF_PROG_TYPE_CGROUP_DEVICE:
-- 
2.21.0


^ permalink raw reply related

* Re: [bpf PATCH v2 6/6] bpf: sockmap/tls, close can race with map free
From: John Fastabend @ 2019-07-11 16:39 UTC (permalink / raw)
  To: Jakub Kicinski, John Fastabend; +Cc: ast, daniel, netdev, edumazet, bpf
In-Reply-To: <20190710123543.04846e00@cakuba.netronome.com>

Jakub Kicinski wrote:
> On Tue, 09 Jul 2019 20:33:58 -0700, John Fastabend wrote:
> > Jakub Kicinski wrote:
> > > On Mon, 08 Jul 2019 19:15:18 +0000, John Fastabend wrote:  
> > > > @@ -352,15 +354,18 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
> > > >  	if (ctx->tx_conf == TLS_BASE && ctx->rx_conf == TLS_BASE)
> > > >  		goto skip_tx_cleanup;
> > > >  
> > > > -	sk->sk_prot = ctx->sk_proto;
> > > >  	tls_sk_proto_cleanup(sk, ctx, timeo);
> > > >  
> > > >  skip_tx_cleanup:
> > > > +	write_lock_bh(&sk->sk_callback_lock);
> > > > +	icsk->icsk_ulp_data = NULL;  
> > > 
> > > Is ulp_data pointer now supposed to be updated under the
> > > sk_callback_lock?  
> > 
> > Yes otherwise it can race with tls_update(). I didn't remove the
> > ulp pointer null set from tcp_ulp.c though. Could be done in this
> > patch or as a follow up.
> 
> Do we need to hold the lock in unhash, too, or is unhash called with
> sk_callback_lock held?
> 

We should hold the lock here. Also we should reset sk_prot similar to
other paths in case we get here without a close() call. syzbot hasn't
found that path yet but I'll add some tests for it.

	write_lock_bh(...)
	icsk_ulp_data = NULL
	sk->sk_prot = ctx->sk_proto;
	write_unlock_bh(...)

Thanks

^ permalink raw reply

* Re: [bpf PATCH v2 2/6] bpf: tls fix transition through disconnect with close
From: John Fastabend @ 2019-07-11 16:35 UTC (permalink / raw)
  To: Jakub Kicinski, John Fastabend; +Cc: ast, daniel, netdev, edumazet, bpf
In-Reply-To: <20190710123417.2157a459@cakuba.netronome.com>

Jakub Kicinski wrote:
> On Tue, 09 Jul 2019 20:39:24 -0700, John Fastabend wrote:
> > Jakub Kicinski wrote:
> > > On Mon, 08 Jul 2019 19:14:05 +0000, John Fastabend wrote:  
> > > > @@ -287,6 +313,27 @@ static void tls_sk_proto_cleanup(struct sock *sk,
> > > >  #endif
> > > >  }
> > > >  
> > > > +static void tls_sk_proto_unhash(struct sock *sk)
> > > > +{
> > > > +	struct inet_connection_sock *icsk = inet_csk(sk);
> > > > +	long timeo = sock_sndtimeo(sk, 0);
> > > > +	struct tls_context *ctx;
> > > > +
> > > > +	if (unlikely(!icsk->icsk_ulp_data)) {  
> > > 
> > > Is this for when sockmap is stacked on top of TLS and TLS got removed
> > > without letting sockmap know?  
> > 
> > Right its a pattern I used on the sockmap side and put here. But
> > I dropped the patch to let sockmap stack on top of TLS because
> > it was more than a fix IMO. We could probably drop this check on
> > the other hand its harmless.
> 
> I feel like this code is pretty complex I struggle to follow all the
> paths, so perhaps it'd be better to drop stuff that's not necessary 
> to have a clearer picture.
> 

Sure I can drop it and add it later when its necessary.

> > > > +		if (sk->sk_prot->unhash)
> > > > +			sk->sk_prot->unhash(sk);
> > > > +	}
> > > > +
> > > > +	ctx = tls_get_ctx(sk);
> > > > +	if (ctx->tx_conf == TLS_SW || ctx->rx_conf == TLS_SW)
> > > > +		tls_sk_proto_cleanup(sk, ctx, timeo);
> > > > +	icsk->icsk_ulp_data = NULL;  
> > > 
> > > I think close only starts checking if ctx is NULL in patch 6.
> > > Looks like some chunks of ctx checking/clearing got spread to
> > > patch 1 and some to patch 6.  
> > 
> > Yeah, I thought the patches were easier to read this way but
> > maybe not. Could add something in the commit log.
> 
> Ack! Let me try to get a full grip of patches 2 and 6 and come back 
> to this.
> 
> > > > +	tls_ctx_free_wq(ctx);
> > > > +
> > > > +	if (ctx->unhash)
> > > > +		ctx->unhash(sk);
> > > > +}
> > > > +
> > > >  static void tls_sk_proto_close(struct sock *sk, long timeout)
> > > >  {
> > > >  	struct tls_context *ctx = tls_get_ctx(sk);  

^ permalink raw reply

* [PATCH net-next 1/1] tc-tests: updated skbedit tests
From: Roman Mashak @ 2019-07-11 16:29 UTC (permalink / raw)
  To: davem; +Cc: netdev, kernel, jhs, xiyou.wangcong, jiri, Roman Mashak

- Added mask upper bound test case
- Added mask validation test case
- Added mask replacement case

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
---
 .../tc-testing/tc-tests/actions/skbedit.json       | 117 +++++++++++++++++++++
 1 file changed, 117 insertions(+)

diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/skbedit.json b/tools/testing/selftests/tc-testing/tc-tests/actions/skbedit.json
index 45e7e89928a5..bf5ebf59c2d4 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/skbedit.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/skbedit.json
@@ -70,6 +70,123 @@
         "teardown": []
     },
     {
+        "id": "d4cd",
+        "name": "Add skbedit action with valid mark and mask",
+        "category": [
+            "actions",
+            "skbedit"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action skbedit",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action skbedit mark 1/0xaabb",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action skbedit",
+        "matchPattern": "action order [0-9]*: skbedit  mark 1/0xaabb",
+        "matchCount": "1",
+        "teardown": [
+            "$TC actions flush action skbedit"
+        ]
+    },
+    {
+        "id": "baa7",
+        "name": "Add skbedit action with valid mark and 32-bit maximum mask",
+        "category": [
+            "actions",
+            "skbedit"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action skbedit",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action skbedit mark 1/0xffffffff",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action skbedit",
+        "matchPattern": "action order [0-9]*: skbedit  mark 1/0xffffffff",
+        "matchCount": "1",
+        "teardown": [
+            "$TC actions flush action skbedit"
+        ]
+    },
+    {
+        "id": "62a5",
+        "name": "Add skbedit action with valid mark and mask exceeding 32-bit maximum",
+        "category": [
+            "actions",
+            "skbedit"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action skbedit",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action skbedit mark 1/0xaabbccddeeff112233",
+        "expExitCode": "255",
+        "verifyCmd": "$TC actions list action skbedit",
+        "matchPattern": "action order [0-9]*: skbedit  mark 1/0xaabbccddeeff112233",
+        "matchCount": "0",
+        "teardown": []
+    },
+    {
+        "id": "bc15",
+        "name": "Add skbedit action with valid mark and mask with invalid format",
+        "category": [
+            "actions",
+            "skbedit"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action skbedit",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action skbedit mark 1/-1234",
+        "expExitCode": "255",
+        "verifyCmd": "$TC actions list action skbedit",
+        "matchPattern": "action order [0-9]*: skbedit  mark 1/-1234",
+        "matchCount": "0",
+        "teardown": []
+    },
+    {
+        "id": "57c2",
+        "name": "Replace skbedit action with new mask",
+        "category": [
+            "actions",
+            "skbedit"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action skbedit",
+                0,
+                1,
+                255
+            ],
+            "$TC actions add action skbedit mark 1/0x11223344 index 1"
+        ],
+        "cmdUnderTest": "$TC actions replace action skbedit mark 1/0xaabb index 1",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action skbedit",
+        "matchPattern": "action order [0-9]*: skbedit  mark 1/0xaabb",
+        "matchCount": "1",
+        "teardown": [
+            "$TC actions flush action skbedit"
+        ]
+    },
+    {
         "id": "081d",
         "name": "Add skbedit action with priority",
         "category": [
-- 
2.7.4


^ permalink raw reply related

* Re: [PATCH] libertas: add terminating entry to fw_table
From: Sergei Shtylyov @ 2019-07-11 15:54 UTC (permalink / raw)
  To: Oliver Neukum, davem, netdev
In-Reply-To: <20190711142744.31956-1-oneukum@suse.com>

Hello!

On 07/11/2019 05:27 PM, Oliver Neukum wrote:

> In case no firmware was found, the system would happily read
> and try to load garbage. Terminate the table properly.
> 
> Signed-off-by: Oliver Neukum <oneukum@suse.com>
> Fixes: ce84bb69f50e6 ("libertas USB: convert to asynchronous firmware loading")

   The Fixed: tag should precede the sign-offs, according to DaveM...

> Reported-by: syzbot+8a8f48672560c8ca59dd@syzkaller.appspotmail.com

   That should be the 1st tag, I think...

> ---
>  drivers/net/wireless/marvell/libertas/if_usb.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/marvell/libertas/if_usb.c b/drivers/net/wireless/marvell/libertas/if_usb.c
> index f1622f0ff8c9..b79c65547f4c 100644
> --- a/drivers/net/wireless/marvell/libertas/if_usb.c
> +++ b/drivers/net/wireless/marvell/libertas/if_usb.c
> @@ -50,7 +50,10 @@ static const struct lbs_fw_table fw_table[] = {
>  	{ MODEL_8388, "libertas/usb8388_v5.bin", NULL },
>  	{ MODEL_8388, "libertas/usb8388.bin", NULL },
>  	{ MODEL_8388, "usb8388.bin", NULL },
> -	{ MODEL_8682, "libertas/usb8682.bin", NULL }
> +	{ MODEL_8682, "libertas/usb8682.bin", NULL },
> +
> +	/*terminating entry - keep at end */

   Why no space after /* ?

> +	{ MODEL_8388, NULL, NULL }
>  };
>  
>  static const struct usb_device_id if_usb_table[] = {

MBR, Sergei

^ permalink raw reply

* Re: [PATCH v3 bpf] selftests/bpf: do not ignore clang failures
From: Ilya Leoshkevich @ 2019-07-11 15:00 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, Networking, Daniel Borkmann, Song Liu
In-Reply-To: <CAEf4Bzb6mY-F-wUNNimS+hMSRbJetTKXNcGDQbsJXhXDywA+tg@mail.gmail.com>

> Am 11.07.2019 um 16:55 schrieb Andrii Nakryiko <andrii.nakryiko@gmail.com>:
> 
> On Thu, Jul 11, 2019 at 2:14 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>> 
>> 
>> In addition, pull Kbuild.include in order to get .DELETE_ON_ERROR target,
> 
> In your original patch you explicitly declared .DELETE_ON_ERROR, but
> in this one you just include Kbuild.include.
> Is it enough to just include that file to get desired behavior or your
> forgot to add .DELETE_ON_ERROR?

It’s enough to just include Kbuild.include. I grepped the source tree
and found that no one else declares .DELETE_ON_ERROR explicitly, so I've
decided to avoid doing this as well.

^ permalink raw reply

* Re: [PATCH v3 bpf] selftests/bpf: do not ignore clang failures
From: Andrii Nakryiko @ 2019-07-11 14:55 UTC (permalink / raw)
  To: Ilya Leoshkevich; +Cc: bpf, Networking, Daniel Borkmann, Song Liu
In-Reply-To: <20190711091249.59865-1-iii@linux.ibm.com>

On Thu, Jul 11, 2019 at 2:14 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> When compiling an eBPF prog fails, make still returns 0, because
> failing clang command's output is piped to llc and therefore its
> exit status is ignored.
>
> When clang fails, pipe the string "clang failed" to llc. This will make
> llc fail with an informative error message. This solution was chosen
> over using pipefail, having separate targets or getting rid of llc
> invocation due to its simplicity.
>
> In addition, pull Kbuild.include in order to get .DELETE_ON_ERROR target,

In your original patch you explicitly declared .DELETE_ON_ERROR, but
in this one you just include Kbuild.include.
Is it enough to just include that file to get desired behavior or your
forgot to add .DELETE_ON_ERROR?

> which would cause partial .o files to be removed.
>
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---

Thanks!

Acked-by: Andrii Nakryiko <andriin@fb.com>

> v1->v2: use intermediate targets instead of pipefail
> v2->v3: pipe "clang failed" instead of using intermediate targets
>
> tools/testing/selftests/bpf/Makefile | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index e36356e2377e..e375f399b7a6 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -1,4 +1,5 @@
>  # SPDX-License-Identifier: GPL-2.0
> +include ../../../../scripts/Kbuild.include
>
>  LIBDIR := ../../../lib
>  BPFDIR := $(LIBDIR)/bpf
> @@ -185,8 +186,8 @@ $(ALU32_BUILD_DIR)/test_progs_32: prog_tests/*.c
>
>  $(ALU32_BUILD_DIR)/%.o: progs/%.c $(ALU32_BUILD_DIR) \
>                                         $(ALU32_BUILD_DIR)/test_progs_32
> -       $(CLANG) $(CLANG_FLAGS) \
> -                -O2 -target bpf -emit-llvm -c $< -o - |      \
> +       ($(CLANG) $(CLANG_FLAGS) -O2 -target bpf -emit-llvm -c $< -o - || \
> +               echo "clang failed") | \
>         $(LLC) -march=bpf -mattr=+alu32 -mcpu=$(CPU) $(LLC_FLAGS) \
>                 -filetype=obj -o $@
>  ifeq ($(DWARF2BTF),y)
> @@ -197,16 +198,16 @@ endif
>  # Have one program compiled without "-target bpf" to test whether libbpf loads
>  # it successfully
>  $(OUTPUT)/test_xdp.o: progs/test_xdp.c
> -       $(CLANG) $(CLANG_FLAGS) \
> -               -O2 -emit-llvm -c $< -o - | \
> +       ($(CLANG) $(CLANG_FLAGS) -O2 -emit-llvm -c $< -o - || \
> +               echo "clang failed") | \
>         $(LLC) -march=bpf -mcpu=$(CPU) $(LLC_FLAGS) -filetype=obj -o $@
>  ifeq ($(DWARF2BTF),y)
>         $(BTF_PAHOLE) -J $@
>  endif
>
>  $(OUTPUT)/%.o: progs/%.c
> -       $(CLANG) $(CLANG_FLAGS) \
> -                -O2 -target bpf -emit-llvm -c $< -o - |      \
> +       ($(CLANG) $(CLANG_FLAGS) -O2 -target bpf -emit-llvm -c $< -o - || \
> +               echo "clang failed") | \
>         $(LLC) -march=bpf -mcpu=$(CPU) $(LLC_FLAGS) -filetype=obj -o $@
>  ifeq ($(DWARF2BTF),y)
>         $(BTF_PAHOLE) -J $@
> --
> 2.21.0
>

^ permalink raw reply

* Re: [PATCH bpf-next] selftests/bpf: remove logic duplication in test_verifier.c
From: Andrii Nakryiko @ 2019-07-11 14:43 UTC (permalink / raw)
  To: Krzesimir Nowak
  Cc: Andrii Nakryiko, Kernel Team, Alexei Starovoitov, Daniel Borkmann,
	bpf, Networking
In-Reply-To: <CAGGp+cETuvWUwET=6Mq5sWTJhi5+Rs2bw8xNP2NYZXAAuc6-Og@mail.gmail.com>

On Thu, Jul 11, 2019 at 5:13 AM Krzesimir Nowak <krzesimir@kinvolk.io> wrote:
>
> On Thu, Jul 11, 2019 at 3:08 AM Andrii Nakryiko <andriin@fb.com> wrote:
> >
> > test_verifier tests can specify single- and multi-runs tests. Internally
> > logic of handling them is duplicated. Get rid of it by making single run
> > retval specification to be a first retvals spec.
> >
> > Cc: Krzesimir Nowak <krzesimir@kinvolk.io>
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>
> Looks good, one nit below.
>
> Acked-by: Krzesimir Nowak <krzesimir@kinvolk.io>
>
> > ---
> >  tools/testing/selftests/bpf/test_verifier.c | 37 ++++++++++-----------
> >  1 file changed, 18 insertions(+), 19 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
> > index b0773291012a..120ecdf4a7db 100644
> > --- a/tools/testing/selftests/bpf/test_verifier.c
> > +++ b/tools/testing/selftests/bpf/test_verifier.c
> > @@ -86,7 +86,7 @@ struct bpf_test {
> >         int fixup_sk_storage_map[MAX_FIXUPS];
> >         const char *errstr;
> >         const char *errstr_unpriv;
> > -       uint32_t retval, retval_unpriv, insn_processed;
> > +       uint32_t insn_processed;
> >         int prog_len;
> >         enum {
> >                 UNDEF,
> > @@ -95,16 +95,24 @@ struct bpf_test {
> >         } result, result_unpriv;
> >         enum bpf_prog_type prog_type;
> >         uint8_t flags;
> > -       __u8 data[TEST_DATA_LEN];
> >         void (*fill_helper)(struct bpf_test *self);
> >         uint8_t runs;
> > -       struct {
> > -               uint32_t retval, retval_unpriv;
> > -               union {
> > -                       __u8 data[TEST_DATA_LEN];
> > -                       __u64 data64[TEST_DATA_LEN / 8];
> > +       union {
> > +               struct {
>
> Maybe consider moving the struct definition outside to further the
> removal of the duplication?

Can't do that because then retval/retval_unpriv/data won't be
accessible as a normal field of struct bpf_test. It has to be in
anonymous structs/unions, unfortunately.

I tried the following, but that also didn't work:

union {
    struct bpf_test_retval {
        uint32_t retval, retval_unpriv;
        union {
            __u8 data[TEST_DATA_LEN];
            __u64 data64[TEST_DATA_LEN / 8];
        };
    };
    struct bpf_test_retval retvals[MAX_TEST_RUNS];
};

This also made retval/retval_unpriv to not behave as normal fields of
struct bpf_test.


>
> > +                       uint32_t retval, retval_unpriv;
> > +                       union {
> > +                               __u8 data[TEST_DATA_LEN];
> > +                               __u64 data64[TEST_DATA_LEN / 8];
> > +                       };
> >                 };
> > -       } retvals[MAX_TEST_RUNS];
> > +               struct {
> > +                       uint32_t retval, retval_unpriv;
> > +                       union {
> > +                               __u8 data[TEST_DATA_LEN];
> > +                               __u64 data64[TEST_DATA_LEN / 8];
> > +                       };
> > +               } retvals[MAX_TEST_RUNS];
> > +       };
> >         enum bpf_attach_type expected_attach_type;
> >  };
> >
> > @@ -949,17 +957,8 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
> >                 uint32_t expected_val;
> >                 int i;
> >
> > -               if (!test->runs) {
> > -                       expected_val = unpriv && test->retval_unpriv ?
> > -                               test->retval_unpriv : test->retval;
> > -
> > -                       err = do_prog_test_run(fd_prog, unpriv, expected_val,
> > -                                              test->data, sizeof(test->data));
> > -                       if (err)
> > -                               run_errs++;
> > -                       else
> > -                               run_successes++;
> > -               }
> > +               if (!test->runs)
> > +                       test->runs = 1;
> >
> >                 for (i = 0; i < test->runs; i++) {
> >                         if (unpriv && test->retvals[i].retval_unpriv)
> > --
> > 2.17.1
> >
>
>
> --
> Kinvolk GmbH | Adalbertstr.6a, 10999 Berlin | tel: +491755589364
> Geschäftsführer/Directors: Alban Crequy, Chris Kühl, Iago López Galeiras
> Registergericht/Court of registration: Amtsgericht Charlottenburg
> Registernummer/Registration number: HRB 171414 B
> Ust-ID-Nummer/VAT ID number: DE302207000

^ permalink raw reply

* Re: Re: Re: linux-next: build failure after merge of the net-next tree
From: Jason Gunthorpe @ 2019-07-11 14:33 UTC (permalink / raw)
  To: Bernard Metzler
  Cc: Leon Romanovsky, Stephen Rothwell, Doug Ledford, David Miller,
	Networking, Linux Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <OF9A485648.9C7A28A3-ON00258434.00449B07-00258434.00449B14@notes.na.collabserv.com>

On Thu, Jul 11, 2019 at 12:29:21PM +0000, Bernard Metzler wrote:
> 
> >To: "Bernard Metzler" <BMT@zurich.ibm.com>
> >From: "Jason Gunthorpe" <jgg@mellanox.com>
> >Date: 07/11/2019 01:53PM
> >Cc: "Leon Romanovsky" <leon@kernel.org>, "Stephen Rothwell"
> ><sfr@canb.auug.org.au>, "Doug Ledford" <dledford@redhat.com>, "David
> >Miller" <davem@davemloft.net>, "Networking" <netdev@vger.kernel.org>,
> >"Linux Next Mailing List" <linux-next@vger.kernel.org>, "Linux Kernel
> >Mailing List" <linux-kernel@vger.kernel.org>
> >Subject: [EXTERNAL] Re: Re: linux-next: build failure after merge of
> >the net-next tree
> >
> >On Thu, Jul 11, 2019 at 08:00:49AM +0000, Bernard Metzler wrote:
> >
> >> That listen will not sleep. The socket is just marked
> >> listening. 
> >
> >Eh? siw_listen_address() calls siw_cep_alloc() which does:
> >
> >	struct siw_cep *cep = kzalloc(sizeof(*cep), GFP_KERNEL);
> >
> >Which is sleeping. Many other cases too.
> >
> >Jason
> >
> >
> Ah, true! I was after really deep sleeps like user level
> socket accept() calls ;) So you are correct of course.

I've added this patch to the rdma tree to fix the missing locking.

The merge resolution will be simply swapping
for_ifa to in_dev_for_each_ifa_rtnl.

Jason

From c421651fa2295d1219c36674c7eb8c574542ceea Mon Sep 17 00:00:00 2001
From: Jason Gunthorpe <jgg@mellanox.com>
Date: Thu, 11 Jul 2019 11:29:42 -0300
Subject: [PATCH] RDMA/siw: Add missing rtnl_lock around access to ifa

ifa is protected by rcu or rtnl, add the missing locking. In this case we
have to use rtnl since siw_listen_address() is sleeping.

Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 drivers/infiniband/sw/siw/siw_cm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
index 8e618cb7261f62..c25be723c15b64 100644
--- a/drivers/infiniband/sw/siw/siw_cm.c
+++ b/drivers/infiniband/sw/siw/siw_cm.c
@@ -1975,6 +1975,7 @@ int siw_create_listen(struct iw_cm_id *id, int backlog)
 			id, &s_laddr.sin_addr, ntohs(s_laddr.sin_port),
 			&s_raddr->sin_addr, ntohs(s_raddr->sin_port));
 
+		rtnl_lock();
 		for_ifa(in_dev)
 		{
 			if (ipv4_is_zeronet(s_laddr.sin_addr.s_addr) ||
@@ -1989,6 +1990,7 @@ int siw_create_listen(struct iw_cm_id *id, int backlog)
 			}
 		}
 		endfor_ifa(in_dev);
+		rtnl_unlock();
 		in_dev_put(in_dev);
 	} else if (id->local_addr.ss_family == AF_INET6) {
 		struct inet6_dev *in6_dev = in6_dev_get(dev);
-- 
2.21.0


^ permalink raw reply related

* [PATCH v4 bpf-next 4/4] selftests/bpf: fix compiling loop{1,2,3}.c on s390
From: Ilya Leoshkevich @ 2019-07-11 14:29 UTC (permalink / raw)
  To: bpf, netdev; +Cc: ys114321, daniel, sdf, davem, ast, Ilya Leoshkevich
In-Reply-To: <20190711142930.68809-1-iii@linux.ibm.com>

Use PT_REGS_RC(ctx) instead of ctx->rax, which is not present on s390.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/progs/loop1.c | 2 +-
 tools/testing/selftests/bpf/progs/loop2.c | 2 +-
 tools/testing/selftests/bpf/progs/loop3.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/loop1.c b/tools/testing/selftests/bpf/progs/loop1.c
index dea395af9ea9..7cdb7f878310 100644
--- a/tools/testing/selftests/bpf/progs/loop1.c
+++ b/tools/testing/selftests/bpf/progs/loop1.c
@@ -18,7 +18,7 @@ int nested_loops(volatile struct pt_regs* ctx)
 	for (j = 0; j < 300; j++)
 		for (i = 0; i < j; i++) {
 			if (j & 1)
-				m = ctx->rax;
+				m = PT_REGS_RC(ctx);
 			else
 				m = j;
 			sum += i * m;
diff --git a/tools/testing/selftests/bpf/progs/loop2.c b/tools/testing/selftests/bpf/progs/loop2.c
index 0637bd8e8bcf..9b2f808a2863 100644
--- a/tools/testing/selftests/bpf/progs/loop2.c
+++ b/tools/testing/selftests/bpf/progs/loop2.c
@@ -16,7 +16,7 @@ int while_true(volatile struct pt_regs* ctx)
 	int i = 0;
 
 	while (true) {
-		if (ctx->rax & 1)
+		if (PT_REGS_RC(ctx) & 1)
 			i += 3;
 		else
 			i += 7;
diff --git a/tools/testing/selftests/bpf/progs/loop3.c b/tools/testing/selftests/bpf/progs/loop3.c
index 30a0f6cba080..d727657d51e2 100644
--- a/tools/testing/selftests/bpf/progs/loop3.c
+++ b/tools/testing/selftests/bpf/progs/loop3.c
@@ -16,7 +16,7 @@ int while_true(volatile struct pt_regs* ctx)
 	__u64 i = 0, sum = 0;
 	do {
 		i++;
-		sum += ctx->rax;
+		sum += PT_REGS_RC(ctx);
 	} while (i < 0x100000000ULL);
 	return sum;
 }
-- 
2.21.0


^ permalink raw reply related

* [PATCH v4 bpf-next 3/4] selftests/bpf: make PT_REGS_* work in userspace
From: Ilya Leoshkevich @ 2019-07-11 14:29 UTC (permalink / raw)
  To: bpf, netdev; +Cc: ys114321, daniel, sdf, davem, ast, Ilya Leoshkevich
In-Reply-To: <20190711142930.68809-1-iii@linux.ibm.com>

Right now, on certain architectures, these macros are usable only with
kernel headers. This patch makes it possible to use them with userspace
headers and, as a consequence, not only in BPF samples, but also in BPF
selftests.

On s390, provide the forward declaration of struct pt_regs and cast it
to user_pt_regs in PT_REGS_* macros. This is necessary, because instead
of the full struct pt_regs, s390 exposes only its first member
user_pt_regs to userspace, and bpf_helpers.h is used with both userspace
(in selftests) and kernel (in samples) headers. It was added in commit
466698e654e8 ("s390/bpf: correct broken uapi for
BPF_PROG_TYPE_PERF_EVENT program type").

Ditto on arm64.

On x86, provide userspace versions of PT_REGS_* macros. Unlike s390 and
arm64, x86 provides struct pt_regs to both userspace and kernel, however,
with different member names.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/bpf_helpers.h | 75 +++++++++++++++++------
 1 file changed, 55 insertions(+), 20 deletions(-)

diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
index 73071a94769a..27090d94afb6 100644
--- a/tools/testing/selftests/bpf/bpf_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_helpers.h
@@ -358,6 +358,7 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 
 #if defined(bpf_target_x86)
 
+#ifdef __KERNEL__
 #define PT_REGS_PARM1(x) ((x)->di)
 #define PT_REGS_PARM2(x) ((x)->si)
 #define PT_REGS_PARM3(x) ((x)->dx)
@@ -368,19 +369,49 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 #define PT_REGS_RC(x) ((x)->ax)
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->ip)
+#else
+#ifdef __i386__
+/* i386 kernel is built with -mregparm=3 */
+#define PT_REGS_PARM1(x) ((x)->eax)
+#define PT_REGS_PARM2(x) ((x)->edx)
+#define PT_REGS_PARM3(x) ((x)->ecx)
+#define PT_REGS_PARM4(x) 0
+#define PT_REGS_PARM5(x) 0
+#define PT_REGS_RET(x) ((x)->esp)
+#define PT_REGS_FP(x) ((x)->ebp)
+#define PT_REGS_RC(x) ((x)->eax)
+#define PT_REGS_SP(x) ((x)->esp)
+#define PT_REGS_IP(x) ((x)->eip)
+#else
+#define PT_REGS_PARM1(x) ((x)->rdi)
+#define PT_REGS_PARM2(x) ((x)->rsi)
+#define PT_REGS_PARM3(x) ((x)->rdx)
+#define PT_REGS_PARM4(x) ((x)->rcx)
+#define PT_REGS_PARM5(x) ((x)->r8)
+#define PT_REGS_RET(x) ((x)->rsp)
+#define PT_REGS_FP(x) ((x)->rbp)
+#define PT_REGS_RC(x) ((x)->rax)
+#define PT_REGS_SP(x) ((x)->rsp)
+#define PT_REGS_IP(x) ((x)->rip)
+#endif
+#endif
 
 #elif defined(bpf_target_s390)
 
-#define PT_REGS_PARM1(x) ((x)->gprs[2])
-#define PT_REGS_PARM2(x) ((x)->gprs[3])
-#define PT_REGS_PARM3(x) ((x)->gprs[4])
-#define PT_REGS_PARM4(x) ((x)->gprs[5])
-#define PT_REGS_PARM5(x) ((x)->gprs[6])
-#define PT_REGS_RET(x) ((x)->gprs[14])
-#define PT_REGS_FP(x) ((x)->gprs[11]) /* Works only with CONFIG_FRAME_POINTER */
-#define PT_REGS_RC(x) ((x)->gprs[2])
-#define PT_REGS_SP(x) ((x)->gprs[15])
-#define PT_REGS_IP(x) ((x)->psw.addr)
+/* s390 provides user_pt_regs instead of struct pt_regs to userspace */
+struct pt_regs;
+#define PT_REGS_S390 const volatile user_pt_regs
+#define PT_REGS_PARM1(x) (((PT_REGS_S390 *)(x))->gprs[2])
+#define PT_REGS_PARM2(x) (((PT_REGS_S390 *)(x))->gprs[3])
+#define PT_REGS_PARM3(x) (((PT_REGS_S390 *)(x))->gprs[4])
+#define PT_REGS_PARM4(x) (((PT_REGS_S390 *)(x))->gprs[5])
+#define PT_REGS_PARM5(x) (((PT_REGS_S390 *)(x))->gprs[6])
+#define PT_REGS_RET(x) (((PT_REGS_S390 *)(x))->gprs[14])
+/* Works only with CONFIG_FRAME_POINTER */
+#define PT_REGS_FP(x) (((PT_REGS_S390 *)(x))->gprs[11])
+#define PT_REGS_RC(x) (((PT_REGS_S390 *)(x))->gprs[2])
+#define PT_REGS_SP(x) (((PT_REGS_S390 *)(x))->gprs[15])
+#define PT_REGS_IP(x) (((PT_REGS_S390 *)(x))->psw.addr)
 
 #elif defined(bpf_target_arm)
 
@@ -397,16 +428,20 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 
 #elif defined(bpf_target_arm64)
 
-#define PT_REGS_PARM1(x) ((x)->regs[0])
-#define PT_REGS_PARM2(x) ((x)->regs[1])
-#define PT_REGS_PARM3(x) ((x)->regs[2])
-#define PT_REGS_PARM4(x) ((x)->regs[3])
-#define PT_REGS_PARM5(x) ((x)->regs[4])
-#define PT_REGS_RET(x) ((x)->regs[30])
-#define PT_REGS_FP(x) ((x)->regs[29]) /* Works only with CONFIG_FRAME_POINTER */
-#define PT_REGS_RC(x) ((x)->regs[0])
-#define PT_REGS_SP(x) ((x)->sp)
-#define PT_REGS_IP(x) ((x)->pc)
+/* arm64 provides struct user_pt_regs instead of struct pt_regs to userspace */
+struct pt_regs;
+#define PT_REGS_ARM64 const volatile struct user_pt_regs
+#define PT_REGS_PARM1(x) (((PT_REGS_ARM64 *)(x))->regs[0])
+#define PT_REGS_PARM2(x) (((PT_REGS_ARM64 *)(x))->regs[1])
+#define PT_REGS_PARM3(x) (((PT_REGS_ARM64 *)(x))->regs[2])
+#define PT_REGS_PARM4(x) (((PT_REGS_ARM64 *)(x))->regs[3])
+#define PT_REGS_PARM5(x) (((PT_REGS_ARM64 *)(x))->regs[4])
+#define PT_REGS_RET(x) (((PT_REGS_ARM64 *)(x))->regs[30])
+/* Works only with CONFIG_FRAME_POINTER */
+#define PT_REGS_FP(x) (((PT_REGS_ARM64 *)(x))->regs[29])
+#define PT_REGS_RC(x) (((PT_REGS_ARM64 *)(x))->regs[0])
+#define PT_REGS_SP(x) (((PT_REGS_ARM64 *)(x))->sp)
+#define PT_REGS_IP(x) (((PT_REGS_ARM64 *)(x))->pc)
 
 #elif defined(bpf_target_mips)
 
-- 
2.21.0


^ permalink raw reply related

* [PATCH v4 bpf-next 2/4] selftests/bpf: fix s930 -> s390 typo
From: Ilya Leoshkevich @ 2019-07-11 14:29 UTC (permalink / raw)
  To: bpf, netdev; +Cc: ys114321, daniel, sdf, davem, ast, Ilya Leoshkevich
In-Reply-To: <20190711142930.68809-1-iii@linux.ibm.com>

Also check for __s390__ instead of __s390x__, just in case bpf_helpers.h
is ever used by 32-bit userspace.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/bpf_helpers.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
index 5a3d92c8bec8..73071a94769a 100644
--- a/tools/testing/selftests/bpf/bpf_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_helpers.h
@@ -315,8 +315,8 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 #if defined(__TARGET_ARCH_x86)
 	#define bpf_target_x86
 	#define bpf_target_defined
-#elif defined(__TARGET_ARCH_s930x)
-	#define bpf_target_s930x
+#elif defined(__TARGET_ARCH_s390)
+	#define bpf_target_s390
 	#define bpf_target_defined
 #elif defined(__TARGET_ARCH_arm)
 	#define bpf_target_arm
@@ -341,8 +341,8 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 #ifndef bpf_target_defined
 #if defined(__x86_64__)
 	#define bpf_target_x86
-#elif defined(__s390x__)
-	#define bpf_target_s930x
+#elif defined(__s390__)
+	#define bpf_target_s390
 #elif defined(__arm__)
 	#define bpf_target_arm
 #elif defined(__aarch64__)
@@ -369,7 +369,7 @@ static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 #define PT_REGS_SP(x) ((x)->sp)
 #define PT_REGS_IP(x) ((x)->ip)
 
-#elif defined(bpf_target_s390x)
+#elif defined(bpf_target_s390)
 
 #define PT_REGS_PARM1(x) ((x)->gprs[2])
 #define PT_REGS_PARM2(x) ((x)->gprs[3])
-- 
2.21.0


^ permalink raw reply related

* [PATCH v4 bpf-next 1/4] selftests/bpf: compile progs with -D__TARGET_ARCH_$(SRCARCH)
From: Ilya Leoshkevich @ 2019-07-11 14:29 UTC (permalink / raw)
  To: bpf, netdev; +Cc: ys114321, daniel, sdf, davem, ast, Ilya Leoshkevich
In-Reply-To: <20190711142930.68809-1-iii@linux.ibm.com>

This opens up the possibility of accessing registers in an
arch-independent way.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/Makefile | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 2620406a53ec..ad84450e4ab8 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -1,4 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
+include ../../../scripts/Makefile.arch
 
 LIBDIR := ../../../lib
 BPFDIR := $(LIBDIR)/bpf
@@ -138,7 +139,8 @@ CLANG_SYS_INCLUDES := $(shell $(CLANG) -v -E - </dev/null 2>&1 \
 
 CLANG_FLAGS = -I. -I./include/uapi -I../../../include/uapi \
 	      $(CLANG_SYS_INCLUDES) \
-	      -Wno-compare-distinct-pointer-types
+	      -Wno-compare-distinct-pointer-types \
+	      -D__TARGET_ARCH_$(SRCARCH)
 
 $(OUTPUT)/test_l4lb_noinline.o: CLANG_FLAGS += -fno-inline
 $(OUTPUT)/test_xdp_noinline.o: CLANG_FLAGS += -fno-inline
-- 
2.21.0


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox