Netdev List
 help / color / mirror / Atom feed
* Re: linux-next: build failure after merge of the net-next tree
From: Leon Romanovsky @ 2019-07-10  5:20 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Doug Ledford, Jason Gunthorpe, David Miller, Networking,
	Linux Next Mailing List, Linux Kernel Mailing List,
	Bernard Metzler
In-Reply-To: <20190710143158.6e4bf706@canb.auug.org.au>

On Wed, Jul 10, 2019 at 02:31:58PM +1000, Stephen Rothwell wrote:
> Hi Leon,
>
> On Tue, 9 Jul 2019 09:43:46 +0300 Leon Romanovsky <leon@kernel.org> wrote:
> >
> > From 56c9e15ec670af580daa8c3ffde9503af3042d67 Mon Sep 17 00:00:00 2001
> > From: Leon Romanovsky <leonro@mellanox.com>
> > Date: Sun, 7 Jul 2019 10:43:42 +0300
> > Subject: [PATCH] Fixup to build SIW issue
> >
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
>
> I applied this to linux-next today and it fixes my build problems.

Thanks

>
> --
> Cheers,
> Stephen Rothwell



^ permalink raw reply

* RE: [PATCH 09/12] rtw88: Fix misuse of GENMASK macro
From: Tony Chuang @ 2019-07-10  5:07 UTC (permalink / raw)
  To: Joe Perches, Andrew Morton
  Cc: Kalle Valo, David S. Miller, linux-wireless@vger.kernel.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Andy Huang
In-Reply-To: <0de52d891d7925b02f4f0fe2c750d076e55434d9.1562734889.git.joe@perches.com>

> Subject: [PATCH 09/12] rtw88: Fix misuse of GENMASK macro
> 
> Arguments are supposed to be ordered high then low.
> 
> Signed-off-by: Joe Perches <joe@perches.com>
> ---
>  drivers/net/wireless/realtek/rtw88/rtw8822b.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/realtek/rtw88/rtw8822b.c
> b/drivers/net/wireless/realtek/rtw88/rtw8822b.c
> index 1172f6c0605b..d61d534396c7 100644
> --- a/drivers/net/wireless/realtek/rtw88/rtw8822b.c
> +++ b/drivers/net/wireless/realtek/rtw88/rtw8822b.c
> @@ -997,7 +997,7 @@ static void rtw8822b_do_iqk(struct rtw_dev *rtwdev)
>  	rtw_write_rf(rtwdev, RF_PATH_A, RF_DTXLOK, RFREG_MASK, 0x0);
> 
>  	reload = !!rtw_read32_mask(rtwdev, REG_IQKFAILMSK, BIT(16));
> -	iqk_fail_mask = rtw_read32_mask(rtwdev, REG_IQKFAILMSK,
> GENMASK(0, 7));
> +	iqk_fail_mask = rtw_read32_mask(rtwdev, REG_IQKFAILMSK,
> GENMASK(7, 0));
>  	rtw_dbg(rtwdev, RTW_DBG_PHY,
>  		"iqk counter=%d reload=%d do_iqk_cnt=%d
> n_iqk_fail(mask)=0x%02x\n",
>  		counter, reload, ++do_iqk_cnt, iqk_fail_mask);
> --

That's correct. Thanks.

Acked-by: Yan-Hsuan Chuang <yhchuang@realtek.com>

Yan-Hsuan

^ permalink raw reply

* [PATCH 00/12] treewide: Fix GENMASK misuses
From: Joe Perches @ 2019-07-10  5:04 UTC (permalink / raw)
  To: Andrew Morton, Patrick Venture, Nancy Yuen, Benjamin Fair,
	Andrew Jeffery, openbmc, linux-kernel, linux-aspeed,
	linux-arm-kernel, linux-amlogic, netdev, linux-mediatek,
	linux-stm32, linux-wireless, linux-media
  Cc: dri-devel, linux-iio, linux-mmc, devel, alsa-devel

These GENMASK uses are inverted argument order and the
actual masks produced are incorrect.  Fix them.

Add checkpatch tests to help avoid more misuses too.

Joe Perches (12):
  checkpatch: Add GENMASK tests
  clocksource/drivers/npcm: Fix misuse of GENMASK macro
  drm: aspeed_gfx: Fix misuse of GENMASK macro
  iio: adc: max9611: Fix misuse of GENMASK macro
  irqchip/gic-v3-its: Fix misuse of GENMASK macro
  mmc: meson-mx-sdio: Fix misuse of GENMASK macro
  net: ethernet: mediatek: Fix misuses of GENMASK macro
  net: stmmac: Fix misuses of GENMASK macro
  rtw88: Fix misuse of GENMASK macro
  phy: amlogic: G12A: Fix misuse of GENMASK macro
  staging: media: cedrus: Fix misuse of GENMASK macro
  ASoC: wcd9335: Fix misuse of GENMASK macro

 drivers/clocksource/timer-npcm7xx.c               |  2 +-
 drivers/gpu/drm/aspeed/aspeed_gfx.h               |  2 +-
 drivers/iio/adc/max9611.c                         |  2 +-
 drivers/irqchip/irq-gic-v3-its.c                  |  2 +-
 drivers/mmc/host/meson-mx-sdio.c                  |  2 +-
 drivers/net/ethernet/mediatek/mtk_eth_soc.h       |  2 +-
 drivers/net/ethernet/mediatek/mtk_sgmii.c         |  2 +-
 drivers/net/ethernet/stmicro/stmmac/descs.h       |  2 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c |  4 ++--
 drivers/net/wireless/realtek/rtw88/rtw8822b.c     |  2 +-
 drivers/phy/amlogic/phy-meson-g12a-usb2.c         |  2 +-
 drivers/staging/media/sunxi/cedrus/cedrus_regs.h  |  2 +-
 scripts/checkpatch.pl                             | 15 +++++++++++++++
 sound/soc/codecs/wcd-clsh-v2.c                    |  2 +-
 14 files changed, 29 insertions(+), 14 deletions(-)

-- 
2.15.0


^ permalink raw reply

* Re: [PATCH v2 05/10] net: hisilicon: HI13X1_GMAX need dreq reset at first
From: Jiangfeng Xiao @ 2019-07-10  5:05 UTC (permalink / raw)
  To: Sergei Shtylyov, davem, robh+dt, yisen.zhuang, salil.mehta,
	mark.rutland, dingtianhong
  Cc: netdev, devicetree, linux-kernel, leeyou.li, nixiaoming,
	jianping.liu, xiekunxun
In-Reply-To: <101b8c68-75f5-00a7-9845-e59c0467768c@huawei.com>



On 2019/7/9 21:48, Jiangfeng Xiao wrote:
> 
> 
> On 2019/7/9 17:35, Sergei Shtylyov wrote:
>> Hello!
>>
>> On 09.07.2019 6:31, Jiangfeng Xiao wrote:
>>
[...]
>>> @@ -853,6 +867,15 @@ static int hip04_mac_probe(struct platform_device *pdev)
>>>           goto init_fail;
>>>       }
>>>   +#if defined(CONFIG_HI13X1_GMAC)
>>> +    res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
>>> +    priv->sysctrl_base = devm_ioremap_resource(d, res);
>>
>>    There's devm_platform_ioremap_resource() now.
> 
> Thank you for your review, Great issue, which makes my code more concise.
> 
> I will fix it in v3. Or submit a patch to modify it separately, if maintainer
> applies this patch series.
> 
I decided to wait for this series of patches to sync to the mainline
and then fix this based on the mainline.

Because the mainline does not currently have this part of the code,
if I submit the changes, and the patch is accidentally merged into
another branch or another maintainer to handle, a conflict will occur.

As we all know, maintianer has to deal with many commits every day,
I don't want to increase the burden of maintainer.

So I decided to wait until the patch is synced to the mainline
and then modify it, which is more safe.


^ permalink raw reply

* [PATCH 07/12] net: ethernet: mediatek: Fix misuses of GENMASK macro
From: Joe Perches @ 2019-07-10  5:04 UTC (permalink / raw)
  To: Andrew Morton, Felix Fietkau, John Crispin, Sean Wang,
	Nelson Chang, Matthias Brugger
  Cc: David S. Miller, netdev, linux-arm-kernel, linux-mediatek,
	linux-kernel
In-Reply-To: <cover.1562734889.git.joe@perches.com>

Arguments are supposed to be ordered high then low.

Signed-off-by: Joe Perches <joe@perches.com>
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.h | 2 +-
 drivers/net/ethernet/mediatek/mtk_sgmii.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
index 876ce6798709..221012ecb845 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
@@ -712,7 +712,7 @@ struct mtk_soc_data {
 #define MTK_MAX_DEVS			2
 
 #define MTK_SGMII_PHYSPEED_AN          BIT(31)
-#define MTK_SGMII_PHYSPEED_MASK        GENMASK(0, 2)
+#define MTK_SGMII_PHYSPEED_MASK        GENMASK(2, 0)
 #define MTK_SGMII_PHYSPEED_1000        BIT(0)
 #define MTK_SGMII_PHYSPEED_2500        BIT(1)
 #define MTK_HAS_FLAGS(flags, _x)       (((flags) & (_x)) == (_x))
diff --git a/drivers/net/ethernet/mediatek/mtk_sgmii.c b/drivers/net/ethernet/mediatek/mtk_sgmii.c
index 136f90ce5a65..ff509d42d818 100644
--- a/drivers/net/ethernet/mediatek/mtk_sgmii.c
+++ b/drivers/net/ethernet/mediatek/mtk_sgmii.c
@@ -82,7 +82,7 @@ int mtk_sgmii_setup_mode_force(struct mtk_sgmii *ss, int id)
 		return -EINVAL;
 
 	regmap_read(ss->regmap[id], ss->ana_rgc3, &val);
-	val &= ~GENMASK(2, 3);
+	val &= ~GENMASK(3, 2);
 	mode = ss->flags[id] & MTK_SGMII_PHYSPEED_MASK;
 	val |= (mode == MTK_SGMII_PHYSPEED_1000) ? 0 : BIT(2);
 	regmap_write(ss->regmap[id], ss->ana_rgc3, val);
-- 
2.15.0


^ permalink raw reply related

* [PATCH 09/12] rtw88: Fix misuse of GENMASK macro
From: Joe Perches @ 2019-07-10  5:04 UTC (permalink / raw)
  To: Andrew Morton, Yan-Hsuan Chuang
  Cc: Kalle Valo, David S. Miller, linux-wireless, netdev, linux-kernel
In-Reply-To: <cover.1562734889.git.joe@perches.com>

Arguments are supposed to be ordered high then low.

Signed-off-by: Joe Perches <joe@perches.com>
---
 drivers/net/wireless/realtek/rtw88/rtw8822b.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/realtek/rtw88/rtw8822b.c b/drivers/net/wireless/realtek/rtw88/rtw8822b.c
index 1172f6c0605b..d61d534396c7 100644
--- a/drivers/net/wireless/realtek/rtw88/rtw8822b.c
+++ b/drivers/net/wireless/realtek/rtw88/rtw8822b.c
@@ -997,7 +997,7 @@ static void rtw8822b_do_iqk(struct rtw_dev *rtwdev)
 	rtw_write_rf(rtwdev, RF_PATH_A, RF_DTXLOK, RFREG_MASK, 0x0);
 
 	reload = !!rtw_read32_mask(rtwdev, REG_IQKFAILMSK, BIT(16));
-	iqk_fail_mask = rtw_read32_mask(rtwdev, REG_IQKFAILMSK, GENMASK(0, 7));
+	iqk_fail_mask = rtw_read32_mask(rtwdev, REG_IQKFAILMSK, GENMASK(7, 0));
 	rtw_dbg(rtwdev, RTW_DBG_PHY,
 		"iqk counter=%d reload=%d do_iqk_cnt=%d n_iqk_fail(mask)=0x%02x\n",
 		counter, reload, ++do_iqk_cnt, iqk_fail_mask);
-- 
2.15.0


^ permalink raw reply related

* [PATCH 08/12] net: stmmac: Fix misuses of GENMASK macro
From: Joe Perches @ 2019-07-10  5:04 UTC (permalink / raw)
  To: Andrew Morton, Giuseppe Cavallaro, Alexandre Torgue, Jose Abreu,
	Maxime Coquelin, Maxime Ripard, Chen-Yu Tsai
  Cc: David S. Miller, netdev, linux-stm32, linux-arm-kernel,
	linux-kernel
In-Reply-To: <cover.1562734889.git.joe@perches.com>

Arguments are supposed to be ordered high then low.

Signed-off-by: Joe Perches <joe@perches.com>
---
 drivers/net/ethernet/stmicro/stmmac/descs.h       | 2 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/descs.h b/drivers/net/ethernet/stmicro/stmmac/descs.h
index 10429b05f932..9f0b9a9e63b3 100644
--- a/drivers/net/ethernet/stmicro/stmmac/descs.h
+++ b/drivers/net/ethernet/stmicro/stmmac/descs.h
@@ -123,7 +123,7 @@
 #define	ETDES1_BUFFER2_SIZE_SHIFT	16
 
 /* Extended Receive descriptor definitions */
-#define	ERDES4_IP_PAYLOAD_TYPE_MASK	GENMASK(2, 6)
+#define	ERDES4_IP_PAYLOAD_TYPE_MASK	GENMASK(6, 2)
 #define	ERDES4_IP_HDR_ERR		BIT(3)
 #define	ERDES4_IP_PAYLOAD_ERR		BIT(4)
 #define	ERDES4_IP_CSUM_BYPASSED		BIT(5)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
index 6d5cba4075eb..4c5db00c1d18 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
@@ -192,7 +192,7 @@ static const struct emac_variant emac_variant_h6 = {
 
 /* Used in RX_CTL1*/
 #define EMAC_RX_MD              BIT(1)
-#define EMAC_RX_TH_MASK		GENMASK(4, 5)
+#define EMAC_RX_TH_MASK		GENMASK(5, 4)
 #define EMAC_RX_TH_32		0
 #define EMAC_RX_TH_64		(0x1 << 4)
 #define EMAC_RX_TH_96		(0x2 << 4)
@@ -203,7 +203,7 @@ static const struct emac_variant emac_variant_h6 = {
 /* Used in TX_CTL1*/
 #define EMAC_TX_MD              BIT(1)
 #define EMAC_TX_NEXT_FRM        BIT(2)
-#define EMAC_TX_TH_MASK		GENMASK(8, 10)
+#define EMAC_TX_TH_MASK		GENMASK(10, 8)
 #define EMAC_TX_TH_64		0
 #define EMAC_TX_TH_128		(0x1 << 8)
 #define EMAC_TX_TH_192		(0x2 << 8)
-- 
2.15.0


^ permalink raw reply related

* [PATCH v2] net/mlx5e: Refactor switch statements to avoid using uninitialized variables
From: Nathan Chancellor @ 2019-07-10  4:47 UTC (permalink / raw)
  To: Saeed Mahameed, Leon Romanovsky
  Cc: David S. Miller, Boris Pismenny, netdev, linux-rdma, linux-kernel,
	clang-built-linux, Nathan Chancellor, Nick Desaulniers
In-Reply-To: <20190708231154.89969-1-natechancellor@gmail.com>

clang warns:

drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c:251:2:
warning: variable 'rec_seq_sz' is used uninitialized whenever switch
default is taken [-Wsometimes-uninitialized]
        default:
        ^~~~~~~
drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c:255:46: note:
uninitialized use occurs here
        skip_static_post = !memcmp(rec_seq, &rn_be, rec_seq_sz);
                                                    ^~~~~~~~~~
drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c:239:16: note:
initialize the variable 'rec_seq_sz' to silence this warning
        u16 rec_seq_sz;
                      ^
                       = 0
1 warning generated.

The default case statement should return in tx_post_resync_params like
in fill_static_params_ctx. However, as Nick and Leon point out, the
switch statements converted into if statements to clean up the code a
bit since there is only one cipher supported. Do that to clear up the
code.

Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Link: https://github.com/ClangBuiltLinux/linux/issues/590
Suggested-by: Leon Romanovsky <leon@kernel.org>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
---

v1 -> v2:

* Refactor switch statements into if statements

 .../mellanox/mlx5/core/en_accel/ktls_tx.c     | 33 +++++++------------
 1 file changed, 11 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
index 3f5f4317a22b..ea032f54197e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
@@ -25,23 +25,17 @@ static void
 fill_static_params_ctx(void *ctx, struct mlx5e_ktls_offload_context_tx *priv_tx)
 {
 	struct tls_crypto_info *crypto_info = priv_tx->crypto_info;
+	struct tls12_crypto_info_aes_gcm_128 *info;
 	char *initial_rn, *gcm_iv;
 	u16 salt_sz, rec_seq_sz;
 	char *salt, *rec_seq;
 	u8 tls_version;
 
-	switch (crypto_info->cipher_type) {
-	case TLS_CIPHER_AES_GCM_128: {
-		struct tls12_crypto_info_aes_gcm_128 *info =
-			(struct tls12_crypto_info_aes_gcm_128 *)crypto_info;
-
-		EXTRACT_INFO_FIELDS;
-		break;
-	}
-	default:
-		WARN_ON(1);
+	if (WARN_ON(crypto_info->cipher_type != TLS_CIPHER_AES_GCM_128))
 		return;
-	}
+
+	info = (struct tls12_crypto_info_aes_gcm_128 *)crypto_info;
+	EXTRACT_INFO_FIELDS;
 
 	gcm_iv      = MLX5_ADDR_OF(tls_static_params, ctx, gcm_iv);
 	initial_rn  = MLX5_ADDR_OF(tls_static_params, ctx, initial_record_number);
@@ -234,23 +228,18 @@ tx_post_resync_params(struct mlx5e_txqsq *sq,
 		      u64 rcd_sn)
 {
 	struct tls_crypto_info *crypto_info = priv_tx->crypto_info;
+	struct tls12_crypto_info_aes_gcm_128 *info;
 	__be64 rn_be = cpu_to_be64(rcd_sn);
 	bool skip_static_post;
 	u16 rec_seq_sz;
 	char *rec_seq;
 
-	switch (crypto_info->cipher_type) {
-	case TLS_CIPHER_AES_GCM_128: {
-		struct tls12_crypto_info_aes_gcm_128 *info =
-			(struct tls12_crypto_info_aes_gcm_128 *)crypto_info;
+	if (WARN_ON(crypto_info->cipher_type != TLS_CIPHER_AES_GCM_128))
+		return;
 
-		rec_seq = info->rec_seq;
-		rec_seq_sz = sizeof(info->rec_seq);
-		break;
-	}
-	default:
-		WARN_ON(1);
-	}
+	info = (struct tls12_crypto_info_aes_gcm_128 *)crypto_info;
+	rec_seq = info->rec_seq;
+	rec_seq_sz = sizeof(info->rec_seq);
 
 	skip_static_post = !memcmp(rec_seq, &rn_be, rec_seq_sz);
 	if (!skip_static_post)
-- 
2.22.0


^ permalink raw reply related

* [PATCH] mlx5: Return -EINVAL when WARN_ON_ONCE triggers in mlx5e_tls_resync().
From: David Miller @ 2019-07-10  4:36 UTC (permalink / raw)
  To: netdev


Return value was changes to 'int' from void but this return statement
was not updated, or it slipped in via a merge.

Fixes: b5d9a834f4fd ("net/tls: don't clear TX resync flag on error")
Signed-off-by: David S. Miller <davem@davemloft.net>
---

Applied to net-next.

 drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c
index ca07c86427a7..fba561ffe1d4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c
@@ -170,7 +170,7 @@ static int mlx5e_tls_resync(struct net_device *netdev, struct sock *sk,
 	u64 rcd_sn = *(u64 *)rcd_sn_data;
 
 	if (WARN_ON_ONCE(direction != TLS_OFFLOAD_CTX_DIR_RX))
-		return;
+		return -EINVAL;
 	rx_ctx = mlx5e_get_tls_rx_context(tls_ctx);
 
 	netdev_info(netdev, "resyncing seq %d rcd %lld\n", seq,
-- 
2.20.1


^ permalink raw reply related

* Re: linux-next: build failure after merge of the net-next tree
From: Stephen Rothwell @ 2019-07-10  4:31 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Jason Gunthorpe, David Miller, Networking,
	Linux Next Mailing List, Linux Kernel Mailing List,
	Bernard Metzler
In-Reply-To: <20190709064346.GF7034@mtr-leonro.mtl.com>

[-- Attachment #1: Type: text/plain, Size: 461 bytes --]

Hi Leon,

On Tue, 9 Jul 2019 09:43:46 +0300 Leon Romanovsky <leon@kernel.org> wrote:
>
> From 56c9e15ec670af580daa8c3ffde9503af3042d67 Mon Sep 17 00:00:00 2001
> From: Leon Romanovsky <leonro@mellanox.com>
> Date: Sun, 7 Jul 2019 10:43:42 +0300
> Subject: [PATCH] Fixup to build SIW issue
> 
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>

I applied this to linux-next today and it fixes my build problems.

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH] net/mlx5e: Return in default case statement in tx_post_resync_params
From: Leon Romanovsky @ 2019-07-10  4:24 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Nick Desaulniers, Saeed Mahameed, David S. Miller, Boris Pismenny,
	netdev, linux-rdma, LKML, clang-built-linux
In-Reply-To: <20190709231024.GA61953@archlinux-threadripper>

On Tue, Jul 09, 2019 at 04:10:24PM -0700, Nathan Chancellor wrote:
> On Tue, Jul 09, 2019 at 03:44:59PM -0700, Nick Desaulniers wrote:
> > On Mon, Jul 8, 2019 at 4:13 PM Nathan Chancellor
> > <natechancellor@gmail.com> wrote:
> > >
> > > clang warns:
> > >
> > > drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c:251:2:
> > > warning: variable 'rec_seq_sz' is used uninitialized whenever switch
> > > default is taken [-Wsometimes-uninitialized]
> > >         default:
> > >         ^~~~~~~
> > > drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c:255:46: note:
> > > uninitialized use occurs here
> > >         skip_static_post = !memcmp(rec_seq, &rn_be, rec_seq_sz);
> > >                                                     ^~~~~~~~~~
> > > drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c:239:16: note:
> > > initialize the variable 'rec_seq_sz' to silence this warning
> > >         u16 rec_seq_sz;
> > >                       ^
> > >                        = 0
> > > 1 warning generated.
> > >
> > > This case statement was clearly designed to be one that should not be
> > > hit during runtime because of the WARN_ON statement so just return early
> > > to prevent copying uninitialized memory up into rn_be.
> > >
> > > Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
> > > Link: https://github.com/ClangBuiltLinux/linux/issues/590
> > > Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
> > > ---
> > >  drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
> > > index 3f5f4317a22b..5c08891806f0 100644
> > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
> > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
> > > @@ -250,6 +250,7 @@ tx_post_resync_params(struct mlx5e_txqsq *sq,
> > >         }
> > >         default:
> > >                 WARN_ON(1);
> > > +               return;
> > >         }
> >
> > hmm...a switch statement with a single case is a code smell.  How
> > about a single conditional with early return?  Then the "meat" of the
> > happy path doesn't need an additional level of indentation.
> > --
> > Thanks,
> > ~Nick Desaulniers
>
> I assume that the reason for this is there may be other cipher types
> added in the future? I suppose the maintainers can give more clarity to
> that.

Our devices supports extra ciphers, for example TLS_CIPHER_AES_GCM_256.
So I assume this was the reason for switch<->case, but because such
implementation doesn't exist in any driver, I recommend to rewrite the
code to have "if" statement and return early.

>
> Furthermore, if they want the switch statements to remain, it looks like
> fill_static_params_ctx also returns in the default statement so it seems
> like this is the right fix.
>
> Cheers,
> Nathan

^ permalink raw reply

* Re: [bpf PATCH v2 2/6] bpf: tls fix transition through disconnect with close
From: John Fastabend @ 2019-07-10  3:39 UTC (permalink / raw)
  To: Jakub Kicinski, John Fastabend; +Cc: ast, daniel, netdev, edumazet, bpf
In-Reply-To: <20190709194525.0d4c15a6@cakuba.netronome.com>

Jakub Kicinski wrote:
> On Mon, 08 Jul 2019 19:14:05 +0000, John Fastabend wrote:
> > @@ -287,6 +313,27 @@ static void tls_sk_proto_cleanup(struct sock *sk,
> >  #endif
> >  }
> >  
> > +static void tls_sk_proto_unhash(struct sock *sk)
> > +{
> > +	struct inet_connection_sock *icsk = inet_csk(sk);
> > +	long timeo = sock_sndtimeo(sk, 0);
> > +	struct tls_context *ctx;
> > +
> > +	if (unlikely(!icsk->icsk_ulp_data)) {
> 
> Is this for when sockmap is stacked on top of TLS and TLS got removed
> without letting sockmap know?

Right its a pattern I used on the sockmap side and put here. But
I dropped the patch to let sockmap stack on top of TLS because
it was more than a fix IMO. We could probably drop this check on
the other hand its harmless.
> 
> > +		if (sk->sk_prot->unhash)
> > +			sk->sk_prot->unhash(sk);
> > +	}
> > +
> > +	ctx = tls_get_ctx(sk);
> > +	if (ctx->tx_conf == TLS_SW || ctx->rx_conf == TLS_SW)
> > +		tls_sk_proto_cleanup(sk, ctx, timeo);
> > +	icsk->icsk_ulp_data = NULL;
> 
> I think close only starts checking if ctx is NULL in patch 6.
> Looks like some chunks of ctx checking/clearing got spread to
> patch 1 and some to patch 6.

Yeah, I thought the patches were easier to read this way but
maybe not. Could add something in the commit log.

> 
> > +	tls_ctx_free_wq(ctx);
> > +
> > +	if (ctx->unhash)
> > +		ctx->unhash(sk);
> > +}
> > +
> >  static void tls_sk_proto_close(struct sock *sk, long timeout)
> >  {
> >  	struct tls_context *ctx = tls_get_ctx(sk);
> 



^ permalink raw reply

* Re: [bpf PATCH v2 6/6] bpf: sockmap/tls, close can race with map free
From: John Fastabend @ 2019-07-10  3:33 UTC (permalink / raw)
  To: Jakub Kicinski, John Fastabend; +Cc: ast, daniel, netdev, edumazet, bpf
In-Reply-To: <20190709193846.62f0a2c7@cakuba.netronome.com>

Jakub Kicinski wrote:
> On Mon, 08 Jul 2019 19:15:18 +0000, John Fastabend wrote:
> > @@ -352,15 +354,18 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
> >  	if (ctx->tx_conf == TLS_BASE && ctx->rx_conf == TLS_BASE)
> >  		goto skip_tx_cleanup;
> >  
> > -	sk->sk_prot = ctx->sk_proto;
> >  	tls_sk_proto_cleanup(sk, ctx, timeo);
> >  
> >  skip_tx_cleanup:
> > +	write_lock_bh(&sk->sk_callback_lock);
> > +	icsk->icsk_ulp_data = NULL;
> 
> Is ulp_data pointer now supposed to be updated under the
> sk_callback_lock?

Yes otherwise it can race with tls_update(). I didn't remove the
ulp pointer null set from tcp_ulp.c though. Could be done in this
patch or as a follow up.
> 
> > +	if (sk->sk_prot->close == tls_sk_proto_close)
> > +		sk->sk_prot = ctx->sk_proto;
> > +	write_unlock_bh(&sk->sk_callback_lock);
> >  	release_sock(sk);
> >  	if (ctx->rx_conf == TLS_SW)
> >  		tls_sw_release_strp_rx(ctx);
> > -	sk_proto_close(sk, timeout);
> > -
> > +	ctx->sk_proto_close(sk, timeout);
> >  	if (ctx->tx_conf != TLS_HW && ctx->rx_conf != TLS_HW &&
> >  	    ctx->tx_conf != TLS_HW_RECORD && ctx->rx_conf != TLS_HW_RECORD)
> >  		tls_ctx_free(ctx);

^ permalink raw reply

* Re: [bpf PATCH v2 0/6] bpf: sockmap/tls fixes
From: John Fastabend @ 2019-07-10  3:28 UTC (permalink / raw)
  To: Jakub Kicinski, John Fastabend; +Cc: ast, daniel, netdev, edumazet, bpf
In-Reply-To: <20190709192145.473d2d80@cakuba.netronome.com>

Jakub Kicinski wrote:
> On Tue, 9 Jul 2019 17:04:59 -0700, Jakub Kicinski wrote:
> > On Tue, 09 Jul 2019 08:40:14 -0700, John Fastabend wrote:
> > > Jakub Kicinski wrote:  
> > > > Looks like strparser is not done'd for offload?    
> > > 
> > > Right so if rx_conf != TLS_SW then the hardware needs to do
> > > the strparser functionality.  
> > 
> > Can I just take a stab at fixing the HW part?
> > 
> > Can I rebase this onto net-next?  There are a few patches from net
> > missing in the bpf tree.
> 
> I think I fixed patch 1 for offload, I need to test it a little more
> and I'll send it back to you. In the meantime, let me ask some
> questions about the other two :)

Great thanks. When your ready push it back and I'll retest in
my setup.

^ permalink raw reply

* [PATCH v3 0/3] kernel/notifier.c: avoid duplicate registration
From: Xiaoming Ni @ 2019-07-10  3:09 UTC (permalink / raw)
  To: adobriyan, akpm, anna.schumaker, arjan, bfields, chuck.lever,
	davem, gregkh, jlayton, luto, mingo, Nadia.Derbey, paulmck,
	semen.protsenko, stable, stern, tglx, torvalds, trond.myklebust,
	viresh.kumar, vvs
  Cc: alex.huangjianhui, dylix.dailei, nixiaoming, linux-kernel,
	linux-nfs, netdev

Registering the same notifier to a hook repeatedly can cause the hook
list to form a ring or lose other members of the list.

case1: An infinite loop in notifier_chain_register() can cause soft lockup
        atomic_notifier_chain_register(&test_notifier_list, &test1);
        atomic_notifier_chain_register(&test_notifier_list, &test1);
        atomic_notifier_chain_register(&test_notifier_list, &test2);

case2: An infinite loop in notifier_chain_register() can cause soft lockup
        atomic_notifier_chain_register(&test_notifier_list, &test1);
        atomic_notifier_chain_register(&test_notifier_list, &test1);
        atomic_notifier_call_chain(&test_notifier_list, 0, NULL);

case3: lose other hook test2
        atomic_notifier_chain_register(&test_notifier_list, &test1);
        atomic_notifier_chain_register(&test_notifier_list, &test2);
        atomic_notifier_chain_register(&test_notifier_list, &test1);

case4: Unregister returns 0, but the hook is still in the linked list,
        and it is not really registered. If you call notifier_call_chain
        after ko is unloaded, it will trigger oops. if the system is
       	configured with softlockup_panic and the same hook is repeatedly
       	registered on the panic_notifier_list, it will cause a loop panic.

so. need add a check in in notifier_chain_register() to avoid duplicate
registration

v1:
* use notifier_chain_cond_register replace notifier_chain_register

v2:
* Add a check in notifier_chain_register() to avoid duplicate registration
* remove notifier_chain_cond_register() to avoid duplicate code 
* remove blocking_notifier_chain_cond_register() to avoid duplicate code

v3:
* Add a cover letter.

Xiaoming Ni (3):
  kernel/notifier.c: avoid duplicate registration
  kernel/notifier.c: remove notifier_chain_cond_register()
  kernel/notifier.c: remove blocking_notifier_chain_cond_register()

 include/linux/notifier.h |  4 ----
 kernel/notifier.c        | 41 +++--------------------------------------
 net/sunrpc/rpc_pipe.c    |  2 +-
 3 files changed, 4 insertions(+), 43 deletions(-)

-- 
1.8.5.6


^ permalink raw reply

* [PATCH v3 1/3] kernel/notifier.c: avoid duplicate registration
From: Xiaoming Ni @ 2019-07-10  3:09 UTC (permalink / raw)
  To: adobriyan, akpm, anna.schumaker, arjan, bfields, chuck.lever,
	davem, gregkh, jlayton, luto, mingo, Nadia.Derbey, paulmck,
	semen.protsenko, stable, stern, tglx, torvalds, trond.myklebust,
	viresh.kumar, vvs
  Cc: alex.huangjianhui, dylix.dailei, nixiaoming, linux-kernel,
	linux-nfs, netdev

Registering the same notifier to a hook repeatedly can cause the hook
list to form a ring or lose other members of the list.

case1: An infinite loop in notifier_chain_register() can cause soft lockup
        atomic_notifier_chain_register(&test_notifier_list, &test1);
        atomic_notifier_chain_register(&test_notifier_list, &test1);
        atomic_notifier_chain_register(&test_notifier_list, &test2);

case2: An infinite loop in notifier_chain_register() can cause soft lockup
        atomic_notifier_chain_register(&test_notifier_list, &test1);
        atomic_notifier_chain_register(&test_notifier_list, &test1);
        atomic_notifier_call_chain(&test_notifier_list, 0, NULL);

case3: lose other hook test2
        atomic_notifier_chain_register(&test_notifier_list, &test1);
        atomic_notifier_chain_register(&test_notifier_list, &test2);
        atomic_notifier_chain_register(&test_notifier_list, &test1);

case4: Unregister returns 0, but the hook is still in the linked list,
        and it is not really registered. If you call notifier_call_chain
        after ko is unloaded, it will trigger oops.

If the system is configured with softlockup_panic and the same
hook is repeatedly registered on the panic_notifier_list, it
will cause a loop panic.

Add a check in notifier_chain_register() to avoid duplicate registration

Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
---
 kernel/notifier.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/notifier.c b/kernel/notifier.c
index d9f5081..30bedb8 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -23,7 +23,10 @@ static int notifier_chain_register(struct notifier_block **nl,
 		struct notifier_block *n)
 {
 	while ((*nl) != NULL) {
-		WARN_ONCE(((*nl) == n), "double register detected");
+		if (unlikely((*nl) == n)) {
+			WARN(1, "double register detected");
+			return 0;
+		}
 		if (n->priority > (*nl)->priority)
 			break;
 		nl = &((*nl)->next);
-- 
1.8.5.6


^ permalink raw reply related

* [PATCH v3 3/3] kernel/notifier.c: remove blocking_notifier_chain_cond_register()
From: Xiaoming Ni @ 2019-07-10  3:09 UTC (permalink / raw)
  To: adobriyan, akpm, anna.schumaker, arjan, bfields, chuck.lever,
	davem, gregkh, jlayton, luto, mingo, Nadia.Derbey, paulmck,
	semen.protsenko, stable, stern, tglx, torvalds, trond.myklebust,
	viresh.kumar, vvs
  Cc: alex.huangjianhui, dylix.dailei, nixiaoming, linux-kernel,
	linux-nfs, netdev

blocking_notifier_chain_cond_register() does not consider
system_booting state, which is the only difference between this
function and blocking_notifier_cain_register(). This can be a bug
and is a piece of duplicate code.

Delete blocking_notifier_chain_cond_register()

Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
---
 include/linux/notifier.h |  4 ----
 kernel/notifier.c        | 23 -----------------------
 net/sunrpc/rpc_pipe.c    |  2 +-
 3 files changed, 1 insertion(+), 28 deletions(-)

diff --git a/include/linux/notifier.h b/include/linux/notifier.h
index 0096a05..0189476 100644
--- a/include/linux/notifier.h
+++ b/include/linux/notifier.h
@@ -150,10 +150,6 @@ extern int raw_notifier_chain_register(struct raw_notifier_head *nh,
 extern int srcu_notifier_chain_register(struct srcu_notifier_head *nh,
 		struct notifier_block *nb);
 
-extern int blocking_notifier_chain_cond_register(
-		struct blocking_notifier_head *nh,
-		struct notifier_block *nb);
-
 extern int atomic_notifier_chain_unregister(struct atomic_notifier_head *nh,
 		struct notifier_block *nb);
 extern int blocking_notifier_chain_unregister(struct blocking_notifier_head *nh,
diff --git a/kernel/notifier.c b/kernel/notifier.c
index e3d221f..63d7501 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -221,29 +221,6 @@ int blocking_notifier_chain_register(struct blocking_notifier_head *nh,
 EXPORT_SYMBOL_GPL(blocking_notifier_chain_register);
 
 /**
- *	blocking_notifier_chain_cond_register - Cond add notifier to a blocking notifier chain
- *	@nh: Pointer to head of the blocking notifier chain
- *	@n: New entry in notifier chain
- *
- *	Adds a notifier to a blocking notifier chain, only if not already
- *	present in the chain.
- *	Must be called in process context.
- *
- *	Currently always returns zero.
- */
-int blocking_notifier_chain_cond_register(struct blocking_notifier_head *nh,
-		struct notifier_block *n)
-{
-	int ret;
-
-	down_write(&nh->rwsem);
-	ret = notifier_chain_register(&nh->head, n);
-	up_write(&nh->rwsem);
-	return ret;
-}
-EXPORT_SYMBOL_GPL(blocking_notifier_chain_cond_register);
-
-/**
  *	blocking_notifier_chain_unregister - Remove notifier from a blocking notifier chain
  *	@nh: Pointer to head of the blocking notifier chain
  *	@n: Entry to remove from notifier chain
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index 126d314..1287f80 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -50,7 +50,7 @@
 
 int rpc_pipefs_notifier_register(struct notifier_block *nb)
 {
-	return blocking_notifier_chain_cond_register(&rpc_pipefs_notifier_list, nb);
+	return blocking_notifier_chain_register(&rpc_pipefs_notifier_list, nb);
 }
 EXPORT_SYMBOL_GPL(rpc_pipefs_notifier_register);
 
-- 
1.8.5.6


^ permalink raw reply related

* [PATCH v3 2/3] kernel/notifier.c: remove notifier_chain_cond_register()
From: Xiaoming Ni @ 2019-07-10  3:09 UTC (permalink / raw)
  To: adobriyan, akpm, anna.schumaker, arjan, bfields, chuck.lever,
	davem, gregkh, jlayton, luto, mingo, Nadia.Derbey, paulmck,
	semen.protsenko, stable, stern, tglx, torvalds, trond.myklebust,
	viresh.kumar, vvs
  Cc: alex.huangjianhui, dylix.dailei, nixiaoming, linux-kernel,
	linux-nfs, netdev

The only difference between notifier_chain_cond_register() and
notifier_chain_register() is the lack of warning hints for duplicate
registrations.
Consider using notifier_chain_register() instead of
notifier_chain_cond_register() to avoid duplicate code

Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
---
 kernel/notifier.c | 17 +----------------
 1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/kernel/notifier.c b/kernel/notifier.c
index 30bedb8..e3d221f 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -36,21 +36,6 @@ static int notifier_chain_register(struct notifier_block **nl,
 	return 0;
 }
 
-static int notifier_chain_cond_register(struct notifier_block **nl,
-		struct notifier_block *n)
-{
-	while ((*nl) != NULL) {
-		if ((*nl) == n)
-			return 0;
-		if (n->priority > (*nl)->priority)
-			break;
-		nl = &((*nl)->next);
-	}
-	n->next = *nl;
-	rcu_assign_pointer(*nl, n);
-	return 0;
-}
-
 static int notifier_chain_unregister(struct notifier_block **nl,
 		struct notifier_block *n)
 {
@@ -252,7 +237,7 @@ int blocking_notifier_chain_cond_register(struct blocking_notifier_head *nh,
 	int ret;
 
 	down_write(&nh->rwsem);
-	ret = notifier_chain_cond_register(&nh->head, n);
+	ret = notifier_chain_register(&nh->head, n);
 	up_write(&nh->rwsem);
 	return ret;
 }
-- 
1.8.5.6


^ permalink raw reply related

* Re: [bpf PATCH v2 2/6] bpf: tls fix transition through disconnect with close
From: Jakub Kicinski @ 2019-07-10  2:45 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev, edumazet, bpf
In-Reply-To: <156261324561.31108.14410711674221391677.stgit@ubuntu3-kvm1>

On Mon, 08 Jul 2019 19:14:05 +0000, John Fastabend wrote:
> @@ -287,6 +313,27 @@ static void tls_sk_proto_cleanup(struct sock *sk,
>  #endif
>  }
>  
> +static void tls_sk_proto_unhash(struct sock *sk)
> +{
> +	struct inet_connection_sock *icsk = inet_csk(sk);
> +	long timeo = sock_sndtimeo(sk, 0);
> +	struct tls_context *ctx;
> +
> +	if (unlikely(!icsk->icsk_ulp_data)) {

Is this for when sockmap is stacked on top of TLS and TLS got removed
without letting sockmap know?

> +		if (sk->sk_prot->unhash)
> +			sk->sk_prot->unhash(sk);
> +	}
> +
> +	ctx = tls_get_ctx(sk);
> +	if (ctx->tx_conf == TLS_SW || ctx->rx_conf == TLS_SW)
> +		tls_sk_proto_cleanup(sk, ctx, timeo);
> +	icsk->icsk_ulp_data = NULL;

I think close only starts checking if ctx is NULL in patch 6.
Looks like some chunks of ctx checking/clearing got spread to
patch 1 and some to patch 6.

> +	tls_ctx_free_wq(ctx);
> +
> +	if (ctx->unhash)
> +		ctx->unhash(sk);
> +}
> +
>  static void tls_sk_proto_close(struct sock *sk, long timeout)
>  {
>  	struct tls_context *ctx = tls_get_ctx(sk);


^ permalink raw reply

* Re: [bpf PATCH v2 6/6] bpf: sockmap/tls, close can race with map free
From: Jakub Kicinski @ 2019-07-10  2:38 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev, edumazet, bpf
In-Reply-To: <156261331866.31108.6405316261950259075.stgit@ubuntu3-kvm1>

On Mon, 08 Jul 2019 19:15:18 +0000, John Fastabend wrote:
> @@ -352,15 +354,18 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
>  	if (ctx->tx_conf == TLS_BASE && ctx->rx_conf == TLS_BASE)
>  		goto skip_tx_cleanup;
>  
> -	sk->sk_prot = ctx->sk_proto;
>  	tls_sk_proto_cleanup(sk, ctx, timeo);
>  
>  skip_tx_cleanup:
> +	write_lock_bh(&sk->sk_callback_lock);
> +	icsk->icsk_ulp_data = NULL;

Is ulp_data pointer now supposed to be updated under the
sk_callback_lock?

> +	if (sk->sk_prot->close == tls_sk_proto_close)
> +		sk->sk_prot = ctx->sk_proto;
> +	write_unlock_bh(&sk->sk_callback_lock);
>  	release_sock(sk);
>  	if (ctx->rx_conf == TLS_SW)
>  		tls_sw_release_strp_rx(ctx);
> -	sk_proto_close(sk, timeout);
> -
> +	ctx->sk_proto_close(sk, timeout);
>  	if (ctx->tx_conf != TLS_HW && ctx->rx_conf != TLS_HW &&
>  	    ctx->tx_conf != TLS_HW_RECORD && ctx->rx_conf != TLS_HW_RECORD)
>  		tls_ctx_free(ctx);


^ permalink raw reply

* Re: [bpf PATCH v2 6/6] bpf: sockmap/tls, close can race with map free
From: Jakub Kicinski @ 2019-07-10  2:36 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev, edumazet, bpf
In-Reply-To: <156261331866.31108.6405316261950259075.stgit@ubuntu3-kvm1>

On Mon, 08 Jul 2019 19:15:18 +0000, John Fastabend wrote:
> @@ -836,22 +841,39 @@ static int tls_init(struct sock *sk)

There is a goto out above this which has to be turned into return 0;
if out now releases the lock.

>  	if (sk->sk_state != TCP_ESTABLISHED)
>  		return -ENOTSUPP;
>  
> +	tls_build_proto(sk);
> +
>  	/* allocate tls context */
> +	write_lock_bh(&sk->sk_callback_lock);
>  	ctx = create_ctx(sk);
>  	if (!ctx) {
>  		rc = -ENOMEM;
>  		goto out;
>  	}
>  
> -	tls_build_proto(sk);
>  	ctx->tx_conf = TLS_BASE;
>  	ctx->rx_conf = TLS_BASE;
>  	ctx->sk_proto = sk->sk_prot;
>  	update_sk_prot(sk, ctx);
>  out:
> +	write_unlock_bh(&sk->sk_callback_lock);
>  	return rc;
>  }

^ permalink raw reply

* Re: [PATCH bpf-next v3] virtio_net: add XDP meta data support
From: Jason Wang @ 2019-07-10  2:30 UTC (permalink / raw)
  To: Daniel Borkmann, Yuya Kusakabe
  Cc: ast, davem, hawk, jakub.kicinski, john.fastabend, kafai, mst,
	netdev, songliubraving, yhs
In-Reply-To: <116cdb35-57b3-e2fe-ef8a-05cc6a1afbbe@iogearbox.net>


On 2019/7/10 上午4:03, Daniel Borkmann wrote:
> On 07/09/2019 05:04 AM, Jason Wang wrote:
>> On 2019/7/9 上午6:38, Daniel Borkmann wrote:
>>> On 07/02/2019 04:11 PM, Yuya Kusakabe wrote:
>>>> On 7/2/19 5:33 PM, Jason Wang wrote:
>>>>> On 2019/7/2 下午4:16, Yuya Kusakabe wrote:
>>>>>> This adds XDP meta data support to both receive_small() and
>>>>>> receive_mergeable().
>>>>>>
>>>>>> Fixes: de8f3a83b0a0 ("bpf: add meta pointer for direct access")
>>>>>> Signed-off-by: Yuya Kusakabe <yuya.kusakabe@gmail.com>
>>>>>> ---
>>>>>> v3:
>>>>>>     - fix preserve the vnet header in receive_small().
>>>>>> v2:
>>>>>>     - keep copy untouched in page_to_skb().
>>>>>>     - preserve the vnet header in receive_small().
>>>>>>     - fix indentation.
>>>>>> ---
>>>>>>     drivers/net/virtio_net.c | 45 +++++++++++++++++++++++++++-------------
>>>>>>     1 file changed, 31 insertions(+), 14 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>>>> index 4f3de0ac8b0b..03a1ae6fe267 100644
>>>>>> --- a/drivers/net/virtio_net.c
>>>>>> +++ b/drivers/net/virtio_net.c
>>>>>> @@ -371,7 +371,7 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
>>>>>>                        struct receive_queue *rq,
>>>>>>                        struct page *page, unsigned int offset,
>>>>>>                        unsigned int len, unsigned int truesize,
>>>>>> -                   bool hdr_valid)
>>>>>> +                   bool hdr_valid, unsigned int metasize)
>>>>>>     {
>>>>>>         struct sk_buff *skb;
>>>>>>         struct virtio_net_hdr_mrg_rxbuf *hdr;
>>>>>> @@ -393,7 +393,7 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
>>>>>>         else
>>>>>>             hdr_padded_len = sizeof(struct padded_vnet_hdr);
>>>>>>     -    if (hdr_valid)
>>>>>> +    if (hdr_valid && !metasize)
>>>>>>             memcpy(hdr, p, hdr_len);
>>>>>>           len -= hdr_len;
>>>>>> @@ -405,6 +405,11 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
>>>>>>             copy = skb_tailroom(skb);
>>>>>>         skb_put_data(skb, p, copy);
>>>>>>     +    if (metasize) {
>>>>>> +        __skb_pull(skb, metasize);
>>>>>> +        skb_metadata_set(skb, metasize);
>>>>>> +    }
>>>>>> +
>>>>>>         len -= copy;
>>>>>>         offset += copy;
>>>>>>     @@ -644,6 +649,7 @@ static struct sk_buff *receive_small(struct net_device *dev,
>>>>>>         unsigned int delta = 0;
>>>>>>         struct page *xdp_page;
>>>>>>         int err;
>>>>>> +    unsigned int metasize = 0;
>>>>>>           len -= vi->hdr_len;
>>>>>>         stats->bytes += len;
>>>>>> @@ -683,10 +689,13 @@ static struct sk_buff *receive_small(struct net_device *dev,
>>>>>>               xdp.data_hard_start = buf + VIRTNET_RX_PAD + vi->hdr_len;
>>>>>>             xdp.data = xdp.data_hard_start + xdp_headroom;
>>>>>> -        xdp_set_data_meta_invalid(&xdp);
>>>>>>             xdp.data_end = xdp.data + len;
>>>>>> +        xdp.data_meta = xdp.data;
>>>>>>             xdp.rxq = &rq->xdp_rxq;
>>>>>>             orig_data = xdp.data;
>>>>>> +        /* Copy the vnet header to the front of data_hard_start to avoid
>>>>>> +         * overwriting by XDP meta data */
>>>>>> +        memcpy(xdp.data_hard_start - vi->hdr_len, xdp.data - vi->hdr_len, vi->hdr_len);
>>> I'm not fully sure if I'm following this one correctly, probably just missing
>>> something. Isn't the vnet header based on how we set up xdp.data_hard_start
>>> earlier already in front of it? Wouldn't we copy invalid data from xdp.data -
>>> vi->hdr_len into the vnet header at that point (given there can be up to 256
>>> bytes of headroom between the two)? If it's relative to xdp.data and headroom
>>> is >0, then BPF prog could otherwise mangle this; something doesn't add up to
>>> me here. Could you clarify? Thx
>> Vnet headr sits just in front of xdp.data not xdp.data_hard_start. So it could be overwrote by metadata, that's why we need a copy here.
> For the current code, you can adjust the xdp.data with a positive/negative offset
> already via bpf_xdp_adjust_head() helper. If vnet headr sits just in front of
> xdp.data, couldn't this be overridden today as well then? Anyway, just wondering
> how this is handled differently?


We will invalidate the vnet header in this case. But for the case of 
metadata adjustment without header adjustment, we want to seek a way to 
preserve that.

Thanks


>
> Thanks,
> Daniel

^ permalink raw reply

* Re: [RFC v2] vhost: introduce mdev based hardware vhost backend
From: Jason Wang @ 2019-07-10  2:26 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: Alex Williamson, mst, maxime.coquelin, linux-kernel, kvm,
	virtualization, netdev, dan.daly, cunming.liang, zhihong.wang,
	idos, Rob Miller, Ariel Adam
In-Reply-To: <20190709063317.GA29300@___>


On 2019/7/9 下午2:33, Tiwei Bie wrote:
> On Tue, Jul 09, 2019 at 10:50:38AM +0800, Jason Wang wrote:
>> On 2019/7/8 下午2:16, Tiwei Bie wrote:
>>> On Fri, Jul 05, 2019 at 08:49:46AM -0600, Alex Williamson wrote:
>>>> On Thu, 4 Jul 2019 14:21:34 +0800
>>>> Tiwei Bie <tiwei.bie@intel.com> wrote:
>>>>> On Thu, Jul 04, 2019 at 12:31:48PM +0800, Jason Wang wrote:
>>>>>> On 2019/7/3 下午9:08, Tiwei Bie wrote:
>>>>>>> On Wed, Jul 03, 2019 at 08:16:23PM +0800, Jason Wang wrote:
>>>>>>>> On 2019/7/3 下午7:52, Tiwei Bie wrote:
>>>>>>>>> On Wed, Jul 03, 2019 at 06:09:51PM +0800, Jason Wang wrote:
>>>>>>>>>> On 2019/7/3 下午5:13, Tiwei Bie wrote:
>>>>>>>>>>> Details about this can be found here:
>>>>>>>>>>>
>>>>>>>>>>> https://lwn.net/Articles/750770/
>>>>>>>>>>>
>>>>>>>>>>> What's new in this version
>>>>>>>>>>> ==========================
>>>>>>>>>>>
>>>>>>>>>>> A new VFIO device type is introduced - vfio-vhost. This addressed
>>>>>>>>>>> some comments from here:https://patchwork.ozlabs.org/cover/984763/
>>>>>>>>>>>
>>>>>>>>>>> Below is the updated device interface:
>>>>>>>>>>>
>>>>>>>>>>> Currently, there are two regions of this device: 1) CONFIG_REGION
>>>>>>>>>>> (VFIO_VHOST_CONFIG_REGION_INDEX), which can be used to setup the
>>>>>>>>>>> device; 2) NOTIFY_REGION (VFIO_VHOST_NOTIFY_REGION_INDEX), which
>>>>>>>>>>> can be used to notify the device.
>>>>>>>>>>>
>>>>>>>>>>> 1. CONFIG_REGION
>>>>>>>>>>>
>>>>>>>>>>> The region described by CONFIG_REGION is the main control interface.
>>>>>>>>>>> Messages will be written to or read from this region.
>>>>>>>>>>>
>>>>>>>>>>> The message type is determined by the `request` field in message
>>>>>>>>>>> header. The message size is encoded in the message header too.
>>>>>>>>>>> The message format looks like this:
>>>>>>>>>>>
>>>>>>>>>>> struct vhost_vfio_op {
>>>>>>>>>>> 	__u64 request;
>>>>>>>>>>> 	__u32 flags;
>>>>>>>>>>> 	/* Flag values: */
>>>>>>>>>>>       #define VHOST_VFIO_NEED_REPLY 0x1 /* Whether need reply */
>>>>>>>>>>> 	__u32 size;
>>>>>>>>>>> 	union {
>>>>>>>>>>> 		__u64 u64;
>>>>>>>>>>> 		struct vhost_vring_state state;
>>>>>>>>>>> 		struct vhost_vring_addr addr;
>>>>>>>>>>> 	} payload;
>>>>>>>>>>> };
>>>>>>>>>>>
>>>>>>>>>>> The existing vhost-kernel ioctl cmds are reused as the message
>>>>>>>>>>> requests in above structure.
>>>>>>>>>> Still a comments like V1. What's the advantage of inventing a new protocol?
>>>>>>>>> I'm trying to make it work in VFIO's way..
>>>>>>>>>> I believe either of the following should be better:
>>>>>>>>>>
>>>>>>>>>> - using vhost ioctl,  we can start from SET_VRING_KICK/SET_VRING_CALL and
>>>>>>>>>> extend it with e.g notify region. The advantages is that all exist userspace
>>>>>>>>>> program could be reused without modification (or minimal modification). And
>>>>>>>>>> vhost API hides lots of details that is not necessary to be understood by
>>>>>>>>>> application (e.g in the case of container).
>>>>>>>>> Do you mean reusing vhost's ioctl on VFIO device fd directly,
>>>>>>>>> or introducing another mdev driver (i.e. vhost_mdev instead of
>>>>>>>>> using the existing vfio_mdev) for mdev device?
>>>>>>>> Can we simply add them into ioctl of mdev_parent_ops?
>>>>>>> Right, either way, these ioctls have to be and just need to be
>>>>>>> added in the ioctl of the mdev_parent_ops. But another thing we
>>>>>>> also need to consider is that which file descriptor the userspace
>>>>>>> will do the ioctl() on. So I'm wondering do you mean let the
>>>>>>> userspace do the ioctl() on the VFIO device fd of the mdev
>>>>>>> device?
>>>>>> Yes.
>>>>> Got it! I'm not sure what's Alex opinion on this. If we all
>>>>> agree with this, I can do it in this way.
>>>>>
>>>>>> Is there any other way btw?
>>>>> Just a quick thought.. Maybe totally a bad idea. I was thinking
>>>>> whether it would be odd to do non-VFIO's ioctls on VFIO's device
>>>>> fd. So I was wondering whether it's possible to allow binding
>>>>> another mdev driver (e.g. vhost_mdev) to the supported mdev
>>>>> devices. The new mdev driver, vhost_mdev, can provide similar
>>>>> ways to let userspace open the mdev device and do the vhost ioctls
>>>>> on it. To distinguish with the vfio_mdev compatible mdev devices,
>>>>> the device API of the new vhost_mdev compatible mdev devices
>>>>> might be e.g. "vhost-net" for net?
>>>>>
>>>>> So in VFIO case, the device will be for passthru directly. And
>>>>> in VHOST case, the device can be used to accelerate the existing
>>>>> virtualized devices.
>>>>>
>>>>> How do you think?
>>>> VFIO really can't prevent vendor specific ioctls on the device file
>>>> descriptor for mdevs, but a) we'd want to be sure the ioctl address
>>>> space can't collide with ioctls we'd use for vfio defined purposes and
>>>> b) maybe the VFIO user API isn't what you want in the first place if
>>>> you intend to mostly/entirely ignore the defined ioctl set and replace
>>>> them with your own.  In the case of the latter, you're also not getting
>>>> the advantages of the existing VFIO userspace code, so why expose a
>>>> VFIO device at all.
>>> Yeah, I totally agree.
>>
>> I guess the original idea is to reuse the VFIO DMA/IOMMU API for this. Then
>> we have the chance to reuse vfio codes in qemu for dealing with e.g vIOMMU.
> Yeah, you are right. We have several choices here:
>
> #1. We expose a VFIO device, so we can reuse the VFIO container/group
>      based DMA API and potentially reuse a lot of VFIO code in QEMU.
>
>      But in this case, we have two choices for the VFIO device interface
>      (i.e. the interface on top of VFIO device fd):
>
>      A) we may invent a new vhost protocol (as demonstrated by the code
>         in this RFC) on VFIO device fd to make it work in VFIO's way,
>         i.e. regions and irqs.
>
>      B) Or as you proposed, instead of inventing a new vhost protocol,
>         we can reuse most existing vhost ioctls on the VFIO device fd
>         directly. There should be no conflicts between the VFIO ioctls
>         (type is 0x3B) and VHOST ioctls (type is 0xAF) currently.
>
> #2. Instead of exposing a VFIO device, we may expose a VHOST device.
>      And we will introduce a new mdev driver vhost-mdev to do this.
>      It would be natural to reuse the existing kernel vhost interface
>      (ioctls) on it as much as possible. But we will need to invent
>      some APIs for DMA programming (reusing VHOST_SET_MEM_TABLE is a
>      choice, but it's too heavy and doesn't support vIOMMU by itself).
>
> I'm not sure which one is the best choice we all want..
> Which one (#1/A, #1/B, or #2) would you prefer?


#2 looks better. One concern is that we may end up with similar API as 
what VFIO does. And I do see some new RFC for VFIO to add more DMA API.

Consider it was still in the stage of RFC, does it make sense if we try 
this way with some sample parents?


>
>>
>>>> The mdev interface does provide a general interface for creating and
>>>> managing virtual devices, vfio-mdev is just one driver on the mdev
>>>> bus.  Parav (Mellanox) has been doing work on mdev-core to help clean
>>>> out vfio-isms from the interface, aiui, with the intent of implementing
>>>> another mdev bus driver for using the devices within the kernel.
>>> Great to know this! I found below series after some searching:
>>>
>>> https://lkml.org/lkml/2019/3/8/821
>>>
>>> In above series, the new mlx5_core mdev driver will do the probe
>>> by calling mlx5_get_core_dev() first on the parent device of the
>>> mdev device. In vhost_mdev, maybe we can also keep track of all
>>> the compatible mdev devices and use this info to do the probe.
>>
>> I don't get why this is needed. My understanding is if we want to go this
>> way, there're actually two parts. 1) Vhost mdev that implements the device
>> managements and vhost ioctl. 2) Vhost it self, which can accept mdev fd as
>> it backend through VHOST_NET_SET_BACKEND.
> I think with vhost-mdev (or with vfio-mdev if we agree to do vhost
> ioctls on vfio device fd directly), we don't need to open /dev/vhost-net
> (and there is no VHOST_NET_SET_BACKEND needed) at all. Either way,
> after getting the fd of the mdev, we just need to do vhost ioctls
> on it directly.


The reason I ask is that vhost-net is designed to not tied to any kind 
of backend. So it's better to have a single place to deal with ioctl. 
But it's not must.

Thanks


>
>>
>>> But we also need a way to allow vfio_mdev driver to distinguish
>>> and reject the incompatible mdev devices.
>>
>> One issue for this series is that it doesn't consider DMA isolation at all.
>>
>>
>>>> It
>>>> seems like this vhost-mdev driver might be similar, using mdev but not
>>>> necessarily vfio-mdev to expose devices.  Thanks,
>>> Yeah, I also think so!
>>
>> I've cced some driver developers for their inputs. I think we need a sample
>> parent drivers in the next version for us to understand the full picture.
>>
>>
>> Thanks
>>
>>
>>> Thanks!
>>> Tiwei
>>>
>>>> Alex

^ permalink raw reply

* Re: [bpf PATCH v2 0/6] bpf: sockmap/tls fixes
From: Jakub Kicinski @ 2019-07-10  2:21 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev, edumazet, bpf
In-Reply-To: <20190709170459.387bced6@cakuba.netronome.com>

On Tue, 9 Jul 2019 17:04:59 -0700, Jakub Kicinski wrote:
> On Tue, 09 Jul 2019 08:40:14 -0700, John Fastabend wrote:
> > Jakub Kicinski wrote:  
> > > Looks like strparser is not done'd for offload?    
> > 
> > Right so if rx_conf != TLS_SW then the hardware needs to do
> > the strparser functionality.  
> 
> Can I just take a stab at fixing the HW part?
> 
> Can I rebase this onto net-next?  There are a few patches from net
> missing in the bpf tree.

I think I fixed patch 1 for offload, I need to test it a little more
and I'll send it back to you. In the meantime, let me ask some
questions about the other two :)

^ permalink raw reply

* Re: [EXT] [PATCH net-next 16/16] qlge: Refill empty buffer queues from wq
From: Benjamin Poirier @ 2019-07-10  1:18 UTC (permalink / raw)
  To: Manish Chopra; +Cc: GR-Linux-NIC-Dev, netdev@vger.kernel.org
In-Reply-To: <DM6PR18MB2697EC53399F214EC3DFC4ABABFD0@DM6PR18MB2697.namprd18.prod.outlook.com>

On 2019/06/27 14:18, Manish Chopra wrote:
> > -----Original Message-----
> > From: Benjamin Poirier <bpoirier@suse.com>
> > Sent: Monday, June 17, 2019 1:19 PM
> > To: Manish Chopra <manishc@marvell.com>; GR-Linux-NIC-Dev <GR-Linux-
> > NIC-Dev@marvell.com>; netdev@vger.kernel.org
> > Subject: [EXT] [PATCH net-next 16/16] qlge: Refill empty buffer queues from
> > wq
> > 
> > External Email
> > 
> > ----------------------------------------------------------------------
> > When operating at mtu 9000, qlge does order-1 allocations for rx buffers in
> > atomic context. This is especially unreliable when free memory is low or
> > fragmented. Add an approach similar to commit 3161e453e496 ("virtio: net
> > refill on out-of-memory") to qlge so that the device doesn't lock up if there
> > are allocation failures.
> > 
[...]
> > +
> > +static void ql_update_buffer_queues(struct rx_ring *rx_ring, gfp_t gfp,
> > +				    unsigned long delay)
> > +{
> > +	bool sbq_fail, lbq_fail;
> > +
> > +	sbq_fail = !!qlge_refill_bq(&rx_ring->sbq, gfp);
> > +	lbq_fail = !!qlge_refill_bq(&rx_ring->lbq, gfp);
> > +
> > +	/* Minimum number of buffers needed to be able to receive at least
> > one
> > +	 * frame of any format:
> > +	 * sbq: 1 for header + 1 for data
> > +	 * lbq: mtu 9000 / lb size
> > +	 * Below this, the queue might stall.
> > +	 */
> > +	if ((sbq_fail && QLGE_BQ_HW_OWNED(&rx_ring->sbq) < 2) ||
> > +	    (lbq_fail && QLGE_BQ_HW_OWNED(&rx_ring->lbq) <
> > +	     DIV_ROUND_UP(9000, LARGE_BUFFER_MAX_SIZE)))
> > +		/* Allocations can take a long time in certain cases (ex.
> > +		 * reclaim). Therefore, use a workqueue for long-running
> > +		 * work items.
> > +		 */
> > +		queue_delayed_work_on(smp_processor_id(),
> > system_long_wq,
> > +				      &rx_ring->refill_work, delay);
> >  }
> > 
> 
> This is probably going to mess up when at the interface load time (qlge_open()) allocation failure occurs, in such cases we don't really want to re-try allocations
> using refill_work but rather simply fail the interface load.

Why would you want to turn a recoverable failure into a fatal failure?

In case of allocation failure at ndo_open time, allocations are retried
later from a workqueue. Meanwhile, the device can use the available rx
buffers (if any could be allocated at all).

> Just to make sure here in such cases it shouldn't lead to kernel panic etc. while completing qlge_open() and
> leaving refill_work executing in background. Or probably handle such allocation failures from the napi context and schedule refill_work from there.
> 

I've just tested allocation failures at open time and didn't find
problems; with mtu 9000, using bcc, for example:
tools/inject.py -P 0.5 -c 100 alloc_page "should_fail_alloc_page(gfp_t gfp_mask, unsigned int order) (order == 1) => qlge_refill_bq()"

What exact scenario do you have in mind that's going to lead to
problems? Please try it out and describe it precisely.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox