Netdev List
 help / color / mirror / Atom feed
* [Patch V2 net 01/11] net: hns3: add error handler for hns3_nic_init_vector_data()
From: Huazhong Tan @ 2018-10-27  8:10 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	linyunsheng
In-Reply-To: <1540627818-17635-1-git-send-email-tanhuazhong@huawei.com>

When hns3_nic_init_vector_data() failed for mapping ring to vector,
it should cancel the netif_napi_add() that have been successfully done
and then exit.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 32f3aca8..d9066c5 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -2821,7 +2821,7 @@ static int hns3_nic_init_vector_data(struct hns3_nic_priv *priv)
 	struct hnae3_handle *h = priv->ae_handle;
 	struct hns3_enet_tqp_vector *tqp_vector;
 	int ret = 0;
-	u16 i;
+	int i, j;
 
 	hns3_nic_set_cpumask(priv);
 
@@ -2868,13 +2868,19 @@ static int hns3_nic_init_vector_data(struct hns3_nic_priv *priv)
 		hns3_free_vector_ring_chain(tqp_vector, &vector_ring_chain);
 
 		if (ret)
-			return ret;
+			goto map_ring_fail;
 
 		netif_napi_add(priv->netdev, &tqp_vector->napi,
 			       hns3_nic_common_poll, NAPI_POLL_WEIGHT);
 	}
 
 	return 0;
+
+map_ring_fail:
+	for (j = i - 1; j >= 0; j--)
+		netif_napi_del(&priv->tqp_vector[j].napi);
+
+	return ret;
 }
 
 static int hns3_nic_alloc_vector_data(struct hns3_nic_priv *priv)
-- 
2.7.4

^ permalink raw reply related

* [Patch V2 net 00/11] Bugfix for the HNS3 driver
From: Huazhong Tan @ 2018-10-27  8:10 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	linyunsheng

This patch series include bugfix for the HNS3 ethernet
controller driver.

Change log:
V1->V2:
	Fixes the compilation break reported by kbuild test robot
	http://patchwork.ozlabs.org/patch/989818/

Huazhong Tan (11):
  net: hns3: add error handler for hns3_nic_init_vector_data()
  net: hns3: add error handler for
    hns3_get_ring_config/hns3_queue_to_ring
  net: hns3: bugfix for reporting unknown vector0 interrupt repeatly
    problem
  net: hns3: bugfix for the initialization of command queue's spin lock
  net: hns3: remove unnecessary queue reset in the
    hns3_uninit_all_ring()
  net: hns3: bugfix for is_valid_csq_clean_head()
  net: hns3: bugfix for hclge_mdio_write and hclge_mdio_read
  net: hns3: fix incorrect return value/type of some functions
  net: hns3: bugfix for handling mailbox while the command queue
    reinitialized
  net: hns3: bugfix for rtnl_lock's range in the hclge_reset()
  net: hns3: bugfix for rtnl_lock's range in the hclgevf_reset()

 drivers/net/ethernet/hisilicon/hns3/hnae3.h        |   6 +-
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c    | 105 +++++++++++++++------
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.h    |   2 +-
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c |  26 +++--
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    |  42 ++++-----
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h    |   2 +-
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c |   6 ++
 .../ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c    |   4 +-
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c  |  19 ++--
 9 files changed, 136 insertions(+), 76 deletions(-)

-- 
2.7.4

^ permalink raw reply

* [Patch V2 net 08/11] net: hns3: fix incorrect return value/type of some functions
From: Huazhong Tan @ 2018-10-27  8:10 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	linyunsheng
In-Reply-To: <1540627818-17635-1-git-send-email-tanhuazhong@huawei.com>

There are some functions that, when they fail to execute a send command,
need to return the corresponding error value to its caller.

Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Fixes: 681ec3999b3d ("net: hns3: fix for vlan table lost problem when resetting")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
V2: Fix the compilation error reported by kbuild test robot
---
 drivers/net/ethernet/hisilicon/hns3/hnae3.h        |  6 +-
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c    | 80 +++++++++++++++-------
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.h    |  2 +-
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    | 34 ++++-----
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h    |  2 +-
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c  | 14 ++--
 6 files changed, 85 insertions(+), 53 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index e82e4ca..055b406 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -316,8 +316,8 @@ struct hnae3_ae_ops {
 	int (*set_loopback)(struct hnae3_handle *handle,
 			    enum hnae3_loop loop_mode, bool en);
 
-	void (*set_promisc_mode)(struct hnae3_handle *handle, bool en_uc_pmc,
-				 bool en_mc_pmc);
+	int (*set_promisc_mode)(struct hnae3_handle *handle, bool en_uc_pmc,
+				bool en_mc_pmc);
 	int (*set_mtu)(struct hnae3_handle *handle, int new_mtu);
 
 	void (*get_pauseparam)(struct hnae3_handle *handle,
@@ -391,7 +391,7 @@ struct hnae3_ae_ops {
 				      int vector_num,
 				      struct hnae3_ring_chain_node *vr_chain);
 
-	void (*reset_queue)(struct hnae3_handle *handle, u16 queue_id);
+	int (*reset_queue)(struct hnae3_handle *handle, u16 queue_id);
 	u32 (*get_fw_version)(struct hnae3_handle *handle);
 	void (*get_mdix_mode)(struct hnae3_handle *handle,
 			      u8 *tp_mdix_ctrl, u8 *tp_mdix);
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index a80ecfb..4d919b8 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -509,16 +509,18 @@ static void hns3_nic_set_rx_mode(struct net_device *netdev)
 	h->netdev_flags = new_flags;
 }
 
-void hns3_update_promisc_mode(struct net_device *netdev, u8 promisc_flags)
+int hns3_update_promisc_mode(struct net_device *netdev, u8 promisc_flags)
 {
 	struct hns3_nic_priv *priv = netdev_priv(netdev);
 	struct hnae3_handle *h = priv->ae_handle;
 
 	if (h->ae_algo->ops->set_promisc_mode) {
-		h->ae_algo->ops->set_promisc_mode(h,
-						  promisc_flags & HNAE3_UPE,
-						  promisc_flags & HNAE3_MPE);
+		return h->ae_algo->ops->set_promisc_mode(h,
+						promisc_flags & HNAE3_UPE,
+						promisc_flags & HNAE3_MPE);
 	}
+
+	return 0;
 }
 
 void hns3_enable_vlan_filter(struct net_device *netdev, bool enable)
@@ -1494,18 +1496,22 @@ static int hns3_vlan_rx_kill_vid(struct net_device *netdev,
 	return ret;
 }
 
-static void hns3_restore_vlan(struct net_device *netdev)
+static int hns3_restore_vlan(struct net_device *netdev)
 {
 	struct hns3_nic_priv *priv = netdev_priv(netdev);
+	int ret = 0;
 	u16 vid;
-	int ret;
 
 	for_each_set_bit(vid, priv->active_vlans, VLAN_N_VID) {
 		ret = hns3_vlan_rx_add_vid(netdev, htons(ETH_P_8021Q), vid);
-		if (ret)
-			netdev_warn(netdev, "Restore vlan: %d filter, ret:%d\n",
-				    vid, ret);
+		if (ret) {
+			netdev_err(netdev, "Restore vlan: %d filter, ret:%d\n",
+				   vid, ret);
+			return ret;
+		}
 	}
+
+	return ret;
 }
 
 static int hns3_ndo_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan,
@@ -3247,11 +3253,12 @@ int hns3_uninit_all_ring(struct hns3_nic_priv *priv)
 }
 
 /* Set mac addr if it is configured. or leave it to the AE driver */
-static void hns3_init_mac_addr(struct net_device *netdev, bool init)
+static int hns3_init_mac_addr(struct net_device *netdev, bool init)
 {
 	struct hns3_nic_priv *priv = netdev_priv(netdev);
 	struct hnae3_handle *h = priv->ae_handle;
 	u8 mac_addr_temp[ETH_ALEN];
+	int ret = 0;
 
 	if (h->ae_algo->ops->get_mac_addr && init) {
 		h->ae_algo->ops->get_mac_addr(h, mac_addr_temp);
@@ -3266,8 +3273,9 @@ static void hns3_init_mac_addr(struct net_device *netdev, bool init)
 	}
 
 	if (h->ae_algo->ops->set_mac_addr)
-		h->ae_algo->ops->set_mac_addr(h, netdev->dev_addr, true);
+		ret = h->ae_algo->ops->set_mac_addr(h, netdev->dev_addr, true);
 
+	return ret;
 }
 
 static int hns3_restore_fd_rules(struct net_device *netdev)
@@ -3480,20 +3488,29 @@ static int hns3_client_setup_tc(struct hnae3_handle *handle, u8 tc)
 	return ret;
 }
 
-static void hns3_recover_hw_addr(struct net_device *ndev)
+static int hns3_recover_hw_addr(struct net_device *ndev)
 {
 	struct netdev_hw_addr_list *list;
 	struct netdev_hw_addr *ha, *tmp;
+	int ret = 0;
 
 	/* go through and sync uc_addr entries to the device */
 	list = &ndev->uc;
-	list_for_each_entry_safe(ha, tmp, &list->list, list)
-		hns3_nic_uc_sync(ndev, ha->addr);
+	list_for_each_entry_safe(ha, tmp, &list->list, list) {
+		ret = hns3_nic_uc_sync(ndev, ha->addr);
+		if (ret)
+			return ret;
+	}
 
 	/* go through and sync mc_addr entries to the device */
 	list = &ndev->mc;
-	list_for_each_entry_safe(ha, tmp, &list->list, list)
-		hns3_nic_mc_sync(ndev, ha->addr);
+	list_for_each_entry_safe(ha, tmp, &list->list, list) {
+		ret = hns3_nic_mc_sync(ndev, ha->addr);
+		if (ret)
+			return ret;
+	}
+
+	return ret;
 }
 
 static void hns3_remove_hw_addr(struct net_device *netdev)
@@ -3620,7 +3637,10 @@ int hns3_nic_reset_all_ring(struct hnae3_handle *h)
 	int ret;
 
 	for (i = 0; i < h->kinfo.num_tqps; i++) {
-		h->ae_algo->ops->reset_queue(h, i);
+		ret = h->ae_algo->ops->reset_queue(h, i);
+		if (ret)
+			return ret;
+
 		hns3_init_ring_hw(priv->ring_data[i].ring);
 
 		/* We need to clear tx ring here because self test will
@@ -3712,18 +3732,30 @@ static int hns3_reset_notify_init_enet(struct hnae3_handle *handle)
 	bool vlan_filter_enable;
 	int ret;
 
-	hns3_init_mac_addr(netdev, false);
-	hns3_recover_hw_addr(netdev);
-	hns3_update_promisc_mode(netdev, handle->netdev_flags);
+	ret = hns3_init_mac_addr(netdev, false);
+	if (ret)
+		return ret;
+
+	ret = hns3_recover_hw_addr(netdev);
+	if (ret)
+		return ret;
+
+	ret = hns3_update_promisc_mode(netdev, handle->netdev_flags);
+	if (ret)
+		return ret;
+
 	vlan_filter_enable = netdev->flags & IFF_PROMISC ? false : true;
 	hns3_enable_vlan_filter(netdev, vlan_filter_enable);
 
-
 	/* Hardware table is only clear when pf resets */
-	if (!(handle->flags & HNAE3_SUPPORT_VF))
-		hns3_restore_vlan(netdev);
+	if (!(handle->flags & HNAE3_SUPPORT_VF)) {
+		ret = hns3_restore_vlan(netdev);
+		return ret;
+	}
 
-	hns3_restore_fd_rules(netdev);
+	ret = hns3_restore_fd_rules(netdev);
+	if (ret)
+		return ret;
 
 	/* Carrier off reporting is important to ethtool even BEFORE open */
 	netif_carrier_off(netdev);
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
index 71cfca1..d3636d0 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
@@ -640,7 +640,7 @@ void hns3_set_vector_coalesce_rl(struct hns3_enet_tqp_vector *tqp_vector,
 				 u32 rl_value);
 
 void hns3_enable_vlan_filter(struct net_device *netdev, bool enable);
-void hns3_update_promisc_mode(struct net_device *netdev, u8 promisc_flags);
+int hns3_update_promisc_mode(struct net_device *netdev, u8 promisc_flags);
 
 #ifdef CONFIG_HNS3_DCB
 void hns3_dcbnl_setup(struct hnae3_handle *handle);
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 4dd0506..f3212c9 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -3314,8 +3314,8 @@ void hclge_promisc_param_init(struct hclge_promisc_param *param, bool en_uc,
 	param->vf_id = vport_id;
 }
 
-static void hclge_set_promisc_mode(struct hnae3_handle *handle, bool en_uc_pmc,
-				   bool en_mc_pmc)
+static int hclge_set_promisc_mode(struct hnae3_handle *handle, bool en_uc_pmc,
+				  bool en_mc_pmc)
 {
 	struct hclge_vport *vport = hclge_get_vport(handle);
 	struct hclge_dev *hdev = vport->back;
@@ -3323,7 +3323,7 @@ static void hclge_set_promisc_mode(struct hnae3_handle *handle, bool en_uc_pmc,
 
 	hclge_promisc_param_init(&param, en_uc_pmc, en_mc_pmc, true,
 				 vport->vport_id);
-	hclge_cmd_set_promisc_mode(hdev, &param);
+	return hclge_cmd_set_promisc_mode(hdev, &param);
 }
 
 static int hclge_get_fd_mode(struct hclge_dev *hdev, u8 *fd_mode)
@@ -6107,28 +6107,28 @@ static u16 hclge_covert_handle_qid_global(struct hnae3_handle *handle,
 	return tqp->index;
 }
 
-void hclge_reset_tqp(struct hnae3_handle *handle, u16 queue_id)
+int hclge_reset_tqp(struct hnae3_handle *handle, u16 queue_id)
 {
 	struct hclge_vport *vport = hclge_get_vport(handle);
 	struct hclge_dev *hdev = vport->back;
 	int reset_try_times = 0;
 	int reset_status;
 	u16 queue_gid;
-	int ret;
+	int ret = 0;
 
 	queue_gid = hclge_covert_handle_qid_global(handle, queue_id);
 
 	ret = hclge_tqp_enable(hdev, queue_id, 0, false);
 	if (ret) {
-		dev_warn(&hdev->pdev->dev, "Disable tqp fail, ret = %d\n", ret);
-		return;
+		dev_err(&hdev->pdev->dev, "Disable tqp fail, ret = %d\n", ret);
+		return ret;
 	}
 
 	ret = hclge_send_reset_tqp_cmd(hdev, queue_gid, true);
 	if (ret) {
-		dev_warn(&hdev->pdev->dev,
-			 "Send reset tqp cmd fail, ret = %d\n", ret);
-		return;
+		dev_err(&hdev->pdev->dev,
+			"Send reset tqp cmd fail, ret = %d\n", ret);
+		return ret;
 	}
 
 	reset_try_times = 0;
@@ -6141,16 +6141,16 @@ void hclge_reset_tqp(struct hnae3_handle *handle, u16 queue_id)
 	}
 
 	if (reset_try_times >= HCLGE_TQP_RESET_TRY_TIMES) {
-		dev_warn(&hdev->pdev->dev, "Reset TQP fail\n");
-		return;
+		dev_err(&hdev->pdev->dev, "Reset TQP fail\n");
+		return ret;
 	}
 
 	ret = hclge_send_reset_tqp_cmd(hdev, queue_gid, false);
-	if (ret) {
-		dev_warn(&hdev->pdev->dev,
-			 "Deassert the soft reset fail, ret = %d\n", ret);
-		return;
-	}
+	if (ret)
+		dev_err(&hdev->pdev->dev,
+			"Deassert the soft reset fail, ret = %d\n", ret);
+
+	return ret;
 }
 
 void hclge_reset_vf_queue(struct hclge_vport *vport, u16 queue_id)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
index e3dfd65..0d92154 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
@@ -778,7 +778,7 @@ int hclge_rss_init_hw(struct hclge_dev *hdev);
 void hclge_rss_indir_init_cfg(struct hclge_dev *hdev);
 
 void hclge_mbx_handler(struct hclge_dev *hdev);
-void hclge_reset_tqp(struct hnae3_handle *handle, u16 queue_id);
+int hclge_reset_tqp(struct hnae3_handle *handle, u16 queue_id);
 void hclge_reset_vf_queue(struct hclge_vport *vport, u16 queue_id);
 int hclge_cfg_flowctrl(struct hclge_dev *hdev);
 int hclge_func_reset_cmd(struct hclge_dev *hdev, int func_id);
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index e0a86a5..b224f6a 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -925,12 +925,12 @@ static int hclgevf_cmd_set_promisc_mode(struct hclgevf_dev *hdev,
 	return status;
 }
 
-static void hclgevf_set_promisc_mode(struct hnae3_handle *handle,
-				     bool en_uc_pmc, bool en_mc_pmc)
+static int hclgevf_set_promisc_mode(struct hnae3_handle *handle,
+				    bool en_uc_pmc, bool en_mc_pmc)
 {
 	struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle);
 
-	hclgevf_cmd_set_promisc_mode(hdev, en_uc_pmc, en_mc_pmc);
+	return hclgevf_cmd_set_promisc_mode(hdev, en_uc_pmc, en_mc_pmc);
 }
 
 static int hclgevf_tqp_enable(struct hclgevf_dev *hdev, int tqp_id,
@@ -1080,7 +1080,7 @@ static int hclgevf_en_hw_strip_rxvtag(struct hnae3_handle *handle, bool enable)
 				    1, false, NULL, 0);
 }
 
-static void hclgevf_reset_tqp(struct hnae3_handle *handle, u16 queue_id)
+static int hclgevf_reset_tqp(struct hnae3_handle *handle, u16 queue_id)
 {
 	struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle);
 	u8 msg_data[2];
@@ -1091,10 +1091,10 @@ static void hclgevf_reset_tqp(struct hnae3_handle *handle, u16 queue_id)
 	/* disable vf queue before send queue reset msg to PF */
 	ret = hclgevf_tqp_enable(hdev, queue_id, 0, false);
 	if (ret)
-		return;
+		return ret;
 
-	hclgevf_send_mbx_msg(hdev, HCLGE_MBX_QUEUE_RESET, 0, msg_data,
-			     2, true, NULL, 0);
+	return hclgevf_send_mbx_msg(hdev, HCLGE_MBX_QUEUE_RESET, 0, msg_data,
+				    2, true, NULL, 0);
 }
 
 static int hclgevf_notify_client(struct hclgevf_dev *hdev,
-- 
2.7.4

^ permalink raw reply related

* [Patch V2 net 10/11] net: hns3: bugfix for rtnl_lock's range in the hclge_reset()
From: Huazhong Tan @ 2018-10-27  8:10 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	linyunsheng
In-Reply-To: <1540627818-17635-1-git-send-email-tanhuazhong@huawei.com>

Since hclge_reset_wait() is used to wait for the hardware to complete
the reset, it is not necessary to hold the rtnl_lock during
hclge_reset_wait(). So this patch release the lock for the duration
of hclge_reset_wait().

Fixes: 6d4fab39533f ("net: hns3: Reset net device with rtnl_lock")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index f3212c9..ffdd960 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -2470,14 +2470,17 @@ static void hclge_reset(struct hclge_dev *hdev)
 	handle = &hdev->vport[0].nic;
 	rtnl_lock();
 	hclge_notify_client(hdev, HNAE3_DOWN_CLIENT);
+	rtnl_unlock();
 
 	if (!hclge_reset_wait(hdev)) {
+		rtnl_lock();
 		hclge_notify_client(hdev, HNAE3_UNINIT_CLIENT);
 		hclge_reset_ae_dev(hdev->ae_dev);
 		hclge_notify_client(hdev, HNAE3_INIT_CLIENT);
 
 		hclge_clear_reset_cause(hdev);
 	} else {
+		rtnl_lock();
 		/* schedule again to check pending resets later */
 		set_bit(hdev->reset_type, &hdev->reset_pending);
 		hclge_reset_task_schedule(hdev);
-- 
2.7.4

^ permalink raw reply related

* [Patch V2 net 06/11] net: hns3: bugfix for is_valid_csq_clean_head()
From: Huazhong Tan @ 2018-10-27  8:10 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	linyunsheng
In-Reply-To: <1540627818-17635-1-git-send-email-tanhuazhong@huawei.com>

The HEAD pointer of the hardware command queue maybe equal to the command
queue's next_to_use the driver, so that does not belong to the invalid
HEAD pointer, since the hardware may not process the command in time,
causing the HEAD pointer to be too late to update. The variables' name
in this function is unreadable, so give them a more readable one.

Fixes: 3ff504908f95 ("net: hns3: fix a dead loop in hclge_cmd_csq_clean")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
index 68026a5..690f62e 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
@@ -24,15 +24,15 @@ static int hclge_ring_space(struct hclge_cmq_ring *ring)
 	return ring->desc_num - used - 1;
 }
 
-static int is_valid_csq_clean_head(struct hclge_cmq_ring *ring, int h)
+static int is_valid_csq_clean_head(struct hclge_cmq_ring *ring, int head)
 {
-	int u = ring->next_to_use;
-	int c = ring->next_to_clean;
+	int ntu = ring->next_to_use;
+	int ntc = ring->next_to_clean;
 
-	if (unlikely(h >= ring->desc_num))
-		return 0;
+	if (ntu > ntc)
+		return head >= ntc && head <= ntu;
 
-	return u > c ? (h > c && h <= u) : (h > c || h <= u);
+	return head >= ntc || head <= ntu;
 }
 
 static int hclge_alloc_cmd_desc(struct hclge_cmq_ring *ring)
-- 
2.7.4

^ permalink raw reply related

* [Patch V2 net 09/11] net: hns3: bugfix for handling mailbox while the command queue reinitialized
From: Huazhong Tan @ 2018-10-27  8:10 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	linyunsheng
In-Reply-To: <1540627818-17635-1-git-send-email-tanhuazhong@huawei.com>

In a multi-core machine, the mailbox service and reset service
will be executed at the same time. The reset server will re-initialize
the commond queue, before that, the mailbox handler can only get some
invalid messages.

The HCLGE_STATE_CMD_DISABLE flag means that the command queue is not
available and needs to be reinitialized. Therefore, when the mailbox
hanlder recognizes this flag, it should not process the command.

Fixes: dde1a86e93ca ("net: hns3: Add mailbox support to PF driver")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
index 04462a3..6ac2fab 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
@@ -400,6 +400,12 @@ void hclge_mbx_handler(struct hclge_dev *hdev)
 
 	/* handle all the mailbox requests in the queue */
 	while (!hclge_cmd_crq_empty(&hdev->hw)) {
+		if (test_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state)) {
+			dev_warn(&hdev->pdev->dev,
+				 "command queue need re-initialize\n");
+			return;
+		}
+
 		desc = &crq->desc[crq->next_to_use];
 		req = (struct hclge_mbx_vf_to_pf_cmd *)desc->data;
 
-- 
2.7.4

^ permalink raw reply related

* [Patch V2 net 11/11] net: hns3: bugfix for rtnl_lock's range in the hclgevf_reset()
From: Huazhong Tan @ 2018-10-27  8:10 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	linyunsheng
In-Reply-To: <1540627818-17635-1-git-send-email-tanhuazhong@huawei.com>

Since hclgevf_reset_wait() is used to wait for the hardware to complete
the reset, it is not necessary to hold the rtnl_lock during
hclgevf_reset_wait(). So this patch release the lock for the duration
of hclgevf_reset_wait().

Fixes: 6988eb2a9b77 ("net: hns3: Add support to reset the enet/ring mgmt layer")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index b224f6a..085edb9 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -1170,6 +1170,8 @@ static int hclgevf_reset(struct hclgevf_dev *hdev)
 	/* bring down the nic to stop any ongoing TX/RX */
 	hclgevf_notify_client(hdev, HNAE3_DOWN_CLIENT);
 
+	rtnl_unlock();
+
 	/* check if VF could successfully fetch the hardware reset completion
 	 * status from the hardware
 	 */
@@ -1181,12 +1183,15 @@ static int hclgevf_reset(struct hclgevf_dev *hdev)
 			ret);
 
 		dev_warn(&hdev->pdev->dev, "VF reset failed, disabling VF!\n");
+		rtnl_lock();
 		hclgevf_notify_client(hdev, HNAE3_UNINIT_CLIENT);
 
 		rtnl_unlock();
 		return ret;
 	}
 
+	rtnl_lock();
+
 	/* now, re-initialize the nic client and ae device*/
 	ret = hclgevf_reset_stack(hdev);
 	if (ret)
-- 
2.7.4

^ permalink raw reply related

* [Patch V2 net 02/11] net: hns3: add error handler for hns3_get_ring_config/hns3_queue_to_ring
From: Huazhong Tan @ 2018-10-27  8:10 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	linyunsheng
In-Reply-To: <1540627818-17635-1-git-send-email-tanhuazhong@huawei.com>

When hns3_get_ring_config()/hns3_queue_to_ring() failed during resetting,
the allocated memory has not been freed before hns3_get_ring_config() and
hns3_queue_to_ring() return. So this patch fixes the buffer not freeing
problem during resetting.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index d9066c5..6f0fd62 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -3037,8 +3037,10 @@ static int hns3_queue_to_ring(struct hnae3_queue *tqp,
 		return ret;
 
 	ret = hns3_ring_get_cfg(tqp, priv, HNAE3_RING_TYPE_RX);
-	if (ret)
+	if (ret) {
+		devm_kfree(priv->dev, priv->ring_data[tqp->tqp_index].ring);
 		return ret;
+	}
 
 	return 0;
 }
@@ -3047,7 +3049,7 @@ static int hns3_get_ring_config(struct hns3_nic_priv *priv)
 {
 	struct hnae3_handle *h = priv->ae_handle;
 	struct pci_dev *pdev = h->pdev;
-	int i, ret;
+	int i, j, ret;
 
 	priv->ring_data =  devm_kzalloc(&pdev->dev,
 					array3_size(h->kinfo.num_tqps,
@@ -3065,6 +3067,12 @@ static int hns3_get_ring_config(struct hns3_nic_priv *priv)
 
 	return 0;
 err:
+	for (j = i - 1; j >= 0; j--) {
+		devm_kfree(priv->dev, priv->ring_data[j].ring);
+		devm_kfree(priv->dev,
+			   priv->ring_data[j + h->kinfo.num_tqps].ring);
+	}
+
 	devm_kfree(&pdev->dev, priv->ring_data);
 	return ret;
 }
-- 
2.7.4

^ permalink raw reply related

* [Patch V2 net 04/11] net: hns3: bugfix for the initialization of command queue's spin lock
From: Huazhong Tan @ 2018-10-27  8:10 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	linyunsheng
In-Reply-To: <1540627818-17635-1-git-send-email-tanhuazhong@huawei.com>

The spin lock of the command queue only needs to be initialized once
when the driver initializes the command queue. It is not necessary to
initialize the spin lock when resetting. At the same time, the
modification of the queue member should be performed after acquiring
the lock.

Fixes: 3efb960f056d ("net: hns3: Refactor the initialization of command queue")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
index ac13cb2..68026a5 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
@@ -304,6 +304,10 @@ int hclge_cmd_queue_init(struct hclge_dev *hdev)
 {
 	int ret;
 
+	/* Setup the lock for command queue */
+	spin_lock_init(&hdev->hw.cmq.csq.lock);
+	spin_lock_init(&hdev->hw.cmq.crq.lock);
+
 	/* Setup the queue entries for use cmd queue */
 	hdev->hw.cmq.csq.desc_num = HCLGE_NIC_CMQ_DESC_NUM;
 	hdev->hw.cmq.crq.desc_num = HCLGE_NIC_CMQ_DESC_NUM;
@@ -337,18 +341,20 @@ int hclge_cmd_init(struct hclge_dev *hdev)
 	u32 version;
 	int ret;
 
+	spin_lock_bh(&hdev->hw.cmq.csq.lock);
+	spin_lock_bh(&hdev->hw.cmq.crq.lock);
+
 	hdev->hw.cmq.csq.next_to_clean = 0;
 	hdev->hw.cmq.csq.next_to_use = 0;
 	hdev->hw.cmq.crq.next_to_clean = 0;
 	hdev->hw.cmq.crq.next_to_use = 0;
 
-	/* Setup the lock for command queue */
-	spin_lock_init(&hdev->hw.cmq.csq.lock);
-	spin_lock_init(&hdev->hw.cmq.crq.lock);
-
 	hclge_cmd_init_regs(&hdev->hw);
 	clear_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state);
 
+	spin_unlock_bh(&hdev->hw.cmq.crq.lock);
+	spin_unlock_bh(&hdev->hw.cmq.csq.lock);
+
 	ret = hclge_cmd_query_firmware_version(&hdev->hw, &version);
 	if (ret) {
 		dev_err(&hdev->pdev->dev,
-- 
2.7.4

^ permalink raw reply related

* [PATCH] can: hi311x: Use level-triggered interrupt
From: Lukas Wunner @ 2018-10-27  8:36 UTC (permalink / raw)
  To: Marc Kleine-Budde, Wolfgang Grandegger
  Cc: Mathias Duckeck, Akshay Bhat, Casey Fitzpatrick, linux-can,
	netdev

If the hi3110 shares the SPI bus with another traffic-intensive device
and packets are received in high volume (by a separate machine sending
with "cangen -g 0 -i -x"), reception stops after a few minutes and the
counter in /proc/interrupts stops incrementing.  Bus state is "active".
Bringing the interface down and back up reconvenes the reception.  The
issue is not observed when the hi3110 is the sole device on the SPI bus.

Using a level-triggered interrupt makes the issue go away and lets the
hi3110 successfully receive 2 GByte over the course of 5 days while a
ks8851 Ethernet chip on the same SPI bus handles 6 GByte of traffic.

Unfortunately the hi3110 datasheet is mum on the trigger type.  The pin
description on page 3 only specifies the polarity (active high):
http://www.holtic.com/documents/371-hi-3110_v-rev-kpdf.do

Cc: Mathias Duckeck <m.duckeck@kunbus.de>
Cc: Akshay Bhat <akshay.bhat@timesys.com>
Cc: Casey Fitzpatrick <casey.fitzpatrick@timesys.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
 Documentation/devicetree/bindings/net/can/holt_hi311x.txt | 2 +-
 drivers/net/can/spi/hi311x.c                              | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/can/holt_hi311x.txt b/Documentation/devicetree/bindings/net/can/holt_hi311x.txt
index 903a78da65be..3a9926f99937 100644
--- a/Documentation/devicetree/bindings/net/can/holt_hi311x.txt
+++ b/Documentation/devicetree/bindings/net/can/holt_hi311x.txt
@@ -17,7 +17,7 @@ Example:
 		reg = <1>;
 		clocks = <&clk32m>;
 		interrupt-parent = <&gpio4>;
-		interrupts = <13 IRQ_TYPE_EDGE_RISING>;
+		interrupts = <13 IRQ_TYPE_LEVEL_HIGH>;
 		vdd-supply = <&reg5v0>;
 		xceiver-supply = <&reg5v0>;
 	};
diff --git a/drivers/net/can/spi/hi311x.c b/drivers/net/can/spi/hi311x.c
index 53e320c92a8b..ddaf46239e39 100644
--- a/drivers/net/can/spi/hi311x.c
+++ b/drivers/net/can/spi/hi311x.c
@@ -760,7 +760,7 @@ static int hi3110_open(struct net_device *net)
 {
 	struct hi3110_priv *priv = netdev_priv(net);
 	struct spi_device *spi = priv->spi;
-	unsigned long flags = IRQF_ONESHOT | IRQF_TRIGGER_RISING;
+	unsigned long flags = IRQF_ONESHOT | IRQF_TRIGGER_HIGH;
 	int ret;
 
 	ret = open_candev(net);
-- 
2.19.1

^ permalink raw reply related

* Re: CAKE and r8169 cause panic on upload in v4.19
From: Oleksandr Natalenko @ 2018-10-27 11:04 UTC (permalink / raw)
  To: Dave Taht
  Cc: hkallweit1, Toke Høiland-Jørgensen, David S. Miller,
	Jamal Hadi Salim, Cong Wang, Jiří Pírko,
	Linux Kernel Network Developers, linux-kernel
In-Reply-To: <CAA93jw4pvEYiXZ-C=yH1H_twZigSYAuEMrE_CCDjzLnVHwAdVA@mail.gmail.com>

Hi.

On 27.10.2018 01:08, Dave Taht wrote:
> Groovy. :whew:
> 
> I do look forward to more cake test results, particularly on different
> network cards such as these, and at speeds higher than 10Gbit on high
> end hardware, and in the 100-1Gbit range on low to mid-range. After
> the last round of features added to cake before it went into linux, we
> run now out of cpu on inbound shaping at those speeds on low end apu2
> (x86) hardware, (atom and a15 chips are not so hot now either) and I
> wish I knew what we could do to speed it up. The new "list skb" and
> mirred code looked promising but we haven't got around to exploring it
> yet.
> 
> Thank you for trying and I hope this gets sorted out on your chipset.

Yeah, but this is still strange. Both LAN computer and router run 4.19, 
but only router panics. The LAN computer employs alx driver, router 
employs r8169. Both had GRO enabled at the moment of panic. But [1] 
reports that this happens with Intel NIC too, so must not be limited to 
Realtek.

> We tend to use flent's rrul test to *really* abuse things. :)
> 
> So cake's ok with gro disabled in hw?

Yes, I've gone back to CAKE but with GRO disabled for NIC, and it is 
stable now. I've also asked a bug reporter [1] to do the same, so we 
will see.

Thanks.

-- 
   Oleksandr Natalenko (post-factum)

[1] https://bugzilla.kernel.org/show_bug.cgi?id=201063

^ permalink raw reply

* [PATCH] sctp: socket.c validate sprstat_policy
From: Tomas Bortoli @ 2018-10-27 19:58 UTC (permalink / raw)
  To: vyasevich, nhorman, marcelo.leitner
  Cc: davem, linux-sctp, netdev, linux-kernel, syzkaller, Tomas Bortoli

It is possible to perform out-of-bound reads on
sctp_getsockopt_pr_streamstatus() and on
sctp_getsockopt_pr_assocstatus() by passing from userspace a
sprstat_policy that overflows the abandoned_sent/abandoned_unsent
fixed length arrays. The over-read data are directly copied/leaked
to userspace.

Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Reported-by: syzbot+5da0d0a72a9e7d791748@syzkaller.appspotmail.com
---
 net/sctp/socket.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index fc0386e8ff23..5290b8bd40c8 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -7083,7 +7083,8 @@ static int sctp_getsockopt_pr_assocstatus(struct sock *sk, int len,
 	}
 
 	policy = params.sprstat_policy;
-	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
+	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
+	    __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX))
 		goto out;
 
 	asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
@@ -7142,7 +7143,8 @@ static int sctp_getsockopt_pr_streamstatus(struct sock *sk, int len,
 	}
 
 	policy = params.sprstat_policy;
-	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
+	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
+	    __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX))
 		goto out;
 
 	asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
-- 
2.11.0

^ permalink raw reply related

* [PATCH] sctp: socket.c validate sprstat_policy
From: Tomas Bortoli @ 2018-10-27 20:20 UTC (permalink / raw)
  To: vyasevich, nhorman, marcelo.leitner
  Cc: davem, linux-sctp, netdev, linux-kernel, Tomas Bortoli

It is possible to perform out-of-bound reads on
sctp_getsockopt_pr_streamstatus() and on
sctp_getsockopt_pr_assocstatus() by passing from userspace a
sprstat_policy that overflows the abandoned_sent/abandoned_unsent
fixed length arrays. The over-read data are directly copied/leaked
to userspace.

Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Reported-by: syzbot+5da0d0a72a9e7d791748@syzkaller.appspotmail.com
---
v2 - added forgot ||

 net/sctp/socket.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index fc0386e8ff23..5290b8bd40c8 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -7083,7 +7083,8 @@ static int sctp_getsockopt_pr_assocstatus(struct sock *sk, int len,
 	}
 
 	policy = params.sprstat_policy;
-	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
+	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
+	    __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX))
 		goto out;
 
 	asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
@@ -7142,7 +7143,8 @@ static int sctp_getsockopt_pr_streamstatus(struct sock *sk, int len,
 	}
 
 	policy = params.sprstat_policy;
-	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
+	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
+	    __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX))
 		goto out;
 
 	asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH] sctp: socket.c validate sprstat_policy
From: Tomas Bortoli @ 2018-10-27 20:43 UTC (permalink / raw)
  To: vyasevich, nhorman, marcelo.leitner
  Cc: davem, linux-sctp, netdev, linux-kernel
In-Reply-To: <20181027202026.32157-1-tomasbortoli@gmail.com>

On 10/27/18 10:20 PM, Tomas Bortoli wrote:
> It is possible to perform out-of-bound reads on
> sctp_getsockopt_pr_streamstatus() and on
> sctp_getsockopt_pr_assocstatus() by passing from userspace a
> sprstat_policy that overflows the abandoned_sent/abandoned_unsent
> fixed length arrays. The over-read data are directly copied/leaked
> to userspace.
> 
> Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
> Reported-by: syzbot+5da0d0a72a9e7d791748@syzkaller.appspotmail.com
> ---
> v2 - added forgot ||
> 
>  net/sctp/socket.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index fc0386e8ff23..5290b8bd40c8 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -7083,7 +7083,8 @@ static int sctp_getsockopt_pr_assocstatus(struct sock *sk, int len,
>  	}
>  
>  	policy = params.sprstat_policy;
> -	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
> +	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
> +	    __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX))
>  		goto out;
>  
>  	asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
> @@ -7142,7 +7143,8 @@ static int sctp_getsockopt_pr_streamstatus(struct sock *sk, int len,
>  	}
>  
>  	policy = params.sprstat_policy;
> -	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
> +	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
> +	    __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX))
>  		goto out;
>  
>  	asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
> 

I just realized we also have to check for less than 0 indexes..

^ permalink raw reply

* [PATCH] net/packet: support vhost mrg_rxbuf
From: Jianfeng Tan @ 2018-10-27 12:04 UTC (permalink / raw)
  To: netdev; +Cc: davem, jasowang, mst

Previouly, virtio net header size is hardcoded to be 10, which makes
the feature mrg_rxbuf not available.

We redefine PACKET_VNET_HDR ioctl which treats user input as boolean,
but now as int, 0, 10, 12, or everything else be treated as 10.

There will be one case which is treated differently: if user input is
12, previously, the header size will be 10; but now it's 12.

Signed-off-by: Jianfeng Tan <jianfeng.tan@linux.alibaba.com>
---
 net/packet/af_packet.c | 97 ++++++++++++++++++++++++++----------------
 net/packet/diag.c      |  2 +-
 net/packet/internal.h  |  2 +-
 3 files changed, 63 insertions(+), 38 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index ec3095f13aae..1bd7f4cdcc80 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1999,18 +1999,24 @@ static unsigned int run_filter(struct sk_buff *skb,
 }
 
 static int packet_rcv_vnet(struct msghdr *msg, const struct sk_buff *skb,
-			   size_t *len)
+			   size_t *len, int vnet_hdr_len)
 {
+	int res;
 	struct virtio_net_hdr vnet_hdr;
 
-	if (*len < sizeof(vnet_hdr))
+	if (*len < vnet_hdr_len)
 		return -EINVAL;
-	*len -= sizeof(vnet_hdr);
+	*len -= vnet_hdr_len;
 
 	if (virtio_net_hdr_from_skb(skb, &vnet_hdr, vio_le(), true, 0))
 		return -EINVAL;
 
-	return memcpy_to_msg(msg, (void *)&vnet_hdr, sizeof(vnet_hdr));
+	res = memcpy_to_msg(msg, (void *)&vnet_hdr, sizeof(vnet_hdr));
+	if (res == 0)
+		iov_iter_advance(&msg->msg_iter,
+				 vnet_hdr_len - sizeof(vnet_hdr));
+
+	return res;
 }
 
 /*
@@ -2206,11 +2212,13 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
 				  po->tp_reserve;
 	} else {
 		unsigned int maclen = skb_network_offset(skb);
+		int vnet_hdr_sz = READ_ONCE(po->vnet_hdr_sz);
+
 		netoff = TPACKET_ALIGN(po->tp_hdrlen +
 				       (maclen < 16 ? 16 : maclen)) +
 				       po->tp_reserve;
-		if (po->has_vnet_hdr) {
-			netoff += sizeof(struct virtio_net_hdr);
+		if (vnet_hdr_sz) {
+			netoff += vnet_hdr_sz;
 			do_vnet = true;
 		}
 		macoff = netoff - maclen;
@@ -2429,19 +2437,6 @@ static int __packet_snd_vnet_parse(struct virtio_net_hdr *vnet_hdr, size_t len)
 	return 0;
 }
 
-static int packet_snd_vnet_parse(struct msghdr *msg, size_t *len,
-				 struct virtio_net_hdr *vnet_hdr)
-{
-	if (*len < sizeof(*vnet_hdr))
-		return -EINVAL;
-	*len -= sizeof(*vnet_hdr);
-
-	if (!copy_from_iter_full(vnet_hdr, sizeof(*vnet_hdr), &msg->msg_iter))
-		return -EFAULT;
-
-	return __packet_snd_vnet_parse(vnet_hdr, *len);
-}
-
 static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 		void *frame, struct net_device *dev, void *data, int tp_len,
 		__be16 proto, unsigned char *addr, int hlen, int copylen,
@@ -2609,6 +2604,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
 	int len_sum = 0;
 	int status = TP_STATUS_AVAILABLE;
 	int hlen, tlen, copylen = 0;
+	int vnet_hdr_sz;
 
 	mutex_lock(&po->pg_vec_lock);
 
@@ -2648,7 +2644,8 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
 	size_max = po->tx_ring.frame_size
 		- (po->tp_hdrlen - sizeof(struct sockaddr_ll));
 
-	if ((size_max > dev->mtu + reserve + VLAN_HLEN) && !po->has_vnet_hdr)
+	vnet_hdr_sz = READ_ONCE(po->vnet_hdr_sz);
+	if ((size_max > dev->mtu + reserve + VLAN_HLEN) && !vnet_hdr_sz)
 		size_max = dev->mtu + reserve + VLAN_HLEN;
 
 	do {
@@ -2668,10 +2665,10 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
 		status = TP_STATUS_SEND_REQUEST;
 		hlen = LL_RESERVED_SPACE(dev);
 		tlen = dev->needed_tailroom;
-		if (po->has_vnet_hdr) {
+		if (vnet_hdr_sz) {
 			vnet_hdr = data;
-			data += sizeof(*vnet_hdr);
-			tp_len -= sizeof(*vnet_hdr);
+			data += vnet_hdr_sz;
+			tp_len -= vnet_hdr_sz;
 			if (tp_len < 0 ||
 			    __packet_snd_vnet_parse(vnet_hdr, tp_len)) {
 				tp_len = -EINVAL;
@@ -2696,7 +2693,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
 					  addr, hlen, copylen, &sockc);
 		if (likely(tp_len >= 0) &&
 		    tp_len > dev->mtu + reserve &&
-		    !po->has_vnet_hdr &&
+		    !vnet_hdr_sz &&
 		    !packet_extra_vlan_len_allowed(dev, skb))
 			tp_len = -EMSGSIZE;
 
@@ -2715,7 +2712,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
 			}
 		}
 
-		if (po->has_vnet_hdr) {
+		if (vnet_hdr_sz) {
 			if (virtio_net_hdr_to_skb(skb, vnet_hdr, vio_le())) {
 				tp_len = -EINVAL;
 				goto tpacket_error;
@@ -2802,9 +2799,9 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	int err, reserve = 0;
 	struct sockcm_cookie sockc;
 	struct virtio_net_hdr vnet_hdr = { 0 };
+	int vnet_hdr_sz;
 	int offset = 0;
 	struct packet_sock *po = pkt_sk(sk);
-	bool has_vnet_hdr = false;
 	int hlen, tlen, linear;
 	int extra_len = 0;
 
@@ -2844,11 +2841,29 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 
 	if (sock->type == SOCK_RAW)
 		reserve = dev->hard_header_len;
-	if (po->has_vnet_hdr) {
-		err = packet_snd_vnet_parse(msg, &len, &vnet_hdr);
-		if (err)
+
+	vnet_hdr_sz = READ_ONCE(po->vnet_hdr_sz);
+	if (vnet_hdr_sz) {
+		if (len < vnet_hdr_sz) {
+			err = -EINVAL;
 			goto out_unlock;
-		has_vnet_hdr = true;
+		}
+		len -= vnet_hdr_sz;
+
+		if (!copy_from_iter_full(&vnet_hdr, sizeof(vnet_hdr),
+					 &msg->msg_iter)) {
+			err = -EFAULT;
+			goto out_unlock;
+		}
+
+		if (__packet_snd_vnet_parse(&vnet_hdr, len)) {
+			err = -EINVAL;
+			goto out_unlock;
+		}
+
+		/* TODO: check hdr_len with len? */
+
+		iov_iter_advance(&msg->msg_iter, vnet_hdr_sz - sizeof(vnet_hdr));
 	}
 
 	if (unlikely(sock_flag(sk, SOCK_NOFCS))) {
@@ -2912,7 +2927,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	skb->mark = sockc.mark;
 	skb->tstamp = sockc.transmit_time;
 
-	if (has_vnet_hdr) {
+	if (vnet_hdr_sz) {
 		err = virtio_net_hdr_to_skb(skb, &vnet_hdr, vio_le());
 		if (err)
 			goto out_free;
@@ -3307,11 +3322,11 @@ static int packet_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
 	if (pkt_sk(sk)->pressure)
 		packet_rcv_has_room(pkt_sk(sk), NULL);
 
-	if (pkt_sk(sk)->has_vnet_hdr) {
-		err = packet_rcv_vnet(msg, skb, &len);
+	vnet_hdr_len = READ_ONCE(pkt_sk(sk)->vnet_hdr_sz);
+	if (vnet_hdr_len) {
+		err = packet_rcv_vnet(msg, skb, &len, vnet_hdr_len);
 		if (err)
 			goto out_free;
-		vnet_hdr_len = sizeof(struct virtio_net_hdr);
 	}
 
 	/* You lose any data beyond the buffer you gave. If it worries
@@ -3772,7 +3787,17 @@ packet_setsockopt(struct socket *sock, int level, int optname, char __user *optv
 		if (po->rx_ring.pg_vec || po->tx_ring.pg_vec) {
 			ret = -EBUSY;
 		} else {
-			po->has_vnet_hdr = !!val;
+			/* Previouly we treat user input as boolean (!!val),
+			 * now we treat it as int. After the below correction, 
+			 * the only violation case is 12, which results in
+			 * vnet header size of 12 instead of 10. 
+			 */
+			if (val &&
+			    val != sizeof(struct virtio_net_hdr) &&
+			    val != sizeof(struct virtio_net_hdr_mrg_rxbuf))
+				val = sizeof(struct virtio_net_hdr);
+
+			po->vnet_hdr_sz = val;
 			ret = 0;
 		}
 		release_sock(sk);
@@ -3903,7 +3928,7 @@ static int packet_getsockopt(struct socket *sock, int level, int optname,
 		val = po->origdev;
 		break;
 	case PACKET_VNET_HDR:
-		val = po->has_vnet_hdr;
+		val = po->vnet_hdr_sz;
 		break;
 	case PACKET_VERSION:
 		val = po->tp_version;
diff --git a/net/packet/diag.c b/net/packet/diag.c
index 7ef1c881ae74..950015b6704f 100644
--- a/net/packet/diag.c
+++ b/net/packet/diag.c
@@ -26,7 +26,7 @@ static int pdiag_put_info(const struct packet_sock *po, struct sk_buff *nlskb)
 		pinfo.pdi_flags |= PDI_AUXDATA;
 	if (po->origdev)
 		pinfo.pdi_flags |= PDI_ORIGDEV;
-	if (po->has_vnet_hdr)
+	if (po->vnet_hdr_sz)
 		pinfo.pdi_flags |= PDI_VNETHDR;
 	if (po->tp_loss)
 		pinfo.pdi_flags |= PDI_LOSS;
diff --git a/net/packet/internal.h b/net/packet/internal.h
index 3bb7c5fb3bff..11bc75950f28 100644
--- a/net/packet/internal.h
+++ b/net/packet/internal.h
@@ -115,9 +115,9 @@ struct packet_sock {
 	unsigned int		running;	/* bind_lock must be held */
 	unsigned int		auxdata:1,	/* writer must hold sock lock */
 				origdev:1,
-				has_vnet_hdr:1,
 				tp_loss:1,
 				tp_tx_has_off:1;
+	int			vnet_hdr_sz;
 	int			pressure;
 	int			ifindex;	/* bound device		*/
 	__be16			num;
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH] sctp: socket.c validate sprstat_policy
From: kbuild test robot @ 2018-10-27 20:50 UTC (permalink / raw)
  To: Tomas Bortoli
  Cc: kbuild-all, vyasevich, nhorman, marcelo.leitner, davem,
	linux-sctp, netdev, linux-kernel, syzkaller, Tomas Bortoli
In-Reply-To: <20181027195853.30243-1-tomasbortoli@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4456 bytes --]

Hi Tomas,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]
[also build test WARNING on v4.19 next-20181019]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Tomas-Bortoli/sctp-socket-c-validate-sprstat_policy/20181028-040051
config: i386-randconfig-x075-201843 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All warnings (new ones prefixed by >>):

   In file included from arch/x86/include/asm/atomic.h:5:0,
                    from include/linux/atomic.h:7,
                    from include/linux/crypto.h:20,
                    from include/crypto/hash.h:16,
                    from net/sctp/socket.c:55:
   net/sctp/socket.c: In function 'sctp_getsockopt_pr_assocstatus':
   net/sctp/socket.c:7086:25: error: called object is not a function or function pointer
     if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
                    ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:58:30: note: in definition of macro '__trace_if'
     if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
                                 ^~~~
>> net/sctp/socket.c:7086:2: note: in expansion of macro 'if'
     if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
     ^~
   net/sctp/socket.c:7086:25: error: called object is not a function or function pointer
     if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
                    ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:58:42: note: in definition of macro '__trace_if'
     if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
                                             ^~~~
>> net/sctp/socket.c:7086:2: note: in expansion of macro 'if'
     if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
     ^~
   net/sctp/socket.c:7086:25: error: called object is not a function or function pointer
     if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
                    ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:69:16: note: in definition of macro '__trace_if'
      ______r = !!(cond);     \
                   ^~~~
>> net/sctp/socket.c:7086:2: note: in expansion of macro 'if'
     if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
     ^~

vim +/if +7086 net/sctp/socket.c

  7066	
  7067	static int sctp_getsockopt_pr_assocstatus(struct sock *sk, int len,
  7068						  char __user *optval,
  7069						  int __user *optlen)
  7070	{
  7071		struct sctp_prstatus params;
  7072		struct sctp_association *asoc;
  7073		int policy;
  7074		int retval = -EINVAL;
  7075	
  7076		if (len < sizeof(params))
  7077			goto out;
  7078	
  7079		len = sizeof(params);
  7080		if (copy_from_user(&params, optval, len)) {
  7081			retval = -EFAULT;
  7082			goto out;
  7083		}
  7084	
  7085		policy = params.sprstat_policy;
> 7086		if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL))
  7087		    __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX))
  7088			goto out;
  7089	
  7090		asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
  7091		if (!asoc)
  7092			goto out;
  7093	
  7094		if (policy & SCTP_PR_SCTP_ALL) {
  7095			params.sprstat_abandoned_unsent = 0;
  7096			params.sprstat_abandoned_sent = 0;
  7097			for (policy = 0; policy <= SCTP_PR_INDEX(MAX); policy++) {
  7098				params.sprstat_abandoned_unsent +=
  7099					asoc->abandoned_unsent[policy];
  7100				params.sprstat_abandoned_sent +=
  7101					asoc->abandoned_sent[policy];
  7102			}
  7103		} else {
  7104			params.sprstat_abandoned_unsent =
  7105				asoc->abandoned_unsent[__SCTP_PR_INDEX(policy)];
  7106			params.sprstat_abandoned_sent =
  7107				asoc->abandoned_sent[__SCTP_PR_INDEX(policy)];
  7108		}
  7109	
  7110		if (put_user(len, optlen)) {
  7111			retval = -EFAULT;
  7112			goto out;
  7113		}
  7114	
  7115		if (copy_to_user(optval, &params, len)) {
  7116			retval = -EFAULT;
  7117			goto out;
  7118		}
  7119	
  7120		retval = 0;
  7121	
  7122	out:
  7123		return retval;
  7124	}
  7125	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 33915 bytes --]

^ permalink raw reply

* [PATCH v2] sctp: socket.c validate sprstat_policy
From: Tomas Bortoli @ 2018-10-27 20:53 UTC (permalink / raw)
  To: vyasevich, nhorman, marcelo.leitner
  Cc: davem, linux-sctp, netdev, linux-kernel, Tomas Bortoli

It is possible to perform out-of-bound reads on
sctp_getsockopt_pr_streamstatus() and on
sctp_getsockopt_pr_assocstatus() by passing from userspace a
sprstat_policy that overflows the abandoned_sent/abandoned_unsent
fixed length arrays. The over-read data are directly copied/leaked
to userspace.

Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Reported-by: syzbot+5da0d0a72a9e7d791748@syzkaller.appspotmail.com
---
 net/sctp/socket.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index fc0386e8ff23..14dce5d95817 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -7083,7 +7083,9 @@ static int sctp_getsockopt_pr_assocstatus(struct sock *sk, int len,
 	}
 
 	policy = params.sprstat_policy;
-	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
+	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
+	    __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX) ||
+	    __SCTP_PR_INDEX(policy) < 0)
 		goto out;
 
 	asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
@@ -7142,7 +7144,9 @@ static int sctp_getsockopt_pr_streamstatus(struct sock *sk, int len,
 	}
 
 	policy = params.sprstat_policy;
-	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)))
+	if (!policy || (policy & ~(SCTP_PR_SCTP_MASK | SCTP_PR_SCTP_ALL)) ||
+	    __SCTP_PR_INDEX(policy) > SCTP_PR_INDEX(MAX) ||
+	    __SCTP_PR_INDEX(policy) < 0)
 		goto out;
 
 	asoc = sctp_id2assoc(sk, params.sprstat_assoc_id);
-- 
2.11.0

^ permalink raw reply related

* [PATCH] bonding: fix length of actor system
From: Tobias Jungel @ 2018-10-27 13:31 UTC (permalink / raw)
  To: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek; +Cc: Eric Dumazet, netdev

The attribute IFLA_BOND_AD_ACTOR_SYSTEM is sent to user space having the
length of sizeof(bond->params.ad_actor_system) which is 8 byte. This
patch aligns the length to ETH_ALEN to have the same MAC address exposed
as using sysfs.

fixes f87fda00b6ed2

Signed-off-by: Tobias Jungel <tobias.jungel@gmail.com>
---
 drivers/net/bonding/bond_netlink.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c
index 9697977b80f0..6b9ad8673218 100644
--- a/drivers/net/bonding/bond_netlink.c
+++ b/drivers/net/bonding/bond_netlink.c
@@ -638,8 +638,7 @@ static int bond_fill_info(struct sk_buff *skb,
 				goto nla_put_failure;
 
 			if (nla_put(skb, IFLA_BOND_AD_ACTOR_SYSTEM,
-				    sizeof(bond->params.ad_actor_system),
-				    &bond->params.ad_actor_system))
+				    ETH_ALEN, &bond->params.ad_actor_system))
 				goto nla_put_failure;
 		}
 		if (!bond_3ad_get_active_agg_info(bond, &info)) {

^ permalink raw reply related

* Re: checksumming on non-local forward path
From: Andrew Lunn @ 2018-10-27 14:26 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: Netdev
In-Reply-To: <CAHmME9rMhHwyiw0t+0oGS6XwPkmrbG_8TPmtWdS3aW9AFByphg@mail.gmail.com>

> What would you think of a flag on the receiving end like,
> "CHECKSUM_INVALID_BUT_UNNECESSARY"? It would be treated as
> CHECKSUM_UNNECESSARY in the case that the the packet is locally
> received. But if the packet is going to be forwarded instead, then
> skb_checksum_help is called on it before forwarding onward.
> 
> AFAICT, wireguard isn't the only thing that could benefit from this:
> virtio is another case where it's not always necessary for the sender
> to call skb_checksum_help, when the receiver could just do it
> conditionally based on whether it's being forwarded.

Hi Jason

It is the sort of thing which breaks in hard to find ways. I've run
network simulations with machine instances running in containers. It
used veth pairs to connect the instances to a central 'switching'
namespace which did the interconnect between the instances, using lots
of bridges. After a while, my simulation got bigger than a single host
could support. So i split it over multiple servers, using GRE tunnels
between the bridges. It took me a while to notice the network was
actually in two segments, because frames going over GRE were getting
tossed with checksum issues. It was not the GRE tunnel at fault. It
took a while to trace it back to where the checksumming was turned
off, a TAP interface i think, but i don't remember.

How do you reliably decide if a frame needs checksums, when you cannot
peer down the pipe of bridges, veth, GRE tunnels and TAP interfaces
the frame is about to take?

	   Andrew

^ permalink raw reply

* Re: [PATCH] sctp: socket.c validate sprstat_policy
From: David Miller @ 2018-10-28  0:03 UTC (permalink / raw)
  To: tomasbortoli
  Cc: vyasevich, nhorman, marcelo.leitner, linux-sctp, netdev,
	linux-kernel
In-Reply-To: <cb41ca17-4bad-bd21-6938-aee960a8ba9b@gmail.com>

From: Tomas Bortoli <tomasbortoli@gmail.com>
Date: Sat, 27 Oct 2018 22:43:43 +0200

> I just realized we also have to check for less than 0 indexes..

How about the fact that your original submission didn't even compile?

I hope you realized that first.

^ permalink raw reply

* [iproute PATCH] utils.h: provide fallback CLOCK_TAI definition
From: Peter Korsgaard @ 2018-10-27 15:31 UTC (permalink / raw)
  To: Stephen Hemminger, Vinicius Costa Gomes; +Cc: netdev, Peter Korsgaard

q_{etf,taprio}.c uses CLOCK_TAI, which isn't exposed by glibc < 2.21 or
uClibc, breaking the build. Provide a fallback definition like it is done
for IPPROTO_MPLS and others.

Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
---
 include/utils.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/utils.h b/include/utils.h
index 258d630e..685d2c1d 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -126,6 +126,10 @@ struct ipx_addr {
 #define IPPROTO_MPLS	137
 #endif
 
+#ifndef CLOCK_TAI
+# define CLOCK_TAI 11
+#endif
+
 __u32 get_addr32(const char *name);
 int get_addr_1(inet_prefix *dst, const char *arg, int family);
 int get_prefix_1(inet_prefix *dst, char *arg, int family);
-- 
2.11.0

^ permalink raw reply related

* RE: [LKP] [tcp] a337531b94: netperf.Throughput_Mbps -6.1% regression
From: Wang, Kemi @ 2018-10-28  1:43 UTC (permalink / raw)
  To: Eric Dumazet, Chen, Rong A, Yuchung Cheng
  Cc: Soheil Hassas Yeganeh, netdev@vger.kernel.org, LKML, Eric Dumazet,
	lkp@01.org, Wei Wang, Neal Cardwell, David S. Miller
In-Reply-To: <e22a09c4-8bbb-e482-6e1e-59ea1111eda3@gmail.com>

Hi, Eric
   Thanks for the info.
   We rerun the test and verified that this issue has been fixed with commit 041a14d2671573611ffd6412bc16e2f64469f7fb.
   Only about  0.1% performance difference was observed.
 


-----Original Message-----
From: LKP [mailto:lkp-bounces@lists.01.org] On Behalf Of Eric Dumazet
Sent: Wednesday, October 24, 2018 9:27 PM
To: Chen, Rong A <rong.a.chen@intel.com>; Yuchung Cheng <ycheng@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>; netdev@vger.kernel.org; LKML <linux-kernel@vger.kernel.org>; Eric Dumazet <edumazet@google.com>; lkp@01.org; Wei Wang <weiwan@google.com>; Neal Cardwell <ncardwell@google.com>; David S. Miller <davem@davemloft.net>
Subject: Re: [LKP] [tcp] a337531b94: netperf.Throughput_Mbps -6.1% regression

Hi Rong

This has been reported already, and we believe this has been fixed with :

commit 041a14d2671573611ffd6412bc16e2f64469f7fb
Author: Yuchung Cheng <ycheng@google.com>
Date:   Mon Oct 1 15:42:32 2018 -0700

    tcp: start receiver buffer autotuning sooner
    
    Previously receiver buffer auto-tuning starts after receiving
    one advertised window amount of data. After the initial receiver
    buffer was raised by patch a337531b942b ("tcp: up initial rmem to
    128KB and SYN rwin to around 64KB"), the reciver buffer may take
    too long to start raising. To address this issue, this patch lowers
    the initial bytes expected to receive roughly the expected sender's
    initial window.
    
    Fixes: a337531b942b ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
    Signed-off-by: Yuchung Cheng <ycheng@google.com>
    Signed-off-by: Wei Wang <weiwan@google.com>
    Signed-off-by: Neal Cardwell <ncardwell@google.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>


Thanks

On 10/24/2018 05:13 AM, kernel test robot wrote:
> Greeting,
> 
> FYI, we noticed a -6.1% regression of netperf.Throughput_Mbps due to commit:
> 
> 
> commit: a337531b942bd8a03e7052444d7e36972aac2d92 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
> https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git master
> 
> in testcase: netperf
> on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory
> with following parameters:
> 
> 	ip: ipv4
> 	runtime: 900s
> 	nr_threads: 200%
> 	cluster: cs-localhost
> 	test: TCP_STREAM
> 	ucode: 0x7000013
> 	cpufreq_governor: performance
> 
> test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
> test-url: http://www.netperf.org/netperf/
> 
> In addition to that, the commit also has significant impact on the following tests:
> 
> +------------------+-------------------------------------------------------------------+
> | testcase: change | netperf: netperf.Throughput_Mbps -1.0% regression                 |
> | test machine     | 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory   |
> | test parameters  | cluster=cs-localhost                                              |
> |                  | cpufreq_governor=performance                                      |
> |                  | ip=ipv4                                                           |
> |                  | nr_threads=200%                                                   |
> |                  | runtime=300s                                                      |
> |                  | send_size=5K                                                      |
> |                  | test=TCP_SENDFILE                                                 |
> |                  | ucode=0x7000013                                                   |
> +------------------+-------------------------------------------------------------------+
> | testcase: change | netperf: netperf.Throughput_Mbps -5.9% regression                 |
> | test machine     | 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory   |
> | test parameters  | cluster=cs-localhost                                              |
> |                  | cpufreq_governor=performance                                      |
> |                  | ip=ipv4                                                           |
> |                  | nr_threads=200%                                                   |
> |                  | runtime=900s                                                      |
> |                  | test=TCP_MAERTS                                                   |
> |                  | ucode=0x7000013                                                   |
> +------------------+-------------------------------------------------------------------+
> | testcase: change | netperf: netperf.Throughput_Mbps -3.2% regression                 |
> | test machine     | 4 threads Intel(R) Core(TM) i5-3317U CPU @ 1.70GHz with 4G memory |
> | test parameters  | cluster=cs-localhost                                              |
> |                  | cpufreq_governor=performance                                      |
> |                  | ip=ipv4                                                           |
> |                  | nr_threads=200%                                                   |
> |                  | runtime=900s                                                      |
> |                  | test=TCP_MAERTS                                                   |
> |                  | ucode=0x20                                                        |
> +------------------+-------------------------------------------------------------------+
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> To reproduce:
> 
>         git clone https://github.com/intel/lkp-tests.git
>         cd lkp-tests
>         bin/lkp install job.yaml  # job file is attached in this email
>         bin/lkp run     job.yaml
> 
> =========================================================================================
> cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase/ucode:
>   cs-localhost/gcc-7/performance/ipv4/x86_64-rhel-7.2/200%/debian-x86_64-2018-04-03.cgz/900s/lkp-bdw-de1/TCP_STREAM/netperf/0x7000013
> 
> commit: 
>   3ff6cde846 ("hns3: Another build fix.")
>   a337531b94 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
> 
> 3ff6cde846857d45 a337531b942bd8a03e7052444d 
> ---------------- -------------------------- 
>        fail:runs  %reproduction    fail:runs
>            |             |             |    
>            :4           50%           2:4     dmesg.WARNING:at#for_ip_interrupt_entry/0x
>          %stddev     %change         %stddev
>              \          |                \  
>       2497            -6.1%       2345        netperf.Throughput_Mbps
>      79924            -6.1%      75061        netperf.Throughput_total_Mbps
>     186513           +11.3%     207590        netperf.time.involuntary_context_switches
>  5.488e+08            -6.1%  5.154e+08        netperf.workload
>       1172 ± 34%     -37.6%     731.75 ±  5%  cpuidle.C1E.usage
>       1137 ± 34%     -40.0%     682.25 ±  8%  turbostat.C1E
>       2775 ± 11%     +17.5%       3261 ±  9%  sched_debug.cpu.nr_switches.stddev
>       0.01 ± 17%     +28.2%       0.01 ± 10%  sched_debug.rt_rq:/.rt_time.avg
>       0.14 ± 17%     +28.2%       0.18 ± 10%  sched_debug.rt_rq:/.rt_time.max
>       0.03 ± 17%     +28.2%       0.04 ± 10%  sched_debug.rt_rq:/.rt_time.stddev
>      66336            +0.9%      66948        proc-vmstat.nr_anon_pages
>  2.755e+08            -6.1%  2.588e+08        proc-vmstat.numa_hit
>  2.755e+08            -6.1%  2.588e+08        proc-vmstat.numa_local
>  2.197e+09            -6.1%  2.064e+09        proc-vmstat.pgalloc_normal
>  2.197e+09            -6.1%  2.064e+09        proc-vmstat.pgfree
>  5.903e+11            -7.9%  5.438e+11        perf-stat.branch-instructions
>       2.68            -0.0        2.64        perf-stat.branch-miss-rate%
>  1.582e+10            -9.2%  1.436e+10        perf-stat.branch-misses
>   6.26e+11            -4.7%  5.964e+11        perf-stat.cache-misses
>   6.26e+11            -4.7%  5.964e+11        perf-stat.cache-references
>      11.69            +8.6%      12.69        perf-stat.cpi
>     123723            +2.1%     126291        perf-stat.cpu-migrations
>       0.09 ±  2%      +0.0        0.09        perf-stat.dTLB-load-miss-rate%
>  1.475e+12            -7.1%   1.37e+12        perf-stat.dTLB-loads
>  1.094e+12            -6.9%  1.018e+12        perf-stat.dTLB-stores
>  2.912e+08 ±  5%     -13.0%  2.533e+08        perf-stat.iTLB-loads
>  3.019e+12            -7.9%  2.781e+12        perf-stat.instructions
>       0.09            -7.9%       0.08        perf-stat.ipc
>       5500            -1.9%       5394        perf-stat.path-length
>       0.53 ±  2%      -0.2        0.38 ± 57%  perf-profile.calltrace.cycles-pp.ip_output.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames
>       0.63 ±  2%      -0.1        0.58 ±  4%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
>       0.73 ±  3%      +0.1        0.78 ±  2%  perf-profile.calltrace.cycles-pp.tcp_clean_rtx_queue.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
>       0.96            +0.1        1.03        perf-profile.calltrace.cycles-pp.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_local_deliver_finish
>      98.02            +0.1       98.13        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
>      97.88            +0.1       98.00        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.70 ±  3%      -0.1        0.64 ±  4%  perf-profile.children.cycles-pp.syscall_return_via_sysret
>       0.26 ±  5%      -0.0        0.21 ±  6%  perf-profile.children.cycles-pp._raw_spin_lock_bh
>       0.28 ±  5%      -0.0        0.24 ±  6%  perf-profile.children.cycles-pp.lock_sock_nested
>       0.46 ±  4%      -0.0        0.43 ±  2%  perf-profile.children.cycles-pp.nf_hook_slow
>       0.21 ±  8%      -0.0        0.18 ±  5%  perf-profile.children.cycles-pp.tcp_rcv_space_adjust
>       0.08 ±  5%      -0.0        0.06        perf-profile.children.cycles-pp.entry_SYSCALL_64_stage2
>       0.08 ±  6%      -0.0        0.06 ±  6%  perf-profile.children.cycles-pp.ip_finish_output
>       0.17 ±  6%      +0.0        0.20 ±  5%  perf-profile.children.cycles-pp.tcp_event_new_data_sent
>       0.24 ±  4%      +0.0        0.27 ±  2%  perf-profile.children.cycles-pp.mod_timer
>       0.15 ±  2%      +0.0        0.18 ±  2%  perf-profile.children.cycles-pp.__might_sleep
>       0.80 ±  3%      +0.0        0.84 ±  2%  perf-profile.children.cycles-pp.tcp_clean_rtx_queue
>       0.30 ±  3%      +0.1        0.36 ±  4%  perf-profile.children.cycles-pp.__might_fault
>       1.61 ±  4%      +0.1        1.69        perf-profile.children.cycles-pp.__release_sock
>       1.06 ±  2%      +0.1        1.14        perf-profile.children.cycles-pp.tcp_ack
>      98.24            +0.1       98.36        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
>      98.09            +0.1       98.23        perf-profile.children.cycles-pp.do_syscall_64
>      70.28            +0.6       70.86        perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
>       1.56            -0.1        1.48 ±  3%  perf-profile.self.cycles-pp.copy_page_to_iter
>       0.70 ±  3%      -0.1        0.64 ±  4%  perf-profile.self.cycles-pp.syscall_return_via_sysret
>       1.37 ±  2%      -0.1        1.32 ±  2%  perf-profile.self.cycles-pp.__free_pages_ok
>       0.55 ±  3%      -0.0        0.50 ±  3%  perf-profile.self.cycles-pp.__alloc_skb
>       0.44 ±  3%      -0.0        0.40 ±  5%  perf-profile.self.cycles-pp.tcp_recvmsg
>       0.16 ±  9%      -0.0        0.14 ±  5%  perf-profile.self.cycles-pp.sock_has_perm
>       0.08 ±  6%      -0.0        0.06        perf-profile.self.cycles-pp.entry_SYSCALL_64_stage2
>       0.10 ±  4%      +0.0        0.12 ±  6%  perf-profile.self.cycles-pp.tcp_clean_rtx_queue
>       0.14 ±  6%      +0.0        0.17 ±  4%  perf-profile.self.cycles-pp.__might_sleep
>      69.25            +0.5       69.77        perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
> 
> 
>                                                                                 
>                               netperf.Throughput_Mbps                           
>                                                                                 
>   3000 +-+------------------------------------------------------------------+   
>        |                                                                    |   
>   2500 +-+..+.+..+.+..+.+..+.+..+.+..+.+..+.+.+..+.+..+.+..+.+..+.+..+.+..+.|   
>        O O  O O  O O  O O  O O  O O  O O  O O O  O O  O O  O O  O O         |   
>        | :                                                                  |   
>   2000 +-+                                                                  |   
>        |:                                                                   |   
>   1500 +-+                                                                  |   
>        |:                                                                   |   
>   1000 +-+                                                                  |   
>        |:                                                                   |   
>        |:                                                                   |   
>    500 +-+                                                                  |   
>        |                                                                    |   
>      0 +-+------------------------------------------------------------------+   
>                                                                                 
>                                                                                                                                                                 
>                             netperf.Throughput_total_Mbps                       
>                                                                                 
>   90000 +-+-----------------------------------------------------------------+   
>         |                                                                   |   
>   80000 O-O..O.O..O.O..O.O.O..O.O..O.O..O.O.O..O.O..O.O..O.O.O..O.O..+.+..+.|   
>   70000 +-+                                                                 |   
>         | :                                                                 |   
>   60000 +-+                                                                 |   
>   50000 +-+                                                                 |   
>         |:                                                                  |   
>   40000 +-+                                                                 |   
>   30000 +-+                                                                 |   
>         |:                                                                  |   
>   20000 +-+                                                                 |   
>   10000 +-+                                                                 |   
>         |                                                                   |   
>       0 +-+-----------------------------------------------------------------+   
>                                                                                 
>                                                                                                                                                                 
>                                   netperf.workload                              
>                                                                                 
>   6e+08 +-+-----------------------------------------------------------------+   
>         | +..+.+..+.+..+.+.+..+.+..+.+..+.+.+..+.+..+.+..+.+.+..+.+..+.+..+.|   
>   5e+08 O-O  O O  O O  O O O  O O  O O  O O O  O O  O O  O O O  O O         |   
>         | :                                                                 |   
>         | :                                                                 |   
>   4e+08 +-+                                                                 |   
>         |:                                                                  |   
>   3e+08 +-+                                                                 |   
>         |:                                                                  |   
>   2e+08 +-+                                                                 |   
>         |:                                                                  |   
>         |                                                                   |   
>   1e+08 +-+                                                                 |   
>         |                                                                   |   
>       0 +-+-----------------------------------------------------------------+   
>                                                                                 
>                                                                                 
> [*] bisect-good sample
> [O] bisect-bad  sample
> 
> ***************************************************************************************************
> lkp-bdw-de1: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory
> =========================================================================================
> cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/send_size/tbox_group/test/testcase/ucode:
>   cs-localhost/gcc-7/performance/ipv4/x86_64-rhel-7.2/200%/debian-x86_64-2018-04-03.cgz/300s/5K/lkp-bdw-de1/TCP_SENDFILE/netperf/0x7000013
> 
> commit: 
>   3ff6cde846 ("hns3: Another build fix.")
>   a337531b94 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
> 
> 3ff6cde846857d45 a337531b942bd8a03e7052444d 
> ---------------- -------------------------- 
>        fail:runs  %reproduction    fail:runs
>            |             |             |    
>           1:4          -25%            :4     dmesg.WARNING:at#for_ip_interrupt_entry/0x
>          %stddev     %change         %stddev
>              \          |                \  
>       5211            -1.0%       5160        netperf.Throughput_Mbps
>     166777            -1.0%     165138        netperf.Throughput_total_Mbps
>       1268            -1.6%       1247        netperf.time.percent_of_cpu_this_job_got
>       3539            -1.6%       3481        netperf.time.system_time
>     282.77            -1.5%     278.54        netperf.time.user_time
>    1435875            -1.0%    1421780        netperf.time.voluntary_context_switches
>  1.222e+09            -1.0%   1.21e+09        netperf.workload
>      22728            -1.3%      22437        vmstat.system.cs
>    1218263 ±  3%      -5.6%    1150027 ±  4%  proc-vmstat.pgalloc_normal
>    1197588 ±  4%      -6.0%    1125684 ±  4%  proc-vmstat.pgfree
>       3424 ± 17%     -28.2%       2456 ± 21%  sched_debug.cpu.nr_load_updates.stddev
>       9.00 ± 11%     -19.9%       7.21 ± 11%  sched_debug.cpu.nr_uninterruptible.max
>   35344728 ± 33%     -94.5%    1954598 ±144%  cpuidle.C3.time
>      79217 ± 32%     -95.5%       3571 ±115%  cpuidle.C3.usage
>   13342584 ± 19%    +253.4%   47153200 ± 34%  cpuidle.C6.time
>      17886 ± 21%    +185.8%      51115 ± 34%  cpuidle.C6.usage
>       4295 ± 24%    +108.0%       8934 ± 53%  cpuidle.POLL.time
>      79180 ± 32%     -95.6%       3487 ±118%  turbostat.C3
>       0.73 ± 32%      -0.7        0.04 ±144%  turbostat.C3%
>      17693 ± 21%    +187.9%      50931 ± 34%  turbostat.C6
>       0.27 ± 19%      +0.7        0.97 ± 34%  turbostat.C6%
>       0.35 ± 30%     -89.9%       0.04 ±173%  turbostat.CPU%c3
>       0.08 ±  6%    +693.3%       0.59 ± 38%  turbostat.CPU%c6
>       2.95            +3.1%       3.04        turbostat.RAMWatt
>  1.711e+12            -1.3%  1.689e+12        perf-stat.branch-instructions
>  5.345e+10            -1.2%  5.283e+10        perf-stat.branch-misses
>  9.417e+10           +16.7%  1.099e+11        perf-stat.cache-misses
>  9.417e+10           +16.7%  1.099e+11        perf-stat.cache-references
>    6927335            -1.1%    6849494        perf-stat.context-switches
>  2.936e+12            -1.3%  2.899e+12        perf-stat.dTLB-loads
>  1.796e+12            -1.3%  1.773e+12        perf-stat.dTLB-stores
>      80.43            +3.5       83.95        perf-stat.iTLB-load-miss-rate%
>  3.809e+09 ±  4%      -4.7%  3.629e+09 ±  2%  perf-stat.iTLB-load-misses
>  9.248e+08 ±  3%     -25.0%  6.934e+08        perf-stat.iTLB-loads
>  8.835e+12            -1.3%  8.719e+12        perf-stat.instructions
>      69.17            -1.1       68.08        perf-profile.calltrace.cycles-pp.__x64_sys_sendfile64.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      65.80            -1.0       64.79        perf-profile.calltrace.cycles-pp.do_sendfile.__x64_sys_sendfile64.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      55.88            -0.8       55.04        perf-profile.calltrace.cycles-pp.do_splice_direct.do_sendfile.__x64_sys_sendfile64.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      52.32            -0.8       51.56        perf-profile.calltrace.cycles-pp.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64.do_syscall_64
>      35.71            -0.6       35.11        perf-profile.calltrace.cycles-pp.direct_splice_actor.splice_direct_to_actor.do_splice_direct.do_sendfile.__x64_sys_sendfile64
>      34.84            -0.6       34.26        perf-profile.calltrace.cycles-pp.splice_from_pipe.direct_splice_actor.splice_direct_to_actor.do_splice_direct.do_sendfile
>      33.94            -0.5       33.41        perf-profile.calltrace.cycles-pp.__splice_from_pipe.splice_from_pipe.direct_splice_actor.splice_direct_to_actor.do_splice_direct
>      26.16            -0.5       25.70        perf-profile.calltrace.cycles-pp.tcp_sendpage.inet_sendpage.kernel_sendpage.sock_sendpage.pipe_to_sendpage
>      30.02            -0.5       29.55        perf-profile.calltrace.cycles-pp.pipe_to_sendpage.__splice_from_pipe.splice_from_pipe.direct_splice_actor.splice_direct_to_actor
>      28.77            -0.4       28.34        perf-profile.calltrace.cycles-pp.sock_sendpage.pipe_to_sendpage.__splice_from_pipe.splice_from_pipe.direct_splice_actor
>      27.68            -0.4       27.27        perf-profile.calltrace.cycles-pp.inet_sendpage.kernel_sendpage.sock_sendpage.pipe_to_sendpage.__splice_from_pipe
>      27.98            -0.4       27.58        perf-profile.calltrace.cycles-pp.kernel_sendpage.sock_sendpage.pipe_to_sendpage.__splice_from_pipe.splice_from_pipe
>      20.30            -0.3       19.95        perf-profile.calltrace.cycles-pp.tcp_sendpage_locked.tcp_sendpage.inet_sendpage.kernel_sendpage.sock_sendpage
>      19.49            -0.3       19.16        perf-profile.calltrace.cycles-pp.do_tcp_sendpages.tcp_sendpage_locked.tcp_sendpage.inet_sendpage.kernel_sendpage
>       9.78            -0.2        9.53        perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.do_tcp_sendpages.tcp_sendpage_locked.tcp_sendpage
>       9.94            -0.2        9.70        perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.do_tcp_sendpages.tcp_sendpage_locked.tcp_sendpage.inet_sendpage
>       6.32            -0.2        6.09        perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.do_tcp_sendpages.tcp_sendpage_locked
>       5.59            -0.2        5.42        perf-profile.calltrace.cycles-pp.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.do_tcp_sendpages
>       5.19            -0.2        5.02        perf-profile.calltrace.cycles-pp.ip_output.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames
>       4.79            -0.2        4.62        perf-profile.calltrace.cycles-pp.ip_rcv.__netif_receive_skb_one_core.process_backlog.net_rx_action.__softirqentry_text_start
>       5.51            -0.2        5.35        perf-profile.calltrace.cycles-pp.__softirqentry_text_start.do_softirq_own_stack.do_softirq.__local_bh_enable_ip.ip_finish_output2
>       5.00            -0.2        4.84        perf-profile.calltrace.cycles-pp.__netif_receive_skb_one_core.process_backlog.net_rx_action.__softirqentry_text_start.do_softirq_own_stack
>       5.52            -0.2        5.36        perf-profile.calltrace.cycles-pp.do_softirq_own_stack.do_softirq.__local_bh_enable_ip.ip_finish_output2.ip_output
>       5.37            -0.2        5.21        perf-profile.calltrace.cycles-pp.net_rx_action.__softirqentry_text_start.do_softirq_own_stack.do_softirq.__local_bh_enable_ip
>       4.68            -0.2        4.53        perf-profile.calltrace.cycles-pp.security_file_permission.do_sendfile.__x64_sys_sendfile64.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       5.61            -0.2        5.46        perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.ip_finish_output2.ip_output.__ip_queue_xmit
>       5.21            -0.2        5.06        perf-profile.calltrace.cycles-pp.process_backlog.net_rx_action.__softirqentry_text_start.do_softirq_own_stack.do_softirq
>       4.58            -0.2        4.42        perf-profile.calltrace.cycles-pp.ip_finish_output2.ip_output.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit
>       5.66            -0.2        5.50        perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.ip_finish_output2.ip_output.__ip_queue_xmit.__tcp_transmit_skb
>       4.39            -0.2        4.24        perf-profile.calltrace.cycles-pp.__entry_SYSCALL_64_trampoline
>       2.87 ±  2%      -0.1        2.76        perf-profile.calltrace.cycles-pp.selinux_file_permission.security_file_permission.do_sendfile.__x64_sys_sendfile64.do_syscall_64
>       1.25 ±  3%      -0.1        1.15        perf-profile.calltrace.cycles-pp.__inode_security_revalidate.selinux_file_permission.security_file_permission.do_sendfile.__x64_sys_sendfile64
>       4.30            -0.1        4.20        perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.ip_local_deliver.ip_rcv.__netif_receive_skb_one_core.process_backlog
>       1.86            -0.1        1.77 ±  3%  perf-profile.calltrace.cycles-pp.release_sock.tcp_sendpage.inet_sendpage.kernel_sendpage.sock_sendpage
>       1.14            -0.1        1.08 ±  2%  perf-profile.calltrace.cycles-pp.file_has_perm.security_file_permission.do_splice_direct.do_sendfile.__x64_sys_sendfile64
>       0.69            -0.1        0.63        perf-profile.calltrace.cycles-pp.tcp_release_cb.release_sock.tcp_sendpage.inet_sendpage.kernel_sendpage
>       0.61 ±  2%      -0.1        0.56 ±  2%  perf-profile.calltrace.cycles-pp.__might_fault.__x64_sys_sendfile64.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.61 ±  2%      -0.0        0.57 ±  4%  perf-profile.calltrace.cycles-pp.avc_has_perm.file_has_perm.security_file_permission.do_splice_direct.do_sendfile
>       0.57 ±  2%      +0.0        0.61 ±  2%  perf-profile.calltrace.cycles-pp.___might_sleep.__might_fault.copy_page_to_iter.skb_copy_datagram_iter.tcp_recvmsg
>      90.63            +0.2       90.83        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      91.39            +0.2       91.62        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
>      20.12            +1.3       21.46        perf-profile.calltrace.cycles-pp.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      20.10            +1.3       21.44        perf-profile.calltrace.cycles-pp.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      19.84            +1.4       21.24        perf-profile.calltrace.cycles-pp.tcp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64
>      19.89            +1.4       21.30        perf-profile.calltrace.cycles-pp.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      15.07            +1.6       16.65        perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.tcp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
>      14.25            +1.6       15.82        perf-profile.calltrace.cycles-pp.copy_page_to_iter.skb_copy_datagram_iter.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
>      11.15            +1.6       12.74        perf-profile.calltrace.cycles-pp.copyout.copy_page_to_iter.skb_copy_datagram_iter.tcp_recvmsg.inet_recvmsg
>      10.84            +1.6       12.45        perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout.copy_page_to_iter.skb_copy_datagram_iter.tcp_recvmsg
>      69.33            -1.1       68.23        perf-profile.children.cycles-pp.__x64_sys_sendfile64
>      65.94            -1.0       64.92        perf-profile.children.cycles-pp.do_sendfile
>      55.98            -0.8       55.14        perf-profile.children.cycles-pp.do_splice_direct
>      52.38            -0.8       51.60        perf-profile.children.cycles-pp.splice_direct_to_actor
>      35.77            -0.6       35.16        perf-profile.children.cycles-pp.direct_splice_actor
>      34.91            -0.6       34.33        perf-profile.children.cycles-pp.splice_from_pipe
>      34.07            -0.5       33.53        perf-profile.children.cycles-pp.__splice_from_pipe
>      30.09            -0.5       29.62        perf-profile.children.cycles-pp.pipe_to_sendpage
>      26.31            -0.5       25.86        perf-profile.children.cycles-pp.tcp_sendpage
>      28.85            -0.4       28.42        perf-profile.children.cycles-pp.sock_sendpage
>      27.75            -0.4       27.33        perf-profile.children.cycles-pp.inet_sendpage
>      28.05            -0.4       27.65        perf-profile.children.cycles-pp.kernel_sendpage
>      20.38            -0.3       20.03        perf-profile.children.cycles-pp.tcp_sendpage_locked
>      19.62            -0.3       19.29        perf-profile.children.cycles-pp.do_tcp_sendpages
>       9.69            -0.3        9.42        perf-profile.children.cycles-pp.security_file_permission
>       8.60            -0.2        8.38        perf-profile.children.cycles-pp.__tcp_transmit_skb
>      10.66            -0.2       10.43        perf-profile.children.cycles-pp.tcp_write_xmit
>      10.79            -0.2       10.56        perf-profile.children.cycles-pp.__tcp_push_pending_frames
>       7.82            -0.2        7.64        perf-profile.children.cycles-pp.__ip_queue_xmit
>       7.38            -0.2        7.20        perf-profile.children.cycles-pp.ip_output
>       6.36            -0.2        6.19        perf-profile.children.cycles-pp.__local_bh_enable_ip
>       5.95            -0.2        5.78        perf-profile.children.cycles-pp.__entry_SYSCALL_64_trampoline
>       4.86            -0.2        4.69        perf-profile.children.cycles-pp.ip_rcv
>       5.07            -0.2        4.91        perf-profile.children.cycles-pp.__netif_receive_skb_one_core
>       5.44            -0.2        5.29        perf-profile.children.cycles-pp.net_rx_action
>       5.58            -0.2        5.42        perf-profile.children.cycles-pp.do_softirq_own_stack
>       5.28            -0.2        5.13        perf-profile.children.cycles-pp.process_backlog
>       6.70            -0.2        6.55        perf-profile.children.cycles-pp.ip_finish_output2
>       5.67            -0.1        5.52        perf-profile.children.cycles-pp.do_softirq
>       2.76 ±  3%      -0.1        2.62        perf-profile.children.cycles-pp.__inode_security_revalidate
>       1.39 ±  4%      -0.1        1.27 ±  2%  perf-profile.children.cycles-pp._cond_resched
>       4.45            -0.1        4.34        perf-profile.children.cycles-pp.ip_local_deliver
>       0.73 ±  5%      -0.1        0.64 ±  3%  perf-profile.children.cycles-pp.rcu_all_qs
>       0.72            -0.1        0.65        perf-profile.children.cycles-pp.tcp_release_cb
>       0.30 ±  5%      -0.1        0.24 ±  3%  perf-profile.children.cycles-pp.tcp_rcv_space_adjust
>       0.43 ±  4%      -0.0        0.39 ±  5%  perf-profile.children.cycles-pp.copy_user_generic_unrolled
>       0.17 ±  7%      -0.0        0.12 ±  6%  perf-profile.children.cycles-pp.ip_rcv_finish_core
>       0.19 ±  7%      -0.0        0.15 ±  6%  perf-profile.children.cycles-pp.ip_rcv_finish
>       0.14 ±  5%      -0.0        0.11 ±  8%  perf-profile.children.cycles-pp.tcp_rearm_rto
>       0.10 ± 11%      -0.0        0.06 ±  6%  perf-profile.children.cycles-pp.sockfd_lookup_light
>       0.07 ±  5%      +0.0        0.09 ±  5%  perf-profile.children.cycles-pp.skb_entail
>       0.11 ±  3%      +0.0        0.13 ±  6%  perf-profile.children.cycles-pp.scheduler_tick
>       0.51 ±  3%      +0.0        0.55 ±  3%  perf-profile.children.cycles-pp.tcp_established_options
>      90.70            +0.2       90.90        perf-profile.children.cycles-pp.do_syscall_64
>      91.47            +0.2       91.70        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
>      20.13            +1.3       21.47        perf-profile.children.cycles-pp.__x64_sys_recvfrom
>      20.10            +1.3       21.44        perf-profile.children.cycles-pp.__sys_recvfrom
>      19.89            +1.4       21.30        perf-profile.children.cycles-pp.inet_recvmsg
>      19.84            +1.4       21.26        perf-profile.children.cycles-pp.tcp_recvmsg
>      16.63            +1.6       18.19        perf-profile.children.cycles-pp.copy_page_to_iter
>      15.08            +1.6       16.66        perf-profile.children.cycles-pp.skb_copy_datagram_iter
>      11.24            +1.6       12.82        perf-profile.children.cycles-pp.copyout
>      11.24            +1.6       12.82        perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
>       5.68            -0.2        5.51        perf-profile.self.cycles-pp.__entry_SYSCALL_64_trampoline
>       0.67            -0.1        0.60 ±  2%  perf-profile.self.cycles-pp.tcp_release_cb
>       0.93 ±  2%      -0.1        0.86 ±  2%  perf-profile.self.cycles-pp.__inode_security_revalidate
>       1.09 ±  2%      -0.0        1.05 ±  2%  perf-profile.self.cycles-pp.do_syscall_64
>       0.16 ±  9%      -0.0        0.12 ±  7%  perf-profile.self.cycles-pp.ip_rcv_finish_core
>       0.09 ± 11%      -0.0        0.05 ± 62%  perf-profile.self.cycles-pp.__tcp_ack_snd_check
>       0.40 ±  3%      -0.0        0.36 ±  7%  perf-profile.self.cycles-pp.copy_user_generic_unrolled
>       0.80            -0.0        0.77 ±  2%  perf-profile.self.cycles-pp.current_time
>       0.28 ±  2%      -0.0        0.25 ±  3%  perf-profile.self.cycles-pp.tcp_recvmsg
>       0.27 ±  6%      -0.0        0.24 ±  5%  perf-profile.self.cycles-pp.__alloc_skb
>       0.18 ±  6%      -0.0        0.15 ±  7%  perf-profile.self.cycles-pp.tcp_mstamp_refresh
>       0.10 ±  5%      -0.0        0.08 ±  5%  perf-profile.self.cycles-pp.__tcp_select_window
>       0.22 ±  3%      +0.0        0.24 ±  2%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
>       0.46 ±  5%      +0.0        0.51 ±  4%  perf-profile.self.cycles-pp.tcp_established_options
>      11.14            +1.5       12.68        perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
> 
> 
> 
> ***************************************************************************************************
> lkp-bdw-de1: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory
> =========================================================================================
> cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase/ucode:
>   cs-localhost/gcc-7/performance/ipv4/x86_64-rhel-7.2/200%/debian-x86_64-2018-04-03.cgz/900s/lkp-bdw-de1/TCP_MAERTS/netperf/0x7000013
> 
> commit: 
>   3ff6cde846 ("hns3: Another build fix.")
>   a337531b94 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
> 
> 3ff6cde846857d45 a337531b942bd8a03e7052444d 
> ---------------- -------------------------- 
>        fail:runs  %reproduction    fail:runs
>            |             |             |    
>           1:4            2%           1:4     perf-profile.children.cycles-pp.schedule_timeout
>          %stddev     %change         %stddev
>              \          |                \  
>       2497            -5.9%       2349        netperf.Throughput_Mbps
>      79914            -5.9%      75172        netperf.Throughput_total_Mbps
>       2472            +4.7%       2588        netperf.time.maximum_resident_set_size
>       8998            +8.0%       9715        netperf.time.minor_page_faults
>      88.91           -13.7%      76.77        netperf.time.user_time
>  5.487e+08            -5.9%  5.162e+08        netperf.workload
>   50507215 ± 49%     -63.0%   18671277 ± 27%  cpuidle.C3.time
>     111760 ±  6%     +12.4%     125584 ±  3%  meminfo.DirectMap4k
>       0.35 ± 49%      -0.2        0.13 ± 29%  turbostat.C3%
>      42.19            -1.2%      41.70        turbostat.PkgWatt
>       1988            +9.6%       2180 ±  2%  sched_debug.cfs_rq:/.util_est_enqueued.max
>     401.62 ±  3%     +11.2%     446.64 ±  4%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
>       3.91 ± 12%     -18.4%       3.19 ± 14%  sched_debug.cpu.nr_uninterruptible.stddev
>     697.25 ±  4%     +48.3%       1034 ± 19%  slabinfo.dmaengine-unmap-16.active_objs
>     697.25 ±  4%     +48.3%       1034 ± 19%  slabinfo.dmaengine-unmap-16.num_objs
>       1464 ± 11%     -20.9%       1157 ±  9%  slabinfo.skbuff_head_cache.active_objs
>       1464 ± 11%     -20.9%       1157 ±  9%  slabinfo.skbuff_head_cache.num_objs
>      70462            +1.3%      71390        proc-vmstat.nr_active_anon
>      66190            +1.5%      67154        proc-vmstat.nr_anon_pages
>      70462            +1.3%      71390        proc-vmstat.nr_zone_active_anon
>  2.756e+08            -6.0%  2.592e+08        proc-vmstat.numa_hit
>  2.756e+08            -6.0%  2.592e+08        proc-vmstat.numa_local
>  2.197e+09            -6.0%  2.067e+09        proc-vmstat.pgalloc_normal
>  2.197e+09            -6.0%  2.066e+09        proc-vmstat.pgfree
>  5.831e+11            -7.8%  5.377e+11        perf-stat.branch-instructions
>  1.567e+10            -8.9%  1.428e+10        perf-stat.branch-misses
>  6.246e+11            -4.4%  5.974e+11        perf-stat.cache-misses
>  6.246e+11            -4.4%  5.974e+11        perf-stat.cache-references
>      11.79            +8.4%      12.78        perf-stat.cpi
>     122574            +2.4%     125502        perf-stat.cpu-migrations
>  1.473e+12            -7.0%  1.369e+12        perf-stat.dTLB-loads
>       0.07 ± 13%      +0.0        0.09 ±  6%  perf-stat.dTLB-store-miss-rate%
>   7.83e+08 ± 13%     +15.6%  9.049e+08 ±  6%  perf-stat.dTLB-store-misses
>  1.092e+12            -6.8%  1.017e+12        perf-stat.dTLB-stores
>  1.153e+09           -10.1%  1.037e+09        perf-stat.iTLB-load-misses
>   2.66e+08 ±  4%      -7.0%  2.474e+08        perf-stat.iTLB-loads
>  2.994e+12            -7.8%  2.761e+12        perf-stat.instructions
>       0.08            -7.8%       0.08        perf-stat.ipc
>       5456            -2.0%       5348        perf-stat.path-length
>       2.62            -0.1        2.49        perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
>       2.64            -0.1        2.51        perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_local_deliver_finish
>       2.83            -0.1        2.73        perf-profile.calltrace.cycles-pp.__free_pages_ok.skb_release_data.__kfree_skb.tcp_recvmsg.inet_recvmsg
>       3.64            -0.1        3.54        perf-profile.calltrace.cycles-pp.__kfree_skb.tcp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
>       3.27            -0.1        3.18        perf-profile.calltrace.cycles-pp.skb_release_data.__kfree_skb.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
>      98.03            +0.1       98.11        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
>      97.89            +0.1       97.96        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.44 ± 58%      +0.3        0.71 ±  5%  perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.copy_user_enhanced_fast_string.copyout.copy_page_to_iter
>       2.92 ±  6%      +0.4        3.29 ±  4%  perf-profile.calltrace.cycles-pp.apic_timer_interrupt.copy_user_enhanced_fast_string.copyout.copy_page_to_iter.skb_copy_datagram_iter
>       0.00            +0.5        0.55 ±  6%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.copy_user_enhanced_fast_string.copyout
>       3.64            -0.1        3.52        perf-profile.children.cycles-pp.tcp_write_xmit
>       3.60            -0.1        3.48        perf-profile.children.cycles-pp.__tcp_push_pending_frames
>       2.84            -0.1        2.74        perf-profile.children.cycles-pp.__free_pages_ok
>       4.08            -0.1        4.00        perf-profile.children.cycles-pp.__kfree_skb
>       0.80 ±  2%      -0.1        0.74 ±  3%  perf-profile.children.cycles-pp.__entry_SYSCALL_64_trampoline
>       0.23 ±  4%      -0.0        0.20 ±  5%  perf-profile.children.cycles-pp.__sk_mem_schedule
>       0.22 ±  4%      -0.0        0.19 ±  5%  perf-profile.children.cycles-pp.__sk_mem_raise_allocated
>       0.06            -0.0        0.04 ± 57%  perf-profile.children.cycles-pp.tcp_release_cb
>       0.08 ±  6%      -0.0        0.06 ± 15%  perf-profile.children.cycles-pp.__tcp_select_window
>       0.23            +0.0        0.24 ±  2%  perf-profile.children.cycles-pp.__tcp_send_ack
>       0.06 ± 11%      +0.0        0.08 ±  5%  perf-profile.children.cycles-pp.___perf_sw_event
>       0.06 ± 14%      +0.0        0.09 ± 13%  perf-profile.children.cycles-pp.tcp_write_timer_handler
>       0.12 ±  7%      +0.0        0.15 ±  5%  perf-profile.children.cycles-pp.update_curr
>       0.06 ± 11%      +0.0        0.09 ± 17%  perf-profile.children.cycles-pp.call_timer_fn
>       0.17 ±  4%      +0.0        0.20 ±  3%  perf-profile.children.cycles-pp.___slab_alloc
>       0.18 ±  4%      +0.0        0.21 ±  3%  perf-profile.children.cycles-pp.__slab_alloc
>       0.05 ± 58%      +0.0        0.08 ± 15%  perf-profile.children.cycles-pp.tcp_write_timer
>       0.04 ± 58%      +0.0        0.08 ± 16%  perf-profile.children.cycles-pp.tcp_send_loss_probe
>       0.32 ±  3%      +0.0        0.35        perf-profile.children.cycles-pp.kmem_cache_alloc_node
>       0.14 ±  7%      +0.0        0.19 ± 16%  perf-profile.children.cycles-pp.preempt_schedule_common
>       0.21 ± 12%      +0.1        0.27 ±  6%  perf-profile.children.cycles-pp.task_tick_fair
>       0.00            +0.1        0.06 ± 11%  perf-profile.children.cycles-pp.__tcp_retransmit_skb
>       0.51 ±  3%      +0.1        0.57 ±  6%  perf-profile.children.cycles-pp.__sched_text_start
>       1.61            +0.1        1.68 ±  2%  perf-profile.children.cycles-pp.__release_sock
>       1.06 ±  3%      +0.1        1.14 ±  2%  perf-profile.children.cycles-pp.tcp_ack
>       0.28 ±  9%      +0.1        0.36 ±  4%  perf-profile.children.cycles-pp.scheduler_tick
>      98.09            +0.1       98.18        perf-profile.children.cycles-pp.do_syscall_64
>      98.23            +0.1       98.32        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
>       0.49 ±  8%      +0.1        0.58 ±  5%  perf-profile.children.cycles-pp.update_process_times
>       0.50 ±  8%      +0.1        0.61 ±  6%  perf-profile.children.cycles-pp.tick_sched_handle
>       0.54 ±  9%      +0.1        0.67 ±  5%  perf-profile.children.cycles-pp.tick_sched_timer
>       0.79 ±  8%      +0.1        0.93 ±  3%  perf-profile.children.cycles-pp.__hrtimer_run_queues
>       0.93 ±  9%      +0.2        1.09 ±  2%  perf-profile.children.cycles-pp.hrtimer_interrupt
>       1.13 ± 10%      +0.2        1.37 ±  4%  perf-profile.children.cycles-pp.smp_apic_timer_interrupt
>       2.51 ±  6%      +0.4        2.87 ±  3%  perf-profile.children.cycles-pp.apic_timer_interrupt
>      70.21            +0.4       70.63        perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
>       1.61            -0.1        1.49 ±  2%  perf-profile.self.cycles-pp.copy_page_to_iter
>       0.78 ±  2%      -0.1        0.72 ±  3%  perf-profile.self.cycles-pp.__entry_SYSCALL_64_trampoline
>       1.37            -0.1        1.32        perf-profile.self.cycles-pp.__free_pages_ok
>       0.21 ±  5%      -0.0        0.18 ±  4%  perf-profile.self.cycles-pp.__sk_mem_raise_allocated
>       0.65 ±  2%      -0.0        0.62        perf-profile.self.cycles-pp.free_one_page
>       0.41 ±  2%      -0.0        0.39 ±  4%  perf-profile.self.cycles-pp.skb_copy_datagram_iter
>       0.08 ±  6%      -0.0        0.06 ± 15%  perf-profile.self.cycles-pp.__tcp_select_window
>       0.10 ±  5%      -0.0        0.08 ±  8%  perf-profile.self.cycles-pp.import_single_range
>       0.14 ±  5%      +0.0        0.16 ±  5%  perf-profile.self.cycles-pp.___slab_alloc
>       0.19 ±  3%      +0.0        0.21 ±  3%  perf-profile.self.cycles-pp.kmem_cache_alloc_node
>       0.15 ±  4%      +0.0        0.17 ±  4%  perf-profile.self.cycles-pp.__might_sleep
>       0.03 ±100%      +0.0        0.07 ± 13%  perf-profile.self.cycles-pp.___perf_sw_event
> 
> 
> 
> ***************************************************************************************************
> lkp-u410: 4 threads Intel(R) Core(TM) i5-3317U CPU @ 1.70GHz with 4G memory
> =========================================================================================
> cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase/ucode:
>   cs-localhost/gcc-7/performance/ipv4/x86_64-rhel-7.2/200%/debian-x86_64-2018-04-03.cgz/900s/lkp-u410/TCP_MAERTS/netperf/0x20
> 
> commit: 
>   3ff6cde846 ("hns3: Another build fix.")
>   a337531b94 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
> 
> 3ff6cde846857d45 a337531b942bd8a03e7052444d 
> ---------------- -------------------------- 
>        fail:runs  %reproduction    fail:runs
>            |             |             |    
>           4:4         -100%            :4     dmesg.RIP:intel_modeset_init[i915]
>           4:4         -100%            :4     dmesg.WARNING:at_drivers/gpu/drm/i915/intel_display.c:#intel_modeset_init[i915]
>           2:4           -3%           2:4     perf-profile.children.cycles-pp.schedule_timeout
>          %stddev     %change         %stddev
>              \          |                \  
>       3879            -3.2%       3753        netperf.Throughput_Mbps
>      31036            -3.2%      30030        netperf.Throughput_total_Mbps
>       2463            +3.6%       2552        netperf.time.maximum_resident_set_size
>       2499            +7.5%       2685        netperf.time.minor_page_faults
>      24.96           -14.8%      21.28 ±  8%  netperf.time.user_time
>     543040 ± 13%     -15.9%     456816 ±  2%  netperf.time.voluntary_context_switches
>  2.131e+08            -3.2%  2.062e+08        netperf.workload
>      21274            +3.3%      21986        interrupts.CAL:Function_call_interrupts
>     826.00 ±  6%     -27.1%     602.00 ± 23%  slabinfo.skbuff_head_cache.active_objs
>       3904 ±  2%      -4.5%       3728        vmstat.system.cs
>      56.50 ±  2%      +8.8%      61.50 ±  5%  turbostat.CoreTmp
>      56.75 ±  2%      +8.4%      61.50 ±  5%  turbostat.PkgTmp
>       4224 ±173%    +294.2%      16653 ± 52%  sched_debug.cfs_rq:/.spread0.avg
>     110.92 ±  8%     -22.2%      86.34 ± 10%  sched_debug.cfs_rq:/.util_avg.stddev
>     896147 ±  3%     -11.3%     795033 ±  4%  sched_debug.cpu.avg_idle.max
>     162406 ±  9%     -26.1%     119960 ± 21%  sched_debug.cpu.avg_idle.stddev
>      59886 ±  3%      -3.8%      57590        proc-vmstat.nr_dirty_background_threshold
>     119920 ±  3%      -3.8%     115322        proc-vmstat.nr_dirty_threshold
>     628429 ±  3%      -3.7%     605425        proc-vmstat.nr_free_pages
>  1.071e+08            -3.2%  1.036e+08        proc-vmstat.numa_hit
>  1.071e+08            -3.2%  1.036e+08        proc-vmstat.numa_local
>  8.503e+08            -3.2%  8.229e+08        proc-vmstat.pgfree
>  2.265e+11            -5.7%  2.135e+11        perf-stat.branch-instructions
>       3.01            -0.1        2.94        perf-stat.branch-miss-rate%
>  6.809e+09            -7.8%  6.279e+09 ±  3%  perf-stat.branch-misses
>      30.13            +2.0       32.13        perf-stat.cache-miss-rate%
>  5.149e+10            +3.2%  5.314e+10        perf-stat.cache-misses
>  1.709e+11            -3.2%  1.654e+11        perf-stat.cache-references
>    3532029 ±  2%      -4.5%    3373137        perf-stat.context-switches
>       7.31            +6.2%       7.76        perf-stat.cpi
>  5.633e+09 ±  2%      -5.8%  5.308e+09        perf-stat.dTLB-load-misses
>  7.264e+11            -4.1%  6.964e+11        perf-stat.dTLB-loads
>   6.35e+11            -4.0%  6.097e+11        perf-stat.dTLB-stores
>  4.029e+08            -7.1%  3.743e+08 ±  2%  perf-stat.iTLB-load-misses
>  1.157e+12            -5.7%  1.091e+12        perf-stat.instructions
>       0.14            -5.8%       0.13        perf-stat.ipc
>       5426            -2.5%       5289        perf-stat.path-length
>       1.16 ±  6%      -0.2        0.99 ±  3%  perf-profile.calltrace.cycles-pp.__entry_SYSCALL_64_trampoline
>       0.99 ±  6%      -0.1        0.88 ± 10%  perf-profile.calltrace.cycles-pp.tcp_v4_do_rcv.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
>      96.58            +0.3       96.87        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
>      26.12 ±  2%      +1.3       27.40        perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin._copy_from_iter_full.tcp_sendmsg_locked.tcp_sendmsg
>      26.39 ±  2%      +1.3       27.69        perf-profile.calltrace.cycles-pp.copyin._copy_from_iter_full.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
>      27.12 ±  3%      +1.4       28.48        perf-profile.calltrace.cycles-pp._copy_from_iter_full.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
>      41.73 ±  2%      +1.7       43.40 ±  2%  perf-profile.calltrace.cycles-pp.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto
>      43.17 ±  2%      +1.7       44.87 ±  2%  perf-profile.calltrace.cycles-pp.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64
>      43.75 ±  2%      +1.8       45.51        perf-profile.calltrace.cycles-pp.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      44.88 ±  2%      +1.8       46.63        perf-profile.calltrace.cycles-pp.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      44.73 ±  2%      +1.8       46.53        perf-profile.calltrace.cycles-pp.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       1.38 ±  6%      -0.2        1.20 ±  3%  perf-profile.children.cycles-pp.__entry_SYSCALL_64_trampoline
>       0.42 ±  9%      -0.1        0.31 ±  9%  perf-profile.children.cycles-pp.tcp_queue_rcv
>       0.79 ±  6%      -0.1        0.68 ±  5%  perf-profile.children.cycles-pp.ktime_get_with_offset
>       0.32 ± 12%      -0.1        0.21 ± 33%  perf-profile.children.cycles-pp.scheduler_tick
>       0.35 ± 12%      -0.1        0.26 ± 11%  perf-profile.children.cycles-pp.tcp_try_coalesce
>       0.29 ± 10%      -0.1        0.20 ± 17%  perf-profile.children.cycles-pp.skb_try_coalesce
>       0.88 ±  2%      -0.1        0.79 ±  4%  perf-profile.children.cycles-pp.tcp_mstamp_refresh
>       0.32 ±  9%      -0.1        0.26 ± 18%  perf-profile.children.cycles-pp.ip_local_out
>       0.41 ±  3%      +0.0        0.45 ±  4%  perf-profile.children.cycles-pp.selinux_ip_postroute
>       0.03 ±102%      +0.1        0.09 ± 24%  perf-profile.children.cycles-pp.lock_timer_base
>       0.00            +0.1        0.08 ± 29%  perf-profile.children.cycles-pp.raw_local_deliver
>       0.57 ±  4%      +0.1        0.66 ±  7%  perf-profile.children.cycles-pp.tcp_event_new_data_sent
>       0.20 ± 28%      +0.1        0.29 ± 21%  perf-profile.children.cycles-pp._cond_resched
>      64.27            +0.5       64.78        perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
>      26.41 ±  2%      +1.3       27.70        perf-profile.children.cycles-pp.copyin
>      27.16 ±  3%      +1.3       28.50        perf-profile.children.cycles-pp._copy_from_iter_full
>      41.76 ±  2%      +1.7       43.44 ±  2%  perf-profile.children.cycles-pp.tcp_sendmsg_locked
>      43.19 ±  2%      +1.7       44.88 ±  2%  perf-profile.children.cycles-pp.tcp_sendmsg
>      44.88 ±  2%      +1.8       46.65        perf-profile.children.cycles-pp.__x64_sys_sendto
>      43.75 ±  2%      +1.8       45.51        perf-profile.children.cycles-pp.sock_sendmsg
>      44.74 ±  2%      +1.8       46.54        perf-profile.children.cycles-pp.__sys_sendto
>       1.21 ±  8%      -0.2        0.99 ±  5%  perf-profile.self.cycles-pp.copy_page_to_iter
>       1.32 ±  6%      -0.2        1.15 ±  3%  perf-profile.self.cycles-pp.__entry_SYSCALL_64_trampoline
>       0.29 ±  9%      -0.1        0.20 ± 18%  perf-profile.self.cycles-pp.skb_try_coalesce
>       0.50 ±  9%      -0.1        0.42 ± 10%  perf-profile.self.cycles-pp.ktime_get_with_offset
>       0.19 ± 14%      -0.1        0.12 ± 10%  perf-profile.self.cycles-pp.__local_bh_enable_ip
>       0.08 ± 10%      -0.0        0.03 ±102%  perf-profile.self.cycles-pp.selinux_sock_rcv_skb_compat
>       0.13 ±  3%      -0.0        0.08 ± 57%  perf-profile.self.cycles-pp.__x64_sys_sendto
>       0.07 ± 12%      -0.0        0.03 ±100%  perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
>       0.11 ± 11%      -0.0        0.08 ± 22%  perf-profile.self.cycles-pp.__sys_recvfrom
>       0.05 ± 61%      +0.0        0.09 ± 11%  perf-profile.self.cycles-pp.selinux_ip_postroute
>       0.09 ± 20%      +0.1        0.15 ± 31%  perf-profile.self.cycles-pp.rcu_all_qs
>       0.00            +0.1        0.07 ± 28%  perf-profile.self.cycles-pp.raw_local_deliver
> 
> 
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 
> Thanks,
> Rong Chen
> 
_______________________________________________
LKP mailing list
LKP@lists.01.org
https://lists.01.org/mailman/listinfo/lkp

^ permalink raw reply

* [PATCH net v4] net/ipv6: Add anycast addresses to a global hashtable
From: Jeff Barnhill @ 2018-10-27 18:02 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuznet, yoshfuji, Jeff Barnhill
In-Reply-To: <95cb5670-eaf0-c7af-7e35-bc4f6e68c5ba@gmail.com>

icmp6_send() function is expensive on systems with a large number of
interfaces. Every time it’s called, it has to verify that the source
address does not correspond to an existing anycast address by looping
through every device and every anycast address on the device.  This can
result in significant delays for a CPU when there are a large number of
neighbors and ND timers are frequently timing out and calling
neigh_invalidate().

Add anycast addresses to a global hashtable to allow quick searching for
matching anycast addresses.  This is based on inet6_addr_lst in addrconf.c.

Signed-off-by: Jeff Barnhill <0xeffeff@gmail.com>
---
 include/net/addrconf.h |   2 +
 include/net/if_inet6.h |   8 ++++
 net/ipv6/af_inet6.c    |   5 +++
 net/ipv6/anycast.c     | 120 ++++++++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 133 insertions(+), 2 deletions(-)

diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index 14b789a123e7..799af1a037d1 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -317,6 +317,8 @@ bool ipv6_chk_acast_addr(struct net *net, struct net_device *dev,
 			 const struct in6_addr *addr);
 bool ipv6_chk_acast_addr_src(struct net *net, struct net_device *dev,
 			     const struct in6_addr *addr);
+int anycast_init(void);
+void anycast_cleanup(void);
 
 /* Device notifier */
 int register_inet6addr_notifier(struct notifier_block *nb);
diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
index d7578cf49c3a..a445014b981d 100644
--- a/include/net/if_inet6.h
+++ b/include/net/if_inet6.h
@@ -142,6 +142,14 @@ struct ipv6_ac_socklist {
 	struct ipv6_ac_socklist *acl_next;
 };
 
+struct ipv6_ac_addrlist {
+	struct in6_addr		acal_addr;
+	possible_net_t		acal_pnet;
+	refcount_t		acal_users;
+	struct hlist_node	acal_lst; /* inet6_acaddr_lst */
+	struct rcu_head		rcu;
+};
+
 struct ifacaddr6 {
 	struct in6_addr		aca_addr;
 	struct fib6_info	*aca_rt;
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 3f4d61017a69..ddc8a6dbfba2 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -1001,6 +1001,9 @@ static int __init inet6_init(void)
 	err = ip6_flowlabel_init();
 	if (err)
 		goto ip6_flowlabel_fail;
+	err = anycast_init();
+	if (err)
+		goto anycast_fail;
 	err = addrconf_init();
 	if (err)
 		goto addrconf_fail;
@@ -1091,6 +1094,8 @@ static int __init inet6_init(void)
 ipv6_exthdrs_fail:
 	addrconf_cleanup();
 addrconf_fail:
+	anycast_cleanup();
+anycast_fail:
 	ip6_flowlabel_cleanup();
 ip6_flowlabel_fail:
 	ndisc_late_cleanup();
diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 4e0ff7031edd..45585010908a 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -44,8 +44,22 @@
 
 #include <net/checksum.h>
 
+#define IN6_ADDR_HSIZE_SHIFT	8
+#define IN6_ADDR_HSIZE		BIT(IN6_ADDR_HSIZE_SHIFT)
+/*	anycast address hash table
+ */
+static struct hlist_head inet6_acaddr_lst[IN6_ADDR_HSIZE];
+static DEFINE_SPINLOCK(acaddr_hash_lock);
+
 static int ipv6_dev_ac_dec(struct net_device *dev, const struct in6_addr *addr);
 
+static u32 inet6_acaddr_hash(struct net *net, const struct in6_addr *addr)
+{
+	u32 val = ipv6_addr_hash(addr) ^ net_hash_mix(net);
+
+	return hash_32(val, IN6_ADDR_HSIZE_SHIFT);
+}
+
 /*
  *	socket join an anycast group
  */
@@ -204,6 +218,73 @@ void ipv6_sock_ac_close(struct sock *sk)
 	rtnl_unlock();
 }
 
+static struct ipv6_ac_addrlist *acal_alloc(struct net *net,
+					   const struct in6_addr *addr)
+{
+	struct ipv6_ac_addrlist *acal;
+
+	acal = kzalloc(sizeof(*acal), GFP_ATOMIC);
+	if (!acal)
+		return NULL;
+
+	acal->acal_addr = *addr;
+	write_pnet(&acal->acal_pnet, net);
+	refcount_set(&acal->acal_users, 1);
+	INIT_HLIST_NODE(&acal->acal_lst);
+
+	return acal;
+}
+
+static int ipv6_add_acaddr_hash(struct net *net, const struct in6_addr *addr)
+{
+	unsigned int hash = inet6_acaddr_hash(net, addr);
+	struct ipv6_ac_addrlist *acal;
+	int err = 0;
+
+	spin_lock(&acaddr_hash_lock);
+	hlist_for_each_entry(acal, &inet6_acaddr_lst[hash], acal_lst) {
+		if (!net_eq(read_pnet(&acal->acal_pnet), net))
+			continue;
+		if (ipv6_addr_equal(&acal->acal_addr, addr)) {
+			refcount_inc(&acal->acal_users);
+			goto out;
+		}
+	}
+
+	acal = acal_alloc(net, addr);
+	if (!acal) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	hlist_add_head_rcu(&acal->acal_lst, &inet6_acaddr_lst[hash]);
+
+out:
+	spin_unlock(&acaddr_hash_lock);
+	return err;
+}
+
+static void ipv6_del_acaddr_hash(struct net *net, const struct in6_addr *addr)
+{
+	unsigned int hash = inet6_acaddr_hash(net, addr);
+	struct ipv6_ac_addrlist *acal;
+
+	spin_lock(&acaddr_hash_lock);
+	hlist_for_each_entry(acal, &inet6_acaddr_lst[hash], acal_lst) {
+		if (!net_eq(read_pnet(&acal->acal_pnet), net))
+			continue;
+		if (ipv6_addr_equal(&acal->acal_addr, addr)) {
+			if (refcount_dec_and_test(&acal->acal_users)) {
+				hlist_del_init_rcu(&acal->acal_lst);
+				kfree_rcu(acal, rcu);
+			}
+			spin_unlock(&acaddr_hash_lock);
+			return;
+		}
+	}
+	spin_unlock(&acaddr_hash_lock);
+}
+
 static void aca_get(struct ifacaddr6 *aca)
 {
 	refcount_inc(&aca->aca_refcnt);
@@ -275,6 +356,11 @@ int __ipv6_dev_ac_inc(struct inet6_dev *idev, const struct in6_addr *addr)
 		err = -ENOMEM;
 		goto out;
 	}
+	err = ipv6_add_acaddr_hash(dev_net(idev->dev), addr);
+	if (err) {
+		aca_put(aca);
+		goto out;
+	}
 
 	aca->aca_next = idev->ac_list;
 	idev->ac_list = aca;
@@ -324,6 +410,7 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, const struct in6_addr *addr)
 		prev_aca->aca_next = aca->aca_next;
 	else
 		idev->ac_list = aca->aca_next;
+	ipv6_del_acaddr_hash(dev_net(idev->dev), &aca->aca_addr);
 	write_unlock_bh(&idev->lock);
 	addrconf_leave_solict(idev, &aca->aca_addr);
 
@@ -350,6 +437,8 @@ void ipv6_ac_destroy_dev(struct inet6_dev *idev)
 	write_lock_bh(&idev->lock);
 	while ((aca = idev->ac_list) != NULL) {
 		idev->ac_list = aca->aca_next;
+		ipv6_del_acaddr_hash(dev_net(idev->dev), &aca->aca_addr);
+
 		write_unlock_bh(&idev->lock);
 
 		addrconf_leave_solict(idev, &aca->aca_addr);
@@ -390,17 +479,23 @@ static bool ipv6_chk_acast_dev(struct net_device *dev, const struct in6_addr *ad
 bool ipv6_chk_acast_addr(struct net *net, struct net_device *dev,
 			 const struct in6_addr *addr)
 {
+	unsigned int hash = inet6_acaddr_hash(net, addr);
+	struct ipv6_ac_addrlist *acal;
 	bool found = false;
 
 	rcu_read_lock();
 	if (dev)
 		found = ipv6_chk_acast_dev(dev, addr);
 	else
-		for_each_netdev_rcu(net, dev)
-			if (ipv6_chk_acast_dev(dev, addr)) {
+		hlist_for_each_entry_rcu(acal, &inet6_acaddr_lst[hash],
+					 acal_lst) {
+			if (!net_eq(read_pnet(&acal->acal_pnet), net))
+				continue;
+			if (ipv6_addr_equal(&acal->acal_addr, addr)) {
 				found = true;
 				break;
 			}
+		}
 	rcu_read_unlock();
 	return found;
 }
@@ -539,4 +634,25 @@ void ac6_proc_exit(struct net *net)
 {
 	remove_proc_entry("anycast6", net->proc_net);
 }
+
+/*	Init / cleanup code
+ */
+int __init anycast_init(void)
+{
+	int i;
+
+	for (i = 0; i < IN6_ADDR_HSIZE; i++)
+		INIT_HLIST_HEAD(&inet6_acaddr_lst[i]);
+	return 0;
+}
+
+void anycast_cleanup(void)
+{
+	int i;
+
+	spin_lock(&acaddr_hash_lock);
+	for (i = 0; i < IN6_ADDR_HSIZE; i++)
+		WARN_ON(!hlist_empty(&inet6_acaddr_lst[i]));
+	spin_unlock(&acaddr_hash_lock);
+}
 #endif
-- 
2.14.1

^ permalink raw reply related

* WARNING in __debug_object_init (3)
From: syzbot @ 2018-10-28  3:18 UTC (permalink / raw)
  To: ast, daniel, davem, linux-kernel, netdev, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    8c60c36d0b8c Add linux-next specific files for 20181019
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=100feec5400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=8b6d7c4c81535e89
dashboard link: https://syzkaller.appspot.com/bug?extid=6e682caa546b7c96c859
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13579abd400000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=13654f6b400000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+6e682caa546b7c96c859@syzkaller.appspotmail.com

ODEBUG: object 0000000015e9012c is on stack 00000000115bcb67, but NOT  
annotated.
WARNING: CPU: 0 PID: 5594 at lib/debugobjects.c:369  
debug_object_is_on_stack lib/debugobjects.c:363 [inline]
WARNING: CPU: 0 PID: 5594 at lib/debugobjects.c:369  
__debug_object_init.cold.14+0x51/0xdf lib/debugobjects.c:395
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 5594 Comm: syz-executor740 Not tainted  
4.19.0-rc8-next-20181019+ #98
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x244/0x39d lib/dump_stack.c:113
  panic+0x2ad/0x55c kernel/panic.c:188
  __warn.cold.8+0x20/0x45 kernel/panic.c:540
  report_bug+0x254/0x2d0 lib/bug.c:186
  fixup_bug arch/x86/kernel/traps.c:178 [inline]
  do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
  do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969
RIP: 0010:debug_object_is_on_stack lib/debugobjects.c:363 [inline]
RIP: 0010:__debug_object_init.cold.14+0x51/0xdf lib/debugobjects.c:395
Code: ea 03 80 3c 02 00 75 7c 49 8b 54 24 18 48 89 de 48 c7 c7 c0 f1 40 88  
4c 89 85 d0 fd ff ff e8 09 8c d1 fd 4c 8b 85 d0 fd ff ff <0f> 0b e9 09 d6  
ff ff 41 83 c4 01 b8 ff ff 37 00 44 89 25 b7 4e 66
RSP: 0018:ffff8801bb387308 EFLAGS: 00010086
RAX: 0000000000000050 RBX: ffff8801bb387af8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff816585a5 RDI: 0000000000000005
RBP: ffff8801bb387560 R08: ffff8801cb208a20 R09: ffffed003b5c5008
R10: ffffed003b5c5008 R11: ffff8801dae28047 R12: ffff8801d82ea300
R13: 0000000000069700 R14: ffff8801d82ea300 R15: ffff8801cb208a10
  debug_object_init+0x16/0x20 lib/debugobjects.c:432
  debug_timer_init kernel/time/timer.c:704 [inline]
  debug_init kernel/time/timer.c:757 [inline]
  init_timer_key+0xa9/0x480 kernel/time/timer.c:806
  sock_init_data+0xe1/0xdc0 net/core/sock.c:2696
  bpf_prog_test_run_skb+0x255/0xc40 net/bpf/test_run.c:144
  bpf_prog_test_run+0x130/0x1a0 kernel/bpf/syscall.c:1790
  __do_sys_bpf kernel/bpf/syscall.c:2427 [inline]
  __se_sys_bpf kernel/bpf/syscall.c:2371 [inline]
  __x64_sys_bpf+0x3d8/0x510 kernel/bpf/syscall.c:2371
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440259
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc212cf818 EFLAGS: 00000213 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440259
RDX: 0000000000000028 RSI: 0000000020000080 RDI: 000000000000000a
RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000401ae0
R13: 0000000000401b70 R14: 0000000000000000 R15: 0000000000000000

======================================================
WARNING: possible circular locking dependency detected
4.19.0-rc8-next-20181019+ #98 Not tainted
------------------------------------------------------
syz-executor740/5594 is trying to acquire lock:
00000000688fcc6b ((console_sem).lock){-.-.}, at: down_trylock+0x13/0x70  
kernel/locking/semaphore.c:136

but task is already holding lock:
00000000505ead1b (&obj_hash[i].lock){-.-.}, at:  
__debug_object_init+0x127/0x1290 lib/debugobjects.c:384

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 (&obj_hash[i].lock){-.-.}:
        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
        _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
        __debug_object_init+0x127/0x1290 lib/debugobjects.c:384
        debug_object_init+0x16/0x20 lib/debugobjects.c:432
        debug_hrtimer_init kernel/time/hrtimer.c:410 [inline]
        debug_init kernel/time/hrtimer.c:458 [inline]
        hrtimer_init+0x97/0x490 kernel/time/hrtimer.c:1308
        init_dl_task_timer+0x1b/0x50 kernel/sched/deadline.c:1057
        __sched_fork+0x2ae/0x590 kernel/sched/core.c:2166
        init_idle+0x75/0x740 kernel/sched/core.c:5382
        sched_init+0xb33/0xc02 kernel/sched/core.c:6065
        start_kernel+0x4be/0xa2b init/main.c:608
        x86_64_start_reservations+0x2e/0x30 arch/x86/kernel/head64.c:472
        x86_64_start_kernel+0x76/0x79 arch/x86/kernel/head64.c:451
        secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243

-> #2 (&rq->lock){-.-.}:
        __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
        _raw_spin_lock+0x2d/0x40 kernel/locking/spinlock.c:144
        rq_lock kernel/sched/sched.h:1127 [inline]
        task_fork_fair+0xb0/0x6d0 kernel/sched/fair.c:9768
        sched_fork+0x443/0xba0 kernel/sched/core.c:2359
        copy_process+0x2585/0x8770 kernel/fork.c:1887
        _do_fork+0x1cb/0x11c0 kernel/fork.c:2216
        kernel_thread+0x34/0x40 kernel/fork.c:2275
        rest_init+0x28/0x372 init/main.c:409
        arch_call_rest_init+0xe/0x1b
        start_kernel+0x9f0/0xa2b init/main.c:745
        x86_64_start_reservations+0x2e/0x30 arch/x86/kernel/head64.c:472
        x86_64_start_kernel+0x76/0x79 arch/x86/kernel/head64.c:451
        secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243

-> #1 (&p->pi_lock){-.-.}:
        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
        _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
        try_to_wake_up+0xd2/0x12e0 kernel/sched/core.c:1965
        wake_up_process+0x10/0x20 kernel/sched/core.c:2129
        __up.isra.1+0x1c0/0x2a0 kernel/locking/semaphore.c:262
        up+0x13c/0x1c0 kernel/locking/semaphore.c:187
        __up_console_sem+0xbe/0x1b0 kernel/printk/printk.c:236
        console_unlock+0x80c/0x1190 kernel/printk/printk.c:2432
        vprintk_emit+0x391/0x990 kernel/printk/printk.c:1922
        vprintk_default+0x28/0x30 kernel/printk/printk.c:1964
        vprintk_func+0x7e/0x181 kernel/printk/printk_safe.c:398
        printk+0xa7/0xcf kernel/printk/printk.c:1997
        check_stack_usage kernel/exit.c:755 [inline]
        do_exit.cold.18+0x57/0x16f kernel/exit.c:916
        do_group_exit+0x177/0x440 kernel/exit.c:970
        __do_sys_exit_group kernel/exit.c:981 [inline]
        __se_sys_exit_group kernel/exit.c:979 [inline]
        __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:979
        do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
        entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 ((console_sem).lock){-.-.}:
        lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3844
        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
        _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
        down_trylock+0x13/0x70 kernel/locking/semaphore.c:136
        __down_trylock_console_sem+0xae/0x1f0 kernel/printk/printk.c:219
        console_trylock+0x15/0xa0 kernel/printk/printk.c:2247
        console_trylock_spinning kernel/printk/printk.c:1653 [inline]
        vprintk_emit+0x372/0x990 kernel/printk/printk.c:1921
        vprintk_default+0x28/0x30 kernel/printk/printk.c:1964
        vprintk_func+0x7e/0x181 kernel/printk/printk_safe.c:398
        printk+0xa7/0xcf kernel/printk/printk.c:1997
        debug_object_is_on_stack lib/debugobjects.c:363 [inline]
        __debug_object_init.cold.14+0x4a/0xdf lib/debugobjects.c:395
        debug_object_init+0x16/0x20 lib/debugobjects.c:432
        debug_timer_init kernel/time/timer.c:704 [inline]
        debug_init kernel/time/timer.c:757 [inline]
        init_timer_key+0xa9/0x480 kernel/time/timer.c:806
        sock_init_data+0xe1/0xdc0 net/core/sock.c:2696
        bpf_prog_test_run_skb+0x255/0xc40 net/bpf/test_run.c:144
        bpf_prog_test_run+0x130/0x1a0 kernel/bpf/syscall.c:1790
        __do_sys_bpf kernel/bpf/syscall.c:2427 [inline]
        __se_sys_bpf kernel/bpf/syscall.c:2371 [inline]
        __x64_sys_bpf+0x3d8/0x510 kernel/bpf/syscall.c:2371
        do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
        entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

Chain exists of:
   (console_sem).lock --> &rq->lock --> &obj_hash[i].lock

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&obj_hash[i].lock);
                                lock(&rq->lock);
                                lock(&obj_hash[i].lock);
   lock((console_sem).lock);

  *** DEADLOCK ***

1 lock held by syz-executor740/5594:
  #0: 00000000505ead1b (&obj_hash[i].lock){-.-.}, at:  
__debug_object_init+0x127/0x1290 lib/debugobjects.c:384

stack backtrace:
CPU: 0 PID: 5594 Comm: syz-executor740 Not tainted  
4.19.0-rc8-next-20181019+ #98
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x244/0x39d lib/dump_stack.c:113
  print_circular_bug.isra.35.cold.54+0x1bd/0x27d  
kernel/locking/lockdep.c:1221
  check_prev_add kernel/locking/lockdep.c:1863 [inline]
  check_prevs_add kernel/locking/lockdep.c:1976 [inline]
  validate_chain kernel/locking/lockdep.c:2347 [inline]
  __lock_acquire+0x3399/0x4c20 kernel/locking/lockdep.c:3341
  lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3844
  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
  _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
  down_trylock+0x13/0x70 kernel/locking/semaphore.c:136
  __down_trylock_console_sem+0xae/0x1f0 kernel/printk/printk.c:219
  console_trylock+0x15/0xa0 kernel/printk/printk.c:2247
  console_trylock_spinning kernel/printk/printk.c:1653 [inline]
  vprintk_emit+0x372/0x990 kernel/printk/printk.c:1921
  vprintk_default+0x28/0x30 kernel/printk/printk.c:1964
  vprintk_func+0x7e/0x181 kernel/printk/printk_safe.c:398
  printk+0xa7/0xcf kernel/printk/printk.c:1997
  debug_object_is_on_stack lib/debugobjects.c:363 [inline]
  __debug_object_init.cold.14+0x4a/0xdf lib/debugobjects.c:395
  debug_object_init+0x16/0x20 lib/debugobjects.c:432
  debug_timer_init kernel/time/timer.c:704 [inline]
  debug_init kernel/time/timer.c:757 [inline]
  init_timer_key+0xa9/0x480 kernel/time/timer.c:806
  sock_init_data+0xe1/0xdc0 net/core/sock.c:2696
  bpf_prog_test_run_skb+0x255/0xc40 net/bpf/test_run.c:144
  bpf_prog_test_run+0x130/0x1a0 kernel/bpf/syscall.c:1790
  __do_sys_bpf kernel/bpf/syscall.c:2427 [inline]
  __se_sys_bpf kernel/bpf/syscall.c:2371 [inline]
  __x64_sys_bpf+0x3d8/0x510 kernel/bpf/syscall.c:2371
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440259
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc212cf818 EFLAGS: 00000213 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440259
RDX: 0000000000000028 RSI: 0000000020000080 RDI: 000000000000000a
RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000401ae0
R13: 0000000000401b70 R14: 0000000000000000 R15: 0000000000000000
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* Re: [Patch net 05/11] net: hns3: remove unnecessary queue reset in the hns3_uninit_all_ring()
From: Sergei Shtylyov @ 2018-10-27 19:02 UTC (permalink / raw)
  To: Huazhong Tan, davem
  Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321
In-Reply-To: <1540608118-27449-6-git-send-email-tanhuazhong@huawei.com>

Hello!

On 27.10.2018 5:41, Huazhong Tan wrote:

> It is not necessary to reset the queue in the hns3_uninit_all_ring(),
> since the queue is stopped in the down operation, and will be resetted

    s/resetted/reset/.

> in the up operaton. And the judgment of the HCLGE_STATE_RST_HANDLING
> flag in the hclge_reset_tqp() is not correct, because we need to reset
> tqp during pf reset, otherwise it may cause queue not be resetted to

    Same here.

> working state problem.
>
> Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
[...]

MBR, Sergei

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox