Netdev List
 help / color / mirror / Atom feed
* [PATCH] net: macb: Only call GPIO functions if there is a valid GPIO
From: Charles Keepax @ 2016-03-28 12:47 UTC (permalink / raw)
  To: nicolas.ferre, davem; +Cc: netdev, linux-kernel, patches

GPIOlib will print warning messages if we call GPIO functions without a
valid GPIO. Change the code to avoid doing so.

Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
---
 drivers/net/ethernet/cadence/macb.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index 6619178..71bb42e 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -2957,9 +2957,10 @@ static int macb_probe(struct platform_device *pdev)
 	phy_node =  of_get_next_available_child(np, NULL);
 	if (phy_node) {
 		int gpio = of_get_named_gpio(phy_node, "reset-gpios", 0);
-		if (gpio_is_valid(gpio))
+		if (gpio_is_valid(gpio)) {
 			bp->reset_gpio = gpio_to_desc(gpio);
-		gpiod_direction_output(bp->reset_gpio, 1);
+			gpiod_direction_output(bp->reset_gpio, 1);
+		}
 	}
 	of_node_put(phy_node);
 
@@ -3029,7 +3030,8 @@ static int macb_remove(struct platform_device *pdev)
 		mdiobus_free(bp->mii_bus);
 
 		/* Shutdown the PHY if there is a GPIO reset */
-		gpiod_set_value(bp->reset_gpio, 0);
+		if (bp->reset_gpio)
+			gpiod_set_value(bp->reset_gpio, 0);
 
 		unregister_netdev(dev);
 		clk_disable_unprepare(bp->tx_clk);
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH] iwlwifi: pcie: remove duplicate assignment of variable isr_stats
From: Grumbach, Emmanuel @ 2016-03-28 11:39 UTC (permalink / raw)
  To: linuxwifi, colin.king@canonical.com, kvalo@codeaurora.org,
	Sharon, Sara, netdev@vger.kernel.org, Berg, Johannes,
	linux-wireless@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org
In-Reply-To: <1459164824-6994-1-git-send-email-colin.king@canonical.com>

On Mon, 2016-03-28 at 12:33 +0100, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> isr_stats is written twice with the same value, remove one of the
> redundant assignments to isr_stats.
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ---


Applied - thanks.

>  drivers/net/wireless/intel/iwlwifi/pcie/rx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
> b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
> index 4be3c35..253e4f0 100644
> --- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
> +++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
> @@ -1805,7 +1805,7 @@ irqreturn_t iwl_pcie_irq_msix_handler(int irq,
> void *dev_id)
>  	struct msix_entry *entry = dev_id;
>  	struct iwl_trans_pcie *trans_pcie =
> iwl_pcie_get_trans_pcie(entry);
>  	struct iwl_trans *trans = trans_pcie->trans;
> -	struct isr_statistics *isr_stats = isr_stats = &trans_pcie
> ->isr_stats;
> +	struct isr_statistics *isr_stats = &trans_pcie->isr_stats;
>  	u32 inta_fh, inta_hw;
>  
>  	lock_map_acquire(&trans->sync_cmd_lockdep_map);

^ permalink raw reply

* [PATCH] iwlwifi: pcie: remove duplicate assignment of variable isr_stats
From: Colin King @ 2016-03-28 11:33 UTC (permalink / raw)
  To: Johannes Berg, Emmanuel Grumbach, Intel Linux Wireless,
	Kalle Valo, Sara Sharon, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Colin Ian King <colin.king-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>

isr_stats is written twice with the same value, remove one of the
redundant assignments to isr_stats.

Signed-off-by: Colin Ian King <colin.king-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
---
 drivers/net/wireless/intel/iwlwifi/pcie/rx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
index 4be3c35..253e4f0 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
@@ -1805,7 +1805,7 @@ irqreturn_t iwl_pcie_irq_msix_handler(int irq, void *dev_id)
 	struct msix_entry *entry = dev_id;
 	struct iwl_trans_pcie *trans_pcie = iwl_pcie_get_trans_pcie(entry);
 	struct iwl_trans *trans = trans_pcie->trans;
-	struct isr_statistics *isr_stats = isr_stats = &trans_pcie->isr_stats;
+	struct isr_statistics *isr_stats = &trans_pcie->isr_stats;
 	u32 inta_fh, inta_hw;
 
 	lock_map_acquire(&trans->sync_cmd_lockdep_map);
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH net-next 2/2] net: hns: set-coalesce-usecs returns errno by dsaf.ko
From: Yisen Zhuang @ 2016-03-28 10:40 UTC (permalink / raw)
  To: davem, yisen.zhuang, salil.mehta, liguozhu, huangdaode, arnd,
	andriy.shevchenko, andrew, geliangtang, ivecera, lisheng011,
	fengguang.wu
  Cc: charles.chenxin, haifeng.wei, netdev, linux-kernel,
	linux-arm-kernel, linuxarm
In-Reply-To: <1459161657-151406-1-git-send-email-Yisen.Zhuang@huawei.com>

From: Lisheng <lisheng011@huawei.com>

It may fail to set coalesce usecs to HW, and Ethtool needs to know if it
is successful to cfg the parameter or not. So it needs return the errno by
dsaf.ko.

Signed-off-by: Lisheng <lisheng011@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns/hnae.h         | 2 +-
 drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c | 6 +++---
 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c  | 6 ++++--
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.h b/drivers/net/ethernet/hisilicon/hns/hnae.h
index 37d0cce..e8d36aa 100644
--- a/drivers/net/ethernet/hisilicon/hns/hnae.h
+++ b/drivers/net/ethernet/hisilicon/hns/hnae.h
@@ -469,7 +469,7 @@ struct hnae_ae_ops {
 				   u32 *tx_usecs, u32 *rx_usecs);
 	void (*get_rx_max_coalesced_frames)(struct hnae_handle *handle,
 					    u32 *tx_frames, u32 *rx_frames);
-	void (*set_coalesce_usecs)(struct hnae_handle *handle, u32 timeout);
+	int (*set_coalesce_usecs)(struct hnae_handle *handle, u32 timeout);
 	int (*set_coalesce_frames)(struct hnae_handle *handle,
 				   u32 coalesce_frames);
 	void (*set_promisc_mode)(struct hnae_handle *handle, u32 en);
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
index 1dd1d69..a1cb461 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
@@ -469,13 +469,13 @@ static void hns_ae_get_rx_max_coalesced_frames(struct hnae_handle *handle,
 						  ring_pair->port_id_in_comm);
 }
 
-static void hns_ae_set_coalesce_usecs(struct hnae_handle *handle,
-				      u32 timeout)
+static int hns_ae_set_coalesce_usecs(struct hnae_handle *handle,
+				     u32 timeout)
 {
 	struct ring_pair_cb *ring_pair =
 		container_of(handle->qs[0], struct ring_pair_cb, q);
 
-	(void)hns_rcb_set_coalesce_usecs(
+	return hns_rcb_set_coalesce_usecs(
 		ring_pair->rcb_common, ring_pair->port_id_in_comm, timeout);
 }
 
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
index 9c3ba65..b138181 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
@@ -794,8 +794,10 @@ static int hns_set_coalesce(struct net_device *net_dev,
 	    (!ops->set_coalesce_frames))
 		return -ESRCH;
 
-	ops->set_coalesce_usecs(priv->ae_handle,
-					ec->rx_coalesce_usecs);
+	ret = ops->set_coalesce_usecs(priv->ae_handle,
+				      ec->rx_coalesce_usecs);
+	if (ret)
+		return ret;
 
 	ret = ops->set_coalesce_frames(
 		priv->ae_handle,
-- 
1.9.1

^ permalink raw reply related

* [PATCH net-next 1/2] net: hns: fixed the setting and getting overtime bug
From: Yisen Zhuang @ 2016-03-28 10:40 UTC (permalink / raw)
  To: davem, yisen.zhuang, salil.mehta, liguozhu, huangdaode, arnd,
	andriy.shevchenko, andrew, geliangtang, ivecera, lisheng011,
	fengguang.wu
  Cc: charles.chenxin, haifeng.wei, netdev, linux-kernel,
	linux-arm-kernel, linuxarm
In-Reply-To: <1459161657-151406-1-git-send-email-Yisen.Zhuang@huawei.com>

From: Lisheng <lisheng011@huawei.com>

The overtime setting and getting REGs in HNS V2 is defferent from HNS V1.
It needs to be distinguished between them if getting or setting the REGs.

Signed-off-by: Lisheng <lisheng011@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c |  60 +++----
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c | 196 ++++++++++------------
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.h |  23 +--
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h |   1 +
 4 files changed, 126 insertions(+), 154 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
index 285c893..1dd1d69 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
@@ -159,11 +159,6 @@ struct hnae_handle *hns_ae_get_handle(struct hnae_ae_dev *dev,
 		ae_handle->qs[i]->tx_ring.q = ae_handle->qs[i];
 
 		ring_pair_cb->used_by_vf = 1;
-		if (port_idx < DSAF_SERVICE_PORT_NUM_PER_DSAF)
-			ring_pair_cb->port_id_in_dsa = port_idx;
-		else
-			ring_pair_cb->port_id_in_dsa = 0;
-
 		ring_pair_cb++;
 	}
 
@@ -453,59 +448,46 @@ static int hns_ae_set_pauseparam(struct hnae_handle *handle,
 static void hns_ae_get_coalesce_usecs(struct hnae_handle *handle,
 				      u32 *tx_usecs, u32 *rx_usecs)
 {
-	int port;
-
-	port = hns_ae_map_eport_to_dport(handle->eport_id);
+	struct ring_pair_cb *ring_pair =
+		container_of(handle->qs[0], struct ring_pair_cb, q);
 
-	*tx_usecs = hns_rcb_get_coalesce_usecs(
-		hns_ae_get_dsaf_dev(handle->dev),
-		hns_dsaf_get_comm_idx_by_port(port));
-	*rx_usecs = hns_rcb_get_coalesce_usecs(
-		hns_ae_get_dsaf_dev(handle->dev),
-		hns_dsaf_get_comm_idx_by_port(port));
+	*tx_usecs = hns_rcb_get_coalesce_usecs(ring_pair->rcb_common,
+					       ring_pair->port_id_in_comm);
+	*rx_usecs = hns_rcb_get_coalesce_usecs(ring_pair->rcb_common,
+					       ring_pair->port_id_in_comm);
 }
 
 static void hns_ae_get_rx_max_coalesced_frames(struct hnae_handle *handle,
 					       u32 *tx_frames, u32 *rx_frames)
 {
-	int port;
+	struct ring_pair_cb *ring_pair =
+		container_of(handle->qs[0], struct ring_pair_cb, q);
 
-	assert(handle);
-
-	port = hns_ae_map_eport_to_dport(handle->eport_id);
-
-	*tx_frames = hns_rcb_get_coalesced_frames(
-		hns_ae_get_dsaf_dev(handle->dev), port);
-	*rx_frames = hns_rcb_get_coalesced_frames(
-		hns_ae_get_dsaf_dev(handle->dev), port);
+	*tx_frames = hns_rcb_get_coalesced_frames(ring_pair->rcb_common,
+						  ring_pair->port_id_in_comm);
+	*rx_frames = hns_rcb_get_coalesced_frames(ring_pair->rcb_common,
+						  ring_pair->port_id_in_comm);
 }
 
 static void hns_ae_set_coalesce_usecs(struct hnae_handle *handle,
 				      u32 timeout)
 {
-	int port;
+	struct ring_pair_cb *ring_pair =
+		container_of(handle->qs[0], struct ring_pair_cb, q);
 
-	assert(handle);
-
-	port = hns_ae_map_eport_to_dport(handle->eport_id);
-
-	hns_rcb_set_coalesce_usecs(hns_ae_get_dsaf_dev(handle->dev),
-				   port, timeout);
+	(void)hns_rcb_set_coalesce_usecs(
+		ring_pair->rcb_common, ring_pair->port_id_in_comm, timeout);
 }
 
 static int  hns_ae_set_coalesce_frames(struct hnae_handle *handle,
 				       u32 coalesce_frames)
 {
-	int port;
-	int ret;
+	struct ring_pair_cb *ring_pair =
+		container_of(handle->qs[0], struct ring_pair_cb, q);
 
-	assert(handle);
-
-	port = hns_ae_map_eport_to_dport(handle->eport_id);
-
-	ret = hns_rcb_set_coalesced_frames(hns_ae_get_dsaf_dev(handle->dev),
-					   port, coalesce_frames);
-	return ret;
+	return hns_rcb_set_coalesced_frames(
+		ring_pair->rcb_common,
+		ring_pair->port_id_in_comm, coalesce_frames);
 }
 
 void hns_ae_update_stats(struct hnae_handle *handle,
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c
index 1218880..28ee26e 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c
@@ -215,9 +215,9 @@ static void hns_rcb_ring_init(struct ring_pair_cb *ring_pair, int ring_type)
 		dsaf_write_dev(q, RCB_RING_RX_RING_BD_LEN_REG,
 			       bd_size_type);
 		dsaf_write_dev(q, RCB_RING_RX_RING_BD_NUM_REG,
-			       ring_pair->port_id_in_dsa);
+			       ring_pair->port_id_in_comm);
 		dsaf_write_dev(q, RCB_RING_RX_RING_PKTLINE_REG,
-			       ring_pair->port_id_in_dsa);
+			       ring_pair->port_id_in_comm);
 	} else {
 		dsaf_write_dev(q, RCB_RING_TX_RING_BASEADDR_L_REG,
 			       (u32)dma);
@@ -227,9 +227,9 @@ static void hns_rcb_ring_init(struct ring_pair_cb *ring_pair, int ring_type)
 		dsaf_write_dev(q, RCB_RING_TX_RING_BD_LEN_REG,
 			       bd_size_type);
 		dsaf_write_dev(q, RCB_RING_TX_RING_BD_NUM_REG,
-			       ring_pair->port_id_in_dsa);
+			       ring_pair->port_id_in_comm);
 		dsaf_write_dev(q, RCB_RING_TX_RING_PKTLINE_REG,
-			       ring_pair->port_id_in_dsa);
+			       ring_pair->port_id_in_comm);
 	}
 }
 
@@ -256,50 +256,16 @@ static void hns_rcb_set_port_desc_cnt(struct rcb_common_cb *rcb_common,
 		       desc_cnt);
 }
 
-/**
- *hns_rcb_set_port_coalesced_frames - set rcb port coalesced frames
- *@rcb_common: rcb_common device
- *@port_idx:port index
- *@coalesced_frames:BD num for coalesced frames
- */
-static int  hns_rcb_set_port_coalesced_frames(struct rcb_common_cb *rcb_common,
-					      u32 port_idx,
-					      u32 coalesced_frames)
-{
-	if (coalesced_frames >= rcb_common->desc_num ||
-	    coalesced_frames > HNS_RCB_MAX_COALESCED_FRAMES)
-		return -EINVAL;
-
-	dsaf_write_dev(rcb_common, RCB_CFG_PKTLINE_REG + port_idx * 4,
-		       coalesced_frames);
-	return 0;
-}
-
-/**
- *hns_rcb_get_port_coalesced_frames - set rcb port coalesced frames
- *@rcb_common: rcb_common device
- *@port_idx:port index
- * return coaleseced frames value
- */
-static u32 hns_rcb_get_port_coalesced_frames(struct rcb_common_cb *rcb_common,
-					     u32 port_idx)
+static void hns_rcb_set_port_timeout(
+	struct rcb_common_cb *rcb_common, u32 port_idx, u32 timeout)
 {
-	if (port_idx >= HNS_RCB_SERVICE_NW_ENGINE_NUM)
-		port_idx = 0;
-
-	return dsaf_read_dev(rcb_common,
-			     RCB_CFG_PKTLINE_REG + port_idx * 4);
-}
-
-/**
- *hns_rcb_set_timeout - set rcb port coalesced time_out
- *@rcb_common: rcb_common device
- *@time_out:time for coalesced time_out
- */
-static void hns_rcb_set_timeout(struct rcb_common_cb *rcb_common,
-				u32 timeout)
-{
-	dsaf_write_dev(rcb_common, RCB_CFG_OVERTIME_REG, timeout);
+	if (AE_IS_VER1(rcb_common->dsaf_dev->dsaf_ver))
+		dsaf_write_dev(rcb_common, RCB_CFG_OVERTIME_REG,
+			       timeout * HNS_RCB_CLK_FREQ_MHZ);
+	else
+		dsaf_write_dev(rcb_common,
+			       RCB_PORT_CFG_OVERTIME_REG + port_idx * 4,
+			       timeout);
 }
 
 static int hns_rcb_common_get_port_num(struct rcb_common_cb *rcb_common)
@@ -361,10 +327,11 @@ int hns_rcb_common_init_hw(struct rcb_common_cb *rcb_common)
 
 	for (i = 0; i < port_num; i++) {
 		hns_rcb_set_port_desc_cnt(rcb_common, i, rcb_common->desc_num);
-		(void)hns_rcb_set_port_coalesced_frames(
-			rcb_common, i, rcb_common->coalesced_frames);
+		(void)hns_rcb_set_coalesced_frames(
+			rcb_common, i, HNS_RCB_DEF_COALESCED_FRAMES);
+		hns_rcb_set_port_timeout(
+			rcb_common, i, HNS_RCB_DEF_COALESCED_USECS);
 	}
-	hns_rcb_set_timeout(rcb_common, rcb_common->timeout);
 
 	dsaf_write_dev(rcb_common, RCB_COM_CFG_ENDIAN_REG,
 		       HNS_RCB_COMMON_ENDIAN);
@@ -460,7 +427,8 @@ static void hns_rcb_ring_pair_get_cfg(struct ring_pair_cb *ring_pair_cb)
 	hns_rcb_ring_get_cfg(&ring_pair_cb->q, TX_RING);
 }
 
-static int hns_rcb_get_port(struct rcb_common_cb *rcb_common, int ring_idx)
+static int hns_rcb_get_port_in_comm(
+	struct rcb_common_cb *rcb_common, int ring_idx)
 {
 	int comm_index = rcb_common->comm_index;
 	int port;
@@ -470,7 +438,7 @@ static int hns_rcb_get_port(struct rcb_common_cb *rcb_common, int ring_idx)
 		q_num = (int)rcb_common->max_q_per_vf * rcb_common->max_vfn;
 		port = ring_idx / q_num;
 	} else {
-		port = HNS_RCB_SERVICE_NW_ENGINE_NUM + comm_index - 1;
+		port = 0; /* config debug-ports port_id_in_comm to 0*/
 	}
 
 	return port;
@@ -518,7 +486,8 @@ void hns_rcb_get_cfg(struct rcb_common_cb *rcb_common)
 		ring_pair_cb->index = i;
 		ring_pair_cb->q.io_base =
 			RCB_COMM_BASE_TO_RING_BASE(rcb_common->io_base, i);
-		ring_pair_cb->port_id_in_dsa = hns_rcb_get_port(rcb_common, i);
+		ring_pair_cb->port_id_in_comm =
+			hns_rcb_get_port_in_comm(rcb_common, i);
 		ring_pair_cb->virq[HNS_RCB_IRQ_IDX_TX] =
 		is_ver1 ? irq_of_parse_and_map(np, base_irq_idx + i * 2) :
 			  platform_get_irq(pdev, base_irq_idx + i * 3 + 1);
@@ -534,82 +503,95 @@ void hns_rcb_get_cfg(struct rcb_common_cb *rcb_common)
 /**
  *hns_rcb_get_coalesced_frames - get rcb port coalesced frames
  *@rcb_common: rcb_common device
- *@comm_index:port index
- *return coalesced_frames
+ *@port_idx:port id in comm
+ *
+ *Returns: coalesced_frames
  */
-u32 hns_rcb_get_coalesced_frames(struct dsaf_device *dsaf_dev, int port)
+u32 hns_rcb_get_coalesced_frames(
+	struct rcb_common_cb *rcb_common, u32 port_idx)
 {
-	int comm_index =  hns_dsaf_get_comm_idx_by_port(port);
-	struct rcb_common_cb *rcb_comm = dsaf_dev->rcb_common[comm_index];
-
-	return hns_rcb_get_port_coalesced_frames(rcb_comm, port);
+	return dsaf_read_dev(rcb_common, RCB_CFG_PKTLINE_REG + port_idx * 4);
 }
 
 /**
  *hns_rcb_get_coalesce_usecs - get rcb port coalesced time_out
  *@rcb_common: rcb_common device
- *@comm_index:port index
- *return time_out
+ *@port_idx:port id in comm
+ *
+ *Returns: time_out
  */
-u32 hns_rcb_get_coalesce_usecs(struct dsaf_device *dsaf_dev, int comm_index)
+u32 hns_rcb_get_coalesce_usecs(
+	struct rcb_common_cb *rcb_common, u32 port_idx)
 {
-	struct rcb_common_cb *rcb_comm = dsaf_dev->rcb_common[comm_index];
-
-	return rcb_comm->timeout;
+	if (AE_IS_VER1(rcb_common->dsaf_dev->dsaf_ver))
+		return dsaf_read_dev(rcb_common, RCB_CFG_OVERTIME_REG) /
+		       HNS_RCB_CLK_FREQ_MHZ;
+	else
+		return dsaf_read_dev(rcb_common,
+				     RCB_PORT_CFG_OVERTIME_REG + port_idx * 4);
 }
 
 /**
  *hns_rcb_set_coalesce_usecs - set rcb port coalesced time_out
  *@rcb_common: rcb_common device
- *@comm_index: comm :index
- *@etx_usecs:tx time for coalesced time_out
- *@rx_usecs:rx time for coalesced time_out
+ *@port_idx:port id in comm
+ *@timeout:tx/rx time for coalesced time_out
+ *
+ * Returns:
+ * Zero for success, or an error code in case of failure
  */
-void hns_rcb_set_coalesce_usecs(struct dsaf_device *dsaf_dev,
-				int port, u32 timeout)
+int hns_rcb_set_coalesce_usecs(
+	struct rcb_common_cb *rcb_common, u32 port_idx, u32 timeout)
 {
-	int comm_index =  hns_dsaf_get_comm_idx_by_port(port);
-	struct rcb_common_cb *rcb_comm = dsaf_dev->rcb_common[comm_index];
+	u32 old_timeout = hns_rcb_get_coalesce_usecs(rcb_common, port_idx);
 
-	if (rcb_comm->timeout == timeout)
-		return;
+	if (timeout == old_timeout)
+		return 0;
 
-	if (comm_index == HNS_DSAF_COMM_SERVICE_NW_IDX) {
-		dev_err(dsaf_dev->dev,
-			"error: not support coalesce_usecs setting!\n");
-		return;
+	if (AE_IS_VER1(rcb_common->dsaf_dev->dsaf_ver)) {
+		if (rcb_common->comm_index == HNS_DSAF_COMM_SERVICE_NW_IDX) {
+			dev_err(rcb_common->dsaf_dev->dev,
+				"error: not support coalesce_usecs setting!\n");
+			return -EINVAL;
+		}
 	}
-	rcb_comm->timeout = timeout;
-	hns_rcb_set_timeout(rcb_comm, rcb_comm->timeout);
+	if (timeout > HNS_RCB_MAX_COALESCED_USECS) {
+		dev_err(rcb_common->dsaf_dev->dev,
+			"error: not support coalesce %dus!\n", timeout);
+		return -EINVAL;
+	}
+	hns_rcb_set_port_timeout(rcb_common, port_idx, timeout);
+	return 0;
 }
 
 /**
  *hns_rcb_set_coalesced_frames - set rcb coalesced frames
  *@rcb_common: rcb_common device
- *@tx_frames:tx BD num for coalesced frames
- *@rx_frames:rx BD num for coalesced frames
- *Return 0 on success, negative on failure
+ *@port_idx:port id in comm
+ *@coalesced_frames:tx/rx BD num for coalesced frames
+ *
+ * Returns:
+ * Zero for success, or an error code in case of failure
  */
-int hns_rcb_set_coalesced_frames(struct dsaf_device *dsaf_dev,
-				 int port, u32 coalesced_frames)
+int hns_rcb_set_coalesced_frames(
+	struct rcb_common_cb *rcb_common, u32 port_idx, u32 coalesced_frames)
 {
-	int comm_index =  hns_dsaf_get_comm_idx_by_port(port);
-	struct rcb_common_cb *rcb_comm = dsaf_dev->rcb_common[comm_index];
-	u32 coalesced_reg_val;
-	int ret;
+	u32 old_waterline = hns_rcb_get_coalesced_frames(rcb_common, port_idx);
 
-	coalesced_reg_val = hns_rcb_get_port_coalesced_frames(rcb_comm, port);
-
-	if (coalesced_reg_val == coalesced_frames)
+	if (coalesced_frames == old_waterline)
 		return 0;
 
-	if (coalesced_frames >= HNS_RCB_MIN_COALESCED_FRAMES) {
-		ret = hns_rcb_set_port_coalesced_frames(rcb_comm, port,
-							coalesced_frames);
-		return ret;
-	} else {
+	if (coalesced_frames >= rcb_common->desc_num ||
+	    coalesced_frames > HNS_RCB_MAX_COALESCED_FRAMES ||
+	    coalesced_frames < HNS_RCB_MIN_COALESCED_FRAMES) {
+		dev_err(rcb_common->dsaf_dev->dev,
+			"error: not support coalesce_frames setting!\n");
 		return -EINVAL;
 	}
+
+	dsaf_write_dev(rcb_common, RCB_CFG_PKTLINE_REG + port_idx * 4,
+		       coalesced_frames);
+	return 0;
 }
 
 /**
@@ -749,8 +731,6 @@ int hns_rcb_common_get_cfg(struct dsaf_device *dsaf_dev,
 	rcb_common->dsaf_dev = dsaf_dev;
 
 	rcb_common->desc_num = dsaf_dev->desc_num;
-	rcb_common->coalesced_frames = HNS_RCB_DEF_COALESCED_FRAMES;
-	rcb_common->timeout = HNS_RCB_MAX_TIME_OUT;
 
 	hns_rcb_get_queue_mode(dsaf_mode, comm_index, &max_vfn, &max_q_per_vf);
 	rcb_common->max_vfn = max_vfn;
@@ -951,6 +931,10 @@ void hns_rcb_get_strings(int stringset, u8 *data, int index)
 void hns_rcb_get_common_regs(struct rcb_common_cb *rcb_com, void *data)
 {
 	u32 *regs = data;
+	bool is_ver1 = AE_IS_VER1(rcb_com->dsaf_dev->dsaf_ver);
+	bool is_dbg = (rcb_com->comm_index != HNS_DSAF_COMM_SERVICE_NW_IDX);
+	u32 reg_tmp;
+	u32 reg_num_tmp;
 	u32 i = 0;
 
 	/*rcb common registers */
@@ -1004,12 +988,16 @@ void hns_rcb_get_common_regs(struct rcb_common_cb *rcb_com, void *data)
 			= dsaf_read_dev(rcb_com, RCB_CFG_PKTLINE_REG + 4 * i);
 	}
 
-	regs[70] = dsaf_read_dev(rcb_com, RCB_CFG_OVERTIME_REG);
-	regs[71] = dsaf_read_dev(rcb_com, RCB_CFG_PKTLINE_INT_NUM_REG);
-	regs[72] = dsaf_read_dev(rcb_com, RCB_CFG_OVERTIME_INT_NUM_REG);
+	reg_tmp = is_ver1 ? RCB_CFG_OVERTIME_REG : RCB_PORT_CFG_OVERTIME_REG;
+	reg_num_tmp = (is_ver1 || is_dbg) ? 1 : 6;
+	for (i = 0; i < reg_num_tmp; i++)
+		regs[70 + i] = dsaf_read_dev(rcb_com, reg_tmp);
+
+	regs[76] = dsaf_read_dev(rcb_com, RCB_CFG_PKTLINE_INT_NUM_REG);
+	regs[77] = dsaf_read_dev(rcb_com, RCB_CFG_OVERTIME_INT_NUM_REG);
 
 	/* mark end of rcb common regs */
-	for (i = 73; i < 80; i++)
+	for (i = 78; i < 80; i++)
 		regs[i] = 0xcccccccc;
 }
 
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.h b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.h
index 81fe9f8..eb61014 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.h
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.h
@@ -38,7 +38,9 @@ struct rcb_common_cb;
 #define HNS_RCB_MAX_COALESCED_FRAMES		1023
 #define HNS_RCB_MIN_COALESCED_FRAMES		1
 #define HNS_RCB_DEF_COALESCED_FRAMES		50
-#define HNS_RCB_MAX_TIME_OUT			0x500
+#define HNS_RCB_CLK_FREQ_MHZ			350
+#define HNS_RCB_MAX_COALESCED_USECS		0x3ff
+#define HNS_RCB_DEF_COALESCED_USECS		3
 
 #define HNS_RCB_COMMON_ENDIAN			1
 
@@ -82,7 +84,7 @@ struct ring_pair_cb {
 
 	int virq[HNS_RCB_IRQ_NUM_PER_QUEUE];
 
-	u8 port_id_in_dsa;
+	u8 port_id_in_comm;
 	u8 used_by_vf;
 
 	struct hns_ring_hw_stats hw_stats;
@@ -97,8 +99,6 @@ struct rcb_common_cb {
 
 	u8 comm_index;
 	u32 ring_num;
-	u32 coalesced_frames; /* frames  threshold of  rx interrupt   */
-	u32 timeout; /* time threshold of  rx interrupt  */
 	u32 desc_num; /*  desc num per queue*/
 
 	struct ring_pair_cb ring_pair_cb[0];
@@ -125,13 +125,14 @@ void hns_rcbv2_int_clr_hw(struct hnae_queue *q, u32 flag);
 void hns_rcb_init_hw(struct ring_pair_cb *ring);
 void hns_rcb_reset_ring_hw(struct hnae_queue *q);
 void hns_rcb_wait_fbd_clean(struct hnae_queue **qs, int q_num, u32 flag);
-
-u32 hns_rcb_get_coalesced_frames(struct dsaf_device *dsaf_dev, int comm_index);
-u32 hns_rcb_get_coalesce_usecs(struct dsaf_device *dsaf_dev, int comm_index);
-void hns_rcb_set_coalesce_usecs(struct dsaf_device *dsaf_dev,
-				int comm_index, u32 timeout);
-int hns_rcb_set_coalesced_frames(struct dsaf_device *dsaf_dev,
-				 int comm_index, u32 coalesce_frames);
+u32 hns_rcb_get_coalesced_frames(
+	struct rcb_common_cb *rcb_common, u32 port_idx);
+u32 hns_rcb_get_coalesce_usecs(
+	struct rcb_common_cb *rcb_common, u32 port_idx);
+int hns_rcb_set_coalesce_usecs(
+	struct rcb_common_cb *rcb_common, u32 port_idx, u32 timeout);
+int hns_rcb_set_coalesced_frames(
+	struct rcb_common_cb *rcb_common, u32 port_idx, u32 coalesced_frames);
 void hns_rcb_update_stats(struct hnae_queue *queue);
 
 void hns_rcb_get_stats(struct hnae_queue *queue, u64 *data);
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h
index bf62687..018fa7d 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h
@@ -404,6 +404,7 @@
 #define RCB_CFG_OVERTIME_REG			0x9300
 #define RCB_CFG_PKTLINE_INT_NUM_REG		0x9304
 #define RCB_CFG_OVERTIME_INT_NUM_REG		0x9308
+#define RCB_PORT_CFG_OVERTIME_REG		0x9430
 
 #define RCB_RING_RX_RING_BASEADDR_L_REG		0x00000
 #define RCB_RING_RX_RING_BASEADDR_H_REG		0x00004
-- 
1.9.1

^ permalink raw reply related

* [PATCH net-next 0/2] net: hns: add setting coalescing parameters for HNS v2
From: Yisen Zhuang @ 2016-03-28 10:40 UTC (permalink / raw)
  To: davem, yisen.zhuang, salil.mehta, liguozhu, huangdaode, arnd,
	andriy.shevchenko, andrew, geliangtang, ivecera, lisheng011,
	fengguang.wu
  Cc: charles.chenxin, haifeng.wei, netdev, linux-kernel,
	linux-arm-kernel, linuxarm

There are some different REGs about coalescing setting between HNS V2 and
HNS V1. The current HNS driver is only considering the situation for HNS
V1. It needs to support both of them. And Ethtool needs to know if it is
successful to set the parameters as well.

The patchset as below:

>from Lisheng, one fix about setting overtime regs, and the other is about
the return value from HW when setting the parameters.

Lisheng (2):
  net: hns: fixed the setting and getting overtime bug
  net: hns: set-coalesce-usecs returns errno by dsaf.ko

 drivers/net/ethernet/hisilicon/hns/hnae.h         |   2 +-
 drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c |  64 +++----
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c | 196 ++++++++++------------
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.h |  23 +--
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h |   1 +
 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c  |   6 +-
 6 files changed, 133 insertions(+), 159 deletions(-)

-- 
1.9.1

^ permalink raw reply

* [PATCH] openvswitch: Use proper buffer size in nla_memcpy
From: Haishuang Yan @ 2016-03-28 10:08 UTC (permalink / raw)
  To: David S. Miller, Pravin Shelar; +Cc: netdev, linux-kernel, dev, Haishuang Yan

For the input parameter count, it's better to use the size
of destination buffer size, as nla_memcpy would take into
account the length of the source netlink attribute when
a data is copied from an attribute.

Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
---
 net/openvswitch/conntrack.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index dc5eb29..f8a8d43 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -968,7 +968,8 @@ static int parse_nat(const struct nlattr *attr,
 			break;
 
 		case OVS_NAT_ATTR_IP_MIN:
-			nla_memcpy(&info->range.min_addr, a, nla_len(a));
+			nla_memcpy(&info->range.min_addr, a,
+				   sizeof(info->range.min_addr));
 			info->range.flags |= NF_NAT_RANGE_MAP_IPS;
 			break;
 
-- 
1.8.3.1

^ permalink raw reply related

* Re: [RFC PATCH] tcp: Add SOF_TIMESTAMPING_TX_EOR and allow MSG_EOR in tcp_sendmsg
From: Martin KaFai Lau @ 2016-03-28  5:42 UTC (permalink / raw)
  To: Yuchung Cheng
  Cc: Willem de Bruijn, Network Development, Kernel Team, Eric Dumazet,
	Neal Cardwell, Willem de Bruijn, Soheil Hassas Yeganeh
In-Reply-To: <CAK6E8=ekVGq9E3sZEcZVdSEfTxxZ74W7FdW1YmKS-GFASQbJnw@mail.gmail.com>

On Fri, Mar 25, 2016 at 04:05:51PM -0700, Yuchung Cheng wrote:
> Looks like an interesting and useful patch. Since HTTP2 allows
> multiplexing data stream frames from multiple logical streams on a
> single socket,
> how would you instrument to measure the latency of each stream? e.g.,
>
> sendmsg of data_frame_1_of_stream_a
> sendmsg of data_frame_1_of_stream_b
> sendmsg of data_frame_2_of_stream_a
> sendmsg of data_frame_1_of_stream_c
> sendmsg of data_frame_2_of_stream_b

A quick recall from the end of the commit message:
"One of our use case is at the webserver.  The webserver tracks
the HTTP2 response latency by measuring when the webserver
sends the first byte to the socket till the TCP ACK of the last byte
is received."

It is the server side perception on how long does it take to deliver
the whole response/stream to the client.  Hence, the number of
interleaved streams does not matter.

Some sample use cases are,
comparing TCP sysctl/code changes,
observing encoding/compression impact (e.g. HPACK in HTTP2).

Assuming frame_2 is the end stream for 'a' and 'b':
sendmsg of data_frame_1_of_stream_a
sendmsg of data_frame_1_of_stream_b
sendmsg of data_frame_2_of_stream_a MSG_EOR
sendmsg of data_frame_1_of_stream_c
sendmsg of data_frame_2_of_stream_b MSG_EOR

Are you suggesting other useful ways/metrics should be measured in
this case?

^ permalink raw reply

* Re: [RFC PATCH 7/9] GSO: Support partial segmentation offload
From: Jesse Gross @ 2016-03-28  5:36 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Edward Cree, Linux Kernel Network Developers, David Miller,
	Alexander Duyck, Tom Herbert
In-Reply-To: <20160318232522.14955.13475.stgit@localhost.localdomain>

On Fri, Mar 18, 2016 at 4:25 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
> diff --git a/net/core/dev.c b/net/core/dev.c
> index edb7179bc051..666cf427898b 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2711,6 +2711,19 @@ struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
[...]
> +       /* Only report GSO partial support if it will enable us to
> +        * support segmentation on this frame without needing additional
> +        * work.
> +        */
> +       if (features & NETIF_F_GSO_PARTIAL) {
> +               netdev_features_t partial_features;
> +               struct net_device *dev = skb->dev;
> +
> +               partial_features = dev->features & dev->gso_partial_features;
> +               if (!skb_gso_ok(skb, features | partial_features))
> +                       features &= ~NETIF_F_GSO_PARTIAL;

I think we need to add NETIF_F_GSO_ROBUST into the skb_gso_ok() check
- otherwise packets coming from VMs fail this test and we lose GSO
partial. It's totally safe to expose this feature, since we'll compute
gso_segs anyways.

> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index f044f970f1a6..bdcba77e164c 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -3281,6 +3291,15 @@ perform_csum_check:
>          */
>         segs->prev = tail;
>
> +       /* Update GSO info on first skb in partial sequence. */
> +       if (partial_segs) {
> +               skb_shinfo(segs)->gso_size = mss / partial_segs;

One small thing: this gso_size is the same as the original MSS, right?
It seems like we could trivially stick it in a local variable and
avoid the extra division.

> +               skb_shinfo(segs)->gso_segs = partial_segs;
> +               skb_shinfo(segs)->gso_type = skb_shinfo(head_skb)->gso_type |
> +                                            SKB_GSO_PARTIAL;

Since we're computing the gso_segs ourselves, it might be nice to
strip out SKB_GSO_DODGY when we set the type.

I just wanted to say that this is really nice work - I was expecting
it to turn out to be really messy and unmaintainable but this is very
clean. Thanks!

^ permalink raw reply

* Re: [RFC PATCH 7/9] GSO: Support partial segmentation offload
From: Jesse Gross @ 2016-03-28  5:35 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Tom Herbert, Alexander Duyck, Edward Cree,
	Linux Kernel Network Developers, David S. Miller
In-Reply-To: <CAKgT0UcbX+Kfa0+B57ZOMfeNgDvAuFhQO=oVw2m4kGyurBqFSQ@mail.gmail.com>

On Wed, Mar 23, 2016 at 7:53 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On Wed, Mar 23, 2016 at 6:37 PM, Jesse Gross <jesse@kernel.org> wrote:
>> That being said, I actually think that it is good to have the DF bit
>> on by default for encapsulation headers being added. Unintentional
>> (and perhaps multiple layers of) fragmentation usually results in
>> unuseably bad performance and so it best to try to correct it,
>> hopefully automatically in most cases. And, of course, this is the
>> direction that IPv6 has already gone. If we can assume that this is
>> the most common case then in practice we can keep the outer headers
>> constant for the high performance path.
>
> I'm still weighting my options on how to address the DF issue.  I'm
> not really sure having DF always on for outer headers is the best way
> to go either.  I kind of like the idea that if DF is set for the inner
> headers then we set it for the outer headers.  I just have to see how
> hard something like that would be to implement.

I don't think it would be too hard to implement. We already have code
that copies things like ECN between the inner and outer headers, so it
shouldn't be an issue to add this if that's what we decide is right.

My first concern was that we have a way to turn off the DF bit in the
event that it breaks things. Otherwise, I think it is a performance
win both from the perspective of avoiding fragmentation and allowing
blind copying of headers through TSO by avoiding the need to increment
the ID. In practice, there's probably not too much difference between
turning it on by default and inheriting from the inner frame other
than it might affect what is considered to be the 'fast path'.

>> To me, incrementing the inner IP really seems the best choice. The
>> inner header is most likely someone else's traffic so it best to not
>> mess with that whereas the outer headers are likely ours and we know
>> the parameters for them (and can set the DF bit as we believe is
>> correct). Also, if you are looking forward to the future as far as
>> stacking multiple layers of tunnels, I think the only consistent thing
>> to do is have the inner ID increment and all of the tunnel headers
>> stay fixed - it is hard to justify why the first tunnel header should
>> increment but not the second one. And finally, as a nice bonus, this
>> is what the GRO code has been expecting already so you won't cause any
>> performance regressions with existing systems.
>
> Agreed.  However in the case of the igb and ixgbe incrementing the
> inner isn't really going to work well as it introduces a number of
> issues.  I've considered not enabling GSO partial by default on those
> parts, but leaving it as an available feature.  In the case of i40e we
> will not have any of those issues as both the inner and outer IP IDs
> will increment.
>
> The regressions issue shouldn't be all that big.  In most cases
> tunnels are point to point links.  Odds are if someone is running a
> network they should have similar devices on both ends.  So in the case
> of the Intel drivers I have patched here updating i40e isn't that big
> of a deal since it retains the original GSO behavior.  In the case of
> ixgbe it didn't support any tunnel offloads so there is no GRO support
> without checksums being enabled in the tunnel, and Tx checksum support
> vi HW_CSUM has yet to be pushed upstream from Jeff's tree so things
> are currently throttled by the Tx side.
>
> GSO partial is going to end up being very driver specific in terms of
> what increments and what doesn't as we are having to really hack our
> way around to make it do things the spec sheets say it cannot do.
> Some implementations are going to end up working out better than
> others in terms of interoperability and what can make use of other
> advanced features such as GRO on legacy systems.

I have to admit that I'm a little concerned about the fact that
different drivers will end up doing different things as a result of
this series. Generally speaking, it doesn't seem all that good for
drivers to be setting policy for what packets look like.

That being said, if we are OK with IP IDs jumping around when DF is
set as a result of the first patch, then this shouldn't really be a
problem as the effect is similar. We just obviously need to make sure
that we don't let a packet without DF and and non-incrementing IP IDs
escape out in the wild.

Anyways, I think you've already covered a lot of this ground in the
other thread, so I don't think there is any real reason to rehash it
here.

^ permalink raw reply

* Re: [RFC PATCH 1/9] ipv4/GRO: Allow multiple frames to use the same IP ID
From: Jesse Gross @ 2016-03-28  4:57 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Alexander Duyck, Edward Cree, Linux Kernel Network Developers,
	David Miller, Tom Herbert
In-Reply-To: <CAKgT0UeH4qEHeaES2k31CNbb3cXD1zxm_vutqEZ_svO2F73b9A@mail.gmail.com>

On Wed, Mar 23, 2016 at 7:21 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On Wed, Mar 23, 2016 at 6:43 PM, Jesse Gross <jesse@kernel.org> wrote:
>> On Fri, Mar 18, 2016 at 4:24 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
>>> In RFC 6864 it is stated that we can essentially ignore the IPv4 ID field
>>> if we have not and will not use fragmentation.  Such a frame is defined
>>> as having the DF flag set to 1, and the MF and frag_offset as 0.  Currently
>>> for GRO we were requiring that the inner header always have an increasing
>>> IPv4 ID, but we are ignoring the outer value.
>>>
>>> This patch is a first step in trying to reverse some of that.  Specifically
>>> what this patch does is allow us to coalesce frames that have a static IPv4
>>> ID value.  So for example if we had a series of frames where the DF flag
>>> was set we would allow the same IPv4 ID value to be used for all the frames
>>> belonging to that flow.  This would become the standard behavior for TCP so
>>> it would support either a fixed IPv4 ID value, or one in which the value
>>> increments.
>>>
>>> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>>
>> One thing that is a bit odd is that the TSO output procedure stays the
>> same. So that means that if we get a stream of packets with the DF bit
>> set and a constant IP ID, aggregate them with GRO, and the retransmit
>> with GSO/TSO then we'll get packets with IDs that increment for each
>> burst and then start back again to the original value. I guess it
>> doesn't matter in practice since the IDs are supposed to be ignored
>> but it does seem a little strange - especially because the new packets
>> will now be violating the rules of the same GRO implementation that
>> produced them.
>
> Yes and no.  The rule for GRO with this patch is that the IP ID has to
> be either incrementing or if DF is set it has the option to be a fixed
> value for a given grouping of packets.  In that regard either GSO
> partial or standard GSO are still both reversible via GRO so that you
> can aggregate to get back to the original frame (ignoring the IP ID)
> that GSO segmented.  The bit I am still trying to work out is what to
> do about the case where we GRO 2 frames out of one GSO segment.  I
> wonder if I should just totally ignore the IP ID value since it ends
> up creating an artificial boundary between the two frames if they are
> segmented using the incrementing GSO versus the fixed IP ID GSO.

Yeah, I agree that it should work in practice, it just seems a bit odd
to have the IP IDs skip around like that. It does also mean that GRO
will no longer be completely, transparently reversible.

I guess that's fine though as long as we fully embrace the idea that
the DF bit means that IP IDs are not used. In that case, it seems best
to allow any ID when DF is set so we are at least self-consistent.

^ permalink raw reply

* Re: [PATCH net v2 2/3] tunnels: Don't apply GRO to multiple layers of encapsulation.
From: Jesse Gross @ 2016-03-28  4:38 UTC (permalink / raw)
  To: Tom Herbert; +Cc: David Miller, Linux Kernel Network Developers
In-Reply-To: <CALx6S36Zw+ky+u0XHROV1AE21H=EU5=SEoAj7+AskxLw6b0Odg@mail.gmail.com>

On Sat, Mar 26, 2016 at 12:41 PM, Tom Herbert <tom@herbertland.com> wrote:
> On Sat, Mar 19, 2016 at 9:32 AM, Jesse Gross <jesse@kernel.org> wrote:
>> When drivers express support for TSO of encapsulated packets, they
>> only mean that they can do it for one layer of encapsulation.
>> Supporting additional levels would mean updating, at a minimum,
>> more IP length fields and they are unaware of this.
>>
> This patch completely breaks GRO for FOU and GUE.
>
>> No encapsulation device expresses support for handling offloaded
>> encapsulated packets, so we won't generate these types of frames
>> in the transmit path. However, GRO doesn't have a check for
>> multiple levels of encapsulation and will attempt to build them.
>>
>> UDP tunnel GRO actually does prevent this situation but it only
>> handles multiple UDP tunnels stacked on top of each other. This
>> generalizes that solution to prevent any kind of tunnel stacking
>> that would cause problems.
>>
> GUE and FOU regularly create packets that will be both GSO UDP tunnels
> and IPIP, SIT, GRE, etc. This is by design. There should be no
> ambiguity in the drivers as to what this means. For instance, if
> SKB_GSO_UDP_TUNNEL and SKB_GSO_GRE are set that is GRE/UDP packet, a
> driver can use ndo_features_check to validate.
>
> So multiple levels of encapsulation with GRO is perfectly valid and I
> would suggest to simply revert this patch. The one potential issue we
> could have is the potential for GRO to construct a packet which is a
> UDP-encapsulation inside another encapsulation, like UDP-encap in GRE.
> In this case the GSO flags don't provide enough information to
> distinguish say between GRE/UDP (common case) and UDP/GRE (uncommon
> case). To make this clear we can set udp_mark in GRE, ipip, and sit
> but *not* check for it.

Generally speaking, multiple levels of encapsulation offload are not
valid. I think this is pretty clear from the fact that none of the
encapsulation drivers expose support for encapsulation offloads on
transmit and the drivers supporting NETIF_F_GSO_GRE and
NETIF_F_GSO_UDP_TUNNEL do not mean they can handle GRE in VXLAN.

Asking drivers to assume that this combination of flags means FOU
doesn't seem right to me. To the best of my knowledge, no driver uses
ndo_feature_check to do validation of multiple tunnel offload flags
since the assumption is that the stack will never try to do this.
Since FOU is being treated as only a single level of encapsulation, I
think it would be better to just advertise it that way on transmit
(i.e. only set SKB_GSO_UDP_TUNNEL).

The change that you are suggesting would result in packets generated
by GRO that cannot be handled properly on transmit in some cases and
would likely end up being dropped or malformed. GRO is just an
optimization and correctness must come first so we cannot use it if it
might corrupt traffic.

^ permalink raw reply

* Re: [PATCH] brcmfmac: sdio: remove unused variable retry_limit
From: Julian Calaby @ 2016-03-28  4:16 UTC (permalink / raw)
  To: Colin King
  Cc: Brett Rudley, Arend van Spriel, Franky Lin, Hante Meuleman,
	Kalle Valo, Pieter-Paul Giesberts, Vineet Gupta, Kosuke Tatsukawa,
	linux-wireless, brcm80211 development, netdev,
	linux-kernel@vger.kernel.org
In-Reply-To: <1458495292-17701-1-git-send-email-colin.king@canonical.com>

Hi All,

On Mon, Mar 21, 2016 at 4:34 AM, Colin King <colin.king@canonical.com> wrote:
> From: Colin Ian King <colin.king@canonical.com>
>
> retry_limit has never been used during the life of this driver, so
> we may as well remove it as it is redundant.
>
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

Looks right to me.

Reviewed-by: Julian Calaby <julian.calaby@gmail.com>


> ---
>  drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> index 43fd3f4..cd92ba7 100644
> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> @@ -535,9 +535,6 @@ static int qcount[NUMPRIO];
>
>  #define RETRYCHAN(chan) ((chan) == SDPCM_EVENT_CHANNEL)
>
> -/* Retry count for register access failures */
> -static const uint retry_limit = 2;
> -
>  /* Limit on rounding up frames */
>  static const uint max_roundup = 512;
>
> --
> 2.7.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Julian Calaby

Email: julian.calaby@gmail.com
Profile: http://www.google.com/profiles/julian.calaby/

^ permalink raw reply

* Re: [PATCH] Drivers: isdn: hisax: isac.c: Fix assignment and check into one expression.
From: David Miller @ 2016-03-28  2:43 UTC (permalink / raw)
  To: gabrielcsmo; +Cc: isdn, netdev, linux-kernel
In-Reply-To: <1458953390-7306-1-git-send-email-gabrielcsmo@gmail.com>

From: Cosmin-Gabriel Samoila <gabrielcsmo@gmail.com>
Date: Sat, 26 Mar 2016 02:49:50 +0200

> Fix variable assignment inside if statement. It is error-prone and hard to read.
> 
> Signed-off-by: Cosmin-Gabriel Samoila <gabrielcsmo@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH] ravb: fix software timestamping
From: David Miller @ 2016-03-28  2:42 UTC (permalink / raw)
  To: LinoSanfilippo; +Cc: sergei.shtylyov, netdev
In-Reply-To: <1459074122-6964-1-git-send-email-LinoSanfilippo@gmx.de>

From: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Date: Sun, 27 Mar 2016 12:22:02 +0200

> In ravb_start_xmit dont call skb_tx_timestamp only when hardware
> timestamping is requested: in the latter case software timestamps are
> suppressed and thus the call of skb_tx_timestamp does not have any effect.
> 
> Instead call skb_tx_timestamp unconditionally in ravb_start_xmit, since
> the function checks itself if software timestamping is required or should
> be skipped due to hardware timestamping.
> 
> Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] net: sxgbe: fix error paths in sxgbe_platform_probe()
From: David Miller @ 2016-03-28  2:40 UTC (permalink / raw)
  To: linux; +Cc: netdev, linux-kernel
In-Reply-To: <1459027449-2667-1-git-send-email-linux@rasmusvillemoes.dk>

From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Date: Sat, 26 Mar 2016 22:24:09 +0100

> We need to use post-decrement to ensure that irq_dispose_mapping is
> also called on priv->rxq[0]->irq_no; moreover, if one of the above for
> loops failed already at i==0 (so we reach one of these labels with
> that value of i), we'll enter an essentially infinite loop of
> out-of-bounds accesses.
> 
> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>

Applied, thanks.

> David, can you take this directly? Of the three samsung people listed
> by get_maintainer.pl, one address bounces and another informed me
> privately that he's not actually a maintainer of this anymore.

That's awesome, another pump and dump driver submission.

^ permalink raw reply

* Re: [PATCH] net: sxgbe: fix error paths in sxgbe_platform_probe()
From: David Miller @ 2016-03-28  2:39 UTC (permalink / raw)
  To: romieu; +Cc: linux, netdev, linux-kernel
In-Reply-To: <20160327082254.GA10620@electric-eye.fr.zoreil.com>

From: Francois Romieu <romieu@fr.zoreil.com>
Date: Sun, 27 Mar 2016 10:22:54 +0200

> Two years after the initial submission, there is zero change regarding use
> of sxgbe_core_ops for extension or manageability. The extra indirection is
> ripe for removal during next net-next.

I completely agree, that stuff has to go.

^ permalink raw reply

* Re: [PATCH] Fix returned tc and hoplimit values for route with IPv6 encapsulation
From: David Miller @ 2016-03-28  2:35 UTC (permalink / raw)
  To: quentin; +Cc: kuznet, jmorris, yoshfuji, kaber, netdev
In-Reply-To: <1459094771-24749-1-git-send-email-quentin@armitage.org.uk>

From: Quentin Armitage <quentin@armitage.org.uk>
Date: Sun, 27 Mar 2016 17:06:11 +0100

> For a route with IPv6 encapsulation, the traffic class and hop limit
> values are interchanged when returned to userspace by the kernel.
> For example, see below.
> 
>># ip route add 192.168.0.1 dev eth0.2 encap ip6 dst 0x50 tc 0x50 hoplimit 100 table 1000
>># ip route show table 1000
> 192.168.0.1  encap ip6 id 0 src :: dst fe83::1 hoplimit 80 tc 100 dev eth0.2  scope link
> 
> Signed-off-by: Quentin Armitage <quentin@armitage.org.uk>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH TRIVIAL] drivers/net/usb/plusb.c: Fix typo
From: David Miller @ 2016-03-28  2:34 UTC (permalink / raw)
  To: diego.viola-Re5JQEeQqe8AvxtiuMwx3w
  Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	trivial-DgEjT+Ai2ygdnm+yROfE0A
In-Reply-To: <1459123095-2673-1-git-send-email-diego.viola-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

From: Diego Viola <diego.viola-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Sun, 27 Mar 2016 20:58:15 -0300

> Signed-off-by: Diego Viola <diego.viola-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net] inet: add proper locking in __inet{6}_lookup()
From: David Miller @ 2016-03-28  2:32 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, lorenzo
In-Reply-To: <1458944115.6473.62.camel@edumazet-glaptop3.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 25 Mar 2016 15:15:15 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> Blocking BH in __inet{6}_lookup() is not correct, as the lookups
> are done using RCU protection.
> 
> It matters only starting from Lorenzo Colitti patches to destroy
> a TCP socket, since rcu_read_lock() is already held by other users
> of these functions.
> 
> This can be backported to all known stable versions, since TCP got RCU
> lookups back in 2.6.29 : Even if iproute2 contained no code to trigger
> the bug, some user programs could have used the API.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Lorenzo Colitti <lorenzo@google.com>

This is quite the maze of RCU locking here.  With this change,
inet_lookup is now:

	rcu_read_lock();
	sk = x(); {
		rcu_read_lock();
		...
		rcu_read_unlock();
	}
	if (!sk) {
		sk = y(); {
			rcu_read_lock();
			...
			rcu_read_unlock();
		}
	}
	rcu_read_unlock();

It would seem to me that we should bubble up the rcu locking calls.

If, as you say, the other users do RCU locking already, then it should
be safe to do what your patch is doing and also remove the RCU locking
done by __inet_lookup_established() and __inet_lookup_listener().

^ permalink raw reply

* [PATCH TRIVIAL] drivers/net/usb/plusb.c: Fix typo
From: Diego Viola @ 2016-03-27 23:58 UTC (permalink / raw)
  To: linux-usb-u79uwXL29TY76Z2rM5mHXA
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	trivial-DgEjT+Ai2ygdnm+yROfE0A, Diego Viola

Signed-off-by: Diego Viola <diego.viola-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 drivers/net/usb/plusb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/usb/plusb.c b/drivers/net/usb/plusb.c
index 1bfe0fc..22e1a9a 100644
--- a/drivers/net/usb/plusb.c
+++ b/drivers/net/usb/plusb.c
@@ -38,7 +38,7 @@
  * HEADS UP:  this handshaking isn't all that robust.  This driver
  * gets confused easily if you unplug one end of the cable then
  * try to connect it again; you'll need to restart both ends. The
- * "naplink" software (used by some PlayStation/2 deveopers) does
+ * "naplink" software (used by some PlayStation/2 developers) does
  * the handshaking much better!   Also, sometimes this hardware
  * seems to get wedged under load.  Prolific docs are weak, and
  * don't identify differences between PL2301 and PL2302, much less
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [RFC] r8169: Module parameter for opt-in of ASPM
From: Kast Bernd @ 2016-03-27 22:54 UTC (permalink / raw)
  To: nic_swsd; +Cc: netdev, linux-kernel

Hello,

this patch adds a module parameter in order to activate ASPM.
Basically it reapplies d64ec841517a25f6d468bde9f67e5b4cffdc67c7, which
was reverted as some people reported delayed link status detection and
increased boot times: https://lkml.org/lkml/2013/2/6/372.

The differences to this patch are:
	1) Turned off by default but with module parameter for activation
	2) Flags for aspm and clock request are set after ephy_init
	3) Minor changes because of merging

Motivation for changes compared to previous patch:
	1) Probably the patch wouldn't be merged when it is active by
	default

	2) There are comments, that state: "disable aspm and clock
	request before access ephy", thus I tried to respect that and
	activate it afterwards.
	Perhaps that was even the cause for the problems that let to
	reverting the old patch, but unfortunately couldn't reproduce
	these bugs and therefore wasn't	able to check if it's solved with
	my version.

Remarks:
	1) This patch drops power usage	from 13W to 8W and therefore
	leads to a vastly increased battery life and reduced heat
	dissipation on my notebook. Some people reported similar issues
	on different hardware (for example):
	https://bugzilla.kernel.org/show_bug.cgi?id=72211
	http://www.spinics.net/lists/netdev/msg298949.html
	Especially on newer systems (Haswell) the power savings seem to
	be huge, as the network chip prevents the CPU from entering
	deeper package sleep states (no PC6/PC7 are reached, only PC3).

	2) To benefit from the power savings a reboot is not sufficient.
	At least on my system it has to be powered off completely (cold
	boot). There seem to be no kernel functions to enable a link
	state, there is just pci_disable_link_state to disable low power
	states, thus this problem can't be solved by now.

	3) This patch was tested only on a single system and supports
	only the same cards than the old patch, as I couldn't find any
	documents to read about the registers and flags of newer cards.
	If anybody could send me a link, or tell me that it's save to
	set the same flags on newer cards this patch could probably also
	cover those cards.

Should I add support for newer cards, too?
Is there any chance to enable ASPM by default again?
Which improvements do I need to do in order to get this patch to the
kernel again?

I would appreciate any kind of feedback.

Signed-off-by: Kast Bernd <kastbernd@gmx.de>
---
 drivers/net/ethernet/realtek/r8169.c | 82 ++++++++++++++++++++++++++++++++----
 1 file changed, 74 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 537974c..6a533a4 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -346,6 +346,7 @@ MODULE_DEVICE_TABLE(pci, rtl8169_pci_tbl);
 
 static int rx_buf_sz = 16383;
 static int use_dac;
+static int use_aspm;
 static struct {
 	u32 msg_enable;
 } debug = { -1 };
@@ -509,6 +510,7 @@ enum rtl8168_registers {
 #define PWM_EN				(1 << 22)
 #define RXDV_GATED_EN			(1 << 19)
 #define EARLY_TALLY_EN			(1 << 16)
+#define FORCE_CLK			(1 << 15) /* force clock request */
 };
 
 enum rtl_register_content {
@@ -860,6 +862,8 @@ MODULE_AUTHOR("Realtek and the Linux r8169 crew <netdev@vger.kernel.org>");
 MODULE_DESCRIPTION("RealTek RTL-8169 Gigabit Ethernet driver");
 module_param(use_dac, int, 0);
 MODULE_PARM_DESC(use_dac, "Enable PCI DAC. Unsafe on 32 bit PCI slot.");
+module_param(use_aspm, int, 0);
+MODULE_PARM_DESC(use_aspm, "Enable ASPM power saving. Unsave on some systems");
 module_param_named(debug, debug.msg_enable, int, 0);
 MODULE_PARM_DESC(debug, "Debug verbosity level (0=none, ..., 16=all)");
 MODULE_LICENSE("GPL");
@@ -5899,7 +5903,8 @@ static void rtl_hw_start_8168e_2(struct rtl8169_private *tp)
 
 	RTL_W8(MaxTxPacketSize, EarlySize);
 
-	rtl_disable_clock_request(pdev);
+	if (!use_aspm)
+		rtl_disable_clock_request(pdev);
 
 	RTL_W32(TxConfig, RTL_R32(TxConfig) | TXCFG_AUTO_FIFO);
 	RTL_W8(MCU, RTL_R8(MCU) & ~NOW_IS_OOB);
@@ -5909,7 +5914,13 @@ static void rtl_hw_start_8168e_2(struct rtl8169_private *tp)
 
 	RTL_W8(DLLPR, RTL_R8(DLLPR) | PFM_EN);
 	RTL_W32(MISC, RTL_R32(MISC) | PWM_EN);
-	RTL_W8(Config5, RTL_R8(Config5) & ~Spi_en);
+
+	if (use_aspm) {
+		RTL_W8(Config5, (RTL_R8(Config5) & ~Spi_en) | ASPM_en);
+		RTL_W8(Config2, RTL_R8(Config2) | ClkReqEn);
+	} else {
+		RTL_W8(Config5, RTL_R8(Config5) & ~Spi_en);
+	}
 }
 
 static void rtl_hw_start_8168f(struct rtl8169_private *tp)
@@ -5934,13 +5945,21 @@ static void rtl_hw_start_8168f(struct rtl8169_private *tp)
 
 	RTL_W8(MaxTxPacketSize, EarlySize);
 
-	rtl_disable_clock_request(pdev);
+	if (!use_aspm)
+		rtl_disable_clock_request(pdev);
 
 	RTL_W32(TxConfig, RTL_R32(TxConfig) | TXCFG_AUTO_FIFO);
 	RTL_W8(MCU, RTL_R8(MCU) & ~NOW_IS_OOB);
 	RTL_W8(DLLPR, RTL_R8(DLLPR) | PFM_EN);
-	RTL_W32(MISC, RTL_R32(MISC) | PWM_EN);
-	RTL_W8(Config5, RTL_R8(Config5) & ~Spi_en);
+
+	if (use_aspm) {
+		RTL_W32(MISC, RTL_R32(MISC) | PWM_EN | FORCE_CLK);
+		RTL_W8(Config5, (RTL_R8(Config5) & ~Spi_en) | ASPM_en);
+		RTL_W8(Config2, RTL_R8(Config2) | ClkReqEn);
+	} else {
+		RTL_W32(MISC, RTL_R32(MISC) | PWM_EN);
+		RTL_W8(Config5, RTL_R8(Config5) & ~Spi_en);
+	}
 }
 
 static void rtl_hw_start_8168f_1(struct rtl8169_private *tp)
@@ -5957,6 +5976,12 @@ static void rtl_hw_start_8168f_1(struct rtl8169_private *tp)
 
 	rtl_ephy_init(tp, e_info_8168f_1, ARRAY_SIZE(e_info_8168f_1));
 
+	if (use_aspm) {
+		RTL_W32(MISC, (RTL_R32(MISC) | FORCE_CLK));
+		RTL_W8(Config5, RTL_R8(Config5) | ASPM_en);
+		RTL_W8(Config2, RTL_R8(Config2) | ClkReqEn);
+	}
+
 	rtl_w0w1_eri(tp, 0x0d4, ERIAR_MASK_0011, 0x0c00, 0xff00, ERIAR_EXGMAC);
 
 	/* Adjust EEE LED frequency */
@@ -6031,6 +6056,12 @@ static void rtl_hw_start_8168g_1(struct rtl8169_private *tp)
 	RTL_W8(Config2, RTL_R8(Config2) & ~ClkReqEn);
 	RTL_W8(Config5, RTL_R8(Config5) & ~ASPM_en);
 	rtl_ephy_init(tp, e_info_8168g_1, ARRAY_SIZE(e_info_8168g_1));
+
+	if (use_aspm) {
+		RTL_W8(Config5, RTL_R8(Config5) | ASPM_en);
+		RTL_W8(Config2, RTL_R8(Config2) | ClkReqEn);
+		RTL_W32(MISC, RTL_R32(MISC) | FORCE_CLK);
+	}
 }
 
 static void rtl_hw_start_8168g_2(struct rtl8169_private *tp)
@@ -6049,6 +6080,12 @@ static void rtl_hw_start_8168g_2(struct rtl8169_private *tp)
 	RTL_W8(Config2, RTL_R8(Config2) & ~ClkReqEn);
 	RTL_W8(Config5, RTL_R8(Config5) & ~ASPM_en);
 	rtl_ephy_init(tp, e_info_8168g_2, ARRAY_SIZE(e_info_8168g_2));
+
+	if (use_aspm) {
+		RTL_W8(Config5, RTL_R8(Config5) | ASPM_en);
+		RTL_W8(Config2, RTL_R8(Config2) | ClkReqEn);
+		RTL_W32(MISC, RTL_R32(MISC) | FORCE_CLK);
+	}
 }
 
 static void rtl_hw_start_8411_2(struct rtl8169_private *tp)
@@ -6068,6 +6105,12 @@ static void rtl_hw_start_8411_2(struct rtl8169_private *tp)
 	RTL_W8(Config2, RTL_R8(Config2) & ~ClkReqEn);
 	RTL_W8(Config5, RTL_R8(Config5) & ~ASPM_en);
 	rtl_ephy_init(tp, e_info_8411_2, ARRAY_SIZE(e_info_8411_2));
+
+	if (use_aspm) {
+		RTL_W8(Config5, RTL_R8(Config5) | ASPM_en);
+		RTL_W8(Config2, RTL_R8(Config2) | ClkReqEn);
+		RTL_W32(MISC, RTL_R32(MISC) | FORCE_CLK);
+	}
 }
 
 static void rtl_hw_start_8168h_1(struct rtl8169_private *tp)
@@ -6514,6 +6557,12 @@ static void rtl_hw_start_8105e_1(struct rtl8169_private *tp)
 
 	rtl_ephy_init(tp, e_info_8105e_1, ARRAY_SIZE(e_info_8105e_1));
 
+	if (use_aspm) {
+		RTL_W8(Config5, RTL_R8(Config5) | ASPM_en);
+		RTL_W8(Config2, RTL_R8(Config2) | ClkReqEn);
+		RTL_W32(MISC, RTL_R32(MISC) | FORCE_CLK);
+	}
+
 	rtl_pcie_state_l2l3_enable(tp, false);
 }
 
@@ -6541,6 +6590,12 @@ static void rtl_hw_start_8402(struct rtl8169_private *tp)
 
 	rtl_ephy_init(tp, e_info_8402, ARRAY_SIZE(e_info_8402));
 
+	if (use_aspm) {
+		RTL_W8(Config5, RTL_R8(Config5) | ASPM_en);
+		RTL_W8(Config2, RTL_R8(Config2) | ClkReqEn);
+		RTL_W32(MISC, RTL_R32(MISC) | FORCE_CLK);
+	}
+
 	rtl_tx_performance_tweak(tp->pci_dev, 0x5 << MAX_READ_REQUEST_SHIFT);
 
 	rtl_eri_write(tp, 0xc8, ERIAR_MASK_1111, 0x00000002, ERIAR_EXGMAC);
@@ -6561,7 +6616,16 @@ static void rtl_hw_start_8106(struct rtl8169_private *tp)
 	/* Force LAN exit from ASPM if Rx/Tx are not idle */
 	RTL_W32(FuncEvent, RTL_R32(FuncEvent) | 0x002800);
 
-	RTL_W32(MISC, (RTL_R32(MISC) | DISABLE_LAN_EN) & ~EARLY_TALLY_EN);
+	if (!use_aspm) {
+		RTL_W32(MISC, (RTL_R32(MISC) | DISABLE_LAN_EN) &
+				~EARLY_TALLY_EN);
+	} else {
+		RTL_W32(MISC, (RTL_R32(MISC) | DISABLE_LAN_EN | FORCE_CLK) &
+				~EARLY_TALLY_EN);
+		RTL_W8(Config5, RTL_R8(Config5) | ASPM_en);
+		RTL_W8(Config2, RTL_R8(Config2) | ClkReqEn);
+	}
+
 	RTL_W8(MCU, RTL_R8(MCU) | EN_NDP | EN_OOB_RESET);
 	RTL_W8(DLLPR, RTL_R8(DLLPR) & ~PFM_EN);
 
@@ -8181,8 +8245,10 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	/* disable ASPM completely as that cause random device stop working
 	 * problems as well as full system hangs for some PCIe devices users */
-	pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1 |
-				     PCIE_LINK_STATE_CLKPM);
+	if (!use_aspm) {
+		pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S
+				| PCIE_LINK_STATE_L1 | PCIE_LINK_STATE_CLKPM);
+	}
 
 	/* enable device (incl. PCI PM wakeup and hotplug setup) */
 	rc = pci_enable_device(pdev);
-- 
2.7.4

^ permalink raw reply related

* Re: BUG: net/netfilter: KASAN: stack-out-of-bounds in tcp_packet
From: Jozsef Kadlecsik @ 2016-03-27 22:25 UTC (permalink / raw)
  To: Baozeng Ding
  Cc: Pablo Neira Ayuso, Patrick McHardy, David Miller, netfilter-devel,
	netdev
In-Reply-To: <alpine.DEB.2.10.1603280010120.19633@blackhole.kfki.hu>

On Mon, 28 Mar 2016, Jozsef Kadlecsik wrote:

> On Sun, 27 Mar 2016, Baozeng Ding wrote:
> 
> > The following program triggers stack-out-of-bounds in tcp_packet. The
> > kernel version is 4.5 (on Mar 16 commit
> > 09fd671ccb2475436bd5f597f751ca4a7d177aea).
> > Uncovered with syzkaller. Thanks.
> > 
> > ==================================================================
> > BUG: KASAN: stack-out-of-bounds in tcp_packet+0x4b77/0x51c0 at addr
> > ffff8800a45df3c8
> > Read of size 1 by task 0327/11132
> > page:ffffea00029177c0 count:0 mapcount:0 mapping:          (null) index:0x0
> > flags: 0x1fffc0000000000()
> > page dumped because: kasan: bad access detected
> > CPU: 1 PID: 11132 Comm: 0327 Tainted: G    B           4.5.0+ #12
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
> >  0000000000000001 ffff8800a45df148 ffffffff82945051 ffff8800a45df1d8
> >  ffff8800a45df3c8 0000000000000027 0000000000000001 ffff8800a45df1c8
> >  ffffffff81709f88 ffff8800b4f7e3d0 0000000000000028 0000000000000286
> > Call Trace:
> > [<     inline     >] __dump_stack /kernel/lib/dump_stack.c:15
> > [<ffffffff82945051>] dump_stack+0xb3/0x112 /kernel/lib/dump_stack.c:51
> > [<     inline     >] print_address_description /kernel/mm/kasan/report.c:150
> > [<ffffffff81709f88>] kasan_report_error+0x4f8/0x530
> > /kernel/mm/kasan/report.c:236
> > [<ffffffff84c54b8d>] ? skb_copy_bits+0x49d/0x6d0
> > /kernel/net/core/skbuff.c:1675
> > [<     inline     >] ? spin_lock_bh /kernel/include/linux/spinlock.h:307
> > [<ffffffff84e0e9b9>] ? tcp_packet+0x1c9/0x51c0
> > /kernel/net/netfilter/nf_conntrack_proto_tcp.c:833
> > [<     inline     >] kasan_report /kernel/mm/kasan/report.c:259
> > [<ffffffff81709ffe>] __asan_report_load1_noabort+0x3e/0x40
> > /kernel/mm/kasan/report.c:277
> > [<     inline     >] ? tcp_sack
> > /kernel/net/netfilter/nf_conntrack_proto_tcp.c:473
> > [<     inline     >] ? tcp_in_window
> > /kernel/net/netfilter/nf_conntrack_proto_tcp.c:527
> > [<ffffffff84e13367>] ? tcp_packet+0x4b77/0x51c0
> > /kernel/net/netfilter/nf_conntrack_proto_tcp.c:1036
> > [<     inline     >] tcp_sack
> > /kernel/net/netfilter/nf_conntrack_proto_tcp.c:473
> > [<     inline     >] tcp_in_window
> > /kernel/net/netfilter/nf_conntrack_proto_tcp.c:527
> > [<ffffffff84e13367>] tcp_packet+0x4b77/0x51c0
> > /kernel/net/netfilter/nf_conntrack_proto_tcp.c:1036
> > [<ffffffff817094b8>] ? memset+0x28/0x30 /kernel/mm/kasan/kasan.c:302
> > [<ffffffff84e0dd74>] ? tcp_new+0x1a4/0xc20
> > /kernel/net/netfilter/nf_conntrack_proto_tcp.c:1122
> > [<     inline     >] ? build_report /kernel/include/net/netlink.h:499
> > [<ffffffff8518c4d6>] ? xfrm_send_report+0x426/0x450
> > /kernel/net/xfrm/xfrm_user.c:3039
> > [<ffffffff84e0e7f0>] ? tcp_new+0xc20/0xc20
> > /kernel/net/netfilter/nf_conntrack_proto_tcp.c:1169
> > [<ffffffff84dfb03a>] ? init_conntrack+0xca/0x9e0
> > /kernel/net/netfilter/nf_conntrack_core.c:972
> > [<ffffffff84dfaf70>] ? nf_conntrack_alloc+0x40/0x40
> > /kernel/net/netfilter/nf_conntrack_core.c:903
> > [<ffffffff84e0cdf0>] ? tcp_init_net+0x6e0/0x6e0
> > /kernel/include/net/netfilter/nf_conntrack_l4proto.h:137
> > [<ffffffff85121732>] ? ipv4_get_l4proto+0x262/0x390
> > /kernel/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c:89
> > [<ffffffff84df372f>] ? nf_ct_get_tuple+0xaf/0x190
> > /kernel/net/netfilter/nf_conntrack_core.c:197
> > [<ffffffff84dfc23e>] nf_conntrack_in+0x8ee/0x1170
> > /kernel/net/netfilter/nf_conntrack_core.c:1177
> > [<ffffffff84dfb950>] ? init_conntrack+0x9e0/0x9e0
> > /kernel/net/netfilter/nf_conntrack_core.c:287
> > [<ffffffff8512ab06>] ? ipt_do_table+0xa16/0x1260
> > /kernel/net/ipv4/netfilter/ip_tables.c:423
> > [<ffffffff81405ced>] ? trace_hardirqs_on+0xd/0x10
> > /kernel/kernel/locking/lockdep.c:2635
> > [<ffffffff81311fcb>] ? __local_bh_enable_ip+0x6b/0xc0
> > /kernel/kernel/softirq.c:175
> > [<ffffffff8512a0f0>] ? check_entry.isra.4+0x190/0x190
> > /kernel/net/ipv6/netfilter/ip6_tables.c:594
> > [<ffffffff84f9d4e0>] ? ip_reply_glue_bits+0xc0/0xc0
> > /kernel/net/ipv4/ip_output.c:1530
> > [<ffffffff851219ae>] ipv4_conntrack_local+0x14e/0x1a0
> > /kernel/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c:161
> > [<ffffffff85131b3d>] ? iptable_raw_hook+0x9d/0x1e0
> > /kernel/net/ipv4/netfilter/iptable_raw.c:32
> > [<ffffffff84de5b7d>] nf_iterate+0x15d/0x230 /kernel/net/netfilter/core.c:274
> > [<ffffffff84de5c50>] ? nf_iterate+0x230/0x230 /kernel/net/netfilter/core.c:268
> > [<ffffffff84de5dfd>] nf_hook_slow+0x1ad/0x310 /kernel/net/netfilter/core.c:306
> > [<ffffffff84de5c50>] ? nf_iterate+0x230/0x230 /kernel/net/netfilter/core.c:268
> > [<ffffffff84de5c50>] ? nf_iterate+0x230/0x230 /kernel/net/netfilter/core.c:268
> > [<ffffffff82979274>] ? prandom_u32+0x24/0x30 /kernel/lib/random32.c:83
> > [<ffffffff84f747ff>] ? ip_idents_reserve+0x9f/0xf0
> > /kernel/net/ipv4/route.c:484
> > [<     inline     >] nf_hook_thresh /kernel/include/linux/netfilter.h:187
> > [<     inline     >] nf_hook /kernel/include/linux/netfilter.h:197
> > [<ffffffff84fa4f53>] __ip_local_out+0x263/0x3c0
> > /kernel/net/ipv4/ip_output.c:104
> > [<ffffffff84fa4cf0>] ? ip_finish_output+0xd00/0xd00
> > /kernel/include/net/ip.h:322
> > [<ffffffff84fa0230>] ? __ip_flush_pending_frames.isra.45+0x2e0/0x2e0
> > /kernel/net/ipv4/ip_output.c:1337
> > [<ffffffff84faa336>] ? __ip_make_skb+0xfe6/0x1610
> > /kernel/net/ipv4/ip_output.c:1436
> > [<ffffffff84fa50dd>] ip_local_out+0x2d/0x1c0 /kernel/net/ipv4/ip_output.c:113
> > [<ffffffff84faa99c>] ip_send_skb+0x3c/0xc0 /kernel/net/ipv4/ip_output.c:1443
> > [<ffffffff84faaa84>] ip_push_pending_frames+0x64/0x80
> > /kernel/net/ipv4/ip_output.c:1463
> > [<     inline     >] rcu_read_unlock /kernel/include/linux/rcupdate.h:922
> > [<ffffffff8504e10b>] raw_sendmsg+0x17bb/0x25c0
> > /kernel/net/ieee802154/socket.c:53
> > [<ffffffff8504c950>] ? dst_output+0x190/0x190 /kernel/include/net/dst.h:492
> > [<     inline     >] ? trace_mm_page_alloc
> > /kernel/include/trace/events/kmem.h:217
> > [<ffffffff81621609>] ? __alloc_pages_nodemask+0x559/0x16b0
> > /kernel/mm/page_alloc.c:3368
> > [<ffffffff81406260>] ? debug_check_no_locks_freed+0x290/0x290
> > /kernel/kernel/locking/lockdep.c:4104
> > [<ffffffff814c0e30>] ? is_module_text_address+0x10/0x20
> > /kernel/kernel/module.c:4057
> > [<ffffffff81360533>] ? __kernel_text_address+0x73/0xa0
> > /kernel/kernel/extable.c:103
> > [<ffffffff81406260>] ? debug_check_no_locks_freed+0x290/0x290
> > /kernel/kernel/locking/lockdep.c:4104
> > [<ffffffff81406260>] ? debug_check_no_locks_freed+0x290/0x290
> > /kernel/kernel/locking/lockdep.c:4104
> > [<ffffffff81405ced>] ? trace_hardirqs_on+0xd/0x10
> > /kernel/kernel/locking/lockdep.c:2635
> > [<ffffffff81406260>] ? debug_check_no_locks_freed+0x290/0x290
> > /kernel/kernel/locking/lockdep.c:4104
> > [<     inline     >] ? sock_rps_record_flow /kernel/include/net/sock.h:874
> > [<ffffffff85089113>] ? inet_sendmsg+0x73/0x4c0 /kernel/net/ipv4/af_inet.c:729
> > [<     inline     >] ? rcu_read_unlock /kernel/include/linux/rcupdate.h:922
> > [<     inline     >] ? sock_rps_record_flow_hash
> > /kernel/include/net/sock.h:867
> > [<     inline     >] ? sock_rps_record_flow /kernel/include/net/sock.h:874
> > [<ffffffff8508929a>] ? inet_sendmsg+0x1fa/0x4c0 /kernel/net/ipv4/af_inet.c:729
> > [<ffffffff85089395>] inet_sendmsg+0x2f5/0x4c0 /kernel/net/ipv4/af_inet.c:736
> > [<     inline     >] ? sock_rps_record_flow /kernel/include/net/sock.h:874
> > [<ffffffff85089113>] ? inet_sendmsg+0x73/0x4c0 /kernel/net/ipv4/af_inet.c:729
> > [<ffffffff850890a0>] ? inet_recvmsg+0x4a0/0x4a0
> > /kernel/include/linux/compiler.h:222
> > [<     inline     >] sock_sendmsg_nosec /kernel/net/socket.c:611
> > [<ffffffff84c3434a>] sock_sendmsg+0xca/0x110 /kernel/net/socket.c:621
> > [<ffffffff84c35448>] SYSC_sendto+0x208/0x350 /kernel/net/socket.c:1651
> > [<ffffffff84c35240>] ? SYSC_connect+0x2e0/0x2e0 /kernel/net/socket.c:1543
> > [<ffffffff81698650>] ? __pmd_alloc+0x350/0x350 /kernel/mm/memory.c:3928
> > [<ffffffff81230b3b>] ? __do_page_fault+0x2ab/0x8e0
> > /kernel/arch/x86/mm/fault.c:1184
> > [<ffffffff81230c30>] ? __do_page_fault+0x3a0/0x8e0
> > /kernel/arch/x86/mm/fault.c:1271
> > [<ffffffff813fb5da>] ? up_read+0x1a/0x40 /kernel/kernel/locking/rwsem.c:79
> > [<ffffffff81230a29>] ? __do_page_fault+0x199/0x8e0
> > /kernel/arch/x86/mm/fault.c:1187
> > [<ffffffff84c379b0>] SyS_sendto+0x40/0x50 /kernel/net/socket.c:1619
> > [<ffffffff85dab940>] entry_SYSCALL_64_fastpath+0x23/0xc1
> > /kernel/arch/x86/entry/entry_64.S:207
> > Memory state around the buggy address:
> >  ffff8800a45df280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >  ffff8800a45df300: f1 f1 f1 f1 00 00 04 f4 f2 f2 f2 f2 00 00 04 f4
> > > ffff8800a45df380: f2 f2 f2 f2 00 00 00 00 00 f4 f4 f4 f3 f3 f3 f3
> >                                               ^
> >  ffff8800a45df400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >  ffff8800a45df480: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 01 f4 f4 f4
> > ==================================================================
> > 
> > #include <unistd.h>
> > #include <sys/syscall.h>
> > #include <string.h>
> > #include <stdint.h>
> > #include <pthread.h>
> > #include <sys/socket.h>
> > #include <sys/mman.h>
> > #include <netinet/in.h>
> > int main()
> > {
> >         mmap((void *)0x20000000ul, 0x19000ul, PROT_READ|PROT_WRITE,
> > MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0);
> >         int sock = socket(AF_INET, SOCK_RAW, IPPROTO_TCP);
> >         int sock_dup = dup(sock);
> >         memcpy((void*)0x2000b000,
> > "\x11\xaf\x7d\x99\x91\x3c\x87\x34\x85\x18\xc4\xd6\xf2\x30\x0a", 15);
> >         *(uint16_t*)0x20002fec = (uint16_t)0x2;
> >         *(uint16_t*)0x20002fee = (uint16_t)0x11ab;
> >         *(uint32_t*)0x20002ff0 = (uint32_t)0x100007f;
> >         sendto(sock_dup, (void *)0x2000b000ul, 0xful, 0x8800ul, (struct
> > sockaddr *)0x20002fe4ul, 0x1cul);
> >         memcpy((void*)0x2001504f,
> > "\x7e\xb1\x52\x5b\x78\x85\x27\xe7\xcc\x3d\xf5\x18\x1b\xba\xda\x97\x6c\x18\x72\x0c\xd2\x0a\xa6\x77\xb7\x8b\xa2\xd2\x1d\xf0\x6b\xf6\x1a\x27\x6b\x98\x3e\x0b\x49\x8d\x54\x6e\x9e\xbb\x21\x4a\x72\x79\x1f\x82\xaf\x89\x2c\xf6\xd3\xc9\xd7\xed\x18\x29\x4d\x2e\x03\x15\xe2\x03\x14\xd0\xac\xa5\x81\x37\x73\x88\xa9\xf5\x08\xe5\xef\x5b\x56\xb7\x18\x8f\xe6\x19\xea\x91\x82\x23\xdd\x2c\x5c\xa5\xf0\xfc\xd8\xe2\x8b\x91\x48\x70\x24\xed\xae\xf9\x06\xac\xc4\x53\x01\xc3\xf5\xa3\x10\xef\xf1\xa6\x2b\xae\x72\xc7\x1a\x02\xee\x78\xcd\xd1\x7e\x8c\x9c\x1a\x36\xc7\xd4\x7c\x82\x64\xf7\x8b\x5a\xb0\x72\xa8\x87\x3c\xdc\xd0\xba\xfe\x70\x7d\x8c\x23\x78\xad\x7c\x31\x04\xec\xab\x1e\x4c\xee\xae\x84\xd8\x1a\x1d\x85\xa5\x57\xa8\x24\x53\x08\x1c\x4f\xda\x49\xe5\x3a\x99\x8c\x29\xa1\xed\x4b\x42\x7a\x15\x48\x2a\x22\x3b\x81\xfe\x47\x7
 4\xc1\x2f\x64\xcf\x10\xd4\x71\x72\x50\x71\xd7\xf6\xb0\xca\x41\x9a\x5e\x3e\xe4\x31\x19\xd1\x19\x46\x20\x66\x4c\x2f\xea\x76\x17\x2d\x94",
> > 232);
> >         *(uint16_t*)0x2001501c = (uint16_t)0xa;
> >         *(uint16_t*)0x2001501e = (uint16_t)0x11ab;
> >         *(uint32_t*)0x20015020 = (uint32_t)0xbdc;
> >         *(uint32_t*)0x20015024 = (uint32_t)0x0;
> >         *(uint32_t*)0x20015028 = (uint32_t)0x0;
> >         *(uint32_t*)0x2001502c = (uint32_t)0x0;
> >         *(uint32_t*)0x20015030 = (uint32_t)0x1000000;
> >         *(uint32_t*)0x20015034 = (uint32_t)0x3;
> >         sendto(sock_dup, (void *)0x2001504ful, 0xe8ul, 0x880ul, (struct
> > sockaddr *)0x20015000ul, 0x1cul);
> >         return 0;
> > }

Actually, in order to fix the non-conntrack case too, I believe the next 
patch is required:

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index d4c5115..365f4fb 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3716,6 +3716,8 @@ void tcp_parse_options(const struct sk_buff *skb,
 			length--;
 			continue;
 		default:
+			if (length < 2)
+				return;
 			opsize = *ptr++;
 			if (opsize < 2) /* "silly options" */
 				return;
@@ -3873,6 +3875,8 @@ const u8 *tcp_parse_md5sig_option(const struct tcphdr *th)
 			length--;
 			continue;
 		default:
+			if (length < 2)
+				return;
 			opsize = *ptr++;
 			if (opsize < 2 || opsize > length)
 				return NULL;
diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index 278f3b9..7cc1d9c 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -410,6 +410,8 @@ static void tcp_options(const struct sk_buff *skb,
 			length--;
 			continue;
 		default:
+			if (length < 2)
+				return;
 			opsize=*ptr++;
 			if (opsize < 2) /* "silly options" */
 				return;
@@ -470,6 +472,8 @@ static void tcp_sack(const struct sk_buff *skb, unsigned int dataoff,
 			length--;
 			continue;
 		default:
+			if (length < 2)
+				return;
 			opsize = *ptr++;
 			if (opsize < 2) /* "silly options" */
 				return;

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.mta.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply related

* Re: BUG: net/netfilter: KASAN: stack-out-of-bounds in tcp_packet
From: Jozsef Kadlecsik @ 2016-03-27 22:11 UTC (permalink / raw)
  To: Baozeng Ding; +Cc: pablo, kaber, davem, netfilter-devel, netdev
In-Reply-To: <56F7D3AD.10003@gmail.com>

On Sun, 27 Mar 2016, Baozeng Ding wrote:

> The following program triggers stack-out-of-bounds in tcp_packet. The
> kernel version is 4.5 (on Mar 16 commit
> 09fd671ccb2475436bd5f597f751ca4a7d177aea).
> Uncovered with syzkaller. Thanks.
> 
> ==================================================================
> BUG: KASAN: stack-out-of-bounds in tcp_packet+0x4b77/0x51c0 at addr
> ffff8800a45df3c8
> Read of size 1 by task 0327/11132
> page:ffffea00029177c0 count:0 mapcount:0 mapping:          (null) index:0x0
> flags: 0x1fffc0000000000()
> page dumped because: kasan: bad access detected
> CPU: 1 PID: 11132 Comm: 0327 Tainted: G    B           4.5.0+ #12
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
>  0000000000000001 ffff8800a45df148 ffffffff82945051 ffff8800a45df1d8
>  ffff8800a45df3c8 0000000000000027 0000000000000001 ffff8800a45df1c8
>  ffffffff81709f88 ffff8800b4f7e3d0 0000000000000028 0000000000000286
> Call Trace:
> [<     inline     >] __dump_stack /kernel/lib/dump_stack.c:15
> [<ffffffff82945051>] dump_stack+0xb3/0x112 /kernel/lib/dump_stack.c:51
> [<     inline     >] print_address_description /kernel/mm/kasan/report.c:150
> [<ffffffff81709f88>] kasan_report_error+0x4f8/0x530
> /kernel/mm/kasan/report.c:236
> [<ffffffff84c54b8d>] ? skb_copy_bits+0x49d/0x6d0
> /kernel/net/core/skbuff.c:1675
> [<     inline     >] ? spin_lock_bh /kernel/include/linux/spinlock.h:307
> [<ffffffff84e0e9b9>] ? tcp_packet+0x1c9/0x51c0
> /kernel/net/netfilter/nf_conntrack_proto_tcp.c:833
> [<     inline     >] kasan_report /kernel/mm/kasan/report.c:259
> [<ffffffff81709ffe>] __asan_report_load1_noabort+0x3e/0x40
> /kernel/mm/kasan/report.c:277
> [<     inline     >] ? tcp_sack
> /kernel/net/netfilter/nf_conntrack_proto_tcp.c:473
> [<     inline     >] ? tcp_in_window
> /kernel/net/netfilter/nf_conntrack_proto_tcp.c:527
> [<ffffffff84e13367>] ? tcp_packet+0x4b77/0x51c0
> /kernel/net/netfilter/nf_conntrack_proto_tcp.c:1036
> [<     inline     >] tcp_sack
> /kernel/net/netfilter/nf_conntrack_proto_tcp.c:473
> [<     inline     >] tcp_in_window
> /kernel/net/netfilter/nf_conntrack_proto_tcp.c:527
> [<ffffffff84e13367>] tcp_packet+0x4b77/0x51c0
> /kernel/net/netfilter/nf_conntrack_proto_tcp.c:1036
> [<ffffffff817094b8>] ? memset+0x28/0x30 /kernel/mm/kasan/kasan.c:302
> [<ffffffff84e0dd74>] ? tcp_new+0x1a4/0xc20
> /kernel/net/netfilter/nf_conntrack_proto_tcp.c:1122
> [<     inline     >] ? build_report /kernel/include/net/netlink.h:499
> [<ffffffff8518c4d6>] ? xfrm_send_report+0x426/0x450
> /kernel/net/xfrm/xfrm_user.c:3039
> [<ffffffff84e0e7f0>] ? tcp_new+0xc20/0xc20
> /kernel/net/netfilter/nf_conntrack_proto_tcp.c:1169
> [<ffffffff84dfb03a>] ? init_conntrack+0xca/0x9e0
> /kernel/net/netfilter/nf_conntrack_core.c:972
> [<ffffffff84dfaf70>] ? nf_conntrack_alloc+0x40/0x40
> /kernel/net/netfilter/nf_conntrack_core.c:903
> [<ffffffff84e0cdf0>] ? tcp_init_net+0x6e0/0x6e0
> /kernel/include/net/netfilter/nf_conntrack_l4proto.h:137
> [<ffffffff85121732>] ? ipv4_get_l4proto+0x262/0x390
> /kernel/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c:89
> [<ffffffff84df372f>] ? nf_ct_get_tuple+0xaf/0x190
> /kernel/net/netfilter/nf_conntrack_core.c:197
> [<ffffffff84dfc23e>] nf_conntrack_in+0x8ee/0x1170
> /kernel/net/netfilter/nf_conntrack_core.c:1177
> [<ffffffff84dfb950>] ? init_conntrack+0x9e0/0x9e0
> /kernel/net/netfilter/nf_conntrack_core.c:287
> [<ffffffff8512ab06>] ? ipt_do_table+0xa16/0x1260
> /kernel/net/ipv4/netfilter/ip_tables.c:423
> [<ffffffff81405ced>] ? trace_hardirqs_on+0xd/0x10
> /kernel/kernel/locking/lockdep.c:2635
> [<ffffffff81311fcb>] ? __local_bh_enable_ip+0x6b/0xc0
> /kernel/kernel/softirq.c:175
> [<ffffffff8512a0f0>] ? check_entry.isra.4+0x190/0x190
> /kernel/net/ipv6/netfilter/ip6_tables.c:594
> [<ffffffff84f9d4e0>] ? ip_reply_glue_bits+0xc0/0xc0
> /kernel/net/ipv4/ip_output.c:1530
> [<ffffffff851219ae>] ipv4_conntrack_local+0x14e/0x1a0
> /kernel/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c:161
> [<ffffffff85131b3d>] ? iptable_raw_hook+0x9d/0x1e0
> /kernel/net/ipv4/netfilter/iptable_raw.c:32
> [<ffffffff84de5b7d>] nf_iterate+0x15d/0x230 /kernel/net/netfilter/core.c:274
> [<ffffffff84de5c50>] ? nf_iterate+0x230/0x230 /kernel/net/netfilter/core.c:268
> [<ffffffff84de5dfd>] nf_hook_slow+0x1ad/0x310 /kernel/net/netfilter/core.c:306
> [<ffffffff84de5c50>] ? nf_iterate+0x230/0x230 /kernel/net/netfilter/core.c:268
> [<ffffffff84de5c50>] ? nf_iterate+0x230/0x230 /kernel/net/netfilter/core.c:268
> [<ffffffff82979274>] ? prandom_u32+0x24/0x30 /kernel/lib/random32.c:83
> [<ffffffff84f747ff>] ? ip_idents_reserve+0x9f/0xf0
> /kernel/net/ipv4/route.c:484
> [<     inline     >] nf_hook_thresh /kernel/include/linux/netfilter.h:187
> [<     inline     >] nf_hook /kernel/include/linux/netfilter.h:197
> [<ffffffff84fa4f53>] __ip_local_out+0x263/0x3c0
> /kernel/net/ipv4/ip_output.c:104
> [<ffffffff84fa4cf0>] ? ip_finish_output+0xd00/0xd00
> /kernel/include/net/ip.h:322
> [<ffffffff84fa0230>] ? __ip_flush_pending_frames.isra.45+0x2e0/0x2e0
> /kernel/net/ipv4/ip_output.c:1337
> [<ffffffff84faa336>] ? __ip_make_skb+0xfe6/0x1610
> /kernel/net/ipv4/ip_output.c:1436
> [<ffffffff84fa50dd>] ip_local_out+0x2d/0x1c0 /kernel/net/ipv4/ip_output.c:113
> [<ffffffff84faa99c>] ip_send_skb+0x3c/0xc0 /kernel/net/ipv4/ip_output.c:1443
> [<ffffffff84faaa84>] ip_push_pending_frames+0x64/0x80
> /kernel/net/ipv4/ip_output.c:1463
> [<     inline     >] rcu_read_unlock /kernel/include/linux/rcupdate.h:922
> [<ffffffff8504e10b>] raw_sendmsg+0x17bb/0x25c0
> /kernel/net/ieee802154/socket.c:53
> [<ffffffff8504c950>] ? dst_output+0x190/0x190 /kernel/include/net/dst.h:492
> [<     inline     >] ? trace_mm_page_alloc
> /kernel/include/trace/events/kmem.h:217
> [<ffffffff81621609>] ? __alloc_pages_nodemask+0x559/0x16b0
> /kernel/mm/page_alloc.c:3368
> [<ffffffff81406260>] ? debug_check_no_locks_freed+0x290/0x290
> /kernel/kernel/locking/lockdep.c:4104
> [<ffffffff814c0e30>] ? is_module_text_address+0x10/0x20
> /kernel/kernel/module.c:4057
> [<ffffffff81360533>] ? __kernel_text_address+0x73/0xa0
> /kernel/kernel/extable.c:103
> [<ffffffff81406260>] ? debug_check_no_locks_freed+0x290/0x290
> /kernel/kernel/locking/lockdep.c:4104
> [<ffffffff81406260>] ? debug_check_no_locks_freed+0x290/0x290
> /kernel/kernel/locking/lockdep.c:4104
> [<ffffffff81405ced>] ? trace_hardirqs_on+0xd/0x10
> /kernel/kernel/locking/lockdep.c:2635
> [<ffffffff81406260>] ? debug_check_no_locks_freed+0x290/0x290
> /kernel/kernel/locking/lockdep.c:4104
> [<     inline     >] ? sock_rps_record_flow /kernel/include/net/sock.h:874
> [<ffffffff85089113>] ? inet_sendmsg+0x73/0x4c0 /kernel/net/ipv4/af_inet.c:729
> [<     inline     >] ? rcu_read_unlock /kernel/include/linux/rcupdate.h:922
> [<     inline     >] ? sock_rps_record_flow_hash
> /kernel/include/net/sock.h:867
> [<     inline     >] ? sock_rps_record_flow /kernel/include/net/sock.h:874
> [<ffffffff8508929a>] ? inet_sendmsg+0x1fa/0x4c0 /kernel/net/ipv4/af_inet.c:729
> [<ffffffff85089395>] inet_sendmsg+0x2f5/0x4c0 /kernel/net/ipv4/af_inet.c:736
> [<     inline     >] ? sock_rps_record_flow /kernel/include/net/sock.h:874
> [<ffffffff85089113>] ? inet_sendmsg+0x73/0x4c0 /kernel/net/ipv4/af_inet.c:729
> [<ffffffff850890a0>] ? inet_recvmsg+0x4a0/0x4a0
> /kernel/include/linux/compiler.h:222
> [<     inline     >] sock_sendmsg_nosec /kernel/net/socket.c:611
> [<ffffffff84c3434a>] sock_sendmsg+0xca/0x110 /kernel/net/socket.c:621
> [<ffffffff84c35448>] SYSC_sendto+0x208/0x350 /kernel/net/socket.c:1651
> [<ffffffff84c35240>] ? SYSC_connect+0x2e0/0x2e0 /kernel/net/socket.c:1543
> [<ffffffff81698650>] ? __pmd_alloc+0x350/0x350 /kernel/mm/memory.c:3928
> [<ffffffff81230b3b>] ? __do_page_fault+0x2ab/0x8e0
> /kernel/arch/x86/mm/fault.c:1184
> [<ffffffff81230c30>] ? __do_page_fault+0x3a0/0x8e0
> /kernel/arch/x86/mm/fault.c:1271
> [<ffffffff813fb5da>] ? up_read+0x1a/0x40 /kernel/kernel/locking/rwsem.c:79
> [<ffffffff81230a29>] ? __do_page_fault+0x199/0x8e0
> /kernel/arch/x86/mm/fault.c:1187
> [<ffffffff84c379b0>] SyS_sendto+0x40/0x50 /kernel/net/socket.c:1619
> [<ffffffff85dab940>] entry_SYSCALL_64_fastpath+0x23/0xc1
> /kernel/arch/x86/entry/entry_64.S:207
> Memory state around the buggy address:
>  ffff8800a45df280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  ffff8800a45df300: f1 f1 f1 f1 00 00 04 f4 f2 f2 f2 f2 00 00 04 f4
> > ffff8800a45df380: f2 f2 f2 f2 00 00 00 00 00 f4 f4 f4 f3 f3 f3 f3
>                                               ^
>  ffff8800a45df400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  ffff8800a45df480: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 01 f4 f4 f4
> ==================================================================
> 
> #include <unistd.h>
> #include <sys/syscall.h>
> #include <string.h>
> #include <stdint.h>
> #include <pthread.h>
> #include <sys/socket.h>
> #include <sys/mman.h>
> #include <netinet/in.h>
> int main()
> {
>         mmap((void *)0x20000000ul, 0x19000ul, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0);
>         int sock = socket(AF_INET, SOCK_RAW, IPPROTO_TCP);
>         int sock_dup = dup(sock);
>         memcpy((void*)0x2000b000,
> "\x11\xaf\x7d\x99\x91\x3c\x87\x34\x85\x18\xc4\xd6\xf2\x30\x0a", 15);
>         *(uint16_t*)0x20002fec = (uint16_t)0x2;
>         *(uint16_t*)0x20002fee = (uint16_t)0x11ab;
>         *(uint32_t*)0x20002ff0 = (uint32_t)0x100007f;
>         sendto(sock_dup, (void *)0x2000b000ul, 0xful, 0x8800ul, (struct
> sockaddr *)0x20002fe4ul, 0x1cul);
>         memcpy((void*)0x2001504f,
> "\x7e\xb1\x52\x5b\x78\x85\x27\xe7\xcc\x3d\xf5\x18\x1b\xba\xda\x97\x6c\x18\x72\x0c\xd2\x0a\xa6\x77\xb7\x8b\xa2\xd2\x1d\xf0\x6b\xf6\x1a\x27\x6b\x98\x3e\x0b\x49\x8d\x54\x6e\x9e\xbb\x21\x4a\x72\x79\x1f\x82\xaf\x89\x2c\xf6\xd3\xc9\xd7\xed\x18\x29\x4d\x2e\x03\x15\xe2\x03\x14\xd0\xac\xa5\x81\x37\x73\x88\xa9\xf5\x08\xe5\xef\x5b\x56\xb7\x18\x8f\xe6\x19\xea\x91\x82\x23\xdd\x2c\x5c\xa5\xf0\xfc\xd8\xe2\x8b\x91\x48\x70\x24\xed\xae\xf9\x06\xac\xc4\x53\x01\xc3\xf5\xa3\x10\xef\xf1\xa6\x2b\xae\x72\xc7\x1a\x02\xee\x78\xcd\xd1\x7e\x8c\x9c\x1a\x36\xc7\xd4\x7c\x82\x64\xf7\x8b\x5a\xb0\x72\xa8\x87\x3c\xdc\xd0\xba\xfe\x70\x7d\x8c\x23\x78\xad\x7c\x31\x04\xec\xab\x1e\x4c\xee\xae\x84\xd8\x1a\x1d\x85\xa5\x57\xa8\x24\x53\x08\x1c\x4f\xda\x49\xe5\x3a\x99\x8c\x29\xa1\xed\x4b\x42\x7a\x15\x48\x2a\x22\x3b\x81\xfe\x47\x74\
 xc1\x2f\x64\xcf\x10\xd4\x71\x72\x50\x71\xd7\xf6\xb0\xca\x41\x9a\x5e\x3e\xe4\x31\x19\xd1\x19\x46\x20\x66\x4c\x2f\xea\x76\x17\x2d\x94",
> 232);
>         *(uint16_t*)0x2001501c = (uint16_t)0xa;
>         *(uint16_t*)0x2001501e = (uint16_t)0x11ab;
>         *(uint32_t*)0x20015020 = (uint32_t)0xbdc;
>         *(uint32_t*)0x20015024 = (uint32_t)0x0;
>         *(uint32_t*)0x20015028 = (uint32_t)0x0;
>         *(uint32_t*)0x2001502c = (uint32_t)0x0;
>         *(uint32_t*)0x20015030 = (uint32_t)0x1000000;
>         *(uint32_t*)0x20015034 = (uint32_t)0x3;
>         sendto(sock_dup, (void *)0x2001504ful, 0xe8ul, 0x880ul, (struct
> sockaddr *)0x20015000ul, 0x1cul);
>         return 0;
> }

Please verify that the patch below fixes the issue:

diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index 278f3b9..7cc1d9c 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -410,6 +410,8 @@ static void tcp_options(const struct sk_buff *skb,
 			length--;
 			continue;
 		default:
+			if (length < 2)
+				return;
 			opsize=*ptr++;
 			if (opsize < 2) /* "silly options" */
 				return;
@@ -470,6 +472,8 @@ static void tcp_sack(const struct sk_buff *skb, unsigned int dataoff,
 			length--;
 			continue;
 		default:
+			if (length < 2)
+				return;
 			opsize = *ptr++;
 			if (opsize < 2) /* "silly options" */
 				return;

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.mta.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply related

* Re: [PATCH] net: sxgbe: fix error paths in sxgbe_platform_probe()
From: Rasmus Villemoes @ 2016-03-27 21:40 UTC (permalink / raw)
  To: Francois Romieu; +Cc: David Miller, netdev, linux-kernel
In-Reply-To: <20160327082254.GA10620@electric-eye.fr.zoreil.com>

On Sun, Mar 27 2016, Francois Romieu <romieu@fr.zoreil.com> wrote:

> Rasmus Villemoes <linux@rasmusvillemoes.dk> :
>> We need to use post-decrement to ensure that irq_dispose_mapping is
>> also called on priv->rxq[0]->irq_no; moreover, if one of the above for
>> loops failed already at i==0 (so we reach one of these labels with
>> that value of i), we'll enter an essentially infinite loop of
>> out-of-bounds accesses.
>> 
>> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
>
> (ok, i is signed)
>
> Reviewed-by: Francois Romieu <romieu@fr.zoreil.com>
>

Thanks for reviewing, but just FTR I want to point out that it doesn't
matter whether i is signed or not in

  while (i--)

However, when i is signed, there's another slightly less popular variant
which is equivalent:

  while (--i >= 0)

(a precondition for their equivalence is that i has a non-negative value
before reaching the while statement).

Neither are equivalent to the almost-always broken

  while (--i)

Rasmus

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox