Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH] mwifiex: add __GFP_REPEAT to skb allocation call
From: Wei-Ning Huang @ 2016-03-29  4:47 UTC (permalink / raw)
  To: linux-wireless-u79uwXL29TY76Z2rM5mHXA
  Cc: LKML, Amitkumar Karwar, Nishant Sarmukadam,
	snanda-hpIqsD4AKlfQT0dZR+AlfA, Wei-Ning Huang,
	kvalo-sgV2jX0FEOL9JmXXK+q4OQ, netdev-u79uwXL29TY76Z2rM5mHXA

"single skb allocation failure" happens when system is under heavy
memory pressure.  Add __GFP_REPEAT to skb allocation call so kernel
attempts to reclaim pages and retry the allocation.

Signed-off-by: Wei-Ning Huang <wnhuang-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
---
 drivers/net/wireless/marvell/mwifiex/sdio.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
index b2c839a..c64989c 100644
--- a/drivers/net/wireless/marvell/mwifiex/sdio.c
+++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
@@ -1124,7 +1124,8 @@ static void mwifiex_deaggr_sdio_pkt(struct mwifiex_adapter *adapter,
 			break;
 		}
 		skb_deaggr = mwifiex_alloc_dma_align_buf(pkt_len,
-							 GFP_KERNEL | GFP_DMA);
+							 GFP_KERNEL | GFP_DMA |
+							 __GFP_REPEAT);
 		if (!skb_deaggr)
 			break;
 		skb_put(skb_deaggr, pkt_len);
@@ -1374,7 +1375,8 @@ static int mwifiex_sdio_card_to_host_mp_aggr(struct mwifiex_adapter *adapter,
 			/* copy pkt to deaggr buf */
 			skb_deaggr = mwifiex_alloc_dma_align_buf(len_arr[pind],
 								 GFP_KERNEL |
-								 GFP_DMA);
+								 GFP_DMA |
+								 __GFP_REPEAT);
 			if (!skb_deaggr) {
 				mwifiex_dbg(adapter, ERROR, "skb allocation failure\t"
 					    "drop pkt len=%d type=%d\n",
@@ -1416,7 +1418,8 @@ rx_curr_single:
 		mwifiex_dbg(adapter, INFO, "info: RX: port: %d, rx_len: %d\n",
 			    port, rx_len);
 
-		skb = mwifiex_alloc_dma_align_buf(rx_len, GFP_KERNEL | GFP_DMA);
+		skb = mwifiex_alloc_dma_align_buf(rx_len, GFP_KERNEL | GFP_DMA |
+						  __GFP_REPEAT);
 		if (!skb) {
 			mwifiex_dbg(adapter, ERROR,
 				    "single skb allocated fail,\t"
@@ -1521,7 +1524,8 @@ static int mwifiex_process_int_status(struct mwifiex_adapter *adapter)
 		rx_len = (u16) (rx_blocks * MWIFIEX_SDIO_BLOCK_SIZE);
 		mwifiex_dbg(adapter, INFO, "info: rx_len = %d\n", rx_len);
 
-		skb = mwifiex_alloc_dma_align_buf(rx_len, GFP_KERNEL | GFP_DMA);
+		skb = mwifiex_alloc_dma_align_buf(rx_len, GFP_KERNEL | GFP_DMA |
+						  __GFP_REPEAT);
 		if (!skb)
 			return -1;
 
-- 
2.8.0.rc3.226.g39d4020

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH 3/4] wcn36xx: Transition driver to SMD client
From: Bjorn Andersson @ 2016-03-29  4:39 UTC (permalink / raw)
  To: Eugene Krasnikov
  Cc: Kalle Valo, Ian Campbell, Mark Rutland, Pawel Moll, Rob Herring,
	yfw, linux-arm-msm,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-wireless, netdev, wcn36xx
In-Reply-To: <CAFSJ42a2mD1gSaarKEny4xp9g_PbcLuv3PLsRnHp+qviv6f6mg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Mon, Jan 11, 2016 at 1:02 AM, Eugene Krasnikov <k.eugene.e-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Better late than never! Looks good to me.
>

Unfortunately I ran into an issue with ordering of operations between
the WiFi driver and the wcnss_ctrl driver. So an updated series is on
the way, but this depends on changes to the wcnss_ctrl driver, which
are being reviewed right now.

Regards,
Bjorn

> 2015-12-28 1:34 GMT+00:00 Bjorn Andersson <bjorn-UYDU3/A3LUY@public.gmane.org>:
>> The wcn36xx wifi driver follows the life cycle of the WLAN_CTRL SMD
>> channel, as such it should be a SMD client. This patch makes this
>> transition, now that we have the necessary frameworks available.
>>
>> Signed-off-by: Bjorn Andersson <bjorn.andersson-/MT0OVThwyLZJqsBc5GL+g@public.gmane.org>
>> ---
>>  drivers/net/wireless/ath/wcn36xx/Kconfig   |   2 +-
>>  drivers/net/wireless/ath/wcn36xx/dxe.c     |  16 +++--
>>  drivers/net/wireless/ath/wcn36xx/main.c    | 111 +++++++++++++++++------------
>>  drivers/net/wireless/ath/wcn36xx/smd.c     |  26 ++++---
>>  drivers/net/wireless/ath/wcn36xx/smd.h     |   4 ++
>>  drivers/net/wireless/ath/wcn36xx/wcn36xx.h |  21 ++----
>>  6 files changed, 99 insertions(+), 81 deletions(-)
>>
>> diff --git a/drivers/net/wireless/ath/wcn36xx/Kconfig b/drivers/net/wireless/ath/wcn36xx/Kconfig
>> index 591ebaea8265..394fe5b77c90 100644
>> --- a/drivers/net/wireless/ath/wcn36xx/Kconfig
>> +++ b/drivers/net/wireless/ath/wcn36xx/Kconfig
>> @@ -1,6 +1,6 @@
>>  config WCN36XX
>>         tristate "Qualcomm Atheros WCN3660/3680 support"
>> -       depends on MAC80211 && HAS_DMA
>> +       depends on MAC80211 && HAS_DMA && QCOM_SMD
>>         ---help---
>>           This module adds support for wireless adapters based on
>>           Qualcomm Atheros WCN3660 and WCN3680 mobile chipsets.
>> diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c b/drivers/net/wireless/ath/wcn36xx/dxe.c
>> index f8dfa05b290a..47f3937a7ab9 100644
>> --- a/drivers/net/wireless/ath/wcn36xx/dxe.c
>> +++ b/drivers/net/wireless/ath/wcn36xx/dxe.c
>> @@ -23,6 +23,7 @@
>>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>>
>>  #include <linux/interrupt.h>
>> +#include <linux/soc/qcom/smem_state.h>
>>  #include "wcn36xx.h"
>>  #include "txrx.h"
>>
>> @@ -150,9 +151,12 @@ int wcn36xx_dxe_alloc_ctl_blks(struct wcn36xx *wcn)
>>                 goto out_err;
>>
>>         /* Initialize SMSM state  Clear TX Enable RING EMPTY STATE */
>> -       ret = wcn->ctrl_ops->smsm_change_state(
>> -               WCN36XX_SMSM_WLAN_TX_ENABLE,
>> -               WCN36XX_SMSM_WLAN_TX_RINGS_EMPTY);
>> +       ret = qcom_smem_state_update_bits(wcn->tx_enable_state,
>> +                                         WCN36XX_SMSM_WLAN_TX_ENABLE |
>> +                                         WCN36XX_SMSM_WLAN_TX_RINGS_EMPTY,
>> +                                         WCN36XX_SMSM_WLAN_TX_RINGS_EMPTY);
>> +       if (ret)
>> +               goto out_err;
>>
>>         return 0;
>>
>> @@ -676,9 +680,9 @@ int wcn36xx_dxe_tx_frame(struct wcn36xx *wcn,
>>          * notify chip about new frame through SMSM bus.
>>          */
>>         if (is_low &&  vif_priv->pw_state == WCN36XX_BMPS) {
>> -               wcn->ctrl_ops->smsm_change_state(
>> -                                 0,
>> -                                 WCN36XX_SMSM_WLAN_TX_ENABLE);
>> +               qcom_smem_state_update_bits(wcn->tx_rings_empty_state,
>> +                                           WCN36XX_SMSM_WLAN_TX_ENABLE,
>> +                                           WCN36XX_SMSM_WLAN_TX_ENABLE);
>>         } else {
>>                 /* indicate End Of Packet and generate interrupt on descriptor
>>                  * done.
>> diff --git a/drivers/net/wireless/ath/wcn36xx/main.c b/drivers/net/wireless/ath/wcn36xx/main.c
>> index 7c169abdbafe..8659e3f997d2 100644
>> --- a/drivers/net/wireless/ath/wcn36xx/main.c
>> +++ b/drivers/net/wireless/ath/wcn36xx/main.c
>> @@ -19,6 +19,9 @@
>>  #include <linux/module.h>
>>  #include <linux/firmware.h>
>>  #include <linux/platform_device.h>
>> +#include <linux/of_irq.h>
>> +#include <linux/soc/qcom/smd.h>
>> +#include <linux/soc/qcom/smem_state.h>
>>  #include "wcn36xx.h"
>>
>>  unsigned int wcn36xx_dbg_mask;
>> @@ -981,48 +984,63 @@ static int wcn36xx_init_ieee80211(struct wcn36xx *wcn)
>>  }
>>
>>  static int wcn36xx_platform_get_resources(struct wcn36xx *wcn,
>> -                                         struct platform_device *pdev)
>> +                                         struct device *dev)
>>  {
>> -       struct resource *res;
>> +       u32 mmio[2];
>> +       int ret;
>> +
>>         /* Set TX IRQ */
>> -       res = platform_get_resource_byname(pdev, IORESOURCE_IRQ,
>> -                                          "wcnss_wlantx_irq");
>> -       if (!res) {
>> +       wcn->tx_irq = irq_of_parse_and_map(dev->of_node, 0);
>> +       if (!wcn->tx_irq) {
>>                 wcn36xx_err("failed to get tx_irq\n");
>>                 return -ENOENT;
>>         }
>> -       wcn->tx_irq = res->start;
>>
>>         /* Set RX IRQ */
>> -       res = platform_get_resource_byname(pdev, IORESOURCE_IRQ,
>> -                                          "wcnss_wlanrx_irq");
>> -       if (!res) {
>> +       wcn->rx_irq = irq_of_parse_and_map(dev->of_node, 1);
>> +       if (!wcn->rx_irq) {
>>                 wcn36xx_err("failed to get rx_irq\n");
>>                 return -ENOENT;
>>         }
>> -       wcn->rx_irq = res->start;
>> +
>> +       /* Acquire SMSM tx enable handle */
>> +       wcn->tx_enable_state = qcom_smem_state_get(dev, "tx-enable",
>> +                       &wcn->tx_enable_state_bit);
>> +       if (IS_ERR(wcn->tx_enable_state)) {
>> +               wcn36xx_err("failed to get tx-enable state\n");
>> +               return -ENOENT;
>> +       }
>> +
>> +       /* Acquire SMSM tx rings empty handle */
>> +       wcn->tx_rings_empty_state = qcom_smem_state_get(dev, "tx-rings-empty",
>> +                       &wcn->tx_rings_empty_state_bit);
>> +       if (IS_ERR(wcn->tx_rings_empty_state)) {
>> +               wcn36xx_err("failed to get tx-rings-empty state\n");
>> +               return -ENOENT;
>> +       }
>>
>>         /* Map the memory */
>> -       res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
>> -                                                "wcnss_mmio");
>> -       if (!res) {
>> -               wcn36xx_err("failed to get mmio\n");
>> +       ret = of_property_read_u32_array(dev->of_node, "qcom,wcnss-mmio", mmio, 2);
>> +       if (ret) {
>> +               wcn36xx_err("failed to get qcom,wcnss-mmio\n");
>>                 return -ENOENT;
>>         }
>> -       wcn->mmio = ioremap(res->start, resource_size(res));
>> +
>> +       wcn->mmio = ioremap(mmio[0], mmio[1]);
>>         if (!wcn->mmio) {
>>                 wcn36xx_err("failed to map io memory\n");
>>                 return -ENOMEM;
>>         }
>> +
>>         return 0;
>>  }
>>
>> -static int wcn36xx_probe(struct platform_device *pdev)
>> +static int wcn36xx_probe(struct qcom_smd_device *sdev)
>>  {
>>         struct ieee80211_hw *hw;
>>         struct wcn36xx *wcn;
>>         int ret;
>> -       u8 addr[ETH_ALEN];
>> +       const u8 *addr;
>>
>>         wcn36xx_dbg(WCN36XX_DBG_MAC, "platform probe\n");
>>
>> @@ -1032,20 +1050,27 @@ static int wcn36xx_probe(struct platform_device *pdev)
>>                 ret = -ENOMEM;
>>                 goto out_err;
>>         }
>> -       platform_set_drvdata(pdev, hw);
>> +       dev_set_drvdata(&sdev->dev, hw);
>>         wcn = hw->priv;
>>         wcn->hw = hw;
>> -       wcn->dev = &pdev->dev;
>> -       wcn->ctrl_ops = pdev->dev.platform_data;
>> +       wcn->dev = &sdev->dev;
>> +       wcn->smd_channel = sdev->channel;
>>
>>         mutex_init(&wcn->hal_mutex);
>>
>> -       if (!wcn->ctrl_ops->get_hw_mac(addr)) {
>> +       sdev->dev.coherent_dma_mask = DMA_BIT_MASK(32);
>> +
>> +       addr = of_get_property(sdev->dev.of_node, "local-mac-address", &ret);
>> +       if (addr && ret != ETH_ALEN) {
>> +               wcn36xx_err("invalid local-mac-address\n");
>> +               ret = -EINVAL;
>> +               goto out_err;
>> +       } else if (addr) {
>>                 wcn36xx_info("mac address: %pM\n", addr);
>>                 SET_IEEE80211_PERM_ADDR(wcn->hw, addr);
>>         }
>>
>> -       ret = wcn36xx_platform_get_resources(wcn, pdev);
>> +       ret = wcn36xx_platform_get_resources(wcn, &sdev->dev);
>>         if (ret)
>>                 goto out_wq;
>>
>> @@ -1063,9 +1088,10 @@ out_wq:
>>  out_err:
>>         return ret;
>>  }
>> -static int wcn36xx_remove(struct platform_device *pdev)
>> +
>> +static void wcn36xx_remove(struct qcom_smd_device *sdev)
>>  {
>> -       struct ieee80211_hw *hw = platform_get_drvdata(pdev);
>> +       struct ieee80211_hw *hw = dev_get_drvdata(&sdev->dev);
>>         struct wcn36xx *wcn = hw->priv;
>>         wcn36xx_dbg(WCN36XX_DBG_MAC, "platform remove\n");
>>
>> @@ -1073,41 +1099,34 @@ static int wcn36xx_remove(struct platform_device *pdev)
>>         mutex_destroy(&wcn->hal_mutex);
>>
>>         ieee80211_unregister_hw(hw);
>> +
>> +       qcom_smem_state_put(wcn->tx_enable_state);
>> +       qcom_smem_state_put(wcn->tx_rings_empty_state);
>> +
>>         iounmap(wcn->mmio);
>>         ieee80211_free_hw(hw);
>> -
>> -       return 0;
>>  }
>> -static const struct platform_device_id wcn36xx_platform_id_table[] = {
>> -       {
>> -               .name = "wcn36xx",
>> -               .driver_data = 0
>> -       },
>> +
>> +static const struct of_device_id wcn36xx_of_match[] = {
>> +       { .compatible = "qcom,wcn3620-wlan" },
>> +       { .compatible = "qcom,wcn3660-wlan" },
>> +       { .compatible = "qcom,wcn3680-wlan" },
>>         {}
>>  };
>> -MODULE_DEVICE_TABLE(platform, wcn36xx_platform_id_table);
>> +MODULE_DEVICE_TABLE(of, wcn36xx_of_match);
>>
>> -static struct platform_driver wcn36xx_driver = {
>> +static struct qcom_smd_driver wcn36xx_driver = {
>>         .probe      = wcn36xx_probe,
>>         .remove     = wcn36xx_remove,
>> +       .callback = wcn36xx_smd_rsp_process,
>>         .driver         = {
>>                 .name   = "wcn36xx",
>> +               .of_match_table = wcn36xx_of_match,
>> +               .owner  = THIS_MODULE,
>>         },
>> -       .id_table    = wcn36xx_platform_id_table,
>>  };
>>
>> -static int __init wcn36xx_init(void)
>> -{
>> -       platform_driver_register(&wcn36xx_driver);
>> -       return 0;
>> -}
>> -module_init(wcn36xx_init);
>> -
>> -static void __exit wcn36xx_exit(void)
>> -{
>> -       platform_driver_unregister(&wcn36xx_driver);
>> -}
>> -module_exit(wcn36xx_exit);
>> +module_qcom_smd_driver(wcn36xx_driver);
>>
>>  MODULE_LICENSE("Dual BSD/GPL");
>>  MODULE_AUTHOR("Eugene Krasnikov k.eugene.e-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org");
>> diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c
>> index 4307429740a9..2bf42787a4af 100644
>> --- a/drivers/net/wireless/ath/wcn36xx/smd.c
>> +++ b/drivers/net/wireless/ath/wcn36xx/smd.c
>> @@ -19,6 +19,7 @@
>>  #include <linux/etherdevice.h>
>>  #include <linux/firmware.h>
>>  #include <linux/bitops.h>
>> +#include <linux/soc/qcom/smd.h>
>>  #include "smd.h"
>>
>>  struct wcn36xx_cfg_val {
>> @@ -253,7 +254,7 @@ static int wcn36xx_smd_send_and_wait(struct wcn36xx *wcn, size_t len)
>>
>>         init_completion(&wcn->hal_rsp_compl);
>>         start = jiffies;
>> -       ret = wcn->ctrl_ops->tx(wcn->hal_buf, len);
>> +       ret = qcom_smd_send(wcn->smd_channel, wcn->hal_buf, len);
>>         if (ret) {
>>                 wcn36xx_err("HAL TX failed\n");
>>                 goto out;
>> @@ -2100,9 +2101,13 @@ out:
>>         mutex_unlock(&wcn->hal_mutex);
>>         return ret;
>>  }
>> -static void wcn36xx_smd_rsp_process(struct wcn36xx *wcn, void *buf, size_t len)
>> +
>> +int wcn36xx_smd_rsp_process(struct qcom_smd_device *sdev,
>> +                           const void *buf, size_t len)
>>  {
>> -       struct wcn36xx_hal_msg_header *msg_header = buf;
>> +       const struct wcn36xx_hal_msg_header *msg_header = buf;
>> +       struct ieee80211_hw *hw = dev_get_drvdata(&sdev->dev);
>> +       struct wcn36xx *wcn = hw->priv;
>>         struct wcn36xx_hal_ind_msg *msg_ind;
>>         wcn36xx_dbg_dump(WCN36XX_DBG_SMD_DUMP, "SMD <<< ", buf, len);
>>
>> @@ -2151,7 +2156,7 @@ static void wcn36xx_smd_rsp_process(struct wcn36xx *wcn, void *buf, size_t len)
>>         case WCN36XX_HAL_OTA_TX_COMPL_IND:
>>         case WCN36XX_HAL_MISSED_BEACON_IND:
>>         case WCN36XX_HAL_DELETE_STA_CONTEXT_IND:
>> -               msg_ind = kmalloc(sizeof(*msg_ind) + len, GFP_KERNEL);
>> +               msg_ind = kmalloc(sizeof(*msg_ind) + len, GFP_ATOMIC);
>>                 if (!msg_ind) {
>>                         /*
>>                          * FIXME: Do something smarter then just
>> @@ -2175,6 +2180,8 @@ static void wcn36xx_smd_rsp_process(struct wcn36xx *wcn, void *buf, size_t len)
>>                 wcn36xx_err("SMD_EVENT (%d) not supported\n",
>>                               msg_header->msg_type);
>>         }
>> +
>> +       return 0;
>>  }
>>  static void wcn36xx_ind_smd_work(struct work_struct *work)
>>  {
>> @@ -2232,22 +2239,13 @@ int wcn36xx_smd_open(struct wcn36xx *wcn)
>>         INIT_LIST_HEAD(&wcn->hal_ind_queue);
>>         spin_lock_init(&wcn->hal_ind_lock);
>>
>> -       ret = wcn->ctrl_ops->open(wcn, wcn36xx_smd_rsp_process);
>> -       if (ret) {
>> -               wcn36xx_err("failed to open control channel\n");
>> -               goto free_wq;
>> -       }
>> -
>> -       return ret;
>> +       return 0;
>>
>> -free_wq:
>> -       destroy_workqueue(wcn->hal_ind_wq);
>>  out:
>>         return ret;
>>  }
>>
>>  void wcn36xx_smd_close(struct wcn36xx *wcn)
>>  {
>> -       wcn->ctrl_ops->close();
>>         destroy_workqueue(wcn->hal_ind_wq);
>>  }
>> diff --git a/drivers/net/wireless/ath/wcn36xx/smd.h b/drivers/net/wireless/ath/wcn36xx/smd.h
>> index 21cc4ac7b5ca..4d96d56d266f 100644
>> --- a/drivers/net/wireless/ath/wcn36xx/smd.h
>> +++ b/drivers/net/wireless/ath/wcn36xx/smd.h
>> @@ -60,6 +60,7 @@ struct wcn36xx_hal_ind_msg {
>>  };
>>
>>  struct wcn36xx;
>> +struct qcom_smd_device;
>>
>>  int wcn36xx_smd_open(struct wcn36xx *wcn);
>>  void wcn36xx_smd_close(struct wcn36xx *wcn);
>> @@ -136,4 +137,7 @@ int wcn36xx_smd_del_ba(struct wcn36xx *wcn, u16 tid, u8 sta_index);
>>  int wcn36xx_smd_trigger_ba(struct wcn36xx *wcn, u8 sta_index);
>>
>>  int wcn36xx_smd_update_cfg(struct wcn36xx *wcn, u32 cfg_id, u32 value);
>> +
>> +int wcn36xx_smd_rsp_process(struct qcom_smd_device *sdev,
>> +                           const void *buf, size_t len);
>>  #endif /* _SMD_H_ */
>> diff --git a/drivers/net/wireless/ath/wcn36xx/wcn36xx.h b/drivers/net/wireless/ath/wcn36xx/wcn36xx.h
>> index 7997cc1312ee..4be81f66e167 100644
>> --- a/drivers/net/wireless/ath/wcn36xx/wcn36xx.h
>> +++ b/drivers/net/wireless/ath/wcn36xx/wcn36xx.h
>> @@ -103,19 +103,6 @@ struct nv_data {
>>         u8      table;
>>  };
>>
>> -/* Interface for platform control path
>> - *
>> - * @open: hook must be called when wcn36xx wants to open control channel.
>> - * @tx: sends a buffer.
>> - */
>> -struct wcn36xx_platform_ctrl_ops {
>> -       int (*open)(void *drv_priv, void *rsp_cb);
>> -       void (*close)(void);
>> -       int (*tx)(char *buf, size_t len);
>> -       int (*get_hw_mac)(u8 *addr);
>> -       int (*smsm_change_state)(u32 clear_mask, u32 set_mask);
>> -};
>> -
>>  /**
>>   * struct wcn36xx_vif - holds VIF related fields
>>   *
>> @@ -204,7 +191,13 @@ struct wcn36xx {
>>         int                     rx_irq;
>>         void __iomem            *mmio;
>>
>> -       struct wcn36xx_platform_ctrl_ops *ctrl_ops;
>> +       struct qcom_smd_channel *smd_channel;
>> +
>> +       struct qcom_smem_state  *tx_enable_state;
>> +       unsigned                tx_enable_state_bit;
>> +       struct qcom_smem_state  *tx_rings_empty_state;
>> +       unsigned                tx_rings_empty_state_bit;
>> +
>>         /*
>>          * smd_buf must be protected with smd_mutex to garantee
>>          * that all messages are sent one after another
>> --
>> 2.5.0
>>
>
>
>
> --
> Best regards,
> Eugene
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC PATCH] tcp: Add SOF_TIMESTAMPING_TX_EOR and allow MSG_EOR in tcp_sendmsg
From: Yuchung Cheng @ 2016-03-29  4:31 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Willem de Bruijn, Network Development, Kernel Team, Eric Dumazet,
	Neal Cardwell, Willem de Bruijn, Soheil Hassas Yeganeh
In-Reply-To: <20160328054210.GA30968@DHCP.thefacebook.com>

On Sun, Mar 27, 2016 at 10:42 PM, Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Fri, Mar 25, 2016 at 04:05:51PM -0700, Yuchung Cheng wrote:
> > Looks like an interesting and useful patch. Since HTTP2 allows
> > multiplexing data stream frames from multiple logical streams on a
> > single socket,
> > how would you instrument to measure the latency of each stream? e.g.,
> >
> > sendmsg of data_frame_1_of_stream_a
> > sendmsg of data_frame_1_of_stream_b
> > sendmsg of data_frame_2_of_stream_a
> > sendmsg of data_frame_1_of_stream_c
> > sendmsg of data_frame_2_of_stream_b
>
> A quick recall from the end of the commit message:
> "One of our use case is at the webserver.  The webserver tracks
> the HTTP2 response latency by measuring when the webserver
> sends the first byte to the socket till the TCP ACK of the last byte
> is received."
>
> It is the server side perception on how long does it take to deliver
> the whole response/stream to the client.  Hence, the number of
> interleaved streams does not matter.
>
> Some sample use cases are,
> comparing TCP sysctl/code changes,
> observing encoding/compression impact (e.g. HPACK in HTTP2).
>
> Assuming frame_2 is the end stream for 'a' and 'b':
> sendmsg of data_frame_1_of_stream_a
> sendmsg of data_frame_1_of_stream_b
> sendmsg of data_frame_2_of_stream_a MSG_EOR
> sendmsg of data_frame_1_of_stream_c
> sendmsg of data_frame_2_of_stream_b MSG_EOR
>
> Are you suggesting other useful ways/metrics should be measured in
> this case?
No this is what I have in mind but wanted to confirm. Thanks.

^ permalink raw reply

* Re: [PATCH net] team: team should sync the port's uc/mc addrs when add a port
From: David Miller @ 2016-03-29  4:22 UTC (permalink / raw)
  To: lucien.xin; +Cc: netdev, jiri, marcelo.leitner
In-Reply-To: <320fc71dbbf5ed164ca506cf5750b7bf4e048d44.1459183351.git.lucien.xin@gmail.com>

From: Xin Long <lucien.xin@gmail.com>
Date: Tue, 29 Mar 2016 00:42:31 +0800

> There is an issue when we use mavtap over team:
> When we replug nic links from team0, the real nics's mc list will not
> include the maddr for macvtap any more. then we can't receive pkts to
> macvtap device, as they are filterred by mc list of nic.
> 
> In Bonding Driver, it syncs the uc/mc addrs in bond_enslave().
> 
> We will fix this issue on team by adding the port's uc/mc addrs sync in
> team_port_add.
> 
> Signed-off-by: Xin Long <lucien.xin@gmail.com>

Applied, thank you.

^ permalink raw reply

* Re: [net PATCH] gro: Allow tunnel stacking in the case of FOU/GUE
From: Alex Duyck @ 2016-03-29  4:15 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Alexander Duyck, Jesse Gross, Linux Kernel Network Developers,
	David S. Miller
In-Reply-To: <CALx6S37xUjQpxYdskm0Lt33MQr6Jwwg1eGjZYnadX8762EuxBg@mail.gmail.com>

On Mon, Mar 28, 2016 at 9:01 PM, Tom Herbert <tom@herbertland.com> wrote:
> On Mon, Mar 28, 2016 at 8:27 PM, Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
>> On Mon, Mar 28, 2016 at 8:17 PM, Tom Herbert <tom@herbertland.com> wrote:
>>> On Mon, Mar 28, 2016 at 6:54 PM, Jesse Gross <jesse@kernel.org> wrote:
>>>> On Mon, Mar 28, 2016 at 6:24 PM, Tom Herbert <tom@herbertland.com> wrote:
>>>>> On Mon, Mar 28, 2016 at 4:58 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
>>>>>> This patch should fix the issues seen with a recent fix to prevent
>>>>>> tunnel-in-tunnel frames from being generated with GRO.  The fix itself is
>>>>>> correct for now as long as we do not add any devices that support
>>>>>> NETIF_F_GSO_GRE_CSUM.  When such a device is added it could have the
>>>>>> potential to mess things up due to the fact that the outer transport header
>>>>>> points to the outer UDP header and not the GRE header as would be expected.
>>>>>>
>>>>>> Fixes: fac8e0f579695 ("tunnels: Don't apply GRO to multiple layers of encapsulation.")
>>>>>
>>>>> This could only fix FOU/GUE. It is very possible someone else could
>>>>> happily be doing some other layered encapsulation and never had a
>>>>> problem before, so the decision to start enforcing only a single layer
>>>>> of encapsulation for GRO would still break them. I still think we
>>>>> should revert the patch, and for next version fixes things to that any
>>>>> combination/nesting of encapsulation is supported, and if there are
>>>>> exceptions to that support they need be clearly documented.
>>>>
>>>> It was pointed out to me that prior to my patch, it was also possible
>>>> to remotely cause a stack overflow by filling up a packet with tunnel
>>>> headers and letting GRO descend through them over and over again.
>>>>
>>> Then the fix would be set set a reasonable limit on the number of
>>> encapsulation levels.
>>>
>>>> Tom, I'm sorry that you don't like how I fixed this issue but there
>>>> really, truly is a bug here. I gave you a specific example to be clear
>>>> but that doesn't mean that is the only case. I am aware that the bug
>>>> is not encountered in all situations and that the fix removes an
>>>> optimization in some of those but I think that ensuring correct
>>>> behavior must come first.
>>>
>>> The example you gave results in packet loss, this is not
>>> incorrectness. Actually reproduce a real issue that leads to
>>> incorrectness and then we can talk about a solution.
>>
>> Tom,
>>
>> Just take a look in the __skb_udp_tunnel_segment or gre_gso_segment
>> code.  Then tell me how we are supposed to deal with the fact that the
>> GSO code expects skb_inner_network_offset() to be valid.  If you have
>> more than an inner and an outer network header we cannot.  So we
>> cannot put GRE in UDP, or UDP in GRE if there is a network header
>> between them.  The FOU/GUE code gets around this because in the IPIP
>> and SIT cases you are adding an L4 header between two L3 headers.  The
>> GRE case works because you essentially convert the GRE header into a
>> tunnel header like VXLAN or GENEVE and we just overwrite the outer
>> transport header offset.
>>
>> What it comes down to is that we can only support 2 network headers
>> per frame.  One for the inner and one for the outer.  That is why we
>> can have an exception for GUE as it only has 2 network headers.  If we
>> had multiple levels of UDP, or GRE, or 2 levels of network headers
>> either before or after either UDP or GRE we cannot support
>> segmentation because the code will blow up and generate a malformed
>> frame.
>>
> If you apply Edward's jumbo L2 header concept then
> Eth|IP|UDP|VXLAN|Eth|IP|UDP|GUE|GRE|IPIP|IPv6|TCP|payload becomes
> Eth|IP|UDP|encapsulation-hdrs|IPv6|TCP|Payload. One set of outer
> headers, one set of inner headers. The rules that encapsulation_hdrs
> don't contain fields that need to be modified for every segment need
> to be supported in GRO and the stack when it generates such a
> configuration.

Thats all well and good but nothing like that exists now.  So you
cannot expect us to fix the kernel to support code that isn't there.
In addition there were a number of issues with the jumbo L2 header
approach.  That was one of the reasons why I went with the jumbo L3
header approach as it is much easier to be compliant with all the
RFCs.

We might be able to get some of that supported for net-next but things
are going to be limited.  We need to have the UDP tunnels actually
setting the DF bit which as far as I know none of them do now.  In
addition we will have to add rules for all the encapsulated types so
that we can enforce the outer IP header incrementing in the event that
DF is not set.  Then we will also have to go through and make certain
that we have the DF bit set in all headers between the transport and
the outer network header in order to allow support for GSO partial.

What you are describing is no small task.  There are bugs that need to
be fixed now in net.  We can try to get the features you want pushed
for net-next but they don't exist now so locking down GRO so that it
matches the feature set provided by GSO is not a regression.

- Alex

^ permalink raw reply

* Re: [PATCH net 0/4] misc. small fixes.
From: David Miller @ 2016-03-29  4:10 UTC (permalink / raw)
  To: michael.chan; +Cc: netdev
In-Reply-To: <1459208767-4225-1-git-send-email-michael.chan@broadcom.com>

From: Michael Chan <michael.chan@broadcom.com>
Date: Mon, 28 Mar 2016 19:46:03 -0400

> Misc. small fixes for net.

Series applied, thanks Michael.

^ permalink raw reply

* Re: [net PATCH] gro: Allow tunnel stacking in the case of FOU/GUE
From: Tom Herbert @ 2016-03-29  4:01 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Jesse Gross, Alexander Duyck, Linux Kernel Network Developers,
	David S. Miller
In-Reply-To: <CAKgT0UeszLNiv=hjeD50pHRB7rgT7qWzbvVAj690jkepVQC7Xg@mail.gmail.com>

On Mon, Mar 28, 2016 at 8:27 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On Mon, Mar 28, 2016 at 8:17 PM, Tom Herbert <tom@herbertland.com> wrote:
>> On Mon, Mar 28, 2016 at 6:54 PM, Jesse Gross <jesse@kernel.org> wrote:
>>> On Mon, Mar 28, 2016 at 6:24 PM, Tom Herbert <tom@herbertland.com> wrote:
>>>> On Mon, Mar 28, 2016 at 4:58 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
>>>>> This patch should fix the issues seen with a recent fix to prevent
>>>>> tunnel-in-tunnel frames from being generated with GRO.  The fix itself is
>>>>> correct for now as long as we do not add any devices that support
>>>>> NETIF_F_GSO_GRE_CSUM.  When such a device is added it could have the
>>>>> potential to mess things up due to the fact that the outer transport header
>>>>> points to the outer UDP header and not the GRE header as would be expected.
>>>>>
>>>>> Fixes: fac8e0f579695 ("tunnels: Don't apply GRO to multiple layers of encapsulation.")
>>>>
>>>> This could only fix FOU/GUE. It is very possible someone else could
>>>> happily be doing some other layered encapsulation and never had a
>>>> problem before, so the decision to start enforcing only a single layer
>>>> of encapsulation for GRO would still break them. I still think we
>>>> should revert the patch, and for next version fixes things to that any
>>>> combination/nesting of encapsulation is supported, and if there are
>>>> exceptions to that support they need be clearly documented.
>>>
>>> It was pointed out to me that prior to my patch, it was also possible
>>> to remotely cause a stack overflow by filling up a packet with tunnel
>>> headers and letting GRO descend through them over and over again.
>>>
>> Then the fix would be set set a reasonable limit on the number of
>> encapsulation levels.
>>
>>> Tom, I'm sorry that you don't like how I fixed this issue but there
>>> really, truly is a bug here. I gave you a specific example to be clear
>>> but that doesn't mean that is the only case. I am aware that the bug
>>> is not encountered in all situations and that the fix removes an
>>> optimization in some of those but I think that ensuring correct
>>> behavior must come first.
>>
>> The example you gave results in packet loss, this is not
>> incorrectness. Actually reproduce a real issue that leads to
>> incorrectness and then we can talk about a solution.
>
> Tom,
>
> Just take a look in the __skb_udp_tunnel_segment or gre_gso_segment
> code.  Then tell me how we are supposed to deal with the fact that the
> GSO code expects skb_inner_network_offset() to be valid.  If you have
> more than an inner and an outer network header we cannot.  So we
> cannot put GRE in UDP, or UDP in GRE if there is a network header
> between them.  The FOU/GUE code gets around this because in the IPIP
> and SIT cases you are adding an L4 header between two L3 headers.  The
> GRE case works because you essentially convert the GRE header into a
> tunnel header like VXLAN or GENEVE and we just overwrite the outer
> transport header offset.
>
> What it comes down to is that we can only support 2 network headers
> per frame.  One for the inner and one for the outer.  That is why we
> can have an exception for GUE as it only has 2 network headers.  If we
> had multiple levels of UDP, or GRE, or 2 levels of network headers
> either before or after either UDP or GRE we cannot support
> segmentation because the code will blow up and generate a malformed
> frame.
>
If you apply Edward's jumbo L2 header concept then
Eth|IP|UDP|VXLAN|Eth|IP|UDP|GUE|GRE|IPIP|IPv6|TCP|payload becomes
Eth|IP|UDP|encapsulation-hdrs|IPv6|TCP|Payload. One set of outer
headers, one set of inner headers. The rules that encapsulation_hdrs
don't contain fields that need to be modified for every segment need
to be supported in GRO and the stack when it generates such a
configuration.

> - Alex

^ permalink raw reply

* Re: am335x: no multicast reception over VLAN
From: Mugunthan V N @ 2016-03-29  4:00 UTC (permalink / raw)
  To: Yegor Yefremov, netdev
  Cc: linux-omap@vger.kernel.org, drivshin, grygorii.strashko
In-Reply-To: <CAGm1_ktOqB4QBMy2G4XeHTkaP+WvG8GOc8maVv3dDoxM+-F+ZA@mail.gmail.com>

Hi Yegor

On Wednesday 16 March 2016 08:35 PM, Yegor Yefremov wrote:
> I have an am335x based board using CPSW in Dual EMAC mode. Without
> VLAN IDs I can receive and send multicast packets [1]. When I create
> VLAN ID:
> 
> ip link add link eth1 name eth1.100 type vlan id 100
> ip addr add 192.168.100.2/24 brd 192.168.100.255 dev eth1.100
> route add -net 224.0.0.0 netmask 224.0.0.0 eth1.100
> 
> I can successfully send multicast packets, but not receive them. On
> the other side of the Ethernet cable I've used Pandaboard. Pandaboard
> could both receive and send multicast packets via VLAN.

Are you trying multicast tx/rx on eth1 or eth1.100?

> 
> This setup was tested with both 3.18.21 and 4.5 kernels.
> 
> Any idea?
> 
> [1] https://pymotw.com/2/socket/multicast.html
> 

^ permalink raw reply

* Re: [net PATCH] gro: Allow tunnel stacking in the case of FOU/GUE
From: Alexander Duyck @ 2016-03-29  3:27 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Jesse Gross, Alexander Duyck, Linux Kernel Network Developers,
	David S. Miller
In-Reply-To: <CALx6S34PLWEMVM3t8jEnj5S4GH+XbCuOdm0aJD_ijPVqoy9ueQ@mail.gmail.com>

On Mon, Mar 28, 2016 at 8:17 PM, Tom Herbert <tom@herbertland.com> wrote:
> On Mon, Mar 28, 2016 at 6:54 PM, Jesse Gross <jesse@kernel.org> wrote:
>> On Mon, Mar 28, 2016 at 6:24 PM, Tom Herbert <tom@herbertland.com> wrote:
>>> On Mon, Mar 28, 2016 at 4:58 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
>>>> This patch should fix the issues seen with a recent fix to prevent
>>>> tunnel-in-tunnel frames from being generated with GRO.  The fix itself is
>>>> correct for now as long as we do not add any devices that support
>>>> NETIF_F_GSO_GRE_CSUM.  When such a device is added it could have the
>>>> potential to mess things up due to the fact that the outer transport header
>>>> points to the outer UDP header and not the GRE header as would be expected.
>>>>
>>>> Fixes: fac8e0f579695 ("tunnels: Don't apply GRO to multiple layers of encapsulation.")
>>>
>>> This could only fix FOU/GUE. It is very possible someone else could
>>> happily be doing some other layered encapsulation and never had a
>>> problem before, so the decision to start enforcing only a single layer
>>> of encapsulation for GRO would still break them. I still think we
>>> should revert the patch, and for next version fixes things to that any
>>> combination/nesting of encapsulation is supported, and if there are
>>> exceptions to that support they need be clearly documented.
>>
>> It was pointed out to me that prior to my patch, it was also possible
>> to remotely cause a stack overflow by filling up a packet with tunnel
>> headers and letting GRO descend through them over and over again.
>>
> Then the fix would be set set a reasonable limit on the number of
> encapsulation levels.
>
>> Tom, I'm sorry that you don't like how I fixed this issue but there
>> really, truly is a bug here. I gave you a specific example to be clear
>> but that doesn't mean that is the only case. I am aware that the bug
>> is not encountered in all situations and that the fix removes an
>> optimization in some of those but I think that ensuring correct
>> behavior must come first.
>
> The example you gave results in packet loss, this is not
> incorrectness. Actually reproduce a real issue that leads to
> incorrectness and then we can talk about a solution.

Tom,

Just take a look in the __skb_udp_tunnel_segment or gre_gso_segment
code.  Then tell me how we are supposed to deal with the fact that the
GSO code expects skb_inner_network_offset() to be valid.  If you have
more than an inner and an outer network header we cannot.  So we
cannot put GRE in UDP, or UDP in GRE if there is a network header
between them.  The FOU/GUE code gets around this because in the IPIP
and SIT cases you are adding an L4 header between two L3 headers.  The
GRE case works because you essentially convert the GRE header into a
tunnel header like VXLAN or GENEVE and we just overwrite the outer
transport header offset.

What it comes down to is that we can only support 2 network headers
per frame.  One for the inner and one for the outer.  That is why we
can have an exception for GUE as it only has 2 network headers.  If we
had multiple levels of UDP, or GRE, or 2 levels of network headers
either before or after either UDP or GRE we cannot support
segmentation because the code will blow up and generate a malformed
frame.

- Alex

^ permalink raw reply

* Re: [net PATCH] gro: Allow tunnel stacking in the case of FOU/GUE
From: Tom Herbert @ 2016-03-29  3:17 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Alexander Duyck, Linux Kernel Network Developers, David S. Miller,
	Alexander Duyck
In-Reply-To: <CAEh+42g2AudOVigy9udEsDz-K1RrG_0GJyYYrMoN5xNUHdJyfg@mail.gmail.com>

On Mon, Mar 28, 2016 at 6:54 PM, Jesse Gross <jesse@kernel.org> wrote:
> On Mon, Mar 28, 2016 at 6:24 PM, Tom Herbert <tom@herbertland.com> wrote:
>> On Mon, Mar 28, 2016 at 4:58 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
>>> This patch should fix the issues seen with a recent fix to prevent
>>> tunnel-in-tunnel frames from being generated with GRO.  The fix itself is
>>> correct for now as long as we do not add any devices that support
>>> NETIF_F_GSO_GRE_CSUM.  When such a device is added it could have the
>>> potential to mess things up due to the fact that the outer transport header
>>> points to the outer UDP header and not the GRE header as would be expected.
>>>
>>> Fixes: fac8e0f579695 ("tunnels: Don't apply GRO to multiple layers of encapsulation.")
>>
>> This could only fix FOU/GUE. It is very possible someone else could
>> happily be doing some other layered encapsulation and never had a
>> problem before, so the decision to start enforcing only a single layer
>> of encapsulation for GRO would still break them. I still think we
>> should revert the patch, and for next version fixes things to that any
>> combination/nesting of encapsulation is supported, and if there are
>> exceptions to that support they need be clearly documented.
>
> It was pointed out to me that prior to my patch, it was also possible
> to remotely cause a stack overflow by filling up a packet with tunnel
> headers and letting GRO descend through them over and over again.
>
Then the fix would be set set a reasonable limit on the number of
encapsulation levels.

> Tom, I'm sorry that you don't like how I fixed this issue but there
> really, truly is a bug here. I gave you a specific example to be clear
> but that doesn't mean that is the only case. I am aware that the bug
> is not encountered in all situations and that the fix removes an
> optimization in some of those but I think that ensuring correct
> behavior must come first.

The example you gave results in packet loss, this is not
incorrectness. Actually reproduce a real issue that leads to
incorrectness and then we can talk about a solution.

Tom

^ permalink raw reply

* Re: [net PATCH] gro: Allow tunnel stacking in the case of FOU/GUE
From: Jesse Gross @ 2016-03-29  2:54 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Linux Kernel Network Developers, David Miller, Alexander Duyck,
	Tom Herbert
In-Reply-To: <20160328235613.26269.26291.stgit@localhost.localdomain>

On Mon, Mar 28, 2016 at 4:58 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
> This patch should fix the issues seen with a recent fix to prevent
> tunnel-in-tunnel frames from being generated with GRO.  The fix itself is
> correct for now as long as we do not add any devices that support
> NETIF_F_GSO_GRE_CSUM.  When such a device is added it could have the
> potential to mess things up due to the fact that the outer transport header
> points to the outer UDP header and not the GRE header as would be expected.
>
> Fixes: fac8e0f579695 ("tunnels: Don't apply GRO to multiple layers of encapsulation.")
> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
> ---
>
> This should allow us to keep the fix that Jesse added without breaking the
> 3 cases that Tom called out in terms of FOU/GUE.
>
> Additional work will be needed in net-next as we probably need to make it
> so that offloads work correctly when we get around to supporting
> NETIF_F_GSO_GRE_CSUM.

Thanks, this looks like a reasonable fix to me. I agree that there is
more that can be done in the future to improve things but this should
restore GRO while still avoiding possible issues.

^ permalink raw reply

* Re: [PATCH net v2 2/3] tunnels: Don't apply GRO to multiple layers of encapsulation.
From: Alexander Duyck @ 2016-03-29  2:41 UTC (permalink / raw)
  To: Tom Herbert; +Cc: Jesse Gross, David Miller, Linux Kernel Network Developers
In-Reply-To: <CALx6S34VydaFDEVZD5Qv4ts7iGywx6PCGUy1oB4AkTF3Y+MB6w@mail.gmail.com>

On Mon, Mar 28, 2016 at 5:50 PM, Tom Herbert <tom@herbertland.com> wrote:
> On Mon, Mar 28, 2016 at 4:34 PM, Jesse Gross <jesse@kernel.org> wrote:
>> * A packet is received that is encapsulated with two layers of GRE. It
>> looks like this: Eth|IP|GRE|IP|GRE|IP|TCP
>> * The packet is processed through GRO successfully. skb->encapsulation
>> bit is set and skb_shinfo(skb)->gso_type is set to SKB_GSO_GRE |
>> SKB_GSO_TCPV4.
>> * The packet is bridged to an output device and is prepared for
>> transmission. skb_gso_segment() is called to segment the packet.
>> * As we descent down the GSO code, we get to gre_gso_segment() for the
>> first GRE header. skb->encapsulation is cleared and we keep going to
>> the next header.
>> * We return to gre_gso_segment() again for the second GRE header.
>> There is a check for skb->encapsulation being set but it is now clear.
>> GSO processing is aborted.
>> * The packet is dropped.
>
> If multiple level of GRE GSO is the problem then you are welcome to
> fix it after FOU is fixed and when net-next opens (GSO partial should
> fix this this anyway). But this is not at all the same problem as
> saying all multiple levels of encapsulation are invalid so we need to
> disallow that in GRO for everyone regardless of whether the packet is
> is being forwarded or some driver can't understand a certain valid
> combination.

GSO partial is not magic.  It is not going to fix a great many of
these kind of issues.  All GSO partial gives us is a way for hardware
to segment frames if the headers can be replicated.  If GSO cannot
handle the frame in software then GSO partial still cannot handle it.

It looks like what we probably need to do is rewrite GSO in order to
support this as multiple tunnel levels are currently not supported.  I
figure it is something we can add once we have GSO partial in.  Maybe
call it GSO_STACKED.  I still have to figure out where we are going to
want the header pointers since there are obvious bits that wouldn't
work such as the part in UDP or GRE segmentation code where we assume
we can just reset the network header offset based off of the inner
network header offset.  If we have multiple levels of tunnels then
this is blatantly broken.  Maybe to get around it we might have to add
the tunnel header length as a value passed with UDP offloads.

> Testing GUE is really not hard. There is no reason why GUE, Geneve,
> and VXLAN should not be tested each time a change is made to any of
> the common UDP tunneling code.
>
> Configuring ipip-GUE can be done by:
>
> ./ip fou add port 6080 gue
> ./ip link add name tun1 type ipip remote 10.1.1.2 local 10.1.1.1 ttl
> 225 encap gue encap-sport auto encap-dport 6080 encap-csum
> encap-remcsum
> ifconfig tun1 192.168.1.1
> ip route add 192.168.1.0/24 dev tun1
>
> Configuring gre-GUE is just s/ipip/gre/ of above.
>
> ./netperf -H 192.168.1.2 -t TCP_STREAM -l 100

This is essentially the test case I ran to verify that the patch I
submitted fixed the issue.  I'm not sure we need to do too much
testing for the patch I submitted since the whole thing is pretty
straightforward.

> Is good for showing showing regression in a single flow. You should
> see GSO/GRO being effective on both sides and perf should show only a
> minute amount of time in csum_partial assuming that your device has
> checksum offload for outer UDP (which should be about all).
>
> ./super_netperf 200 -H 192.168.1.2 -j -l 1 -t TCP_RR -- -r 1,1 -o
> TRANSACTION_RATE,P50_LATENCY,P90_LATENCY,P99_LATENCY
>
> Is run to test pps. We're looking for no regression in tput, latency,
> or CPU utilization.
>
> Next time you make changes to anything that affects common paths in
> tunnels please run these tests and report the results in the commit
> log so we can avoid regressions like this. You should also be running
> an equivalent battery of VXLAN tests. The configuration I use is:
>
> ./ip link add vxlan0 type vxlan id 10 group 224.10.10.10 ttl 10 dev
> eth0  udpcsum remcsumtx remcsumrx
> ifconfig vxlan0 192.168.111.1
> ip route add 192.168.111.0/24 dev vxlan0
>
> Again we expect the same sort of results, GRO/GSO should be effective,
> csum_partial should not be getting significant tine, etc...

With my patch I can verify that GRO is working and coalescing frames
at least for the IPIP case as that was the only one I tested.

If there is something I missed I am willing to respin it and resubmit
it.  From what I can tell though it should restore the original
functionality for the FOU/GUE cases though since all it really does is
push the setting of the encap_mark back from the UDP header over to
the next level header which is the correct behavior (at least in my
opinion).

- Alex

^ permalink raw reply

* Re: [net PATCH] gro: Allow tunnel stacking in the case of FOU/GUE
From: Jesse Gross @ 2016-03-29  1:54 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Alexander Duyck, Linux Kernel Network Developers, David S. Miller,
	Alexander Duyck
In-Reply-To: <CALx6S3624MH3cKF1zPUvvr1saGAwhTUb6DC6tZcrFYKqRMCozA@mail.gmail.com>

On Mon, Mar 28, 2016 at 6:24 PM, Tom Herbert <tom@herbertland.com> wrote:
> On Mon, Mar 28, 2016 at 4:58 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
>> This patch should fix the issues seen with a recent fix to prevent
>> tunnel-in-tunnel frames from being generated with GRO.  The fix itself is
>> correct for now as long as we do not add any devices that support
>> NETIF_F_GSO_GRE_CSUM.  When such a device is added it could have the
>> potential to mess things up due to the fact that the outer transport header
>> points to the outer UDP header and not the GRE header as would be expected.
>>
>> Fixes: fac8e0f579695 ("tunnels: Don't apply GRO to multiple layers of encapsulation.")
>
> This could only fix FOU/GUE. It is very possible someone else could
> happily be doing some other layered encapsulation and never had a
> problem before, so the decision to start enforcing only a single layer
> of encapsulation for GRO would still break them. I still think we
> should revert the patch, and for next version fixes things to that any
> combination/nesting of encapsulation is supported, and if there are
> exceptions to that support they need be clearly documented.

It was pointed out to me that prior to my patch, it was also possible
to remotely cause a stack overflow by filling up a packet with tunnel
headers and letting GRO descend through them over and over again.

Tom, I'm sorry that you don't like how I fixed this issue but there
really, truly is a bug here. I gave you a specific example to be clear
but that doesn't mean that is the only case. I am aware that the bug
is not encountered in all situations and that the fix removes an
optimization in some of those but I think that ensuring correct
behavior must come first.

^ permalink raw reply

* Re: [net PATCH] gro: Allow tunnel stacking in the case of FOU/GUE
From: Tom Herbert @ 2016-03-29  1:24 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Jesse Gross, Linux Kernel Network Developers, David S. Miller,
	Alexander Duyck
In-Reply-To: <20160328235613.26269.26291.stgit@localhost.localdomain>

On Mon, Mar 28, 2016 at 4:58 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
> This patch should fix the issues seen with a recent fix to prevent
> tunnel-in-tunnel frames from being generated with GRO.  The fix itself is
> correct for now as long as we do not add any devices that support
> NETIF_F_GSO_GRE_CSUM.  When such a device is added it could have the
> potential to mess things up due to the fact that the outer transport header
> points to the outer UDP header and not the GRE header as would be expected.
>
> Fixes: fac8e0f579695 ("tunnels: Don't apply GRO to multiple layers of encapsulation.")

This could only fix FOU/GUE. It is very possible someone else could
happily be doing some other layered encapsulation and never had a
problem before, so the decision to start enforcing only a single layer
of encapsulation for GRO would still break them. I still think we
should revert the patch, and for next version fixes things to that any
combination/nesting of encapsulation is supported, and if there are
exceptions to that support they need be clearly documented.

> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
> ---
>
> This should allow us to keep the fix that Jesse added without breaking the
> 3 cases that Tom called out in terms of FOU/GUE.
>
> Additional work will be needed in net-next as we probably need to make it
> so that offloads work correctly when we get around to supporting
> NETIF_F_GSO_GRE_CSUM.
>
>  net/ipv4/fou.c |   32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>
> diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c
> index 4136da9275b2..2c30256ee959 100644
> --- a/net/ipv4/fou.c
> +++ b/net/ipv4/fou.c
> @@ -195,6 +195,22 @@ static struct sk_buff **fou_gro_receive(struct sk_buff **head,
>         u8 proto = NAPI_GRO_CB(skb)->proto;
>         const struct net_offload **offloads;
>
> +       switch (proto) {
> +       case IPPROTO_IPIP:
> +       case IPPROTO_IPV6:
> +       case IPPROTO_GRE:
> +               /* We can clear the encap_mark for these 3 protocols as
> +                * we are either adding an L4 tunnel header to the outer
> +                * L3 tunnel header, or we are are simply treating the
> +                * GRE tunnel header as though it is a UDP protocol
> +                * specific header such as VXLAN or GENEVE.
> +                */
> +               NAPI_GRO_CB(skb)->encap_mark = 0;
> +               /* fall-through */
> +       default:
> +               break;
> +       }
> +
>         rcu_read_lock();
>         offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
>         ops = rcu_dereference(offloads[proto]);
> @@ -359,6 +375,22 @@ static struct sk_buff **gue_gro_receive(struct sk_buff **head,
>                 NAPI_GRO_CB(p)->flush |= NAPI_GRO_CB(p)->flush_id;
>         }
>
> +       switch (guehdr->proto_ctype) {
> +       case IPPROTO_IPIP:
> +       case IPPROTO_IPV6:
> +       case IPPROTO_GRE:
> +               /* We can clear the encap_mark for these 3 protocols as
> +                * we are either adding an L4 tunnel header to the outer
> +                * L3 tunnel header, or we are are simply treating the
> +                * GRE tunnel header as though it is a UDP protocol
> +                * specific header such as VXLAN or GENEVE.
> +                */
> +               NAPI_GRO_CB(skb)->encap_mark = 0;
> +               /* fall-through */
> +       default:
> +               break;
> +       }
> +
>         rcu_read_lock();
>         offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
>         ops = rcu_dereference(offloads[guehdr->proto_ctype]);
>

^ permalink raw reply

* Re: BUG: net/netfilter: KASAN: stack-out-of-bounds in tcp_packet
From: Eric Dumazet @ 2016-03-29  1:17 UTC (permalink / raw)
  To: David Miller
  Cc: jengelh, kadlec, sploving1, pablo, kaber, netfilter-devel, netdev
In-Reply-To: <20160328.195457.1407210954449940457.davem@davemloft.net>

On Mon, 2016-03-28 at 19:54 -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Mon, 28 Mar 2016 13:51:46 -0700
> 
> > On Mon, 2016-03-28 at 13:46 -0700, Eric Dumazet wrote:
> > 
> >> We have at least 384 bytes of padding in skb->head (this is struct
> >> skb_shared_info).
> >> 
> >> Whatever garbage we might read, current code is fine.
> >> 
> >> We have to deal with a false positive here.
> > 
> > Very similar to the one fixed in 
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=10ec9472f05b45c94db3c854d22581a20b97db41
> 
> I don't see them as similar.
> 
> The current options code we are talking about here never references
> past legitimate parts of the packet data.  We always check 'length',
> and we never access past the boundary it describes.
> 
> This was the entire point of my posting.
> 
> Talking about padding, rather than the logical correctness of the
> code, is therefore a distraction I think :-)

Not really, we do read one out of bound byte David.

length = 1;
...
while (length > 0) {
	int opcode = *ptr++; // Note that length is still 1
        switch (opcode) {
        ...
        default:
            opsize = *ptr++; // Note that length is still 1


        ...
        length -= opsize;
}

So we do read 2 bytes, while length was 1.

opsize definitely can read garbage.
Call it padding or redzone or whatever ;)

^ permalink raw reply

* Re: [net PATCH] gro: Allow tunnel stacking in the case of FOU/GUE
From: Tom Herbert @ 2016-03-29  1:00 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Jesse Gross, Linux Kernel Network Developers, David S. Miller,
	Alexander Duyck
In-Reply-To: <20160328235613.26269.26291.stgit@localhost.localdomain>

On Mon, Mar 28, 2016 at 4:58 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
> This patch should fix the issues seen with a recent fix to prevent
> tunnel-in-tunnel frames from being generated with GRO.  The fix itself is
> correct for now as long as we do not add any devices that support
> NETIF_F_GSO_GRE_CSUM.  When such a device is added it could have the
> potential to mess things up due to the fact that the outer transport header
> points to the outer UDP header and not the GRE header as would be expected.
>
Did you test this?

> Fixes: fac8e0f579695 ("tunnels: Don't apply GRO to multiple layers of encapsulation.")
> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
> ---
>
> This should allow us to keep the fix that Jesse added without breaking the
> 3 cases that Tom called out in terms of FOU/GUE.
>
> Additional work will be needed in net-next as we probably need to make it
> so that offloads work correctly when we get around to supporting
> NETIF_F_GSO_GRE_CSUM.
>
>  net/ipv4/fou.c |   32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>
> diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c
> index 4136da9275b2..2c30256ee959 100644
> --- a/net/ipv4/fou.c
> +++ b/net/ipv4/fou.c
> @@ -195,6 +195,22 @@ static struct sk_buff **fou_gro_receive(struct sk_buff **head,
>         u8 proto = NAPI_GRO_CB(skb)->proto;
>         const struct net_offload **offloads;
>
> +       switch (proto) {
> +       case IPPROTO_IPIP:
> +       case IPPROTO_IPV6:
> +       case IPPROTO_GRE:
> +               /* We can clear the encap_mark for these 3 protocols as
> +                * we are either adding an L4 tunnel header to the outer
> +                * L3 tunnel header, or we are are simply treating the
> +                * GRE tunnel header as though it is a UDP protocol
> +                * specific header such as VXLAN or GENEVE.
> +                */
> +               NAPI_GRO_CB(skb)->encap_mark = 0;
> +               /* fall-through */
> +       default:
> +               break;
> +       }
> +
>         rcu_read_lock();
>         offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
>         ops = rcu_dereference(offloads[proto]);
> @@ -359,6 +375,22 @@ static struct sk_buff **gue_gro_receive(struct sk_buff **head,
>                 NAPI_GRO_CB(p)->flush |= NAPI_GRO_CB(p)->flush_id;
>         }
>
> +       switch (guehdr->proto_ctype) {
> +       case IPPROTO_IPIP:
> +       case IPPROTO_IPV6:
> +       case IPPROTO_GRE:
> +               /* We can clear the encap_mark for these 3 protocols as
> +                * we are either adding an L4 tunnel header to the outer
> +                * L3 tunnel header, or we are are simply treating the
> +                * GRE tunnel header as though it is a UDP protocol
> +                * specific header such as VXLAN or GENEVE.
> +                */
> +               NAPI_GRO_CB(skb)->encap_mark = 0;
> +               /* fall-through */
> +       default:
> +               break;
> +       }
> +
>         rcu_read_lock();
>         offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
>         ops = rcu_dereference(offloads[guehdr->proto_ctype]);
>

^ permalink raw reply

* Re: [PATCH net v2 2/3] tunnels: Don't apply GRO to multiple layers of encapsulation.
From: Tom Herbert @ 2016-03-29  0:50 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Alexander Duyck, David Miller, Linux Kernel Network Developers
In-Reply-To: <CAEh+42grm7dH8VQ6MJGPjp2Cpti8NcptORk+-9hyGp0MgSVZbQ@mail.gmail.com>

On Mon, Mar 28, 2016 at 4:34 PM, Jesse Gross <jesse@kernel.org> wrote:
> On Mon, Mar 28, 2016 at 3:10 PM, Tom Herbert <tom@herbertland.com> wrote:
>> On Mon, Mar 28, 2016 at 1:34 PM, Alexander Duyck
>> <alexander.duyck@gmail.com> wrote:
>>> On Mon, Mar 28, 2016 at 1:03 PM, Tom Herbert <tom@herbertland.com> wrote:
>>>> On Mon, Mar 28, 2016 at 12:31 PM, Alexander Duyck
>>>> <alexander.duyck@gmail.com> wrote:
>>>>> On Mon, Mar 28, 2016 at 11:47 AM, Tom Herbert <tom@herbertland.com> wrote:
>>>>>> On Mon, Mar 28, 2016 at 10:37 AM, Alexander Duyck
>>>>>> <alexander.duyck@gmail.com> wrote:
>>>>>>> On Mon, Mar 28, 2016 at 9:31 AM, Tom Herbert <tom@herbertland.com> wrote:
>>>>>>>> On Sun, Mar 27, 2016 at 9:38 PM, Jesse Gross <jesse@kernel.org> wrote:
>>>>>>>>> On Sat, Mar 26, 2016 at 12:41 PM, Tom Herbert <tom@herbertland.com> wrote:
>>>>>>>>>> On Sat, Mar 19, 2016 at 9:32 AM, Jesse Gross <jesse@kernel.org> wrote:
>>>>>>>>>>> When drivers express support for TSO of encapsulated packets, they
>>>>>>>>>>> only mean that they can do it for one layer of encapsulation.
>>>>>>>>>>> Supporting additional levels would mean updating, at a minimum,
>>>>>>>>>>> more IP length fields and they are unaware of this.
>>>>>>>>>>>
>>>>>>>>>> This patch completely breaks GRO for FOU and GUE.
>>>>>>>>>>
>>>>>>>>>>> No encapsulation device expresses support for handling offloaded
>>>>>>>>>>> encapsulated packets, so we won't generate these types of frames
>>>>>>>>>>> in the transmit path. However, GRO doesn't have a check for
>>>>>>>>>>> multiple levels of encapsulation and will attempt to build them.
>>>>>>>>>>>
>>>>>>>>>>> UDP tunnel GRO actually does prevent this situation but it only
>>>>>>>>>>> handles multiple UDP tunnels stacked on top of each other. This
>>>>>>>>>>> generalizes that solution to prevent any kind of tunnel stacking
>>>>>>>>>>> that would cause problems.
>>>>>>>>>>>
>>>>>>>>>> GUE and FOU regularly create packets that will be both GSO UDP tunnels
>>>>>>>>>> and IPIP, SIT, GRE, etc. This is by design. There should be no
>>>>>>>>>> ambiguity in the drivers as to what this means. For instance, if
>>>>>>>>>> SKB_GSO_UDP_TUNNEL and SKB_GSO_GRE are set that is GRE/UDP packet, a
>>>>>>>>>> driver can use ndo_features_check to validate.
>>>>>>>>>>
>>>>>>>>>> So multiple levels of encapsulation with GRO is perfectly valid and I
>>>>>>>>>> would suggest to simply revert this patch. The one potential issue we
>>>>>>>>>> could have is the potential for GRO to construct a packet which is a
>>>>>>>>>> UDP-encapsulation inside another encapsulation, like UDP-encap in GRE.
>>>>>>>>>> In this case the GSO flags don't provide enough information to
>>>>>>>>>> distinguish say between GRE/UDP (common case) and UDP/GRE (uncommon
>>>>>>>>>> case). To make this clear we can set udp_mark in GRE, ipip, and sit
>>>>>>>>>> but *not* check for it.
>>>>>>>>>
>>>>>>>>> Generally speaking, multiple levels of encapsulation offload are not
>>>>>>>>> valid. I think this is pretty clear from the fact that none of the
>>>>>>>>> encapsulation drivers expose support for encapsulation offloads on
>>>>>>>>> transmit and the drivers supporting NETIF_F_GSO_GRE and
>>>>>>>>> NETIF_F_GSO_UDP_TUNNEL do not mean they can handle GRE in VXLAN.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Kernel software offload does support this combination just fine.
>>>>>>>> Seriously, I've tested that more than a thousand times. This is a few
>>>>>>>> HW implementations you're referring to. The limitations of these
>>>>>>>> drivers should not dictate how we build the software-- it needs to
>>>>>>>> work the other way around.
>>>>>>>
>>>>>>> Jesse does have a point.  Drivers aren't checking for this kind of
>>>>>>> thing currently as the transmit side doesn't generate these kind of
>>>>>>> frames.  The fact that GRO is makes things a bit more messy as we will
>>>>>>> really need to restructure the GSO code in order to handle it.  As far
>>>>>>> as drivers testing for it I am pretty certain the i40e isn't.  I would
>>>>>>> wonder if we need to add yet another GSO bit to indicate that we
>>>>>>> support multiple layers of encapsulation.  I'm pretty sure the only
>>>>>>> way we could possibly handle it would be in software since what you
>>>>>>> are indicating is a indeterminate number of headers that all require
>>>>>>> updates.
>>>>>>>
>>>>>>>>> Asking drivers to assume that this combination of flags means FOU
>>>>>>>>> doesn't seem right to me. To the best of my knowledge, no driver uses
>>>>>>>>> ndo_feature_check to do validation of multiple tunnel offload flags
>>>>>>>>> since the assumption is that the stack will never try to do this.
>>>>>>>>> Since FOU is being treated as only a single level of encapsulation, I
>>>>>>>>> think it would be better to just advertise it that way on transmit
>>>>>>>>> (i.e. only set SKB_GSO_UDP_TUNNEL).
>>>>>>>>>
>>>>>>>> If it's not FOU it will be something else. Arbitrarily declaring
>>>>>>>> multi-levels of encapsulation invalid is simply the wrong direction,
>>>>>>>> we should be increasing generality and capabilities of the kernel not
>>>>>>>> holding it down with artificial limits. This is why the generic TSO
>>>>>>>> work that Alexander and Edward are looking at is so important-- if
>>>>>>>> this flies then we can offload any combination of encapsulations with
>>>>>>>> out protocol specific information.
>>>>>>>
>>>>>>> The segmentation code isn't designed to handle more than 2 layers of
>>>>>>> headers.  Currently we have the pointers for the inner headers and the
>>>>>>> outer headers.  If you are talking about adding yet another level we
>>>>>>> would need additional pointers in the skbuff to handle them and we
>>>>>>> currently don't have them at present.
>>>>>>>
>>>>>>>>> The change that you are suggesting would result in packets generated
>>>>>>>>> by GRO that cannot be handled properly on transmit in some cases and
>>>>>>>>> would likely end up being dropped or malformed. GRO is just an
>>>>>>>>> optimization and correctness must come first so we cannot use it if it
>>>>>>>>> might corrupt traffic.
>>>>>>>>
>>>>>>>> That's (a few) drivers problem. It's no different than if they had
>>>>>>>> decided that SKB_GSO_UDP_TUNNEL means vxlan, or they can support GRE
>>>>>>>> in IPv4 offload but not GRE in IPv6, or they can only handle headers
>>>>>>>> of a certain size, can't handle IPv6 ext. hdrs., etc. As I mentioned,
>>>>>>>> the long term solution is to eliminate the GSO_ flags and use a
>>>>>>>> generic segmentation offload interface. But in the interim, it is
>>>>>>>> *incumbent* on drivers to determine if they can handle a GSO packet
>>>>>>>> and the interfaces to do that exist. Restricting the capabilities of
>>>>>>>> GRO just to appease those drivers is not right. Breaking existing GRO
>>>>>>>> for their benefit is definitely not right.
>>>>>>>
>>>>>>> This isn't about if drivers can handle it.  It is about if the skbuff
>>>>>>> can handle it.  The problem as it stands right now is that we start
>>>>>>> losing data once we go past 1 level of encapsulation.  We only have
>>>>>>> the standard mac_header, network_header, transport_header, and then
>>>>>>> the inner set of the same pointers.  If we try to go multiple levels
>>>>>>> deep we start losing data.
>>>>>>>
>>>>>> Huh? GUE/FOU has been running perfectly well with two levels of
>>>>>> encapsulation for over a year now. We never had to add anything to
>>>>>> skbuff to make that work. If "losing data" is a problem please provide
>>>>>> the *reproducible* test case for that and we'll debug that.
>>>>>
>>>>> I'm guessing most of your examples involve either a remote checksum
>>>>> being enabled or using NICs that don't support any tunnel offloads?
>>>>> Hardware needs to be able to identify where headers are in order to
>>>>> perform most of their offloads for TSO.  We either have to parse to
>>>>> find them or we are provided with them by the stack.  GSO can work
>>>>> around it as long as we don't stack checksum based offload with
>>>>> non-checksum of the same type.
>>>>>
>>>>> mostIf for example you were to try and send a frame that had either an
>>>>> inner or outer GRE tunnel in addition to a UDP tunnel most NICs would
>>>>> probably screw it up.
>>>>
>>>> Please be more precise. Obviously we're only talking about the few
>>>> NICs that even support UDP tunnel offload so it's not most NICs. Also,
>>>> there is a big difference between "probably" and "definitely"; no one
>>>> has provided a single data point that that there is even an issue. For
>>>> instance, looking a i40e I suspect this will work with GRE/UDP since
>>>> the driver already deals with offsets and shouldn't care about
>>>> intermediate levels of encapsulation.
>>>
>>> I'm sorry I cannot be super precise as I hadn't looked that closely at
>>> the FOU or GUE code before.  Honestly I suspect most people probably
>>> haven't.
>>>
>> Sorry, but one simple test would have identified this patch was a
>> regression. I don't mind that bugs are introduced with new patches,
>> but it is frustrating when testing of the patches is obviously
>> inadequate and then we get the "it's your stuff that's broken" knee
>> jerk reaction when a patch causes a regression for someone else.
>>
>>> So from what I can tell FOU and GUE isn't really even necessarily
>>> doing stacked tunnels.  It looks like you support IPIP, SIT, or GRE.
>>> So in the case of IPIP and SIT for instance the only real difference
>>> between VXLAN and those tunnels with fou is that you don't have the
>>> VXLAN or inner Ethernet headers.  So really you still only have two
>>> levels of headers.  Even in the case of GRE you have that set after
>>> the outer UDP header so in that case GRE ends up being treated as a
>>> tunnel header since the outer transport offset was overwritten.  If we
>>> had hardware that supported both UDP and GRE outer checksums it would
>>> cause a mess as the hardware would place the GRE checksum in the wrong
>>> spot.
>>>
>>> So if anything all we probably would need to do is treat the FOU or
>>> GUE stuff as a special case.  Basically if we end up having to do a
>>> GRO over a FOU or GUE instance maybe we need to knock the encap_mark
>>> back to 0 of it is at 1 and the next level is IPIP, SIT, or GRE
>>> because really it can just be treated as a UDP tunnel.  Jesse's code
>>> is still needed to catch the case where someone is trying to do
>>> something like IPIP over VXLAN or whatever but it does seem like there
>>> should be some wiggle room for FOU or GUE.
>>>
>> Nothing is needed other than to revert this patch. There is no problem
>> with GRO. You nor Jesse have not identified any tangible problem. If a
>> driver really does had an issue with doing GRE/UDP with GSO then it
>> can filter that in check_features (this is like 2 lines of code). But
>> that has not been proven to be a problem, and I would expect that
>> anyone who wants to fix that better run tests showing there is a
>> problem.
>
> Let me try to give a very clear and specific example. Note that this
> is pure software and does not involve any hardware offloading.
>
> * A packet is received that is encapsulated with two layers of GRE. It
> looks like this: Eth|IP|GRE|IP|GRE|IP|TCP
> * The packet is processed through GRO successfully. skb->encapsulation
> bit is set and skb_shinfo(skb)->gso_type is set to SKB_GSO_GRE |
> SKB_GSO_TCPV4.
> * The packet is bridged to an output device and is prepared for
> transmission. skb_gso_segment() is called to segment the packet.
> * As we descent down the GSO code, we get to gre_gso_segment() for the
> first GRE header. skb->encapsulation is cleared and we keep going to
> the next header.
> * We return to gre_gso_segment() again for the second GRE header.
> There is a check for skb->encapsulation being set but it is now clear.
> GSO processing is aborted.
> * The packet is dropped.

If multiple level of GRE GSO is the problem then you are welcome to
fix it after FOU is fixed and when net-next opens (GSO partial should
fix this this anyway). But this is not at all the same problem as
saying all multiple levels of encapsulation are invalid so we need to
disallow that in GRO for everyone regardless of whether the packet is
is being forwarded or some driver can't understand a certain valid
combination.

Testing GUE is really not hard. There is no reason why GUE, Geneve,
and VXLAN should not be tested each time a change is made to any of
the common UDP tunneling code.

Configuring ipip-GUE can be done by:

./ip fou add port 6080 gue
./ip link add name tun1 type ipip remote 10.1.1.2 local 10.1.1.1 ttl
225 encap gue encap-sport auto encap-dport 6080 encap-csum
encap-remcsum
ifconfig tun1 192.168.1.1
ip route add 192.168.1.0/24 dev tun1

Configuring gre-GUE is just s/ipip/gre/ of above.

./netperf -H 192.168.1.2 -t TCP_STREAM -l 100

Is good for showing showing regression in a single flow. You should
see GSO/GRO being effective on both sides and perf should show only a
minute amount of time in csum_partial assuming that your device has
checksum offload for outer UDP (which should be about all).

./super_netperf 200 -H 192.168.1.2 -j -l 1 -t TCP_RR -- -r 1,1 -o
TRANSACTION_RATE,P50_LATENCY,P90_LATENCY,P99_LATENCY

Is run to test pps. We're looking for no regression in tput, latency,
or CPU utilization.

Next time you make changes to anything that affects common paths in
tunnels please run these tests and report the results in the commit
log so we can avoid regressions like this. You should also be running
an equivalent battery of VXLAN tests. The configuration I use is:

./ip link add vxlan0 type vxlan id 10 group 224.10.10.10 ttl 10 dev
eth0  udpcsum remcsumtx remcsumrx
ifconfig vxlan0 192.168.111.1
ip route add 192.168.111.0/24 dev vxlan0

Again we expect the same sort of results, GRO/GSO should be effective,
csum_partial should not be getting significant tine, etc...

Thanks,
Tom

^ permalink raw reply

* Re: [PATCH net,stable] qmi_wwan: add "D-Link DWM-221 B1" device id
From: David Miller @ 2016-03-29  0:01 UTC (permalink / raw)
  To: bjorn-yOkvZcmFvRU
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, tschaefer-zqRNUXuvxA0b1SvskN2V4Q,
	linux-usb-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1459197496-11236-1-git-send-email-bjorn-yOkvZcmFvRU@public.gmane.org>

From: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>
Date: Mon, 28 Mar 2016 22:38:16 +0200

> Thomas reports:
> "Windows:
 ...
> Linux:
 ...
> Reported-by: Thomas Schäfer <tschaefer-zqRNUXuvxA0b1SvskN2V4Q@public.gmane.org>
> Signed-off-by: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>

Applied and queued up for -stable, thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [net PATCH] gro: Allow tunnel stacking in the case of FOU/GUE
From: Alexander Duyck @ 2016-03-28 23:58 UTC (permalink / raw)
  To: jesse, netdev, davem, alexander.duyck, tom

This patch should fix the issues seen with a recent fix to prevent
tunnel-in-tunnel frames from being generated with GRO.  The fix itself is
correct for now as long as we do not add any devices that support
NETIF_F_GSO_GRE_CSUM.  When such a device is added it could have the
potential to mess things up due to the fact that the outer transport header
points to the outer UDP header and not the GRE header as would be expected.

Fixes: fac8e0f579695 ("tunnels: Don't apply GRO to multiple layers of encapsulation.")
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---

This should allow us to keep the fix that Jesse added without breaking the
3 cases that Tom called out in terms of FOU/GUE.

Additional work will be needed in net-next as we probably need to make it
so that offloads work correctly when we get around to supporting
NETIF_F_GSO_GRE_CSUM.

 net/ipv4/fou.c |   32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c
index 4136da9275b2..2c30256ee959 100644
--- a/net/ipv4/fou.c
+++ b/net/ipv4/fou.c
@@ -195,6 +195,22 @@ static struct sk_buff **fou_gro_receive(struct sk_buff **head,
 	u8 proto = NAPI_GRO_CB(skb)->proto;
 	const struct net_offload **offloads;
 
+	switch (proto) {
+	case IPPROTO_IPIP:
+	case IPPROTO_IPV6:
+	case IPPROTO_GRE:
+		/* We can clear the encap_mark for these 3 protocols as
+		 * we are either adding an L4 tunnel header to the outer
+		 * L3 tunnel header, or we are are simply treating the
+		 * GRE tunnel header as though it is a UDP protocol
+		 * specific header such as VXLAN or GENEVE.
+		 */
+		NAPI_GRO_CB(skb)->encap_mark = 0;
+		/* fall-through */
+	default:
+		break;
+	}
+
 	rcu_read_lock();
 	offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
 	ops = rcu_dereference(offloads[proto]);
@@ -359,6 +375,22 @@ static struct sk_buff **gue_gro_receive(struct sk_buff **head,
 		NAPI_GRO_CB(p)->flush |= NAPI_GRO_CB(p)->flush_id;
 	}
 
+	switch (guehdr->proto_ctype) {
+	case IPPROTO_IPIP:
+	case IPPROTO_IPV6:
+	case IPPROTO_GRE:
+		/* We can clear the encap_mark for these 3 protocols as
+		 * we are either adding an L4 tunnel header to the outer
+		 * L3 tunnel header, or we are are simply treating the
+		 * GRE tunnel header as though it is a UDP protocol
+		 * specific header such as VXLAN or GENEVE.
+		 */
+		NAPI_GRO_CB(skb)->encap_mark = 0;
+		/* fall-through */
+	default:
+		break;
+	}
+
 	rcu_read_lock();
 	offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
 	ops = rcu_dereference(offloads[guehdr->proto_ctype]);

^ permalink raw reply related

* [PATCH net 3/4] bnxt_en: Fix typo in bnxt_hwrm_set_pause_common().
From: Michael Chan @ 2016-03-28 23:46 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1459208767-4225-1-git-send-email-michael.chan@broadcom.com>

The typo caused the wrong flow control bit to be set.

Reported by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 4600a05..12a009d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4559,7 +4559,7 @@ bnxt_hwrm_set_pause_common(struct bnxt *bp, struct hwrm_port_phy_cfg_input *req)
 		if (bp->link_info.req_flow_ctrl & BNXT_LINK_PAUSE_RX)
 			req->auto_pause |= PORT_PHY_CFG_REQ_AUTO_PAUSE_RX;
 		if (bp->link_info.req_flow_ctrl & BNXT_LINK_PAUSE_TX)
-			req->auto_pause |= PORT_PHY_CFG_REQ_AUTO_PAUSE_RX;
+			req->auto_pause |= PORT_PHY_CFG_REQ_AUTO_PAUSE_TX;
 		req->enables |=
 			cpu_to_le32(PORT_PHY_CFG_REQ_ENABLES_AUTO_PAUSE);
 	} else {
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net 0/4] misc. small fixes.
From: Michael Chan @ 2016-03-28 23:46 UTC (permalink / raw)
  To: davem; +Cc: netdev

Misc. small fixes for net.

Michael Chan (4):
  bnxt_en: Initialize CP doorbell value before ring allocation
  bnxt_en: Implement proper firmware message padding.
  bnxt_en: Fix typo in bnxt_hwrm_set_pause_common().
  bnxt_en: Fix ethtool -a reporting.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c         | 10 +++++++---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h         |  2 ++
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  6 ++----
 3 files changed, 11 insertions(+), 7 deletions(-)

-- 
1.8.3.1

^ permalink raw reply

* [PATCH net 2/4] bnxt_en: Implement proper firmware message padding.
From: Michael Chan @ 2016-03-28 23:46 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1459208767-4225-1-git-send-email-michael.chan@broadcom.com>

The size of every padded firmware message is specified in the first
HWRM_VER_GET response message.  Use this value to pad every message
after that.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 6 +++++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 ++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index c92053c..4600a05 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2653,7 +2653,7 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void *msg, u32 msg_len,
 	/* Write request msg to hwrm channel */
 	__iowrite32_copy(bp->bar0, data, msg_len / 4);
 
-	for (i = msg_len; i < HWRM_MAX_REQ_LEN; i += 4)
+	for (i = msg_len; i < BNXT_HWRM_MAX_REQ_LEN; i += 4)
 		writel(0, bp->bar0 + i);
 
 	/* currently supports only one outstanding message */
@@ -3830,6 +3830,7 @@ static int bnxt_hwrm_ver_get(struct bnxt *bp)
 	struct hwrm_ver_get_input req = {0};
 	struct hwrm_ver_get_output *resp = bp->hwrm_cmd_resp_addr;
 
+	bp->hwrm_max_req_len = HWRM_MAX_REQ_LEN;
 	bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_VER_GET, -1, -1);
 	req.hwrm_intf_maj = HWRM_VERSION_MAJOR;
 	req.hwrm_intf_min = HWRM_VERSION_MINOR;
@@ -3855,6 +3856,9 @@ static int bnxt_hwrm_ver_get(struct bnxt *bp)
 	if (!bp->hwrm_cmd_timeout)
 		bp->hwrm_cmd_timeout = DFLT_HWRM_CMD_TIMEOUT;
 
+	if (resp->hwrm_intf_maj >= 1)
+		bp->hwrm_max_req_len = le16_to_cpu(resp->max_req_win_len);
+
 hwrm_ver_get_exit:
 	mutex_unlock(&bp->hwrm_cmd_lock);
 	return rc;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index ec04c47..709b95b 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -477,6 +477,7 @@ struct rx_tpa_end_cmp_ext {
 #define RING_CMP(idx)		((idx) & bp->cp_ring_mask)
 #define NEXT_CMP(idx)		RING_CMP(ADV_RAW_CMP(idx, 1))
 
+#define BNXT_HWRM_MAX_REQ_LEN		(bp->hwrm_max_req_len)
 #define DFLT_HWRM_CMD_TIMEOUT		500
 #define HWRM_CMD_TIMEOUT		(bp->hwrm_cmd_timeout)
 #define HWRM_RESET_TIMEOUT		((HWRM_CMD_TIMEOUT) * 4)
@@ -953,6 +954,7 @@ struct bnxt {
 	dma_addr_t		hw_tx_port_stats_map;
 	int			hw_port_stats_size;
 
+	u16			hwrm_max_req_len;
 	int			hwrm_cmd_timeout;
 	struct mutex		hwrm_cmd_lock;	/* serialize hwrm messages */
 	struct hwrm_ver_get_output	ver_resp;
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net 4/4] bnxt_en: Fix ethtool -a reporting.
From: Michael Chan @ 2016-03-28 23:46 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1459208767-4225-1-git-send-email-michael.chan@broadcom.com>

To report flow control tx/rx settings accurately regardless of autoneg
setting, we should use link_info->req_flow_ctrl.  Before this patch,
the reported settings were only correct when autoneg was on.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 9ada166..2e472f6 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -855,10 +855,8 @@ static void bnxt_get_pauseparam(struct net_device *dev,
 	if (BNXT_VF(bp))
 		return;
 	epause->autoneg = !!(link_info->autoneg & BNXT_AUTONEG_FLOW_CTRL);
-	epause->rx_pause =
-		((link_info->auto_pause_setting & BNXT_LINK_PAUSE_RX) != 0);
-	epause->tx_pause =
-		((link_info->auto_pause_setting & BNXT_LINK_PAUSE_TX) != 0);
+	epause->rx_pause = !!(link_info->req_flow_ctrl & BNXT_LINK_PAUSE_RX);
+	epause->tx_pause = !!(link_info->req_flow_ctrl & BNXT_LINK_PAUSE_TX);
 }
 
 static int bnxt_set_pauseparam(struct net_device *dev,
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net 1/4] bnxt_en: Initialize CP doorbell value before ring allocation
From: Michael Chan @ 2016-03-28 23:46 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1459208767-4225-1-git-send-email-michael.chan@broadcom.com>

From: Prashant Sreedharan <prashant@broadcom.com>

The existing code does the following:
    allocate completion ring
    initialize completion ring doorbell
    disable interrupts on this completion ring by writing to the doorbell

We can have a race where firmware sends an asynchronous event to the host
after completion ring allocation and before doorbell is initialized.
When this happens driver can crash while ringing the doorbell using
uninitialized value as part of handling the IRQ/napi request.

Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index aabbd51..c92053c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -3391,11 +3391,11 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
 		struct bnxt_cp_ring_info *cpr = &bnapi->cp_ring;
 		struct bnxt_ring_struct *ring = &cpr->cp_ring_struct;
 
+		cpr->cp_doorbell = bp->bar1 + i * 0x80;
 		rc = hwrm_ring_alloc_send_msg(bp, ring, HWRM_RING_ALLOC_CMPL, i,
 					      INVALID_STATS_CTX_ID);
 		if (rc)
 			goto err_out;
-		cpr->cp_doorbell = bp->bar1 + i * 0x80;
 		BNXT_CP_DB(cpr->cp_doorbell, cpr->cp_raw_cons);
 		bp->grp_info[i].cp_fw_ring_id = ring->fw_ring_id;
 	}
-- 
1.8.3.1

^ permalink raw reply related

* Re: BUG: net/netfilter: KASAN: stack-out-of-bounds in tcp_packet
From: David Miller @ 2016-03-28 23:54 UTC (permalink / raw)
  To: eric.dumazet
  Cc: jengelh, kadlec, sploving1, pablo, kaber, netfilter-devel, netdev
In-Reply-To: <1459198306.6473.126.camel@edumazet-glaptop3.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 28 Mar 2016 13:51:46 -0700

> On Mon, 2016-03-28 at 13:46 -0700, Eric Dumazet wrote:
> 
>> We have at least 384 bytes of padding in skb->head (this is struct
>> skb_shared_info).
>> 
>> Whatever garbage we might read, current code is fine.
>> 
>> We have to deal with a false positive here.
> 
> Very similar to the one fixed in 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=10ec9472f05b45c94db3c854d22581a20b97db41

I don't see them as similar.

The current options code we are talking about here never references
past legitimate parts of the packet data.  We always check 'length',
and we never access past the boundary it describes.

This was the entire point of my posting.

Talking about padding, rather than the logical correctness of the
code, is therefore a distraction I think :-)

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox