Netdev List
 help / color / mirror / Atom feed
* [PATCH V2 net-next 2/7] net: hns3: revert to old channel when setting new channel num fail
From: Huazhong Tan @ 2019-09-11  2:40 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
	jakub.kicinski
In-Reply-To: <1568169639-43658-1-git-send-email-tanhuazhong@huawei.com>

From: Peng Li <lipeng321@huawei.com>

After setting new channel num, it needs free old ring memory and
allocate new ring memory. If there is no enough memory and allocate
new ring memory fail, the ring may initialize fail. To make sure
the network interface can work normally, driver should revert the
channel to the old configuration.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 51 ++++++++++++++++++-------
 1 file changed, 37 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 9f3f8e3..8dbaf36 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -4410,6 +4410,30 @@ static int hns3_reset_notify(struct hnae3_handle *handle,
 	return ret;
 }
 
+static int hns3_change_channels(struct hnae3_handle *handle, u32 new_tqp_num,
+				bool rxfh_configured)
+{
+	int ret;
+
+	ret = handle->ae_algo->ops->set_channels(handle, new_tqp_num,
+						 rxfh_configured);
+	if (ret) {
+		dev_err(&handle->pdev->dev,
+			"Change tqp num(%u) fail.\n", new_tqp_num);
+		return ret;
+	}
+
+	ret = hns3_reset_notify(handle, HNAE3_INIT_CLIENT);
+	if (ret)
+		return ret;
+
+	ret =  hns3_reset_notify(handle, HNAE3_UP_CLIENT);
+	if (ret)
+		hns3_reset_notify(handle, HNAE3_UNINIT_CLIENT);
+
+	return ret;
+}
+
 int hns3_set_channels(struct net_device *netdev,
 		      struct ethtool_channels *ch)
 {
@@ -4450,24 +4474,23 @@ int hns3_set_channels(struct net_device *netdev,
 		return ret;
 
 	org_tqp_num = h->kinfo.num_tqps;
-	ret = h->ae_algo->ops->set_channels(h, new_tqp_num, rxfh_configured);
+	ret = hns3_change_channels(h, new_tqp_num, rxfh_configured);
 	if (ret) {
-		ret = h->ae_algo->ops->set_channels(h, org_tqp_num,
-						    rxfh_configured);
-		if (ret) {
-			/* If revert to old tqp failed, fatal error occurred */
-			dev_err(&netdev->dev,
-				"Revert to old tqp num fail, ret=%d", ret);
-			return ret;
+		int ret1;
+
+		netdev_warn(netdev,
+			    "Change channels fail, revert to old value\n");
+		ret1 = hns3_change_channels(h, org_tqp_num, rxfh_configured);
+		if (ret1) {
+			netdev_err(netdev,
+				   "revert to old channel fail\n");
+			return ret1;
 		}
-		dev_info(&netdev->dev,
-			 "Change tqp num fail, Revert to old tqp num");
-	}
-	ret = hns3_reset_notify(h, HNAE3_INIT_CLIENT);
-	if (ret)
+
 		return ret;
+	}
 
-	return hns3_reset_notify(h, HNAE3_UP_CLIENT);
+	return 0;
 }
 
 static const struct hns3_hw_error_info hns3_hw_err[] = {
-- 
2.7.4


^ permalink raw reply related

* [PATCH V2 net-next 3/7] net: hns3: fix shaper parameter algorithm
From: Huazhong Tan @ 2019-09-11  2:40 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
	jakub.kicinski
In-Reply-To: <1568169639-43658-1-git-send-email-tanhuazhong@huawei.com>

From: Yonglong Liu <liuyonglong@huawei.com>

Currently when hns3 driver configures the tm shaper to limit
bandwidth below 20Mbit using the parameters calculated by
hclge_shaper_para_calc(), the actual bandwidth limited by tm
hardware module is not accurate enough, for example, 1.28 Mbit
when the user is configuring 1 Mbit.

This patch adjusts the ir_calc to be closer to ir, and
always calculate the ir_b parameter when user is configuring
a small bandwidth. Also, removes an unnecessary parenthesis
when calculating denominator.

Fixes: 848440544b41 ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
index e829101..9f0e35f 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
@@ -81,16 +81,13 @@ static int hclge_shaper_para_calc(u32 ir, u8 shaper_level,
 		return 0;
 	} else if (ir_calc > ir) {
 		/* Increasing the denominator to select ir_s value */
-		while (ir_calc > ir) {
+		while (ir_calc >= ir && ir) {
 			ir_s_calc++;
 			ir_calc = DIVISOR_IR_B_126 / (tick * (1 << ir_s_calc));
 		}
 
-		if (ir_calc == ir)
-			*ir_b = 126;
-		else
-			*ir_b = (ir * tick * (1 << ir_s_calc) +
-				 (DIVISOR_CLK >> 1)) / DIVISOR_CLK;
+		*ir_b = (ir * tick * (1 << ir_s_calc) + (DIVISOR_CLK >> 1)) /
+			DIVISOR_CLK;
 	} else {
 		/* Increasing the numerator to select ir_u value */
 		u32 numerator;
@@ -104,7 +101,7 @@ static int hclge_shaper_para_calc(u32 ir, u8 shaper_level,
 		if (ir_calc == ir) {
 			*ir_b = 126;
 		} else {
-			u32 denominator = (DIVISOR_CLK * (1 << --ir_u_calc));
+			u32 denominator = DIVISOR_CLK * (1 << --ir_u_calc);
 			*ir_b = (ir * tick + (denominator >> 1)) / denominator;
 		}
 	}
-- 
2.7.4


^ permalink raw reply related

* [PATCH v2] rtl8xxxu: add bluetooth co-existence support for single antenna
From: Chris Chiu @ 2019-09-11  2:50 UTC (permalink / raw)
  To: Jes.Sorensen, kvalo, davem; +Cc: linux-wireless, netdev, linux-kernel, linux

The RTL8723BU suffers the wifi disconnection problem while bluetooth
device connected. While wifi is doing tx/rx, the bluetooth will scan
without results. This is due to the wifi and bluetooth share the same
single antenna for RF communication and they need to have a mechanism
to collaborate.

BT information is provided via the packet sent from co-processor to
host (C2H). It contains the status of BT but the rtl8723bu_handle_c2h
dose not really handle it. And there's no bluetooth coexistence
mechanism to deal with it.

This commit adds a workqueue to set the tdma configurations and
coefficient table per the parsed bluetooth link status and given
wifi connection state. The tdma/coef table comes from the vendor
driver code of the RTL8192EU and RTL8723BU. However, this commit is
only for single antenna scenario which RTL8192EU is default dual
antenna. The rtl8xxxu_parse_rxdesc24 which invokes the handle_c2h
is only for 8723b and 8192e so the mechanism is expected to work
on both chips with single antenna. Note RTL8192EU dual antenna is
not supported.

Signed-off-by: Chris Chiu <chiu@endlessm.com>
---

Notes:
  v2:
   - Add helper functions to replace bunch of tdma settings
   - Reformat some lines to meet kernel coding style


 .../net/wireless/realtek/rtl8xxxu/rtl8xxxu.h  |  37 +++
 .../realtek/rtl8xxxu/rtl8xxxu_8723b.c         |   2 -
 .../wireless/realtek/rtl8xxxu/rtl8xxxu_core.c | 262 +++++++++++++++++-
 3 files changed, 294 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h
index 582c2a346cec..22e95b11bfbb 100644
--- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h
+++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h
@@ -1220,6 +1220,37 @@ enum ratr_table_mode_new {
 	RATEID_IDX_BGN_3SS = 14,
 };
 
+#define BT_INFO_8723B_1ANT_B_FTP		BIT(7)
+#define BT_INFO_8723B_1ANT_B_A2DP		BIT(6)
+#define BT_INFO_8723B_1ANT_B_HID		BIT(5)
+#define BT_INFO_8723B_1ANT_B_SCO_BUSY		BIT(4)
+#define BT_INFO_8723B_1ANT_B_ACL_BUSY		BIT(3)
+#define BT_INFO_8723B_1ANT_B_INQ_PAGE		BIT(2)
+#define BT_INFO_8723B_1ANT_B_SCO_ESCO		BIT(1)
+#define BT_INFO_8723B_1ANT_B_CONNECTION	BIT(0)
+
+enum _BT_8723B_1ANT_STATUS {
+	BT_8723B_1ANT_STATUS_NON_CONNECTED_IDLE      = 0x0,
+	BT_8723B_1ANT_STATUS_CONNECTED_IDLE          = 0x1,
+	BT_8723B_1ANT_STATUS_INQ_PAGE                = 0x2,
+	BT_8723B_1ANT_STATUS_ACL_BUSY                = 0x3,
+	BT_8723B_1ANT_STATUS_SCO_BUSY                = 0x4,
+	BT_8723B_1ANT_STATUS_ACL_SCO_BUSY            = 0x5,
+	BT_8723B_1ANT_STATUS_MAX
+};
+
+struct rtl8xxxu_btcoex {
+	u8      bt_status;
+	bool	bt_busy;
+	bool	has_sco;
+	bool	has_a2dp;
+	bool    has_hid;
+	bool    has_pan;
+	bool	hid_only;
+	bool	a2dp_only;
+	bool    c2h_bt_inquiry;
+};
+
 #define RTL8XXXU_RATR_STA_INIT 0
 #define RTL8XXXU_RATR_STA_HIGH 1
 #define RTL8XXXU_RATR_STA_MID  2
@@ -1340,6 +1371,10 @@ struct rtl8xxxu_priv {
 	 */
 	struct ieee80211_vif *vif;
 	struct delayed_work ra_watchdog;
+	struct work_struct c2hcmd_work;
+	struct sk_buff_head c2hcmd_queue;
+	spinlock_t c2hcmd_lock;
+	struct rtl8xxxu_btcoex bt_coex;
 };
 
 struct rtl8xxxu_rx_urb {
@@ -1486,6 +1521,8 @@ void rtl8xxxu_fill_txdesc_v2(struct ieee80211_hw *hw, struct ieee80211_hdr *hdr,
 			     struct rtl8xxxu_txdesc32 *tx_desc32, bool sgi,
 			     bool short_preamble, bool ampdu_enable,
 			     u32 rts_rate);
+void rtl8723bu_set_ps_tdma(struct rtl8xxxu_priv *priv,
+			   u8 arg1, u8 arg2, u8 arg3, u8 arg4, u8 arg5);
 
 extern struct rtl8xxxu_fileops rtl8192cu_fops;
 extern struct rtl8xxxu_fileops rtl8192eu_fops;
diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8723b.c b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8723b.c
index ceffe05bd65b..9ba661b3d767 100644
--- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8723b.c
+++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8723b.c
@@ -1580,9 +1580,7 @@ static void rtl8723b_enable_rf(struct rtl8xxxu_priv *priv)
 	/*
 	 * Software control, antenna at WiFi side
 	 */
-#ifdef NEED_PS_TDMA
 	rtl8723bu_set_ps_tdma(priv, 0x08, 0x00, 0x00, 0x00, 0x00);
-#endif
 
 	rtl8xxxu_write32(priv, REG_BT_COEX_TABLE1, 0x55555555);
 	rtl8xxxu_write32(priv, REG_BT_COEX_TABLE2, 0x55555555);
diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
index a6f358b9e447..e4c1b08c8070 100644
--- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
+++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
@@ -3820,9 +3820,8 @@ void rtl8xxxu_power_off(struct rtl8xxxu_priv *priv)
 	rtl8xxxu_write8(priv, REG_RSV_CTRL, 0x0e);
 }
 
-#ifdef NEED_PS_TDMA
-static void rtl8723bu_set_ps_tdma(struct rtl8xxxu_priv *priv,
-				  u8 arg1, u8 arg2, u8 arg3, u8 arg4, u8 arg5)
+void rtl8723bu_set_ps_tdma(struct rtl8xxxu_priv *priv,
+			   u8 arg1, u8 arg2, u8 arg3, u8 arg4, u8 arg5)
 {
 	struct h2c_cmd h2c;
 
@@ -3835,7 +3834,6 @@ static void rtl8723bu_set_ps_tdma(struct rtl8xxxu_priv *priv,
 	h2c.b_type_dma.data5 = arg5;
 	rtl8xxxu_gen2_h2c_cmd(priv, &h2c, sizeof(h2c.b_type_dma));
 }
-#endif
 
 void rtl8xxxu_gen2_disable_rf(struct rtl8xxxu_priv *priv)
 {
@@ -5186,12 +5184,258 @@ static void rtl8xxxu_rx_urb_work(struct work_struct *work)
 	}
 }
 
+void rtl8723bu_set_coex_with_type(struct rtl8xxxu_priv *priv, u8 type)
+{
+	switch (type) {
+	case 0:
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE1, 0x55555555);
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE2, 0x55555555);
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE3, 0x00ffffff);
+		rtl8xxxu_write8(priv, REG_BT_COEX_TABLE4, 0x03);
+		break;
+	case 1:
+	case 3:
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE1, 0x55555555);
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE2, 0x5a5a5a5a);
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE3, 0x00ffffff);
+		rtl8xxxu_write8(priv, REG_BT_COEX_TABLE4, 0x03);
+		break;
+	case 2:
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE1, 0x5a5a5a5a);
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE2, 0x5a5a5a5a);
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE3, 0x00ffffff);
+		rtl8xxxu_write8(priv, REG_BT_COEX_TABLE4, 0x03);
+		break;
+	case 4:
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE1, 0x5a5a5a5a);
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE2, 0xaaaa5a5a);
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE3, 0x00ffffff);
+		rtl8xxxu_write8(priv, REG_BT_COEX_TABLE4, 0x03);
+		break;
+	case 5:
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE1, 0x5a5a5a5a);
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE2, 0xaa5a5a5a);
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE3, 0x00ffffff);
+		rtl8xxxu_write8(priv, REG_BT_COEX_TABLE4, 0x03);
+		break;
+	case 6:
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE1, 0x55555555);
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE2, 0xaaaaaaaa);
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE3, 0x00ffffff);
+		rtl8xxxu_write8(priv, REG_BT_COEX_TABLE4, 0x03);
+		break;
+	case 7:
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE1, 0xaaaaaaaa);
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE2, 0xaaaaaaaa);
+		rtl8xxxu_write32(priv, REG_BT_COEX_TABLE3, 0x00ffffff);
+		rtl8xxxu_write8(priv, REG_BT_COEX_TABLE4, 0x03);
+		break;
+	default:
+		break;
+	}
+}
+
+void rtl8723bu_update_bt_link_info(struct rtl8xxxu_priv *priv, u8 bt_info)
+{
+	struct rtl8xxxu_btcoex *btcoex = &priv->bt_coex;
+
+	if (bt_info & BT_INFO_8723B_1ANT_B_INQ_PAGE)
+		btcoex->c2h_bt_inquiry = true;
+	else
+		btcoex->c2h_bt_inquiry = false;
+
+	if (!(bt_info & BT_INFO_8723B_1ANT_B_CONNECTION)) {
+		btcoex->bt_status = BT_8723B_1ANT_STATUS_NON_CONNECTED_IDLE;
+		btcoex->has_sco = false;
+		btcoex->has_hid = false;
+		btcoex->has_pan = false;
+		btcoex->has_a2dp = false;
+	} else {
+		if ((bt_info & 0x1f) == BT_INFO_8723B_1ANT_B_CONNECTION)
+			btcoex->bt_status = BT_8723B_1ANT_STATUS_CONNECTED_IDLE;
+		else if ((bt_info & BT_INFO_8723B_1ANT_B_SCO_ESCO) ||
+			 (bt_info & BT_INFO_8723B_1ANT_B_SCO_BUSY))
+			btcoex->bt_status = BT_8723B_1ANT_STATUS_SCO_BUSY;
+		else if (bt_info & BT_INFO_8723B_1ANT_B_ACL_BUSY)
+			btcoex->bt_status = BT_8723B_1ANT_STATUS_ACL_BUSY;
+		else
+			btcoex->bt_status = BT_8723B_1ANT_STATUS_MAX;
+
+		if (bt_info & BT_INFO_8723B_1ANT_B_FTP)
+			btcoex->has_pan = true;
+		else
+			btcoex->has_pan = false;
+
+		if (bt_info & BT_INFO_8723B_1ANT_B_A2DP)
+			btcoex->has_a2dp = true;
+		else
+			btcoex->has_a2dp = false;
+
+		if (bt_info & BT_INFO_8723B_1ANT_B_HID)
+			btcoex->has_hid = true;
+		else
+			btcoex->has_hid = false;
+
+		if (bt_info & BT_INFO_8723B_1ANT_B_SCO_ESCO)
+			btcoex->has_sco = true;
+		else
+			btcoex->has_sco = false;
+	}
+
+	if (!btcoex->has_a2dp && !btcoex->has_sco &&
+	    !btcoex->has_pan && btcoex->has_hid)
+		btcoex->hid_only = true;
+	else
+		btcoex->hid_only = false;
+
+	if (!btcoex->has_sco && !btcoex->has_pan &&
+	    !btcoex->has_hid && btcoex->has_a2dp)
+		btcoex->has_a2dp = true;
+	else
+		btcoex->has_a2dp = false;
+
+	if (btcoex->bt_status == BT_8723B_1ANT_STATUS_SCO_BUSY ||
+	    btcoex->bt_status == BT_8723B_1ANT_STATUS_ACL_BUSY)
+		btcoex->bt_busy = true;
+	else
+		btcoex->bt_busy = false;
+}
+
+void rtl8723bu_handle_bt_inquiry(struct rtl8xxxu_priv *priv)
+{
+	struct ieee80211_vif *vif;
+	struct rtl8xxxu_btcoex *btcoex;
+	bool wifi_connected;
+
+	vif = priv->vif;
+	btcoex = &priv->bt_coex;
+	wifi_connected = (vif && vif->bss_conf.assoc);
+
+	if (!wifi_connected) {
+		rtl8723bu_set_ps_tdma(priv, 0x8, 0x0, 0x0, 0x0, 0x0);
+		rtl8723bu_set_coex_with_type(priv, 0);
+	} else if (btcoex->has_sco || btcoex->has_hid || btcoex->has_a2dp) {
+		rtl8723bu_set_ps_tdma(priv, 0x61, 0x35, 0x3, 0x11, 0x11);
+		rtl8723bu_set_coex_with_type(priv, 4);
+	} else if (btcoex->has_pan) {
+		rtl8723bu_set_ps_tdma(priv, 0x61, 0x3f, 0x3, 0x11, 0x11);
+		rtl8723bu_set_coex_with_type(priv, 4);
+	} else {
+		rtl8723bu_set_ps_tdma(priv, 0x8, 0x0, 0x0, 0x0, 0x0);
+		rtl8723bu_set_coex_with_type(priv, 7);
+	}
+}
+
+void rtl8723bu_handle_bt_info(struct rtl8xxxu_priv *priv)
+{
+	struct ieee80211_vif *vif;
+	struct rtl8xxxu_btcoex *btcoex;
+	bool wifi_connected;
+
+	vif = priv->vif;
+	btcoex = &priv->bt_coex;
+	wifi_connected = (vif && vif->bss_conf.assoc);
+
+	if (wifi_connected) {
+		u32 val32 = 0;
+		u32 high_prio_tx = 0, high_prio_rx = 0;
+
+		val32 = rtl8xxxu_read32(priv, 0x770);
+		high_prio_tx = val32 & 0x0000ffff;
+		high_prio_rx = (val32  & 0xffff0000) >> 16;
+
+		if (btcoex->bt_busy) {
+			if (btcoex->hid_only) {
+				rtl8723bu_set_ps_tdma(priv, 0x61, 0x20,
+						      0x3, 0x11, 0x11);
+				rtl8723bu_set_coex_with_type(priv, 5);
+			} else if (btcoex->a2dp_only) {
+				rtl8723bu_set_ps_tdma(priv, 0x61, 0x35,
+						      0x3, 0x11, 0x11);
+				rtl8723bu_set_coex_with_type(priv, 4);
+			} else if ((btcoex->has_a2dp && btcoex->has_pan) ||
+				   (btcoex->has_hid && btcoex->has_a2dp &&
+				    btcoex->has_pan)) {
+				rtl8723bu_set_ps_tdma(priv, 0x51, 0x21,
+						      0x3, 0x10, 0x10);
+				rtl8723bu_set_coex_with_type(priv, 4);
+			} else if (btcoex->has_hid && btcoex->has_a2dp) {
+				rtl8723bu_set_ps_tdma(priv, 0x51, 0x21,
+						      0x3, 0x10, 0x10);
+				rtl8723bu_set_coex_with_type(priv, 3);
+			} else {
+				rtl8723bu_set_ps_tdma(priv, 0x61, 0x35,
+						      0x3, 0x11, 0x11);
+				rtl8723bu_set_coex_with_type(priv, 4);
+			}
+		} else {
+			rtl8723bu_set_ps_tdma(priv, 0x8, 0x0, 0x0, 0x0, 0x0);
+			if (high_prio_tx + high_prio_rx <= 60)
+				rtl8723bu_set_coex_with_type(priv, 2);
+			else
+				rtl8723bu_set_coex_with_type(priv, 7);
+		}
+	} else {
+		rtl8723bu_set_ps_tdma(priv, 0x8, 0x0, 0x0, 0x0, 0x0);
+		rtl8723bu_set_coex_with_type(priv, 0);
+	}
+}
+
+static void rtl8xxxu_c2hcmd_callback(struct work_struct *work)
+{
+	struct rtl8xxxu_priv *priv;
+	struct rtl8723bu_c2h *c2h;
+	struct ieee80211_vif *vif;
+	struct device *dev;
+	struct sk_buff *skb = NULL;
+	unsigned long flags;
+	int len;
+	u8 bt_info = 0;
+	struct rtl8xxxu_btcoex *btcoex;
+
+	priv = container_of(work, struct rtl8xxxu_priv, c2hcmd_work);
+	vif = priv->vif;
+	btcoex = &priv->bt_coex;
+	dev = &priv->udev->dev;
+
+	if (priv->rf_paths > 1)
+		goto out;
+
+	while (!skb_queue_empty(&priv->c2hcmd_queue)) {
+		spin_lock_irqsave(&priv->c2hcmd_lock, flags);
+		skb = __skb_dequeue(&priv->c2hcmd_queue);
+		spin_unlock_irqrestore(&priv->c2hcmd_lock, flags);
+
+		c2h = (struct rtl8723bu_c2h *)skb->data;
+		len = skb->len - 2;
+
+		switch (c2h->id) {
+		case C2H_8723B_BT_INFO:
+			bt_info = c2h->bt_info.bt_info;
+
+			rtl8723bu_update_bt_link_info(priv, bt_info);
+			if (btcoex->c2h_bt_inquiry) {
+				rtl8723bu_handle_bt_inquiry(priv);
+				break;
+			}
+			rtl8723bu_handle_bt_info(priv);
+			break;
+		default:
+			break;
+		}
+	}
+
+out:
+	dev_kfree_skb(skb);
+}
+
 static void rtl8723bu_handle_c2h(struct rtl8xxxu_priv *priv,
 				 struct sk_buff *skb)
 {
 	struct rtl8723bu_c2h *c2h = (struct rtl8723bu_c2h *)skb->data;
 	struct device *dev = &priv->udev->dev;
 	int len;
+	unsigned long flags;
 
 	len = skb->len - 2;
 
@@ -5229,6 +5473,12 @@ static void rtl8723bu_handle_c2h(struct rtl8xxxu_priv *priv,
 			       16, 1, c2h->raw.payload, len, false);
 		break;
 	}
+
+	spin_lock_irqsave(&priv->c2hcmd_lock, flags);
+	__skb_queue_tail(&priv->c2hcmd_queue, skb);
+	spin_unlock_irqrestore(&priv->c2hcmd_lock, flags);
+
+	schedule_work(&priv->c2hcmd_work);
 }
 
 int rtl8xxxu_parse_rxdesc16(struct rtl8xxxu_priv *priv, struct sk_buff *skb)
@@ -5353,7 +5603,6 @@ int rtl8xxxu_parse_rxdesc24(struct rtl8xxxu_priv *priv, struct sk_buff *skb)
 		struct device *dev = &priv->udev->dev;
 		dev_dbg(dev, "%s: C2H packet\n", __func__);
 		rtl8723bu_handle_c2h(priv, skb);
-		dev_kfree_skb(skb);
 		return RX_TYPE_C2H;
 	}
 
@@ -6272,6 +6521,9 @@ static int rtl8xxxu_probe(struct usb_interface *interface,
 	spin_lock_init(&priv->rx_urb_lock);
 	INIT_WORK(&priv->rx_urb_wq, rtl8xxxu_rx_urb_work);
 	INIT_DELAYED_WORK(&priv->ra_watchdog, rtl8xxxu_watchdog_callback);
+	spin_lock_init(&priv->c2hcmd_lock);
+	INIT_WORK(&priv->c2hcmd_work, rtl8xxxu_c2hcmd_callback);
+	skb_queue_head_init(&priv->c2hcmd_queue);
 
 	usb_set_intfdata(interface, hw);
 
-- 
2.20.1


^ permalink raw reply related

* Re: [RFC PATCH 3/4] virtio: introudce a mdev based transport
From: Jason Wang @ 2019-09-11  2:52 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: mst, kvm, virtualization, netdev, linux-kernel, kwankhede,
	alex.williamson, cohuck, maxime.coquelin, cunming.liang,
	zhihong.wang, rob.miller, idos, xiao.w.wang, haotian.wang
In-Reply-To: <20190911014726.GA14798@___>


On 2019/9/11 上午9:47, Tiwei Bie wrote:
> On Tue, Sep 10, 2019 at 04:19:34PM +0800, Jason Wang wrote:
>> This path introduces a new mdev transport for virtio. This is used to
>> use kernel virtio driver to drive the mediated device that is capable
>> of populating virtqueue directly.
>>
>> A new virtio-mdev driver will be registered to the mdev bus, when a
>> new virtio-mdev device is probed, it will register the device with
>> mdev based config ops. This means, unlike the exist hardware
>> transport, this is a software transport between mdev driver and mdev
>> device. The transport was implemented through:
>>
>> - configuration access was implemented through parent_ops->read()/write()
>> - vq/config callback was implemented through parent_ops->ioctl()
>>
>> This transport is derived from virtio MMIO protocol and was wrote for
>> kernel driver. But for the transport itself, but the design goal is to
>> be generic enough to support userspace driver (this part will be added
>> in the future).
>>
>> Note:
>> - current mdev assume all the parameter of parent_ops was from
>>    userspace. This prevents us from implementing the kernel mdev
>>    driver. For a quick POC, this patch just abuse those parameter and
>>    assume the mdev device implementation will treat them as kernel
>>    pointer. This should be addressed in the formal series by extending
>>    mdev_parent_ops.
>> - for a quick POC, I just drive the transport from MMIO, I'm pretty
>>    there's lot of optimization space for this.
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>   drivers/vfio/mdev/Kconfig        |   7 +
>>   drivers/vfio/mdev/Makefile       |   1 +
>>   drivers/vfio/mdev/virtio_mdev.c  | 500 +++++++++++++++++++++++++++++++
>>   include/uapi/linux/virtio_mdev.h | 131 ++++++++
>>   4 files changed, 639 insertions(+)
>>   create mode 100644 drivers/vfio/mdev/virtio_mdev.c
>>   create mode 100644 include/uapi/linux/virtio_mdev.h
>>
> [...]
>> diff --git a/include/uapi/linux/virtio_mdev.h b/include/uapi/linux/virtio_mdev.h
>> new file mode 100644
>> index 000000000000..8040de6b960a
>> --- /dev/null
>> +++ b/include/uapi/linux/virtio_mdev.h
>> @@ -0,0 +1,131 @@
>> +/*
>> + * Virtio mediated device driver
>> + *
>> + * Copyright 2019, Red Hat Corp.
>> + *
>> + * Based on Virtio MMIO driver by ARM Ltd, copyright ARM Ltd. 2011
>> + *
>> + * This header is BSD licensed so anyone can use the definitions to implement
>> + * compatible drivers/servers.
>> + *
>> + * Redistribution and use in source and binary forms, with or without
>> + * modification, are permitted provided that the following conditions
>> + * are met:
>> + * 1. Redistributions of source code must retain the above copyright
>> + *    notice, this list of conditions and the following disclaimer.
>> + * 2. Redistributions in binary form must reproduce the above copyright
>> + *    notice, this list of conditions and the following disclaimer in the
>> + *    documentation and/or other materials provided with the distribution.
>> + * 3. Neither the name of IBM nor the names of its contributors
>> + *    may be used to endorse or promote products derived from this software
>> + *    without specific prior written permission.
>> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND
>> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
>> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
>> + * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
>> + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
>> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
>> + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
>> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
>> + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
>> + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
>> + * SUCH DAMAGE.
>> + */
>> +#ifndef _LINUX_VIRTIO_MDEV_H
>> +#define _LINUX_VIRTIO_MDEV_H
>> +
>> +#include <linux/interrupt.h>
>> +#include <linux/vringh.h>
>> +#include <uapi/linux/virtio_net.h>
>> +
>> +/*
>> + * Ioctls
>> + */
>> +
>> +struct virtio_mdev_callback {
>> +	irqreturn_t (*callback)(void *);
>> +	void *private;
>> +};
>> +
>> +#define VIRTIO_MDEV 0xAF
>> +#define VIRTIO_MDEV_SET_VQ_CALLBACK _IOW(VIRTIO_MDEV, 0x00, \
>> +					 struct virtio_mdev_callback)
>> +#define VIRTIO_MDEV_SET_CONFIG_CALLBACK _IOW(VIRTIO_MDEV, 0x01, \
>> +					struct virtio_mdev_callback)
>> +
>> +#define VIRTIO_MDEV_DEVICE_API_STRING		"virtio-mdev"
>> +
>> +/*
>> + * Control registers
>> + */
>> +
>> +/* Magic value ("virt" string) - Read Only */
>> +#define VIRTIO_MDEV_MAGIC_VALUE		0x000
>> +
>> +/* Virtio device version - Read Only */
>> +#define VIRTIO_MDEV_VERSION		0x004
>> +
>> +/* Virtio device ID - Read Only */
>> +#define VIRTIO_MDEV_DEVICE_ID		0x008
>> +
>> +/* Virtio vendor ID - Read Only */
>> +#define VIRTIO_MDEV_VENDOR_ID		0x00c
>> +
>> +/* Bitmask of the features supported by the device (host)
>> + * (32 bits per set) - Read Only */
>> +#define VIRTIO_MDEV_DEVICE_FEATURES	0x010
>> +
>> +/* Device (host) features set selector - Write Only */
>> +#define VIRTIO_MDEV_DEVICE_FEATURES_SEL	0x014
>> +
>> +/* Bitmask of features activated by the driver (guest)
>> + * (32 bits per set) - Write Only */
>> +#define VIRTIO_MDEV_DRIVER_FEATURES	0x020
>> +
>> +/* Activated features set selector - Write Only */
>> +#define VIRTIO_MDEV_DRIVER_FEATURES_SEL	0x024
>> +
>> +/* Queue selector - Write Only */
>> +#define VIRTIO_MDEV_QUEUE_SEL		0x030
>> +
>> +/* Maximum size of the currently selected queue - Read Only */
>> +#define VIRTIO_MDEV_QUEUE_NUM_MAX	0x034
>> +
>> +/* Queue size for the currently selected queue - Write Only */
>> +#define VIRTIO_MDEV_QUEUE_NUM		0x038
>> +
>> +/* Ready bit for the currently selected queue - Read Write */
>> +#define VIRTIO_MDEV_QUEUE_READY		0x044
>> +
>> +/* Alignment of virtqueue - Read Only */
>> +#define VIRTIO_MDEV_QUEUE_ALIGN		0x048
>> +
>> +/* Queue notifier - Write Only */
>> +#define VIRTIO_MDEV_QUEUE_NOTIFY	0x050
>> +
>> +/* Device status register - Read Write */
>> +#define VIRTIO_MDEV_STATUS		0x060
>> +
>> +/* Selected queue's Descriptor Table address, 64 bits in two halves */
>> +#define VIRTIO_MDEV_QUEUE_DESC_LOW	0x080
>> +#define VIRTIO_MDEV_QUEUE_DESC_HIGH	0x084
>> +
>> +/* Selected queue's Available Ring address, 64 bits in two halves */
>> +#define VIRTIO_MDEV_QUEUE_AVAIL_LOW	0x090
>> +#define VIRTIO_MDEV_QUEUE_AVAIL_HIGH	0x094
>> +
>> +/* Selected queue's Used Ring address, 64 bits in two halves */
>> +#define VIRTIO_MDEV_QUEUE_USED_LOW	0x0a0
>> +#define VIRTIO_MDEV_QUEUE_USED_HIGH	0x0a4
>> +
>> +/* Configuration atomicity value */
>> +#define VIRTIO_MDEV_CONFIG_GENERATION	0x0fc
>> +
>> +/* The config space is defined by each driver as
>> + * the per-driver configuration space - Read Write */
>> +#define VIRTIO_MDEV_CONFIG		0x100
> IIUC, we can use above registers with virtio-mdev parent's
> read()/write() to access the mdev device from kernel driver.
> As you suggested, it's a choice to build vhost-mdev on top
> of this abstraction as well. But virtio is the frontend
> device which lacks some vhost backend features, e.g. get
> vring base, set vring base, negotiate vhost features, etc.
> So I'm wondering, does it make sense to reserve some space
> for vhost-mdev in kernel to do vhost backend specific setups?
> Or do you have any other thoughts?


Good point. It's just a quick POC, I plan to add vhost features in the 
next release:

1) set/get virtqueue state (e.g vring base)
2) dirty page tracking
3) for vhost features, if you mean _F_LOG_ALL, it could be done by 2), 
for IOTLB, it requires more thought on the API since current IOTLB API 
is no implemented through ioctl ...


>
> Besides, I'm also wondering, what's the purpose of making
> above registers part of UAPI?


Sorry if it brings confusion. If we do vhost-mdev on top of this, 
there's no need for this to go for UAPI.


> And if we make them part
> of UAPI, do we also need to make them part of virtio spec?


Yes, but I do not think we should put it into UAPI. Technically, 
userspace can use this transport as well with some modification on the 
interrupt part, e.g using eventfd. But is there any value for doing this 
instead of vhost?

Thanks


>
> Thanks!
> Tiwei
>
>> +
>> +#endif
>> +
>> +
>> +/* Ready bit for the currently selected queue - Read Write */
>> -- 
>> 2.19.1
>>

^ permalink raw reply

* Re: [PATCH v4] tun: fix use-after-free when register netdev failed
From: Jason Wang @ 2019-09-11  2:57 UTC (permalink / raw)
  To: Yang Yingliang, netdev; +Cc: eric.dumazet, xiyou.wangcong, davem, weiyongjun1
In-Reply-To: <1568113017-79840-1-git-send-email-yangyingliang@huawei.com>


On 2019/9/10 下午6:56, Yang Yingliang wrote:
> I got a UAF repport in tun driver when doing fuzzy test:
>
> [  466.269490] ==================================================================
> [  466.271792] BUG: KASAN: use-after-free in tun_chr_read_iter+0x2ca/0x2d0
> [  466.271806] Read of size 8 at addr ffff888372139250 by task tun-test/2699
> [  466.271810]
> [  466.271824] CPU: 1 PID: 2699 Comm: tun-test Not tainted 5.3.0-rc1-00001-g5a9433db2614-dirty #427
> [  466.271833] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [  466.271838] Call Trace:
> [  466.271858]  dump_stack+0xca/0x13e
> [  466.271871]  ? tun_chr_read_iter+0x2ca/0x2d0
> [  466.271890]  print_address_description+0x79/0x440
> [  466.271906]  ? vprintk_func+0x5e/0xf0
> [  466.271920]  ? tun_chr_read_iter+0x2ca/0x2d0
> [  466.271935]  __kasan_report+0x15c/0x1df
> [  466.271958]  ? tun_chr_read_iter+0x2ca/0x2d0
> [  466.271976]  kasan_report+0xe/0x20
> [  466.271987]  tun_chr_read_iter+0x2ca/0x2d0
> [  466.272013]  do_iter_readv_writev+0x4b7/0x740
> [  466.272032]  ? default_llseek+0x2d0/0x2d0
> [  466.272072]  do_iter_read+0x1c5/0x5e0
> [  466.272110]  vfs_readv+0x108/0x180
> [  466.299007]  ? compat_rw_copy_check_uvector+0x440/0x440
> [  466.299020]  ? fsnotify+0x888/0xd50
> [  466.299040]  ? __fsnotify_parent+0xd0/0x350
> [  466.299064]  ? fsnotify_first_mark+0x1e0/0x1e0
> [  466.304548]  ? vfs_write+0x264/0x510
> [  466.304569]  ? ksys_write+0x101/0x210
> [  466.304591]  ? do_preadv+0x116/0x1a0
> [  466.304609]  do_preadv+0x116/0x1a0
> [  466.309829]  do_syscall_64+0xc8/0x600
> [  466.309849]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  466.309861] RIP: 0033:0x4560f9
> [  466.309875] Code: 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> [  466.309889] RSP: 002b:00007ffffa5166e8 EFLAGS: 00000206 ORIG_RAX: 0000000000000127
> [  466.322992] RAX: ffffffffffffffda RBX: 0000000000400460 RCX: 00000000004560f9
> [  466.322999] RDX: 0000000000000003 RSI: 00000000200008c0 RDI: 0000000000000003
> [  466.323007] RBP: 00007ffffa516700 R08: 0000000000000004 R09: 0000000000000000
> [  466.323014] R10: 0000000000000000 R11: 0000000000000206 R12: 000000000040cb10
> [  466.323021] R13: 0000000000000000 R14: 00000000006d7018 R15: 0000000000000000
> [  466.323057]
> [  466.323064] Allocated by task 2605:
> [  466.335165]  save_stack+0x19/0x80
> [  466.336240]  __kasan_kmalloc.constprop.8+0xa0/0xd0
> [  466.337755]  kmem_cache_alloc+0xe8/0x320
> [  466.339050]  getname_flags+0xca/0x560
> [  466.340229]  user_path_at_empty+0x2c/0x50
> [  466.341508]  vfs_statx+0xe6/0x190
> [  466.342619]  __do_sys_newstat+0x81/0x100
> [  466.343908]  do_syscall_64+0xc8/0x600
> [  466.345303]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  466.347034]
> [  466.347517] Freed by task 2605:
> [  466.348471]  save_stack+0x19/0x80
> [  466.349476]  __kasan_slab_free+0x12e/0x180
> [  466.350726]  kmem_cache_free+0xc8/0x430
> [  466.351874]  putname+0xe2/0x120
> [  466.352921]  filename_lookup+0x257/0x3e0
> [  466.354319]  vfs_statx+0xe6/0x190
> [  466.355498]  __do_sys_newstat+0x81/0x100
> [  466.356889]  do_syscall_64+0xc8/0x600
> [  466.358037]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  466.359567]
> [  466.360050] The buggy address belongs to the object at ffff888372139100
> [  466.360050]  which belongs to the cache names_cache of size 4096
> [  466.363735] The buggy address is located 336 bytes inside of
> [  466.363735]  4096-byte region [ffff888372139100, ffff88837213a100)
> [  466.367179] The buggy address belongs to the page:
> [  466.368604] page:ffffea000dc84e00 refcount:1 mapcount:0 mapping:ffff8883df1b4f00 index:0x0 compound_mapcount: 0
> [  466.371582] flags: 0x2fffff80010200(slab|head)
> [  466.372910] raw: 002fffff80010200 dead000000000100 dead000000000122 ffff8883df1b4f00
> [  466.375209] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
> [  466.377778] page dumped because: kasan: bad access detected
> [  466.379730]
> [  466.380288] Memory state around the buggy address:
> [  466.381844]  ffff888372139100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [  466.384009]  ffff888372139180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [  466.386131] >ffff888372139200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [  466.388257]                                                  ^
> [  466.390234]  ffff888372139280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [  466.392512]  ffff888372139300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [  466.394667] ==================================================================
>
> tun_chr_read_iter() accessed the memory which freed by free_netdev()
> called by tun_set_iff():
>
>          CPUA                                           CPUB
>    tun_set_iff()
>      alloc_netdev_mqs()
>      tun_attach()
>                                                    tun_chr_read_iter()
>                                                      tun_get()
>                                                      tun_do_read()
>                                                        tun_ring_recv()
>      register_netdevice() <-- inject error
>      goto err_detach
>      tun_detach_all() <-- set RCV_SHUTDOWN
>      free_netdev() <-- called from
>                       err_free_dev path
>        netdev_freemem() <-- free the memory
>                          without check refcount
>        (In this path, the refcount cannot prevent
>         freeing the memory of dev, and the memory
>         will be used by dev_put() called by
>         tun_chr_read_iter() on CPUB.)
>                                                       (Break from tun_ring_recv(),
>                                                       because RCV_SHUTDOWN is set)
>                                                     tun_put()
>                                                       dev_put() <-- use the memory
>                                                                     freed by netdev_freemem()
>
> Put the publishing of tfile->tun after register_netdevice(),
> so tun_get() won't get the tun pointer that freed by
> err_detach path if register_netdevice() failed.
>
> Fixes: eb0fb363f920 ("tuntap: attach queue 0 before registering netdevice")
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Suggested-by: Jason Wang <jasowang@redhat.com>
> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
> ---
>   drivers/net/tun.c | 16 +++++++++++-----
>   1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index db16d7a13e00..aab0be40d443 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -787,7 +787,8 @@ static void tun_detach_all(struct net_device *dev)
>   }
>   
>   static int tun_attach(struct tun_struct *tun, struct file *file,
> -		      bool skip_filter, bool napi, bool napi_frags)
> +		      bool skip_filter, bool napi, bool napi_frags,
> +		      bool publish_tun)
>   {
>   	struct tun_file *tfile = file->private_data;
>   	struct net_device *dev = tun->dev;
> @@ -870,7 +871,8 @@ static int tun_attach(struct tun_struct *tun, struct file *file,
>   	 * initialized tfile; otherwise we risk using half-initialized
>   	 * object.
>   	 */
> -	rcu_assign_pointer(tfile->tun, tun);
> +	if (publish_tun)
> +		rcu_assign_pointer(tfile->tun, tun);
>   	rcu_assign_pointer(tun->tfiles[tun->numqueues], tfile);
>   	tun->numqueues++;
>   	tun_set_real_num_queues(tun);
> @@ -2730,7 +2732,7 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
>   
>   		err = tun_attach(tun, file, ifr->ifr_flags & IFF_NOFILTER,
>   				 ifr->ifr_flags & IFF_NAPI,
> -				 ifr->ifr_flags & IFF_NAPI_FRAGS);
> +				 ifr->ifr_flags & IFF_NAPI_FRAGS, true);
>   		if (err < 0)
>   			return err;
>   
> @@ -2829,13 +2831,17 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
>   
>   		INIT_LIST_HEAD(&tun->disabled);
>   		err = tun_attach(tun, file, false, ifr->ifr_flags & IFF_NAPI,
> -				 ifr->ifr_flags & IFF_NAPI_FRAGS);
> +				 ifr->ifr_flags & IFF_NAPI_FRAGS, false);
>   		if (err < 0)
>   			goto err_free_flow;
>   
>   		err = register_netdevice(tun->dev);
>   		if (err < 0)
>   			goto err_detach;
> +		/* free_netdev() won't check refcnt, to aovid race


A typo that comes from my original patch "avoid".

Other  looks good.

Thanks


> +		 * with dev_put() we need publish tun after registration.
> +		 */
> +		rcu_assign_pointer(tfile->tun, tun);
>   	}
>   
>   	netif_carrier_on(tun->dev);
> @@ -2978,7 +2984,7 @@ static int tun_set_queue(struct file *file, struct ifreq *ifr)
>   		if (ret < 0)
>   			goto unlock;
>   		ret = tun_attach(tun, file, false, tun->flags & IFF_NAPI,
> -				 tun->flags & IFF_NAPI_FRAGS);
> +				 tun->flags & IFF_NAPI_FRAGS, true);
>   	} else if (ifr->ifr_flags & IFF_DETACH_QUEUE) {
>   		tun = rtnl_dereference(tfile->tun);
>   		if (!tun || !(tun->flags & IFF_MULTI_QUEUE) || tfile->detached)

^ permalink raw reply

* Re: [RFC PATCH 3/4] virtio: introudce a mdev based transport
From: Tiwei Bie @ 2019-09-11  3:08 UTC (permalink / raw)
  To: Jason Wang
  Cc: mst, kvm, virtualization, netdev, linux-kernel, kwankhede,
	alex.williamson, cohuck, maxime.coquelin, cunming.liang,
	zhihong.wang, rob.miller, idos, xiao.w.wang, haotian.wang
In-Reply-To: <8428eec1-b53c-9efc-1260-4e5ce2651461@redhat.com>

On Wed, Sep 11, 2019 at 10:52:03AM +0800, Jason Wang wrote:
> On 2019/9/11 上午9:47, Tiwei Bie wrote:
> > On Tue, Sep 10, 2019 at 04:19:34PM +0800, Jason Wang wrote:
> > > This path introduces a new mdev transport for virtio. This is used to
> > > use kernel virtio driver to drive the mediated device that is capable
> > > of populating virtqueue directly.
> > > 
> > > A new virtio-mdev driver will be registered to the mdev bus, when a
> > > new virtio-mdev device is probed, it will register the device with
> > > mdev based config ops. This means, unlike the exist hardware
> > > transport, this is a software transport between mdev driver and mdev
> > > device. The transport was implemented through:
> > > 
> > > - configuration access was implemented through parent_ops->read()/write()
> > > - vq/config callback was implemented through parent_ops->ioctl()
> > > 
> > > This transport is derived from virtio MMIO protocol and was wrote for
> > > kernel driver. But for the transport itself, but the design goal is to
> > > be generic enough to support userspace driver (this part will be added
> > > in the future).
> > > 
> > > Note:
> > > - current mdev assume all the parameter of parent_ops was from
> > >    userspace. This prevents us from implementing the kernel mdev
> > >    driver. For a quick POC, this patch just abuse those parameter and
> > >    assume the mdev device implementation will treat them as kernel
> > >    pointer. This should be addressed in the formal series by extending
> > >    mdev_parent_ops.
> > > - for a quick POC, I just drive the transport from MMIO, I'm pretty
> > >    there's lot of optimization space for this.
> > > 
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > >   drivers/vfio/mdev/Kconfig        |   7 +
> > >   drivers/vfio/mdev/Makefile       |   1 +
> > >   drivers/vfio/mdev/virtio_mdev.c  | 500 +++++++++++++++++++++++++++++++
> > >   include/uapi/linux/virtio_mdev.h | 131 ++++++++
> > >   4 files changed, 639 insertions(+)
> > >   create mode 100644 drivers/vfio/mdev/virtio_mdev.c
> > >   create mode 100644 include/uapi/linux/virtio_mdev.h
> > > 
> > [...]
> > > diff --git a/include/uapi/linux/virtio_mdev.h b/include/uapi/linux/virtio_mdev.h
> > > new file mode 100644
> > > index 000000000000..8040de6b960a
> > > --- /dev/null
> > > +++ b/include/uapi/linux/virtio_mdev.h
> > > @@ -0,0 +1,131 @@
> > > +/*
> > > + * Virtio mediated device driver
> > > + *
> > > + * Copyright 2019, Red Hat Corp.
> > > + *
> > > + * Based on Virtio MMIO driver by ARM Ltd, copyright ARM Ltd. 2011
> > > + *
> > > + * This header is BSD licensed so anyone can use the definitions to implement
> > > + * compatible drivers/servers.
> > > + *
> > > + * Redistribution and use in source and binary forms, with or without
> > > + * modification, are permitted provided that the following conditions
> > > + * are met:
> > > + * 1. Redistributions of source code must retain the above copyright
> > > + *    notice, this list of conditions and the following disclaimer.
> > > + * 2. Redistributions in binary form must reproduce the above copyright
> > > + *    notice, this list of conditions and the following disclaimer in the
> > > + *    documentation and/or other materials provided with the distribution.
> > > + * 3. Neither the name of IBM nor the names of its contributors
> > > + *    may be used to endorse or promote products derived from this software
> > > + *    without specific prior written permission.
> > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND
> > > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> > > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> > > + * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
> > > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> > > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> > > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> > > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> > > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> > > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> > > + * SUCH DAMAGE.
> > > + */
> > > +#ifndef _LINUX_VIRTIO_MDEV_H
> > > +#define _LINUX_VIRTIO_MDEV_H
> > > +
> > > +#include <linux/interrupt.h>
> > > +#include <linux/vringh.h>
> > > +#include <uapi/linux/virtio_net.h>
> > > +
> > > +/*
> > > + * Ioctls
> > > + */
> > > +
> > > +struct virtio_mdev_callback {
> > > +	irqreturn_t (*callback)(void *);
> > > +	void *private;
> > > +};
> > > +
> > > +#define VIRTIO_MDEV 0xAF
> > > +#define VIRTIO_MDEV_SET_VQ_CALLBACK _IOW(VIRTIO_MDEV, 0x00, \
> > > +					 struct virtio_mdev_callback)
> > > +#define VIRTIO_MDEV_SET_CONFIG_CALLBACK _IOW(VIRTIO_MDEV, 0x01, \
> > > +					struct virtio_mdev_callback)
> > > +
> > > +#define VIRTIO_MDEV_DEVICE_API_STRING		"virtio-mdev"
> > > +
> > > +/*
> > > + * Control registers
> > > + */
> > > +
> > > +/* Magic value ("virt" string) - Read Only */
> > > +#define VIRTIO_MDEV_MAGIC_VALUE		0x000
> > > +
> > > +/* Virtio device version - Read Only */
> > > +#define VIRTIO_MDEV_VERSION		0x004
> > > +
> > > +/* Virtio device ID - Read Only */
> > > +#define VIRTIO_MDEV_DEVICE_ID		0x008
> > > +
> > > +/* Virtio vendor ID - Read Only */
> > > +#define VIRTIO_MDEV_VENDOR_ID		0x00c
> > > +
> > > +/* Bitmask of the features supported by the device (host)
> > > + * (32 bits per set) - Read Only */
> > > +#define VIRTIO_MDEV_DEVICE_FEATURES	0x010
> > > +
> > > +/* Device (host) features set selector - Write Only */
> > > +#define VIRTIO_MDEV_DEVICE_FEATURES_SEL	0x014
> > > +
> > > +/* Bitmask of features activated by the driver (guest)
> > > + * (32 bits per set) - Write Only */
> > > +#define VIRTIO_MDEV_DRIVER_FEATURES	0x020
> > > +
> > > +/* Activated features set selector - Write Only */
> > > +#define VIRTIO_MDEV_DRIVER_FEATURES_SEL	0x024
> > > +
> > > +/* Queue selector - Write Only */
> > > +#define VIRTIO_MDEV_QUEUE_SEL		0x030
> > > +
> > > +/* Maximum size of the currently selected queue - Read Only */
> > > +#define VIRTIO_MDEV_QUEUE_NUM_MAX	0x034
> > > +
> > > +/* Queue size for the currently selected queue - Write Only */
> > > +#define VIRTIO_MDEV_QUEUE_NUM		0x038
> > > +
> > > +/* Ready bit for the currently selected queue - Read Write */
> > > +#define VIRTIO_MDEV_QUEUE_READY		0x044
> > > +
> > > +/* Alignment of virtqueue - Read Only */
> > > +#define VIRTIO_MDEV_QUEUE_ALIGN		0x048
> > > +
> > > +/* Queue notifier - Write Only */
> > > +#define VIRTIO_MDEV_QUEUE_NOTIFY	0x050
> > > +
> > > +/* Device status register - Read Write */
> > > +#define VIRTIO_MDEV_STATUS		0x060
> > > +
> > > +/* Selected queue's Descriptor Table address, 64 bits in two halves */
> > > +#define VIRTIO_MDEV_QUEUE_DESC_LOW	0x080
> > > +#define VIRTIO_MDEV_QUEUE_DESC_HIGH	0x084
> > > +
> > > +/* Selected queue's Available Ring address, 64 bits in two halves */
> > > +#define VIRTIO_MDEV_QUEUE_AVAIL_LOW	0x090
> > > +#define VIRTIO_MDEV_QUEUE_AVAIL_HIGH	0x094
> > > +
> > > +/* Selected queue's Used Ring address, 64 bits in two halves */
> > > +#define VIRTIO_MDEV_QUEUE_USED_LOW	0x0a0
> > > +#define VIRTIO_MDEV_QUEUE_USED_HIGH	0x0a4
> > > +
> > > +/* Configuration atomicity value */
> > > +#define VIRTIO_MDEV_CONFIG_GENERATION	0x0fc
> > > +
> > > +/* The config space is defined by each driver as
> > > + * the per-driver configuration space - Read Write */
> > > +#define VIRTIO_MDEV_CONFIG		0x100
> > IIUC, we can use above registers with virtio-mdev parent's
> > read()/write() to access the mdev device from kernel driver.
> > As you suggested, it's a choice to build vhost-mdev on top
> > of this abstraction as well. But virtio is the frontend
> > device which lacks some vhost backend features, e.g. get
> > vring base, set vring base, negotiate vhost features, etc.
> > So I'm wondering, does it make sense to reserve some space
> > for vhost-mdev in kernel to do vhost backend specific setups?
> > Or do you have any other thoughts?
> 
> 
> Good point. It's just a quick POC, I plan to add vhost features in the next
> release:
> 
> 1) set/get virtqueue state (e.g vring base)
> 2) dirty page tracking
> 3) for vhost features, if you mean _F_LOG_ALL, it could be done by 2), for
> IOTLB, it requires more thought on the API since current IOTLB API is no
> implemented through ioctl ...
> 
> 
> > 
> > Besides, I'm also wondering, what's the purpose of making
> > above registers part of UAPI?
> 
> 
> Sorry if it brings confusion. If we do vhost-mdev on top of this, there's no
> need for this to go for UAPI.
> 
> 
> > And if we make them part
> > of UAPI, do we also need to make them part of virtio spec?
> 
> 
> Yes, but I do not think we should put it into UAPI. Technically, userspace
> can use this transport as well with some modification on the interrupt part,
> e.g using eventfd. But is there any value for doing this instead of vhost?

Yeah, I agree. As we already have vhost, I don't see the value
to make virtio become "virtio + vhost".

Thanks,
Tiwei

^ permalink raw reply

* Re: [PATCH bpf-next v10 2/4] bpf: new helper to obtain namespace data from current task New bpf helper bpf_get_current_pidns_info.
From: Carlos Antonio Neira Bustos @ 2019-09-11  4:32 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Al Viro, netdev@vger.kernel.org, ebiederm@xmission.com,
	brouer@redhat.com, bpf@vger.kernel.org
In-Reply-To: <dadf3657-2648-14ef-35ee-e09efb2cdb3e@fb.com>

On Tue, Sep 10, 2019 at 10:35:09PM +0000, Yonghong Song wrote:
Thanks a lot Yonghong.
I'll include this patch when submitting changes for version 11 of
this patch. 
> 
> Carlos,
> 
> Discussed with Eric today for what is the best way to get
> the device number for a namespace. The following patch seems
> a reasonable start although Eric would like to see
> how the helper is used in order to decide whether the
> interface looks right.
> 
> commit bb00fc36d5d263047a8bceb3e51e969d7fbce7db (HEAD -> fs2)
> Author: Yonghong Song <yhs@fb.com>
> Date:   Mon Sep 9 21:50:51 2019 -0700
> 
>      nsfs: add an interface function ns_get_inum_dev()
> 
>      This patch added an interface function
>      ns_get_inum_dev(). Given a ns_common structure,
>      the function returns the inode and device
>      numbers. The function will be used later
>      by a newly added bpf helper.
> 
>      Signed-off-by: Yonghong Song <yhs@fb.com>
> 
> diff --git a/fs/nsfs.c b/fs/nsfs.c
> index a0431642c6b5..a603c6fc3f54 100644
> --- a/fs/nsfs.c
> +++ b/fs/nsfs.c
> @@ -245,6 +245,14 @@ struct file *proc_ns_fget(int fd)
>          return ERR_PTR(-EINVAL);
>   }
> 
> +/* Get the device number for the current task pidns.
> + */
> +void ns_get_inum_dev(struct ns_common *ns, u32 *inum, dev_t *dev)
> +{
> +       *inum = ns->inum;
> +       *dev = nsfs_mnt->mnt_sb->s_dev;
> +}
> +
>   static int nsfs_show_path(struct seq_file *seq, struct dentry *dentry)
>   {
>          struct inode *inode = d_inode(dentry);
> diff --git a/include/linux/proc_ns.h b/include/linux/proc_ns.h
> index d31cb6215905..b8fc680cdf1a 100644
> --- a/include/linux/proc_ns.h
> +++ b/include/linux/proc_ns.h
> @@ -81,6 +81,7 @@ extern void *ns_get_path(struct path *path, struct 
> task_struct *task,
>   typedef struct ns_common *ns_get_path_helper_t(void *);
>   extern void *ns_get_path_cb(struct path *path, ns_get_path_helper_t 
> ns_get_cb,
>                              void *private_data);
> +extern void ns_get_inum_dev(struct ns_common *ns, u32 *inum, dev_t *dev);
> 
>   extern int ns_get_name(char *buf, size_t size, struct task_struct *task,
>                          const struct proc_ns_operations *ns_ops);
> 
> Could you put the above change and patch #1 and then have
> all your other patches? In your kernel change, please use
> interface function ns_get_inum_dev() to get pidns inode number
> and dev number.
> 
> On 9/9/19 6:45 PM, Carlos Antonio Neira Bustos wrote:
> > Thanks a lot, Al Viro and Yonghong for taking the time to review this patch and
> > provide technical insights needed on this one.
> > But how do we move this forward?
> > Al Viro's review is clear that this will not work and we should strip the name
> > resolution code (thanks for your detailed analysis).
> > As there is currently only one instance of the nsfs device on the system,
> > I think we could leave out the retrieval of the pidns device number and address it
> > when the situation changes.
> > What do you think?
> > 
> > 
> > On Sat, Sep 07, 2019 at 06:34:39AM +0000, Yonghong Song wrote:
> >>
> >>
> >> On 9/6/19 5:10 PM, Al Viro wrote:
> >>> On Fri, Sep 06, 2019 at 11:21:14PM +0000, Yonghong Song wrote:
> >>>
> >>>> -bash-4.4$ readlink /proc/self/ns/pid
> >>>> pid:[4026531836]
> >>>> -bash-4.4$ stat /proc/self/ns/pid
> >>>>      File: ‘/proc/self/ns/pid’ -> ‘pid:[4026531836]’
> >>>>      Size: 0               Blocks: 0          IO Block: 1024   symbolic link
> >>>> Device: 4h/4d   Inode: 344795989   Links: 1
> >>>> Access: (0777/lrwxrwxrwx)  Uid: (128203/     yhs)   Gid: (  100/   users)
> >>>> Context: user_u:base_r:base_t
> >>>> Access: 2019-09-06 16:06:09.431616380 -0700
> >>>> Modify: 2019-09-06 16:06:09.431616380 -0700
> >>>> Change: 2019-09-06 16:06:09.431616380 -0700
> >>>>     Birth: -
> >>>> -bash-4.4$
> >>>>
> >>>> Based on a discussion with Eric Biederman back in 2019 Linux
> >>>> Plumbers, Eric suggested that to uniquely identify a
> >>>> namespace, device id (major/minor) number should also
> >>>> be included. Although today's kernel implementation
> >>>> has the same device for all namespace pseudo files,
> >>>> but from uapi perspective, device id should be included.
> >>>>
> >>>> That is the reason why we try to get device id which holds
> >>>> pid namespace pseudo file.
> >>>>
> >>>> Do you have a better suggestion on how to get
> >>>> the device id for 'current' pid namespace? Or from design, we
> >>>> really should not care about device id at all?
> >>>
> >>> What the hell is "device id for pid namespace"?  This is the
> >>> first time I've heard about that mystery object, so it's
> >>> hard to tell where it could be found.
> >>>
> >>> I can tell you what device numbers are involved in the areas
> >>> you seem to be looking in.
> >>>
> >>> 1) there's whatever device number that gets assigned to
> >>> (this) procfs instance.  That, ironically, _is_ per-pidns, but
> >>> that of the procfs instance, not that of your process (and
> >>> those can be different).  That's what you get in ->st_dev
> >>> when doing lstat() of anything in /proc (assuming that
> >>> procfs is mounted there, in the first place).  NOTE:
> >>> that's lstat(2), not stat(2).  stat(1) uses lstat(2),
> >>> unless given -L (in which case it's stat(2) time).  The
> >>> difference:
> >>>
> >>> root@kvm1:~# stat /proc/self/ns/pid
> >>>     File: /proc/self/ns/pid -> pid:[4026531836]
> >>>     Size: 0               Blocks: 0          IO Block: 1024   symbolic link
> >>> Device: 4h/4d   Inode: 17396       Links: 1
> >>> Access: (0777/lrwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
> >>> Access: 2019-09-06 19:43:11.871312319 -0400
> >>> Modify: 2019-09-06 19:43:11.871312319 -0400
> >>> Change: 2019-09-06 19:43:11.871312319 -0400
> >>>    Birth: -
> >>> root@kvm1:~# stat -L /proc/self/ns/pid
> >>>     File: /proc/self/ns/pid
> >>>     Size: 0               Blocks: 0          IO Block: 4096   regular empty file
> >>> Device: 3h/3d   Inode: 4026531836  Links: 1
> >>> Access: (0444/-r--r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
> >>> Access: 2019-09-06 19:43:15.955313293 -0400
> >>> Modify: 2019-09-06 19:43:15.955313293 -0400
> >>> Change: 2019-09-06 19:43:15.955313293 -0400
> >>>    Birth: -
> >>>
> >>> The former is lstat, the latter - stat.
> >>>
> >>> 2) device number of the filesystem where the symlink target lives.
> >>> In this case, it's nsfs and there's only one instance on the entire
> >>> system.  _That_ would be obtained by looking at st_dev in stat(2) on
> >>> /proc/self/ns/pid (0:3 above).
> >>>
> >>> 3) device number *OF* the symlink.  That would be st_rdev in lstat(2).
> >>> There's none - it's a symlink, not a character or block device.  It's
> >>> always zero and always will be zero.
> >>>
> >>> 4) the same for the target; st_rdev in stat(2) results and again,
> >>> there's no such beast - it's neither character nor block device.
> >>>
> >>> Your code is looking at (3).  Please, reread any textbook on Unix
> >>> in the section that would cover stat(2) and discussion of the
> >>> difference between st_dev and st_rdev.
> >>>
> >>> I have no idea what Eric had been talking about - it's hard to
> >>> reconstruct by what you said so far.  Making nsfs per-userns,
> >>> perhaps?  But that makes no sense whatsoever, not that userns
> >>> ever had...  Cheap shots aside, I really can't guess what that's
> >>> about.  Sorry.
> >>
> >> Thanks for the detailed information. The device number we want
> >> is nsfs. Indeed, currently, there is only one instance
> >> on the entire system. But not exactly sure what is the possibility
> >> to have more than one nsfs device in the future. Maybe per-userns
> >> or any other criteria?
> >>
> >>>
> >>> In any case, pathname resolution is *NOT* for the situations where
> >>> you can't block.  Even if it's procfs (and from the same pidns as
> >>> the process) mounted there, there is no promise that the target
> >>> of /proc/self has already been looked up and not evicted from
> >>> memory since then.  And in case of cache miss pathwalk will
> >>> have to call ->lookup(), which requires locking the directory
> >>> (rw_sem, shared).  You can't do that in such context.
> >>>
> >>> And that doesn't even go into the possibility that process has
> >>> something very different mounted on /proc.
> >>>
> >>> Again, I don't know what it is that you want to get to, but
> >>> I would strongly recommend finding a way to get to that data
> >>> that would not involve going anywhere near pathname resolution.
> >>>
> >>> How would you expect the userland to work with that value,
> >>> whatever it might be?  If it's just a 32bit field that will
> >>> never be read, you might as well store there the same value
> >>> you store now (0, that is) in much cheaper and safer way ;-)
> >>
> >> Suppose inside pid namespace, user can pass the device number,
> >> say n1, (`stat -L /proc/self/ns/pid`) to bpf program (through map
> >> or JIT). At runtime, bpf program will try to get device number,
> >> say n2, for the 'current' process. If n1 is not the same as
> >> n2, that means they are not in the same namespace. 'current'
> >> is in the same pid namespace as the user iff
> >> n1 == n2 and also pidns id is the same for 'current' and
> >> the one with `lsns -t pid`.
> >>
> >> Are you aware of any way to get the pidns device number
> >> for 'current' without going through the pathname
> >> lookup?
> >>

^ permalink raw reply

* Re: [PATCH bpf-next v10 2/4] bpf: new helper to obtain namespace data from current task New bpf helper bpf_get_current_pidns_info.
From: Carlos Antonio Neira Bustos @ 2019-09-11  4:33 UTC (permalink / raw)
  To: Yonghong Song
  Cc: netdev@vger.kernel.org, ebiederm@xmission.com, brouer@redhat.com,
	bpf@vger.kernel.org
In-Reply-To: <c5ecd602-8b45-94bd-96e8-2264a88a3c09@fb.com>

Thanks for reviewing the rest of the code, 
I'll include these changes in ver 11.

Bests 
On Tue, Sep 10, 2019 at 10:46:45PM +0000, Yonghong Song wrote:
> 
> 
> On 9/6/19 4:09 PM, Carlos Neira wrote:
> > This helper(bpf_get_current_pidns_info) obtains the active namespace from
> > current and returns pid, tgid, device and namespace id as seen from that
> > namespace, allowing to instrument a process inside a container.
> > 
> > Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
> > ---
> >   include/linux/bpf.h      |  1 +
> >   include/uapi/linux/bpf.h | 35 +++++++++++++++++++-
> >   kernel/bpf/core.c        |  1 +
> >   kernel/bpf/helpers.c     | 86 ++++++++++++++++++++++++++++++++++++++++++++++++
> >   kernel/trace/bpf_trace.c |  2 ++
> >   5 files changed, 124 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 5b9d22338606..819cb1c84be0 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1055,6 +1055,7 @@ extern const struct bpf_func_proto bpf_get_local_storage_proto;
> >   extern const struct bpf_func_proto bpf_strtol_proto;
> >   extern const struct bpf_func_proto bpf_strtoul_proto;
> >   extern const struct bpf_func_proto bpf_tcp_sock_proto;
> > +extern const struct bpf_func_proto bpf_get_current_pidns_info_proto;
> >   
> >   /* Shared helpers among cBPF and eBPF. */
> >   void bpf_user_rnd_init_once(void);
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index b5889257cc33..3ec9aa1438b7 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -2747,6 +2747,32 @@ union bpf_attr {
> >    *		**-EOPNOTSUPP** kernel configuration does not enable SYN cookies
> >    *
> >    *		**-EPROTONOSUPPORT** IP packet version is not 4 or 6
> > + *
> > + * int bpf_get_current_pidns_info(struct bpf_pidns_info *pidns, u32 size_of_pidns)
> > + *	Description
> > + *		Get tgid, pid and namespace id as seen by the current namespace,
> > + *		and device major/minor numbers from /proc/self/ns/pid. Such
> > + *		information is stored in *pidns* of size *size*.
> > + *
> > + *		This helper is used when pid filtering is needed inside a
> > + *		container as bpf_get_current_tgid() helper always returns the
> > + *		pid id as seen by the root namespace.
> > + *	Return
> > + *		0 on success
> > + *
> > + *		On failure, the returned value is one of the following:
> > + *
> > + *		**-EINVAL** if *size_of_pidns* is not valid or unable to get ns, pid
> > + *		or tgid of the current task.
> > + *
> > + *		**-ENOENT** if /proc/self/ns/pid does not exists.
> > + *
> > + *		**-ENOENT** if /proc/self/ns does not exists.
> > + *
> > + *		**-ENOMEM** if helper internal allocation fails.
> 
> -ENOMEM can be removed.
> 
> > + *
> > + *		**-EPERM** if not able to call helper.
> > + *
> >    */
> >   #define __BPF_FUNC_MAPPER(FN)		\
> >   	FN(unspec),			\
> > @@ -2859,7 +2885,8 @@ union bpf_attr {
> >   	FN(sk_storage_get),		\
> >   	FN(sk_storage_delete),		\
> >   	FN(send_signal),		\
> > -	FN(tcp_gen_syncookie),
> > +	FN(tcp_gen_syncookie),		\
> > +	FN(get_current_pidns_info),
> >   
> >   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> >    * function eBPF program intends to call
> > @@ -3610,4 +3637,10 @@ struct bpf_sockopt {
> >   	__s32	retval;
> >   };
> >   
> > +struct bpf_pidns_info {
> > +	__u32 dev;	/* dev_t from /proc/self/ns/pid inode */
> 
>     /* dev_t of pid namespace pseudo file (typically /proc/seelf/ns/pid) 
> after following symbolic link */
> 
> > +	__u32 nsid;
> > +	__u32 tgid;
> > +	__u32 pid;
> > +};
> >   #endif /* _UAPI__LINUX_BPF_H__ */
> > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > index 8191a7db2777..3159f2a0188c 100644
> > --- a/kernel/bpf/core.c
> > +++ b/kernel/bpf/core.c
> > @@ -2038,6 +2038,7 @@ const struct bpf_func_proto bpf_get_current_uid_gid_proto __weak;
> >   const struct bpf_func_proto bpf_get_current_comm_proto __weak;
> >   const struct bpf_func_proto bpf_get_current_cgroup_id_proto __weak;
> >   const struct bpf_func_proto bpf_get_local_storage_proto __weak;
> > +const struct bpf_func_proto bpf_get_current_pidns_info __weak;
> >   
> >   const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void)
> >   {
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index 5e28718928ca..8dbe6347893c 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -11,6 +11,11 @@
> >   #include <linux/uidgid.h>
> >   #include <linux/filter.h>
> >   #include <linux/ctype.h>
> > +#include <linux/pid_namespace.h>
> > +#include <linux/kdev_t.h>
> > +#include <linux/stat.h>
> > +#include <linux/namei.h>
> > +#include <linux/version.h>
> >   
> >   #include "../../lib/kstrtox.h"
> >   
> > @@ -312,6 +317,87 @@ void copy_map_value_locked(struct bpf_map *map, void *dst, void *src,
> >   	preempt_enable();
> >   }
> >   
> > +BPF_CALL_2(bpf_get_current_pidns_info, struct bpf_pidns_info *, pidns_info, u32,
> > +	 size)
> > +{
> > +	const char *pidns_path = "/proc/self/ns/pid";
> > +	struct pid_namespace *pidns = NULL;
> > +	struct filename *fname = NULL;
> > +	struct inode *inode;
> > +	struct path kp;
> > +	pid_t tgid = 0;
> > +	pid_t pid = 0;
> > +	int ret = -EINVAL;
> > +	int len;
> > +
> > +	if (unlikely(in_interrupt() ||
> > +			current->flags & (PF_KTHREAD | PF_EXITING)))
> > +		return -EPERM;
> > +
> > +	if (unlikely(size != sizeof(struct bpf_pidns_info)))
> > +		return -EINVAL;
> > +
> > +	pidns = task_active_pid_ns(current);
> > +	if (unlikely(!pidns))
> > +		return -ENOENT;
> > +
> > +	pidns_info->nsid =  pidns->ns.inum;
> > +	pid = task_pid_nr_ns(current, pidns);
> > +	if (unlikely(!pid))
> > +		goto clear;
> > +
> > +	tgid = task_tgid_nr_ns(current, pidns);
> > +	if (unlikely(!tgid))
> > +		goto clear;
> > +
> > +	pidns_info->tgid = (u32) tgid;
> > +	pidns_info->pid = (u32) pid;
> > +
> [...]
> > +	fname = kmem_cache_alloc(names_cachep, GFP_ATOMIC);
> > +	if (unlikely(!fname)) {
> > +		ret = -ENOMEM;
> > +		goto clear;
> > +	}
> > +	const size_t fnamesize = offsetof(struct filename, iname[1]);
> > +	struct filename *tmp;
> > +
> > +	tmp = kmalloc(fnamesize, GFP_ATOMIC);
> > +	if (unlikely(!tmp)) {
> > +		__putname(fname);
> > +		ret = -ENOMEM;
> > +		goto clear;
> > +	}
> > +
> > +	tmp->name = (char *)fname;
> > +	fname = tmp;
> > +	len = strlen(pidns_path) + 1;
> > +	memcpy((char *)fname->name, pidns_path, len);
> > +	fname->uptr = NULL;
> > +	fname->aname = NULL;
> > +	fname->refcnt = 1;
> > +
> > +	ret = filename_lookup(AT_FDCWD, fname, 0, &kp, NULL);
> > +	if (ret)
> > +		goto clear;
> > +
> > +	inode = d_backing_inode(kp.dentry);
> > +	pidns_info->dev = (u32)inode->i_rdev;
> The above can bee replaced with new nsfs interface function
> ns_get_inum_dev().
> > +
> > +	return 0;
> > +
> > +clear:
> > +	memset((void *)pidns_info, 0, (size_t) size);
> > +	return ret;
> > +}
> > +
> > +const struct bpf_func_proto bpf_get_current_pidns_info_proto = {
> > +	.func		= bpf_get_current_pidns_info,
> > +	.gpl_only	= false,
> > +	.ret_type	= RET_INTEGER,
> > +	.arg1_type	= ARG_PTR_TO_UNINIT_MEM,
> > +	.arg2_type	= ARG_CONST_SIZE,
> > +};
> > +
> >   #ifdef CONFIG_CGROUPS
> >   BPF_CALL_0(bpf_get_current_cgroup_id)
> >   {
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index ca1255d14576..5e1dc22765a5 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -709,6 +709,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >   #endif
> >   	case BPF_FUNC_send_signal:
> >   		return &bpf_send_signal_proto;
> > +	case BPF_FUNC_get_current_pidns_info:
> > +		return &bpf_get_current_pidns_info_proto;
> >   	default:
> >   		return NULL;
> >   	}
> > 

^ permalink raw reply

* Re: [PATCH] net: stmmac: socfpga: re-use the `interface` parameter from platform data
From: Ardelean, Alexandru @ 2019-09-11  5:03 UTC (permalink / raw)
  To: davem@davemloft.net
  Cc: linux-stm32@st-md-mailman.stormreply.com, joabreu@synopsys.com,
	linux-kernel@vger.kernel.org, mcoquelin.stm32@gmail.com,
	netdev@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	peppe.cavallaro@st.com, alexandre.torgue@st.com
In-Reply-To: <20190910.174653.800353422834551780.davem@davemloft.net>

On Tue, 2019-09-10 at 17:46 +0200, David Miller wrote:
> [External]
> 
> From: David Miller <davem@davemloft.net>
> Date: Tue, 10 Sep 2019 17:45:44 +0200 (CEST)
> 
> > From: Alexandru Ardelean <alexandru.ardelean@analog.com>
> > Date: Fri, 6 Sep 2019 15:30:54 +0300
> > 
> > > The socfpga sub-driver defines an `interface` field in the `socfpga_dwmac`
> > > struct and parses it on init.
> > > 
> > > The shared `stmmac_probe_config_dt()` function also parses this from the
> > > device-tree and makes it available on the returned `plat_data` (which is
> > > the same data available via `netdev_priv()`).
> > > 
> > > All that's needed now is to dig that information out, via some
> > > `dev_get_drvdata()` && `netdev_priv()` calls and re-use it.
> > > 
> > > Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
> > 
> > This doesn't build even on net-next.
> 

Right.
My bad.

I think I got confused with multiple/cross-testing and probably this change didn't even get compiled.

Apologies for this.
Will send a good version.

Alex

> Specifically:
> 
> drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c: In function ‘socfpga_gen5_set_phy_mode’:
> drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c:264:44: error: ‘phymode’ undeclared (first use in this function);
> did you mean ‘phy_modes’?
>   264 |   dev_err(dwmac->dev, "bad phy mode %d\n", phymode);
>       |                                            ^~~~~~~
> ./include/linux/device.h:1499:32: note: in definition of macro ‘dev_err’
>  1499 |  _dev_err(dev, dev_fmt(fmt), ##__VA_ARGS__)
>       |                                ^~~~~~~~~~~
> drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c:264:44: note: each undeclared identifier is reported only once for
> each function it appears in
>   264 |   dev_err(dwmac->dev, "bad phy mode %d\n", phymode);
>       |                                            ^~~~~~~
> ./include/linux/device.h:1499:32: note: in definition of macro ‘dev_err’
>  1499 |  _dev_err(dev, dev_fmt(fmt), ##__VA_ARGS__)
>       |                                ^~~~~~~~~~~
> drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c: In function ‘socfpga_gen10_set_phy_mode’:
> drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c:340:6: error: ‘phymode’ undeclared (first use in this function);
> did you mean ‘phy_modes’?
>   340 |      phymode == PHY_INTERFACE_MODE_MII ||
>       |      ^~~~~~~
>       |      phy_modes

^ permalink raw reply

* RE: VRF Issue Since kernel 5
From: Gowen @ 2019-09-11  5:09 UTC (permalink / raw)
  To: David Ahern, Alexis Bauvin; +Cc: netdev@vger.kernel.org
In-Reply-To: <db3f6cd0-aa28-0883-715c-3e1eaeb7fd1e@gmail.com>

Thanks for the link - that's really useful. I did re-order ip rules Friday (I think) - no change

-----Original Message-----
From: David Ahern <dsahern@gmail.com> 
Sent: 10 September 2019 17:36
To: Alexis Bauvin <abauvin@online.net>; Gowen <gowen@potatocomputing.co.uk>
Cc: netdev@vger.kernel.org
Subject: Re: VRF Issue Since kernel 5

On 9/9/19 1:01 PM, Alexis Bauvin wrote:
> Could you try swapping the local and l3mdev rules?
> 
> `ip rule del pref 0; ip rule add from all lookup local pref 1001`

yes, the rules should be re-ordered so that local rule is after l3mdev rule (VRF is implemented as policy routing). In general, I would reverse the order of those commands to ensure no breakage.

Also, 5.0 I think it was (too many kernel versions) added a new l3mdev sysctl (raw_l3mdev_accept). Check all 3 of them and nmake sure they are set properly for your use case.

These slides do not cover 5.0 changes but are still the best collection of notes on VRF:
http://schd.ws/hosted_files/ossna2017/fe/vrf-tutorial-oss.pdf

^ permalink raw reply

* Re: linux-next: Signed-off-by missing for commit in the net-next tree
From: Luciano Coelho @ 2019-09-11  5:24 UTC (permalink / raw)
  To: Stephen Rothwell, David Miller, Networking
  Cc: Linux Next Mailing List, Linux Kernel Mailing List, Alex Malamud
In-Reply-To: <20190911004229.74d2763a@canb.auug.org.au>

On Wed, 2019-09-11 at 00:42 +1000, Stephen Rothwell wrote:
> Hi all,
> 
> Commit
> 
>   aa43ae121675 ("iwlwifi: LTR updates")
> 
> is missing a Signed-off-by from its committer.

Oops, that was my fault.  What can we do about it? Is it enough if I
give my s-o-b publicly here?

I hereby sign off this change:

Signed-off-by: Luca Coelho <luciano.coelho@intel.com>


--
Cheers,
Luca.


^ permalink raw reply

* Re: [PATCH v3 1/2] PTP: introduce new versions of IOCTLs
From: Felipe Balbi @ 2019-09-11  6:08 UTC (permalink / raw)
  To: Richard Cochran; +Cc: Christopher S Hall, netdev, linux-kernel
In-Reply-To: <20190910154452.GB4016@localhost>

[-- Attachment #1: Type: text/plain, Size: 705 bytes --]


Hi,

Richard Cochran <richardcochran@gmail.com> writes:
> On Mon, Sep 09, 2019 at 10:59:39AM +0300, Felipe Balbi wrote:
>
>>  	case PTP_PEROUT_REQUEST:
>> +	case PTP_PEROUT_REQUEST2:
>
> ...
>
>> +		if (((req.perout.flags & ~PTP_PEROUT_VALID_FLAGS) ||
>> +			req.perout.rsv[0] || req.perout.rsv[1] ||
>> +			req.perout.rsv[2] || req.perout.rsv[3]) &&
>> +			cmd == PTP_PEROUT_REQUEST2) {
>> +			err = -EINVAL;
>> +			break;
>
> ...
>
>> +/*
>> + * Bits of the ptp_perout_request.flags field:
>> + */
>> +#define PTP_PEROUT_VALID_FLAGS (~0)
>
> I think you meant (0) here, or I am confused...

Argh. What a brain fart!

Sorry about that. I'll go fix that ASAP.

-- 
balbi

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply

* Re: ixgbe: driver drops packets routed from an IPSec interface with a "bad sa_idx" error
From: Steffen Klassert @ 2019-09-11  6:15 UTC (permalink / raw)
  To: Michael Marley; +Cc: Shannon Nelson, netdev, Jeff Kirsher
In-Reply-To: <6ab15854-154a-2c7c-b429-7ba6dfe785ae@michaelmarley.com>

On Tue, Sep 10, 2019 at 06:53:30PM -0400, Michael Marley wrote:
> 
> StrongSwan has hardware offload disabled by default, and I didn't enable
> it explicitly.  I also already tried turning off all those switches with
> ethtool and it has no effect.  This doesn't surprise me though, because
> as I said, I don't actually have the IPSec connection running over the
> ixgbe device.  The IPSec connection runs over another network adapter
> that doesn't support IPSec offload at all.  The problem comes when
> traffic received over the IPSec interface is then routed back out
> (unencrypted) through the ixgbe device into the local network.


Seems like the ixgbe driver tries to use the sec_path
from RX to setup an offload at the TX side.

Can you please try this (completely untested) patch?

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 9bcae44e9883..ae31bd57127c 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -36,6 +36,7 @@
 #include <net/vxlan.h>
 #include <net/mpls.h>
 #include <net/xdp_sock.h>
+#include <net/xfrm.h>
 
 #include "ixgbe.h"
 #include "ixgbe_common.h"
@@ -8696,7 +8697,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
 #endif /* IXGBE_FCOE */
 
 #ifdef CONFIG_IXGBE_IPSEC
-	if (secpath_exists(skb) &&
+	if (xfrm_offload(skb) &&
 	    !ixgbe_ipsec_tx(tx_ring, first, &ipsec_tx))
 		goto out_drop;
 #endif

^ permalink raw reply related

* [PATCH v4 2/2] PTP: add support for one-shot output
From: Felipe Balbi @ 2019-09-11  6:16 UTC (permalink / raw)
  To: Richard Cochran; +Cc: Christopher S Hall, netdev, linux-kernel, Felipe Balbi
In-Reply-To: <20190911061622.774006-1-felipe.balbi@linux.intel.com>

Some controllers allow for a one-shot output pulse, in contrast to
periodic output. Now that we have extensible versions of our IOCTLs, we
can finally make use of the 'flags' field to pass a bit telling driver
that if we want one-shot pulse output.

Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
---

Changes since v3:
	- Remove bogus bitwise negation

Changes since v2:
	- Add _PEROUT_ to bit macro

Changes since v1:
	- remove comment from .flags field

 include/uapi/linux/ptp_clock.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/ptp_clock.h b/include/uapi/linux/ptp_clock.h
index 9a0af3511b68..f16301015949 100644
--- a/include/uapi/linux/ptp_clock.h
+++ b/include/uapi/linux/ptp_clock.h
@@ -38,8 +38,8 @@
 /*
  * Bits of the ptp_perout_request.flags field:
  */
-#define PTP_PEROUT_VALID_FLAGS (0)
-
+#define PTP_PEROUT_ONE_SHOT (1<<0)
+#define PTP_PEROUT_VALID_FLAGS	(PTP_PEROUT_ONE_SHOT)
 /*
  * struct ptp_clock_time - represents a time value
  *
@@ -77,7 +77,7 @@ struct ptp_perout_request {
 	struct ptp_clock_time start;  /* Absolute start time. */
 	struct ptp_clock_time period; /* Desired period, zero means disable. */
 	unsigned int index;           /* Which channel to configure. */
-	unsigned int flags;           /* Reserved for future use. */
+	unsigned int flags;
 	unsigned int rsv[4];          /* Reserved for future use. */
 };
 
-- 
2.23.0


^ permalink raw reply related

* [PATCH v4 1/2] PTP: introduce new versions of IOCTLs
From: Felipe Balbi @ 2019-09-11  6:16 UTC (permalink / raw)
  To: Richard Cochran; +Cc: Christopher S Hall, netdev, linux-kernel, Felipe Balbi

The current version of the IOCTL have a small problem which prevents us
from extending the API by making use of reserved fields. In these new
IOCTLs, we are now making sure that flags and rsv fields are zero which
will allow us to extend the API in the future.

Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
---

Changes since v3:
	- Remove bogus bitwise negation

Changes since v2:
	- Define PTP_{PEROUT,EXTTS}_VALID_FLAGS
	- Fix comment above PTP_*_FLAGS

Changes since v1:
	- Add a blank line after memset()
	- Move memset(req) to the three places where it's needed
	- Fix the accidental removal of GETFUNC and SETFUNC

 drivers/ptp/ptp_chardev.c      | 63 ++++++++++++++++++++++++++++++++++
 include/uapi/linux/ptp_clock.h | 24 ++++++++++++-
 2 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c
index 18ffe449efdf..9c18476d8d10 100644
--- a/drivers/ptp/ptp_chardev.c
+++ b/drivers/ptp/ptp_chardev.c
@@ -126,7 +126,9 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 	switch (cmd) {
 
 	case PTP_CLOCK_GETCAPS:
+	case PTP_CLOCK_GETCAPS2:
 		memset(&caps, 0, sizeof(caps));
+
 		caps.max_adj = ptp->info->max_adj;
 		caps.n_alarm = ptp->info->n_alarm;
 		caps.n_ext_ts = ptp->info->n_ext_ts;
@@ -139,11 +141,24 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_EXTTS_REQUEST:
+	case PTP_EXTTS_REQUEST2:
+		memset(&req, 0, sizeof(req));
+
 		if (copy_from_user(&req.extts, (void __user *)arg,
 				   sizeof(req.extts))) {
 			err = -EFAULT;
 			break;
 		}
+		if (((req.extts.flags & ~PTP_EXTTS_VALID_FLAGS) ||
+			req.extts.rsv[0] || req.extts.rsv[1]) &&
+			cmd == PTP_EXTTS_REQUEST2) {
+			err = -EINVAL;
+			break;
+		} else if (cmd == PTP_EXTTS_REQUEST) {
+			req.extts.flags &= ~PTP_EXTTS_VALID_FLAGS;
+			req.extts.rsv[0] = 0;
+			req.extts.rsv[1] = 0;
+		}
 		if (req.extts.index >= ops->n_ext_ts) {
 			err = -EINVAL;
 			break;
@@ -154,11 +169,27 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_PEROUT_REQUEST:
+	case PTP_PEROUT_REQUEST2:
+		memset(&req, 0, sizeof(req));
+
 		if (copy_from_user(&req.perout, (void __user *)arg,
 				   sizeof(req.perout))) {
 			err = -EFAULT;
 			break;
 		}
+		if (((req.perout.flags & ~PTP_PEROUT_VALID_FLAGS) ||
+			req.perout.rsv[0] || req.perout.rsv[1] ||
+			req.perout.rsv[2] || req.perout.rsv[3]) &&
+			cmd == PTP_PEROUT_REQUEST2) {
+			err = -EINVAL;
+			break;
+		} else if (cmd == PTP_PEROUT_REQUEST) {
+			req.perout.flags &= ~PTP_PEROUT_VALID_FLAGS;
+			req.perout.rsv[0] = 0;
+			req.perout.rsv[1] = 0;
+			req.perout.rsv[2] = 0;
+			req.perout.rsv[3] = 0;
+		}
 		if (req.perout.index >= ops->n_per_out) {
 			err = -EINVAL;
 			break;
@@ -169,6 +200,9 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_ENABLE_PPS:
+	case PTP_ENABLE_PPS2:
+		memset(&req, 0, sizeof(req));
+
 		if (!capable(CAP_SYS_TIME))
 			return -EPERM;
 		req.type = PTP_CLK_REQ_PPS;
@@ -177,6 +211,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_SYS_OFFSET_PRECISE:
+	case PTP_SYS_OFFSET_PRECISE2:
 		if (!ptp->info->getcrosststamp) {
 			err = -EOPNOTSUPP;
 			break;
@@ -201,6 +236,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_SYS_OFFSET_EXTENDED:
+	case PTP_SYS_OFFSET_EXTENDED2:
 		if (!ptp->info->gettimex64) {
 			err = -EOPNOTSUPP;
 			break;
@@ -232,6 +268,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_SYS_OFFSET:
+	case PTP_SYS_OFFSET2:
 		sysoff = memdup_user((void __user *)arg, sizeof(*sysoff));
 		if (IS_ERR(sysoff)) {
 			err = PTR_ERR(sysoff);
@@ -266,10 +303,23 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_PIN_GETFUNC:
+	case PTP_PIN_GETFUNC2:
 		if (copy_from_user(&pd, (void __user *)arg, sizeof(pd))) {
 			err = -EFAULT;
 			break;
 		}
+		if ((pd.rsv[0] || pd.rsv[1] || pd.rsv[2]
+				|| pd.rsv[3] || pd.rsv[4])
+			&& cmd == PTP_PIN_GETFUNC2) {
+			err = -EINVAL;
+			break;
+		} else if (cmd == PTP_PIN_GETFUNC) {
+			pd.rsv[0] = 0;
+			pd.rsv[1] = 0;
+			pd.rsv[2] = 0;
+			pd.rsv[3] = 0;
+			pd.rsv[4] = 0;
+		}
 		pin_index = pd.index;
 		if (pin_index >= ops->n_pins) {
 			err = -EINVAL;
@@ -285,10 +335,23 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_PIN_SETFUNC:
+	case PTP_PIN_SETFUNC2:
 		if (copy_from_user(&pd, (void __user *)arg, sizeof(pd))) {
 			err = -EFAULT;
 			break;
 		}
+		if ((pd.rsv[0] || pd.rsv[1] || pd.rsv[2]
+				|| pd.rsv[3] || pd.rsv[4])
+			&& cmd == PTP_PIN_SETFUNC2) {
+			err = -EINVAL;
+			break;
+		} else if (cmd == PTP_PIN_SETFUNC) {
+			pd.rsv[0] = 0;
+			pd.rsv[1] = 0;
+			pd.rsv[2] = 0;
+			pd.rsv[3] = 0;
+			pd.rsv[4] = 0;
+		}
 		pin_index = pd.index;
 		if (pin_index >= ops->n_pins) {
 			err = -EINVAL;
diff --git a/include/uapi/linux/ptp_clock.h b/include/uapi/linux/ptp_clock.h
index 1bc794ad957a..9a0af3511b68 100644
--- a/include/uapi/linux/ptp_clock.h
+++ b/include/uapi/linux/ptp_clock.h
@@ -25,10 +25,20 @@
 #include <linux/ioctl.h>
 #include <linux/types.h>
 
-/* PTP_xxx bits, for the flags field within the request structures. */
+/*
+ * Bits of the ptp_extts_request.flags field:
+ */
 #define PTP_ENABLE_FEATURE (1<<0)
 #define PTP_RISING_EDGE    (1<<1)
 #define PTP_FALLING_EDGE   (1<<2)
+#define PTP_EXTTS_VALID_FLAGS	(PTP_ENABLE_FEATURE |	\
+				 PTP_RISING_EDGE |	\
+				 PTP_FALLING_EDGE)
+
+/*
+ * Bits of the ptp_perout_request.flags field:
+ */
+#define PTP_PEROUT_VALID_FLAGS (0)
 
 /*
  * struct ptp_clock_time - represents a time value
@@ -149,6 +159,18 @@ struct ptp_pin_desc {
 #define PTP_SYS_OFFSET_EXTENDED \
 	_IOWR(PTP_CLK_MAGIC, 9, struct ptp_sys_offset_extended)
 
+#define PTP_CLOCK_GETCAPS2  _IOR(PTP_CLK_MAGIC, 10, struct ptp_clock_caps)
+#define PTP_EXTTS_REQUEST2  _IOW(PTP_CLK_MAGIC, 11, struct ptp_extts_request)
+#define PTP_PEROUT_REQUEST2 _IOW(PTP_CLK_MAGIC, 12, struct ptp_perout_request)
+#define PTP_ENABLE_PPS2     _IOW(PTP_CLK_MAGIC, 13, int)
+#define PTP_SYS_OFFSET2     _IOW(PTP_CLK_MAGIC, 14, struct ptp_sys_offset)
+#define PTP_PIN_GETFUNC2    _IOWR(PTP_CLK_MAGIC, 15, struct ptp_pin_desc)
+#define PTP_PIN_SETFUNC2    _IOW(PTP_CLK_MAGIC, 16, struct ptp_pin_desc)
+#define PTP_SYS_OFFSET_PRECISE2 \
+	_IOWR(PTP_CLK_MAGIC, 17, struct ptp_sys_offset_precise)
+#define PTP_SYS_OFFSET_EXTENDED2 \
+	_IOWR(PTP_CLK_MAGIC, 18, struct ptp_sys_offset_extended)
+
 struct ptp_extts_event {
 	struct ptp_clock_time t; /* Time event occured. */
 	unsigned int index;      /* Which channel produced the event. */
-- 
2.23.0


^ permalink raw reply related

* Re: [patch net-next v2 3/3] net: devlink: move reload fail indication to devlink core and expose to user
From: Jiri Pirko @ 2019-09-11  6:17 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev@vger.kernel.org, davem@davemloft.net, dsahern@gmail.com,
	jakub.kicinski@netronome.com, Tariq Toukan, mlxsw
In-Reply-To: <20190908103928.GA21777@splinter>

Sun, Sep 08, 2019 at 12:39:32PM CEST, idosch@mellanox.com wrote:
>On Sat, Sep 07, 2019 at 10:54:00PM +0200, Jiri Pirko wrote:
>> diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
>> index 546e75dd74ac..7cb5e8c5ae0d 100644
>> --- a/include/uapi/linux/devlink.h
>> +++ b/include/uapi/linux/devlink.h
>> @@ -410,6 +410,8 @@ enum devlink_attr {
>>  	DEVLINK_ATTR_TRAP_METADATA,			/* nested */
>>  	DEVLINK_ATTR_TRAP_GROUP_NAME,			/* string */
>>  
>> +	DEVLINK_ATTR_RELOAD_FAILED,			/* u8 0 or 1 */
>> +
>>  	/* add new attributes above here, update the policy in devlink.c */
>>  
>>  	__DEVLINK_ATTR_MAX,
>> diff --git a/net/core/devlink.c b/net/core/devlink.c
>> index 1e3a2288b0b2..e00a4a643d17 100644
>> --- a/net/core/devlink.c
>> +++ b/net/core/devlink.c
>> @@ -471,6 +471,8 @@ static int devlink_nl_fill(struct sk_buff *msg, struct devlink *devlink,
>>  
>>  	if (devlink_nl_put_handle(msg, devlink))
>>  		goto nla_put_failure;
>> +	if (nla_put_u8(msg, DEVLINK_ATTR_RELOAD_FAILED, devlink->reload_failed))
>
>Why not use NLA_FLAG for this?

Bacause older kernel does not return this. So the user would not know if
everything is fine or if the kernel does not support the attribute.
Using u8 instead allows it.


>
>> +		goto nla_put_failure;
>>  
>>  	genlmsg_end(msg, hdr);
>>  	return 0;
>> @@ -2677,6 +2679,21 @@ static bool devlink_reload_supported(struct devlink *devlink)
>>  	return devlink->ops->reload_down && devlink->ops->reload_up;
>>  }

^ permalink raw reply

* Re: [patch net-next v2 3/3] net: devlink: move reload fail indication to devlink core and expose to user
From: Jiri Pirko @ 2019-09-11  6:17 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev@vger.kernel.org, davem@davemloft.net, dsahern@gmail.com,
	jakub.kicinski@netronome.com, Tariq Toukan, mlxsw
In-Reply-To: <20190908112534.GA27998@splinter>

Sun, Sep 08, 2019 at 01:25:36PM CEST, idosch@mellanox.com wrote:
>On Sat, Sep 07, 2019 at 10:54:00PM +0200, Jiri Pirko wrote:
>> +bool devlink_is_reload_failed(struct devlink *devlink)
>
>Forgot to mention that this can be 'const'

ok.

>
>> +{
>> +	return devlink->reload_failed;
>> +}
>> +EXPORT_SYMBOL_GPL(devlink_is_reload_failed);

^ permalink raw reply

* Re: [PATCH v2 5/5] ARM: dts: ls1021a-tsn: Use the DSPI controller in poll mode
From: Shawn Guo @ 2019-09-11  6:33 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Mark Brown, linux-spi, lkml, devicetree, netdev, Rob Herring
In-Reply-To: <CA+h21hqMVdsUjBdtiHKtKGpyvuaOf25tc4h-GdDEBQqa3EB7tw@mail.gmail.com>

On Tue, Aug 27, 2019 at 09:16:39PM +0300, Vladimir Oltean wrote:
> On Tue, 27 Aug 2019 at 21:13, Mark Brown <broonie@kernel.org> wrote:
> >
> > On Tue, Aug 27, 2019 at 09:06:14PM +0300, Vladimir Oltean wrote:
> > > On Tue, 27 Aug 2019 at 21:05, Mark Brown <broonie@kernel.org> wrote:
> > > > On Mon, Aug 26, 2019 at 04:10:51PM +0300, Vladimir Oltean wrote:
> >
> > > > > I noticed you skipped applying this patch, and I'm not sure that Shawn
> > > > > will review it/take it.
> > > > > Do you have a better suggestion how I can achieve putting the DSPI
> > > > > driver in poll mode for this board? A Kconfig option maybe?
> >
> > > > DT changes go through the relevant platform trees, not the
> > > > subsystem trees, so it's not something I'd expect to apply.
> >
> > > But at least is it something that you expect to see done through a
> > > device tree change?
> >
> > Well, it's not ideal - if it performs better all the time the
> > driver should probably just do it unconditionally.  If there's
> > some threashold where it tends to perform better then the driver
> > should check for that but IIRC it sounds like the interrupt just
> > isn't at all helpful here.
> 
> I can't seem to find any situation where it performs worse. Hence my
> question on whether it's a better idea to condition this behavior on a
> Kconfig option rather than a DT blob which may or may not be in sync.

DT is a description of hardware not condition for software behavior,
where module parameter is usually used for.

Shawn

^ permalink raw reply

* Re: [PATCH v2 5/5] ARM: dts: ls1021a-tsn: Use the DSPI controller in poll mode
From: Geert Uytterhoeven @ 2019-09-11  7:00 UTC (permalink / raw)
  To: Shawn Guo
  Cc: Vladimir Oltean, Mark Brown, linux-spi, lkml,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	netdev, Rob Herring
In-Reply-To: <20190911063350.GB17142@dragon>

On Wed, Sep 11, 2019 at 8:34 AM Shawn Guo <shawnguo@kernel.org> wrote:
> On Tue, Aug 27, 2019 at 09:16:39PM +0300, Vladimir Oltean wrote:
> > On Tue, 27 Aug 2019 at 21:13, Mark Brown <broonie@kernel.org> wrote:
> > > On Tue, Aug 27, 2019 at 09:06:14PM +0300, Vladimir Oltean wrote:
> > > > On Tue, 27 Aug 2019 at 21:05, Mark Brown <broonie@kernel.org> wrote:
> > > > > On Mon, Aug 26, 2019 at 04:10:51PM +0300, Vladimir Oltean wrote:
> > >
> > > > > > I noticed you skipped applying this patch, and I'm not sure that Shawn
> > > > > > will review it/take it.
> > > > > > Do you have a better suggestion how I can achieve putting the DSPI
> > > > > > driver in poll mode for this board? A Kconfig option maybe?
> > >
> > > > > DT changes go through the relevant platform trees, not the
> > > > > subsystem trees, so it's not something I'd expect to apply.
> > >
> > > > But at least is it something that you expect to see done through a
> > > > device tree change?
> > >
> > > Well, it's not ideal - if it performs better all the time the
> > > driver should probably just do it unconditionally.  If there's
> > > some threashold where it tends to perform better then the driver
> > > should check for that but IIRC it sounds like the interrupt just
> > > isn't at all helpful here.
> >
> > I can't seem to find any situation where it performs worse. Hence my
> > question on whether it's a better idea to condition this behavior on a
> > Kconfig option rather than a DT blob which may or may not be in sync.
>
> DT is a description of hardware not condition for software behavior,
> where module parameter is usually used for.

+1

DT says the interrupt line is wired.
The driver should know if it should make use of the interrupt, or not.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: ixgbe: driver drops packets routed from an IPSec interface with a "bad sa_idx" error
From: Shannon Nelson @ 2019-09-11  7:17 UTC (permalink / raw)
  To: Steffen Klassert, Michael Marley; +Cc: netdev, Jeff Kirsher
In-Reply-To: <20190911061547.GR2879@gauss3.secunet.de>

On 9/10/19 11:15 PM, Steffen Klassert wrote:
> On Tue, Sep 10, 2019 at 06:53:30PM -0400, Michael Marley wrote:
>> StrongSwan has hardware offload disabled by default, and I didn't enable
>> it explicitly.  I also already tried turning off all those switches with
>> ethtool and it has no effect.  This doesn't surprise me though, because
>> as I said, I don't actually have the IPSec connection running over the
>> ixgbe device.  The IPSec connection runs over another network adapter
>> that doesn't support IPSec offload at all.  The problem comes when
>> traffic received over the IPSec interface is then routed back out
>> (unencrypted) through the ixgbe device into the local network.
>
> Seems like the ixgbe driver tries to use the sec_path
> from RX to setup an offload at the TX side.
>
> Can you please try this (completely untested) patch?
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 9bcae44e9883..ae31bd57127c 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -36,6 +36,7 @@
>   #include <net/vxlan.h>
>   #include <net/mpls.h>
>   #include <net/xdp_sock.h>
> +#include <net/xfrm.h>
>   
>   #include "ixgbe.h"
>   #include "ixgbe_common.h"
> @@ -8696,7 +8697,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
>   #endif /* IXGBE_FCOE */
>   
>   #ifdef CONFIG_IXGBE_IPSEC
> -	if (secpath_exists(skb) &&
> +	if (xfrm_offload(skb) &&
>   	    !ixgbe_ipsec_tx(tx_ring, first, &ipsec_tx))
>   		goto out_drop;
>   #endif

Thanks, Steffen, that looks right to me.

sln



^ permalink raw reply

* Re: Q: fixed link
From: Ranran @ 2019-09-11  7:25 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev
In-Reply-To: <20190908090551.GC28580@lunn.ch>

On Sun, Sep 8, 2019 at 12:05 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sun, Sep 08, 2019 at 10:30:51AM +0300, Ranran wrote:
> > Hello,
> >
> > In documentation of fixed-link it is said:"
> > Some Ethernet MACs have a "fixed link", and are not connected to a
> > normal MDIO-managed PHY device. For those situations, a Device Tree
> > binding allows to describe a "fixed link".
> > "
> > Does it mean, that on using unmanaged switch ("no cpu" mode), it is
> > better be used with fixed-link ?
>
> Hi Ranran
>
> Is there a MAC to MAC connection, or PHY to PHY connection?
>
> If the interface MAC is directly connected to the switch MAC, fixed
> link is what you should use. The fixed link will then tell the
> interface MAC what speed it should use.
>
> If you have back to back PHYs, you need a PHY driver for the PHY
> connected to the interface MAC, to configure its speed, duplex
> etc. The dumb switch should be controlling its PHY, and auto-neg will
> probably work.
>
>          Andrew


Hi,

We are using RGMII mode with the switch, which probably means
mac-to-mac (as far as I understand).

Does it mean we can choose between these 2 options:
1. configure dsa switch in device tree
2. configure fixed link in device tree


Thanks,
ranran

^ permalink raw reply

* Re: [PATCH] bpf: validate bpf_func when BPF_JIT is enabled
From: Yonghong Song @ 2019-09-11  7:42 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Alexei Starovoitov, Daniel Borkmann, Kees Cook, Martin Lau,
	Song Liu, netdev@vger.kernel.org, bpf@vger.kernel.org,
	linux-kernel@vger.kernel.org, Toke Høiland-Jørgensen,
	Björn Töpel
In-Reply-To: <20190910172253.GA164966@google.com>



On 9/10/19 6:22 PM, Sami Tolvanen wrote:
> On Tue, Sep 10, 2019 at 08:37:19AM +0000, Yonghong Song wrote:
>> You did not mention BPF_BINARY_HEADER_MAGIC and added member
>> of `magic` in bpf_binary_header. Could you add some details
>> on what is the purpose for this `magic` member?
> 
> Sure, I'll add a description to the next version.
> 
> The magic is a random number used to identify bpf_binary_header in
> memory. The purpose of this patch is to limit the possible call
> targets for the function pointer and checking for the magic helps
> ensure we are jumping to a page that contains a jited function,
> instead of allowing calls to arbitrary targets.
> 
> This is particularly useful when combined with the compiler-based
> Control-Flow Integrity (CFI) mitigation, which Google started shipping
> in Pixel kernels last year. The compiler injects checks to all
> indirect calls, but cannot obviously validate jumps to dynamically
> generated code.
> 
>>> +unsigned int bpf_call_func(const struct bpf_prog *prog, const void *ctx)
>>> +{
>>> +	const struct bpf_binary_header *hdr = bpf_jit_binary_hdr(prog);
>>> +
>>> +	if (!IS_ENABLED(CONFIG_BPF_JIT_ALWAYS_ON) && !prog->jited)
>>> +		return prog->bpf_func(ctx, prog->insnsi);
>>> +
>>> +	if (unlikely(hdr->magic != BPF_BINARY_HEADER_MAGIC ||
>>> +		     !arch_bpf_jit_check_func(prog))) {
>>> +		WARN(1, "attempt to jump to an invalid address");
>>> +		return 0;
>>> +	}
>>> +
>>> +	return prog->bpf_func(ctx, prog->insnsi);
>>> +}
> 
>> The above can be rewritten as
>> 	if (IS_ENABLED(CONFIG_BPF_JIT_ALWAYS_ON) || prog->jited ||
>> 	    hdr->magic != BPF_BINARY_HEADER_MAGIC ||
>> 	    !arch_bpf_jit_check_func(prog))) {
>> 		WARN(1, "attempt to jump to an invalid address");
>> 		return 0;
>> 	}
> 
> That doesn't look quite equivalent, but yes, this can be rewritten as a

Indeed, I made a mistake. Your below change is correct.

> single if statement like this:
> 
> 	if ((IS_ENABLED(CONFIG_BPF_JIT_ALWAYS_ON) ||
> 	     prog->jited) &&
> 	    (hdr->magic != BPF_BINARY_HEADER_MAGIC ||
> 	     !arch_bpf_jit_check_func(prog)))
> 
> I think splitting the interpreter and JIT paths would be more readable,
> but I can certainly change this if you prefer.

How about this:

	if (!IS_ENABLED(CONFIG_BPF_JIT_ALWAYS_ON) && !prog->jited)
		goto out;

	if (unlikely(hdr->magic != BPF_BINARY_HEADER_MAGIC ||
	    !arch_bpf_jit_check_func(prog))) {
		WARN(1, "attempt to jump to an invalid address");
		return 0;
	}
out:
	return prog->bpf_func(ctx, prog->insnsi);

> 
>> BPF_PROG_RUN() will be called during xdp fast path.
>> Have you measured how much slowdown the above change could
>> cost for the performance?
> 
> I have not measured the overhead, but it shouldn't be significant. Is
> there a particular benchmark you'd like me to run?

I am not an expert in XDP testing. Toke, Björn, could you give some
suggestions what to test for XDP performance here?

> 
> Sami
> 

^ permalink raw reply

* [PATCH 04/11] net: phylink: switch to using fwnode_gpiod_get_index()
From: Dmitry Torokhov @ 2019-09-11  7:52 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Andy Shevchenko, Mika Westerberg, linux-kernel, linux-gpio,
	Andrew Lunn, David S. Miller, Florian Fainelli, Heiner Kallweit,
	Russell King, netdev
In-Reply-To: <20190911075215.78047-1-dmitry.torokhov@gmail.com>

Instead of fwnode_get_named_gpiod() that I plan to hide away, let's use
the new fwnode_gpiod_get_index() that mimics gpiod_get_index(), bit
works with arbitrary firmware node.

Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
---

 drivers/net/phy/phylink.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index a45c5de96ab1..14b608991445 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -168,8 +168,8 @@ static int phylink_parse_fixedlink(struct phylink *pl,
 			pl->link_config.pause |= MLO_PAUSE_ASYM;
 
 		if (ret == 0) {
-			desc = fwnode_get_named_gpiod(fixed_node, "link-gpios",
-						      0, GPIOD_IN, "?");
+			desc = fwnode_gpiod_get_index(fixed_node, "link", 0,
+						      GPIOD_IN, "?");
 
 			if (!IS_ERR(desc))
 				pl->link_gpio = desc;
-- 
2.23.0.162.g0b9fbb3734-goog


^ permalink raw reply related

* [PATCH 05/11] net: mdio: switch to using fwnode_gpiod_get_index()
From: Dmitry Torokhov @ 2019-09-11  7:52 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Andy Shevchenko, Mika Westerberg, linux-kernel, linux-gpio,
	Andrew Lunn, David S. Miller, Florian Fainelli, Heiner Kallweit,
	netdev
In-Reply-To: <20190911075215.78047-1-dmitry.torokhov@gmail.com>

Instead of fwnode_get_named_gpiod() that I plan to hide away, let's use
the new fwnode_gpiod_get_index() that mimics gpiod_get_index(), bit
works with arbitrary firmware node.

Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
---

 drivers/net/phy/mdio_bus.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index ce940871331e..9ca51d678123 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -46,8 +46,8 @@ static int mdiobus_register_gpiod(struct mdio_device *mdiodev)
 
 	/* Deassert the optional reset signal */
 	if (mdiodev->dev.of_node)
-		gpiod = fwnode_get_named_gpiod(&mdiodev->dev.of_node->fwnode,
-					       "reset-gpios", 0, GPIOD_OUT_LOW,
+		gpiod = fwnode_gpiod_get_index(&mdiodev->dev.of_node->fwnode,
+					       "reset", 0, GPIOD_OUT_LOW,
 					       "PHY reset");
 	if (IS_ERR(gpiod)) {
 		if (PTR_ERR(gpiod) == -ENOENT || PTR_ERR(gpiod) == -ENOSYS)
-- 
2.23.0.162.g0b9fbb3734-goog


^ permalink raw reply related

* [PATCH 00/11] Add support for software nodes to gpiolib
From: Dmitry Torokhov @ 2019-09-11  7:52 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Andy Shevchenko, Mika Westerberg, linux-kernel, linux-gpio,
	Andrew Lunn, Andrzej Hajda, Bartosz Golaszewski, Daniel Vetter,
	David Airlie, David S. Miller, Florian Fainelli, Heiner Kallweit,
	Jernej Skrabec, Jonas Karlman, Laurent Pinchart, Neil Armstrong,
	Russell King, dri-devel, linux-acpi, netdev

This series attempts to add support for software nodes to gpiolib, using
software node references that were introduced recently. This allows us
to convert more drivers to the generic device properties and drop
support for custom platform data:

static const struct software_node gpio_bank_b_node = {
|-------.name = "B",
};

static const struct property_entry simone_key_enter_props[] = {
|-------PROPERTY_ENTRY_U32("linux,code", KEY_ENTER),
|-------PROPERTY_ENTRY_STRING("label", "enter"),
|-------PROPERTY_ENTRY_REF("gpios", &gpio_bank_b_node, 123, GPIO_ACTIVE_LOW),
|-------{ }
};

If we agree in principle, I would like to have the very first 3 patches
in an immutable branch off maybe -rc8 so that it can be pulled into
individual subsystems so that patches switching various drivers to
fwnode_gpiod_get_index() could be applied.

Thanks,
Dmitry

Dmitry Torokhov (11):
  gpiolib: of: add a fallback for wlf,reset GPIO name
  gpiolib: introduce devm_fwnode_gpiod_get_index()
  gpiolib: introduce fwnode_gpiod_get_index()
  net: phylink: switch to using fwnode_gpiod_get_index()
  net: mdio: switch to using fwnode_gpiod_get_index()
  drm/bridge: ti-tfp410: switch to using fwnode_gpiod_get_index()
  gpliolib: make fwnode_get_named_gpiod() static
  gpiolib: of: tease apart of_find_gpio()
  gpiolib: of: tease apart acpi_find_gpio()
  gpiolib: consolidate fwnode GPIO lookups
  gpiolib: add support for software nodes

 drivers/gpio/Makefile              |   1 +
 drivers/gpio/gpiolib-acpi.c        | 153 ++++++++++++++----------
 drivers/gpio/gpiolib-acpi.h        |  21 ++--
 drivers/gpio/gpiolib-devres.c      |  33 ++----
 drivers/gpio/gpiolib-of.c          | 159 ++++++++++++++-----------
 drivers/gpio/gpiolib-of.h          |  26 ++--
 drivers/gpio/gpiolib-swnode.c      |  92 +++++++++++++++
 drivers/gpio/gpiolib-swnode.h      |  13 ++
 drivers/gpio/gpiolib.c             | 184 ++++++++++++++++-------------
 drivers/gpu/drm/bridge/ti-tfp410.c |   4 +-
 drivers/net/phy/mdio_bus.c         |   4 +-
 drivers/net/phy/phylink.c          |   4 +-
 include/linux/gpio/consumer.h      |  53 ++++++---
 13 files changed, 471 insertions(+), 276 deletions(-)
 create mode 100644 drivers/gpio/gpiolib-swnode.c
 create mode 100644 drivers/gpio/gpiolib-swnode.h

-- 
2.23.0.162.g0b9fbb3734-goog


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox