* [PATCH v2 1/3] dt-bindings: net: qcom,ipa: fix example for upcomming smp2p conversion
From: David Heidelberg @ 2022-04-24 13:15 UTC (permalink / raw)
To: Andy Gross, Bjorn Andersson, David S. Miller, Jakub Kicinski,
Paolo Abeni, Rob Herring, Krzysztof Kozlowski, Alex Elder
Cc: David Heidelberg, linux-arm-msm, netdev, devicetree, linux-kernel
Example of mpss was missing required properties.
Signed-off-by: David Heidelberg <david@ixit.cz>
---
Documentation/devicetree/bindings/net/qcom,ipa.yaml | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.yaml b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
index 58ecc62adfaa..852658b4d05c 100644
--- a/Documentation/devicetree/bindings/net/qcom,ipa.yaml
+++ b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
@@ -182,6 +182,11 @@ examples:
smp2p-mpss {
compatible = "qcom,smp2p";
+ mboxes = <&apss_shared 14>;
+ qcom,smem = <435>, <428>;
+ qcom,local-pid = <0>;
+ qcom,remote-pid = <1>;
+
ipa_smp2p_out: ipa-ap-to-modem {
qcom,entry-name = "ipa";
#qcom,smem-state-cells = <1>;
--
2.35.1
^ permalink raw reply related
* [PATCH net 6/6] net: hns3: add return value for mailbox handling in PF
From: Guangbin Huang @ 2022-04-24 12:57 UTC (permalink / raw)
To: davem, kuba; +Cc: netdev, linux-kernel, lipeng321, huangguangbin2, chenhao288
In-Reply-To: <20220424125725.43232-1-huangguangbin2@huawei.com>
From: Jian Shen <shenjian15@huawei.com>
Currently, there are some querying mailboxes sent from VF to PF,
and VF will wait the PF's handling result. For mailbox
HCLGE_MBX_GET_QID_IN_PF and HCLGE_MBX_GET_RSS_KEY, it may fail
when the input parameter is invalid, but the prototype of their
handler function is void. In this case, PF always return success
to VF, which may cause the VF get incorrect result.
Fixes it by adding return value for these function.
Fixes: 63b1279d9905 ("net: hns3: check queue id range before using")
Fixes: 532cfc0df1e4 ("net: hns3: add a check for index in hclge_get_rss_key()")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
---
.../hisilicon/hns3/hns3pf/hclge_mbx.c | 22 ++++++++++---------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
index 53f939923c28..7998ca617a92 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
@@ -594,9 +594,9 @@ static int hclge_set_vf_mtu(struct hclge_vport *vport,
return hclge_set_vport_mtu(vport, mtu);
}
-static void hclge_get_queue_id_in_pf(struct hclge_vport *vport,
- struct hclge_mbx_vf_to_pf_cmd *mbx_req,
- struct hclge_respond_to_vf_msg *resp_msg)
+static int hclge_get_queue_id_in_pf(struct hclge_vport *vport,
+ struct hclge_mbx_vf_to_pf_cmd *mbx_req,
+ struct hclge_respond_to_vf_msg *resp_msg)
{
struct hnae3_handle *handle = &vport->nic;
struct hclge_dev *hdev = vport->back;
@@ -606,17 +606,18 @@ static void hclge_get_queue_id_in_pf(struct hclge_vport *vport,
if (queue_id >= handle->kinfo.num_tqps) {
dev_err(&hdev->pdev->dev, "Invalid queue id(%u) from VF %u\n",
queue_id, mbx_req->mbx_src_vfid);
- return;
+ return -EINVAL;
}
qid_in_pf = hclge_covert_handle_qid_global(&vport->nic, queue_id);
memcpy(resp_msg->data, &qid_in_pf, sizeof(qid_in_pf));
resp_msg->len = sizeof(qid_in_pf);
+ return 0;
}
-static void hclge_get_rss_key(struct hclge_vport *vport,
- struct hclge_mbx_vf_to_pf_cmd *mbx_req,
- struct hclge_respond_to_vf_msg *resp_msg)
+static int hclge_get_rss_key(struct hclge_vport *vport,
+ struct hclge_mbx_vf_to_pf_cmd *mbx_req,
+ struct hclge_respond_to_vf_msg *resp_msg)
{
#define HCLGE_RSS_MBX_RESP_LEN 8
struct hclge_dev *hdev = vport->back;
@@ -634,13 +635,14 @@ static void hclge_get_rss_key(struct hclge_vport *vport,
dev_warn(&hdev->pdev->dev,
"failed to get the rss hash key, the index(%u) invalid !\n",
index);
- return;
+ return -EINVAL;
}
memcpy(resp_msg->data,
&rss_cfg->rss_hash_key[index * HCLGE_RSS_MBX_RESP_LEN],
HCLGE_RSS_MBX_RESP_LEN);
resp_msg->len = HCLGE_RSS_MBX_RESP_LEN;
+ return 0;
}
static void hclge_link_fail_parse(struct hclge_dev *hdev, u8 link_fail_code)
@@ -816,10 +818,10 @@ void hclge_mbx_handler(struct hclge_dev *hdev)
"VF fail(%d) to set mtu\n", ret);
break;
case HCLGE_MBX_GET_QID_IN_PF:
- hclge_get_queue_id_in_pf(vport, req, &resp_msg);
+ ret = hclge_get_queue_id_in_pf(vport, req, &resp_msg);
break;
case HCLGE_MBX_GET_RSS_KEY:
- hclge_get_rss_key(vport, req, &resp_msg);
+ ret = hclge_get_rss_key(vport, req, &resp_msg);
break;
case HCLGE_MBX_GET_LINK_MODE:
hclge_get_link_mode(vport, req);
--
2.33.0
^ permalink raw reply related
* [PATCH net 5/6] net: hns3: add validity check for message data length
From: Guangbin Huang @ 2022-04-24 12:57 UTC (permalink / raw)
To: davem, kuba; +Cc: netdev, linux-kernel, lipeng321, huangguangbin2, chenhao288
In-Reply-To: <20220424125725.43232-1-huangguangbin2@huawei.com>
From: Jian Shen <shenjian15@huawei.com>
Add validity check for message data length in function
hclge_send_mbx_msg(), avoid unexpected overflow.
Fixes: dde1a86e93ca ("net: hns3: Add mailbox support to PF driver")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
index 36cbafc5f944..53f939923c28 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
@@ -94,6 +94,13 @@ static int hclge_send_mbx_msg(struct hclge_vport *vport, u8 *msg, u16 msg_len,
enum hclge_comm_cmd_status status;
struct hclge_desc desc;
+ if (msg_len > HCLGE_MBX_MAX_MSG_SIZE) {
+ dev_err(&hdev->pdev->dev,
+ "msg data length(=%u) exceeds maximum(=%u)\n",
+ msg_len, HCLGE_MBX_MAX_MSG_SIZE);
+ return -EMSGSIZE;
+ }
+
resp_pf_to_vf = (struct hclge_mbx_pf_to_vf_cmd *)desc.data;
hclge_cmd_setup_basic_desc(&desc, HCLGEVF_OPC_MBX_PF_TO_VF, false);
--
2.33.0
^ permalink raw reply related
* [PATCH net 2/6] net: hns3: align the debugfs output to the left
From: Guangbin Huang @ 2022-04-24 12:57 UTC (permalink / raw)
To: davem, kuba; +Cc: netdev, linux-kernel, lipeng321, huangguangbin2, chenhao288
In-Reply-To: <20220424125725.43232-1-huangguangbin2@huawei.com>
From: Hao Chen <chenhao288@hisilicon.com>
For debugfs node rx/tx_queue_info and rx/tx_bd_info, their output info is
aligned to the right, it's not aligned with output of other debugfs node,
so uniform their output info.
Fixes: 907676b13071 ("net: hns3: use tx bounce buffer for small packets")
Fixes: e44c495d95e0 ("net: hns3: refactor queue info of debugfs")
Fixes: 77e9184869c9 ("net: hns3: refactor dump bd info of debugfs")
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
---
.../ethernet/hisilicon/hns3/hns3_debugfs.c | 84 +++++++++----------
1 file changed, 42 insertions(+), 42 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
index 44d9b560b337..93aeb615191d 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
@@ -562,12 +562,12 @@ static void hns3_dbg_tx_spare_info(struct hns3_enet_ring *ring, char *buf,
for (i = 0; i < ring_num; i++) {
j = 0;
- sprintf(result[j++], "%8u", i);
- sprintf(result[j++], "%9u", ring->tx_copybreak);
- sprintf(result[j++], "%3u", tx_spare->len);
- sprintf(result[j++], "%3u", tx_spare->next_to_use);
- sprintf(result[j++], "%3u", tx_spare->next_to_clean);
- sprintf(result[j++], "%3u", tx_spare->last_to_clean);
+ sprintf(result[j++], "%u", i);
+ sprintf(result[j++], "%u", ring->tx_copybreak);
+ sprintf(result[j++], "%u", tx_spare->len);
+ sprintf(result[j++], "%u", tx_spare->next_to_use);
+ sprintf(result[j++], "%u", tx_spare->next_to_clean);
+ sprintf(result[j++], "%u", tx_spare->last_to_clean);
sprintf(result[j++], "%pad", &tx_spare->dma);
hns3_dbg_fill_content(content, sizeof(content),
tx_spare_info_items,
@@ -598,35 +598,35 @@ static void hns3_dump_rx_queue_info(struct hns3_enet_ring *ring,
u32 base_add_l, base_add_h;
u32 j = 0;
- sprintf(result[j++], "%8u", index);
+ sprintf(result[j++], "%u", index);
- sprintf(result[j++], "%6u", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%u", readl_relaxed(ring->tqp->io_base +
HNS3_RING_RX_RING_BD_NUM_REG));
- sprintf(result[j++], "%6u", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%u", readl_relaxed(ring->tqp->io_base +
HNS3_RING_RX_RING_BD_LEN_REG));
- sprintf(result[j++], "%4u", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%u", readl_relaxed(ring->tqp->io_base +
HNS3_RING_RX_RING_TAIL_REG));
- sprintf(result[j++], "%4u", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%u", readl_relaxed(ring->tqp->io_base +
HNS3_RING_RX_RING_HEAD_REG));
- sprintf(result[j++], "%6u", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%u", readl_relaxed(ring->tqp->io_base +
HNS3_RING_RX_RING_FBDNUM_REG));
- sprintf(result[j++], "%6u", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%u", readl_relaxed(ring->tqp->io_base +
HNS3_RING_RX_RING_PKTNUM_RECORD_REG));
- sprintf(result[j++], "%9u", ring->rx_copybreak);
+ sprintf(result[j++], "%u", ring->rx_copybreak);
- sprintf(result[j++], "%7s", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%s", readl_relaxed(ring->tqp->io_base +
HNS3_RING_EN_REG) ? "on" : "off");
if (hnae3_ae_dev_tqp_txrx_indep_supported(ae_dev))
- sprintf(result[j++], "%10s", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%s", readl_relaxed(ring->tqp->io_base +
HNS3_RING_RX_EN_REG) ? "on" : "off");
else
- sprintf(result[j++], "%10s", "NA");
+ sprintf(result[j++], "%s", "NA");
base_add_h = readl_relaxed(ring->tqp->io_base +
HNS3_RING_RX_RING_BASEADDR_H_REG);
@@ -700,36 +700,36 @@ static void hns3_dump_tx_queue_info(struct hns3_enet_ring *ring,
u32 base_add_l, base_add_h;
u32 j = 0;
- sprintf(result[j++], "%8u", index);
- sprintf(result[j++], "%6u", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%u", index);
+ sprintf(result[j++], "%u", readl_relaxed(ring->tqp->io_base +
HNS3_RING_TX_RING_BD_NUM_REG));
- sprintf(result[j++], "%2u", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%u", readl_relaxed(ring->tqp->io_base +
HNS3_RING_TX_RING_TC_REG));
- sprintf(result[j++], "%4u", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%u", readl_relaxed(ring->tqp->io_base +
HNS3_RING_TX_RING_TAIL_REG));
- sprintf(result[j++], "%4u", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%u", readl_relaxed(ring->tqp->io_base +
HNS3_RING_TX_RING_HEAD_REG));
- sprintf(result[j++], "%6u", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%u", readl_relaxed(ring->tqp->io_base +
HNS3_RING_TX_RING_FBDNUM_REG));
- sprintf(result[j++], "%6u", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%u", readl_relaxed(ring->tqp->io_base +
HNS3_RING_TX_RING_OFFSET_REG));
- sprintf(result[j++], "%6u", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%u", readl_relaxed(ring->tqp->io_base +
HNS3_RING_TX_RING_PKTNUM_RECORD_REG));
- sprintf(result[j++], "%7s", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%s", readl_relaxed(ring->tqp->io_base +
HNS3_RING_EN_REG) ? "on" : "off");
if (hnae3_ae_dev_tqp_txrx_indep_supported(ae_dev))
- sprintf(result[j++], "%10s", readl_relaxed(ring->tqp->io_base +
+ sprintf(result[j++], "%s", readl_relaxed(ring->tqp->io_base +
HNS3_RING_TX_EN_REG) ? "on" : "off");
else
- sprintf(result[j++], "%10s", "NA");
+ sprintf(result[j++], "%s", "NA");
base_add_h = readl_relaxed(ring->tqp->io_base +
HNS3_RING_TX_RING_BASEADDR_H_REG);
@@ -848,15 +848,15 @@ static void hns3_dump_rx_bd_info(struct hns3_nic_priv *priv,
{
unsigned int j = 0;
- sprintf(result[j++], "%5d", idx);
+ sprintf(result[j++], "%d", idx);
sprintf(result[j++], "%#x", le32_to_cpu(desc->rx.l234_info));
- sprintf(result[j++], "%7u", le16_to_cpu(desc->rx.pkt_len));
- sprintf(result[j++], "%4u", le16_to_cpu(desc->rx.size));
+ sprintf(result[j++], "%u", le16_to_cpu(desc->rx.pkt_len));
+ sprintf(result[j++], "%u", le16_to_cpu(desc->rx.size));
sprintf(result[j++], "%#x", le32_to_cpu(desc->rx.rss_hash));
- sprintf(result[j++], "%5u", le16_to_cpu(desc->rx.fd_id));
- sprintf(result[j++], "%8u", le16_to_cpu(desc->rx.vlan_tag));
- sprintf(result[j++], "%15u", le16_to_cpu(desc->rx.o_dm_vlan_id_fb));
- sprintf(result[j++], "%11u", le16_to_cpu(desc->rx.ot_vlan_tag));
+ sprintf(result[j++], "%u", le16_to_cpu(desc->rx.fd_id));
+ sprintf(result[j++], "%u", le16_to_cpu(desc->rx.vlan_tag));
+ sprintf(result[j++], "%u", le16_to_cpu(desc->rx.o_dm_vlan_id_fb));
+ sprintf(result[j++], "%u", le16_to_cpu(desc->rx.ot_vlan_tag));
sprintf(result[j++], "%#x", le32_to_cpu(desc->rx.bd_base_info));
if (test_bit(HNS3_NIC_STATE_RXD_ADV_LAYOUT_ENABLE, &priv->state)) {
u32 ol_info = le32_to_cpu(desc->rx.ol_info);
@@ -930,19 +930,19 @@ static void hns3_dump_tx_bd_info(struct hns3_nic_priv *priv,
{
unsigned int j = 0;
- sprintf(result[j++], "%6d", idx);
+ sprintf(result[j++], "%d", idx);
sprintf(result[j++], "%#llx", le64_to_cpu(desc->addr));
- sprintf(result[j++], "%5u", le16_to_cpu(desc->tx.vlan_tag));
- sprintf(result[j++], "%5u", le16_to_cpu(desc->tx.send_size));
+ sprintf(result[j++], "%u", le16_to_cpu(desc->tx.vlan_tag));
+ sprintf(result[j++], "%u", le16_to_cpu(desc->tx.send_size));
sprintf(result[j++], "%#x",
le32_to_cpu(desc->tx.type_cs_vlan_tso_len));
- sprintf(result[j++], "%5u", le16_to_cpu(desc->tx.outer_vlan_tag));
- sprintf(result[j++], "%5u", le16_to_cpu(desc->tx.tv));
- sprintf(result[j++], "%10u",
+ sprintf(result[j++], "%u", le16_to_cpu(desc->tx.outer_vlan_tag));
+ sprintf(result[j++], "%u", le16_to_cpu(desc->tx.tv));
+ sprintf(result[j++], "%u",
le32_to_cpu(desc->tx.ol_type_vlan_len_msec));
sprintf(result[j++], "%#x", le32_to_cpu(desc->tx.paylen_ol4cs));
sprintf(result[j++], "%#x", le16_to_cpu(desc->tx.bdtp_fe_sc_vld_ra_ri));
- sprintf(result[j++], "%5u", le16_to_cpu(desc->tx.mss_hw_csum));
+ sprintf(result[j++], "%u", le16_to_cpu(desc->tx.mss_hw_csum));
}
static int hns3_dbg_tx_bd_info(struct hns3_dbg_data *d, char *buf, int len)
--
2.33.0
^ permalink raw reply related
* [PATCH net 3/6] net: hns3: fix error log of tx/rx tqps stats
From: Guangbin Huang @ 2022-04-24 12:57 UTC (permalink / raw)
To: davem, kuba; +Cc: netdev, linux-kernel, lipeng321, huangguangbin2, chenhao288
In-Reply-To: <20220424125725.43232-1-huangguangbin2@huawei.com>
From: Peng Li <lipeng321@huawei.com>
The comments in function hclge_comm_tqps_update_stats is not right,
so fix it.
Fixes: 287db5c40d15 ("net: hns3: create new set of common tqp stats APIs for PF and VF reuse")
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
---
.../hisilicon/hns3/hns3_common/hclge_comm_tqp_stats.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_tqp_stats.c b/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_tqp_stats.c
index 0c60f41fca8a..f3c9395d8351 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_tqp_stats.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_tqp_stats.c
@@ -75,7 +75,7 @@ int hclge_comm_tqps_update_stats(struct hnae3_handle *handle,
ret = hclge_comm_cmd_send(hw, &desc, 1);
if (ret) {
dev_err(&hw->cmq.csq.pdev->dev,
- "failed to get tqp stat, ret = %d, tx = %u.\n",
+ "failed to get tqp stat, ret = %d, rx = %u.\n",
ret, i);
return ret;
}
@@ -89,7 +89,7 @@ int hclge_comm_tqps_update_stats(struct hnae3_handle *handle,
ret = hclge_comm_cmd_send(hw, &desc, 1);
if (ret) {
dev_err(&hw->cmq.csq.pdev->dev,
- "failed to get tqp stat, ret = %d, rx = %u.\n",
+ "failed to get tqp stat, ret = %d, tx = %u.\n",
ret, i);
return ret;
}
--
2.33.0
^ permalink raw reply related
* [PATCH net 4/6] net: hns3: modify the return code of hclge_get_ring_chain_from_mbx
From: Guangbin Huang @ 2022-04-24 12:57 UTC (permalink / raw)
To: davem, kuba; +Cc: netdev, linux-kernel, lipeng321, huangguangbin2, chenhao288
In-Reply-To: <20220424125725.43232-1-huangguangbin2@huawei.com>
From: Jie Wang <wangjie125@huawei.com>
Currently, function hclge_get_ring_chain_from_mbx will return -ENOMEM if
ring_num is bigger than HCLGE_MBX_MAX_RING_CHAIN_PARAM_NUM. It is better to
return -EINVAL for the invalid parameter case.
So this patch fixes it by return -EINVAL in this abnormal branch.
Fixes: 5d02a58dae60 ("net: hns3: fix for buffer overflow smatch warning")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
index 6799d16de34b..36cbafc5f944 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
@@ -176,7 +176,7 @@ static int hclge_get_ring_chain_from_mbx(
ring_num = req->msg.ring_num;
if (ring_num > HCLGE_MBX_MAX_RING_CHAIN_PARAM_NUM)
- return -ENOMEM;
+ return -EINVAL;
for (i = 0; i < ring_num; i++) {
if (req->msg.param[i].tqp_index >= vport->nic.kinfo.rss_size) {
--
2.33.0
^ permalink raw reply related
* [PATCH net 0/6] net: hns3: add some fixes for -net
From: Guangbin Huang @ 2022-04-24 12:57 UTC (permalink / raw)
To: davem, kuba; +Cc: netdev, linux-kernel, lipeng321, huangguangbin2, chenhao288
This series adds some fixes for the HNS3 ethernet driver.
Hao Chen (1):
net: hns3: align the debugfs output to the left
Jian Shen (3):
net: hns3: clear inited state and stop client after failed to register
netdev
net: hns3: add validity check for message data length
net: hns3: add return value for mailbox handling in PF
Jie Wang (1):
net: hns3: modify the return code of hclge_get_ring_chain_from_mbx
Peng Li (1):
net: hns3: fix error log of tx/rx tqps stats
.../hns3/hns3_common/hclge_comm_tqp_stats.c | 4 +-
.../ethernet/hisilicon/hns3/hns3_debugfs.c | 84 +++++++++----------
.../net/ethernet/hisilicon/hns3/hns3_enet.c | 9 ++
.../hisilicon/hns3/hns3pf/hclge_mbx.c | 31 ++++---
4 files changed, 73 insertions(+), 55 deletions(-)
--
2.33.0
^ permalink raw reply
* [PATCH net 1/6] net: hns3: clear inited state and stop client after failed to register netdev
From: Guangbin Huang @ 2022-04-24 12:57 UTC (permalink / raw)
To: davem, kuba; +Cc: netdev, linux-kernel, lipeng321, huangguangbin2, chenhao288
In-Reply-To: <20220424125725.43232-1-huangguangbin2@huawei.com>
From: Jian Shen <shenjian15@huawei.com>
If failed to register netdev, it needs to clear INITED state and stop
client in case of cause problem when concurrency with uninitialized
process of driver.
Fixes: a289a7e5c1d4 ("net: hns3: put off calling register_netdev() until client initialize complete")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 14dc12c2155d..a3ee7875d6a7 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -5203,6 +5203,13 @@ static void hns3_state_init(struct hnae3_handle *handle)
set_bit(HNS3_NIC_STATE_RXD_ADV_LAYOUT_ENABLE, &priv->state);
}
+static void hns3_state_uninit(struct hnae3_handle *handle)
+{
+ struct hns3_nic_priv *priv = handle->priv;
+
+ clear_bit(HNS3_NIC_STATE_INITED, &priv->state);
+}
+
static int hns3_client_init(struct hnae3_handle *handle)
{
struct pci_dev *pdev = handle->pdev;
@@ -5320,7 +5327,9 @@ static int hns3_client_init(struct hnae3_handle *handle)
return ret;
out_reg_netdev_fail:
+ hns3_state_uninit(handle);
hns3_dbg_uninit(handle);
+ hns3_client_stop(handle);
out_client_start:
hns3_free_rx_cpu_rmap(netdev);
hns3_nic_uninit_irq(priv);
--
2.33.0
^ permalink raw reply related
* Re: [PATCH 2/3] perf: arm-spe: Fix SPE events with phys addresses
From: Leo Yan @ 2022-04-24 12:59 UTC (permalink / raw)
To: Timothy Hayes
Cc: linux-kernel, linux-perf-users, acme, John Garry, Will Deacon,
Mathieu Poirier, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, linux-arm-kernel, netdev, bpf
In-Reply-To: <20220421165205.117662-3-timothy.hayes@arm.com>
Hi Timothy,
On Thu, Apr 21, 2022 at 05:52:04PM +0100, Timothy Hayes wrote:
> This patch corrects a bug whereby SPE collection is invoked with
> pa_enable=1 but synthesized events fail to show physical addresses.
>
> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
> ---
> tools/perf/arch/arm64/util/arm-spe.c | 10 ++++++++++
> tools/perf/util/arm-spe.c | 3 ++-
> 2 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
> index af4d63af8072..e8b577d33e53 100644
> --- a/tools/perf/arch/arm64/util/arm-spe.c
> +++ b/tools/perf/arch/arm64/util/arm-spe.c
> @@ -148,6 +148,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr,
> bool privileged = perf_event_paranoid_check(-1);
> struct evsel *tracking_evsel;
> int err;
> + u64 bit;
>
> sper->evlist = evlist;
>
> @@ -245,6 +246,15 @@ static int arm_spe_recording_options(struct auxtrace_record *itr,
> */
> evsel__set_sample_bit(arm_spe_evsel, DATA_SRC);
>
> + /*
> + * The PHYS_ADDR flag does not affect the driver behaviour, it is used to
> + * inform that the resulting output's SPE samples contain physical addresses
> + * where applicable.
> + */
> + bit = perf_pmu__format_bits(&arm_spe_pmu->format, "pa_enable");
> + if (arm_spe_evsel->core.attr.config & bit)
> + evsel__set_sample_bit(arm_spe_evsel, PHYS_ADDR);
> +
> /* Add dummy event to keep tracking */
> err = parse_events(evlist, "dummy:u", NULL);
> if (err)
> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> index 151cc38a171c..1a80151baed9 100644
> --- a/tools/perf/util/arm-spe.c
> +++ b/tools/perf/util/arm-spe.c
> @@ -1033,7 +1033,8 @@ arm_spe_synth_events(struct arm_spe *spe, struct perf_session *session)
> memset(&attr, 0, sizeof(struct perf_event_attr));
> attr.size = sizeof(struct perf_event_attr);
> attr.type = PERF_TYPE_HARDWARE;
> - attr.sample_type = evsel->core.attr.sample_type & PERF_SAMPLE_MASK;
> + attr.sample_type = evsel->core.attr.sample_type &
> + (PERF_SAMPLE_MASK | PERF_SAMPLE_PHYS_ADDR);
I verified this patch and I can confirm the physical address can be
dumped successfully.
I have a more general question, seems to me, we need to change the
macro PERF_SAMPLE_MASK in the file util/event.h as below, so
here doesn't need to 'or' the flag PERF_SAMPLE_PHYS_ADDR anymore.
@Arnaldo, @Jiri, could you confirm if this is the right way to move
forward? I am not sure why PERF_SAMPLE_MASK doesn't contain the bit
PERF_SAMPLE_PHYS_ADDR in current code.
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index cdd72e05fd28..c905ac32ebad 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -39,7 +39,7 @@ struct perf_event_attr;
PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | \
PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD | \
- PERF_SAMPLE_IDENTIFIER)
+ PERF_SAMPLE_IDENTIFIER | PERF_SAMPLE_PHYS_ADDR)
Thanks,
Leo
> attr.sample_type |= PERF_SAMPLE_IP | PERF_SAMPLE_TID |
> PERF_SAMPLE_PERIOD | PERF_SAMPLE_DATA_SRC |
> PERF_SAMPLE_WEIGHT | PERF_SAMPLE_ADDR;
> --
> 2.25.1
>
^ permalink raw reply related
* Re: [PATCH 2/2] net: dsa: mv88e6xxx: Handle single-chip-address OF property
From: Nathan Rossi @ 2022-04-24 12:57 UTC (permalink / raw)
To: Andrew Lunn
Cc: netdev, linux-kernel, Vivien Didelot, Florian Fainelli,
Vladimir Oltean, David S. Miller, Jakub Kicinski, Paolo Abeni
In-Reply-To: <YmQeIL4XYdTFTNm7@lunn.ch>
On Sun, 24 Apr 2022 at 01:41, Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sun, Apr 24, 2022 at 12:41:22AM +1000, Nathan Rossi wrote:
> > On Sun, 24 Apr 2022 at 00:07, Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > > On Sat, Apr 23, 2022 at 01:14:27PM +0000, Nathan Rossi wrote:
> > > > Handle the parsing and use of single chip addressing when the switch has
> > > > the single-chip-address property defined. This allows for specifying the
> > > > switch as using single chip addressing even when mdio address 0 is used
> > > > by another device on the bus. This is a feature of some switches (e.g.
> > > > the MV88E6341/MV88E6141) where the switch shares the bus only responding
> > > > to the higher 16 addresses.
> > >
> > > Hi Nathan
> > >
> > > I think i'm missing something in this explanation:
> > >
> > > smi.c says:
> > >
> > > /* The switch ADDR[4:1] configuration pins define the chip SMI device address
> > > * (ADDR[0] is always zero, thus only even SMI addresses can be strapped).
> > > *
> > > * When ADDR is all zero, the chip uses Single-chip Addressing Mode, assuming it
> > > * is the only device connected to the SMI master. In this mode it responds to
> > > * all 32 possible SMI addresses, and thus maps directly the internal devices.
> > > *
> > > * When ADDR is non-zero, the chip uses Multi-chip Addressing Mode, allowing
> > > * multiple devices to share the SMI interface. In this mode it responds to only
> > > * 2 registers, used to indirectly access the internal SMI devices.
> > > *
> > > * Some chips use a different scheme: Only the ADDR4 pin is used for
> > > * configuration, and the device responds to 16 of the 32 SMI
> > > * addresses, allowing two to coexist on the same SMI interface.
> > > */
> > >
> > > So if ADDR = 0, it takes up the whole bus. And in this case reg = 0.
> > > If ADDR != 0, it is in multi chip mode, and DT reg = ADDR.
> > >
> > > int mv88e6xxx_smi_init(struct mv88e6xxx_chip *chip,
> > > struct mii_bus *bus, int sw_addr)
> > > {
> > > if (chip->info->dual_chip)
> > > chip->smi_ops = &mv88e6xxx_smi_dual_direct_ops;
> > > else if (sw_addr == 0)
> > > chip->smi_ops = &mv88e6xxx_smi_direct_ops;
> > > else if (chip->info->multi_chip)
> > > chip->smi_ops = &mv88e6xxx_smi_indirect_ops;
> > > else
> > > return -EINVAL;
> > >
> > > This seems to implement what is above. smi_direct_ops == whole bus,
> > > smi_indirect_ops == multi-chip mode.
> > >
> > > In what situation do you see this not working? What device are you
> > > using, what does you DT look like, and what at the ADDR value?
> >
> > The device I am using is the MV88E6141, it follows the second scheme
> > such that it only responds to the upper 16 of the 32 SMI addresses in
> > single chip addressing mode. I am able to define the switch at address
> > 0, and everything works. However in the device I am using (Netgate
> > SG-3100) the ethernet phys for the non switch ethernet interfaces are
> > also on the same mdio bus as the switch. One of those phys is
> > configured with address 0. Defining both the ethernet-phy and switch
> > as address 0 does not work.
> >
> > The device tree I have looks like:
> >
> > &mdio {
> > status = "okay";
> > pinctrl-0 = <&mdio_pins>;
> > pinctrl-names = "default";
> >
> > phy0: ethernet-phy@0 {
> > status = "okay";
> > reg = <0>;
> > };
> >
> > phy1: ethernet-phy@1 {
> > status = "okay";
> > reg = <1>;
> > };
>
> So normally, we would have
>
>
> switch0: switch0@16 {
> compatible = "marvell,mv88e6141", "marvell,mv88e6085";
> single-chip-address;
> reg = <0>;
> dsa,member = <0 0>;
> status = "okay";
>
> and then i guess you are seeing mdiobus_register_device() returning
> -EBUSY because the PHY is also at address 0?
Correct, that is the issue I am trying to solve here.
>
> This is what is missing from your explanation. It is always better to
> have more than less in the commit message.
>
> So the chip is using addresses 0x10-0x1f, but in order to probe, you
> need to put reg = 0, taking up slot 0, clashing with the PHY. Ideally
> we want to take up one of the slots in the range 0x10-0x1f. reg=16 on
> its own indicates multi-chip mode and the device is using address 16.
>
> O.K, a bit more digging into the datasheet:
>
> For multi-chip mode, for the 6341 family,
>
> The SMI address that is used is determined by the ADDR[3:0]
> configuration pins. ADDR[4] must be zero to select the device.
>
> So it can only take the address range 0-f, since ADDR[4] == 0. So 16
> is not even a valid multi-chip address. But it is valid for some other
> chips.
>
> So your DT property is says, ignore reg, i really am in single chip
> mode.
>
> This appears to be a general problem for any device with
> .port_base_addr = 0x10.
I had initially thought of using the port_base_addr along with setting
up an of_match for the 6141 to provide compat_info which smi init
could use.
>
> I'm wondering if a better solution to this is special case
> reg=16. First try mv88e6xxx_detect() in single chip mode. That will
> read register 3. A read should be safe. If we get back a valid ID for
> a switch, keep with single chip mode. Otherwise swap to multi-chip
> mode. A multi-chip mv88e6xxx_detect() is more dangerous, because that
> involves writes.
I tested this idea and have sent out a patch for it
(https://lore.kernel.org/netdev/20220424125451.295435-1-nathan@nathanrossi.com/).
It works correctly for the single chip detection case and safely falls
through on other multi-chip addresses. It would be great if you could
test this on the armada 370 rd board with reg=16.
However just a side note, I had to move the reset gpio setup to occur
before the smi init. Interestingly I am not sure if there was a reason
for the reset to be unconfigured before setting up smi, it seems that
might cause issues with the multi-chip smi init ops?
Thanks,
Nathan
>
> Looking at the existing DTs, there are only two using multi-chip mode
> with reg=16:
>
> arm/boot/dts/armada-370-rd.dts- reg = <0x10>;
> arm/boot/dts/kirkwood-linksys-viper.dts- reg = <16>;
>
> And i happen to have an armada-370-rd :-)
>
> Andrew
^ permalink raw reply
* [PATCH] net: dsa: mv88e6xxx: Single chip mode detection for MV88E6*41
From: Nathan Rossi @ 2022-04-24 12:54 UTC (permalink / raw)
To: netdev, linux-kernel
Cc: Nathan Rossi, Andrew Lunn, Vivien Didelot, Florian Fainelli,
Vladimir Oltean, David S. Miller, Jakub Kicinski, Paolo Abeni
The mv88e6xxx driver expects switches that are configured in single chip
addressing mode to have the MDIO address configured as 0. This is due to
the switch ADDR pins representing the single chip addressing mode as 0.
However depending on the device (e.g. MV88E6*41) the switch does not
respond on address 0 or any other address below 16 (the first port
address) in single chip addressing mode. This allows for other devices
to be on the same shared MDIO bus despite the switch being in single
chip addressing mode.
When using a switch that works this way it is not possible to configure
switch driver as single chip addressing via device tree, along with
another MDIO device on the same bus with address 0, as both devices
would have the same address of 0 resulting in mdiobus_register_device
-EBUSY errors for one of the devices with address 0.
In order to support this configuration the switch node can have its MDIO
address configured as 16 (the first address that the device responds
to). During initialization the driver will treat this address similar to
how address 0 is, however because this address is also a valid
multi-chip address (in certain switch models, but not all) the driver
will configure the SMI in single chip addressing mode and attempt to
detect the switch model. If the device is configured in single chip
addressing mode this will succeed and the initialization process can
continue. If it fails to detect a valid model this is because the switch
model register is not a valid register when in multi-chip mode, it will
then fall back to the existing SMI initialization process using the MDIO
address as the multi-chip mode address.
This detection method is safe if the device is in either mode because
the single chip addressing mode read is a direct SMI/MDIO read operation
and has no side effects compared to the SMI writes required for the
multi-chip addressing mode.
In order to implement this change, the reset gpio configuration is moved
to occur before any SMI initialization. This ensures that the device has
the same/correct reset gpio state for both mv88e6xxx_smi_init calls.
Signed-off-by: Nathan Rossi <nathan@nathanrossi.com>
---
drivers/net/dsa/mv88e6xxx/chip.c | 45 +++++++++++++++++++++++++++++++++-------
1 file changed, 38 insertions(+), 7 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 64f4fdd029..8cdfafb5d2 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -6276,6 +6276,32 @@ static int mv88e6xxx_detect(struct mv88e6xxx_chip *chip)
return 0;
}
+static int mv88e6xxx_single_chip_detect(struct mv88e6xxx_chip *chip,
+ struct mdio_device *mdiodev)
+{
+ int err;
+
+ /* dual_chip takes precedence over single/multi-chip modes */
+ if (chip->info->dual_chip)
+ return -EINVAL;
+
+ /* If the mdio addr is 16 indicating the first port address of a switch
+ * (e.g. mv88e6*41) in single chip addressing mode the device may be
+ * configured in single chip addressing mode. Setup the smi access as
+ * single chip addressing mode and attempt to detect the model of the
+ * switch, if this fails the device is not configured in single chip
+ * addressing mode.
+ */
+ if (mdiodev->addr != 16)
+ return -EINVAL;
+
+ err = mv88e6xxx_smi_init(chip, mdiodev->bus, 0);
+ if (err)
+ return err;
+
+ return mv88e6xxx_detect(chip);
+}
+
static struct mv88e6xxx_chip *mv88e6xxx_alloc_chip(struct device *dev)
{
struct mv88e6xxx_chip *chip;
@@ -6959,10 +6985,6 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev)
chip->info = compat_info;
- err = mv88e6xxx_smi_init(chip, mdiodev->bus, mdiodev->addr);
- if (err)
- goto out;
-
chip->reset = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_LOW);
if (IS_ERR(chip->reset)) {
err = PTR_ERR(chip->reset);
@@ -6971,9 +6993,18 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev)
if (chip->reset)
usleep_range(1000, 2000);
- err = mv88e6xxx_detect(chip);
- if (err)
- goto out;
+ /* Detect if the device is configured in single chip addressing mode,
+ * otherwise continue with address specific smi init/detection.
+ */
+ if (mv88e6xxx_single_chip_detect(chip, mdiodev)) {
+ err = mv88e6xxx_smi_init(chip, mdiodev->bus, mdiodev->addr);
+ if (err)
+ goto out;
+
+ err = mv88e6xxx_detect(chip);
+ if (err)
+ goto out;
+ }
if (chip->info->edsa_support == MV88E6XXX_EDSA_SUPPORTED)
chip->tag_protocol = DSA_TAG_PROTO_EDSA;
---
2.35.2
^ permalink raw reply related
* Re: [PATCH 1/3] perf: arm-spe: Fix addresses of synthesized SPE events
From: Leo Yan @ 2022-04-24 12:28 UTC (permalink / raw)
To: Timothy Hayes
Cc: linux-kernel, linux-perf-users, acme, John Garry, Will Deacon,
Mathieu Poirier, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, linux-arm-kernel, netdev, bpf
In-Reply-To: <20220421165205.117662-2-timothy.hayes@arm.com>
On Thu, Apr 21, 2022 at 05:52:03PM +0100, Timothy Hayes wrote:
> This patch corrects a bug whereby synthesized events from SPE
> samples are missing virtual addresses.
>
> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
> ---
> tools/perf/util/arm-spe.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> index d2b64e3f588b..151cc38a171c 100644
> --- a/tools/perf/util/arm-spe.c
> +++ b/tools/perf/util/arm-spe.c
> @@ -1036,7 +1036,7 @@ arm_spe_synth_events(struct arm_spe *spe, struct perf_session *session)
> attr.sample_type = evsel->core.attr.sample_type & PERF_SAMPLE_MASK;
> attr.sample_type |= PERF_SAMPLE_IP | PERF_SAMPLE_TID |
> PERF_SAMPLE_PERIOD | PERF_SAMPLE_DATA_SRC |
> - PERF_SAMPLE_WEIGHT;
> + PERF_SAMPLE_WEIGHT | PERF_SAMPLE_ADDR;
> if (spe->timeless_decoding)
> attr.sample_type &= ~(u64)PERF_SAMPLE_TIME;
> else
> --
> 2.25.1
>
^ permalink raw reply
* [PATCH net-next v3 1/2] rtnetlink: add extack support in fdb del handlers
From: Alaa Mohamed @ 2022-04-24 12:09 UTC (permalink / raw)
To: netdev
Cc: outreachy, roopa, jdenham, sbrivio, jesse.brandeburg,
anthony.l.nguyen, davem, kuba, pabeni, vladimir.oltean,
claudiu.manoil, alexandre.belloni, shshaikh, manishc, razor,
intel-wired-lan, linux-kernel, UNGLinuxDriver, GR-Linux-NIC-Dev,
bridge, eng.alaamohamedsoliman.am
In-Reply-To: <cover.1650800975.git.eng.alaamohamedsoliman.am@gmail.com>
Add extack support to .ndo_fdb_del in netdevice.h and
all related methods.
Signed-off-by: Alaa Mohamed <eng.alaamohamedsoliman.am@gmail.com>
---
changes in V3:
fix errors reported by checkpatch.pl
---
drivers/net/ethernet/intel/ice/ice_main.c | 4 ++--
drivers/net/ethernet/mscc/ocelot_net.c | 4 ++--
drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c | 2 +-
drivers/net/macvlan.c | 2 +-
drivers/net/vxlan/vxlan_core.c | 2 +-
include/linux/netdevice.h | 2 +-
net/bridge/br_fdb.c | 2 +-
net/bridge/br_private.h | 2 +-
net/core/rtnetlink.c | 4 ++--
9 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index d768925785ca..7b55d8d94803 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -5678,10 +5678,10 @@ ice_fdb_add(struct ndmsg *ndm, struct nlattr __always_unused *tb[],
static int
ice_fdb_del(struct ndmsg *ndm, __always_unused struct nlattr *tb[],
struct net_device *dev, const unsigned char *addr,
- __always_unused u16 vid)
+ __always_unused u16 vid, struct netlink_ext_ack *extack)
{
int err;
-
+
if (ndm->ndm_state & NUD_PERMANENT) {
netdev_err(dev, "FDB only supports static addresses\n");
return -EINVAL;
diff --git a/drivers/net/ethernet/mscc/ocelot_net.c b/drivers/net/ethernet/mscc/ocelot_net.c
index 247bc105bdd2..e07c64e3159c 100644
--- a/drivers/net/ethernet/mscc/ocelot_net.c
+++ b/drivers/net/ethernet/mscc/ocelot_net.c
@@ -774,14 +774,14 @@ static int ocelot_port_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
static int ocelot_port_fdb_del(struct ndmsg *ndm, struct nlattr *tb[],
struct net_device *dev,
- const unsigned char *addr, u16 vid)
+ const unsigned char *addr, u16 vid, struct netlink_ext_ack *extack)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot_port *ocelot_port = &priv->port;
struct ocelot *ocelot = ocelot_port->ocelot;
int port = priv->chip_port;
- return ocelot_fdb_del(ocelot, port, addr, vid, ocelot_port->bridge);
+ return ocelot_fdb_del(ocelot, port, addr, vid, ocelot_port->bridge, extack);
}
static int ocelot_port_fdb_dump(struct sk_buff *skb,
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
index d320567b2cca..51fa23418f6a 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
@@ -368,7 +368,7 @@ static int qlcnic_set_mac(struct net_device *netdev, void *p)
static int qlcnic_fdb_del(struct ndmsg *ndm, struct nlattr *tb[],
struct net_device *netdev,
- const unsigned char *addr, u16 vid)
+ const unsigned char *addr, u16 vid, struct netlink_ext_ack *extack)
{
struct qlcnic_adapter *adapter = netdev_priv(netdev);
int err = -EOPNOTSUPP;
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 069e8824c264..ffd34d9f7049 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -1017,7 +1017,7 @@ static int macvlan_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
static int macvlan_fdb_del(struct ndmsg *ndm, struct nlattr *tb[],
struct net_device *dev,
- const unsigned char *addr, u16 vid)
+ const unsigned char *addr, u16 vid, struct netlink_ext_ack *extack)
{
struct macvlan_dev *vlan = netdev_priv(dev);
int err = -EINVAL;
diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index de97ff98d36e..cf2f60037340 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -1280,7 +1280,7 @@ int __vxlan_fdb_delete(struct vxlan_dev *vxlan,
/* Delete entry (via netlink) */
static int vxlan_fdb_delete(struct ndmsg *ndm, struct nlattr *tb[],
struct net_device *dev,
- const unsigned char *addr, u16 vid)
+ const unsigned char *addr, u16 vid, struct netlink_ext_ack *extack)
{
struct vxlan_dev *vxlan = netdev_priv(dev);
union vxlan_addr ip;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 28ea4f8269d4..d0d2a8f33c73 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1509,7 +1509,7 @@ struct net_device_ops {
struct nlattr *tb[],
struct net_device *dev,
const unsigned char *addr,
- u16 vid);
+ u16 vid, struct netlink_ext_ack *extack);
int (*ndo_fdb_dump)(struct sk_buff *skb,
struct netlink_callback *cb,
struct net_device *dev,
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 6ccda68bd473..5bfce2e9a553 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -1110,7 +1110,7 @@ static int __br_fdb_delete(struct net_bridge *br,
/* Remove neighbor entry with RTM_DELNEIGH */
int br_fdb_delete(struct ndmsg *ndm, struct nlattr *tb[],
struct net_device *dev,
- const unsigned char *addr, u16 vid)
+ const unsigned char *addr, u16 vid, struct netlink_ext_ack *extack)
{
struct net_bridge_vlan_group *vg;
struct net_bridge_port *p = NULL;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 18ccc3d5d296..95348c1c9ce5 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -780,7 +780,7 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
const unsigned char *addr, u16 vid, unsigned long flags);
int br_fdb_delete(struct ndmsg *ndm, struct nlattr *tb[],
- struct net_device *dev, const unsigned char *addr, u16 vid);
+ struct net_device *dev, const unsigned char *addr, u16 vid, struct netlink_ext_ack *extack);
int br_fdb_add(struct ndmsg *nlh, struct nlattr *tb[], struct net_device *dev,
const unsigned char *addr, u16 vid, u16 nlh_flags,
struct netlink_ext_ack *extack);
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 4041b3e2e8ec..99b30ae58a47 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -4223,7 +4223,7 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh,
const struct net_device_ops *ops = br_dev->netdev_ops;
if (ops->ndo_fdb_del)
- err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid);
+ err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid, extack);
if (err)
goto out;
@@ -4235,7 +4235,7 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh,
if (ndm->ndm_flags & NTF_SELF) {
if (dev->netdev_ops->ndo_fdb_del)
err = dev->netdev_ops->ndo_fdb_del(ndm, tb, dev, addr,
- vid);
+ vid, extack);
else
err = ndo_dflt_fdb_del(ndm, tb, dev, addr, vid);
--
2.36.0
^ permalink raw reply related
* [PATCH net-next v3 2/2] net: vxlan: vxlan_core.c: Add extack support to vxlan_fdb_delete
From: Alaa Mohamed @ 2022-04-24 12:09 UTC (permalink / raw)
To: netdev
Cc: outreachy, roopa, jdenham, sbrivio, jesse.brandeburg,
anthony.l.nguyen, davem, kuba, pabeni, vladimir.oltean,
claudiu.manoil, alexandre.belloni, shshaikh, manishc, razor,
intel-wired-lan, linux-kernel, UNGLinuxDriver, GR-Linux-NIC-Dev,
bridge, eng.alaamohamedsoliman.am
In-Reply-To: <cover.1650800975.git.eng.alaamohamedsoliman.am@gmail.com>
Add extack to vxlan_fdb_delete and vxlan_fdb_parse
Signed-off-by: Alaa Mohamed <eng.alaamohamedsoliman.am@gmail.com>
---
changes in V2:
- fix spelling vxlan_fdb_delete
- add missing braces
- edit error message
---
changes in V3:
fix errors reported by checkpatch.pl
---
drivers/net/vxlan/vxlan_core.c | 36 +++++++++++++++++++++++-----------
1 file changed, 25 insertions(+), 11 deletions(-)
diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index cf2f60037340..4e1886655101 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -1129,19 +1129,23 @@ static void vxlan_fdb_dst_destroy(struct vxlan_dev *vxlan, struct vxlan_fdb *f,
static int vxlan_fdb_parse(struct nlattr *tb[], struct vxlan_dev *vxlan,
union vxlan_addr *ip, __be16 *port, __be32 *src_vni,
- __be32 *vni, u32 *ifindex, u32 *nhid)
+ __be32 *vni, u32 *ifindex, u32 *nhid, struct netlink_ext_ack *extack)
{
struct net *net = dev_net(vxlan->dev);
int err;
if (tb[NDA_NH_ID] && (tb[NDA_DST] || tb[NDA_VNI] || tb[NDA_IFINDEX] ||
- tb[NDA_PORT]))
- return -EINVAL;
+ tb[NDA_PORT])){
+ NL_SET_ERR_MSG(extack, "DST, VNI, ifindex and port are mutually exclusive with NH_ID");
+ return -EINVAL;
+ }
if (tb[NDA_DST]) {
err = vxlan_nla_get_addr(ip, tb[NDA_DST]);
- if (err)
+ if (err){
+ NL_SET_ERR_MSG(extack, "Unsupported address family");
return err;
+ }
} else {
union vxlan_addr *remote = &vxlan->default_dst.remote_ip;
@@ -1157,24 +1161,30 @@ static int vxlan_fdb_parse(struct nlattr *tb[], struct vxlan_dev *vxlan,
}
if (tb[NDA_PORT]) {
- if (nla_len(tb[NDA_PORT]) != sizeof(__be16))
+ if (nla_len(tb[NDA_PORT]) != sizeof(__be16)){
+ NL_SET_ERR_MSG(extack, "Invalid vxlan port");
return -EINVAL;
+ }
*port = nla_get_be16(tb[NDA_PORT]);
} else {
*port = vxlan->cfg.dst_port;
}
if (tb[NDA_VNI]) {
- if (nla_len(tb[NDA_VNI]) != sizeof(u32))
+ if (nla_len(tb[NDA_VNI]) != sizeof(u32)){
+ NL_SET_ERR_MSG(extack, "Invalid vni");
return -EINVAL;
+ }
*vni = cpu_to_be32(nla_get_u32(tb[NDA_VNI]));
} else {
*vni = vxlan->default_dst.remote_vni;
}
if (tb[NDA_SRC_VNI]) {
- if (nla_len(tb[NDA_SRC_VNI]) != sizeof(u32))
+ if (nla_len(tb[NDA_SRC_VNI]) != sizeof(u32)){
+ NL_SET_ERR_MSG(extack, "Invalid src vni");
return -EINVAL;
+ }
*src_vni = cpu_to_be32(nla_get_u32(tb[NDA_SRC_VNI]));
} else {
*src_vni = vxlan->default_dst.remote_vni;
@@ -1183,12 +1193,16 @@ static int vxlan_fdb_parse(struct nlattr *tb[], struct vxlan_dev *vxlan,
if (tb[NDA_IFINDEX]) {
struct net_device *tdev;
- if (nla_len(tb[NDA_IFINDEX]) != sizeof(u32))
+ if (nla_len(tb[NDA_IFINDEX]) != sizeof(u32)){
+ NL_SET_ERR_MSG(extack, "Invalid ifindex");
return -EINVAL;
+ }
*ifindex = nla_get_u32(tb[NDA_IFINDEX]);
tdev = __dev_get_by_index(net, *ifindex);
- if (!tdev)
+ if (!tdev){
+ NL_SET_ERR_MSG(extack,"Device not found");
return -EADDRNOTAVAIL;
+ }
} else {
*ifindex = 0;
}
@@ -1226,7 +1240,7 @@ static int vxlan_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
return -EINVAL;
err = vxlan_fdb_parse(tb, vxlan, &ip, &port, &src_vni, &vni, &ifindex,
- &nhid);
+ &nhid, extack);
if (err)
return err;
@@ -1291,7 +1305,7 @@ static int vxlan_fdb_delete(struct ndmsg *ndm, struct nlattr *tb[],
int err;
err = vxlan_fdb_parse(tb, vxlan, &ip, &port, &src_vni, &vni, &ifindex,
- &nhid);
+ &nhid, extack);
if (err)
return err;
--
2.36.0
^ permalink raw reply related
* [PATCH net-next v3 0/2] propagate extack to vxlan_fdb_delete
From: Alaa Mohamed @ 2022-04-24 12:09 UTC (permalink / raw)
To: netdev
Cc: outreachy, roopa, jdenham, sbrivio, jesse.brandeburg,
anthony.l.nguyen, davem, kuba, pabeni, vladimir.oltean,
claudiu.manoil, alexandre.belloni, shshaikh, manishc, razor,
intel-wired-lan, linux-kernel, UNGLinuxDriver, GR-Linux-NIC-Dev,
bridge, eng.alaamohamedsoliman.am
In order to propagate extack to vxlan_fdb_delete and vxlan_fdb_parse,
add extack to .ndo_fdb_del and edit all fdb del handelers
Alaa Mohamed (2):
rtnetlink: add extack support in fdb del handlers
net: vxlan: vxlan_core.c: Add extack support to vxlan_fdb_delete
drivers/net/ethernet/intel/ice/ice_main.c | 4 +-
drivers/net/ethernet/mscc/ocelot_net.c | 4 +-
.../net/ethernet/qlogic/qlcnic/qlcnic_main.c | 2 +-
drivers/net/macvlan.c | 2 +-
drivers/net/vxlan/vxlan_core.c | 38 +++++++++++++------
include/linux/netdevice.h | 2 +-
net/bridge/br_fdb.c | 2 +-
net/bridge/br_private.h | 2 +-
net/core/rtnetlink.c | 4 +-
9 files changed, 37 insertions(+), 23 deletions(-)
--
2.36.0
^ permalink raw reply
* Re: [PATCH] bpf: init map_btf_id during compiling
From: kernel test robot @ 2022-04-24 11:47 UTC (permalink / raw)
To: menglong8.dong, ast
Cc: llvm, kbuild-all, rostedt, mingo, davem, yoshfuji, dsahern, kuba,
pabeni, benbjiang, flyingpeng, imagedong, edumazet, kafai,
talalahmad, keescook, mengensun, dongli.zhang, linux-kernel,
netdev
In-Reply-To: <20220424092613.863290-1-imagedong@tencent.com>
Hi,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on bpf-next/master]
[also build test WARNING on bpf/master net-next/master net/master linus/master v5.18-rc3 next-20220422]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/intel-lab-lkp/linux/commits/menglong8-dong-gmail-com/bpf-init-map_btf_id-during-compiling/20220424-172902
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: i386-randconfig-a015 (https://download.01.org/0day-ci/archive/20220424/202204241926.3xdM8EYM-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 1cddcfdc3c683b393df1a5c9063252eb60e52818)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/a1d0d3f8a71cc20be0b95fe9506a3b3bd1b572b5
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review menglong8-dong-gmail-com/bpf-init-map_btf_id-during-compiling/20220424-172902
git checkout a1d0d3f8a71cc20be0b95fe9506a3b3bd1b572b5
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash kernel/bpf/
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
>> kernel/bpf/btf.c:4823:41: warning: unused variable 'btf_vmlinux_map_ops' [-Wunused-const-variable]
static const struct bpf_map_ops * const btf_vmlinux_map_ops[] = {
^
1 warning generated.
vim +/btf_vmlinux_map_ops +4823 kernel/bpf/btf.c
8580ac9404f624 Alexei Starovoitov 2019-10-15 4822
41c48f3a982317 Andrey Ignatov 2020-06-19 @4823 static const struct bpf_map_ops * const btf_vmlinux_map_ops[] = {
41c48f3a982317 Andrey Ignatov 2020-06-19 4824 #define BPF_PROG_TYPE(_id, _name, prog_ctx_type, kern_ctx_type)
41c48f3a982317 Andrey Ignatov 2020-06-19 4825 #define BPF_LINK_TYPE(_id, _name)
41c48f3a982317 Andrey Ignatov 2020-06-19 4826 #define BPF_MAP_TYPE(_id, _ops) \
41c48f3a982317 Andrey Ignatov 2020-06-19 4827 [_id] = &_ops,
41c48f3a982317 Andrey Ignatov 2020-06-19 4828 #include <linux/bpf_types.h>
41c48f3a982317 Andrey Ignatov 2020-06-19 4829 #undef BPF_PROG_TYPE
41c48f3a982317 Andrey Ignatov 2020-06-19 4830 #undef BPF_LINK_TYPE
41c48f3a982317 Andrey Ignatov 2020-06-19 4831 #undef BPF_MAP_TYPE
41c48f3a982317 Andrey Ignatov 2020-06-19 4832 };
41c48f3a982317 Andrey Ignatov 2020-06-19 4833
--
0-DAY CI Kernel Test Service
https://01.org/lkp
^ permalink raw reply
* [PATCH V2] vDPA/ifcvf: allow userspace to suspend a queue
From: Zhu Lingshan @ 2022-04-24 11:33 UTC (permalink / raw)
To: jasowang, mst; +Cc: virtualization, netdev, Zhu Lingshan
Formerly, ifcvf driver has implemented a lazy-initialization mechanism
for the virtqueues, it would store all virtqueue config fields that
passed down from the userspace, then load them to the virtqueues and
enable the queues upon DRIVER_OK.
To allow the userspace to suspend a virtqueue,
this commit passes queue_enable to the virtqueue directly through
set_vq_ready().
This feature requires and this commits implementing all virtqueue
ops(set_vq_addr, set_vq_num and set_vq_ready) to take immediate
actions than lazy-initialization, so ifcvf_hw_enable() is retired.
set_features() should take immediate actions as well.
ifcvf_add_status() is retierd because we should not add
status like FEATURES_OK by ifcvf's decision, this driver should
only set device status upon vdpa_ops.set_status()
To avoid losing virtqueue configurations caused by multiple
rounds of reset(), this commit also refactors thed evice reset
routine, now it simply reset the config handler and the virtqueues,
and only once device-reset().
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
---
drivers/vdpa/ifcvf/ifcvf_base.c | 150 +++++++++++++++++++-------------
drivers/vdpa/ifcvf/ifcvf_base.h | 16 ++--
drivers/vdpa/ifcvf/ifcvf_main.c | 81 +++--------------
3 files changed, 111 insertions(+), 136 deletions(-)
diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
index 48c4dadb0c7c..bbc9007a6f34 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.c
+++ b/drivers/vdpa/ifcvf/ifcvf_base.c
@@ -179,20 +179,7 @@ void ifcvf_set_status(struct ifcvf_hw *hw, u8 status)
void ifcvf_reset(struct ifcvf_hw *hw)
{
- hw->config_cb.callback = NULL;
- hw->config_cb.private = NULL;
-
ifcvf_set_status(hw, 0);
- /* flush set_status, make sure VF is stopped, reset */
- ifcvf_get_status(hw);
-}
-
-static void ifcvf_add_status(struct ifcvf_hw *hw, u8 status)
-{
- if (status != 0)
- status |= ifcvf_get_status(hw);
-
- ifcvf_set_status(hw, status);
ifcvf_get_status(hw);
}
@@ -213,7 +200,7 @@ u64 ifcvf_get_hw_features(struct ifcvf_hw *hw)
return features;
}
-u64 ifcvf_get_features(struct ifcvf_hw *hw)
+u64 ifcvf_get_device_features(struct ifcvf_hw *hw)
{
return hw->hw_features;
}
@@ -280,7 +267,7 @@ void ifcvf_write_dev_config(struct ifcvf_hw *hw, u64 offset,
vp_iowrite8(*p++, hw->dev_cfg + offset + i);
}
-static void ifcvf_set_features(struct ifcvf_hw *hw, u64 features)
+void ifcvf_set_features(struct ifcvf_hw *hw, u64 features)
{
struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
@@ -289,22 +276,22 @@ static void ifcvf_set_features(struct ifcvf_hw *hw, u64 features)
vp_iowrite32(1, &cfg->guest_feature_select);
vp_iowrite32(features >> 32, &cfg->guest_feature);
+
+ vp_ioread32(&cfg->guest_feature);
}
-static int ifcvf_config_features(struct ifcvf_hw *hw)
+u64 ifcvf_get_features(struct ifcvf_hw *hw)
{
- struct ifcvf_adapter *ifcvf;
+ struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
+ u64 features;
- ifcvf = vf_to_adapter(hw);
- ifcvf_set_features(hw, hw->req_features);
- ifcvf_add_status(hw, VIRTIO_CONFIG_S_FEATURES_OK);
+ vp_iowrite32(0, &cfg->device_feature_select);
+ features = vp_ioread32(&cfg->device_feature);
- if (!(ifcvf_get_status(hw) & VIRTIO_CONFIG_S_FEATURES_OK)) {
- IFCVF_ERR(ifcvf->pdev, "Failed to set FEATURES_OK status\n");
- return -EIO;
- }
+ vp_iowrite32(1, &cfg->device_feature_select);
+ features |= ((u64)vp_ioread32(&cfg->guest_feature) << 32);
- return 0;
+ return features;
}
u16 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid)
@@ -331,68 +318,111 @@ int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u16 num)
ifcvf_lm = (struct ifcvf_lm_cfg __iomem *)hw->lm_cfg;
q_pair_id = qid / hw->nr_vring;
avail_idx_addr = &ifcvf_lm->vring_lm_cfg[q_pair_id].idx_addr[qid % 2];
- hw->vring[qid].last_avail_idx = num;
vp_iowrite16(num, avail_idx_addr);
return 0;
}
-static int ifcvf_hw_enable(struct ifcvf_hw *hw)
+void ifcvf_set_vq_num(struct ifcvf_hw *hw, u16 qid, u32 num)
{
- struct virtio_pci_common_cfg __iomem *cfg;
- u32 i;
+ struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
- cfg = hw->common_cfg;
- for (i = 0; i < hw->nr_vring; i++) {
- if (!hw->vring[i].ready)
- break;
+ vp_iowrite16(qid, &cfg->queue_select);
+ vp_iowrite16(num, &cfg->queue_size);
+}
- vp_iowrite16(i, &cfg->queue_select);
- vp_iowrite64_twopart(hw->vring[i].desc, &cfg->queue_desc_lo,
- &cfg->queue_desc_hi);
- vp_iowrite64_twopart(hw->vring[i].avail, &cfg->queue_avail_lo,
- &cfg->queue_avail_hi);
- vp_iowrite64_twopart(hw->vring[i].used, &cfg->queue_used_lo,
- &cfg->queue_used_hi);
- vp_iowrite16(hw->vring[i].size, &cfg->queue_size);
- ifcvf_set_vq_state(hw, i, hw->vring[i].last_avail_idx);
- vp_iowrite16(1, &cfg->queue_enable);
- }
+int ifcvf_set_vq_address(struct ifcvf_hw *hw, u16 qid, u64 desc_area,
+ u64 driver_area, u64 device_area)
+{
+ struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
+
+ vp_iowrite16(qid, &cfg->queue_select);
+ vp_iowrite64_twopart(desc_area, &cfg->queue_desc_lo,
+ &cfg->queue_desc_hi);
+ vp_iowrite64_twopart(driver_area, &cfg->queue_avail_lo,
+ &cfg->queue_avail_hi);
+ vp_iowrite64_twopart(device_area, &cfg->queue_used_lo,
+ &cfg->queue_used_hi);
return 0;
}
-static void ifcvf_hw_disable(struct ifcvf_hw *hw)
+void ifcvf_set_vq_ready(struct ifcvf_hw *hw, u16 qid, bool ready)
{
- u32 i;
+ struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
+
+ vp_iowrite16(qid, &cfg->queue_select);
+ /* write 0 to queue_enable will suspend a queue*/
+ vp_iowrite16(ready, &cfg->queue_enable);
+}
+
+bool ifcvf_get_vq_ready(struct ifcvf_hw *hw, u16 qid)
+{
+ struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
+ bool queue_enable;
+
+ vp_iowrite16(qid, &cfg->queue_select);
+ queue_enable = vp_ioread16(&cfg->queue_enable);
+
+ return (bool)queue_enable;
+}
+
+static void synchronize_per_vq_irq(struct ifcvf_hw *hw)
+{
+ int i;
- ifcvf_set_config_vector(hw, VIRTIO_MSI_NO_VECTOR);
for (i = 0; i < hw->nr_vring; i++) {
- ifcvf_set_vq_vector(hw, i, VIRTIO_MSI_NO_VECTOR);
+ if (hw->vring[i].irq != -EINVAL)
+ synchronize_irq(hw->vring[i].irq);
}
}
-int ifcvf_start_hw(struct ifcvf_hw *hw)
+static void synchronize_vqs_reused_irq(struct ifcvf_hw *hw)
{
- ifcvf_reset(hw);
- ifcvf_add_status(hw, VIRTIO_CONFIG_S_ACKNOWLEDGE);
- ifcvf_add_status(hw, VIRTIO_CONFIG_S_DRIVER);
+ if (hw->vqs_reused_irq != -EINVAL)
+ synchronize_irq(hw->vqs_reused_irq);
+}
- if (ifcvf_config_features(hw) < 0)
- return -EINVAL;
+static void synchronize_vq_irq(struct ifcvf_hw *hw)
+{
+ u8 status = hw->msix_vector_status;
- if (ifcvf_hw_enable(hw) < 0)
- return -EINVAL;
+ if (status == MSIX_VECTOR_PER_VQ_AND_CONFIG)
+ synchronize_per_vq_irq(hw);
+ else
+ synchronize_vqs_reused_irq(hw);
+}
- ifcvf_add_status(hw, VIRTIO_CONFIG_S_DRIVER_OK);
+static void synchronize_config_irq(struct ifcvf_hw *hw)
+{
+ if (hw->config_irq != -EINVAL)
+ synchronize_irq(hw->config_irq);
+}
- return 0;
+static void ifcvf_reset_vring(struct ifcvf_hw *hw)
+{
+ int i;
+
+ for (i = 0; i < hw->nr_vring; i++) {
+ synchronize_vq_irq(hw);
+ hw->vring[i].cb.callback = NULL;
+ hw->vring[i].cb.private = NULL;
+ ifcvf_set_vq_vector(hw, i, VIRTIO_MSI_NO_VECTOR);
+ }
+}
+
+static void ifcvf_reset_config_handler(struct ifcvf_hw *hw)
+{
+ synchronize_config_irq(hw);
+ hw->config_cb.callback = NULL;
+ hw->config_cb.private = NULL;
+ ifcvf_set_config_vector(hw, VIRTIO_MSI_NO_VECTOR);
}
void ifcvf_stop_hw(struct ifcvf_hw *hw)
{
- ifcvf_hw_disable(hw);
- ifcvf_reset(hw);
+ ifcvf_reset_vring(hw);
+ ifcvf_reset_config_handler(hw);
}
void ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid)
diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
index 115b61f4924b..f3dce0d795cb 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.h
+++ b/drivers/vdpa/ifcvf/ifcvf_base.h
@@ -49,12 +49,6 @@
#define MSIX_VECTOR_DEV_SHARED 3
struct vring_info {
- u64 desc;
- u64 avail;
- u64 used;
- u16 size;
- u16 last_avail_idx;
- bool ready;
void __iomem *notify_addr;
phys_addr_t notify_pa;
u32 irq;
@@ -76,7 +70,6 @@ struct ifcvf_hw {
phys_addr_t notify_base_pa;
u32 notify_off_multiplier;
u32 dev_type;
- u64 req_features;
u64 hw_features;
struct virtio_pci_common_cfg __iomem *common_cfg;
void __iomem *dev_cfg;
@@ -123,7 +116,7 @@ u8 ifcvf_get_status(struct ifcvf_hw *hw);
void ifcvf_set_status(struct ifcvf_hw *hw, u8 status);
void io_write64_twopart(u64 val, u32 *lo, u32 *hi);
void ifcvf_reset(struct ifcvf_hw *hw);
-u64 ifcvf_get_features(struct ifcvf_hw *hw);
+u64 ifcvf_get_device_features(struct ifcvf_hw *hw);
u64 ifcvf_get_hw_features(struct ifcvf_hw *hw);
int ifcvf_verify_min_features(struct ifcvf_hw *hw, u64 features);
u16 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid);
@@ -131,6 +124,13 @@ int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u16 num);
struct ifcvf_adapter *vf_to_adapter(struct ifcvf_hw *hw);
int ifcvf_probed_virtio_net(struct ifcvf_hw *hw);
u32 ifcvf_get_config_size(struct ifcvf_hw *hw);
+int ifcvf_set_vq_address(struct ifcvf_hw *hw, u16 qid, u64 desc_area,
+ u64 driver_area, u64 device_area);
u16 ifcvf_set_vq_vector(struct ifcvf_hw *hw, u16 qid, int vector);
u16 ifcvf_set_config_vector(struct ifcvf_hw *hw, int vector);
+void ifcvf_set_vq_num(struct ifcvf_hw *hw, u16 qid, u32 num);
+void ifcvf_set_vq_ready(struct ifcvf_hw *hw, u16 qid, bool ready);
+bool ifcvf_get_vq_ready(struct ifcvf_hw *hw, u16 qid);
+void ifcvf_set_features(struct ifcvf_hw *hw, u64 features);
+u64 ifcvf_get_features(struct ifcvf_hw *hw);
#endif /* _IFCVF_H_ */
diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index 4366320fb68d..0257ba98cffe 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -358,53 +358,6 @@ static int ifcvf_request_irq(struct ifcvf_adapter *adapter)
return 0;
}
-static int ifcvf_start_datapath(void *private)
-{
- struct ifcvf_hw *vf = ifcvf_private_to_vf(private);
- u8 status;
- int ret;
-
- ret = ifcvf_start_hw(vf);
- if (ret < 0) {
- status = ifcvf_get_status(vf);
- status |= VIRTIO_CONFIG_S_FAILED;
- ifcvf_set_status(vf, status);
- }
-
- return ret;
-}
-
-static int ifcvf_stop_datapath(void *private)
-{
- struct ifcvf_hw *vf = ifcvf_private_to_vf(private);
- int i;
-
- for (i = 0; i < vf->nr_vring; i++)
- vf->vring[i].cb.callback = NULL;
-
- ifcvf_stop_hw(vf);
-
- return 0;
-}
-
-static void ifcvf_reset_vring(struct ifcvf_adapter *adapter)
-{
- struct ifcvf_hw *vf = ifcvf_private_to_vf(adapter);
- int i;
-
- for (i = 0; i < vf->nr_vring; i++) {
- vf->vring[i].last_avail_idx = 0;
- vf->vring[i].desc = 0;
- vf->vring[i].avail = 0;
- vf->vring[i].used = 0;
- vf->vring[i].ready = 0;
- vf->vring[i].cb.callback = NULL;
- vf->vring[i].cb.private = NULL;
- }
-
- ifcvf_reset(vf);
-}
-
static struct ifcvf_adapter *vdpa_to_adapter(struct vdpa_device *vdpa_dev)
{
return container_of(vdpa_dev, struct ifcvf_adapter, vdpa);
@@ -426,7 +379,7 @@ static u64 ifcvf_vdpa_get_device_features(struct vdpa_device *vdpa_dev)
u64 features;
if (type == VIRTIO_ID_NET || type == VIRTIO_ID_BLOCK)
- features = ifcvf_get_features(vf);
+ features = ifcvf_get_device_features(vf);
else {
features = 0;
IFCVF_ERR(pdev, "VIRTIO ID %u not supported\n", vf->dev_type);
@@ -444,7 +397,7 @@ static int ifcvf_vdpa_set_driver_features(struct vdpa_device *vdpa_dev, u64 feat
if (ret)
return ret;
- vf->req_features = features;
+ ifcvf_set_features(vf, features);
return 0;
}
@@ -453,7 +406,7 @@ static u64 ifcvf_vdpa_get_driver_features(struct vdpa_device *vdpa_dev)
{
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
- return vf->req_features;
+ return ifcvf_get_features(vf);
}
static u8 ifcvf_vdpa_get_status(struct vdpa_device *vdpa_dev)
@@ -486,11 +439,6 @@ static void ifcvf_vdpa_set_status(struct vdpa_device *vdpa_dev, u8 status)
ifcvf_set_status(vf, status);
return;
}
-
- if (ifcvf_start_datapath(adapter) < 0)
- IFCVF_ERR(adapter->pdev,
- "Failed to set ifcvf vdpa status %u\n",
- status);
}
ifcvf_set_status(vf, status);
@@ -509,12 +457,10 @@ static int ifcvf_vdpa_reset(struct vdpa_device *vdpa_dev)
if (status_old == 0)
return 0;
- if (status_old & VIRTIO_CONFIG_S_DRIVER_OK) {
- ifcvf_stop_datapath(adapter);
- ifcvf_free_irq(adapter);
- }
+ ifcvf_stop_hw(vf);
+ ifcvf_free_irq(adapter);
- ifcvf_reset_vring(adapter);
+ ifcvf_reset(vf);
return 0;
}
@@ -554,14 +500,17 @@ static void ifcvf_vdpa_set_vq_ready(struct vdpa_device *vdpa_dev,
{
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
- vf->vring[qid].ready = ready;
+ ifcvf_set_vq_ready(vf, qid, ready);
}
static bool ifcvf_vdpa_get_vq_ready(struct vdpa_device *vdpa_dev, u16 qid)
{
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
+ bool ready;
+
+ ready = ifcvf_get_vq_ready(vf, qid);
- return vf->vring[qid].ready;
+ return ready;
}
static void ifcvf_vdpa_set_vq_num(struct vdpa_device *vdpa_dev, u16 qid,
@@ -569,7 +518,7 @@ static void ifcvf_vdpa_set_vq_num(struct vdpa_device *vdpa_dev, u16 qid,
{
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
- vf->vring[qid].size = num;
+ ifcvf_set_vq_num(vf, qid, num);
}
static int ifcvf_vdpa_set_vq_address(struct vdpa_device *vdpa_dev, u16 qid,
@@ -578,11 +527,7 @@ static int ifcvf_vdpa_set_vq_address(struct vdpa_device *vdpa_dev, u16 qid,
{
struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
- vf->vring[qid].desc = desc_area;
- vf->vring[qid].avail = driver_area;
- vf->vring[qid].used = device_area;
-
- return 0;
+ return ifcvf_set_vq_address(vf, qid, desc_area, driver_area, device_area);
}
static void ifcvf_vdpa_kick_vq(struct vdpa_device *vdpa_dev, u16 qid)
--
2.31.1
^ permalink raw reply related
* [Patch net-next] net: dsa: ksz: added the generic port_stp_state_set function
From: Arun Ramadoss @ 2022-04-24 11:28 UTC (permalink / raw)
To: linux-kernel, netdev
Cc: paolo Abeni, Jakub Kicinski, David S. Miller, Vladimir Oltean,
Florian Fainelli, Vivien Didelot, Andrew Lunn, UNGLinuxDriver,
Woojung Huh
The ksz8795 and ksz9477 uses the same algorithm for the
port_stp_state_set function except the register address is different. So
moved the algorithm to the ksz_common.c and used the dev_ops for
register read and write. This function can also used for the lan937x
part. Hence making it generic for all the parts.
Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
---
drivers/net/dsa/microchip/ksz8795.c | 35 +---------------------
drivers/net/dsa/microchip/ksz8795_reg.h | 3 --
drivers/net/dsa/microchip/ksz9477.c | 33 +-------------------
drivers/net/dsa/microchip/ksz9477_reg.h | 4 ---
drivers/net/dsa/microchip/ksz_common.c | 40 +++++++++++++++++++++++++
drivers/net/dsa/microchip/ksz_common.h | 7 +++++
6 files changed, 49 insertions(+), 73 deletions(-)
diff --git a/drivers/net/dsa/microchip/ksz8795.c b/drivers/net/dsa/microchip/ksz8795.c
index b2752978cb09..f91deea9368e 100644
--- a/drivers/net/dsa/microchip/ksz8795.c
+++ b/drivers/net/dsa/microchip/ksz8795.c
@@ -1027,40 +1027,7 @@ static void ksz8_cfg_port_member(struct ksz_device *dev, int port, u8 member)
static void ksz8_port_stp_state_set(struct dsa_switch *ds, int port, u8 state)
{
- struct ksz_device *dev = ds->priv;
- struct ksz_port *p;
- u8 data;
-
- ksz_pread8(dev, port, P_STP_CTRL, &data);
- data &= ~(PORT_TX_ENABLE | PORT_RX_ENABLE | PORT_LEARN_DISABLE);
-
- switch (state) {
- case BR_STATE_DISABLED:
- data |= PORT_LEARN_DISABLE;
- break;
- case BR_STATE_LISTENING:
- data |= (PORT_RX_ENABLE | PORT_LEARN_DISABLE);
- break;
- case BR_STATE_LEARNING:
- data |= PORT_RX_ENABLE;
- break;
- case BR_STATE_FORWARDING:
- data |= (PORT_TX_ENABLE | PORT_RX_ENABLE);
- break;
- case BR_STATE_BLOCKING:
- data |= PORT_LEARN_DISABLE;
- break;
- default:
- dev_err(ds->dev, "invalid STP state: %d\n", state);
- return;
- }
-
- ksz_pwrite8(dev, port, P_STP_CTRL, data);
-
- p = &dev->ports[port];
- p->stp_state = state;
-
- ksz_update_port_member(dev, port);
+ ksz_port_stp_state_set(ds, port, state, P_STP_CTRL);
}
static void ksz8_flush_dyn_mac_table(struct ksz_device *dev, int port)
diff --git a/drivers/net/dsa/microchip/ksz8795_reg.h b/drivers/net/dsa/microchip/ksz8795_reg.h
index d74defcd86b4..4109433b6b6c 100644
--- a/drivers/net/dsa/microchip/ksz8795_reg.h
+++ b/drivers/net/dsa/microchip/ksz8795_reg.h
@@ -160,9 +160,6 @@
#define PORT_DISCARD_NON_VID BIT(5)
#define PORT_FORCE_FLOW_CTRL BIT(4)
#define PORT_BACK_PRESSURE BIT(3)
-#define PORT_TX_ENABLE BIT(2)
-#define PORT_RX_ENABLE BIT(1)
-#define PORT_LEARN_DISABLE BIT(0)
#define REG_PORT_1_CTRL_3 0x13
#define REG_PORT_2_CTRL_3 0x23
diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c
index 8222c8a6c5ec..4f617fee9a4e 100644
--- a/drivers/net/dsa/microchip/ksz9477.c
+++ b/drivers/net/dsa/microchip/ksz9477.c
@@ -517,38 +517,7 @@ static void ksz9477_cfg_port_member(struct ksz_device *dev, int port,
static void ksz9477_port_stp_state_set(struct dsa_switch *ds, int port,
u8 state)
{
- struct ksz_device *dev = ds->priv;
- struct ksz_port *p = &dev->ports[port];
- u8 data;
-
- ksz_pread8(dev, port, P_STP_CTRL, &data);
- data &= ~(PORT_TX_ENABLE | PORT_RX_ENABLE | PORT_LEARN_DISABLE);
-
- switch (state) {
- case BR_STATE_DISABLED:
- data |= PORT_LEARN_DISABLE;
- break;
- case BR_STATE_LISTENING:
- data |= (PORT_RX_ENABLE | PORT_LEARN_DISABLE);
- break;
- case BR_STATE_LEARNING:
- data |= PORT_RX_ENABLE;
- break;
- case BR_STATE_FORWARDING:
- data |= (PORT_TX_ENABLE | PORT_RX_ENABLE);
- break;
- case BR_STATE_BLOCKING:
- data |= PORT_LEARN_DISABLE;
- break;
- default:
- dev_err(ds->dev, "invalid STP state: %d\n", state);
- return;
- }
-
- ksz_pwrite8(dev, port, P_STP_CTRL, data);
- p->stp_state = state;
-
- ksz_update_port_member(dev, port);
+ ksz_port_stp_state_set(ds, port, state, P_STP_CTRL);
}
static void ksz9477_flush_dyn_mac_table(struct ksz_device *dev, int port)
diff --git a/drivers/net/dsa/microchip/ksz9477_reg.h b/drivers/net/dsa/microchip/ksz9477_reg.h
index 0bd58467181f..7a2c8d4767af 100644
--- a/drivers/net/dsa/microchip/ksz9477_reg.h
+++ b/drivers/net/dsa/microchip/ksz9477_reg.h
@@ -1586,10 +1586,6 @@
#define REG_PORT_LUE_MSTP_STATE 0x0B04
-#define PORT_TX_ENABLE BIT(2)
-#define PORT_RX_ENABLE BIT(1)
-#define PORT_LEARN_DISABLE BIT(0)
-
/* C - PTP */
#define REG_PTP_PORT_RX_DELAY__2 0x0C00
diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
index 8014b18d9391..9b9f570ebb0b 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -372,6 +372,46 @@ int ksz_enable_port(struct dsa_switch *ds, int port, struct phy_device *phy)
}
EXPORT_SYMBOL_GPL(ksz_enable_port);
+void ksz_port_stp_state_set(struct dsa_switch *ds, int port,
+ u8 state, int reg)
+{
+ struct ksz_device *dev = ds->priv;
+ struct ksz_port *p;
+ u8 data;
+
+ ksz_pread8(dev, port, reg, &data);
+ data &= ~(PORT_TX_ENABLE | PORT_RX_ENABLE | PORT_LEARN_DISABLE);
+
+ switch (state) {
+ case BR_STATE_DISABLED:
+ data |= PORT_LEARN_DISABLE;
+ break;
+ case BR_STATE_LISTENING:
+ data |= (PORT_RX_ENABLE | PORT_LEARN_DISABLE);
+ break;
+ case BR_STATE_LEARNING:
+ data |= PORT_RX_ENABLE;
+ break;
+ case BR_STATE_FORWARDING:
+ data |= (PORT_TX_ENABLE | PORT_RX_ENABLE);
+ break;
+ case BR_STATE_BLOCKING:
+ data |= PORT_LEARN_DISABLE;
+ break;
+ default:
+ dev_err(ds->dev, "invalid STP state: %d\n", state);
+ return;
+ }
+
+ ksz_pwrite8(dev, port, reg, data);
+
+ p = &dev->ports[port];
+ p->stp_state = state;
+
+ ksz_update_port_member(dev, port);
+}
+EXPORT_SYMBOL_GPL(ksz_port_stp_state_set);
+
struct ksz_device *ksz_switch_alloc(struct device *base, void *priv)
{
struct dsa_switch *ds;
diff --git a/drivers/net/dsa/microchip/ksz_common.h b/drivers/net/dsa/microchip/ksz_common.h
index 485d4a948c38..4d978832c448 100644
--- a/drivers/net/dsa/microchip/ksz_common.h
+++ b/drivers/net/dsa/microchip/ksz_common.h
@@ -165,6 +165,8 @@ int ksz_port_bridge_join(struct dsa_switch *ds, int port,
struct netlink_ext_ack *extack);
void ksz_port_bridge_leave(struct dsa_switch *ds, int port,
struct dsa_bridge bridge);
+void ksz_port_stp_state_set(struct dsa_switch *ds, int port,
+ u8 state, int reg);
void ksz_port_fast_age(struct dsa_switch *ds, int port);
int ksz_port_fdb_dump(struct dsa_switch *ds, int port, dsa_fdb_dump_cb_t *cb,
void *data);
@@ -292,6 +294,11 @@ static inline void ksz_regmap_unlock(void *__mtx)
mutex_unlock(mtx);
}
+/* STP State Defines */
+#define PORT_TX_ENABLE BIT(2)
+#define PORT_RX_ENABLE BIT(1)
+#define PORT_LEARN_DISABLE BIT(0)
+
/* Regmap tables generation */
#define KSZ_SPI_OP_RD 3
#define KSZ_SPI_OP_WR 2
base-commit: cfc1d91a7d78cf9de25b043d81efcc16966d55b3
--
2.33.0
^ permalink raw reply related
* Re: [PATCH net v2] virtio_net: fix wrong buf address calculation when using xdp
From: Xuan Zhuo @ 2022-04-24 11:18 UTC (permalink / raw)
To: Nikolay Aleksandrov
Cc: kuba, davem, stable, Jason Wang, Daniel Borkmann,
Michael S. Tsirkin, virtualization, netdev
In-Reply-To: <94172c53-2919-9eab-7933-91a78bdb87f0@blackwall.org>
On Sun, 24 Apr 2022 13:56:17 +0300, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> On 24/04/2022 13:42, Xuan Zhuo wrote:
> > On Sun, 24 Apr 2022 13:21:21 +0300, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> >> We received a report[1] of kernel crashes when Cilium is used in XDP
> >> mode with virtio_net after updating to newer kernels. After
> >> investigating the reason it turned out that when using mergeable bufs
> >> with an XDP program which adjusts xdp.data or xdp.data_meta page_to_buf()
> >> calculates the build_skb address wrong because the offset can become less
> >> than the headroom so it gets the address of the previous page (-X bytes
> >> depending on how lower offset is):
> >> page_to_skb: page addr ffff9eb2923e2000 buf ffff9eb2923e1ffc offset 252 headroom 256
> >>
> >> This is a pr_err() I added in the beginning of page_to_skb which clearly
> >> shows offset that is less than headroom by adding 4 bytes of metadata
> >> via an xdp prog. The calculations done are:
> >> receive_mergeable():
> >> headroom = VIRTIO_XDP_HEADROOM; // VIRTIO_XDP_HEADROOM == 256 bytes
> >> offset = xdp.data - page_address(xdp_page) -
> >> vi->hdr_len - metasize;
> >>
> >> page_to_skb():
> >> p = page_address(page) + offset;
> >> ...
> >> buf = p - headroom;
> >>
> >> Now buf goes -4 bytes from the page's starting address as can be seen
> >> above which is set as skb->head and skb->data by build_skb later. Depending
> >> on what's done with the skb (when it's freed most often) we get all kinds
> >> of corruptions and BUG_ON() triggers in mm[2]. We have to recalculate
> >> the new headroom after the xdp program has run, similar to how offset
> >> and len are recalculated. Headroom is directly related to
> >> data_hard_start, data and data_meta, so we use them to get the new size.
> >> The result is correct (similar pr_err() in page_to_skb, one case of
> >> xdp_page and one case of virtnet buf):
> >> a) Case with 4 bytes of metadata
> >> [ 115.949641] page_to_skb: page addr ffff8b4dcfad2000 offset 252 headroom 252
> >> [ 121.084105] page_to_skb: page addr ffff8b4dcf018000 offset 20732 headroom 252
> >> b) Case of pushing data +32 bytes
> >> [ 153.181401] page_to_skb: page addr ffff8b4dd0c4d000 offset 288 headroom 288
> >> [ 158.480421] page_to_skb: page addr ffff8b4dd00b0000 offset 24864 headroom 288
> >> c) Case of pushing data -33 bytes
> >> [ 835.906830] page_to_skb: page addr ffff8b4dd3270000 offset 223 headroom 223
> >> [ 840.839910] page_to_skb: page addr ffff8b4dcdd68000 offset 12511 headroom 223
> >>
> >> An example reproducer xdp prog[3] is below.
> >>
> >> [1] https://github.com/cilium/cilium/issues/19453
> >>
> >> [2] Two of the many traces:
> >> [ 40.437400] BUG: Bad page state in process swapper/0 pfn:14940
> >> [ 40.916726] BUG: Bad page state in process systemd-resolve pfn:053b7
> >> [ 41.300891] kernel BUG at include/linux/mm.h:720!
> >> [ 41.301801] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> >> [ 41.302784] CPU: 1 PID: 1181 Comm: kubelet Kdump: loaded Tainted: G B W 5.18.0-rc1+ #37
> >> [ 41.304458] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
> >> [ 41.306018] RIP: 0010:page_frag_free+0x79/0xe0
> >> [ 41.306836] Code: 00 00 75 ea 48 8b 07 a9 00 00 01 00 74 e0 48 8b 47 48 48 8d 50 ff a8 01 48 0f 45 fa eb d0 48 c7 c6 18 b8 30 a6 e8 d7 f8 fc ff <0f> 0b 48 8d 78 ff eb bc 48 8b 07 a9 00 00 01 00 74 3a 66 90 0f b6
> >> [ 41.310235] RSP: 0018:ffffac05c2a6bc78 EFLAGS: 00010292
> >> [ 41.311201] RAX: 000000000000003e RBX: 0000000000000000 RCX: 0000000000000000
> >> [ 41.312502] RDX: 0000000000000001 RSI: ffffffffa6423004 RDI: 00000000ffffffff
> >> [ 41.313794] RBP: ffff993c98823600 R08: 0000000000000000 R09: 00000000ffffdfff
> >> [ 41.315089] R10: ffffac05c2a6ba68 R11: ffffffffa698ca28 R12: ffff993c98823600
> >> [ 41.316398] R13: ffff993c86311ebc R14: 0000000000000000 R15: 000000000000005c
> >> [ 41.317700] FS: 00007fe13fc56740(0000) GS:ffff993cdd900000(0000) knlGS:0000000000000000
> >> [ 41.319150] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [ 41.320152] CR2: 000000c00008a000 CR3: 0000000014908000 CR4: 0000000000350ee0
> >> [ 41.321387] Call Trace:
> >> [ 41.321819] <TASK>
> >> [ 41.322193] skb_release_data+0x13f/0x1c0
> >> [ 41.322902] __kfree_skb+0x20/0x30
> >> [ 41.343870] tcp_recvmsg_locked+0x671/0x880
> >> [ 41.363764] tcp_recvmsg+0x5e/0x1c0
> >> [ 41.384102] inet_recvmsg+0x42/0x100
> >> [ 41.406783] ? sock_recvmsg+0x1d/0x70
> >> [ 41.428201] sock_read_iter+0x84/0xd0
> >> [ 41.445592] ? 0xffffffffa3000000
> >> [ 41.462442] new_sync_read+0x148/0x160
> >> [ 41.479314] ? 0xffffffffa3000000
> >> [ 41.496937] vfs_read+0x138/0x190
> >> [ 41.517198] ksys_read+0x87/0xc0
> >> [ 41.535336] do_syscall_64+0x3b/0x90
> >> [ 41.551637] entry_SYSCALL_64_after_hwframe+0x44/0xae
> >> [ 41.568050] RIP: 0033:0x48765b
> >> [ 41.583955] Code: e8 4a 35 fe ff eb 88 cc cc cc cc cc cc cc cc e8 fb 7a fe ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
> >> [ 41.632818] RSP: 002b:000000c000a2f5b8 EFLAGS: 00000212 ORIG_RAX: 0000000000000000
> >> [ 41.664588] RAX: ffffffffffffffda RBX: 000000c000062000 RCX: 000000000048765b
> >> [ 41.681205] RDX: 0000000000005e54 RSI: 000000c000e66000 RDI: 0000000000000016
> >> [ 41.697164] RBP: 000000c000a2f608 R08: 0000000000000001 R09: 00000000000001b4
> >> [ 41.713034] R10: 00000000000000b6 R11: 0000000000000212 R12: 00000000000000e9
> >> [ 41.728755] R13: 0000000000000001 R14: 000000c000a92000 R15: ffffffffffffffff
> >> [ 41.744254] </TASK>
> >> [ 41.758585] Modules linked in: br_netfilter bridge veth netconsole virtio_net
> >>
> >> and
> >>
> >> [ 33.524802] BUG: Bad page state in process systemd-network pfn:11e60
> >> [ 33.528617] page ffffe05dc0147b00 ffffe05dc04e7a00 ffff8ae9851ec000 (1) len 82 offset 252 metasize 4 hroom 0 hdr_len 12 data ffff8ae9851ec10c data_meta ffff8ae9851ec108 data_end ffff8ae9851ec14e
> >> [ 33.529764] page:000000003792b5ba refcount:0 mapcount:-512 mapping:0000000000000000 index:0x0 pfn:0x11e60
> >> [ 33.532463] flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
> >> [ 33.532468] raw: 000fffffc0000000 0000000000000000 dead000000000122 0000000000000000
> >> [ 33.532470] raw: 0000000000000000 0000000000000000 00000000fffffdff 0000000000000000
> >> [ 33.532471] page dumped because: nonzero mapcount
> >> [ 33.532472] Modules linked in: br_netfilter bridge veth netconsole virtio_net
> >> [ 33.532479] CPU: 0 PID: 791 Comm: systemd-network Kdump: loaded Not tainted 5.18.0-rc1+ #37
> >> [ 33.532482] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
> >> [ 33.532484] Call Trace:
> >> [ 33.532496] <TASK>
> >> [ 33.532500] dump_stack_lvl+0x45/0x5a
> >> [ 33.532506] bad_page.cold+0x63/0x94
> >> [ 33.532510] free_pcp_prepare+0x290/0x420
> >> [ 33.532515] free_unref_page+0x1b/0x100
> >> [ 33.532518] skb_release_data+0x13f/0x1c0
> >> [ 33.532524] kfree_skb_reason+0x3e/0xc0
> >> [ 33.532527] ip6_mc_input+0x23c/0x2b0
> >> [ 33.532531] ip6_sublist_rcv_finish+0x83/0x90
> >> [ 33.532534] ip6_sublist_rcv+0x22b/0x2b0
> >>
> >> [3] XDP program to reproduce(xdp_pass.c):
> >> #include <linux/bpf.h>
> >> #include <bpf/bpf_helpers.h>
> >>
> >> SEC("xdp_pass")
> >> int xdp_pkt_pass(struct xdp_md *ctx)
> >> {
> >> bpf_xdp_adjust_head(ctx, -(int)32);
> >> return XDP_PASS;
> >> }
> >>
> >> char _license[] SEC("license") = "GPL";
> >>
> >> compile: clang -O2 -g -Wall -target bpf -c xdp_pass.c -o xdp_pass.o
> >> load on virtio_net: ip link set enp1s0 xdpdrv obj xdp_pass.o sec xdp_pass
> >>
> >> CC: stable@vger.kernel.org
> >> CC: Jason Wang <jasowang@redhat.com>
> >> CC: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> >> CC: Daniel Borkmann <daniel@iogearbox.net>
> >> CC: "Michael S. Tsirkin" <mst@redhat.com>
> >> CC: virtualization@lists.linux-foundation.org
> >> Fixes: 8fb7da9e9907 ("virtio_net: get build_skb() buf by data ptr")
> >> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
> >> ---
> >> v2: Recalculate headroom based on data, data_hard_start and data_meta
> >>
> >> drivers/net/virtio_net.c | 8 +++++++-
> >> 1 file changed, 7 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >> index 87838cbe38cf..a12338de7ef1 100644
> >> --- a/drivers/net/virtio_net.c
> >> +++ b/drivers/net/virtio_net.c
> >> @@ -1005,6 +1005,12 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >> * xdp.data_meta were adjusted
> >> */
> >> len = xdp.data_end - xdp.data + vi->hdr_len + metasize;
> >> +
> >> + /* recalculate headroom if xdp.data or xdp.data_meta
> >> + * were adjusted
> >> + */
> >> + headroom = xdp.data - xdp.data_hard_start - metasize;
> >
> >
> > This is incorrect.
> >
> >
> > data = page_address(xdp_page) + offset;
> > xdp_init_buff(&xdp, frame_sz - vi->hdr_len, &rq->xdp_rxq);
> > xdp_prepare_buff(&xdp, data - VIRTIO_XDP_HEADROOM + vi->hdr_len,
> > VIRTIO_XDP_HEADROOM, len - vi->hdr_len, true);
> >
> > so: xdp.data_hard_start = page_address(xdp_page) + offset - VIRTIO_XDP_HEADROOM + vi->hdr_len
> >
> > (page_address(xdp_page) + offset) points to virtio-net hdr.
> > (page_address(xdp_page) + offset - VIRTIO_XDP_HEADROOM) points to the allocated buf.
> >
> > xdp.data_hard_start points to buf + vi->hdr_len
> >
> > Thanks.
> >
>
> xdp.data points to buf + vi->hdr_len + VIRTIO_XDP_HEADROOM, so we calculate
> xdp.data - xdp.data_hard_start, i.e. buf + vi->hdr_len + VIRTIO_XDP_HEADROOM - (buf + vi->hdr_len)
>
> You can see the headrooms from my tests above, they are correct and they match exactly
> the values from the headroom calculations that you suggested earlier.
OK. You are right, xdp.data, xdp.data_hard_start have an offset of hdr_len. I
hope this can be explained in the comments, because the headroom we want to get
is virtio_hdr - buf. Although the value here are equal.
In addition, if you are going to post v2, I think you should post a new thread
separately instead of replying in the previous thread.
Thanks.
>
> >
> >> +
> >> /* We can only create skb based on xdp_page. */
> >> if (unlikely(xdp_page != page)) {
> >> rcu_read_unlock();
> >> @@ -1012,7 +1018,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >> head_skb = page_to_skb(vi, rq, xdp_page, offset,
> >> len, PAGE_SIZE, false,
> >> metasize,
> >> - VIRTIO_XDP_HEADROOM);
> >> + headroom);
> >> return head_skb;
> >> }
> >> break;
> >> --
> >> 2.35.1
> >>
>
^ permalink raw reply
* Re: [RFC Patch net-next] net: dsa: ksz: added the generic port_stp_state_set function
From: Arun.Ramadoss @ 2022-04-24 11:21 UTC (permalink / raw)
To: olteanv
Cc: andrew, linux-kernel, UNGLinuxDriver, vivien.didelot, f.fainelli,
kuba, pabeni, netdev, Woojung.Huh, davem
In-Reply-To: <20220422170135.ctkibqs3lunbeo44@skbuf>
On Fri, 2022-04-22 at 20:01 +0300, Vladimir Oltean wrote:
Hi Vladimir,
> EXTERNAL EMAIL: Do not click links or open attachments unless you
> know the content is safe
>
> On Wed, Apr 20, 2022 at 12:56:47PM +0530, Arun Ramadoss wrote:
> > The ksz8795 and ksz9477 uses the same algorithm for the
> > port_stp_state_set function except the register address is
> > different. So
> > moved the algorithm to the ksz_common.c and used the dev_ops for
> > register read and write. This function can also used for the
> > lan937x
> > part. Hence making it generic for all the parts.
> >
> > Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
> > ---
>
> If the entire port STP state change procedure is the same, just a
> register offset is different, can you not create a common STP state
> procedure that takes the register offset as argument, and gets called
> with different offset arguments from ksz8795.c and from ksz9477.c?
Thanks for comment.
I will update the Patch by adding the register address as function
argument and repost.
^ permalink raw reply
* Re: [PATCH net v2] virtio_net: fix wrong buf address calculation when using xdp
From: Nikolay Aleksandrov @ 2022-04-24 10:56 UTC (permalink / raw)
To: Xuan Zhuo
Cc: kuba, davem, stable, Jason Wang, Daniel Borkmann,
Michael S. Tsirkin, virtualization, netdev
In-Reply-To: <1650796959.4611728-1-xuanzhuo@linux.alibaba.com>
On 24/04/2022 13:42, Xuan Zhuo wrote:
> On Sun, 24 Apr 2022 13:21:21 +0300, Nikolay Aleksandrov <razor@blackwall.org> wrote:
>> We received a report[1] of kernel crashes when Cilium is used in XDP
>> mode with virtio_net after updating to newer kernels. After
>> investigating the reason it turned out that when using mergeable bufs
>> with an XDP program which adjusts xdp.data or xdp.data_meta page_to_buf()
>> calculates the build_skb address wrong because the offset can become less
>> than the headroom so it gets the address of the previous page (-X bytes
>> depending on how lower offset is):
>> page_to_skb: page addr ffff9eb2923e2000 buf ffff9eb2923e1ffc offset 252 headroom 256
>>
>> This is a pr_err() I added in the beginning of page_to_skb which clearly
>> shows offset that is less than headroom by adding 4 bytes of metadata
>> via an xdp prog. The calculations done are:
>> receive_mergeable():
>> headroom = VIRTIO_XDP_HEADROOM; // VIRTIO_XDP_HEADROOM == 256 bytes
>> offset = xdp.data - page_address(xdp_page) -
>> vi->hdr_len - metasize;
>>
>> page_to_skb():
>> p = page_address(page) + offset;
>> ...
>> buf = p - headroom;
>>
>> Now buf goes -4 bytes from the page's starting address as can be seen
>> above which is set as skb->head and skb->data by build_skb later. Depending
>> on what's done with the skb (when it's freed most often) we get all kinds
>> of corruptions and BUG_ON() triggers in mm[2]. We have to recalculate
>> the new headroom after the xdp program has run, similar to how offset
>> and len are recalculated. Headroom is directly related to
>> data_hard_start, data and data_meta, so we use them to get the new size.
>> The result is correct (similar pr_err() in page_to_skb, one case of
>> xdp_page and one case of virtnet buf):
>> a) Case with 4 bytes of metadata
>> [ 115.949641] page_to_skb: page addr ffff8b4dcfad2000 offset 252 headroom 252
>> [ 121.084105] page_to_skb: page addr ffff8b4dcf018000 offset 20732 headroom 252
>> b) Case of pushing data +32 bytes
>> [ 153.181401] page_to_skb: page addr ffff8b4dd0c4d000 offset 288 headroom 288
>> [ 158.480421] page_to_skb: page addr ffff8b4dd00b0000 offset 24864 headroom 288
>> c) Case of pushing data -33 bytes
>> [ 835.906830] page_to_skb: page addr ffff8b4dd3270000 offset 223 headroom 223
>> [ 840.839910] page_to_skb: page addr ffff8b4dcdd68000 offset 12511 headroom 223
>>
>> An example reproducer xdp prog[3] is below.
>>
>> [1] https://github.com/cilium/cilium/issues/19453
>>
>> [2] Two of the many traces:
>> [ 40.437400] BUG: Bad page state in process swapper/0 pfn:14940
>> [ 40.916726] BUG: Bad page state in process systemd-resolve pfn:053b7
>> [ 41.300891] kernel BUG at include/linux/mm.h:720!
>> [ 41.301801] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
>> [ 41.302784] CPU: 1 PID: 1181 Comm: kubelet Kdump: loaded Tainted: G B W 5.18.0-rc1+ #37
>> [ 41.304458] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
>> [ 41.306018] RIP: 0010:page_frag_free+0x79/0xe0
>> [ 41.306836] Code: 00 00 75 ea 48 8b 07 a9 00 00 01 00 74 e0 48 8b 47 48 48 8d 50 ff a8 01 48 0f 45 fa eb d0 48 c7 c6 18 b8 30 a6 e8 d7 f8 fc ff <0f> 0b 48 8d 78 ff eb bc 48 8b 07 a9 00 00 01 00 74 3a 66 90 0f b6
>> [ 41.310235] RSP: 0018:ffffac05c2a6bc78 EFLAGS: 00010292
>> [ 41.311201] RAX: 000000000000003e RBX: 0000000000000000 RCX: 0000000000000000
>> [ 41.312502] RDX: 0000000000000001 RSI: ffffffffa6423004 RDI: 00000000ffffffff
>> [ 41.313794] RBP: ffff993c98823600 R08: 0000000000000000 R09: 00000000ffffdfff
>> [ 41.315089] R10: ffffac05c2a6ba68 R11: ffffffffa698ca28 R12: ffff993c98823600
>> [ 41.316398] R13: ffff993c86311ebc R14: 0000000000000000 R15: 000000000000005c
>> [ 41.317700] FS: 00007fe13fc56740(0000) GS:ffff993cdd900000(0000) knlGS:0000000000000000
>> [ 41.319150] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 41.320152] CR2: 000000c00008a000 CR3: 0000000014908000 CR4: 0000000000350ee0
>> [ 41.321387] Call Trace:
>> [ 41.321819] <TASK>
>> [ 41.322193] skb_release_data+0x13f/0x1c0
>> [ 41.322902] __kfree_skb+0x20/0x30
>> [ 41.343870] tcp_recvmsg_locked+0x671/0x880
>> [ 41.363764] tcp_recvmsg+0x5e/0x1c0
>> [ 41.384102] inet_recvmsg+0x42/0x100
>> [ 41.406783] ? sock_recvmsg+0x1d/0x70
>> [ 41.428201] sock_read_iter+0x84/0xd0
>> [ 41.445592] ? 0xffffffffa3000000
>> [ 41.462442] new_sync_read+0x148/0x160
>> [ 41.479314] ? 0xffffffffa3000000
>> [ 41.496937] vfs_read+0x138/0x190
>> [ 41.517198] ksys_read+0x87/0xc0
>> [ 41.535336] do_syscall_64+0x3b/0x90
>> [ 41.551637] entry_SYSCALL_64_after_hwframe+0x44/0xae
>> [ 41.568050] RIP: 0033:0x48765b
>> [ 41.583955] Code: e8 4a 35 fe ff eb 88 cc cc cc cc cc cc cc cc e8 fb 7a fe ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
>> [ 41.632818] RSP: 002b:000000c000a2f5b8 EFLAGS: 00000212 ORIG_RAX: 0000000000000000
>> [ 41.664588] RAX: ffffffffffffffda RBX: 000000c000062000 RCX: 000000000048765b
>> [ 41.681205] RDX: 0000000000005e54 RSI: 000000c000e66000 RDI: 0000000000000016
>> [ 41.697164] RBP: 000000c000a2f608 R08: 0000000000000001 R09: 00000000000001b4
>> [ 41.713034] R10: 00000000000000b6 R11: 0000000000000212 R12: 00000000000000e9
>> [ 41.728755] R13: 0000000000000001 R14: 000000c000a92000 R15: ffffffffffffffff
>> [ 41.744254] </TASK>
>> [ 41.758585] Modules linked in: br_netfilter bridge veth netconsole virtio_net
>>
>> and
>>
>> [ 33.524802] BUG: Bad page state in process systemd-network pfn:11e60
>> [ 33.528617] page ffffe05dc0147b00 ffffe05dc04e7a00 ffff8ae9851ec000 (1) len 82 offset 252 metasize 4 hroom 0 hdr_len 12 data ffff8ae9851ec10c data_meta ffff8ae9851ec108 data_end ffff8ae9851ec14e
>> [ 33.529764] page:000000003792b5ba refcount:0 mapcount:-512 mapping:0000000000000000 index:0x0 pfn:0x11e60
>> [ 33.532463] flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
>> [ 33.532468] raw: 000fffffc0000000 0000000000000000 dead000000000122 0000000000000000
>> [ 33.532470] raw: 0000000000000000 0000000000000000 00000000fffffdff 0000000000000000
>> [ 33.532471] page dumped because: nonzero mapcount
>> [ 33.532472] Modules linked in: br_netfilter bridge veth netconsole virtio_net
>> [ 33.532479] CPU: 0 PID: 791 Comm: systemd-network Kdump: loaded Not tainted 5.18.0-rc1+ #37
>> [ 33.532482] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
>> [ 33.532484] Call Trace:
>> [ 33.532496] <TASK>
>> [ 33.532500] dump_stack_lvl+0x45/0x5a
>> [ 33.532506] bad_page.cold+0x63/0x94
>> [ 33.532510] free_pcp_prepare+0x290/0x420
>> [ 33.532515] free_unref_page+0x1b/0x100
>> [ 33.532518] skb_release_data+0x13f/0x1c0
>> [ 33.532524] kfree_skb_reason+0x3e/0xc0
>> [ 33.532527] ip6_mc_input+0x23c/0x2b0
>> [ 33.532531] ip6_sublist_rcv_finish+0x83/0x90
>> [ 33.532534] ip6_sublist_rcv+0x22b/0x2b0
>>
>> [3] XDP program to reproduce(xdp_pass.c):
>> #include <linux/bpf.h>
>> #include <bpf/bpf_helpers.h>
>>
>> SEC("xdp_pass")
>> int xdp_pkt_pass(struct xdp_md *ctx)
>> {
>> bpf_xdp_adjust_head(ctx, -(int)32);
>> return XDP_PASS;
>> }
>>
>> char _license[] SEC("license") = "GPL";
>>
>> compile: clang -O2 -g -Wall -target bpf -c xdp_pass.c -o xdp_pass.o
>> load on virtio_net: ip link set enp1s0 xdpdrv obj xdp_pass.o sec xdp_pass
>>
>> CC: stable@vger.kernel.org
>> CC: Jason Wang <jasowang@redhat.com>
>> CC: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>> CC: Daniel Borkmann <daniel@iogearbox.net>
>> CC: "Michael S. Tsirkin" <mst@redhat.com>
>> CC: virtualization@lists.linux-foundation.org
>> Fixes: 8fb7da9e9907 ("virtio_net: get build_skb() buf by data ptr")
>> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
>> ---
>> v2: Recalculate headroom based on data, data_hard_start and data_meta
>>
>> drivers/net/virtio_net.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index 87838cbe38cf..a12338de7ef1 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -1005,6 +1005,12 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>> * xdp.data_meta were adjusted
>> */
>> len = xdp.data_end - xdp.data + vi->hdr_len + metasize;
>> +
>> + /* recalculate headroom if xdp.data or xdp.data_meta
>> + * were adjusted
>> + */
>> + headroom = xdp.data - xdp.data_hard_start - metasize;
>
>
> This is incorrect.
>
>
> data = page_address(xdp_page) + offset;
> xdp_init_buff(&xdp, frame_sz - vi->hdr_len, &rq->xdp_rxq);
> xdp_prepare_buff(&xdp, data - VIRTIO_XDP_HEADROOM + vi->hdr_len,
> VIRTIO_XDP_HEADROOM, len - vi->hdr_len, true);
>
> so: xdp.data_hard_start = page_address(xdp_page) + offset - VIRTIO_XDP_HEADROOM + vi->hdr_len
>
> (page_address(xdp_page) + offset) points to virtio-net hdr.
> (page_address(xdp_page) + offset - VIRTIO_XDP_HEADROOM) points to the allocated buf.
>
> xdp.data_hard_start points to buf + vi->hdr_len
>
> Thanks.
>
xdp.data points to buf + vi->hdr_len + VIRTIO_XDP_HEADROOM, so we calculate
xdp.data - xdp.data_hard_start, i.e. buf + vi->hdr_len + VIRTIO_XDP_HEADROOM - (buf + vi->hdr_len)
You can see the headrooms from my tests above, they are correct and they match exactly
the values from the headroom calculations that you suggested earlier.
>
>> +
>> /* We can only create skb based on xdp_page. */
>> if (unlikely(xdp_page != page)) {
>> rcu_read_unlock();
>> @@ -1012,7 +1018,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>> head_skb = page_to_skb(vi, rq, xdp_page, offset,
>> len, PAGE_SIZE, false,
>> metasize,
>> - VIRTIO_XDP_HEADROOM);
>> + headroom);
>> return head_skb;
>> }
>> break;
>> --
>> 2.35.1
>>
^ permalink raw reply
* Re: [PATCH net v2] virtio_net: fix wrong buf address calculation when using xdp
From: Xuan Zhuo @ 2022-04-24 10:42 UTC (permalink / raw)
To: Nikolay Aleksandrov
Cc: kuba, davem, Nikolay Aleksandrov, stable, Jason Wang,
Daniel Borkmann, Michael S. Tsirkin, virtualization, netdev
In-Reply-To: <20220424102121.2686893-1-razor@blackwall.org>
On Sun, 24 Apr 2022 13:21:21 +0300, Nikolay Aleksandrov <razor@blackwall.org> wrote:
> We received a report[1] of kernel crashes when Cilium is used in XDP
> mode with virtio_net after updating to newer kernels. After
> investigating the reason it turned out that when using mergeable bufs
> with an XDP program which adjusts xdp.data or xdp.data_meta page_to_buf()
> calculates the build_skb address wrong because the offset can become less
> than the headroom so it gets the address of the previous page (-X bytes
> depending on how lower offset is):
> page_to_skb: page addr ffff9eb2923e2000 buf ffff9eb2923e1ffc offset 252 headroom 256
>
> This is a pr_err() I added in the beginning of page_to_skb which clearly
> shows offset that is less than headroom by adding 4 bytes of metadata
> via an xdp prog. The calculations done are:
> receive_mergeable():
> headroom = VIRTIO_XDP_HEADROOM; // VIRTIO_XDP_HEADROOM == 256 bytes
> offset = xdp.data - page_address(xdp_page) -
> vi->hdr_len - metasize;
>
> page_to_skb():
> p = page_address(page) + offset;
> ...
> buf = p - headroom;
>
> Now buf goes -4 bytes from the page's starting address as can be seen
> above which is set as skb->head and skb->data by build_skb later. Depending
> on what's done with the skb (when it's freed most often) we get all kinds
> of corruptions and BUG_ON() triggers in mm[2]. We have to recalculate
> the new headroom after the xdp program has run, similar to how offset
> and len are recalculated. Headroom is directly related to
> data_hard_start, data and data_meta, so we use them to get the new size.
> The result is correct (similar pr_err() in page_to_skb, one case of
> xdp_page and one case of virtnet buf):
> a) Case with 4 bytes of metadata
> [ 115.949641] page_to_skb: page addr ffff8b4dcfad2000 offset 252 headroom 252
> [ 121.084105] page_to_skb: page addr ffff8b4dcf018000 offset 20732 headroom 252
> b) Case of pushing data +32 bytes
> [ 153.181401] page_to_skb: page addr ffff8b4dd0c4d000 offset 288 headroom 288
> [ 158.480421] page_to_skb: page addr ffff8b4dd00b0000 offset 24864 headroom 288
> c) Case of pushing data -33 bytes
> [ 835.906830] page_to_skb: page addr ffff8b4dd3270000 offset 223 headroom 223
> [ 840.839910] page_to_skb: page addr ffff8b4dcdd68000 offset 12511 headroom 223
>
> An example reproducer xdp prog[3] is below.
>
> [1] https://github.com/cilium/cilium/issues/19453
>
> [2] Two of the many traces:
> [ 40.437400] BUG: Bad page state in process swapper/0 pfn:14940
> [ 40.916726] BUG: Bad page state in process systemd-resolve pfn:053b7
> [ 41.300891] kernel BUG at include/linux/mm.h:720!
> [ 41.301801] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [ 41.302784] CPU: 1 PID: 1181 Comm: kubelet Kdump: loaded Tainted: G B W 5.18.0-rc1+ #37
> [ 41.304458] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
> [ 41.306018] RIP: 0010:page_frag_free+0x79/0xe0
> [ 41.306836] Code: 00 00 75 ea 48 8b 07 a9 00 00 01 00 74 e0 48 8b 47 48 48 8d 50 ff a8 01 48 0f 45 fa eb d0 48 c7 c6 18 b8 30 a6 e8 d7 f8 fc ff <0f> 0b 48 8d 78 ff eb bc 48 8b 07 a9 00 00 01 00 74 3a 66 90 0f b6
> [ 41.310235] RSP: 0018:ffffac05c2a6bc78 EFLAGS: 00010292
> [ 41.311201] RAX: 000000000000003e RBX: 0000000000000000 RCX: 0000000000000000
> [ 41.312502] RDX: 0000000000000001 RSI: ffffffffa6423004 RDI: 00000000ffffffff
> [ 41.313794] RBP: ffff993c98823600 R08: 0000000000000000 R09: 00000000ffffdfff
> [ 41.315089] R10: ffffac05c2a6ba68 R11: ffffffffa698ca28 R12: ffff993c98823600
> [ 41.316398] R13: ffff993c86311ebc R14: 0000000000000000 R15: 000000000000005c
> [ 41.317700] FS: 00007fe13fc56740(0000) GS:ffff993cdd900000(0000) knlGS:0000000000000000
> [ 41.319150] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 41.320152] CR2: 000000c00008a000 CR3: 0000000014908000 CR4: 0000000000350ee0
> [ 41.321387] Call Trace:
> [ 41.321819] <TASK>
> [ 41.322193] skb_release_data+0x13f/0x1c0
> [ 41.322902] __kfree_skb+0x20/0x30
> [ 41.343870] tcp_recvmsg_locked+0x671/0x880
> [ 41.363764] tcp_recvmsg+0x5e/0x1c0
> [ 41.384102] inet_recvmsg+0x42/0x100
> [ 41.406783] ? sock_recvmsg+0x1d/0x70
> [ 41.428201] sock_read_iter+0x84/0xd0
> [ 41.445592] ? 0xffffffffa3000000
> [ 41.462442] new_sync_read+0x148/0x160
> [ 41.479314] ? 0xffffffffa3000000
> [ 41.496937] vfs_read+0x138/0x190
> [ 41.517198] ksys_read+0x87/0xc0
> [ 41.535336] do_syscall_64+0x3b/0x90
> [ 41.551637] entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 41.568050] RIP: 0033:0x48765b
> [ 41.583955] Code: e8 4a 35 fe ff eb 88 cc cc cc cc cc cc cc cc e8 fb 7a fe ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
> [ 41.632818] RSP: 002b:000000c000a2f5b8 EFLAGS: 00000212 ORIG_RAX: 0000000000000000
> [ 41.664588] RAX: ffffffffffffffda RBX: 000000c000062000 RCX: 000000000048765b
> [ 41.681205] RDX: 0000000000005e54 RSI: 000000c000e66000 RDI: 0000000000000016
> [ 41.697164] RBP: 000000c000a2f608 R08: 0000000000000001 R09: 00000000000001b4
> [ 41.713034] R10: 00000000000000b6 R11: 0000000000000212 R12: 00000000000000e9
> [ 41.728755] R13: 0000000000000001 R14: 000000c000a92000 R15: ffffffffffffffff
> [ 41.744254] </TASK>
> [ 41.758585] Modules linked in: br_netfilter bridge veth netconsole virtio_net
>
> and
>
> [ 33.524802] BUG: Bad page state in process systemd-network pfn:11e60
> [ 33.528617] page ffffe05dc0147b00 ffffe05dc04e7a00 ffff8ae9851ec000 (1) len 82 offset 252 metasize 4 hroom 0 hdr_len 12 data ffff8ae9851ec10c data_meta ffff8ae9851ec108 data_end ffff8ae9851ec14e
> [ 33.529764] page:000000003792b5ba refcount:0 mapcount:-512 mapping:0000000000000000 index:0x0 pfn:0x11e60
> [ 33.532463] flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
> [ 33.532468] raw: 000fffffc0000000 0000000000000000 dead000000000122 0000000000000000
> [ 33.532470] raw: 0000000000000000 0000000000000000 00000000fffffdff 0000000000000000
> [ 33.532471] page dumped because: nonzero mapcount
> [ 33.532472] Modules linked in: br_netfilter bridge veth netconsole virtio_net
> [ 33.532479] CPU: 0 PID: 791 Comm: systemd-network Kdump: loaded Not tainted 5.18.0-rc1+ #37
> [ 33.532482] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
> [ 33.532484] Call Trace:
> [ 33.532496] <TASK>
> [ 33.532500] dump_stack_lvl+0x45/0x5a
> [ 33.532506] bad_page.cold+0x63/0x94
> [ 33.532510] free_pcp_prepare+0x290/0x420
> [ 33.532515] free_unref_page+0x1b/0x100
> [ 33.532518] skb_release_data+0x13f/0x1c0
> [ 33.532524] kfree_skb_reason+0x3e/0xc0
> [ 33.532527] ip6_mc_input+0x23c/0x2b0
> [ 33.532531] ip6_sublist_rcv_finish+0x83/0x90
> [ 33.532534] ip6_sublist_rcv+0x22b/0x2b0
>
> [3] XDP program to reproduce(xdp_pass.c):
> #include <linux/bpf.h>
> #include <bpf/bpf_helpers.h>
>
> SEC("xdp_pass")
> int xdp_pkt_pass(struct xdp_md *ctx)
> {
> bpf_xdp_adjust_head(ctx, -(int)32);
> return XDP_PASS;
> }
>
> char _license[] SEC("license") = "GPL";
>
> compile: clang -O2 -g -Wall -target bpf -c xdp_pass.c -o xdp_pass.o
> load on virtio_net: ip link set enp1s0 xdpdrv obj xdp_pass.o sec xdp_pass
>
> CC: stable@vger.kernel.org
> CC: Jason Wang <jasowang@redhat.com>
> CC: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> CC: Daniel Borkmann <daniel@iogearbox.net>
> CC: "Michael S. Tsirkin" <mst@redhat.com>
> CC: virtualization@lists.linux-foundation.org
> Fixes: 8fb7da9e9907 ("virtio_net: get build_skb() buf by data ptr")
> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
> ---
> v2: Recalculate headroom based on data, data_hard_start and data_meta
>
> drivers/net/virtio_net.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 87838cbe38cf..a12338de7ef1 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1005,6 +1005,12 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> * xdp.data_meta were adjusted
> */
> len = xdp.data_end - xdp.data + vi->hdr_len + metasize;
> +
> + /* recalculate headroom if xdp.data or xdp.data_meta
> + * were adjusted
> + */
> + headroom = xdp.data - xdp.data_hard_start - metasize;
This is incorrect.
data = page_address(xdp_page) + offset;
xdp_init_buff(&xdp, frame_sz - vi->hdr_len, &rq->xdp_rxq);
xdp_prepare_buff(&xdp, data - VIRTIO_XDP_HEADROOM + vi->hdr_len,
VIRTIO_XDP_HEADROOM, len - vi->hdr_len, true);
so: xdp.data_hard_start = page_address(xdp_page) + offset - VIRTIO_XDP_HEADROOM + vi->hdr_len
(page_address(xdp_page) + offset) points to virtio-net hdr.
(page_address(xdp_page) + offset - VIRTIO_XDP_HEADROOM) points to the allocated buf.
xdp.data_hard_start points to buf + vi->hdr_len
Thanks.
> +
> /* We can only create skb based on xdp_page. */
> if (unlikely(xdp_page != page)) {
> rcu_read_unlock();
> @@ -1012,7 +1018,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> head_skb = page_to_skb(vi, rq, xdp_page, offset,
> len, PAGE_SIZE, false,
> metasize,
> - VIRTIO_XDP_HEADROOM);
> + headroom);
> return head_skb;
> }
> break;
> --
> 2.35.1
>
^ permalink raw reply
* Re: [PATCH v2] NFC: nfcmrvl: fix error check return value of irq_of_parse_and_map()
From: Krzysztof Kozlowski @ 2022-04-24 10:46 UTC (permalink / raw)
To: cgel.zte, kuba
Cc: cuissard, davem, linux-kernel, lv.ruyi, netdev, sameo, yashsri421,
Zeal Robot
In-Reply-To: <20220424025710.3166034-1-lv.ruyi@zte.com.cn>
On 24/04/2022 04:57, cgel.zte@gmail.com wrote:
> From: Lv Ruyi <lv.ruyi@zte.com.cn>
>
> The irq_of_parse_and_map() function returns 0 on failure, and does not
> return an negative value.
>
> Fixes: b5b3e23e4cac ("NFC: nfcmrvl: add i2c driver")
> Reported-by: Zeal Robot <zealci@zte.com.cn>
> Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
> ---
> v2: don't print ret, and return -EINVAL rather than 0
> ---
> drivers/nfc/nfcmrvl/i2c.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
How about Jakub's idea of squashing here fix for SPI (correcting my
patch) with additional Fixes tag?
Fixes: caf6e49bf6d0 ("NFC: nfcmrvl: add spi driver")
Best regards,
Krzysztof
^ permalink raw reply
* Re: [PATCH] FDDI: defxx: simplify if-if to if-else
From: Maciej W. Rozycki @ 2022-04-24 10:39 UTC (permalink / raw)
To: Wan Jiabing
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, netdev,
linux-kernel, kael_w
In-Reply-To: <20220424092842.101307-1-wanjiabing@vivo.com>
On Sun, 24 Apr 2022, Wan Jiabing wrote:
> diff --git a/drivers/net/fddi/defxx.c b/drivers/net/fddi/defxx.c
> index b584ffe38ad6..3edb2e96f763 100644
> --- a/drivers/net/fddi/defxx.c
> +++ b/drivers/net/fddi/defxx.c
> @@ -585,10 +585,10 @@ static int dfx_register(struct device *bdev)
> bp->mmio = false;
> dfx_get_bars(bp, bar_start, bar_len);
> }
> - }
> - if (!dfx_use_mmio)
> + } else {
> region = request_region(bar_start[0], bar_len[0],
> bdev->driver->name);
> + }
NAK. The first conditional optionally sets `bp->mmio = false', which
changes the value of `dfx_use_mmio' in some configurations:
#if defined(CONFIG_EISA) || defined(CONFIG_PCI)
#define dfx_use_mmio bp->mmio
#else
#define dfx_use_mmio true
#endif
Maciej
^ permalink raw reply
* [PATCH net v2] virtio_net: fix wrong buf address calculation when using xdp
From: Nikolay Aleksandrov @ 2022-04-24 10:21 UTC (permalink / raw)
To: netdev
Cc: kuba, davem, Nikolay Aleksandrov, stable, Jason Wang, Xuan Zhuo,
Daniel Borkmann, Michael S. Tsirkin, virtualization
In-Reply-To: <c7e49737-c5f8-5164-88ad-599c828c5d23@blackwall.org>
We received a report[1] of kernel crashes when Cilium is used in XDP
mode with virtio_net after updating to newer kernels. After
investigating the reason it turned out that when using mergeable bufs
with an XDP program which adjusts xdp.data or xdp.data_meta page_to_buf()
calculates the build_skb address wrong because the offset can become less
than the headroom so it gets the address of the previous page (-X bytes
depending on how lower offset is):
page_to_skb: page addr ffff9eb2923e2000 buf ffff9eb2923e1ffc offset 252 headroom 256
This is a pr_err() I added in the beginning of page_to_skb which clearly
shows offset that is less than headroom by adding 4 bytes of metadata
via an xdp prog. The calculations done are:
receive_mergeable():
headroom = VIRTIO_XDP_HEADROOM; // VIRTIO_XDP_HEADROOM == 256 bytes
offset = xdp.data - page_address(xdp_page) -
vi->hdr_len - metasize;
page_to_skb():
p = page_address(page) + offset;
...
buf = p - headroom;
Now buf goes -4 bytes from the page's starting address as can be seen
above which is set as skb->head and skb->data by build_skb later. Depending
on what's done with the skb (when it's freed most often) we get all kinds
of corruptions and BUG_ON() triggers in mm[2]. We have to recalculate
the new headroom after the xdp program has run, similar to how offset
and len are recalculated. Headroom is directly related to
data_hard_start, data and data_meta, so we use them to get the new size.
The result is correct (similar pr_err() in page_to_skb, one case of
xdp_page and one case of virtnet buf):
a) Case with 4 bytes of metadata
[ 115.949641] page_to_skb: page addr ffff8b4dcfad2000 offset 252 headroom 252
[ 121.084105] page_to_skb: page addr ffff8b4dcf018000 offset 20732 headroom 252
b) Case of pushing data +32 bytes
[ 153.181401] page_to_skb: page addr ffff8b4dd0c4d000 offset 288 headroom 288
[ 158.480421] page_to_skb: page addr ffff8b4dd00b0000 offset 24864 headroom 288
c) Case of pushing data -33 bytes
[ 835.906830] page_to_skb: page addr ffff8b4dd3270000 offset 223 headroom 223
[ 840.839910] page_to_skb: page addr ffff8b4dcdd68000 offset 12511 headroom 223
An example reproducer xdp prog[3] is below.
[1] https://github.com/cilium/cilium/issues/19453
[2] Two of the many traces:
[ 40.437400] BUG: Bad page state in process swapper/0 pfn:14940
[ 40.916726] BUG: Bad page state in process systemd-resolve pfn:053b7
[ 41.300891] kernel BUG at include/linux/mm.h:720!
[ 41.301801] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 41.302784] CPU: 1 PID: 1181 Comm: kubelet Kdump: loaded Tainted: G B W 5.18.0-rc1+ #37
[ 41.304458] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
[ 41.306018] RIP: 0010:page_frag_free+0x79/0xe0
[ 41.306836] Code: 00 00 75 ea 48 8b 07 a9 00 00 01 00 74 e0 48 8b 47 48 48 8d 50 ff a8 01 48 0f 45 fa eb d0 48 c7 c6 18 b8 30 a6 e8 d7 f8 fc ff <0f> 0b 48 8d 78 ff eb bc 48 8b 07 a9 00 00 01 00 74 3a 66 90 0f b6
[ 41.310235] RSP: 0018:ffffac05c2a6bc78 EFLAGS: 00010292
[ 41.311201] RAX: 000000000000003e RBX: 0000000000000000 RCX: 0000000000000000
[ 41.312502] RDX: 0000000000000001 RSI: ffffffffa6423004 RDI: 00000000ffffffff
[ 41.313794] RBP: ffff993c98823600 R08: 0000000000000000 R09: 00000000ffffdfff
[ 41.315089] R10: ffffac05c2a6ba68 R11: ffffffffa698ca28 R12: ffff993c98823600
[ 41.316398] R13: ffff993c86311ebc R14: 0000000000000000 R15: 000000000000005c
[ 41.317700] FS: 00007fe13fc56740(0000) GS:ffff993cdd900000(0000) knlGS:0000000000000000
[ 41.319150] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 41.320152] CR2: 000000c00008a000 CR3: 0000000014908000 CR4: 0000000000350ee0
[ 41.321387] Call Trace:
[ 41.321819] <TASK>
[ 41.322193] skb_release_data+0x13f/0x1c0
[ 41.322902] __kfree_skb+0x20/0x30
[ 41.343870] tcp_recvmsg_locked+0x671/0x880
[ 41.363764] tcp_recvmsg+0x5e/0x1c0
[ 41.384102] inet_recvmsg+0x42/0x100
[ 41.406783] ? sock_recvmsg+0x1d/0x70
[ 41.428201] sock_read_iter+0x84/0xd0
[ 41.445592] ? 0xffffffffa3000000
[ 41.462442] new_sync_read+0x148/0x160
[ 41.479314] ? 0xffffffffa3000000
[ 41.496937] vfs_read+0x138/0x190
[ 41.517198] ksys_read+0x87/0xc0
[ 41.535336] do_syscall_64+0x3b/0x90
[ 41.551637] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 41.568050] RIP: 0033:0x48765b
[ 41.583955] Code: e8 4a 35 fe ff eb 88 cc cc cc cc cc cc cc cc e8 fb 7a fe ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
[ 41.632818] RSP: 002b:000000c000a2f5b8 EFLAGS: 00000212 ORIG_RAX: 0000000000000000
[ 41.664588] RAX: ffffffffffffffda RBX: 000000c000062000 RCX: 000000000048765b
[ 41.681205] RDX: 0000000000005e54 RSI: 000000c000e66000 RDI: 0000000000000016
[ 41.697164] RBP: 000000c000a2f608 R08: 0000000000000001 R09: 00000000000001b4
[ 41.713034] R10: 00000000000000b6 R11: 0000000000000212 R12: 00000000000000e9
[ 41.728755] R13: 0000000000000001 R14: 000000c000a92000 R15: ffffffffffffffff
[ 41.744254] </TASK>
[ 41.758585] Modules linked in: br_netfilter bridge veth netconsole virtio_net
and
[ 33.524802] BUG: Bad page state in process systemd-network pfn:11e60
[ 33.528617] page ffffe05dc0147b00 ffffe05dc04e7a00 ffff8ae9851ec000 (1) len 82 offset 252 metasize 4 hroom 0 hdr_len 12 data ffff8ae9851ec10c data_meta ffff8ae9851ec108 data_end ffff8ae9851ec14e
[ 33.529764] page:000000003792b5ba refcount:0 mapcount:-512 mapping:0000000000000000 index:0x0 pfn:0x11e60
[ 33.532463] flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
[ 33.532468] raw: 000fffffc0000000 0000000000000000 dead000000000122 0000000000000000
[ 33.532470] raw: 0000000000000000 0000000000000000 00000000fffffdff 0000000000000000
[ 33.532471] page dumped because: nonzero mapcount
[ 33.532472] Modules linked in: br_netfilter bridge veth netconsole virtio_net
[ 33.532479] CPU: 0 PID: 791 Comm: systemd-network Kdump: loaded Not tainted 5.18.0-rc1+ #37
[ 33.532482] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
[ 33.532484] Call Trace:
[ 33.532496] <TASK>
[ 33.532500] dump_stack_lvl+0x45/0x5a
[ 33.532506] bad_page.cold+0x63/0x94
[ 33.532510] free_pcp_prepare+0x290/0x420
[ 33.532515] free_unref_page+0x1b/0x100
[ 33.532518] skb_release_data+0x13f/0x1c0
[ 33.532524] kfree_skb_reason+0x3e/0xc0
[ 33.532527] ip6_mc_input+0x23c/0x2b0
[ 33.532531] ip6_sublist_rcv_finish+0x83/0x90
[ 33.532534] ip6_sublist_rcv+0x22b/0x2b0
[3] XDP program to reproduce(xdp_pass.c):
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
SEC("xdp_pass")
int xdp_pkt_pass(struct xdp_md *ctx)
{
bpf_xdp_adjust_head(ctx, -(int)32);
return XDP_PASS;
}
char _license[] SEC("license") = "GPL";
compile: clang -O2 -g -Wall -target bpf -c xdp_pass.c -o xdp_pass.o
load on virtio_net: ip link set enp1s0 xdpdrv obj xdp_pass.o sec xdp_pass
CC: stable@vger.kernel.org
CC: Jason Wang <jasowang@redhat.com>
CC: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
CC: Daniel Borkmann <daniel@iogearbox.net>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: virtualization@lists.linux-foundation.org
Fixes: 8fb7da9e9907 ("virtio_net: get build_skb() buf by data ptr")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
---
v2: Recalculate headroom based on data, data_hard_start and data_meta
drivers/net/virtio_net.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 87838cbe38cf..a12338de7ef1 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1005,6 +1005,12 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
* xdp.data_meta were adjusted
*/
len = xdp.data_end - xdp.data + vi->hdr_len + metasize;
+
+ /* recalculate headroom if xdp.data or xdp.data_meta
+ * were adjusted
+ */
+ headroom = xdp.data - xdp.data_hard_start - metasize;
+
/* We can only create skb based on xdp_page. */
if (unlikely(xdp_page != page)) {
rcu_read_unlock();
@@ -1012,7 +1018,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
head_skb = page_to_skb(vi, rq, xdp_page, offset,
len, PAGE_SIZE, false,
metasize,
- VIRTIO_XDP_HEADROOM);
+ headroom);
return head_skb;
}
break;
--
2.35.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox