* Re: [PATCH net-next 06/11] net: hns3: modify firmware version display format
From: tanhuazhong @ 2019-07-26 1:50 UTC (permalink / raw)
To: Saeed Mahameed, davem@davemloft.net
Cc: lipeng321@huawei.com, yisen.zhuang@huawei.com,
salil.mehta@huawei.com, linuxarm@huawei.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
moyufeng@huawei.com
In-Reply-To: <d6a32434af7e9c883f104ae66e62b7b376abb39c.camel@mellanox.com>
On 2019/7/26 5:32, Saeed Mahameed wrote:
> On Thu, 2019-07-25 at 10:34 +0800, tanhuazhong wrote:
>>
>> On 2019/7/25 2:34, Saeed Mahameed wrote:
>>> On Wed, 2019-07-24 at 11:18 +0800, Huazhong Tan wrote:
>>>> From: Yufeng Mo <moyufeng@huawei.com>
>>>>
>>>> This patch modifies firmware version display format in
>>>> hclge(vf)_cmd_init() and hns3_get_drvinfo(). Also, adds
>>>> some optimizations for firmware version display format.
>>>>
>>>> Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
>>>> Signed-off-by: Peng Li <lipeng321@huawei.com>
>>>> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
>>>> ---
>>>> drivers/net/ethernet/hisilicon/hns3/hnae3.h | 9
>>>> +++++++++
>>>> drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 15
>>>> +++++++++++++--
>>>> drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c | 10
>>>> +++++++++-
>>>> drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c | 11
>>>> +++++++++--
>>>> 4 files changed, 40 insertions(+), 5 deletions(-)
>>>>
>>>>
>
> [...]
>
>>>> --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
>>>> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
>>>> @@ -419,7 +419,15 @@ int hclge_cmd_init(struct hclge_dev *hdev)
>>>> }
>>>> hdev->fw_version = version;
>>>>
>>>> - dev_info(&hdev->pdev->dev, "The firmware version is %08x\n",
>>>> version);
>>>> + pr_info_once("The firmware version is %lu.%lu.%lu.%lu\n",
>>>> + hnae3_get_field(version,
>>>> HNAE3_FW_VERSION_BYTE3_MASK,
>>>> + HNAE3_FW_VERSION_BYTE3_SHIFT),
>>>> + hnae3_get_field(version,
>>>> HNAE3_FW_VERSION_BYTE2_MASK,
>>>> + HNAE3_FW_VERSION_BYTE2_SHIFT),
>>>> + hnae3_get_field(version,
>>>> HNAE3_FW_VERSION_BYTE1_MASK,
>>>> + HNAE3_FW_VERSION_BYTE1_SHIFT),
>>>> + hnae3_get_field(version,
>>>> HNAE3_FW_VERSION_BYTE0_MASK,
>>>> + HNAE3_FW_VERSION_BYTE0_SHIFT));
>>>>
>>>
>>> Device name/string will not be printed now, what happens if i have
>>> multiple devices ? at least print the device name as it was before
>>>
>> Since on each board we only have one firmware, the firmware
>> version is same per device, and will not change when running.
>> So pr_info_once() looks good for this case.
>>
>
> boards change too often to have such static assumption.
Ok, I will use dev_info instead of pr_info here.
>
>> BTW, maybe we should change below print in the end of
>> hclge_init_ae_dev(), use dev_info() instead of pr_info(),
>> then we can know that which device has already initialized.
>> I will send other patch to do that, is it acceptable for you?
>>
>> "pr_info("%s driver initialization finished.\n", HCLGE_DRIVER_NAME);"
>>
>
> I would avoid using pr_info when i can ! if you have the option to
> print with dev information as it was before that is preferable.
>
> Thanks,
> Saeed.
>
Thanks,
Huazhong.
^ permalink raw reply
* Re: [PATCH net-next 07/11] net: hns3: adds debug messages to identify eth down cause
From: liuyonglong @ 2019-07-26 2:21 UTC (permalink / raw)
To: Saeed Mahameed, tanhuazhong@huawei.com, davem@davemloft.net
Cc: lipeng321@huawei.com, yisen.zhuang@huawei.com,
salil.mehta@huawei.com, linuxarm@huawei.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <75a02bbe5b3b0f2755cd901a8830d4a3026f9383.camel@mellanox.com>
We will change all of them to netif_msg_drv() which is default off
Thanks for your reply!
On 2019/7/26 5:59, Saeed Mahameed wrote:
> On Thu, 2019-07-25 at 20:28 +0800, liuyonglong wrote:
>>
>> On 2019/7/25 3:12, Saeed Mahameed wrote:
>>> On Wed, 2019-07-24 at 11:18 +0800, Huazhong Tan wrote:
>>>> From: Yonglong Liu <liuyonglong@huawei.com>
>>>>
>>>> Some times just see the eth interface have been down/up via
>>>> dmesg, but can not know why the eth down. So adds some debug
>>>> messages to identify the cause for this.
>>>>
>>>
>>> I really don't like this. your default msg lvl has NETIF_MSG_IFDOWN
>>> turned on .. dumping every single operation that happens on your
>>> device
>>> by default to kernel log is too much !
>>>
>>> We should really consider using trace buffers with well defined
>>> structures for vendor specific events. so we can use bpf filters
>>> and
>>> state of the art tools for netdev debugging.
>>>
>>
>> We do this because we can just see a link down message in dmesg, and
>> had
>> take a long time to found the cause of link down, just because
>> another
>> user changed the settings.
>>
>> We can change the net_open/net_stop/dcbnl_ops to msg_drv (not default
>> turned on), and want to keep the others default print to kernel log,
>> is it acceptable?
>>
>
> acceptable as long as debug information are kept off by default and
> your driver doens't spam the kernel log.
>
> you should use dynamic debug [1] and/or "off by default" msg lvls for
> debugging information..
>
> I couldn't find any rules regarding what to put in kernel log, Maybe
> someone can share ?. but i vaguely remember that the recommendation
> for device drivers is to put nothing, only error/warning messages.
>
> [1]
> https://www.kernel.org/doc/html/v4.15/admin-guide/dynamic-debug-howto.html
>
>>>> @@ -1593,6 +1603,11 @@ static int hns3_ndo_set_vf_vlan(struct
>>>> net_device *netdev, int vf, u16 vlan,
>>>> struct hnae3_handle *h = hns3_get_handle(netdev);
>>>> int ret = -EIO;
>>>>
>>>> + if (netif_msg_ifdown(h))
>>>
>>> why msg_ifdown ? looks like netif_msg_drv is more appropriate, for
>>> many
>>> of the cases in this patch.
>>>
>>
>> This operation may cause link down, so we use msg_ifdown.
>>
>
> ifdown isn't link down..
>
> to be honest, I couldn't find any documentation explaining how/when to
> use msg lvls, (i didn't look too deep though), by looking at other
> drivers, my interpretations is:
>
> ifdup (open/boot up flow)
> ifdwon (close/teardown flow)
> drv (driver based or dynamic flows)
> etc ..
>
> -Saeed.
>
^ permalink raw reply
* [PATCH 2/2] net: ipv4: Fix a possible null-pointer dereference in fib4_rule_suppress()
From: Jia-Ju Bai @ 2019-07-26 2:25 UTC (permalink / raw)
To: davem, kuznet, yoshfuji; +Cc: netdev, linux-kernel, Jia-Ju Bai
In fib4_rule_suppress(), there is an if statement on line 145 to check
whether result->fi is NULL:
if (result->fi)
When result->fi is NULL, it is used on line 167:
fib_info_put(result->fi);
In fib_info_put(), the argument fi is used:
if (refcount_dec_and_test(&fi->fib_clntref))
Thus, a possible null-pointer dereference may occur.
To fix this bug, result->fi is checked before calling fib_info_put().
This bug is found by a static analysis tool STCheck written by us.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
---
net/ipv4/fib_rules.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index b43a7ba5c6a4..daedce293aab 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -163,7 +163,7 @@ static bool fib4_rule_suppress(struct fib_rule *rule, struct fib_lookup_arg *arg
return false;
suppress_route:
- if (!(arg->flags & FIB_LOOKUP_NOREF))
+ if (!(arg->flags & FIB_LOOKUP_NOREF) && result->fi)
fib_info_put(result->fi);
return true;
}
--
2.17.0
^ permalink raw reply related
* [PATCH 1/2] net: ipv4: Fix a possible null-pointer dereference in inet_csk_rebuild_route()
From: Jia-Ju Bai @ 2019-07-26 2:25 UTC (permalink / raw)
To: davem, kuznet, yoshfuji; +Cc: netdev, linux-kernel, Jia-Ju Bai
In inet_csk_rebuild_route(), rt is assigned to NULL on line 1071.
On line 1076, rt is used:
return &rt->dst;
Thus, a possible null-pointer dereference may occur.
To fix this bug, rt is checked before being used.
This bug is found by a static analysis tool STCheck written by us.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
---
net/ipv4/inet_connection_sock.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index f5c163d4771b..27d9d80f3401 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -1073,7 +1073,10 @@ static struct dst_entry *inet_csk_rebuild_route(struct sock *sk, struct flowi *f
sk_setup_caps(sk, &rt->dst);
rcu_read_unlock();
- return &rt->dst;
+ if (rt)
+ return &rt->dst;
+ else
+ return NULL;
}
struct dst_entry *inet_csk_update_pmtu(struct sock *sk, u32 mtu)
--
2.17.0
^ permalink raw reply related
* Re: [PATCH net-next 07/11] net: hns3: adds debug messages to identify eth down cause
From: liuyonglong @ 2019-07-26 2:33 UTC (permalink / raw)
To: Jakub Kicinski, Saeed Mahameed
Cc: tanhuazhong@huawei.com, davem@davemloft.net, lipeng321@huawei.com,
yisen.zhuang@huawei.com, salil.mehta@huawei.com,
linuxarm@huawei.com, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <20190725182846.253ae93f@cakuba.netronome.com>
As Saeed said, we will use netif_msg_drv() which is default off, this
can be easily open with ethtool.
Thanks for your reply!
On 2019/7/26 9:28, Jakub Kicinski wrote:
> On Thu, 25 Jul 2019 21:59:08 +0000, Saeed Mahameed wrote:
>> I couldn't find any rules regarding what to put in kernel log, Maybe
>> someone can share ?. but i vaguely remember that the recommendation
>> for device drivers is to put nothing, only error/warning messages.
>
> FWIW my understanding is also that only error/warning messages should
> be printed. IOW things which should "never happen".
>
> There are some historical exceptions. Probe logs for instance may be
> useful, because its not trivial to get to the device if probe fails.
>
> Another one is ethtool flashing, if it takes time we used to print into
> logs some message like "please wait patiently". But since Jiri added
> the progress messages in devlink that's no longer necessary.
>
> For the messages which are basically printing the name of the function
> or name of the function and their args - we have ftrace.
>
> That's my $0.02 :)
>
> .
>
^ permalink raw reply
* Re: memory leak in dma_buf_ioctl
From: syzbot @ 2019-07-26 2:34 UTC (permalink / raw)
To: bsingharora, coreteam, davem, dri-devel, duwe, dvyukov, kaber,
kadlec, linaro-mm-sig, linux-kernel, linux-media, mingo, mpe,
netdev, netfilter-devel, pablo, rostedt, sumit.semwal,
syzkaller-bugs
In-Reply-To: <000000000000b68e04058e6a3421@google.com>
syzbot has bisected this bug to:
commit 04cf31a759ef575f750a63777cee95500e410994
Author: Michael Ellerman <mpe@ellerman.id.au>
Date: Thu Mar 24 11:04:01 2016 +0000
ftrace: Make ftrace_location_range() global
bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=154293f4600000
start commit: abdfd52a Merge tag 'armsoc-defconfig' of git://git.kernel...
git tree: upstream
final crash: https://syzkaller.appspot.com/x/report.txt?x=174293f4600000
console output: https://syzkaller.appspot.com/x/log.txt?x=134293f4600000
kernel config: https://syzkaller.appspot.com/x/.config?x=d31de3d88059b7fa
dashboard link: https://syzkaller.appspot.com/bug?extid=b2098bc44728a4efb3e9
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12526e58600000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=161784f0600000
Reported-by: syzbot+b2098bc44728a4efb3e9@syzkaller.appspotmail.com
Fixes: 04cf31a759ef ("ftrace: Make ftrace_location_range() global")
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
^ permalink raw reply
* Re: memory leak in dma_buf_ioctl
From: Steven Rostedt @ 2019-07-26 2:46 UTC (permalink / raw)
To: syzbot
Cc: bsingharora, coreteam, davem, dri-devel, duwe, dvyukov, kaber,
kadlec, linaro-mm-sig, linux-kernel, linux-media, mingo, mpe,
netdev, netfilter-devel, pablo, sumit.semwal, syzkaller-bugs
In-Reply-To: <00000000000005dbbc058e8c608d@google.com>
On Thu, 25 Jul 2019 19:34:01 -0700
syzbot <syzbot+b2098bc44728a4efb3e9@syzkaller.appspotmail.com> wrote:
> syzbot has bisected this bug to:
>
> commit 04cf31a759ef575f750a63777cee95500e410994
> Author: Michael Ellerman <mpe@ellerman.id.au>
> Date: Thu Mar 24 11:04:01 2016 +0000
>
> ftrace: Make ftrace_location_range() global
It's sad that I have yet to find a single syzbot bisect useful. Really?
setting a function from static to global will cause a memory leak in a
completely unrelated area of the kernel?
I'm about to set these to my /dev/null folder.
-- Steve
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=154293f4600000
> start commit: abdfd52a Merge tag 'armsoc-defconfig' of git://git.kernel...
> git tree: upstream
> final crash: https://syzkaller.appspot.com/x/report.txt?x=174293f4600000
> console output: https://syzkaller.appspot.com/x/log.txt?x=134293f4600000
> kernel config: https://syzkaller.appspot.com/x/.config?x=d31de3d88059b7fa
> dashboard link: https://syzkaller.appspot.com/bug?extid=b2098bc44728a4efb3e9
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12526e58600000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=161784f0600000
>
> Reported-by: syzbot+b2098bc44728a4efb3e9@syzkaller.appspotmail.com
> Fixes: 04cf31a759ef ("ftrace: Make ftrace_location_range() global")
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
^ permalink raw reply
* [PATCH V2 net-next 04/11] net: hns3: fix mis-counting IRQ vector numbers issue
From: Huazhong Tan @ 2019-07-26 3:24 UTC (permalink / raw)
To: davem
Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
Yonglong Liu, Peng Li, Huazhong Tan
In-Reply-To: <1564111502-15504-1-git-send-email-tanhuazhong@huawei.com>
From: Yonglong Liu <liuyonglong@huawei.com>
The num_msi_left means the vector numbers of NIC, but if the
PF supported RoCE, it contains the vector numbers of NIC and
RoCE(Not expected).
This may cause interrupts lost in some case, because of the
NIC module used the vector resources which belongs to RoCE.
This patch corrects the value of num_msi_left to be equals to
the vector numbers of NIC, and adjust the default tqp numbers
according to the value of num_msi_left. Also, adds a little
cleanup about checking whether the device is supporting RoCE
before allocating IRQ vector.
Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 2 ++
drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 12 ++++++++++--
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 3c64d70..9329aab 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -1476,6 +1476,8 @@ static int hclge_vport_setup(struct hclge_vport *vport, u16 num_tqps)
nic->ae_algo = &ae_algo;
nic->numa_node_mask = hdev->numa_node_mask;
+ num_tqps = min_t(u16, hdev->roce_base_msix_offset - 1, num_tqps);
+
ret = hclge_knic_setup(vport, num_tqps,
hdev->num_tx_desc, hdev->num_rx_desc);
if (ret)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index a13a0e1..db84782 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -287,6 +287,14 @@ static int hclgevf_get_queue_info(struct hclgevf_dev *hdev)
memcpy(&hdev->rss_size_max, &resp_msg[2], sizeof(u16));
memcpy(&hdev->rx_buf_len, &resp_msg[4], sizeof(u16));
+ /* if irq is not enough, let tqps have the same value of irqs,
+ * to make sure one irq just bind to one tqp, this can improve
+ * the performance
+ */
+ hdev->num_tqps = min_t(u16, hdev->roce_base_msix_offset - 1,
+ hdev->num_tqps);
+ hdev->rss_size_max = min_t(u16, hdev->rss_size_max, hdev->num_tqps);
+
return 0;
}
@@ -2208,7 +2216,7 @@ static int hclgevf_init_msi(struct hclgevf_dev *hdev)
int vectors;
int i;
- if (hnae3_get_bit(hdev->ae_dev->flag, HNAE3_DEV_SUPPORT_ROCE_B))
+ if (hnae3_dev_roce_supported(hdev))
vectors = pci_alloc_irq_vectors(pdev,
hdev->roce_base_msix_offset + 1,
hdev->num_msi,
@@ -2495,7 +2503,7 @@ static int hclgevf_query_vf_resource(struct hclgevf_dev *hdev)
req = (struct hclgevf_query_res_cmd *)desc.data;
- if (hnae3_get_bit(hdev->ae_dev->flag, HNAE3_DEV_SUPPORT_ROCE_B)) {
+ if (hnae3_dev_roce_supported(hdev)) {
hdev->roce_base_msix_offset =
hnae3_get_field(__le16_to_cpu(req->msixcap_localid_ba_rocee),
HCLGEVF_MSIX_OFT_ROCEE_M,
--
2.7.4
^ permalink raw reply related
* [PATCH V2 net-next 07/11] net: hns3: adds debug messages to identify eth down cause
From: Huazhong Tan @ 2019-07-26 3:24 UTC (permalink / raw)
To: davem
Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
Yonglong Liu, Peng Li, Huazhong Tan
In-Reply-To: <1564111502-15504-1-git-send-email-tanhuazhong@huawei.com>
From: Yonglong Liu <liuyonglong@huawei.com>
Some times just see the eth interface have been down/up via
dmesg, but can not know why the eth down. So adds some debug
messages to identify the cause for this.
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 24 ++++++++++++++++++++
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 26 ++++++++++++++++++++++
.../net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c | 14 ++++++++++++
3 files changed, 64 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 4d58c53..2e30cfa 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -459,6 +459,10 @@ static int hns3_nic_net_open(struct net_device *netdev)
h->ae_algo->ops->set_timer_task(priv->ae_handle, true);
hns3_config_xps(priv);
+
+ if (netif_msg_drv(h))
+ netdev_info(netdev, "net open\n");
+
return 0;
}
@@ -519,6 +523,9 @@ static int hns3_nic_net_stop(struct net_device *netdev)
if (test_and_set_bit(HNS3_NIC_STATE_DOWN, &priv->state))
return 0;
+ if (netif_msg_drv(h))
+ netdev_info(netdev, "net stop\n");
+
if (h->ae_algo->ops->set_timer_task)
h->ae_algo->ops->set_timer_task(priv->ae_handle, false);
@@ -1550,6 +1557,9 @@ static int hns3_setup_tc(struct net_device *netdev, void *type_data)
h = hns3_get_handle(netdev);
kinfo = &h->kinfo;
+ if (netif_msg_drv(h))
+ netdev_info(netdev, "setup tc: num_tc=%d\n", tc);
+
return (kinfo->dcb_ops && kinfo->dcb_ops->setup_tc) ?
kinfo->dcb_ops->setup_tc(h, tc, prio_tc) : -EOPNOTSUPP;
}
@@ -1593,6 +1603,11 @@ static int hns3_ndo_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan,
struct hnae3_handle *h = hns3_get_handle(netdev);
int ret = -EIO;
+ if (netif_msg_drv(h))
+ netdev_info(netdev,
+ "set vf vlan: vf=%d, vlan=%d, qos=%d, vlan_proto=%d\n",
+ vf, vlan, qos, vlan_proto);
+
if (h->ae_algo->ops->set_vf_vlan_filter)
ret = h->ae_algo->ops->set_vf_vlan_filter(h, vf, vlan,
qos, vlan_proto);
@@ -1611,6 +1626,10 @@ static int hns3_nic_change_mtu(struct net_device *netdev, int new_mtu)
if (!h->ae_algo->ops->set_mtu)
return -EOPNOTSUPP;
+ if (netif_msg_drv(h))
+ netdev_info(netdev, "change mtu from %d to %d\n",
+ netdev->mtu, new_mtu);
+
ret = h->ae_algo->ops->set_mtu(h, new_mtu);
if (ret)
netdev_err(netdev, "failed to change MTU in hardware %d\n",
@@ -4395,6 +4414,11 @@ int hns3_set_channels(struct net_device *netdev,
if (kinfo->rss_size == new_tqp_num)
return 0;
+ if (netif_msg_drv(h))
+ netdev_info(netdev,
+ "set channels: tqp_num=%d, rxfh=%d\n",
+ new_tqp_num, rxfh_configured);
+
ret = hns3_reset_notify(h, HNAE3_DOWN_CLIENT);
if (ret)
return ret;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
index e71c92b..08334d7 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
@@ -311,6 +311,9 @@ static void hns3_self_test(struct net_device *ndev,
if (eth_test->flags != ETH_TEST_FL_OFFLINE)
return;
+ if (netif_msg_drv(h))
+ netdev_info(ndev, "self test start\n");
+
st_param[HNAE3_LOOP_APP][0] = HNAE3_LOOP_APP;
st_param[HNAE3_LOOP_APP][1] =
h->flags & HNAE3_SUPPORT_APP_LOOPBACK;
@@ -374,6 +377,9 @@ static void hns3_self_test(struct net_device *ndev,
if (if_running)
ndev->netdev_ops->ndo_open(ndev);
+
+ if (netif_msg_drv(h))
+ netdev_info(ndev, "self test end\n");
}
static int hns3_get_sset_count(struct net_device *netdev, int stringset)
@@ -604,6 +610,11 @@ static int hns3_set_pauseparam(struct net_device *netdev,
{
struct hnae3_handle *h = hns3_get_handle(netdev);
+ if (netif_msg_drv(h))
+ netdev_info(netdev,
+ "set pauseparam: autoneg=%d, rx:%d, tx:%d\n",
+ param->autoneg, param->rx_pause, param->tx_pause);
+
if (h->ae_algo->ops->set_pauseparam)
return h->ae_algo->ops->set_pauseparam(h, param->autoneg,
param->rx_pause,
@@ -743,6 +754,13 @@ static int hns3_set_link_ksettings(struct net_device *netdev,
if (cmd->base.speed == SPEED_1000 && cmd->base.duplex == DUPLEX_HALF)
return -EINVAL;
+ if (netif_msg_drv(handle))
+ netdev_info(netdev,
+ "set link(%s): autoneg=%d, speed=%d, duplex=%d\n",
+ netdev->phydev ? "phy" : "mac",
+ cmd->base.autoneg, cmd->base.speed,
+ cmd->base.duplex);
+
/* Only support ksettings_set for netdev with phy attached for now */
if (netdev->phydev)
return phy_ethtool_ksettings_set(netdev->phydev, cmd);
@@ -984,6 +1002,10 @@ static int hns3_nway_reset(struct net_device *netdev)
return -EINVAL;
}
+ if (netif_msg_drv(handle))
+ netdev_info(netdev, "nway reset (using %s)\n",
+ phy ? "phy" : "mac");
+
if (phy)
return genphy_restart_aneg(phy);
@@ -1308,6 +1330,10 @@ static int hns3_set_fecparam(struct net_device *netdev,
if (!ops->set_fec)
return -EOPNOTSUPP;
fec_mode = eth_to_loc_fec(fec->fec);
+
+ if (netif_msg_drv(handle))
+ netdev_info(netdev, "set fecparam: mode=%d\n", fec_mode);
+
return ops->set_fec(handle, fec_mode);
}
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
index bac4ce1..07471ba 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
@@ -201,6 +201,7 @@ static int hclge_client_setup_tc(struct hclge_dev *hdev)
static int hclge_ieee_setets(struct hnae3_handle *h, struct ieee_ets *ets)
{
struct hclge_vport *vport = hclge_get_vport(h);
+ struct net_device *netdev = h->kinfo.netdev;
struct hclge_dev *hdev = vport->back;
bool map_changed = false;
u8 num_tc = 0;
@@ -215,6 +216,9 @@ static int hclge_ieee_setets(struct hnae3_handle *h, struct ieee_ets *ets)
return ret;
if (map_changed) {
+ if (netif_msg_drv(h))
+ netdev_info(netdev, "set ets\n");
+
ret = hclge_notify_client(hdev, HNAE3_DOWN_CLIENT);
if (ret)
return ret;
@@ -300,6 +304,7 @@ static int hclge_ieee_getpfc(struct hnae3_handle *h, struct ieee_pfc *pfc)
static int hclge_ieee_setpfc(struct hnae3_handle *h, struct ieee_pfc *pfc)
{
struct hclge_vport *vport = hclge_get_vport(h);
+ struct net_device *netdev = h->kinfo.netdev;
struct hclge_dev *hdev = vport->back;
u8 i, j, pfc_map, *prio_tc;
@@ -325,6 +330,11 @@ static int hclge_ieee_setpfc(struct hnae3_handle *h, struct ieee_pfc *pfc)
hdev->tm_info.hw_pfc_map = pfc_map;
hdev->tm_info.pfc_en = pfc->pfc_en;
+ if (netif_msg_drv(h))
+ netdev_info(netdev,
+ "set pfc: pfc_en=%d, pfc_map=%d, num_tc=%d\n",
+ pfc->pfc_en, pfc_map, hdev->tm_info.num_tc);
+
hclge_tm_pfc_info_update(hdev);
return hclge_pause_setup_hw(hdev, false);
@@ -345,8 +355,12 @@ static u8 hclge_getdcbx(struct hnae3_handle *h)
static u8 hclge_setdcbx(struct hnae3_handle *h, u8 mode)
{
struct hclge_vport *vport = hclge_get_vport(h);
+ struct net_device *netdev = h->kinfo.netdev;
struct hclge_dev *hdev = vport->back;
+ if (netif_msg_drv(h))
+ netdev_info(netdev, "set dcbx: mode=%d\n", mode);
+
/* No support for LLD_MANAGED modes or CEE */
if ((mode & DCB_CAP_DCBX_LLD_MANAGED) ||
(mode & DCB_CAP_DCBX_VER_CEE) ||
--
2.7.4
^ permalink raw reply related
* [PATCH V2 net-next 06/11] net: hns3: modify firmware version display format
From: Huazhong Tan @ 2019-07-26 3:24 UTC (permalink / raw)
To: davem
Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
Yufeng Mo, Peng Li, Huazhong Tan
In-Reply-To: <1564111502-15504-1-git-send-email-tanhuazhong@huawei.com>
From: Yufeng Mo <moyufeng@huawei.com>
This patch modifies firmware version display format in
hclge(vf)_cmd_init() and hns3_get_drvinfo(). Also, adds
some optimizations for firmware version display format.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hnae3.h | 9 +++++++++
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 15 +++++++++++++--
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c | 10 +++++++++-
drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c | 10 +++++++++-
4 files changed, 40 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index 48c7b70..a4624db 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -179,6 +179,15 @@ struct hnae3_vector_info {
#define HNAE3_RING_GL_RX 0
#define HNAE3_RING_GL_TX 1
+#define HNAE3_FW_VERSION_BYTE3_SHIFT 24
+#define HNAE3_FW_VERSION_BYTE3_MASK GENMASK(31, 24)
+#define HNAE3_FW_VERSION_BYTE2_SHIFT 16
+#define HNAE3_FW_VERSION_BYTE2_MASK GENMASK(23, 16)
+#define HNAE3_FW_VERSION_BYTE1_SHIFT 8
+#define HNAE3_FW_VERSION_BYTE1_MASK GENMASK(15, 8)
+#define HNAE3_FW_VERSION_BYTE0_SHIFT 0
+#define HNAE3_FW_VERSION_BYTE0_MASK GENMASK(7, 0)
+
struct hnae3_ring_chain_node {
struct hnae3_ring_chain_node *next;
u32 tqp_index;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
index 5bff98a..e71c92b 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
@@ -527,6 +527,7 @@ static void hns3_get_drvinfo(struct net_device *netdev,
{
struct hns3_nic_priv *priv = netdev_priv(netdev);
struct hnae3_handle *h = priv->ae_handle;
+ u32 fw_version;
if (!h->ae_algo->ops->get_fw_version) {
netdev_err(netdev, "could not get fw version!\n");
@@ -545,8 +546,18 @@ static void hns3_get_drvinfo(struct net_device *netdev,
sizeof(drvinfo->bus_info));
drvinfo->bus_info[ETHTOOL_BUSINFO_LEN - 1] = '\0';
- snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), "0x%08x",
- priv->ae_handle->ae_algo->ops->get_fw_version(h));
+ fw_version = priv->ae_handle->ae_algo->ops->get_fw_version(h);
+
+ snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
+ "%lu.%lu.%lu.%lu",
+ hnae3_get_field(fw_version, HNAE3_FW_VERSION_BYTE3_MASK,
+ HNAE3_FW_VERSION_BYTE3_SHIFT),
+ hnae3_get_field(fw_version, HNAE3_FW_VERSION_BYTE2_MASK,
+ HNAE3_FW_VERSION_BYTE2_SHIFT),
+ hnae3_get_field(fw_version, HNAE3_FW_VERSION_BYTE1_MASK,
+ HNAE3_FW_VERSION_BYTE1_SHIFT),
+ hnae3_get_field(fw_version, HNAE3_FW_VERSION_BYTE0_MASK,
+ HNAE3_FW_VERSION_BYTE0_SHIFT));
}
static u32 hns3_get_link(struct net_device *netdev)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
index 22f6acd..d9858f2 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
@@ -419,7 +419,15 @@ int hclge_cmd_init(struct hclge_dev *hdev)
}
hdev->fw_version = version;
- dev_info(&hdev->pdev->dev, "The firmware version is %08x\n", version);
+ dev_info(&hdev->pdev->dev, "The firmware version is %lu.%lu.%lu.%lu\n",
+ hnae3_get_field(version, HNAE3_FW_VERSION_BYTE3_MASK,
+ HNAE3_FW_VERSION_BYTE3_SHIFT),
+ hnae3_get_field(version, HNAE3_FW_VERSION_BYTE2_MASK,
+ HNAE3_FW_VERSION_BYTE2_SHIFT),
+ hnae3_get_field(version, HNAE3_FW_VERSION_BYTE1_MASK,
+ HNAE3_FW_VERSION_BYTE1_SHIFT),
+ hnae3_get_field(version, HNAE3_FW_VERSION_BYTE0_MASK,
+ HNAE3_FW_VERSION_BYTE0_SHIFT));
return 0;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c
index 652b796..8f21eb3 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c
@@ -405,7 +405,15 @@ int hclgevf_cmd_init(struct hclgevf_dev *hdev)
}
hdev->fw_version = version;
- dev_info(&hdev->pdev->dev, "The firmware version is %08x\n", version);
+ dev_info(&hdev->pdev->dev, "The firmware version is %lu.%lu.%lu.%lu\n",
+ hnae3_get_field(version, HNAE3_FW_VERSION_BYTE3_MASK,
+ HNAE3_FW_VERSION_BYTE3_SHIFT),
+ hnae3_get_field(version, HNAE3_FW_VERSION_BYTE2_MASK,
+ HNAE3_FW_VERSION_BYTE2_SHIFT),
+ hnae3_get_field(version, HNAE3_FW_VERSION_BYTE1_MASK,
+ HNAE3_FW_VERSION_BYTE1_SHIFT),
+ hnae3_get_field(version, HNAE3_FW_VERSION_BYTE0_MASK,
+ HNAE3_FW_VERSION_BYTE0_SHIFT));
return 0;
--
2.7.4
^ permalink raw reply related
* [PATCH V2 net-next 10/11] net: hns3: Add support for using order 1 pages with a 4K buffer
From: Huazhong Tan @ 2019-07-26 3:25 UTC (permalink / raw)
To: davem
Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
Yunsheng Lin, Huazhong Tan
In-Reply-To: <1564111502-15504-1-git-send-email-tanhuazhong@huawei.com>
From: Yunsheng Lin <linyunsheng@huawei.com>
Hardware supports 0.5K, 1K, 2K, 4K RX buffer size, the
RX buffer can not be reused because the hns3_page_order
return 0 when page size and RX buffer size are both 4096.
So this patch changes the hns3_page_order to return 1 when
RX buffer is greater than half of the page size and page size
is less the 8192, and dev_alloc_pages has already been used
to allocate the compound page for RX buffer.
This patch also changes hnae3_* to hns3_* for page order
and RX buffer size calculation because they are used in
hns3 module.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 10 +++++-----
drivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 15 ++++++++++++---
2 files changed, 17 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 2e30cfa..3a93bef 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -2086,7 +2086,7 @@ static void hns3_set_default_feature(struct net_device *netdev)
static int hns3_alloc_buffer(struct hns3_enet_ring *ring,
struct hns3_desc_cb *cb)
{
- unsigned int order = hnae3_page_order(ring);
+ unsigned int order = hns3_page_order(ring);
struct page *p;
p = dev_alloc_pages(order);
@@ -2097,7 +2097,7 @@ static int hns3_alloc_buffer(struct hns3_enet_ring *ring,
cb->page_offset = 0;
cb->reuse_flag = 0;
cb->buf = page_address(p);
- cb->length = hnae3_page_size(ring);
+ cb->length = hns3_page_size(ring);
cb->type = DESC_TYPE_PAGE;
return 0;
@@ -2400,7 +2400,7 @@ static void hns3_nic_reuse_page(struct sk_buff *skb, int i,
{
struct hns3_desc *desc = &ring->desc[ring->next_to_clean];
int size = le16_to_cpu(desc->rx.size);
- u32 truesize = hnae3_buf_size(ring);
+ u32 truesize = hns3_buf_size(ring);
skb_add_rx_frag(skb, i, desc_cb->priv, desc_cb->page_offset + pull_len,
size - pull_len, truesize);
@@ -2415,7 +2415,7 @@ static void hns3_nic_reuse_page(struct sk_buff *skb, int i,
/* Move offset up to the next cache line */
desc_cb->page_offset += truesize;
- if (desc_cb->page_offset + truesize <= hnae3_page_size(ring)) {
+ if (desc_cb->page_offset + truesize <= hns3_page_size(ring)) {
desc_cb->reuse_flag = 1;
/* Bump ref count on page before it is given */
get_page(desc_cb->priv);
@@ -2697,7 +2697,7 @@ static int hns3_add_frag(struct hns3_enet_ring *ring, struct hns3_desc *desc,
}
if (ring->tail_skb) {
- head_skb->truesize += hnae3_buf_size(ring);
+ head_skb->truesize += hns3_buf_size(ring);
head_skb->data_len += le16_to_cpu(desc->rx.size);
head_skb->len += le16_to_cpu(desc->rx.size);
skb = ring->tail_skb;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
index 848b866..1a17856 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
@@ -608,9 +608,18 @@ static inline bool hns3_nic_resetting(struct net_device *netdev)
#define tx_ring_data(priv, idx) ((priv)->ring_data[idx])
-#define hnae3_buf_size(_ring) ((_ring)->buf_size)
-#define hnae3_page_order(_ring) (get_order(hnae3_buf_size(_ring)))
-#define hnae3_page_size(_ring) (PAGE_SIZE << (u32)hnae3_page_order(_ring))
+#define hns3_buf_size(_ring) ((_ring)->buf_size)
+
+static inline unsigned int hns3_page_order(struct hns3_enet_ring *ring)
+{
+#if (PAGE_SIZE < 8192)
+ if (ring->buf_size > (PAGE_SIZE / 2))
+ return 1;
+#endif
+ return 0;
+}
+
+#define hns3_page_size(_ring) (PAGE_SIZE << hns3_page_order(_ring))
/* iterator for handling rings in ring group */
#define hns3_for_each_ring(pos, head) \
--
2.7.4
^ permalink raw reply related
* [PATCH V2 net-next 09/11] net: hns3: add interrupt affinity support for misc interrupt
From: Huazhong Tan @ 2019-07-26 3:25 UTC (permalink / raw)
To: davem
Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
Yunsheng Lin, Peng Li, Huazhong Tan
In-Reply-To: <1564111502-15504-1-git-send-email-tanhuazhong@huawei.com>
From: Yunsheng Lin <linyunsheng@huawei.com>
The misc interrupt is used to schedule the reset and mailbox
subtask, and a 1 sec timer is used to schedule the service
subtask, which does periodic work.
This patch sets the above three subtask's affinity using the
misc interrupt' affinity.
Also this patch setups a affinity notify for misc interrupt to
allow user to change the above three subtask's affinity.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
.../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 53 ++++++++++++++++++++--
.../ethernet/hisilicon/hns3/hns3pf/hclge_main.h | 4 ++
2 files changed, 53 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index e804a19..3e43dff 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -1270,6 +1270,12 @@ static int hclge_configure(struct hclge_dev *hdev)
hclge_init_kdump_kernel_config(hdev);
+ /* Set the init affinity based on pci func number */
+ i = cpumask_weight(cpumask_of_node(dev_to_node(&hdev->pdev->dev)));
+ i = i ? PCI_FUNC(hdev->pdev->devfn) % i : 0;
+ cpumask_set_cpu(cpumask_local_spread(i, dev_to_node(&hdev->pdev->dev)),
+ &hdev->affinity_mask);
+
return ret;
}
@@ -2501,14 +2507,16 @@ static void hclge_mbx_task_schedule(struct hclge_dev *hdev)
{
if (!test_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state) &&
!test_and_set_bit(HCLGE_STATE_MBX_SERVICE_SCHED, &hdev->state))
- schedule_work(&hdev->mbx_service_task);
+ queue_work_on(cpumask_first(&hdev->affinity_mask), system_wq,
+ &hdev->mbx_service_task);
}
static void hclge_reset_task_schedule(struct hclge_dev *hdev)
{
if (!test_bit(HCLGE_STATE_REMOVING, &hdev->state) &&
!test_and_set_bit(HCLGE_STATE_RST_SERVICE_SCHED, &hdev->state))
- schedule_work(&hdev->rst_service_task);
+ queue_work_on(cpumask_first(&hdev->affinity_mask), system_wq,
+ &hdev->rst_service_task);
}
static void hclge_task_schedule(struct hclge_dev *hdev)
@@ -2518,8 +2526,9 @@ static void hclge_task_schedule(struct hclge_dev *hdev)
!test_and_set_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state)) {
hdev->hw_stats.stats_timer++;
hdev->fd_arfs_expire_timer++;
- mod_delayed_work(system_wq, &hdev->service_task,
- round_jiffies_relative(HZ));
+ mod_delayed_work_on(cpumask_first(&hdev->affinity_mask),
+ system_wq, &hdev->service_task,
+ round_jiffies_relative(HZ));
}
}
@@ -2905,6 +2914,36 @@ static void hclge_get_misc_vector(struct hclge_dev *hdev)
hdev->num_msi_used += 1;
}
+static void hclge_irq_affinity_notify(struct irq_affinity_notify *notify,
+ const cpumask_t *mask)
+{
+ struct hclge_dev *hdev = container_of(notify, struct hclge_dev,
+ affinity_notify);
+
+ cpumask_copy(&hdev->affinity_mask, mask);
+}
+
+static void hclge_irq_affinity_release(struct kref *ref)
+{
+}
+
+static void hclge_misc_affinity_setup(struct hclge_dev *hdev)
+{
+ irq_set_affinity_hint(hdev->misc_vector.vector_irq,
+ &hdev->affinity_mask);
+
+ hdev->affinity_notify.notify = hclge_irq_affinity_notify;
+ hdev->affinity_notify.release = hclge_irq_affinity_release;
+ irq_set_affinity_notifier(hdev->misc_vector.vector_irq,
+ &hdev->affinity_notify);
+}
+
+static void hclge_misc_affinity_teardown(struct hclge_dev *hdev)
+{
+ irq_set_affinity_notifier(hdev->misc_vector.vector_irq, NULL);
+ irq_set_affinity_hint(hdev->misc_vector.vector_irq, NULL);
+}
+
static int hclge_misc_irq_init(struct hclge_dev *hdev)
{
int ret;
@@ -8796,6 +8835,11 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev)
INIT_WORK(&hdev->rst_service_task, hclge_reset_service_task);
INIT_WORK(&hdev->mbx_service_task, hclge_mailbox_service_task);
+ /* Setup affinity after service timer setup because add_timer_on
+ * is called in affinity notify.
+ */
+ hclge_misc_affinity_setup(hdev);
+
hclge_clear_all_event_cause(hdev);
hclge_clear_resetting_state(hdev);
@@ -8957,6 +9001,7 @@ static void hclge_uninit_ae_dev(struct hnae3_ae_dev *ae_dev)
struct hclge_dev *hdev = ae_dev->priv;
struct hclge_mac *mac = &hdev->hw.mac;
+ hclge_misc_affinity_teardown(hdev);
hclge_state_uninit(hdev);
if (mac->phydev)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
index dde8f22..688e425 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
@@ -863,6 +863,10 @@ struct hclge_dev {
DECLARE_KFIFO(mac_tnl_log, struct hclge_mac_tnl_stats,
HCLGE_MAC_TNL_LOG_SIZE);
+
+ /* affinity mask and notify for misc interrupt */
+ cpumask_t affinity_mask;
+ struct irq_affinity_notify affinity_notify;
};
/* VPort level vlan tag configuration for TX direction */
--
2.7.4
^ permalink raw reply related
* [PATCH V2 net-next 08/11] net: hns3: make hclge_service use delayed workqueue
From: Huazhong Tan @ 2019-07-26 3:24 UTC (permalink / raw)
To: davem
Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
Yunsheng Lin, Huazhong Tan
In-Reply-To: <1564111502-15504-1-git-send-email-tanhuazhong@huawei.com>
From: Yunsheng Lin <linyunsheng@huawei.com>
Use delayed work instead of using timers to trigger the
hclge_serive.
Simplify the code with one less middle function and in order
to support misc irq affinity.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
.../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 52 +++++++++-------------
.../ethernet/hisilicon/hns3/hns3pf/hclge_main.h | 3 +-
2 files changed, 21 insertions(+), 34 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index faf60b4..e804a19 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -2515,8 +2515,12 @@ static void hclge_task_schedule(struct hclge_dev *hdev)
{
if (!test_bit(HCLGE_STATE_DOWN, &hdev->state) &&
!test_bit(HCLGE_STATE_REMOVING, &hdev->state) &&
- !test_and_set_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state))
- (void)schedule_work(&hdev->service_task);
+ !test_and_set_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state)) {
+ hdev->hw_stats.stats_timer++;
+ hdev->fd_arfs_expire_timer++;
+ mod_delayed_work(system_wq, &hdev->service_task,
+ round_jiffies_relative(HZ));
+ }
}
static int hclge_get_mac_link_status(struct hclge_dev *hdev)
@@ -2731,25 +2735,6 @@ static int hclge_get_status(struct hnae3_handle *handle)
return hdev->hw.mac.link;
}
-static void hclge_service_timer(struct timer_list *t)
-{
- struct hclge_dev *hdev = from_timer(hdev, t, service_timer);
-
- mod_timer(&hdev->service_timer, jiffies + HZ);
- hdev->hw_stats.stats_timer++;
- hdev->fd_arfs_expire_timer++;
- hclge_task_schedule(hdev);
-}
-
-static void hclge_service_complete(struct hclge_dev *hdev)
-{
- WARN_ON(!test_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state));
-
- /* Flush memory before next watchdog */
- smp_mb__before_atomic();
- clear_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state);
-}
-
static u32 hclge_check_event_cause(struct hclge_dev *hdev, u32 *clearval)
{
u32 rst_src_reg, cmdq_src_reg, msix_src_reg;
@@ -3596,7 +3581,9 @@ static void hclge_update_vport_alive(struct hclge_dev *hdev)
static void hclge_service_task(struct work_struct *work)
{
struct hclge_dev *hdev =
- container_of(work, struct hclge_dev, service_task);
+ container_of(work, struct hclge_dev, service_task.work);
+
+ clear_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state);
if (hdev->hw_stats.stats_timer >= HCLGE_STATS_TIMER_INTERVAL) {
hclge_update_stats_for_all(hdev);
@@ -3611,7 +3598,8 @@ static void hclge_service_task(struct work_struct *work)
hclge_rfs_filter_expire(hdev);
hdev->fd_arfs_expire_timer = 0;
}
- hclge_service_complete(hdev);
+
+ hclge_task_schedule(hdev);
}
struct hclge_vport *hclge_get_vport(struct hnae3_handle *handle)
@@ -6150,10 +6138,13 @@ static void hclge_set_timer_task(struct hnae3_handle *handle, bool enable)
struct hclge_dev *hdev = vport->back;
if (enable) {
- mod_timer(&hdev->service_timer, jiffies + HZ);
+ hclge_task_schedule(hdev);
} else {
- del_timer_sync(&hdev->service_timer);
- cancel_work_sync(&hdev->service_task);
+ /* Set the DOWN flag here to disable the service to be
+ * scheduled again
+ */
+ set_bit(HCLGE_STATE_DOWN, &hdev->state);
+ cancel_delayed_work_sync(&hdev->service_task);
clear_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state);
}
}
@@ -8592,12 +8583,10 @@ static void hclge_state_uninit(struct hclge_dev *hdev)
set_bit(HCLGE_STATE_DOWN, &hdev->state);
set_bit(HCLGE_STATE_REMOVING, &hdev->state);
- if (hdev->service_timer.function)
- del_timer_sync(&hdev->service_timer);
if (hdev->reset_timer.function)
del_timer_sync(&hdev->reset_timer);
- if (hdev->service_task.func)
- cancel_work_sync(&hdev->service_task);
+ if (hdev->service_task.work.func)
+ cancel_delayed_work_sync(&hdev->service_task);
if (hdev->rst_service_task.func)
cancel_work_sync(&hdev->rst_service_task);
if (hdev->mbx_service_task.func)
@@ -8802,9 +8791,8 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev)
hclge_dcb_ops_set(hdev);
- timer_setup(&hdev->service_timer, hclge_service_timer, 0);
timer_setup(&hdev->reset_timer, hclge_reset_timer, 0);
- INIT_WORK(&hdev->service_task, hclge_service_task);
+ INIT_DELAYED_WORK(&hdev->service_task, hclge_service_task);
INIT_WORK(&hdev->rst_service_task, hclge_reset_service_task);
INIT_WORK(&hdev->mbx_service_task, hclge_mailbox_service_task);
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
index 6a12285..dde8f22 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
@@ -806,9 +806,8 @@ struct hclge_dev {
u16 adminq_work_limit; /* Num of admin receive queue desc to process */
unsigned long service_timer_period;
unsigned long service_timer_previous;
- struct timer_list service_timer;
struct timer_list reset_timer;
- struct work_struct service_task;
+ struct delayed_work service_task;
struct work_struct rst_service_task;
struct work_struct mbx_service_task;
--
2.7.4
^ permalink raw reply related
* [PATCH V2 net-next 11/11] net: hns3: use dev_info() instead of pr_info()
From: Huazhong Tan @ 2019-07-26 3:25 UTC (permalink / raw)
To: davem
Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
Huazhong Tan
In-Reply-To: <1564111502-15504-1-git-send-email-tanhuazhong@huawei.com>
dev_info() is more appropriate for printing messages when driver
initialization done, so switch to dev_info().
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 4 +++-
drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 3 ++-
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 3e43dff..588fb42 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -8864,7 +8864,9 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev)
hclge_state_init(hdev);
hdev->last_reset_time = jiffies;
- pr_info("%s driver initialization finished.\n", HCLGE_DRIVER_NAME);
+ dev_info(&hdev->pdev->dev, "%s driver initialization finished.\n",
+ HCLGE_DRIVER_NAME);
+
return 0;
err_mdiobus_unreg:
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index db84782..218fd5d 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -2703,7 +2703,8 @@ static int hclgevf_init_hdev(struct hclgevf_dev *hdev)
}
hdev->last_reset_time = jiffies;
- pr_info("finished initializing %s driver\n", HCLGEVF_DRIVER_NAME);
+ dev_info(&hdev->pdev->dev, "finished initializing %s driver\n",
+ HCLGEVF_DRIVER_NAME);
return 0;
--
2.7.4
^ permalink raw reply related
* [PATCH V2 net-next 05/11] net: hns3: change GFP flag during lock period
From: Huazhong Tan @ 2019-07-26 3:24 UTC (permalink / raw)
To: davem
Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
Yufeng Mo, lipeng 00277521, Huazhong Tan
In-Reply-To: <1564111502-15504-1-git-send-email-tanhuazhong@huawei.com>
From: Yufeng Mo <moyufeng@huawei.com>
When allocating memory, the GFP_KERNEL cannot be used during the
spin_lock period. This is because it may cause scheduling when holding
spin_lock. This patch changes GFP flag to GFP_ATOMIC in this case.
Fixes: dd74f815dd41 ("net: hns3: Add support for rule add/delete for flow director")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: lipeng 00277521 <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 9329aab..faf60b4 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -5798,7 +5798,7 @@ static int hclge_add_fd_entry_by_arfs(struct hnae3_handle *handle, u16 queue_id,
return -ENOSPC;
}
- rule = kzalloc(sizeof(*rule), GFP_KERNEL);
+ rule = kzalloc(sizeof(*rule), GFP_ATOMIC);
if (!rule) {
spin_unlock_bh(&hdev->fd_rule_lock);
--
2.7.4
^ permalink raw reply related
* [PATCH V2 net-next 01/11] net: hns3: add reset checking before set channels
From: Huazhong Tan @ 2019-07-26 3:24 UTC (permalink / raw)
To: davem
Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
Jian Shen, Huazhong Tan
In-Reply-To: <1564111502-15504-1-git-send-email-tanhuazhong@huawei.com>
From: Jian Shen <shenjian15@huawei.com>
hns3_set_channels() should check the resetting status firstly,
since the device will reinitialize when resetting. If the
reset has not completed, the hns3_set_channels() may access
invalid memory.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 69f7ef8..08af782 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -4378,6 +4378,9 @@ int hns3_set_channels(struct net_device *netdev,
u16 org_tqp_num;
int ret;
+ if (hns3_nic_resetting(netdev))
+ return -EBUSY;
+
if (ch->rx_count || ch->tx_count)
return -EINVAL;
--
2.7.4
^ permalink raw reply related
* [PATCH V2 net-next 02/11] net: hns3: add a check for get_reset_level
From: Huazhong Tan @ 2019-07-26 3:24 UTC (permalink / raw)
To: davem
Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
Guangbin Huang, Huazhong Tan
In-Reply-To: <1564111502-15504-1-git-send-email-tanhuazhong@huawei.com>
From: Guangbin Huang <huangguangbin@huawei.com>
For some cases, ops->get_reset_level may not be implemented, so we
should check whether it is NULL before calling get_reset_level.
Signed-off-by: Guangbin Huang <huangguangbin@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 08af782..4d58c53 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -1963,7 +1963,7 @@ static pci_ers_result_t hns3_slot_reset(struct pci_dev *pdev)
ops = ae_dev->ops;
/* request the reset */
- if (ops->reset_event) {
+ if (ops->reset_event && ops->get_reset_level) {
if (ae_dev->hw_err_reset_req) {
reset_type = ops->get_reset_level(ae_dev,
&ae_dev->hw_err_reset_req);
--
2.7.4
^ permalink raw reply related
* [PATCH V2 net-next 00/11] net: hns3: some code optimizations & bugfixes & features
From: Huazhong Tan @ 2019-07-26 3:24 UTC (permalink / raw)
To: davem
Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
Huazhong Tan
This patch-set includes code optimizations, bugfixes and features for
the HNS3 ethernet controller driver.
[patch 1/11] checks reset status before setting channel.
[patch 2/11] adds a NULL pointer checking.
[patch 3/11] removes reset level upgrading when current reset fails.
[patch 4/11] fixes a bug related to IRQ vector number initialization.
[patch 5/11] fixes a GFP flags errors when holding spin_lock.
[patch 6/11] modifies firmware version format.
[patch 7/11] adds some print information which is off by default.
[patch 8/11 - 9/11] adds two code optimizations about interrupt handler
and work task.
[patch 10/11] adds support for using order 1 pages with a 4K buffer.
[patch 11/11] modifies messages prints with dev_info() instead of
pr_info().
Change log:
V1->V2: fixes comments from Saeed Mahameed and
removes previous [patch 11/11] which needs further discussion,
adds a new patch [11/11] suggested by Saeed Mahameed.
Guangbin Huang (1):
net: hns3: add a check for get_reset_level
Huazhong Tan (2):
net: hns3: remove upgrade reset level when reset fail
net: hns3: use dev_info() instead of pr_info()
Jian Shen (1):
net: hns3: add reset checking before set channels
Yonglong Liu (2):
net: hns3: fix mis-counting IRQ vector numbers issue
net: hns3: adds debug messages to identify eth down cause
Yufeng Mo (2):
net: hns3: change GFP flag during lock period
net: hns3: modify firmware version display format
Yunsheng Lin (3):
net: hns3: make hclge_service use delayed workqueue
net: hns3: add interrupt affinity support for misc interrupt
net: hns3: Add support for using order 1 pages with a 4K buffer
drivers/net/ethernet/hisilicon/hns3/hnae3.h | 9 ++
drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 39 +++++-
drivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 15 ++-
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 41 +++++-
.../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c | 10 +-
.../net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c | 14 +++
.../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 137 ++++++++++++---------
.../ethernet/hisilicon/hns3/hns3pf/hclge_main.h | 7 +-
.../ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c | 10 +-
.../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 15 ++-
10 files changed, 223 insertions(+), 74 deletions(-)
--
2.7.4
^ permalink raw reply
* [PATCH V2 net-next 03/11] net: hns3: remove upgrade reset level when reset fail
From: Huazhong Tan @ 2019-07-26 3:24 UTC (permalink / raw)
To: davem
Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
Huazhong Tan
In-Reply-To: <1564111502-15504-1-git-send-email-tanhuazhong@huawei.com>
Currently, hclge_reset_err_handle() will assert a global reset
when the failing count is smaller than MAX_RESET_FAIL_CNT, which
will affect other running functions.
So this patch removes this upgrading, and uses re-scheduling reset
task to do it.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
---
.../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 28 +++++++---------------
1 file changed, 8 insertions(+), 20 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 3fde5471..3c64d70 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -3305,7 +3305,7 @@ static int hclge_reset_prepare_wait(struct hclge_dev *hdev)
return ret;
}
-static bool hclge_reset_err_handle(struct hclge_dev *hdev, bool is_timeout)
+static bool hclge_reset_err_handle(struct hclge_dev *hdev)
{
#define MAX_RESET_FAIL_CNT 5
@@ -3322,20 +3322,11 @@ static bool hclge_reset_err_handle(struct hclge_dev *hdev, bool is_timeout)
return false;
} else if (hdev->reset_fail_cnt < MAX_RESET_FAIL_CNT) {
hdev->reset_fail_cnt++;
- if (is_timeout) {
- set_bit(hdev->reset_type, &hdev->reset_pending);
- dev_info(&hdev->pdev->dev,
- "re-schedule to wait for hw reset done\n");
- return true;
- }
-
- dev_info(&hdev->pdev->dev, "Upgrade reset level\n");
- hclge_clear_reset_cause(hdev);
- set_bit(HNAE3_GLOBAL_RESET, &hdev->default_reset_request);
- mod_timer(&hdev->reset_timer,
- jiffies + HCLGE_RESET_INTERVAL);
-
- return false;
+ set_bit(hdev->reset_type, &hdev->reset_pending);
+ dev_info(&hdev->pdev->dev,
+ "re-schedule reset task(%d)\n",
+ hdev->reset_fail_cnt);
+ return true;
}
hclge_clear_reset_cause(hdev);
@@ -3382,7 +3373,6 @@ static int hclge_reset_stack(struct hclge_dev *hdev)
static void hclge_reset(struct hclge_dev *hdev)
{
struct hnae3_ae_dev *ae_dev = pci_get_drvdata(hdev->pdev);
- bool is_timeout = false;
int ret;
/* Initialize ae_dev reset status as well, in case enet layer wants to
@@ -3410,10 +3400,8 @@ static void hclge_reset(struct hclge_dev *hdev)
if (ret)
goto err_reset;
- if (hclge_reset_wait(hdev)) {
- is_timeout = true;
+ if (hclge_reset_wait(hdev))
goto err_reset;
- }
hdev->rst_stats.hw_reset_done_cnt++;
@@ -3465,7 +3453,7 @@ static void hclge_reset(struct hclge_dev *hdev)
err_reset_lock:
rtnl_unlock();
err_reset:
- if (hclge_reset_err_handle(hdev, is_timeout))
+ if (hclge_reset_err_handle(hdev))
hclge_reset_task_schedule(hdev);
}
--
2.7.4
^ permalink raw reply related
* Re: [PATCH V2 net-next 00/11] net: hns3: some code optimizations & bugfixes & features
From: tanhuazhong @ 2019-07-26 4:00 UTC (permalink / raw)
To: davem; +Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm
In-Reply-To: <1564111502-15504-1-git-send-email-tanhuazhong@huawei.com>
Sorry, please ignore this patchset. Will resend it later:)
On 2019/7/26 11:24, Huazhong Tan wrote:
> This patch-set includes code optimizations, bugfixes and features for
> the HNS3 ethernet controller driver.
>
> [patch 1/11] checks reset status before setting channel.
>
> [patch 2/11] adds a NULL pointer checking.
>
> [patch 3/11] removes reset level upgrading when current reset fails.
>
> [patch 4/11] fixes a bug related to IRQ vector number initialization.
>
> [patch 5/11] fixes a GFP flags errors when holding spin_lock.
>
> [patch 6/11] modifies firmware version format.
>
> [patch 7/11] adds some print information which is off by default.
>
> [patch 8/11 - 9/11] adds two code optimizations about interrupt handler
> and work task.
>
> [patch 10/11] adds support for using order 1 pages with a 4K buffer.
>
> [patch 11/11] modifies messages prints with dev_info() instead of
> pr_info().
>
> Change log:
> V1->V2: fixes comments from Saeed Mahameed and
> removes previous [patch 11/11] which needs further discussion,
> adds a new patch [11/11] suggested by Saeed Mahameed.
>
> Guangbin Huang (1):
> net: hns3: add a check for get_reset_level
>
> Huazhong Tan (2):
> net: hns3: remove upgrade reset level when reset fail
> net: hns3: use dev_info() instead of pr_info()
>
> Jian Shen (1):
> net: hns3: add reset checking before set channels
>
> Yonglong Liu (2):
> net: hns3: fix mis-counting IRQ vector numbers issue
> net: hns3: adds debug messages to identify eth down cause
>
> Yufeng Mo (2):
> net: hns3: change GFP flag during lock period
> net: hns3: modify firmware version display format
>
> Yunsheng Lin (3):
> net: hns3: make hclge_service use delayed workqueue
> net: hns3: add interrupt affinity support for misc interrupt
> net: hns3: Add support for using order 1 pages with a 4K buffer
>
> drivers/net/ethernet/hisilicon/hns3/hnae3.h | 9 ++
> drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 39 +++++-
> drivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 15 ++-
> drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 41 +++++-
> .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c | 10 +-
> .../net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c | 14 +++
> .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 137 ++++++++++++---------
> .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h | 7 +-
> .../ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c | 10 +-
> .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 15 ++-
> 10 files changed, 223 insertions(+), 74 deletions(-)
>
^ permalink raw reply
* Re: [PATCH bpf-next 2/6] bpf: add BPF_MAP_DUMP command to dump more than one entry per call
From: Yonghong Song @ 2019-07-26 6:10 UTC (permalink / raw)
To: Alexei Starovoitov, Brian Vazquez
Cc: Song Liu, Brian Vazquez, Daniel Borkmann, David S . Miller,
Stanislav Fomichev, Willem de Bruijn, Petar Penkov, Networking,
bpf
In-Reply-To: <CAADnVQJpp37fXLsu8ZnMFPoC0Uof3roz4gofX0QCewNkwtf-Xg@mail.gmail.com>
On 7/25/19 6:47 PM, Alexei Starovoitov wrote:
> On Thu, Jul 25, 2019 at 6:24 PM Brian Vazquez <brianvv.kernel@gmail.com> wrote:
>>
>> On Thu, Jul 25, 2019 at 4:54 PM Alexei Starovoitov
>> <alexei.starovoitov@gmail.com> wrote:
>>>
>>> On Thu, Jul 25, 2019 at 04:25:53PM -0700, Brian Vazquez wrote:
>>>>>>> If prev_key is deleted before map_get_next_key(), we get the first key
>>>>>>> again. This is pretty weird.
>>>>>>
>>>>>> Yes, I know. But note that the current scenario happens even for the
>>>>>> old interface (imagine you are walking a map from userspace and you
>>>>>> tried get_next_key the prev_key was removed, you will start again from
>>>>>> the beginning without noticing it).
>>>>>> I tried to sent a patch in the past but I was missing some context:
>>>>>> before NULL was used to get the very first_key the interface relied in
>>>>>> a random (non existent) key to retrieve the first_key in the map, and
>>>>>> I was told what we still have to support that scenario.
>>>>>
>>>>> BPF_MAP_DUMP is slightly different, as you may return the first key
>>>>> multiple times in the same call. Also, BPF_MAP_DUMP is new, so we
>>>>> don't have to support legacy scenarios.
>>>>>
>>>>> Since BPF_MAP_DUMP keeps a list of elements. It is possible to try
>>>>> to look up previous keys. Would something down this direction work?
>>>>
>>>> I've been thinking about it and I think first we need a way to detect
>>>> that since key was not present we got the first_key instead:
>>>>
>>>> - One solution I had in mind was to explicitly asked for the first key
>>>> with map_get_next_key(map, NULL, first_key) and while walking the map
>>>> check that map_get_next_key(map, prev_key, key) doesn't return the
>>>> same key. This could be done using memcmp.
>>>> - Discussing with Stan, he mentioned that another option is to support
>>>> a flag in map_get_next_key to let it know that we want an error
>>>> instead of the first_key.
>>>>
>>>> After detecting the problem we also need to define what we want to do,
>>>> here some options:
>>>>
>>>> a) Return the error to the caller
>>>> b) Try with previous keys if any (which be limited to the keys that we
>>>> have traversed so far in this dump call)
>>>> c) continue with next entries in the map. array is easy just get the
>>>> next valid key (starting on i+1), but hmap might be difficult since
>>>> starting on the next bucket could potentially skip some keys that were
>>>> concurrently added to the same bucket where key used to be, and
>>>> starting on the same bucket could lead us to return repeated elements.
>>>>
>>>> Or maybe we could support those 3 cases via flags and let the caller
>>>> decide which one to use?
>>>
>>> this type of indecision is the reason why I wasn't excited about
>>> batch dumping in the first place and gave 'soft yes' when Stan
>>> mentioned it during lsf/mm/bpf uconf.
>>> We probably shouldn't do it.
>>> It feels this map_dump makes api more complex and doesn't really
>>> give much benefit to the user other than large map dump becomes faster.
>>> I think we gotta solve this problem differently.
>>
>> Some users are working around the dumping problems with the existing
>> api by creating a bpf_map_get_next_key_and_delete userspace function
>> (see https://urldefense.proofpoint.com/v2/url?u=https-3A__www.bouncybouncy.net_blog_bpf-5Fmap-5Fget-5Fnext-5Fkey-2Dpitfalls_&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=XvNxqsDhRi62gzZ04HbLRTOFJX8X6mTuK7PZGn80akY&s=7q7beZxOJJ3Q0el8L0r-xDctedSpnEejJ6PVX1XYotQ&e= )
>> which in my opinion is actually a good idea. The only problem with
>> that is that calling bpf_map_get_next_key(fd, key, next_key) and then
>> bpf_map_delete_elem(fd, key) from userspace is racing with kernel code
>> and it might lose some information when deleting.
>> We could then do map_dump_and_delete using that idea but in the kernel
>> where we could better handle the racing condition. In that scenario
>> even if we retrieve the same key it will contain different info ( the
>> delta between old and new value). Would that work?
>
> you mean get_next+lookup+delete at once?
> Sounds useful.
> Yonghong has been thinking about batching api as well.
In bcc, we have many instances like this:
getting all (key value) pairs, do some analysis and output,
delete all keys
The implementation typically like
/* to get all (key, value) pairs */
while(bpf_get_next_key() == 0)
bpf_map_lookup()
/* do analysis and output */
for (all keys)
bpf_map_delete()
get_next+lookup+delete will be definitely useful.
batching will be even better to save the number of syscalls.
An alternative is to do batch get_next+lookup and batch delete
to achieve similar goal as the above code.
There is a minor difference between this approach
and the above get_next+lookup+delete.
During scanning the hash map, get_next+lookup may get less number
of elements compared to get_next+lookup+delete as the latter
may have more later-inserted hash elements after the operation
start. But both are inaccurate, so probably the difference
is minor.
>
> I think if we cannot figure out how to make a batch of two commands
> get_next + lookup to work correctly then we need to identify/invent one
> command and make batching more generic.
not 100% sure. It will be hard to define what is "correctly".
For not changing map, looping of (get_next, lookup) and batch
get_next+lookup should have the same results.
For constant changing loops, not sure how to define which one
is correct. If users have concerns, they may need to just pick one
which gives them more comfort.
> Like make one jumbo/compound/atomic command to be get_next+lookup+delete.
> Define the semantics of this single compound command.
> And then let batching to be a multiplier of such command.
> In a sense that multiplier 1 or N should be have the same way.
> No extra flags to alter the batching.
> The high level description of the batch would be:
> pls execute get_next,lookup,delete and repeat it N times.
> or
> pls execute get_next,lookup and repeat N times.
> where each command action is defined to be composable.
>
> Just a rough idea.
>
^ permalink raw reply
* [PATCH net-next v2 3/3] netfilter: nf_tables_offload: support indr block call
From: wenxu @ 2019-07-26 6:17 UTC (permalink / raw)
To: pablo, fw; +Cc: netfilter-devel, netdev
In-Reply-To: <1564121869-3398-1-git-send-email-wenxu@ucloud.cn>
From: wenxu <wenxu@ucloud.cn>
nftable support indr-block call. It makes nftable an offload vlan
and tunnel device.
nft add table netdev firewall
nft add chain netdev firewall aclout { type filter hook ingress offload device mlx_pf0vf0 priority - 300 \; }
nft add rule netdev firewall aclout ip daddr 10.0.0.1 fwd to vlan0
nft add chain netdev firewall aclin { type filter hook ingress device vlan0 priority - 300 \; }
nft add rule netdev firewall aclin ip daddr 10.0.0.7 fwd to mlx_pf0vf0
Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v2: make use of flow_block
net/netfilter/nf_tables_api.c | 6 ++
net/netfilter/nf_tables_offload.c | 128 +++++++++++++++++++++++++++++++-------
2 files changed, 110 insertions(+), 24 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 605a7cf..a8ab3e9 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -7623,8 +7623,14 @@ static int __init nf_tables_module_init(void)
if (err < 0)
goto err5;
+ err = flow_indr_rhashtable_init();
+ if (err)
+ goto err6;
+
nft_chain_route_init();
return err;
+err6:
+ nfnetlink_subsys_unregister(&nf_tables_subsys);
err5:
rhltable_destroy(&nft_objname_ht);
err4:
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 64f5fd5..09a5efe 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -171,24 +171,120 @@ static int nft_flow_offload_unbind(struct flow_block_offload *bo,
return 0;
}
+static int nft_block_setup(struct nft_base_chain *basechain,
+ struct flow_block_offload *bo,
+ enum flow_block_command cmd)
+{
+ int err;
+
+ switch (cmd) {
+ case FLOW_BLOCK_BIND:
+ err = nft_flow_offload_bind(bo, basechain);
+ break;
+ case FLOW_BLOCK_UNBIND:
+ err = nft_flow_offload_unbind(bo, basechain);
+ break;
+ default:
+ WARN_ON_ONCE(1);
+ err = -EOPNOTSUPP;
+ }
+
+ return err;
+}
+
+static int nft_block_offload_cmd(struct nft_base_chain *chain,
+ struct net_device *dev,
+ enum flow_block_command cmd)
+{
+ struct netlink_ext_ack extack = {};
+ struct flow_block_offload bo = {};
+ int err;
+
+ bo.net = dev_net(dev);
+ bo.block = &chain->flow_block;
+ bo.command = cmd;
+ bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
+ bo.extack = &extack;
+ INIT_LIST_HEAD(&bo.cb_list);
+
+ err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
+ if (err < 0)
+ return err;
+
+ return nft_block_setup(chain, &bo, cmd);
+}
+
+static void nft_indr_block_ing_cmd(struct net_device *dev,
+ struct flow_block *flow_block,
+ struct flow_indr_block_cb *indr_block_cb,
+ enum flow_block_command cmd)
+{
+ struct netlink_ext_ack extack = {};
+ struct flow_block_offload bo = {};
+ struct nft_base_chain *chain;
+
+ if (flow_block)
+ return;
+
+ chain = container_of(flow_block, struct nft_base_chain, flow_block);
+
+ bo.net = dev_net(dev);
+ bo.block = flow_block;
+ bo.command = cmd;
+ bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
+ bo.extack = &extack;
+ INIT_LIST_HEAD(&bo.cb_list);
+
+ indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK, &bo);
+
+ nft_block_setup(chain, &bo, cmd);
+}
+
+static int nft_indr_block_offload_cmd(struct nft_base_chain *chain,
+ struct net_device *dev,
+ enum flow_block_command cmd)
+{
+ struct flow_indr_block_cb *indr_block_cb;
+ struct flow_indr_block_dev *indr_dev;
+ struct flow_block_offload bo = {};
+ struct netlink_ext_ack extack = {};
+
+ bo.net = dev_net(dev);
+ bo.block = &chain->flow_block;
+ bo.command = cmd;
+ bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
+ bo.extack = &extack;
+ INIT_LIST_HEAD(&bo.cb_list);
+
+ indr_dev = flow_indr_block_dev_lookup(dev);
+ if (!indr_dev)
+ return -EOPNOTSUPP;
+
+ indr_dev->flow_block = cmd == FLOW_BLOCK_BIND ? &chain->flow_block : NULL;
+ indr_dev->ing_cmd_cb = cmd == FLOW_BLOCK_BIND ? nft_indr_block_ing_cmd : NULL;
+
+ list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
+ indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
+ &bo);
+
+ return nft_block_setup(chain, &bo, cmd);
+}
+
#define FLOW_SETUP_BLOCK TC_SETUP_BLOCK
static int nft_flow_offload_chain(struct nft_trans *trans,
enum flow_block_command cmd)
{
struct nft_chain *chain = trans->ctx.chain;
- struct netlink_ext_ack extack = {};
- struct flow_block_offload bo = {};
struct nft_base_chain *basechain;
struct net_device *dev;
- int err;
if (!nft_is_base_chain(chain))
return -EOPNOTSUPP;
basechain = nft_base_chain(chain);
dev = basechain->ops.dev;
- if (!dev || !dev->netdev_ops->ndo_setup_tc)
+ if (!dev)
return -EOPNOTSUPP;
/* Only default policy to accept is supported for now. */
@@ -197,26 +293,10 @@ static int nft_flow_offload_chain(struct nft_trans *trans,
nft_trans_chain_policy(trans) != NF_ACCEPT)
return -EOPNOTSUPP;
- bo.command = cmd;
- bo.block = &basechain->flow_block;
- bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
- bo.extack = &extack;
- INIT_LIST_HEAD(&bo.cb_list);
-
- err = dev->netdev_ops->ndo_setup_tc(dev, FLOW_SETUP_BLOCK, &bo);
- if (err < 0)
- return err;
-
- switch (cmd) {
- case FLOW_BLOCK_BIND:
- err = nft_flow_offload_bind(&bo, basechain);
- break;
- case FLOW_BLOCK_UNBIND:
- err = nft_flow_offload_unbind(&bo, basechain);
- break;
- }
-
- return err;
+ if (dev->netdev_ops->ndo_setup_tc)
+ return nft_block_offload_cmd(basechain, dev, cmd);
+ else
+ return nft_indr_block_offload_cmd(basechain, dev, cmd);
}
int nft_flow_rule_offload_commit(struct net *net)
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next v2 0/3] flow_offload: add indr-block in nf_table_offload
From: wenxu @ 2019-07-26 6:17 UTC (permalink / raw)
To: pablo, fw; +Cc: netfilter-devel, netdev
From: wenxu <wenxu@ucloud.cn>
This series patch make nftables offload support the vlan and
tunnel device offload through indr-block architecture.
The first patch mv tc indr block to flow offload and rename
to flow-indr-block.
Because the new flow-indr-block can't get the tcf_block
directly. The second patch provide a callback to get tcf_block
immediately when the device register and contain a ingress block.
The third patch make nf_tables_offload support flow-indr-block.
wenxu (3):
flow_offload: move tc indirect block to flow offload
flow_offload: Support get tcf block immediately
netfilter: nf_tables_offload: support indr block call
drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 10 +-
.../net/ethernet/netronome/nfp/flower/offload.c | 10 +-
include/net/flow_offload.h | 45 ++++
include/net/pkt_cls.h | 35 ---
include/net/sch_generic.h | 3 -
net/core/flow_offload.c | 202 +++++++++++++++++
net/netfilter/nf_tables_api.c | 6 +
net/netfilter/nf_tables_offload.c | 128 +++++++++--
net/sched/cls_api.c | 243 ++++-----------------
9 files changed, 410 insertions(+), 272 deletions(-)
--
1.8.3.1
^ permalink raw reply
* [PATCH net-next v2 2/3] flow_offload: Support get tcf block immediately
From: wenxu @ 2019-07-26 6:17 UTC (permalink / raw)
To: pablo, fw; +Cc: netfilter-devel, netdev
In-Reply-To: <1564121869-3398-1-git-send-email-wenxu@ucloud.cn>
From: wenxu <wenxu@ucloud.cn>
Because the new flow-indr-block can't get the tcf_block
directly.
It provide a callback to find the tcf block immediately
when the device register and contain a ingress block.
Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v2: make use of flow_block
include/net/flow_offload.h | 4 ++++
net/core/flow_offload.c | 12 ++++++++++++
net/sched/cls_api.c | 34 ++++++++++++++++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index b5ef5be..bfe3a34 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -391,6 +391,10 @@ struct flow_indr_block_dev {
struct flow_block *flow_block;
};
+typedef void flow_indr_get_default_block_t(struct flow_indr_block_dev *indr_dev);
+
+void flow_indr_set_default_block_cb(flow_indr_get_default_block_t *cb);
+
struct flow_indr_block_dev *flow_indr_block_dev_lookup(struct net_device *dev);
int flow_indr_rhashtable_init(void);
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index a6785df..eeff99f 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -298,6 +298,14 @@ struct flow_indr_block_dev *
}
EXPORT_SYMBOL(flow_indr_block_dev_lookup);
+static flow_indr_get_default_block_t *flow_indr_get_default_block;
+
+void flow_indr_set_default_block_cb(flow_indr_get_default_block_t *cb)
+{
+ flow_indr_get_default_block = cb;
+}
+EXPORT_SYMBOL(flow_indr_set_default_block_cb);
+
static struct flow_indr_block_dev *flow_indr_block_dev_get(struct net_device *dev)
{
struct flow_indr_block_dev *indr_dev;
@@ -312,6 +320,10 @@ static struct flow_indr_block_dev *flow_indr_block_dev_get(struct net_device *de
INIT_LIST_HEAD(&indr_dev->cb_list);
indr_dev->dev = dev;
+
+ if (flow_indr_get_default_block)
+ flow_indr_get_default_block(indr_dev);
+
if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
flow_indr_setup_block_ht_params)) {
kfree(indr_dev);
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index d370c52..8d4d7f0 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -576,6 +576,38 @@ static void tc_indr_block_ing_cmd(struct net_device *dev,
tcf_block_setup(block, &bo);
}
+static struct tcf_block *tc_dev_ingress_block(struct net_device *dev)
+{
+ const struct Qdisc_class_ops *cops;
+ struct Qdisc *qdisc;
+
+ if (!dev_ingress_queue(dev))
+ return NULL;
+
+ qdisc = dev_ingress_queue(dev)->qdisc_sleeping;
+ if (!qdisc)
+ return NULL;
+
+ cops = qdisc->ops->cl_ops;
+ if (!cops)
+ return NULL;
+
+ if (!cops->tcf_block)
+ return NULL;
+
+ return cops->tcf_block(qdisc, TC_H_MIN_INGRESS, NULL);
+}
+
+static void tc_indr_get_default_block(struct flow_indr_block_dev *indr_dev)
+{
+ struct tcf_block *block = tc_dev_ingress_block(indr_dev->dev);
+
+ if (block) {
+ indr_dev->flow_block = &block->flow_block;
+ indr_dev->ing_cmd_cb = tc_indr_block_ing_cmd;
+ }
+}
+
static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
struct tcf_block_ext_info *ei,
enum flow_block_command command,
@@ -3172,6 +3204,8 @@ static int __init tc_filter_init(void)
if (err)
goto err_rhash_setup_block_ht;
+ flow_indr_set_default_block_cb(tc_indr_get_default_block);
+
rtnl_register(PF_UNSPEC, RTM_NEWTFILTER, tc_new_tfilter, NULL,
RTNL_FLAG_DOIT_UNLOCKED);
rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_del_tfilter, NULL,
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next v2 1/3] flow_offload: move tc indirect block to flow offload
From: wenxu @ 2019-07-26 6:17 UTC (permalink / raw)
To: pablo, fw; +Cc: netfilter-devel, netdev
In-Reply-To: <1564121869-3398-1-git-send-email-wenxu@ucloud.cn>
From: wenxu <wenxu@ucloud.cn>
move tc indirect block to flow_offload and rename
it to flow indirect block.The nf_tables can use the
indr block architecture.
Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v2: make use of flow_block from Pablo
flow_indr_rhashtable_init advice by jakub.kicinski
drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 10 +-
.../net/ethernet/netronome/nfp/flower/offload.c | 10 +-
include/net/flow_offload.h | 41 ++++
include/net/pkt_cls.h | 35 ----
include/net/sch_generic.h | 3 -
net/core/flow_offload.c | 190 +++++++++++++++++
net/sched/cls_api.c | 231 ++-------------------
7 files changed, 261 insertions(+), 259 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 7f747cb..074573b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -785,9 +785,9 @@ static int mlx5e_rep_indr_register_block(struct mlx5e_rep_priv *rpriv,
{
int err;
- err = __tc_indr_block_cb_register(netdev, rpriv,
- mlx5e_rep_indr_setup_tc_cb,
- rpriv);
+ err = __flow_indr_block_cb_register(netdev, rpriv,
+ mlx5e_rep_indr_setup_tc_cb,
+ rpriv);
if (err) {
struct mlx5e_priv *priv = netdev_priv(rpriv->netdev);
@@ -800,8 +800,8 @@ static int mlx5e_rep_indr_register_block(struct mlx5e_rep_priv *rpriv,
static void mlx5e_rep_indr_unregister_block(struct mlx5e_rep_priv *rpriv,
struct net_device *netdev)
{
- __tc_indr_block_cb_unregister(netdev, mlx5e_rep_indr_setup_tc_cb,
- rpriv);
+ __flow_indr_block_cb_unregister(netdev, mlx5e_rep_indr_setup_tc_cb,
+ rpriv);
}
static int mlx5e_nic_rep_netdevice_event(struct notifier_block *nb,
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index e209f15..6a0f034 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -1479,16 +1479,16 @@ int nfp_flower_reg_indir_block_handler(struct nfp_app *app,
return NOTIFY_OK;
if (event == NETDEV_REGISTER) {
- err = __tc_indr_block_cb_register(netdev, app,
- nfp_flower_indr_setup_tc_cb,
- app);
+ err = __flow_indr_block_cb_register(netdev, app,
+ nfp_flower_indr_setup_tc_cb,
+ app);
if (err)
nfp_flower_cmsg_warn(app,
"Indirect block reg failed - %s\n",
netdev->name);
} else if (event == NETDEV_UNREGISTER) {
- __tc_indr_block_cb_unregister(netdev,
- nfp_flower_indr_setup_tc_cb, app);
+ __flow_indr_block_cb_unregister(netdev,
+ nfp_flower_indr_setup_tc_cb, app);
}
return NOTIFY_OK;
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index 00b9aab..b5ef5be 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -4,6 +4,7 @@
#include <linux/kernel.h>
#include <linux/list.h>
#include <net/flow_dissector.h>
+#include <linux/rhashtable.h>
struct flow_match {
struct flow_dissector *dissector;
@@ -366,4 +367,44 @@ static inline void flow_block_init(struct flow_block *flow_block)
INIT_LIST_HEAD(&flow_block->cb_list);
}
+typedef int flow_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
+ enum tc_setup_type type, void *type_data);
+
+struct flow_indr_block_cb {
+ struct list_head list;
+ void *cb_priv;
+ flow_indr_block_bind_cb_t *cb;
+ void *cb_ident;
+};
+
+typedef void flow_indr_block_ing_cmd_t(struct net_device *dev,
+ struct flow_block *flow_block,
+ struct flow_indr_block_cb *indr_block_cb,
+ enum flow_block_command command);
+
+struct flow_indr_block_dev {
+ struct rhash_head ht_node;
+ struct net_device *dev;
+ unsigned int refcnt;
+ struct list_head cb_list;
+ flow_indr_block_ing_cmd_t *ing_cmd_cb;
+ struct flow_block *flow_block;
+};
+
+struct flow_indr_block_dev *flow_indr_block_dev_lookup(struct net_device *dev);
+
+int flow_indr_rhashtable_init(void);
+
+int __flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+ flow_indr_block_bind_cb_t *cb, void *cb_ident);
+
+void __flow_indr_block_cb_unregister(struct net_device *dev,
+ flow_indr_block_bind_cb_t *cb, void *cb_ident);
+
+int flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+ flow_indr_block_bind_cb_t *cb, void *cb_ident);
+
+void flow_indr_block_cb_unregister(struct net_device *dev,
+ flow_indr_block_bind_cb_t *cb, void *cb_ident);
+
#endif /* _NET_FLOW_OFFLOAD_H */
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index e429809..0790a4e 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -70,15 +70,6 @@ static inline struct Qdisc *tcf_block_q(struct tcf_block *block)
return block->q;
}
-int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
- tc_indr_block_bind_cb_t *cb, void *cb_ident);
-int tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
- tc_indr_block_bind_cb_t *cb, void *cb_ident);
-void __tc_indr_block_cb_unregister(struct net_device *dev,
- tc_indr_block_bind_cb_t *cb, void *cb_ident);
-void tc_indr_block_cb_unregister(struct net_device *dev,
- tc_indr_block_bind_cb_t *cb, void *cb_ident);
-
int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
struct tcf_result *res, bool compat_mode);
@@ -137,32 +128,6 @@ void tc_setup_cb_block_unregister(struct tcf_block *block, flow_setup_cb_t *cb,
{
}
-static inline
-int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
- tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
- return 0;
-}
-
-static inline
-int tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
- tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
- return 0;
-}
-
-static inline
-void __tc_indr_block_cb_unregister(struct net_device *dev,
- tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-}
-
-static inline
-void tc_indr_block_cb_unregister(struct net_device *dev,
- tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-}
-
static inline int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
struct tcf_result *res, bool compat_mode)
{
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 6b6b012..d9f359a 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -23,9 +23,6 @@
struct module;
struct bpf_flow_keys;
-typedef int tc_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
- enum tc_setup_type type, void *type_data);
-
struct qdisc_rate_table {
struct tc_ratespec rate;
u32 data[256];
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index d63b970..a6785df 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -2,6 +2,7 @@
#include <linux/kernel.h>
#include <linux/slab.h>
#include <net/flow_offload.h>
+#include <linux/rtnetlink.h>
struct flow_rule *flow_rule_alloc(unsigned int num_actions)
{
@@ -280,3 +281,192 @@ int flow_block_cb_setup_simple(struct flow_block_offload *f,
}
}
EXPORT_SYMBOL(flow_block_cb_setup_simple);
+
+static struct rhashtable indr_setup_block_ht;
+
+static const struct rhashtable_params flow_indr_setup_block_ht_params = {
+ .key_offset = offsetof(struct flow_indr_block_dev, dev),
+ .head_offset = offsetof(struct flow_indr_block_dev, ht_node),
+ .key_len = sizeof(struct net_device *),
+};
+
+struct flow_indr_block_dev *
+flow_indr_block_dev_lookup(struct net_device *dev)
+{
+ return rhashtable_lookup_fast(&indr_setup_block_ht, &dev,
+ flow_indr_setup_block_ht_params);
+}
+EXPORT_SYMBOL(flow_indr_block_dev_lookup);
+
+static struct flow_indr_block_dev *flow_indr_block_dev_get(struct net_device *dev)
+{
+ struct flow_indr_block_dev *indr_dev;
+
+ indr_dev = flow_indr_block_dev_lookup(dev);
+ if (indr_dev)
+ goto inc_ref;
+
+ indr_dev = kzalloc(sizeof(*indr_dev), GFP_KERNEL);
+ if (!indr_dev)
+ return NULL;
+
+ INIT_LIST_HEAD(&indr_dev->cb_list);
+ indr_dev->dev = dev;
+ if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
+ flow_indr_setup_block_ht_params)) {
+ kfree(indr_dev);
+ return NULL;
+ }
+
+inc_ref:
+ indr_dev->refcnt++;
+ return indr_dev;
+}
+
+static void flow_indr_block_dev_put(struct flow_indr_block_dev *indr_dev)
+{
+ if (--indr_dev->refcnt)
+ return;
+
+ rhashtable_remove_fast(&indr_setup_block_ht, &indr_dev->ht_node,
+ flow_indr_setup_block_ht_params);
+ kfree(indr_dev);
+}
+
+static struct flow_indr_block_cb *
+flow_indr_block_cb_lookup(struct flow_indr_block_dev *indr_dev,
+ flow_indr_block_bind_cb_t *cb, void *cb_ident)
+{
+ struct flow_indr_block_cb *indr_block_cb;
+
+ list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
+ if (indr_block_cb->cb == cb &&
+ indr_block_cb->cb_ident == cb_ident)
+ return indr_block_cb;
+ return NULL;
+}
+
+static struct flow_indr_block_cb *
+flow_indr_block_cb_add(struct flow_indr_block_dev *indr_dev, void *cb_priv,
+ flow_indr_block_bind_cb_t *cb, void *cb_ident)
+{
+ struct flow_indr_block_cb *indr_block_cb;
+
+ indr_block_cb = flow_indr_block_cb_lookup(indr_dev, cb, cb_ident);
+ if (indr_block_cb)
+ return ERR_PTR(-EEXIST);
+
+ indr_block_cb = kzalloc(sizeof(*indr_block_cb), GFP_KERNEL);
+ if (!indr_block_cb)
+ return ERR_PTR(-ENOMEM);
+
+ indr_block_cb->cb_priv = cb_priv;
+ indr_block_cb->cb = cb;
+ indr_block_cb->cb_ident = cb_ident;
+ list_add(&indr_block_cb->list, &indr_dev->cb_list);
+
+ return indr_block_cb;
+}
+
+static void flow_indr_block_cb_del(struct flow_indr_block_cb *indr_block_cb)
+{
+ list_del(&indr_block_cb->list);
+ kfree(indr_block_cb);
+}
+
+int __flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+ flow_indr_block_bind_cb_t *cb,
+ void *cb_ident)
+{
+ struct flow_indr_block_cb *indr_block_cb;
+ struct flow_indr_block_dev *indr_dev;
+ int err;
+
+ indr_dev = flow_indr_block_dev_get(dev);
+ if (!indr_dev)
+ return -ENOMEM;
+
+ indr_block_cb = flow_indr_block_cb_add(indr_dev, cb_priv, cb, cb_ident);
+ err = PTR_ERR_OR_ZERO(indr_block_cb);
+ if (err)
+ goto err_dev_put;
+
+ if (indr_dev->ing_cmd_cb)
+ indr_dev->ing_cmd_cb(indr_dev->dev, indr_dev->flow_block, indr_block_cb,
+ FLOW_BLOCK_BIND);
+
+ return 0;
+
+err_dev_put:
+ flow_indr_block_dev_put(indr_dev);
+ return err;
+}
+EXPORT_SYMBOL_GPL(__flow_indr_block_cb_register);
+
+int flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+ flow_indr_block_bind_cb_t *cb,
+ void *cb_ident)
+{
+ int err;
+
+ rtnl_lock();
+ err = __flow_indr_block_cb_register(dev, cb_priv, cb, cb_ident);
+ rtnl_unlock();
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(flow_indr_block_cb_register);
+
+void __flow_indr_block_cb_unregister(struct net_device *dev,
+ flow_indr_block_bind_cb_t *cb,
+ void *cb_ident)
+{
+ struct flow_indr_block_cb *indr_block_cb;
+ struct flow_indr_block_dev *indr_dev;
+
+ indr_dev = flow_indr_block_dev_lookup(dev);
+ if (!indr_dev)
+ return;
+
+ indr_block_cb = flow_indr_block_cb_lookup(indr_dev, cb, cb_ident);
+ if (!indr_block_cb)
+ return;
+
+ /* Send unbind message if required to free any block cbs. */
+ if (indr_dev->ing_cmd_cb)
+ indr_dev->ing_cmd_cb(indr_dev->dev, indr_dev->flow_block,
+ indr_block_cb,
+ FLOW_BLOCK_UNBIND);
+
+ flow_indr_block_cb_del(indr_block_cb);
+ flow_indr_block_dev_put(indr_dev);
+}
+EXPORT_SYMBOL_GPL(__flow_indr_block_cb_unregister);
+
+void flow_indr_block_cb_unregister(struct net_device *dev,
+ flow_indr_block_bind_cb_t *cb,
+ void *cb_ident)
+{
+ rtnl_lock();
+ __flow_indr_block_cb_unregister(dev, cb, cb_ident);
+ rtnl_unlock();
+}
+EXPORT_SYMBOL_GPL(flow_indr_block_cb_unregister);
+
+int flow_indr_rhashtable_init(void)
+{
+ static bool rhash_table_init;
+ int err = 0;
+
+ if (rhash_table_init)
+ return 0;
+
+ err = rhashtable_init(&indr_setup_block_ht,
+ &flow_indr_setup_block_ht_params);
+ if (err)
+ return err;
+
+ rhash_table_init = true;
+ return 0;
+}
+EXPORT_SYMBOL_GPL(flow_indr_rhashtable_init);
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 3565d9a..d370c52 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -37,6 +37,7 @@
#include <net/tc_act/tc_skbedit.h>
#include <net/tc_act/tc_ct.h>
#include <net/tc_act/tc_mpls.h>
+#include <net/flow_offload.h>
extern const struct nla_policy rtm_tca_policy[TCA_MAX + 1];
@@ -545,235 +546,43 @@ static void tcf_chain_flush(struct tcf_chain *chain, bool rtnl_held)
}
}
-static struct tcf_block *tc_dev_ingress_block(struct net_device *dev)
-{
- const struct Qdisc_class_ops *cops;
- struct Qdisc *qdisc;
-
- if (!dev_ingress_queue(dev))
- return NULL;
-
- qdisc = dev_ingress_queue(dev)->qdisc_sleeping;
- if (!qdisc)
- return NULL;
-
- cops = qdisc->ops->cl_ops;
- if (!cops)
- return NULL;
-
- if (!cops->tcf_block)
- return NULL;
-
- return cops->tcf_block(qdisc, TC_H_MIN_INGRESS, NULL);
-}
-
-static struct rhashtable indr_setup_block_ht;
-
-struct tc_indr_block_dev {
- struct rhash_head ht_node;
- struct net_device *dev;
- unsigned int refcnt;
- struct list_head cb_list;
- struct tcf_block *block;
-};
-
-struct tc_indr_block_cb {
- struct list_head list;
- void *cb_priv;
- tc_indr_block_bind_cb_t *cb;
- void *cb_ident;
-};
-
-static const struct rhashtable_params tc_indr_setup_block_ht_params = {
- .key_offset = offsetof(struct tc_indr_block_dev, dev),
- .head_offset = offsetof(struct tc_indr_block_dev, ht_node),
- .key_len = sizeof(struct net_device *),
-};
-
-static struct tc_indr_block_dev *
-tc_indr_block_dev_lookup(struct net_device *dev)
-{
- return rhashtable_lookup_fast(&indr_setup_block_ht, &dev,
- tc_indr_setup_block_ht_params);
-}
-
-static struct tc_indr_block_dev *tc_indr_block_dev_get(struct net_device *dev)
-{
- struct tc_indr_block_dev *indr_dev;
-
- indr_dev = tc_indr_block_dev_lookup(dev);
- if (indr_dev)
- goto inc_ref;
-
- indr_dev = kzalloc(sizeof(*indr_dev), GFP_KERNEL);
- if (!indr_dev)
- return NULL;
-
- INIT_LIST_HEAD(&indr_dev->cb_list);
- indr_dev->dev = dev;
- indr_dev->block = tc_dev_ingress_block(dev);
- if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
- tc_indr_setup_block_ht_params)) {
- kfree(indr_dev);
- return NULL;
- }
-
-inc_ref:
- indr_dev->refcnt++;
- return indr_dev;
-}
-
-static void tc_indr_block_dev_put(struct tc_indr_block_dev *indr_dev)
-{
- if (--indr_dev->refcnt)
- return;
-
- rhashtable_remove_fast(&indr_setup_block_ht, &indr_dev->ht_node,
- tc_indr_setup_block_ht_params);
- kfree(indr_dev);
-}
-
-static struct tc_indr_block_cb *
-tc_indr_block_cb_lookup(struct tc_indr_block_dev *indr_dev,
- tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
- struct tc_indr_block_cb *indr_block_cb;
-
- list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
- if (indr_block_cb->cb == cb &&
- indr_block_cb->cb_ident == cb_ident)
- return indr_block_cb;
- return NULL;
-}
-
-static struct tc_indr_block_cb *
-tc_indr_block_cb_add(struct tc_indr_block_dev *indr_dev, void *cb_priv,
- tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
- struct tc_indr_block_cb *indr_block_cb;
-
- indr_block_cb = tc_indr_block_cb_lookup(indr_dev, cb, cb_ident);
- if (indr_block_cb)
- return ERR_PTR(-EEXIST);
-
- indr_block_cb = kzalloc(sizeof(*indr_block_cb), GFP_KERNEL);
- if (!indr_block_cb)
- return ERR_PTR(-ENOMEM);
-
- indr_block_cb->cb_priv = cb_priv;
- indr_block_cb->cb = cb;
- indr_block_cb->cb_ident = cb_ident;
- list_add(&indr_block_cb->list, &indr_dev->cb_list);
-
- return indr_block_cb;
-}
-
-static void tc_indr_block_cb_del(struct tc_indr_block_cb *indr_block_cb)
-{
- list_del(&indr_block_cb->list);
- kfree(indr_block_cb);
-}
-
static int tcf_block_setup(struct tcf_block *block,
struct flow_block_offload *bo);
-static void tc_indr_block_ing_cmd(struct tc_indr_block_dev *indr_dev,
- struct tc_indr_block_cb *indr_block_cb,
+static void tc_indr_block_ing_cmd(struct net_device *dev,
+ struct flow_block *flow_block,
+ struct flow_indr_block_cb *indr_block_cb,
enum flow_block_command command)
{
+ struct tcf_block *block = flow_block ?
+ container_of(flow_block,
+ struct tcf_block,
+ flow_block) : NULL;
struct flow_block_offload bo = {
.command = command,
.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS,
- .net = dev_net(indr_dev->dev),
- .block_shared = tcf_block_non_null_shared(indr_dev->block),
+ .net = dev_net(dev),
+ .block_shared = tcf_block_non_null_shared(block),
};
INIT_LIST_HEAD(&bo.cb_list);
- if (!indr_dev->block)
- return;
-
- bo.block = &indr_dev->block->flow_block;
-
- indr_block_cb->cb(indr_dev->dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
- &bo);
- tcf_block_setup(indr_dev->block, &bo);
-}
-
-int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
- tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
- struct tc_indr_block_cb *indr_block_cb;
- struct tc_indr_block_dev *indr_dev;
- int err;
-
- indr_dev = tc_indr_block_dev_get(dev);
- if (!indr_dev)
- return -ENOMEM;
-
- indr_block_cb = tc_indr_block_cb_add(indr_dev, cb_priv, cb, cb_ident);
- err = PTR_ERR_OR_ZERO(indr_block_cb);
- if (err)
- goto err_dev_put;
-
- tc_indr_block_ing_cmd(indr_dev, indr_block_cb, FLOW_BLOCK_BIND);
- return 0;
-
-err_dev_put:
- tc_indr_block_dev_put(indr_dev);
- return err;
-}
-EXPORT_SYMBOL_GPL(__tc_indr_block_cb_register);
-
-int tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
- tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
- int err;
-
- rtnl_lock();
- err = __tc_indr_block_cb_register(dev, cb_priv, cb, cb_ident);
- rtnl_unlock();
-
- return err;
-}
-EXPORT_SYMBOL_GPL(tc_indr_block_cb_register);
-
-void __tc_indr_block_cb_unregister(struct net_device *dev,
- tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
- struct tc_indr_block_cb *indr_block_cb;
- struct tc_indr_block_dev *indr_dev;
-
- indr_dev = tc_indr_block_dev_lookup(dev);
- if (!indr_dev)
+ if (!block)
return;
- indr_block_cb = tc_indr_block_cb_lookup(indr_dev, cb, cb_ident);
- if (!indr_block_cb)
- return;
+ bo.block = flow_block;
- /* Send unbind message if required to free any block cbs. */
- tc_indr_block_ing_cmd(indr_dev, indr_block_cb, FLOW_BLOCK_UNBIND);
- tc_indr_block_cb_del(indr_block_cb);
- tc_indr_block_dev_put(indr_dev);
-}
-EXPORT_SYMBOL_GPL(__tc_indr_block_cb_unregister);
+ indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK, &bo);
-void tc_indr_block_cb_unregister(struct net_device *dev,
- tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
- rtnl_lock();
- __tc_indr_block_cb_unregister(dev, cb, cb_ident);
- rtnl_unlock();
+ tcf_block_setup(block, &bo);
}
-EXPORT_SYMBOL_GPL(tc_indr_block_cb_unregister);
static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
struct tcf_block_ext_info *ei,
enum flow_block_command command,
struct netlink_ext_ack *extack)
{
- struct tc_indr_block_cb *indr_block_cb;
- struct tc_indr_block_dev *indr_dev;
+ struct flow_indr_block_cb *indr_block_cb;
+ struct flow_indr_block_dev *indr_dev;
struct flow_block_offload bo = {
.command = command,
.binder_type = ei->binder_type,
@@ -784,11 +593,12 @@ static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
};
INIT_LIST_HEAD(&bo.cb_list);
- indr_dev = tc_indr_block_dev_lookup(dev);
+ indr_dev = flow_indr_block_dev_lookup(dev);
if (!indr_dev)
return;
- indr_dev->block = command == FLOW_BLOCK_BIND ? block : NULL;
+ indr_dev->flow_block = command == FLOW_BLOCK_BIND ? &block->flow_block : NULL;
+ indr_dev->ing_cmd_cb = command == FLOW_BLOCK_BIND ? tc_indr_block_ing_cmd : NULL;
list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
@@ -3358,8 +3168,7 @@ static int __init tc_filter_init(void)
if (err)
goto err_register_pernet_subsys;
- err = rhashtable_init(&indr_setup_block_ht,
- &tc_indr_setup_block_ht_params);
+ err = flow_indr_rhashtable_init();
if (err)
goto err_rhash_setup_block_ht;
--
1.8.3.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox