* Re: [patch net-next v2 0/9] net: sched: introduce chain templates support with offloading to mlxsw
From: Cong Wang @ 2018-06-28 22:25 UTC (permalink / raw)
To: Jiri Pirko
Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
Jakub Kicinski, Simon Horman, john.hurley, David Ahern, mlxsw,
sridhar.samudrala
In-Reply-To: <20180628130907.951-1-jiri@resnulli.us>
On Thu, Jun 28, 2018 at 6:10 AM Jiri Pirko <jiri@resnulli.us> wrote:
> Add a template of type flower allowing to insert rules matching on last
> 2 bytes of destination mac address:
> # tc chaintemplate add dev dummy0 ingress proto ip flower dst_mac 00:00:00:00:00:00/00:00:00:00:FF:FF
>
> The template is now showed in the list:
> # tc chaintemplate show dev dummy0 ingress
> chaintemplate flower chain 0
> dst_mac 00:00:00:00:00:00/00:00:00:00:ff:ff
> eth_type ipv4
>
> Add another template, this time for chain number 22:
> # tc chaintemplate add dev dummy0 ingress proto ip chain 22 flower dst_ip 0.0.0.0/16
> # tc chaintemplate show dev dummy0 ingress
> chaintemplate flower chain 0
> dst_mac 00:00:00:00:00:00/00:00:00:00:ff:ff
> eth_type ipv4
> chaintemplate flower chain 22
> eth_type ipv4
> dst_ip 0.0.0.0/16
So, if I want to check the template of a chain, I have to use
'tc chaintemplate... chain X'.
If I want to check the filters in a chain, I have to use
'tc filter show .... chain X'.
If you introduce 'tc chain', it would just need one command:
`tc chain show ... X` which could list its template first and
followed by filters in this chain, something like:
# tc chain show dev eth0 chain X
template: # could be none
....
filter1
...
filter2
...
Isn't it more elegant?
^ permalink raw reply
* Re: [PATCH bpf-net 05/14] bpf: extend bpf_prog_array to store pointers to the cgroup storage
From: kbuild test robot @ 2018-06-28 22:21 UTC (permalink / raw)
To: Roman Gushchin
Cc: kbuild-all, netdev, kernel-team, tj, Roman Gushchin,
Alexei Starovoitov, Daniel Borkmann
In-Reply-To: <20180628163458.27193-6-guro@fb.com>
Hi Roman,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on bpf-next/master]
[also build test WARNING on v4.18-rc2]
[cannot apply to next-20180628]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Roman-Gushchin/bpf-cgroup-local-storage/20180629-035104
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__
sparse warnings: (new ones prefixed by >>)
include/linux/filter.h:632:16: sparse: expression using sizeof(void)
kernel/bpf/core.c:603:16: sparse: expression using sizeof(void)
kernel/bpf/core.c:1572:31: sparse: incorrect type in return expression (different address spaces) @@ expected struct bpf_prog_array [noderef] <asn:4>* @@ got sn:4>* @@
kernel/bpf/core.c:1572:31: expected struct bpf_prog_array [noderef] <asn:4>*
kernel/bpf/core.c:1572:31: got void *
kernel/bpf/core.c:1577:17: sparse: incorrect type in return expression (different address spaces) @@ expected struct bpf_prog_array [noderef] <asn:4>* @@ got rray [noderef] <asn:4>* @@
kernel/bpf/core.c:1577:17: expected struct bpf_prog_array [noderef] <asn:4>*
kernel/bpf/core.c:1577:17: got struct bpf_prog_array *<noident>
kernel/bpf/core.c:1585:9: sparse: incorrect type in argument 1 (different address spaces) @@ expected struct callback_head *head @@ got struct callback_hstruct callback_head *head @@
kernel/bpf/core.c:1585:9: expected struct callback_head *head
kernel/bpf/core.c:1585:9: got struct callback_head [noderef] <asn:4>*<noident>
>> kernel/bpf/core.c:1610:16: sparse: incompatible types in comparison expression (different address spaces)
include/linux/slab.h:631:13: sparse: undefined identifier '__builtin_mul_overflow'
>> kernel/bpf/core.c:1659:44: sparse: incorrect type in initializer (different address spaces) @@ expected struct bpf_prog_array_item *item @@ got struct bpf_prog_astruct bpf_prog_array_item *item @@
>> kernel/bpf/core.c:1683:26: sparse: incorrect type in assignment (different address spaces) @@ expected struct bpf_prog_array_item *existing @@ got struct bpf_prog_astruct bpf_prog_array_item *existing @@
kernel/bpf/core.c:1711:15: sparse: incorrect type in assignment (different address spaces) @@ expected struct bpf_prog_array *array @@ got struct bpf_prog_astruct bpf_prog_array *array @@
>> kernel/bpf/core.c:1717:26: sparse: incorrect type in assignment (different address spaces) @@ expected struct bpf_prog_array_item *[assigned] existing @@ got structstruct bpf_prog_array_item *[assigned] existing @@
>> kernel/bpf/core.c:1748:41: sparse: incorrect type in argument 1 (different address spaces) @@ expected struct bpf_prog_array *[addressable] array @@ got strstruct bpf_prog_array *[addressable] array @@
include/trace/events/xdp.h:28:1: sparse: Using plain integer as NULL pointer
include/trace/events/xdp.h:53:1: sparse: Using plain integer as NULL pointer
include/trace/events/xdp.h:111:1: sparse: Using plain integer as NULL pointer
include/trace/events/xdp.h:126:1: sparse: Using plain integer as NULL pointer
include/trace/events/xdp.h:162:1: sparse: Using plain integer as NULL pointer
include/trace/events/xdp.h:197:1: sparse: Using plain integer as NULL pointer
include/trace/events/xdp.h:232:1: sparse: Using plain integer as NULL pointer
kernel/bpf/core.c:950:18: sparse: Initializer entry defined twice
include/linux/slab.h:631:13: sparse: call with no type!
vim +1610 kernel/bpf/core.c
1568
1569 struct bpf_prog_array __rcu *bpf_prog_array_alloc(u32 prog_cnt, gfp_t flags)
1570 {
1571 if (prog_cnt)
> 1572 return kzalloc(sizeof(struct bpf_prog_array) +
1573 sizeof(struct bpf_prog_array_item) *
1574 (prog_cnt + 1),
1575 flags);
1576
> 1577 return &empty_prog_array.hdr;
1578 }
1579
1580 void bpf_prog_array_free(struct bpf_prog_array __rcu *progs)
1581 {
1582 if (!progs ||
1583 progs == (struct bpf_prog_array __rcu *)&empty_prog_array.hdr)
1584 return;
> 1585 kfree_rcu(progs, rcu);
1586 }
1587
1588 int bpf_prog_array_length(struct bpf_prog_array __rcu *array)
1589 {
1590 struct bpf_prog_array_item *item;
1591 u32 cnt = 0;
1592
1593 rcu_read_lock();
1594 item = rcu_dereference(array)->items;
1595 for (; item->prog; item++)
1596 if (item->prog != &dummy_bpf_prog.prog)
1597 cnt++;
1598 rcu_read_unlock();
1599 return cnt;
1600 }
1601
1602
1603 static bool bpf_prog_array_copy_core(struct bpf_prog_array *array,
1604 u32 *prog_ids,
1605 u32 request_cnt)
1606 {
1607 struct bpf_prog_array_item *item;
1608 int i = 0;
1609
> 1610 item = rcu_dereference(array)->items;
1611 for (; item->prog; item++) {
1612 if (item->prog == &dummy_bpf_prog.prog)
1613 continue;
1614 prog_ids[i] = item->prog->aux->id;
1615 if (++i == request_cnt) {
1616 item++;
1617 break;
1618 }
1619 }
1620
1621 return !!(item->prog);
1622 }
1623
1624 int bpf_prog_array_copy_to_user(struct bpf_prog_array __rcu *array,
1625 __u32 __user *prog_ids, u32 cnt)
1626 {
1627 unsigned long err = 0;
1628 bool nospc;
1629 u32 *ids;
1630
1631 /* users of this function are doing:
1632 * cnt = bpf_prog_array_length();
1633 * if (cnt > 0)
1634 * bpf_prog_array_copy_to_user(..., cnt);
1635 * so below kcalloc doesn't need extra cnt > 0 check, but
1636 * bpf_prog_array_length() releases rcu lock and
1637 * prog array could have been swapped with empty or larger array,
1638 * so always copy 'cnt' prog_ids to the user.
1639 * In a rare race the user will see zero prog_ids
1640 */
1641 ids = kcalloc(cnt, sizeof(u32), GFP_USER | __GFP_NOWARN);
1642 if (!ids)
1643 return -ENOMEM;
1644 rcu_read_lock();
1645 nospc = bpf_prog_array_copy_core(array, ids, cnt);
1646 rcu_read_unlock();
1647 err = copy_to_user(prog_ids, ids, cnt * sizeof(u32));
1648 kfree(ids);
1649 if (err)
1650 return -EFAULT;
1651 if (nospc)
1652 return -ENOSPC;
1653 return 0;
1654 }
1655
1656 void bpf_prog_array_delete_safe(struct bpf_prog_array __rcu *array,
1657 struct bpf_prog *old_prog)
1658 {
> 1659 struct bpf_prog_array_item *item = array->items;
1660
1661 for (; item->prog; item++)
1662 if (item->prog == old_prog) {
1663 WRITE_ONCE(item->prog, &dummy_bpf_prog.prog);
1664 break;
1665 }
1666 }
1667
1668 int bpf_prog_array_copy(struct bpf_prog_array __rcu *old_array,
1669 struct bpf_prog *exclude_prog,
1670 struct bpf_prog *include_prog,
1671 struct bpf_prog_array **new_array)
1672 {
1673 int new_prog_cnt, carry_prog_cnt = 0;
1674 struct bpf_prog_array_item *existing;
1675 struct bpf_prog_array *array;
1676 bool found_exclude = false;
1677 int new_prog_idx = 0;
1678
1679 /* Figure out how many existing progs we need to carry over to
1680 * the new array.
1681 */
1682 if (old_array) {
> 1683 existing = old_array->items;
1684 for (; existing->prog; existing++) {
1685 if (existing->prog == exclude_prog) {
1686 found_exclude = true;
1687 continue;
1688 }
1689 if (existing->prog != &dummy_bpf_prog.prog)
1690 carry_prog_cnt++;
1691 if (existing->prog == include_prog)
1692 return -EEXIST;
1693 }
1694 }
1695
1696 if (exclude_prog && !found_exclude)
1697 return -ENOENT;
1698
1699 /* How many progs (not NULL) will be in the new array? */
1700 new_prog_cnt = carry_prog_cnt;
1701 if (include_prog)
1702 new_prog_cnt += 1;
1703
1704 /* Do we have any prog (not NULL) in the new array? */
1705 if (!new_prog_cnt) {
1706 *new_array = NULL;
1707 return 0;
1708 }
1709
1710 /* +1 as the end of prog_array is marked with NULL */
1711 array = bpf_prog_array_alloc(new_prog_cnt + 1, GFP_KERNEL);
1712 if (!array)
1713 return -ENOMEM;
1714
1715 /* Fill in the new prog array */
1716 if (carry_prog_cnt) {
> 1717 existing = old_array->items;
1718 for (; existing->prog; existing++)
1719 if (existing->prog != exclude_prog &&
1720 existing->prog != &dummy_bpf_prog.prog) {
1721 array->items[new_prog_idx++].prog =
1722 existing->prog;
1723 }
1724 }
1725 if (include_prog)
1726 array->items[new_prog_idx++].prog = include_prog;
1727 array->items[new_prog_idx].prog = NULL;
1728 *new_array = array;
1729 return 0;
1730 }
1731
1732 int bpf_prog_array_copy_info(struct bpf_prog_array __rcu *array,
1733 u32 *prog_ids, u32 request_cnt,
1734 u32 *prog_cnt)
1735 {
1736 u32 cnt = 0;
1737
1738 if (array)
1739 cnt = bpf_prog_array_length(array);
1740
1741 *prog_cnt = cnt;
1742
1743 /* return early if user requested only program count or nothing to copy */
1744 if (!request_cnt || !cnt)
1745 return 0;
1746
1747 /* this function is called under trace/bpf_trace.c: bpf_event_mutex */
> 1748 return bpf_prog_array_copy_core(array, prog_ids, request_cnt) ? -ENOSPC
1749 : 0;
1750 }
1751
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
^ permalink raw reply
* Re: [PATCH 6/6] fs: replace f_ops->get_poll_head with a static ->f_poll_head pointer
From: Al Viro @ 2018-06-28 22:20 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Christoph Hellwig, linux-fsdevel, Network Development, LKP
In-Reply-To: <20180628213027.GK30522@ZenIV.linux.org.uk>
On Thu, Jun 28, 2018 at 10:30:27PM +0100, Al Viro wrote:
> I'm not saying that blocking on other things is a bug; some of such *are* bogus,
> but a lot aren't really broken. What I said is that in a lot of cases we really
> have hard "no blocking other than in callback" (and on subsequent passes there's
> no callback at all). Which is just about perfect for AIO purposes, so *IF* we
> go for "new method just for AIO, those who don't have it can take a hike", we might
> as well indicate that "can take a hike" in some way (be it opt-in or opt-out) and
> use straight unchanged ->poll(), with alternative callback.
PS: one way of doing that would be to steal a flag from pt->_key and have ->poll()
instances do an equivalent of
if (flags & LOOKUP_RCU)
return -ECHILD;
we have in a lot of ->d_revalidate() instances for "need to block" case. Only
here they would've returned EPOLLNVAL.
Most of the ->poll() instances wouldn't care at all - they do not block unless
the callback does (and in this case it wouldn't have). Normal poll(2)/select(2)
are completely unaffected. And AIO would just have that bit set in its
poll_table_struct.
The rules for drivers change only in one respect - if your ->poll() is going to
need to block, check poll_requested_events(pt) & EPOLL_ATOMIC and return EPOLLNVAL
in such case.
^ permalink raw reply
* Re: [PATCH 0/6] offload Linux LAG devices to the TC datapath
From: Or Gerlitz @ 2018-06-28 22:19 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Or Gerlitz, John Hurley, Jiri Pirko, Linux Netdev List,
ASAP_Direct_Dev, Simon Horman, Andy Gospodarek
In-Reply-To: <20180627210243.154f05e0@cakuba.netronome.com>
On Thu, Jun 28, 2018 at 7:02 AM, Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
[...]
> } else if (netif_is_lag_master(out_dev) &&
> priv->flower_ext_feats & NFP_FL_FEATS_LAG) {
> int gid;
>
> output->flags = cpu_to_be16(tmp_flags);
> gid = nfp_flower_lag_get_output_id(app, out_dev);
> if (gid < 0)
> return gid;
> output->port = cpu_to_be32(NFP_FL_LAG_OUT | gid);
got it how you do that, cool for you
^ permalink raw reply
* linux-next: manual merge of the net tree with Linus' tree
From: Stephen Rothwell @ 2018-06-28 22:14 UTC (permalink / raw)
To: David Miller, Networking, Linus Torvalds
Cc: Linux-Next Mailing List, Linux Kernel Mailing List, Ursula Braun
[-- Attachment #1: Type: text/plain, Size: 911 bytes --]
Hi all,
Today's linux-next merge of the net tree got a conflict in:
net/smc/af_smc.c
between commit:
a11e1d432b51 ("Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL")
from Linus' tree and commit:
24ac3a08e658 ("net/smc: rebuild nonblocking connect")
from the net tree.
I did the obvious syntactic fix up but, given that the commit that the
net tree commit is fixing ie being reverted, I wonder if the net tree
commit is needed (or correct?) any more.
I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging. You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.
--
Cheers,
Stephen Rothwell
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* Re: [PATCH v3 bpf-net] bpf: Change bpf_fib_lookup to return lookup status
From: Daniel Borkmann @ 2018-06-28 22:08 UTC (permalink / raw)
To: dsahern, netdev, borkmann, ast; +Cc: kafai, brouer, David Ahern
In-Reply-To: <20180626232118.14421-1-dsahern@kernel.org>
On 06/27/2018 01:21 AM, dsahern@kernel.org wrote:
> From: David Ahern <dsahern@gmail.com>
>
> For ACLs implemented using either FIB rules or FIB entries, the BPF
> program needs the FIB lookup status to be able to drop the packet.
> Since the bpf_fib_lookup API has not reached a released kernel yet,
> change the return code to contain an encoding of the FIB lookup
> result and return the nexthop device index in the params struct.
>
> In addition, inform the BPF program of any post FIB lookup reason as
> to why the packet needs to go up the stack.
>
> The fib result for unicast routes must have an egress device, so remove
> the check that it is non-NULL.
>
> Signed-off-by: David Ahern <dsahern@gmail.com>
Applied to bpf, thanks David!
^ permalink raw reply
* Re: [PATCH] test_bpf: flag tests that cannot be jited on s390
From: Daniel Borkmann @ 2018-06-28 22:01 UTC (permalink / raw)
To: Kleber Sacilotto de Souza, linux-s390, netdev; +Cc: Alexei Starovoitov
In-Reply-To: <20180627151921.15018-1-kleber.souza@canonical.com>
On 06/27/2018 05:19 PM, Kleber Sacilotto de Souza wrote:
> Flag with FLAG_EXPECTED_FAIL the BPF_MAXINSNS tests that cannot be jited
> on s390 because they exceed BPF_SIZE_MAX and fail when
> CONFIG_BPF_JIT_ALWAYS_ON is set. Also set .expected_errcode to -ENOTSUPP
> so the tests pass in that case.
>
> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Applied to bpf, thanks Kleber!
^ permalink raw reply
* Re: [bpf-next PATCH 0/2] xdp/bpf: extend XDP samples/bpf xdp_rxq_info
From: Daniel Borkmann @ 2018-06-28 21:54 UTC (permalink / raw)
To: Jesper Dangaard Brouer, netdev
Cc: Daniel Borkmann, Toke Høiland-Jørgensen,
Alexei Starovoitov
In-Reply-To: <152993682254.8835.8864318933370018087.stgit@firesoul>
On 06/25/2018 04:27 PM, Jesper Dangaard Brouer wrote:
> While writing an article about XDP, the samples/bpf xdp_rxq_info
> program were extended to cover some more use-cases.
Applied to bpf-next, thanks guys!
^ permalink raw reply
* Re: [PATCH v2 net-next 0/2] net: preserve sock reference when scrubbing the skb.
From: Cong Wang @ 2018-06-28 21:53 UTC (permalink / raw)
To: Flavio Leitner
Cc: Linux Kernel Network Developers, Eric Dumazet, Paolo Abeni,
David Miller, Florian Westphal, NetFilter
In-Reply-To: <CAM_iQpXnfc8uA10EK2X7B=vB_uWayeGw=M5F-_uygu7aPvCjRw@mail.gmail.com>
On Wed, Jun 27, 2018 at 12:39 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> Let me rephrase why I don't like this patchset:
>
> 1. Let's forget about TSQ for a moment, skb_orphan() before leaving
> the stack is not just reasonable but also aligning to network isolation
> design. You can't claim skb_orphan() is broken from beginning, it is
> designed in this way and it is intentional.
>
> 2. Now, let's consider the current TSQ behavior (without any patch):
>
> 2a. For packets leaving the host or just leaving the stack to another
> netns, there is no difference, and this should be expected from user's
> point of view, because I don't need to think about its destination to
> decide how I should configure tcp_limit_output_bytes.
>
> 2b. The hidden pipeline behind TSQ is well defined, that is, any
> queues in between L4 and L2, most importantly qdisc. I can easily
> predict the number of queues my packets will go through with a
> given configuration. This also aligns with 2a.
>
> 2c. Isolation is respected as it should. TCP sockets in this netns
> won't be influenced by any factor in another netns.
>
> Now with your patchset:
>
> 2a. There is an apparent difference for packets leaving the host
> and for packets just leaving this stack.
>
> 2b. You extend the pipeline to another netns's L3, which means
> the number of queues is now unpredictable.
>
> 2c. Isolation is now slightly broken, the other netns could influence
> the source netns.
>
> I don't see you have any good argument on any of these 3 points.
No one finishes reading this.
I will send a revert with quote of the above.
^ permalink raw reply
* [net-next 12/12] net/mlx5e: Update NIC HW stats on demand only
From: Saeed Mahameed @ 2018-06-28 21:51 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Or Gerlitz, Saeed Mahameed
In-Reply-To: <20180628215103.9141-1-saeedm@mellanox.com>
Disable periodic stats update background thread and update stats in
background on demand when ndo_get_stats is called.
Having a background thread running in the driver all the time is bad for
power consumption and normally a user space daemon will query the stats
once every specific interval, so ideally the background thread and its
interval can be done in user space..
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 -
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 10 +++++-----
drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 3 +++
3 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index eb9eb7aa953a..e2b7586ed7a0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -137,7 +137,6 @@ struct page_pool;
#define MLX5E_MAX_NUM_CHANNELS (MLX5E_INDIR_RQT_SIZE >> 1)
#define MLX5E_MAX_NUM_SQS (MLX5E_MAX_NUM_CHANNELS * MLX5E_MAX_NUM_TC)
#define MLX5E_TX_CQ_POLL_BUDGET 128
-#define MLX5E_UPDATE_STATS_INTERVAL 200 /* msecs */
#define MLX5E_SQ_RECOVER_MIN_INTERVAL 500 /* msecs */
#define MLX5E_UMR_WQE_INLINE_SZ \
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 42ef8c818544..ba9ae05efe09 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -270,12 +270,9 @@ void mlx5e_update_stats_work(struct work_struct *work)
struct delayed_work *dwork = to_delayed_work(work);
struct mlx5e_priv *priv = container_of(dwork, struct mlx5e_priv,
update_stats_work);
+
mutex_lock(&priv->state_lock);
- if (test_bit(MLX5E_STATE_OPENED, &priv->state)) {
- priv->profile->update_stats(priv);
- queue_delayed_work(priv->wq, dwork,
- msecs_to_jiffies(MLX5E_UPDATE_STATS_INTERVAL));
- }
+ priv->profile->update_stats(priv);
mutex_unlock(&priv->state_lock);
}
@@ -3405,6 +3402,9 @@ mlx5e_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats)
struct mlx5e_vport_stats *vstats = &priv->stats.vport;
struct mlx5e_pport_stats *pstats = &priv->stats.pport;
+ /* update HW stats in background for next time */
+ queue_delayed_work(priv->wq, &priv->update_stats_work, 0);
+
if (mlx5e_is_uplink_rep(priv)) {
stats->rx_packets = PPORT_802_3_GET(pstats, a_frames_received_ok);
stats->rx_bytes = PPORT_802_3_GET(pstats, a_octets_received_ok);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 3f2fe95e01d9..7db7552b7e3f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -893,6 +893,9 @@ mlx5e_rep_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats)
{
struct mlx5e_priv *priv = netdev_priv(dev);
+ /* update HW stats in background for next time */
+ queue_delayed_work(priv->wq, &priv->update_stats_work, 0);
+
memcpy(stats, &priv->stats.vf_vport, sizeof(*stats));
}
--
2.17.0
^ permalink raw reply related
* [net-next 10/12] net/mlx5e: Add counter for MPWQE filler strides
From: Saeed Mahameed @ 2018-06-28 21:51 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed
In-Reply-To: <20180628215103.9141-1-saeedm@mellanox.com>
From: Tariq Toukan <tariqt@mellanox.com>
Add ethtool counter to indicate the number of strides consumed
by filler CQEs.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 5 ++++-
drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 9 ++++++---
drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 6 ++++--
3 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 6f20ce76c11c..f763a6aebc2d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1263,7 +1263,10 @@ void mlx5e_handle_rx_cqe_mpwrq(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe)
}
if (unlikely(mpwrq_is_filler_cqe(cqe))) {
- rq->stats->mpwqe_filler++;
+ struct mlx5e_rq_stats *stats = rq->stats;
+
+ stats->mpwqe_filler_cqes++;
+ stats->mpwqe_filler_strides += cstrides;
goto mpwrq_cqe_out;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 0c18b20c2c18..76107ed3a651 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -73,7 +73,8 @@ static const struct counter_desc sw_stats_desc[] = {
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_udp_seg_rem) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_cqe_err) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_wqe_err) },
- { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_mpwqe_filler) },
+ { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_mpwqe_filler_cqes) },
+ { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_mpwqe_filler_strides) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_buff_alloc_err) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cqe_compress_blks) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cqe_compress_pkts) },
@@ -144,7 +145,8 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
s->rx_xdp_tx_cqe += rq_stats->xdp_tx_cqe;
s->rx_xdp_tx_full += rq_stats->xdp_tx_full;
s->rx_wqe_err += rq_stats->wqe_err;
- s->rx_mpwqe_filler += rq_stats->mpwqe_filler;
+ s->rx_mpwqe_filler_cqes += rq_stats->mpwqe_filler_cqes;
+ s->rx_mpwqe_filler_strides += rq_stats->mpwqe_filler_strides;
s->rx_buff_alloc_err += rq_stats->buff_alloc_err;
s->rx_cqe_compress_blks += rq_stats->cqe_compress_blks;
s->rx_cqe_compress_pkts += rq_stats->cqe_compress_pkts;
@@ -1129,7 +1131,8 @@ static const struct counter_desc rq_stats_desc[] = {
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, lro_bytes) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, removed_vlan_packets) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, wqe_err) },
- { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, mpwqe_filler) },
+ { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, mpwqe_filler_cqes) },
+ { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, mpwqe_filler_strides) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, buff_alloc_err) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cqe_compress_blks) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cqe_compress_pkts) },
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 70a05298851e..1d641b012afa 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -84,7 +84,8 @@ struct mlx5e_sw_stats {
u64 tx_udp_seg_rem;
u64 tx_cqe_err;
u64 rx_wqe_err;
- u64 rx_mpwqe_filler;
+ u64 rx_mpwqe_filler_cqes;
+ u64 rx_mpwqe_filler_strides;
u64 rx_buff_alloc_err;
u64 rx_cqe_compress_blks;
u64 rx_cqe_compress_pkts;
@@ -180,7 +181,8 @@ struct mlx5e_rq_stats {
u64 xdp_tx_cqe;
u64 xdp_tx_full;
u64 wqe_err;
- u64 mpwqe_filler;
+ u64 mpwqe_filler_cqes;
+ u64 mpwqe_filler_strides;
u64 buff_alloc_err;
u64 cqe_compress_blks;
u64 cqe_compress_pkts;
--
2.17.0
^ permalink raw reply related
* [net-next 11/12] net/mlx5e: Add counter for total num of NOP operations
From: Saeed Mahameed @ 2018-06-28 21:51 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed
In-Reply-To: <20180628215103.9141-1-saeedm@mellanox.com>
From: Tariq Toukan <tariqt@mellanox.com>
A per-ring counter for NOP operations already exists.
Here I add a counter that sums them up.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 2 ++
drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 1 +
2 files changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 76107ed3a651..c0507fada0be 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -44,6 +44,7 @@ static const struct counter_desc sw_stats_desc[] = {
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_tso_inner_packets) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_tso_inner_bytes) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_added_vlan_packets) },
+ { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_nop) },
#ifdef CONFIG_MLX5_EN_TLS
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_tls_ooo) },
@@ -173,6 +174,7 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
s->tx_tso_inner_packets += sq_stats->tso_inner_packets;
s->tx_tso_inner_bytes += sq_stats->tso_inner_bytes;
s->tx_added_vlan_packets += sq_stats->added_vlan_packets;
+ s->tx_nop += sq_stats->nop;
s->tx_queue_stopped += sq_stats->stopped;
s->tx_queue_wake += sq_stats->wake;
s->tx_udp_seg_rem += sq_stats->udp_seg_rem;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 1d641b012afa..fc3f66003edd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -61,6 +61,7 @@ struct mlx5e_sw_stats {
u64 tx_tso_inner_packets;
u64 tx_tso_inner_bytes;
u64 tx_added_vlan_packets;
+ u64 tx_nop;
u64 rx_lro_packets;
u64 rx_lro_bytes;
u64 rx_removed_vlan_packets;
--
2.17.0
^ permalink raw reply related
* [net-next 09/12] net/mlx5e: Add channel events counter
From: Saeed Mahameed @ 2018-06-28 21:51 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed
In-Reply-To: <20180628215103.9141-1-saeedm@mellanox.com>
From: Tariq Toukan <tariqt@mellanox.com>
Add per-channel and global ethtool counters for channel events.
Each event indicates an interrupt on one of the channel's
completion queues.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 3 +++
drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 2 ++
drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 3 ++-
3 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index dd3b5a028a97..0c18b20c2c18 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -84,6 +84,7 @@ static const struct counter_desc sw_stats_desc[] = {
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_busy) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_waive) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_congst_umr) },
+ { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_events) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_poll) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_arm) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_aff_change) },
@@ -154,6 +155,7 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
s->rx_cache_busy += rq_stats->cache_busy;
s->rx_cache_waive += rq_stats->cache_waive;
s->rx_congst_umr += rq_stats->congst_umr;
+ s->ch_events += ch_stats->events;
s->ch_poll += ch_stats->poll;
s->ch_arm += ch_stats->arm;
s->ch_aff_change += ch_stats->aff_change;
@@ -1162,6 +1164,7 @@ static const struct counter_desc sq_stats_desc[] = {
};
static const struct counter_desc ch_stats_desc[] = {
+ { MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, events) },
{ MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, poll) },
{ MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, arm) },
{ MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, aff_change) },
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 4e54cb86fece..70a05298851e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -95,6 +95,7 @@ struct mlx5e_sw_stats {
u64 rx_cache_busy;
u64 rx_cache_waive;
u64 rx_congst_umr;
+ u64 ch_events;
u64 ch_poll;
u64 ch_arm;
u64 ch_aff_change;
@@ -222,6 +223,7 @@ struct mlx5e_sq_stats {
};
struct mlx5e_ch_stats {
+ u64 events;
u64 poll;
u64 arm;
u64 aff_change;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index 9f6e97883cbc..4e1f99a98d5d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -124,8 +124,9 @@ void mlx5e_completion_event(struct mlx5_core_cq *mcq)
{
struct mlx5e_cq *cq = container_of(mcq, struct mlx5e_cq, mcq);
- cq->event_ctr++;
napi_schedule(cq->napi);
+ cq->event_ctr++;
+ cq->channel->stats->events++;
}
void mlx5e_cq_error_event(struct mlx5_core_cq *mcq, enum mlx5_event event)
--
2.17.0
^ permalink raw reply related
* [net-next 07/12] net/mlx5e: Add NAPI statistics
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed
In-Reply-To: <20180628215103.9141-1-saeedm@mellanox.com>
From: Tariq Toukan <tariqt@mellanox.com>
Add per-channel and global ethtool counters for NAPI.
This helps us monitor and analyze performance in general.
- ch[i]_poll:
the number of times the channel's NAPI poll was invoked.
- ch[i]_arm:
the number of times the channel's NAPI poll completed
and armed the completion queues.
- ch[i]_aff_change:
the number of times the channel's NAPI poll explicitly
stopped execution on a cpu due to a change in affinity.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 9 +++++++++
drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 6 ++++++
drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 6 ++++++
3 files changed, 21 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 98aefa6eb266..ec7784189dc2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -83,6 +83,9 @@ static const struct counter_desc sw_stats_desc[] = {
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_empty) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_busy) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_waive) },
+ { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_poll) },
+ { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_arm) },
+ { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_aff_change) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_eq_rearm) },
};
@@ -149,6 +152,9 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
s->rx_cache_empty += rq_stats->cache_empty;
s->rx_cache_busy += rq_stats->cache_busy;
s->rx_cache_waive += rq_stats->cache_waive;
+ s->ch_poll += ch_stats->poll;
+ s->ch_arm += ch_stats->arm;
+ s->ch_aff_change += ch_stats->aff_change;
s->ch_eq_rearm += ch_stats->eq_rearm;
for (j = 0; j < priv->max_opened_tc; j++) {
@@ -1153,6 +1159,9 @@ static const struct counter_desc sq_stats_desc[] = {
};
static const struct counter_desc ch_stats_desc[] = {
+ { MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, poll) },
+ { MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, arm) },
+ { MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, aff_change) },
{ MLX5E_DECLARE_CH_STAT(struct mlx5e_ch_stats, eq_rearm) },
};
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index b598a21bb4d6..0cd08b9f46ff 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -94,6 +94,9 @@ struct mlx5e_sw_stats {
u64 rx_cache_empty;
u64 rx_cache_busy;
u64 rx_cache_waive;
+ u64 ch_poll;
+ u64 ch_arm;
+ u64 ch_aff_change;
u64 ch_eq_rearm;
#ifdef CONFIG_MLX5_EN_TLS
@@ -217,6 +220,9 @@ struct mlx5e_sq_stats {
};
struct mlx5e_ch_stats {
+ u64 poll;
+ u64 arm;
+ u64 aff_change;
u64 eq_rearm;
};
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index 1b17f682693b..9f6e97883cbc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -74,10 +74,13 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
{
struct mlx5e_channel *c = container_of(napi, struct mlx5e_channel,
napi);
+ struct mlx5e_ch_stats *ch_stats = c->stats;
bool busy = false;
int work_done = 0;
int i;
+ ch_stats->poll++;
+
for (i = 0; i < c->num_tc; i++)
busy |= mlx5e_poll_tx_cq(&c->sq[i].cq, budget);
@@ -94,6 +97,7 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
if (busy) {
if (likely(mlx5e_channel_no_affinity_change(c)))
return budget;
+ ch_stats->aff_change++;
if (budget && work_done == budget)
work_done--;
}
@@ -101,6 +105,8 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
if (unlikely(!napi_complete_done(napi, work_done)))
return work_done;
+ ch_stats->arm++;
+
for (i = 0; i < c->num_tc; i++) {
mlx5e_handle_tx_dim(&c->sq[i]);
mlx5e_cq_arm(&c->sq[i].cq);
--
2.17.0
^ permalink raw reply related
* [net-next 08/12] net/mlx5e: Add a counter for congested UMRs
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed
In-Reply-To: <20180628215103.9141-1-saeedm@mellanox.com>
From: Tariq Toukan <tariqt@mellanox.com>
Add per-ring and global ethtool counters for congested UMR requests.
These events indicate congestion in UMR handlers in HW.
Such event is concluded when there's an outstanding UMR post,
yet the SW consumed at least two additional MPWQEs in the meanwhile.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 2 ++
drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 3 +++
drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 2 ++
drivers/net/ethernet/mellanox/mlx5/core/wq.h | 5 +++++
4 files changed, 12 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 733c5d2c99f2..6f20ce76c11c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -601,6 +601,8 @@ bool mlx5e_post_rx_mpwqes(struct mlx5e_rq *rq)
if (!rq->mpwqe.umr_in_progress)
mlx5e_alloc_rx_mpwqe(rq, wq->head);
+ else
+ rq->stats->congst_umr += mlx5_wq_ll_missing(wq) > 2;
return false;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index ec7784189dc2..dd3b5a028a97 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -83,6 +83,7 @@ static const struct counter_desc sw_stats_desc[] = {
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_empty) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_busy) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_waive) },
+ { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_congst_umr) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_poll) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_arm) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_aff_change) },
@@ -152,6 +153,7 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
s->rx_cache_empty += rq_stats->cache_empty;
s->rx_cache_busy += rq_stats->cache_busy;
s->rx_cache_waive += rq_stats->cache_waive;
+ s->rx_congst_umr += rq_stats->congst_umr;
s->ch_poll += ch_stats->poll;
s->ch_arm += ch_stats->arm;
s->ch_aff_change += ch_stats->aff_change;
@@ -1135,6 +1137,7 @@ static const struct counter_desc rq_stats_desc[] = {
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_empty) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_busy) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_waive) },
+ { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, congst_umr) },
};
static const struct counter_desc sq_stats_desc[] = {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 0cd08b9f46ff..4e54cb86fece 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -94,6 +94,7 @@ struct mlx5e_sw_stats {
u64 rx_cache_empty;
u64 rx_cache_busy;
u64 rx_cache_waive;
+ u64 rx_congst_umr;
u64 ch_poll;
u64 ch_arm;
u64 ch_aff_change;
@@ -188,6 +189,7 @@ struct mlx5e_rq_stats {
u64 cache_empty;
u64 cache_busy;
u64 cache_waive;
+ u64 congst_umr;
};
struct mlx5e_sq_stats {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.h b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
index 0b47126815b6..2bd4c3184eba 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
@@ -229,6 +229,11 @@ static inline int mlx5_wq_ll_is_empty(struct mlx5_wq_ll *wq)
return !wq->cur_sz;
}
+static inline int mlx5_wq_ll_missing(struct mlx5_wq_ll *wq)
+{
+ return wq->fbc.sz_m1 - wq->cur_sz;
+}
+
static inline void *mlx5_wq_ll_get_wqe(struct mlx5_wq_ll *wq, u16 ix)
{
return mlx5_frag_buf_get_wqe(&wq->fbc, ix);
--
2.17.0
^ permalink raw reply related
* [net-next 05/12] net/mlx5e: Add TX completions statistics
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed
In-Reply-To: <20180628215103.9141-1-saeedm@mellanox.com>
From: Tariq Toukan <tariqt@mellanox.com>
Add per-ring and global ethtool counters for TX completions.
This helps us monitor and analyze TX flow performance.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 3 +++
drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 4 +++-
drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 9 +++++++--
3 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 7e7155b4e0f0..d35361b1b3fe 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -67,6 +67,7 @@ static const struct counter_desc sw_stats_desc[] = {
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_queue_dropped) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xmit_more) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_recover) },
+ { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_cqes) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_queue_wake) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_udp_seg_rem) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_cqe_err) },
@@ -172,6 +173,7 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
s->tx_tls_ooo += sq_stats->tls_ooo;
s->tx_tls_resync_bytes += sq_stats->tls_resync_bytes;
#endif
+ s->tx_cqes += sq_stats->cqes;
}
}
@@ -1142,6 +1144,7 @@ static const struct counter_desc sq_stats_desc[] = {
{ MLX5E_DECLARE_TX_STAT(struct mlx5e_sq_stats, dropped) },
{ MLX5E_DECLARE_TX_STAT(struct mlx5e_sq_stats, xmit_more) },
{ MLX5E_DECLARE_TX_STAT(struct mlx5e_sq_stats, recover) },
+ { MLX5E_DECLARE_TX_STAT(struct mlx5e_sq_stats, cqes) },
{ MLX5E_DECLARE_TX_STAT(struct mlx5e_sq_stats, wake) },
{ MLX5E_DECLARE_TX_STAT(struct mlx5e_sq_stats, cqe_err) },
};
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index d416bb86e747..8f2dfe56fdef 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -78,6 +78,7 @@ struct mlx5e_sw_stats {
u64 tx_queue_dropped;
u64 tx_xmit_more;
u64 tx_recover;
+ u64 tx_cqes;
u64 tx_queue_wake;
u64 tx_udp_seg_rem;
u64 tx_cqe_err;
@@ -208,7 +209,8 @@ struct mlx5e_sq_stats {
u64 dropped;
u64 recover;
/* dirtied @completion */
- u64 wake ____cacheline_aligned_in_smp;
+ u64 cqes ____cacheline_aligned_in_smp;
+ u64 wake;
u64 cqe_err;
};
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index f450d9ca31fb..f0739dae7b56 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -468,6 +468,7 @@ static void mlx5e_dump_error_cqe(struct mlx5e_txqsq *sq,
bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
{
+ struct mlx5e_sq_stats *stats;
struct mlx5e_txqsq *sq;
struct mlx5_cqe64 *cqe;
u32 dma_fifo_cc;
@@ -485,6 +486,8 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
if (!cqe)
return false;
+ stats = sq->stats;
+
npkts = 0;
nbytes = 0;
@@ -513,7 +516,7 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
queue_work(cq->channel->priv->wq,
&sq->recover.recover_work);
}
- sq->stats->cqe_err++;
+ stats->cqe_err++;
}
do {
@@ -558,6 +561,8 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
} while ((++i < MLX5E_TX_CQ_POLL_BUDGET) && (cqe = mlx5_cqwq_get_cqe(&cq->wq)));
+ stats->cqes += i;
+
mlx5_cqwq_update_db_record(&cq->wq);
/* ensure cq space is freed before enabling more cqes */
@@ -573,7 +578,7 @@ bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
MLX5E_SQ_STOP_ROOM) &&
!test_bit(MLX5E_SQ_STATE_RECOVERING, &sq->state)) {
netif_tx_wake_queue(sq->txq);
- sq->stats->wake++;
+ stats->wake++;
}
return (i == MLX5E_TX_CQ_POLL_BUDGET);
--
2.17.0
^ permalink raw reply related
* [net-next 06/12] net/mlx5e: Add XDP_TX completions statistics
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed
In-Reply-To: <20180628215103.9141-1-saeedm@mellanox.com>
From: Tariq Toukan <tariqt@mellanox.com>
Add per-ring and global ethtool counters for XDP_TX completions.
This helps us monitor and analyze XDP_TX flow performance.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 2 ++
drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 3 +++
drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 2 ++
3 files changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index a2d91eaa99c4..733c5d2c99f2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1383,6 +1383,8 @@ bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq)
} while (!last_wqe);
} while ((++i < MLX5E_TX_CQ_POLL_BUDGET) && (cqe = mlx5_cqwq_get_cqe(&cq->wq)));
+ rq->stats->xdp_tx_cqe += i;
+
mlx5_cqwq_update_db_record(&cq->wq);
/* ensure cq space is freed before enabling more cqes */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index d35361b1b3fe..98aefa6eb266 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -59,6 +59,7 @@ static const struct counter_desc sw_stats_desc[] = {
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_csum_unnecessary_inner) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_drop) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx) },
+ { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_cqe) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_full) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_csum_none) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_csum_partial) },
@@ -135,6 +136,7 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
s->rx_csum_unnecessary_inner += rq_stats->csum_unnecessary_inner;
s->rx_xdp_drop += rq_stats->xdp_drop;
s->rx_xdp_tx += rq_stats->xdp_tx;
+ s->rx_xdp_tx_cqe += rq_stats->xdp_tx_cqe;
s->rx_xdp_tx_full += rq_stats->xdp_tx_full;
s->rx_wqe_err += rq_stats->wqe_err;
s->rx_mpwqe_filler += rq_stats->mpwqe_filler;
@@ -1111,6 +1113,7 @@ static const struct counter_desc rq_stats_desc[] = {
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, csum_none) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, xdp_drop) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, xdp_tx) },
+ { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, xdp_tx_cqe) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, xdp_tx_full) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, lro_packets) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, lro_bytes) },
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 8f2dfe56fdef..b598a21bb4d6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -70,6 +70,7 @@ struct mlx5e_sw_stats {
u64 rx_csum_unnecessary_inner;
u64 rx_xdp_drop;
u64 rx_xdp_tx;
+ u64 rx_xdp_tx_cqe;
u64 rx_xdp_tx_full;
u64 tx_csum_none;
u64 tx_csum_partial;
@@ -171,6 +172,7 @@ struct mlx5e_rq_stats {
u64 removed_vlan_packets;
u64 xdp_drop;
u64 xdp_tx;
+ u64 xdp_tx_cqe;
u64 xdp_tx_full;
u64 wqe_err;
u64 mpwqe_filler;
--
2.17.0
^ permalink raw reply related
* [net-next 03/12] net/mlx5e: Convert large order kzalloc allocations to kvzalloc
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed
In-Reply-To: <20180628215103.9141-1-saeedm@mellanox.com>
From: Tariq Toukan <tariqt@mellanox.com>
Replace calls to kzalloc_node with kvzalloc_node, as it fallsback
to lower-order pages if the higher-order trials fail.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
.../net/ethernet/mellanox/mlx5/core/en_main.c | 44 +++++++++----------
1 file changed, 22 insertions(+), 22 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index e2ef68b1daa2..42ef8c818544 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -352,8 +352,8 @@ static int mlx5e_rq_alloc_mpwqe_info(struct mlx5e_rq *rq,
{
int wq_sz = mlx5_wq_ll_get_size(&rq->mpwqe.wq);
- rq->mpwqe.info = kcalloc_node(wq_sz, sizeof(*rq->mpwqe.info),
- GFP_KERNEL, cpu_to_node(c->cpu));
+ rq->mpwqe.info = kvzalloc_node(wq_sz * sizeof(*rq->mpwqe.info),
+ GFP_KERNEL, cpu_to_node(c->cpu));
if (!rq->mpwqe.info)
return -ENOMEM;
@@ -670,7 +670,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
err_free:
switch (rq->wq_type) {
case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
- kfree(rq->mpwqe.info);
+ kvfree(rq->mpwqe.info);
mlx5_core_destroy_mkey(mdev, &rq->umr_mkey);
break;
default: /* MLX5_WQ_TYPE_CYCLIC */
@@ -702,7 +702,7 @@ static void mlx5e_free_rq(struct mlx5e_rq *rq)
switch (rq->wq_type) {
case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
- kfree(rq->mpwqe.info);
+ kvfree(rq->mpwqe.info);
mlx5_core_destroy_mkey(rq->mdev, &rq->umr_mkey);
break;
default: /* MLX5_WQ_TYPE_CYCLIC */
@@ -965,15 +965,15 @@ static void mlx5e_close_rq(struct mlx5e_rq *rq)
static void mlx5e_free_xdpsq_db(struct mlx5e_xdpsq *sq)
{
- kfree(sq->db.di);
+ kvfree(sq->db.di);
}
static int mlx5e_alloc_xdpsq_db(struct mlx5e_xdpsq *sq, int numa)
{
int wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
- sq->db.di = kcalloc_node(wq_sz, sizeof(*sq->db.di),
- GFP_KERNEL, numa);
+ sq->db.di = kvzalloc_node(sizeof(*sq->db.di) * wq_sz,
+ GFP_KERNEL, numa);
if (!sq->db.di) {
mlx5e_free_xdpsq_db(sq);
return -ENOMEM;
@@ -1024,15 +1024,15 @@ static void mlx5e_free_xdpsq(struct mlx5e_xdpsq *sq)
static void mlx5e_free_icosq_db(struct mlx5e_icosq *sq)
{
- kfree(sq->db.ico_wqe);
+ kvfree(sq->db.ico_wqe);
}
static int mlx5e_alloc_icosq_db(struct mlx5e_icosq *sq, int numa)
{
u8 wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
- sq->db.ico_wqe = kcalloc_node(wq_sz, sizeof(*sq->db.ico_wqe),
- GFP_KERNEL, numa);
+ sq->db.ico_wqe = kvzalloc_node(sizeof(*sq->db.ico_wqe) * wq_sz,
+ GFP_KERNEL, numa);
if (!sq->db.ico_wqe)
return -ENOMEM;
@@ -1077,8 +1077,8 @@ static void mlx5e_free_icosq(struct mlx5e_icosq *sq)
static void mlx5e_free_txqsq_db(struct mlx5e_txqsq *sq)
{
- kfree(sq->db.wqe_info);
- kfree(sq->db.dma_fifo);
+ kvfree(sq->db.wqe_info);
+ kvfree(sq->db.dma_fifo);
}
static int mlx5e_alloc_txqsq_db(struct mlx5e_txqsq *sq, int numa)
@@ -1086,10 +1086,10 @@ static int mlx5e_alloc_txqsq_db(struct mlx5e_txqsq *sq, int numa)
int wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
int df_sz = wq_sz * MLX5_SEND_WQEBB_NUM_DS;
- sq->db.dma_fifo = kcalloc_node(df_sz, sizeof(*sq->db.dma_fifo),
- GFP_KERNEL, numa);
- sq->db.wqe_info = kcalloc_node(wq_sz, sizeof(*sq->db.wqe_info),
- GFP_KERNEL, numa);
+ sq->db.dma_fifo = kvzalloc_node(df_sz * sizeof(*sq->db.dma_fifo),
+ GFP_KERNEL, numa);
+ sq->db.wqe_info = kvzalloc_node(wq_sz * sizeof(*sq->db.wqe_info),
+ GFP_KERNEL, numa);
if (!sq->db.dma_fifo || !sq->db.wqe_info) {
mlx5e_free_txqsq_db(sq);
return -ENOMEM;
@@ -1893,7 +1893,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
int err;
int eqn;
- c = kzalloc_node(sizeof(*c), GFP_KERNEL, cpu_to_node(cpu));
+ c = kvzalloc_node(sizeof(*c), GFP_KERNEL, cpu_to_node(cpu));
if (!c)
return -ENOMEM;
@@ -1979,7 +1979,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
err_napi_del:
netif_napi_del(&c->napi);
- kfree(c);
+ kvfree(c);
return err;
}
@@ -2018,7 +2018,7 @@ static void mlx5e_close_channel(struct mlx5e_channel *c)
mlx5e_close_cq(&c->icosq.cq);
netif_napi_del(&c->napi);
- kfree(c);
+ kvfree(c);
}
#define DEFAULT_FRAG_SIZE (2048)
@@ -2276,7 +2276,7 @@ int mlx5e_open_channels(struct mlx5e_priv *priv,
chs->num = chs->params.num_channels;
chs->c = kcalloc(chs->num, sizeof(struct mlx5e_channel *), GFP_KERNEL);
- cparam = kzalloc(sizeof(struct mlx5e_channel_param), GFP_KERNEL);
+ cparam = kvzalloc(sizeof(struct mlx5e_channel_param), GFP_KERNEL);
if (!chs->c || !cparam)
goto err_free;
@@ -2287,7 +2287,7 @@ int mlx5e_open_channels(struct mlx5e_priv *priv,
goto err_close_channels;
}
- kfree(cparam);
+ kvfree(cparam);
return 0;
err_close_channels:
@@ -2296,7 +2296,7 @@ int mlx5e_open_channels(struct mlx5e_priv *priv,
err_free:
kfree(chs->c);
- kfree(cparam);
+ kvfree(cparam);
chs->num = 0;
return err;
}
--
2.17.0
^ permalink raw reply related
* [net-next 04/12] net/mlx5e: RX, Use existing WQ local variable
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Or Gerlitz, Tariq Toukan, Saeed Mahameed
In-Reply-To: <20180628215103.9141-1-saeedm@mellanox.com>
From: Tariq Toukan <tariqt@mellanox.com>
Local variable 'wq' already points to &sq->wq, use it.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index d3a1dd20e41d..a2d91eaa99c4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -487,7 +487,7 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix)
sq->db.ico_wqe[pi].opcode = MLX5_OPCODE_UMR;
sq->pc += MLX5E_UMR_WQEBBS;
- mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, &umr_wqe->ctrl);
+ mlx5e_notify_hw(wq, sq->pc, sq->uar_map, &umr_wqe->ctrl);
return 0;
--
2.17.0
^ permalink raw reply related
* [net-next 01/12] net/mlx5e: Add UDP GSO support
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Boris Pismenny, Yossi Kuperman,
Saeed Mahameed
In-Reply-To: <20180628215103.9141-1-saeedm@mellanox.com>
From: Boris Pismenny <borisp@mellanox.com>
This patch enables UDP GSO support. We enable this by using two WQEs
the first is a UDP LSO WQE for all segments with equal length, and the
second is for the last segment in case it has different length.
Due to HW limitation, before sending, we must adjust the packet length fields.
We measure performance between two Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz
machines connected back-to-back with Connectx4-Lx (40Gbps) NICs.
We compare single stream UDP, UDP GSO and UDP GSO with offload.
Performance:
| MSS (bytes) | Throughput (Gbps) | CPU utilization (%)
UDP GSO offload | 1472 | 35.6 | 8%
UDP GSO | 1472 | 25.5 | 17%
UDP | 1472 | 10.2 | 17%
UDP GSO offload | 1024 | 35.6 | 8%
UDP GSO | 1024 | 19.2 | 17%
UDP | 1024 | 5.7 | 17%
UDP GSO offload | 512 | 33.8 | 16%
UDP GSO | 512 | 10.4 | 17%
UDP | 512 | 3.5 | 17%
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
.../net/ethernet/mellanox/mlx5/core/Makefile | 4 +-
.../mellanox/mlx5/core/en_accel/en_accel.h | 11 +-
.../mellanox/mlx5/core/en_accel/rxtx.c | 108 ++++++++++++++++++
.../mellanox/mlx5/core/en_accel/rxtx.h | 14 +++
.../net/ethernet/mellanox/mlx5/core/en_main.c | 3 +
.../net/ethernet/mellanox/mlx5/core/en_tx.c | 8 +-
6 files changed, 139 insertions(+), 9 deletions(-)
create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c
create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.h
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 9efbf193ad5a..d923f2f58608 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -14,8 +14,8 @@ mlx5_core-$(CONFIG_MLX5_FPGA) += fpga/cmd.o fpga/core.o fpga/conn.o fpga/sdk.o \
fpga/ipsec.o fpga/tls.o
mlx5_core-$(CONFIG_MLX5_CORE_EN) += en_main.o en_common.o en_fs.o en_ethtool.o \
- en_tx.o en_rx.o en_dim.o en_txrx.o en_stats.o vxlan.o \
- en_arfs.o en_fs_ethtool.o en_selftest.o en/port.o
+ en_tx.o en_rx.o en_dim.o en_txrx.o en_accel/rxtx.o en_stats.o \
+ vxlan.o en_arfs.o en_fs_ethtool.o en_selftest.o en/port.o
mlx5_core-$(CONFIG_MLX5_MPFS) += lib/mpfs.o
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h
index f20074dbef32..39a5d13ba459 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h
@@ -34,12 +34,11 @@
#ifndef __MLX5E_EN_ACCEL_H__
#define __MLX5E_EN_ACCEL_H__
-#ifdef CONFIG_MLX5_ACCEL
-
#include <linux/skbuff.h>
#include <linux/netdevice.h>
#include "en_accel/ipsec_rxtx.h"
#include "en_accel/tls_rxtx.h"
+#include "en_accel/rxtx.h"
#include "en.h"
static inline struct sk_buff *mlx5e_accel_handle_tx(struct sk_buff *skb,
@@ -64,9 +63,13 @@ static inline struct sk_buff *mlx5e_accel_handle_tx(struct sk_buff *skb,
}
#endif
+ if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) {
+ skb = mlx5e_udp_gso_handle_tx_skb(dev, sq, skb, wqe, pi);
+ if (unlikely(!skb))
+ return NULL;
+ }
+
return skb;
}
-#endif /* CONFIG_MLX5_ACCEL */
-
#endif /* __MLX5E_EN_ACCEL_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c
new file mode 100644
index 000000000000..4bb1f3b12b96
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c
@@ -0,0 +1,108 @@
+#include "en_accel/rxtx.h"
+
+static void mlx5e_udp_gso_prepare_last_skb(struct sk_buff *skb,
+ struct sk_buff *nskb,
+ int remaining)
+{
+ int bytes_needed = remaining, remaining_headlen, remaining_page_offset;
+ int headlen = skb_transport_offset(skb) + sizeof(struct udphdr);
+ int payload_len = remaining + sizeof(struct udphdr);
+ int k = 0, i, j;
+
+ skb_copy_bits(skb, 0, nskb->data, headlen);
+ nskb->dev = skb->dev;
+ skb_reset_mac_header(nskb);
+ skb_set_network_header(nskb, skb_network_offset(skb));
+ skb_set_transport_header(nskb, skb_transport_offset(skb));
+ skb_set_tail_pointer(nskb, headlen);
+
+ /* How many frags do we need? */
+ for (i = skb_shinfo(skb)->nr_frags - 1; i >= 0; i--) {
+ bytes_needed -= skb_frag_size(&skb_shinfo(skb)->frags[i]);
+ k++;
+ if (bytes_needed <= 0)
+ break;
+ }
+
+ /* Fill the first frag and split it if necessary */
+ j = skb_shinfo(skb)->nr_frags - k;
+ remaining_page_offset = -bytes_needed;
+ skb_fill_page_desc(nskb, 0,
+ skb_shinfo(skb)->frags[j].page.p,
+ skb_shinfo(skb)->frags[j].page_offset + remaining_page_offset,
+ skb_shinfo(skb)->frags[j].size - remaining_page_offset);
+
+ skb_frag_ref(skb, j);
+
+ /* Fill the rest of the frags */
+ for (i = 1; i < k; i++) {
+ j = skb_shinfo(skb)->nr_frags - k + i;
+
+ skb_fill_page_desc(nskb, i,
+ skb_shinfo(skb)->frags[j].page.p,
+ skb_shinfo(skb)->frags[j].page_offset,
+ skb_shinfo(skb)->frags[j].size);
+ skb_frag_ref(skb, j);
+ }
+ skb_shinfo(nskb)->nr_frags = k;
+
+ remaining_headlen = remaining - skb->data_len;
+
+ /* headlen contains remaining data? */
+ if (remaining_headlen > 0)
+ skb_copy_bits(skb, skb->len - remaining, nskb->data + headlen,
+ remaining_headlen);
+ nskb->len = remaining + headlen;
+ nskb->data_len = payload_len - sizeof(struct udphdr) +
+ max_t(int, 0, remaining_headlen);
+ nskb->protocol = skb->protocol;
+ if (nskb->protocol == htons(ETH_P_IP)) {
+ ip_hdr(nskb)->id = htons(ntohs(ip_hdr(nskb)->id) +
+ skb_shinfo(skb)->gso_segs);
+ ip_hdr(nskb)->tot_len =
+ htons(payload_len + sizeof(struct iphdr));
+ } else {
+ ipv6_hdr(nskb)->payload_len = htons(payload_len);
+ }
+ udp_hdr(nskb)->len = htons(payload_len);
+ skb_shinfo(nskb)->gso_size = 0;
+ nskb->ip_summed = skb->ip_summed;
+ nskb->csum_start = skb->csum_start;
+ nskb->csum_offset = skb->csum_offset;
+ nskb->queue_mapping = skb->queue_mapping;
+}
+
+/* might send skbs and update wqe and pi */
+struct sk_buff *mlx5e_udp_gso_handle_tx_skb(struct net_device *netdev,
+ struct mlx5e_txqsq *sq,
+ struct sk_buff *skb,
+ struct mlx5e_tx_wqe **wqe,
+ u16 *pi)
+{
+ int payload_len = skb_shinfo(skb)->gso_size + sizeof(struct udphdr);
+ int headlen = skb_transport_offset(skb) + sizeof(struct udphdr);
+ int remaining = (skb->len - headlen) % skb_shinfo(skb)->gso_size;
+ struct sk_buff *nskb;
+
+ if (skb->protocol == htons(ETH_P_IP))
+ ip_hdr(skb)->tot_len = htons(payload_len + sizeof(struct iphdr));
+ else
+ ipv6_hdr(skb)->payload_len = htons(payload_len);
+ udp_hdr(skb)->len = htons(payload_len);
+ if (!remaining)
+ return skb;
+
+ nskb = alloc_skb(max_t(int, headlen, headlen + remaining - skb->data_len), GFP_ATOMIC);
+ if (unlikely(!nskb)) {
+ sq->stats->dropped++;
+ return NULL;
+ }
+
+ mlx5e_udp_gso_prepare_last_skb(skb, nskb, remaining);
+
+ skb_shinfo(skb)->gso_segs--;
+ pskb_trim(skb, skb->len - remaining);
+ mlx5e_sq_xmit(sq, skb, *wqe, *pi);
+ mlx5e_sq_fetch_wqe(sq, wqe, pi);
+ return nskb;
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.h
new file mode 100644
index 000000000000..ed42699a78b3
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.h
@@ -0,0 +1,14 @@
+
+#ifndef __MLX5E_EN_ACCEL_RX_TX_H__
+#define __MLX5E_EN_ACCEL_RX_TX_H__
+
+#include <linux/skbuff.h>
+#include "en.h"
+
+struct sk_buff *mlx5e_udp_gso_handle_tx_skb(struct net_device *netdev,
+ struct mlx5e_txqsq *sq,
+ struct sk_buff *skb,
+ struct mlx5e_tx_wqe **wqe,
+ u16 *pi);
+
+#endif
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 134f20a182b5..e2ef68b1daa2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4592,6 +4592,9 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev)
netdev->features |= NETIF_F_HIGHDMA;
netdev->features |= NETIF_F_HW_VLAN_STAG_FILTER;
+ netdev->features |= NETIF_F_GSO_UDP_L4;
+ netdev->hw_features |= NETIF_F_GSO_UDP_L4;
+
netdev->priv_flags |= IFF_UNICAST_FLT;
mlx5e_set_netdev_dev_addr(netdev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index f29deb44bf3b..f450d9ca31fb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -228,7 +228,10 @@ mlx5e_tx_get_gso_ihs(struct mlx5e_txqsq *sq, struct sk_buff *skb)
stats->tso_inner_packets++;
stats->tso_inner_bytes += skb->len - ihs;
} else {
- ihs = skb_transport_offset(skb) + tcp_hdrlen(skb);
+ if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4)
+ ihs = skb_transport_offset(skb) + sizeof(struct udphdr);
+ else
+ ihs = skb_transport_offset(skb) + tcp_hdrlen(skb);
stats->tso_packets++;
stats->tso_bytes += skb->len - ihs;
}
@@ -443,12 +446,11 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev)
sq = priv->txq2sq[skb_get_queue_mapping(skb)];
mlx5e_sq_fetch_wqe(sq, &wqe, &pi);
-#ifdef CONFIG_MLX5_ACCEL
/* might send skbs and update wqe and pi */
skb = mlx5e_accel_handle_tx(skb, sq, dev, &wqe, &pi);
if (unlikely(!skb))
return NETDEV_TX_OK;
-#endif
+
return mlx5e_sq_xmit(sq, skb, wqe, pi);
}
--
2.17.0
^ permalink raw reply related
* [net-next 02/12] net/mlx5e: Add UDP GSO remaining counter
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Or Gerlitz, Boris Pismenny, Saeed Mahameed
In-Reply-To: <20180628215103.9141-1-saeedm@mellanox.com>
From: Boris Pismenny <borisp@mellanox.com>
This patch adds a counter for tx UDP GSO packets that contain a segment
that is not aligned to MSS - remaining segment.
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c | 1 +
drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 2 ++
drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 2 ++
3 files changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c
index 4bb1f3b12b96..7b7ec3998e84 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c
@@ -92,6 +92,7 @@ struct sk_buff *mlx5e_udp_gso_handle_tx_skb(struct net_device *netdev,
if (!remaining)
return skb;
+ sq->stats->udp_seg_rem++;
nskb = alloc_skb(max_t(int, headlen, headlen + remaining - skb->data_len), GFP_ATOMIC);
if (unlikely(!nskb)) {
sq->stats->dropped++;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 1646859974ce..7e7155b4e0f0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -68,6 +68,7 @@ static const struct counter_desc sw_stats_desc[] = {
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_xmit_more) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_recover) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_queue_wake) },
+ { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_udp_seg_rem) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_cqe_err) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_wqe_err) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_mpwqe_filler) },
@@ -159,6 +160,7 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
s->tx_added_vlan_packets += sq_stats->added_vlan_packets;
s->tx_queue_stopped += sq_stats->stopped;
s->tx_queue_wake += sq_stats->wake;
+ s->tx_udp_seg_rem += sq_stats->udp_seg_rem;
s->tx_queue_dropped += sq_stats->dropped;
s->tx_cqe_err += sq_stats->cqe_err;
s->tx_recover += sq_stats->recover;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 643153bb3607..d416bb86e747 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -79,6 +79,7 @@ struct mlx5e_sw_stats {
u64 tx_xmit_more;
u64 tx_recover;
u64 tx_queue_wake;
+ u64 tx_udp_seg_rem;
u64 tx_cqe_err;
u64 rx_wqe_err;
u64 rx_mpwqe_filler;
@@ -196,6 +197,7 @@ struct mlx5e_sq_stats {
u64 csum_partial_inner;
u64 added_vlan_packets;
u64 nop;
+ u64 udp_seg_rem;
#ifdef CONFIG_MLX5_EN_TLS
u64 tls_ooo;
u64 tls_resync_bytes;
--
2.17.0
^ permalink raw reply related
* [pull request][net-next 00/12] Mellanox, mlx5e updates 2018-06-28
From: Saeed Mahameed @ 2018-06-28 21:50 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Or Gerlitz, Saeed Mahameed
Hi Dave,
The following pull request includes updates for mlx5e netdevice driver.
For more information please see tag log below.
Please pull and let me know if there's any problem.
Thanks,
Saeed.
---
The following changes since commit 7861552cedd81a164c0d5d1c89fe2cb45a3ed41b:
netlink: Return extack message if attribute validation fails (2018-06-28 16:18:04 +0900)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5e-updates-2018-06-28
for you to fetch changes up to ed56c5193ad89d1097cdbdc87abeb062e03a06eb:
net/mlx5e: Update NIC HW stats on demand only (2018-06-28 14:44:38 -0700)
----------------------------------------------------------------
mlx5e-updates-2018-06-28
mlx5e netdevice driver updates:
- Boris Pismenny added the support for UDP GSO in the first two patches.
Impressive performance numbers are included in the commit message,
@Line rate with ~half of the cpu utilization compared to non offload
or no GSO at all.
- From Tariq Toukan:
- Convert large order kzalloc allocations to kvzalloc.
- Added performance diagnostic statistics to several places in data path.
>From Saeed and Eran,
- Update NIC HW stats on demand only, this is to eliminate the background
thread needed to update some HW statistics in the driver cache in
order to report error and drop counters from HW in ndo_get_stats.
----------------------------------------------------------------
Boris Pismenny (2):
net/mlx5e: Add UDP GSO support
net/mlx5e: Add UDP GSO remaining counter
Saeed Mahameed (1):
net/mlx5e: Update NIC HW stats on demand only
Tariq Toukan (9):
net/mlx5e: Convert large order kzalloc allocations to kvzalloc
net/mlx5e: RX, Use existing WQ local variable
net/mlx5e: Add TX completions statistics
net/mlx5e: Add XDP_TX completions statistics
net/mlx5e: Add NAPI statistics
net/mlx5e: Add a counter for congested UMRs
net/mlx5e: Add channel events counter
net/mlx5e: Add counter for MPWQE filler strides
net/mlx5e: Add counter for total num of NOP operations
drivers/net/ethernet/mellanox/mlx5/core/Makefile | 4 +-
drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 -
.../mellanox/mlx5/core/en_accel/en_accel.h | 11 ++-
.../ethernet/mellanox/mlx5/core/en_accel/rxtx.c | 109 +++++++++++++++++++++
.../ethernet/mellanox/mlx5/core/en_accel/rxtx.h | 14 +++
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 57 ++++++-----
drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 3 +
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 11 ++-
drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 34 ++++++-
drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 25 ++++-
drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 17 +++-
drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 9 +-
drivers/net/ethernet/mellanox/mlx5/core/wq.h | 5 +
13 files changed, 252 insertions(+), 48 deletions(-)
create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.c
create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/rxtx.h
^ permalink raw reply
* Re: [PATCH net-next] net: preserve sock reference when scrubbing the skb.
From: Cong Wang @ 2018-06-28 21:51 UTC (permalink / raw)
To: Flavio Leitner
Cc: Eric Dumazet, Linux Kernel Network Developers, Paolo Abeni,
David Miller, Florian Westphal, NetFilter
In-Reply-To: <20180627201937.GZ19565@plex.lan>
On Wed, Jun 27, 2018 at 1:19 PM Flavio Leitner <fbl@redhat.com> wrote:
>
> On Wed, Jun 27, 2018 at 12:06:16PM -0700, Cong Wang wrote:
> > On Wed, Jun 27, 2018 at 5:32 AM Flavio Leitner <fbl@redhat.com> wrote:
> > >
> > > On Tue, Jun 26, 2018 at 06:28:27PM -0700, Cong Wang wrote:
> > > > On Tue, Jun 26, 2018 at 5:39 PM Flavio Leitner <fbl@redhat.com> wrote:
> > > > >
> > > > > On Tue, Jun 26, 2018 at 05:29:51PM -0700, Cong Wang wrote:
> > > > > > On Tue, Jun 26, 2018 at 4:33 PM Flavio Leitner <fbl@redhat.com> wrote:
> > > > > > >
> > > > > > > It is still isolated, the sk carries the netns info and it is
> > > > > > > orphaned when it re-enters the stack.
> > > > > >
> > > > > > Then what difference does your patch make?
> > > > >
> > > > > Don't forget it is fixing two issues.
> > > >
> > > > Sure. I am only talking about TSQ from the very beginning.
> > > > Let me rephrase my above question:
> > > > What difference does your patch make to TSQ?
> > >
> > > It avoids burstiness.
> >
> > Never even mentioned in changelog or in your patch. :-/
>
> It's part of queueing and helping the bufferbloat problem in the
> commit message.
Please don't add all queues in this scope. Are you really
going to put all queues in networking into your "bufferbloat" claim?
Seriously? Please get it defined, seriously. You really need to
read into the other reply from me, none of you or David even
seriously finish reading it.
>
> > > > > > Before your patch:
> > > > > > veth orphans skb in its xmit
> > > > > >
> > > > > > After your patch:
> > > > > > RX orphans it when re-entering stack (as you claimed, I don't know)
> > > > >
> > > > > ip_rcv, and equivalents.
> > > >
> > > > ip_rcv() is L3, we enter a stack from L1. So your above claim is incorrect. :)
> > >
> > > Maybe you found a problem, could you please point me to where in
> > > between L1 to L3 the socket is relevant?
> >
> > Of course, ingress qdisc is in L2. Do I need to say more? This
> > is where we can re-route the packets, for example, redirecting it to
> > yet another netns. This is in fact what we use in production, not anything
> > that only in my imagination.
> >
> > You really have to think about why you allow a different netns influence
> > another netns by holding the skb to throttle the source TCP socket.
>
> Maybe I wasn't clear and you didn't understand the question. Please find
> a spot where the preserved socket is used incorrectly.
It's sad you still don't get what I mean, I never complain you leak skb->sk,
I complain you break TSQ. Dragging discussion into skb->sk doesn't
even help.
>
> > > > which means you have to update Documentation/networking/ip-sysctl.txt
> > > > too.
> > >
> > > How it is never targeted? Whole point is to avoid queueing traffic.
> >
> > What queues? You really need to define it, seriously.
> >
> >
> > > Would you be okay if I include this chunk?
> >
> > No, still lack of an explanation why it comes across netns for
> > a good reason.
>
> Because it doesn't. Since you talk more about veth, let's pick it
> as an example. The TX is nothing more than add to the CPU backlog,
That's RX, assume "CPU backlog" here still means softnet_data.
> right? That is netns agnostic. The same for processing that queue
> which will push the skb anyways and will call skb_orphan().
Once it leaves TX, it leaves the stack. skb_orphan() called
in L3 (as you claimed) is already in yet another stack.
>
> How can one netns avoid/delay the skb_orphan()? And even if does
> that, what gain will you have to allow queuing of more and more
> packets in the CPU backlog? It is stalled.
Please read the other reply from me, you don't even understand
what a boundary of a stack is.
^ permalink raw reply
* [PATCH bpf-next 8/8] tools: bpftool: deal with options upfront
From: Jakub Kicinski @ 2018-06-28 21:41 UTC (permalink / raw)
To: alexei.starovoitov, daniel; +Cc: oss-drivers, netdev, Jakub Kicinski
In-Reply-To: <20180628214142.11268-1-jakub.kicinski@netronome.com>
Remove options (in getopt() sense, i.e. starting with a dash like
-n or --NAME) while parsing arguments for bash completions. This
allows us to refer to position-dependent parameters better, and
complete options at any point.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
tools/bpf/bpftool/bash-completion/bpftool | 32 +++++++++++++++--------
1 file changed, 21 insertions(+), 11 deletions(-)
diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
index b0b8022d3570..fffd76f4998b 100644
--- a/tools/bpf/bpftool/bash-completion/bpftool
+++ b/tools/bpf/bpftool/bash-completion/bpftool
@@ -153,6 +153,13 @@ _bpftool()
local cur prev words objword
_init_completion || return
+ # Deal with options
+ if [[ ${words[cword]} == -* ]]; then
+ local c='--version --json --pretty --bpffs'
+ COMPREPLY=( $( compgen -W "$c" -- "$cur" ) )
+ return 0
+ fi
+
# Deal with simplest keywords
case $prev in
help|hex|opcodes|visual)
@@ -172,20 +179,23 @@ _bpftool()
;;
esac
- # Search for object and command
- local object command cmdword
- for (( cmdword=1; cmdword < ${#words[@]}-1; cmdword++ )); do
- [[ -n $object ]] && command=${words[cmdword]} && break
- [[ ${words[cmdword]} != -* ]] && object=${words[cmdword]}
+ # Remove all options so completions don't have to deal with them.
+ local i
+ for (( i=1; i < ${#words[@]}; )); do
+ if [[ ${words[i]::1} == - ]]; then
+ words=( "${words[@]:0:i}" "${words[@]:i+1}" )
+ [[ $i -le $cword ]] && cword=$(( cword - 1 ))
+ else
+ i=$(( ++i ))
+ fi
done
+ cur=${words[cword]}
+ prev=${words[cword - 1]}
- if [[ -z $object ]]; then
+ local object=${words[1]} command=${words[2]}
+
+ if [[ -z $object || $cword -eq 1 ]]; then
case $cur in
- -*)
- local c='--version --json --pretty --bpffs'
- COMPREPLY=( $( compgen -W "$c" -- "$cur" ) )
- return 0
- ;;
*)
COMPREPLY=( $( compgen -W "$( bpftool help 2>&1 | \
command sed \
--
2.17.1
^ permalink raw reply related
* [PATCH bpf-next 7/8] tools: bpftool: add missing --bpffs to completions
From: Jakub Kicinski @ 2018-06-28 21:41 UTC (permalink / raw)
To: alexei.starovoitov, daniel; +Cc: oss-drivers, netdev, Jakub Kicinski
In-Reply-To: <20180628214142.11268-1-jakub.kicinski@netronome.com>
--bpffs is not suggested by bash completions.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
tools/bpf/bpftool/bash-completion/bpftool | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
index 1e1083321643..b0b8022d3570 100644
--- a/tools/bpf/bpftool/bash-completion/bpftool
+++ b/tools/bpf/bpftool/bash-completion/bpftool
@@ -182,7 +182,7 @@ _bpftool()
if [[ -z $object ]]; then
case $cur in
-*)
- local c='--version --json --pretty'
+ local c='--version --json --pretty --bpffs'
COMPREPLY=( $( compgen -W "$c" -- "$cur" ) )
return 0
;;
--
2.17.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox