* [patch net-next 1/5] mlxsw: reg: add rdpm register
2018-01-11 10:20 [patch net-next 0/5] mlxsw: Offload PRIO qdisc Jiri Pirko
@ 2018-01-11 10:20 ` Jiri Pirko
2018-01-11 18:47 ` David Miller
2018-01-11 10:20 ` [patch net-next 2/5] mlxsw: spectrum_router: Configure default routing priority Jiri Pirko
` (4 subsequent siblings)
5 siblings, 1 reply; 21+ messages in thread
From: Jiri Pirko @ 2018-01-11 10:20 UTC (permalink / raw)
To: netdev; +Cc: nogahf, yuvalm, davem, idosch, mlxsw, jhs, xiyou.wangcong
From: Yuval Mintz <yuvalm@mellanox.com>
Add rdpm definition - router DSCP to priority mapping register.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
drivers/net/ethernet/mellanox/mlxsw/item.h | 2 +-
drivers/net/ethernet/mellanox/mlxsw/reg.h | 37 ++++++++++++++++++++++++++++++
2 files changed, 38 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/item.h b/drivers/net/ethernet/mellanox/mlxsw/item.h
index 28427f0758c7..31c886edc791 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/item.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/item.h
@@ -42,7 +42,7 @@
struct mlxsw_item {
unsigned short offset; /* bytes in container */
- unsigned short step; /* step in bytes for indexed items */
+ short step; /* step in bytes for indexed items */
unsigned short in_step_offset; /* offset within one step */
unsigned char shift; /* shift in bits */
unsigned char element_size; /* size of element in bit array */
diff --git a/drivers/net/ethernet/mellanox/mlxsw/reg.h b/drivers/net/ethernet/mellanox/mlxsw/reg.h
index 6c4e08b8058a..0e08be41c8e0 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/reg.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h
@@ -4827,6 +4827,42 @@ static inline void mlxsw_reg_ratr_counter_pack(char *payload, u64 counter_index,
mlxsw_reg_ratr_counter_set_type_set(payload, set_type);
}
+/* RDPM - Router DSCP to Priority Mapping
+ * --------------------------------------
+ * Controls the mapping from DSCP field to switch priority on routed packets
+ */
+#define MLXSW_REG_RDPM_ID 0x8009
+#define MLXSW_REG_RDPM_BASE_LEN 0x00
+#define MLXSW_REG_RDPM_DSCP_ENTRY_REC_LEN 0x01
+#define MLXSW_REG_RDPM_DSCP_ENTRY_REC_MAX_COUNT 64
+#define MLXSW_REG_RDPM_LEN 0x40
+#define MLXSW_REG_RDPM_LAST_ENTRY (MLXSW_REG_RDPM_BASE_LEN + \
+ MLXSW_REG_RDPM_LEN - \
+ MLXSW_REG_RDPM_DSCP_ENTRY_REC_LEN)
+
+MLXSW_REG_DEFINE(rdpm, MLXSW_REG_RDPM_ID, MLXSW_REG_RDPM_LEN);
+
+/* reg_dscp_entry_e
+ * Enable update of the specific entry
+ * Access: Index
+ */
+MLXSW_ITEM8_INDEXED(reg, rdpm, dscp_entry_e, MLXSW_REG_RDPM_LAST_ENTRY, 7, 1,
+ -MLXSW_REG_RDPM_DSCP_ENTRY_REC_LEN, 0x00, false);
+
+/* reg_dscp_entry_prio
+ * Switch Priority
+ * Access: RW
+ */
+MLXSW_ITEM8_INDEXED(reg, rdpm, dscp_entry_prio, MLXSW_REG_RDPM_LAST_ENTRY, 0, 4,
+ -MLXSW_REG_RDPM_DSCP_ENTRY_REC_LEN, 0x00, false);
+
+static inline void mlxsw_reg_rdpm_pack(char *payload, unsigned short index,
+ u8 prio)
+{
+ mlxsw_reg_rdpm_dscp_entry_e_set(payload, index, 1);
+ mlxsw_reg_rdpm_dscp_entry_prio_set(payload, index, prio);
+}
+
/* RICNT - Router Interface Counter Register
* -----------------------------------------
* The RICNT register retrieves per port performance counters
@@ -7640,6 +7676,7 @@ static const struct mlxsw_reg_info *mlxsw_reg_infos[] = {
MLXSW_REG(rtar),
MLXSW_REG(ratr),
MLXSW_REG(rtdp),
+ MLXSW_REG(rdpm),
MLXSW_REG(ricnt),
MLXSW_REG(rrcr),
MLXSW_REG(ralta),
--
2.14.3
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [patch net-next 1/5] mlxsw: reg: add rdpm register
2018-01-11 10:20 ` [patch net-next 1/5] mlxsw: reg: add rdpm register Jiri Pirko
@ 2018-01-11 18:47 ` David Miller
0 siblings, 0 replies; 21+ messages in thread
From: David Miller @ 2018-01-11 18:47 UTC (permalink / raw)
To: jiri; +Cc: netdev, nogahf, yuvalm, idosch, mlxsw, jhs, xiyou.wangcong
From: Jiri Pirko <jiri@resnulli.us>
Date: Thu, 11 Jan 2018 11:20:58 +0100
> From: Yuval Mintz <yuvalm@mellanox.com>
>
> Add rdpm definition - router DSCP to priority mapping register.
>
> Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
> Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
This doesn't explain why adding a definition for this register...
> @@ -42,7 +42,7 @@
>
> struct mlxsw_item {
> unsigned short offset; /* bytes in container */
> - unsigned short step; /* step in bytes for indexed items */
> + short step; /* step in bytes for indexed items */
> unsigned short in_step_offset; /* offset within one step */
> unsigned char shift; /* shift in bits */
> unsigned char element_size; /* size of element in bit array */
requires making mlxsw_item->step signed.
Please update the commit message or move this change into a more
appropriate patch in this series or even a new one. Whichever
is required.
Thanks.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch net-next 2/5] mlxsw: spectrum_router: Configure default routing priority
2018-01-11 10:20 [patch net-next 0/5] mlxsw: Offload PRIO qdisc Jiri Pirko
2018-01-11 10:20 ` [patch net-next 1/5] mlxsw: reg: add rdpm register Jiri Pirko
@ 2018-01-11 10:20 ` Jiri Pirko
2018-01-11 10:21 ` [patch net-next 3/5] net: sch: prio: Add offload ability to PRIO qdisc Jiri Pirko
` (3 subsequent siblings)
5 siblings, 0 replies; 21+ messages in thread
From: Jiri Pirko @ 2018-01-11 10:20 UTC (permalink / raw)
To: netdev; +Cc: nogahf, yuvalm, davem, idosch, mlxsw, jhs, xiyou.wangcong
From: Yuval Mintz <yuvalm@mellanox.com>
When routing ip packets, the kernel is setting the SKB's priority
based on the tos field of the packet.
Imitate this behavior in the mlxsw router, having the internal
switch priority of a routed packet determined according to its DS
field.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
.../net/ethernet/mellanox/mlxsw/spectrum_router.c | 24 ++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 434b3922b34f..8f115d1c7056 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -7008,6 +7008,24 @@ static int mlxsw_sp_mp_hash_init(struct mlxsw_sp *mlxsw_sp)
}
#endif
+static int mlxsw_sp_dscp_init(struct mlxsw_sp *mlxsw_sp)
+{
+ char rdpm_pl[MLXSW_REG_RDPM_LEN];
+ unsigned int i;
+
+ MLXSW_REG_ZERO(rdpm, rdpm_pl);
+
+ /* HW is determining switch priority based on DSCP-bits, but the
+ * kernel is still doing that based on the ToS. Since there's a
+ * mismatch in bits we need to make sure to translate the right
+ * value ToS would observe, skipping the 2 least-significant ECN bits.
+ */
+ for (i = 0; i < MLXSW_REG_RDPM_DSCP_ENTRY_REC_MAX_COUNT; i++)
+ mlxsw_reg_rdpm_pack(rdpm_pl, i, rt_tos2priority(i << 2));
+
+ return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(rdpm), rdpm_pl);
+}
+
static int __mlxsw_sp_router_init(struct mlxsw_sp *mlxsw_sp)
{
char rgcr_pl[MLXSW_REG_RGCR_LEN];
@@ -7020,6 +7038,7 @@ static int __mlxsw_sp_router_init(struct mlxsw_sp *mlxsw_sp)
mlxsw_reg_rgcr_pack(rgcr_pl, true, true);
mlxsw_reg_rgcr_max_router_interfaces_set(rgcr_pl, max_rifs);
+ mlxsw_reg_rgcr_usp_set(rgcr_pl, true);
err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(rgcr), rgcr_pl);
if (err)
return err;
@@ -7095,6 +7114,10 @@ int mlxsw_sp_router_init(struct mlxsw_sp *mlxsw_sp)
if (err)
goto err_mp_hash_init;
+ err = mlxsw_sp_dscp_init(mlxsw_sp);
+ if (err)
+ goto err_dscp_init;
+
mlxsw_sp->router->fib_nb.notifier_call = mlxsw_sp_router_fib_event;
err = register_fib_notifier(&mlxsw_sp->router->fib_nb,
mlxsw_sp_router_fib_dump_flush);
@@ -7104,6 +7127,7 @@ int mlxsw_sp_router_init(struct mlxsw_sp *mlxsw_sp)
return 0;
err_register_fib_notifier:
+err_dscp_init:
err_mp_hash_init:
unregister_netevent_notifier(&mlxsw_sp->router->netevent_nb);
err_register_netevent_notifier:
--
2.14.3
^ permalink raw reply related [flat|nested] 21+ messages in thread* [patch net-next 3/5] net: sch: prio: Add offload ability to PRIO qdisc
2018-01-11 10:20 [patch net-next 0/5] mlxsw: Offload PRIO qdisc Jiri Pirko
2018-01-11 10:20 ` [patch net-next 1/5] mlxsw: reg: add rdpm register Jiri Pirko
2018-01-11 10:20 ` [patch net-next 2/5] mlxsw: spectrum_router: Configure default routing priority Jiri Pirko
@ 2018-01-11 10:21 ` Jiri Pirko
2018-01-11 23:25 ` Jakub Kicinski
2018-01-11 10:21 ` [patch net-next 4/5] mlxsw: spectrum: qdiscs: Support PRIO qdisc offload Jiri Pirko
` (2 subsequent siblings)
5 siblings, 1 reply; 21+ messages in thread
From: Jiri Pirko @ 2018-01-11 10:21 UTC (permalink / raw)
To: netdev; +Cc: nogahf, yuvalm, davem, idosch, mlxsw, jhs, xiyou.wangcong
From: Nogah Frankel <nogahf@mellanox.com>
Add the ability to offload PRIO qdisc by using ndo_setup_tc.
There are three commands for PRIO offloading:
* TC_PRIO_REPLACE: handles set and tune
* TC_PRIO_DESTROY: handles qdisc destroy
* TC_PRIO_STATS: updates the qdiscs counters (given as reference)
Like RED qdisc, the indication of whether PRIO is being offloaded is being
set and updated as part of the dump function. It is so because the driver
could decide to offload or not based on the qdisc parent, which could
change without notifying the qdisc.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
include/linux/netdevice.h | 1 +
include/net/pkt_cls.h | 25 ++++++++++++++++++++
net/sched/sch_prio.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 85 insertions(+)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index ef7b348e8498..6d95477b962c 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -780,6 +780,7 @@ enum tc_setup_type {
TC_SETUP_BLOCK,
TC_SETUP_QDISC_CBS,
TC_SETUP_QDISC_RED,
+ TC_SETUP_QDISC_PRIO,
};
/* These structures hold the attributes of bpf state that are being passed
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 0d1343cba84c..4ba8c3ba3dd4 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -761,4 +761,29 @@ struct tc_red_qopt_offload {
};
};
+enum tc_prio_command {
+ TC_PRIO_REPLACE,
+ TC_PRIO_DESTROY,
+ TC_PRIO_STATS,
+};
+
+struct tc_prio_qopt_offload_params {
+ int bands;
+ u8 priomap[TC_PRIO_MAX + 1];
+ /* In case that a prio qdisc is offloaded and now is changed to a
+ * non-offloadedable config, it needs to update the backlog value
+ * to negate the HW backlog value.
+ */
+ u32 *backlog;
+};
+
+struct tc_prio_qopt_offload {
+ enum tc_prio_command command;
+ u32 handle;
+ u32 parent;
+ union {
+ struct tc_prio_qopt_offload_params replace_params;
+ struct tc_qopt_offload_stats stats;
+ };
+};
#endif
diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c
index fe1510eb111f..3f47a30ce72f 100644
--- a/net/sched/sch_prio.c
+++ b/net/sched/sch_prio.c
@@ -142,6 +142,31 @@ prio_reset(struct Qdisc *sch)
sch->q.qlen = 0;
}
+static int prio_offload(struct Qdisc *sch, bool enable)
+{
+ struct prio_sched_data *q = qdisc_priv(sch);
+ struct net_device *dev = qdisc_dev(sch);
+ struct tc_prio_qopt_offload opt = {
+ .handle = sch->handle,
+ .parent = sch->parent,
+ };
+
+ if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
+ return -EOPNOTSUPP;
+
+ if (enable) {
+ opt.command = TC_PRIO_REPLACE;
+ opt.replace_params.bands = q->bands;
+ memcpy(&opt.replace_params.priomap, q->prio2band,
+ TC_PRIO_MAX + 1);
+ opt.replace_params.backlog = &sch->qstats.backlog;
+ } else {
+ opt.command = TC_PRIO_DESTROY;
+ }
+
+ return dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_PRIO, &opt);
+}
+
static void
prio_destroy(struct Qdisc *sch)
{
@@ -149,6 +174,7 @@ prio_destroy(struct Qdisc *sch)
struct prio_sched_data *q = qdisc_priv(sch);
tcf_block_put(q->block);
+ prio_offload(sch, false);
for (prio = 0; prio < q->bands; prio++)
qdisc_destroy(q->queues[prio]);
}
@@ -204,6 +230,7 @@ static int prio_tune(struct Qdisc *sch, struct nlattr *opt,
}
sch_tree_unlock(sch);
+ prio_offload(sch, true);
return 0;
}
@@ -223,15 +250,47 @@ static int prio_init(struct Qdisc *sch, struct nlattr *opt,
return prio_tune(sch, opt, extack);
}
+static int prio_dump_offload(struct Qdisc *sch)
+{
+ struct net_device *dev = qdisc_dev(sch);
+ struct tc_prio_qopt_offload hw_stats = {
+ .handle = sch->handle,
+ .parent = sch->parent,
+ .command = TC_PRIO_STATS,
+ .stats.bstats = &sch->bstats,
+ .stats.qstats = &sch->qstats,
+ };
+ int err;
+
+ sch->flags &= ~TCQ_F_OFFLOADED;
+ if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
+ return 0;
+
+ err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_PRIO,
+ &hw_stats);
+ if (err == -EOPNOTSUPP)
+ return 0;
+
+ if (!err)
+ sch->flags |= TCQ_F_OFFLOADED;
+
+ return err;
+}
+
static int prio_dump(struct Qdisc *sch, struct sk_buff *skb)
{
struct prio_sched_data *q = qdisc_priv(sch);
unsigned char *b = skb_tail_pointer(skb);
struct tc_prio_qopt opt;
+ int err;
opt.bands = q->bands;
memcpy(&opt.priomap, q->prio2band, TC_PRIO_MAX + 1);
+ err = prio_dump_offload(sch);
+ if (err)
+ goto nla_put_failure;
+
if (nla_put(skb, TCA_OPTIONS, sizeof(opt), &opt))
goto nla_put_failure;
--
2.14.3
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [patch net-next 3/5] net: sch: prio: Add offload ability to PRIO qdisc
2018-01-11 10:21 ` [patch net-next 3/5] net: sch: prio: Add offload ability to PRIO qdisc Jiri Pirko
@ 2018-01-11 23:25 ` Jakub Kicinski
2018-01-11 23:50 ` Yuval Mintz
0 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2018-01-11 23:25 UTC (permalink / raw)
To: Jiri Pirko
Cc: netdev, nogahf, yuvalm, davem, idosch, mlxsw, jhs, xiyou.wangcong
On Thu, 11 Jan 2018 11:21:00 +0100, Jiri Pirko wrote:
> +struct tc_prio_qopt_offload_params {
> + int bands;
> + u8 priomap[TC_PRIO_MAX + 1];
> + /* In case that a prio qdisc is offloaded and now is changed to a
> + * non-offloadedable config, it needs to update the backlog value
> + * to negate the HW backlog value.
> + */
> + u32 *backlog;
> +};
Could we please pass the full qstats on replace and destroy. This
simplifies the driver code and allows handling the qlen as well as
backlog. Please see the 2 patch series I sent earlier yesterday.
^ permalink raw reply [flat|nested] 21+ messages in thread* RE: [patch net-next 3/5] net: sch: prio: Add offload ability to PRIO qdisc
2018-01-11 23:25 ` Jakub Kicinski
@ 2018-01-11 23:50 ` Yuval Mintz
2018-01-12 0:00 ` Jakub Kicinski
0 siblings, 1 reply; 21+ messages in thread
From: Yuval Mintz @ 2018-01-11 23:50 UTC (permalink / raw)
To: Jakub Kicinski, Jiri Pirko
Cc: netdev@vger.kernel.org, Nogah Frankel, davem@davemloft.net,
Ido Schimmel, mlxsw, jhs@mojatatu.com, xiyou.wangcong@gmail.com
> > +struct tc_prio_qopt_offload_params {
> > + int bands;
> > + u8 priomap[TC_PRIO_MAX + 1];
> > + /* In case that a prio qdisc is offloaded and now is changed to a
> > + * non-offloadedable config, it needs to update the backlog value
> > + * to negate the HW backlog value.
> > + */
> > + u32 *backlog;
> > +};
>
> Could we please pass the full qstats on replace and destroy. This
> simplifies the driver code and allows handling the qlen as well as
> backlog. Please see the 2 patch series I sent earlier yesterday.
That might give the false impression that offloading driver is expected
to correct all the qstats fields during destruction, whereas for most of
them it doesn't seem appropriate.
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [patch net-next 3/5] net: sch: prio: Add offload ability to PRIO qdisc
2018-01-11 23:50 ` Yuval Mintz
@ 2018-01-12 0:00 ` Jakub Kicinski
2018-01-12 0:05 ` Yuval Mintz
0 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2018-01-12 0:00 UTC (permalink / raw)
To: Yuval Mintz
Cc: Jiri Pirko, netdev@vger.kernel.org, Nogah Frankel,
davem@davemloft.net, Ido Schimmel, mlxsw, jhs@mojatatu.com,
xiyou.wangcong@gmail.com
On Thu, 11 Jan 2018 23:50:27 +0000, Yuval Mintz wrote:
> > > +struct tc_prio_qopt_offload_params {
> > > + int bands;
> > > + u8 priomap[TC_PRIO_MAX + 1];
> > > + /* In case that a prio qdisc is offloaded and now is changed to a
> > > + * non-offloadedable config, it needs to update the backlog value
> > > + * to negate the HW backlog value.
> > > + */
> > > + u32 *backlog;
> > > +};
> >
> > Could we please pass the full qstats on replace and destroy. This
> > simplifies the driver code and allows handling the qlen as well as
> > backlog. Please see the 2 patch series I sent earlier yesterday.
>
> That might give the false impression that offloading driver is expected
> to correct all the qstats fields during destruction, whereas for most of
> them it doesn't seem appropriate.
The driver is supposed to return the momentary stats to their
original/SW-only value. And the driver knows exactly which stats
those are, just look at your patch 5, you handle backlog completely
differently than other stats already.
^ permalink raw reply [flat|nested] 21+ messages in thread* RE: [patch net-next 3/5] net: sch: prio: Add offload ability to PRIO qdisc
2018-01-12 0:00 ` Jakub Kicinski
@ 2018-01-12 0:05 ` Yuval Mintz
2018-01-12 9:20 ` Nogah Frankel
0 siblings, 1 reply; 21+ messages in thread
From: Yuval Mintz @ 2018-01-12 0:05 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Jiri Pirko, netdev@vger.kernel.org, Nogah Frankel,
davem@davemloft.net, Ido Schimmel, mlxsw, jhs@mojatatu.com,
xiyou.wangcong@gmail.com
> > > > +struct tc_prio_qopt_offload_params {
> > > > + int bands;
> > > > + u8 priomap[TC_PRIO_MAX + 1];
> > > > + /* In case that a prio qdisc is offloaded and now is changed to a
> > > > + * non-offloadedable config, it needs to update the backlog value
> > > > + * to negate the HW backlog value.
> > > > + */
> > > > + u32 *backlog;
> > > > +};
> > >
> > > Could we please pass the full qstats on replace and destroy. This
> > > simplifies the driver code and allows handling the qlen as well as
> > > backlog. Please see the 2 patch series I sent earlier yesterday.
> >
> > That might give the false impression that offloading driver is expected
> > to correct all the qstats fields during destruction, whereas for most of
> > them it doesn't seem appropriate.
>
> The driver is supposed to return the momentary stats to their
> original/SW-only value. And the driver knows exactly which stats
> those are, just look at your patch 5, you handle backlog completely
> differently than other stats already.
*we* surely understand that now. I'm just mentioning it might
confuse future offloaders; No strong objection here.
And I agree the alternative [passing pointers to each momentary stat]
is quite ugly.
^ permalink raw reply [flat|nested] 21+ messages in thread* RE: [patch net-next 3/5] net: sch: prio: Add offload ability to PRIO qdisc
2018-01-12 0:05 ` Yuval Mintz
@ 2018-01-12 9:20 ` Nogah Frankel
2018-01-12 9:42 ` Jakub Kicinski
0 siblings, 1 reply; 21+ messages in thread
From: Nogah Frankel @ 2018-01-12 9:20 UTC (permalink / raw)
To: Yuval Mintz, Jakub Kicinski
Cc: Jiri Pirko, netdev@vger.kernel.org, davem@davemloft.net,
Ido Schimmel, mlxsw, jhs@mojatatu.com, xiyou.wangcong@gmail.com
> -----Original Message-----
> From: Yuval Mintz
> Sent: Friday, January 12, 2018 2:05 AM
> To: Jakub Kicinski <kubakici@wp.pl>
> Cc: Jiri Pirko <jiri@resnulli.us>; netdev@vger.kernel.org; Nogah Frankel
> <nogahf@mellanox.com>; davem@davemloft.net; Ido Schimmel
> <idosch@mellanox.com>; mlxsw <mlxsw@mellanox.com>; jhs@mojatatu.com;
> xiyou.wangcong@gmail.com
> Subject: RE: [patch net-next 3/5] net: sch: prio: Add offload ability to PRIO qdisc
>
> > > > > +struct tc_prio_qopt_offload_params {
> > > > > + int bands;
> > > > > + u8 priomap[TC_PRIO_MAX + 1];
> > > > > + /* In case that a prio qdisc is offloaded and now is changed to a
> > > > > + * non-offloadedable config, it needs to update the backlog value
> > > > > + * to negate the HW backlog value.
> > > > > + */
> > > > > + u32 *backlog;
> > > > > +};
> > > >
> > > > Could we please pass the full qstats on replace and destroy. This
> > > > simplifies the driver code and allows handling the qlen as well as
> > > > backlog. Please see the 2 patch series I sent earlier yesterday.
On replace - no problem.
On destroy, I think it is redundant because the qdisc is going to be
deleted anyway.
(To make sure that we are on the same page, if replace in the HW fails, we
"rollback" only the backlog and qlen. The rest of the counters like TX
count should reflect the number of the packets that passed in the HW while
this qdisc was offloaded)
> > >
> > > That might give the false impression that offloading driver is expected
> > > to correct all the qstats fields during destruction, whereas for most of
> > > them it doesn't seem appropriate.
> >
> > The driver is supposed to return the momentary stats to their
> > original/SW-only value. And the driver knows exactly which stats
> > those are, just look at your patch 5, you handle backlog completely
> > differently than other stats already.
>
> *we* surely understand that now. I'm just mentioning it might
> confuse future offloaders; No strong objection here.
> And I agree the alternative [passing pointers to each momentary stat]
> is quite ugly.
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [patch net-next 3/5] net: sch: prio: Add offload ability to PRIO qdisc
2018-01-12 9:20 ` Nogah Frankel
@ 2018-01-12 9:42 ` Jakub Kicinski
0 siblings, 0 replies; 21+ messages in thread
From: Jakub Kicinski @ 2018-01-12 9:42 UTC (permalink / raw)
To: Nogah Frankel
Cc: Yuval Mintz, Jiri Pirko, netdev@vger.kernel.org,
davem@davemloft.net, Ido Schimmel, mlxsw, jhs@mojatatu.com,
xiyou.wangcong@gmail.com
On Fri, 12 Jan 2018 09:20:46 +0000
Nogah Frankel <nogahf@mellanox.com> wrote:
> > > > > Could we please pass the full qstats on replace and destroy.
> > > > > This simplifies the driver code and allows handling the qlen
> > > > > as well as backlog. Please see the 2 patch series I sent
> > > > > earlier yesterday.
>
> On replace - no problem.
> On destroy, I think it is redundant because the qdisc is going to be
> deleted anyway.
Right, the hope was it wouldn't matter and could make the code easier
to follow, but the replace with new qdisc complicates things.
> (To make sure that we are on the same page, if replace in the HW
> fails, we "rollback" only the backlog and qlen. The rest of the
> counters like TX count should reflect the number of the packets that
> passed in the HW while this qdisc was offloaded)
Agreed!
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch net-next 4/5] mlxsw: spectrum: qdiscs: Support PRIO qdisc offload
2018-01-11 10:20 [patch net-next 0/5] mlxsw: Offload PRIO qdisc Jiri Pirko
` (2 preceding siblings ...)
2018-01-11 10:21 ` [patch net-next 3/5] net: sch: prio: Add offload ability to PRIO qdisc Jiri Pirko
@ 2018-01-11 10:21 ` Jiri Pirko
2018-01-11 10:21 ` [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support stats for PRIO qdisc Jiri Pirko
2018-01-12 13:27 ` [patch net-next 0/5] mlxsw: Offload " Jamal Hadi Salim
5 siblings, 0 replies; 21+ messages in thread
From: Jiri Pirko @ 2018-01-11 10:21 UTC (permalink / raw)
To: netdev; +Cc: nogahf, yuvalm, davem, idosch, mlxsw, jhs, xiyou.wangcong
From: Nogah Frankel <nogahf@mellanox.com>
Add support for offloading PRIO qdisc as root qdisc.
The support is for up to 8 bands.
Routed packets priority is determined by the DSCP field with the default
translations. Bridged packets priority is determined by the PCP field, if
exist, otherwise it is set to 0.
Since both options have only priorities 0-7, higher priorities mapping are
being ignored.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 2 +
drivers/net/ethernet/mellanox/mlxsw/spectrum.h | 2 +
.../net/ethernet/mellanox/mlxsw/spectrum_qdisc.c | 82 ++++++++++++++++++++++
3 files changed, 86 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 54c7d9202e81..f78bfe394966 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -1830,6 +1830,8 @@ static int mlxsw_sp_setup_tc(struct net_device *dev, enum tc_setup_type type,
return mlxsw_sp_setup_tc_block(mlxsw_sp_port, type_data);
case TC_SETUP_QDISC_RED:
return mlxsw_sp_setup_tc_red(mlxsw_sp_port, type_data);
+ case TC_SETUP_QDISC_PRIO:
+ return mlxsw_sp_setup_tc_prio(mlxsw_sp_port, type_data);
default:
return -EOPNOTSUPP;
}
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index b6f475e83474..16f8fbda0891 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -565,6 +565,8 @@ int mlxsw_sp_tc_qdisc_init(struct mlxsw_sp_port *mlxsw_sp_port);
void mlxsw_sp_tc_qdisc_fini(struct mlxsw_sp_port *mlxsw_sp_port);
int mlxsw_sp_setup_tc_red(struct mlxsw_sp_port *mlxsw_sp_port,
struct tc_red_qopt_offload *p);
+int mlxsw_sp_setup_tc_prio(struct mlxsw_sp_port *mlxsw_sp_port,
+ struct tc_prio_qopt_offload *p);
/* spectrum_fid.c */
int mlxsw_sp_fid_flood_set(struct mlxsw_sp_fid *fid,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
index 273300b75a68..9e83edde7b35 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
@@ -41,9 +41,12 @@
#include "spectrum.h"
#include "reg.h"
+#define MLXSW_SP_PRIO_BAND_TO_TCLASS(band) (IEEE_8021QAZ_MAX_TCS - band - 1)
+
enum mlxsw_sp_qdisc_type {
MLXSW_SP_QDISC_NO_QDISC,
MLXSW_SP_QDISC_RED,
+ MLXSW_SP_QDISC_PRIO,
};
struct mlxsw_sp_qdisc_ops {
@@ -402,6 +405,85 @@ int mlxsw_sp_setup_tc_red(struct mlxsw_sp_port *mlxsw_sp_port,
}
}
+static int
+mlxsw_sp_qdisc_prio_destroy(struct mlxsw_sp_port *mlxsw_sp_port,
+ struct mlxsw_sp_qdisc *mlxsw_sp_qdisc)
+{
+ int i;
+
+ for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++)
+ mlxsw_sp_port_prio_tc_set(mlxsw_sp_port, i,
+ MLXSW_SP_PORT_DEFAULT_TCLASS);
+
+ return 0;
+}
+
+static int
+mlxsw_sp_qdisc_prio_check_params(struct mlxsw_sp_port *mlxsw_sp_port,
+ struct mlxsw_sp_qdisc *mlxsw_sp_qdisc,
+ void *params)
+{
+ struct tc_prio_qopt_offload_params *p = params;
+
+ if (p->bands > IEEE_8021QAZ_MAX_TCS)
+ return -EOPNOTSUPP;
+
+ return 0;
+}
+
+static int
+mlxsw_sp_qdisc_prio_replace(struct mlxsw_sp_port *mlxsw_sp_port,
+ struct mlxsw_sp_qdisc *mlxsw_sp_qdisc,
+ void *params)
+{
+ struct tc_prio_qopt_offload_params *p = params;
+ int tclass, i;
+ int err;
+
+ for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
+ tclass = MLXSW_SP_PRIO_BAND_TO_TCLASS(p->priomap[i]);
+ err = mlxsw_sp_port_prio_tc_set(mlxsw_sp_port, i, tclass);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+static struct mlxsw_sp_qdisc_ops mlxsw_sp_qdisc_ops_prio = {
+ .type = MLXSW_SP_QDISC_PRIO,
+ .check_params = mlxsw_sp_qdisc_prio_check_params,
+ .replace = mlxsw_sp_qdisc_prio_replace,
+ .destroy = mlxsw_sp_qdisc_prio_destroy,
+};
+
+int mlxsw_sp_setup_tc_prio(struct mlxsw_sp_port *mlxsw_sp_port,
+ struct tc_prio_qopt_offload *p)
+{
+ struct mlxsw_sp_qdisc *mlxsw_sp_qdisc;
+
+ if (p->parent != TC_H_ROOT)
+ return -EOPNOTSUPP;
+
+ mlxsw_sp_qdisc = mlxsw_sp_port->root_qdisc;
+ if (p->command == TC_PRIO_REPLACE)
+ return mlxsw_sp_qdisc_replace(mlxsw_sp_port, p->handle,
+ mlxsw_sp_qdisc,
+ &mlxsw_sp_qdisc_ops_prio,
+ &p->replace_params);
+
+ if (!mlxsw_sp_qdisc_compare(mlxsw_sp_qdisc, p->handle,
+ MLXSW_SP_QDISC_PRIO))
+ return -EOPNOTSUPP;
+
+ switch (p->command) {
+ case TC_PRIO_DESTROY:
+ return mlxsw_sp_qdisc_destroy(mlxsw_sp_port, mlxsw_sp_qdisc);
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
int mlxsw_sp_tc_qdisc_init(struct mlxsw_sp_port *mlxsw_sp_port)
{
mlxsw_sp_port->root_qdisc = kzalloc(sizeof(*mlxsw_sp_port->root_qdisc),
--
2.14.3
^ permalink raw reply related [flat|nested] 21+ messages in thread* [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support stats for PRIO qdisc
2018-01-11 10:20 [patch net-next 0/5] mlxsw: Offload PRIO qdisc Jiri Pirko
` (3 preceding siblings ...)
2018-01-11 10:21 ` [patch net-next 4/5] mlxsw: spectrum: qdiscs: Support PRIO qdisc offload Jiri Pirko
@ 2018-01-11 10:21 ` Jiri Pirko
2018-01-12 0:07 ` Jakub Kicinski
2018-01-12 13:27 ` [patch net-next 0/5] mlxsw: Offload " Jamal Hadi Salim
5 siblings, 1 reply; 21+ messages in thread
From: Jiri Pirko @ 2018-01-11 10:21 UTC (permalink / raw)
To: netdev; +Cc: nogahf, yuvalm, davem, idosch, mlxsw, jhs, xiyou.wangcong
From: Nogah Frankel <nogahf@mellanox.com>
Support basic stats for PRIO qdisc, which includes tx packets and bytes
count, drops count and backlog size. The rest of the stats are irrelevant
for this qdisc offload.
Since backlog is not only incremental but reflecting momentary value, in
case of a qdisc that stops being offloaded but is not destroyed, backlog
value needs to be updated about the un-offloading.
For that reason an unoffload function is being added to the ops struct.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
.../net/ethernet/mellanox/mlxsw/spectrum_qdisc.c | 92 ++++++++++++++++++++++
1 file changed, 92 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
index 9e83edde7b35..272c04951e5d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
@@ -66,6 +66,11 @@ struct mlxsw_sp_qdisc_ops {
void *xstats_ptr);
void (*clean_stats)(struct mlxsw_sp_port *mlxsw_sp_port,
struct mlxsw_sp_qdisc *mlxsw_sp_qdisc);
+ /* unoffload - to be used for a qdisc that stops being offloaded without
+ * being destroyed.
+ */
+ void (*unoffload)(struct mlxsw_sp_port *mlxsw_sp_port,
+ struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, void *params);
};
struct mlxsw_sp_qdisc {
@@ -73,6 +78,9 @@ struct mlxsw_sp_qdisc {
u8 tclass_num;
union {
struct red_stats red;
+ struct mlxsw_sp_qdisc_prio_stats {
+ u64 backlog;
+ } prio;
} xstats_base;
struct mlxsw_sp_qdisc_stats {
u64 tx_bytes;
@@ -144,6 +152,9 @@ mlxsw_sp_qdisc_replace(struct mlxsw_sp_port *mlxsw_sp_port, u32 handle,
err_bad_param:
err_config:
+ if (mlxsw_sp_qdisc->handle == handle && ops->unoffload)
+ ops->unoffload(mlxsw_sp_port, mlxsw_sp_qdisc, params);
+
mlxsw_sp_qdisc_destroy(mlxsw_sp_port, mlxsw_sp_qdisc);
return err;
}
@@ -450,11 +461,88 @@ mlxsw_sp_qdisc_prio_replace(struct mlxsw_sp_port *mlxsw_sp_port,
return 0;
}
+static void
+mlxsw_sp_qdisc_prio_unoffload(struct mlxsw_sp_port *mlxsw_sp_port,
+ struct mlxsw_sp_qdisc *mlxsw_sp_qdisc,
+ void *params)
+{
+ struct tc_prio_qopt_offload_params *p = params;
+
+ *p->backlog -= mlxsw_sp_cells_bytes(mlxsw_sp_port->mlxsw_sp,
+ mlxsw_sp_qdisc->xstats_base.prio.backlog);
+}
+
+static int
+mlxsw_sp_qdisc_get_prio_stats(struct mlxsw_sp_port *mlxsw_sp_port,
+ struct mlxsw_sp_qdisc *mlxsw_sp_qdisc,
+ struct tc_qopt_offload_stats *stats_ptr)
+{
+ u64 tx_bytes, tx_packets, drops = 0, backlog = 0;
+ struct mlxsw_sp_qdisc_prio_stats *prio_base;
+ struct mlxsw_sp_qdisc_stats *stats_base;
+ struct mlxsw_sp_port_xstats *xstats;
+ struct rtnl_link_stats64 *stats;
+ int i;
+
+ prio_base = &mlxsw_sp_qdisc->xstats_base.prio;
+ xstats = &mlxsw_sp_port->periodic_hw_stats.xstats;
+ stats = &mlxsw_sp_port->periodic_hw_stats.stats;
+ stats_base = &mlxsw_sp_qdisc->stats_base;
+
+ tx_bytes = stats->tx_bytes - stats_base->tx_bytes;
+ tx_packets = stats->tx_packets - stats_base->tx_packets;
+
+ for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
+ drops += xstats->tail_drop[i];
+ backlog += xstats->backlog[i];
+ }
+ drops = drops - stats_base->drops;
+
+ _bstats_update(stats_ptr->bstats, tx_bytes, tx_packets);
+ stats_ptr->qstats->drops += drops;
+ stats_ptr->qstats->backlog +=
+ mlxsw_sp_cells_bytes(mlxsw_sp_port->mlxsw_sp,
+ backlog) -
+ mlxsw_sp_cells_bytes(mlxsw_sp_port->mlxsw_sp,
+ prio_base->backlog);
+ prio_base->backlog = backlog;
+ stats_base->drops += drops;
+ stats_base->tx_bytes += tx_bytes;
+ stats_base->tx_packets += tx_packets;
+ return 0;
+}
+
+static void
+mlxsw_sp_setup_tc_qdisc_prio_clean_stats(struct mlxsw_sp_port *mlxsw_sp_port,
+ struct mlxsw_sp_qdisc *mlxsw_sp_qdisc)
+{
+ struct mlxsw_sp_qdisc_stats *stats_base;
+ struct mlxsw_sp_port_xstats *xstats;
+ struct rtnl_link_stats64 *stats;
+ int i;
+
+ xstats = &mlxsw_sp_port->periodic_hw_stats.xstats;
+ stats = &mlxsw_sp_port->periodic_hw_stats.stats;
+ stats_base = &mlxsw_sp_qdisc->stats_base;
+
+ stats_base->tx_packets = stats->tx_packets;
+ stats_base->tx_bytes = stats->tx_bytes;
+
+ stats_base->drops = 0;
+ for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++)
+ stats_base->drops += xstats->tail_drop[i];
+
+ mlxsw_sp_qdisc->xstats_base.prio.backlog = 0;
+}
+
static struct mlxsw_sp_qdisc_ops mlxsw_sp_qdisc_ops_prio = {
.type = MLXSW_SP_QDISC_PRIO,
.check_params = mlxsw_sp_qdisc_prio_check_params,
.replace = mlxsw_sp_qdisc_prio_replace,
+ .unoffload = mlxsw_sp_qdisc_prio_unoffload,
.destroy = mlxsw_sp_qdisc_prio_destroy,
+ .get_stats = mlxsw_sp_qdisc_get_prio_stats,
+ .clean_stats = mlxsw_sp_setup_tc_qdisc_prio_clean_stats,
};
int mlxsw_sp_setup_tc_prio(struct mlxsw_sp_port *mlxsw_sp_port,
@@ -479,6 +567,10 @@ int mlxsw_sp_setup_tc_prio(struct mlxsw_sp_port *mlxsw_sp_port,
switch (p->command) {
case TC_PRIO_DESTROY:
return mlxsw_sp_qdisc_destroy(mlxsw_sp_port, mlxsw_sp_qdisc);
+ case TC_PRIO_STATS:
+ return mlxsw_sp_qdisc_get_stats(mlxsw_sp_port, mlxsw_sp_qdisc,
+ &p->stats);
+
default:
return -EOPNOTSUPP;
}
--
2.14.3
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support stats for PRIO qdisc
2018-01-11 10:21 ` [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support stats for PRIO qdisc Jiri Pirko
@ 2018-01-12 0:07 ` Jakub Kicinski
2018-01-12 0:39 ` Yuval Mintz
0 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2018-01-12 0:07 UTC (permalink / raw)
To: Jiri Pirko
Cc: netdev, nogahf, yuvalm, davem, idosch, mlxsw, jhs, xiyou.wangcong
On Thu, 11 Jan 2018 11:21:02 +0100, Jiri Pirko wrote:
> From: Nogah Frankel <nogahf@mellanox.com>
>
> Support basic stats for PRIO qdisc, which includes tx packets and bytes
> count, drops count and backlog size. The rest of the stats are irrelevant
> for this qdisc offload.
> Since backlog is not only incremental but reflecting momentary value, in
> case of a qdisc that stops being offloaded but is not destroyed, backlog
> value needs to be updated about the un-offloading.
> For that reason an unoffload function is being added to the ops struct.
>
> Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
> Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
> ---
> .../net/ethernet/mellanox/mlxsw/spectrum_qdisc.c | 92 ++++++++++++++++++++++
> 1 file changed, 92 insertions(+)
>
> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
> index 9e83edde7b35..272c04951e5d 100644
> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
> @@ -66,6 +66,11 @@ struct mlxsw_sp_qdisc_ops {
> void *xstats_ptr);
> void (*clean_stats)(struct mlxsw_sp_port *mlxsw_sp_port,
> struct mlxsw_sp_qdisc *mlxsw_sp_qdisc);
> + /* unoffload - to be used for a qdisc that stops being offloaded without
> + * being destroyed.
> + */
> + void (*unoffload)(struct mlxsw_sp_port *mlxsw_sp_port,
> + struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, void *params);
Hm. You you need this just because you didn't add the backlog pointer
to destroy? AFAIK on destroy we are free to reset stats as well, thus
simplifying your driver... Let me know if I misunderstand.
> };
>
> struct mlxsw_sp_qdisc {
> @@ -73,6 +78,9 @@ struct mlxsw_sp_qdisc {
> u8 tclass_num;
> union {
> struct red_stats red;
> + struct mlxsw_sp_qdisc_prio_stats {
> + u64 backlog;
This is not a prio stat, it's a standard qstat. I've added it to
struct mlxsw_sp_qdisc_stats. The reason you need to treat it
separately is that RED has non-standard backlog handling which I'm
trying to fix...
> + } prio;
> } xstats_base;
> struct mlxsw_sp_qdisc_stats {
> u64 tx_bytes;
> @@ -144,6 +152,9 @@ mlxsw_sp_qdisc_replace(struct mlxsw_sp_port *mlxsw_sp_port, u32 handle,
>
> err_bad_param:
> err_config:
> + if (mlxsw_sp_qdisc->handle == handle && ops->unoffload)
> + ops->unoffload(mlxsw_sp_port, mlxsw_sp_qdisc, params);
> +
> mlxsw_sp_qdisc_destroy(mlxsw_sp_port, mlxsw_sp_qdisc);
> return err;
> }
> @@ -479,6 +567,10 @@ int mlxsw_sp_setup_tc_prio(struct mlxsw_sp_port *mlxsw_sp_port,
> switch (p->command) {
> case TC_PRIO_DESTROY:
> return mlxsw_sp_qdisc_destroy(mlxsw_sp_port, mlxsw_sp_qdisc);
> + case TC_PRIO_STATS:
> + return mlxsw_sp_qdisc_get_stats(mlxsw_sp_port, mlxsw_sp_qdisc,
> + &p->stats);
> +
nit: extra new line intentional? :)
> default:
> return -EOPNOTSUPP;
> }
^ permalink raw reply [flat|nested] 21+ messages in thread* RE: [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support stats for PRIO qdisc
2018-01-12 0:07 ` Jakub Kicinski
@ 2018-01-12 0:39 ` Yuval Mintz
2018-01-12 8:32 ` Jakub Kicinski
0 siblings, 1 reply; 21+ messages in thread
From: Yuval Mintz @ 2018-01-12 0:39 UTC (permalink / raw)
To: Jakub Kicinski, Jiri Pirko
Cc: netdev@vger.kernel.org, Nogah Frankel, davem@davemloft.net,
Ido Schimmel, mlxsw, jhs@mojatatu.com, xiyou.wangcong@gmail.com
> > Support basic stats for PRIO qdisc, which includes tx packets and bytes
> > count, drops count and backlog size. The rest of the stats are irrelevant
> > for this qdisc offload.
> > Since backlog is not only incremental but reflecting momentary value, in
> > case of a qdisc that stops being offloaded but is not destroyed, backlog
> > value needs to be updated about the un-offloading.
> > For that reason an unoffload function is being added to the ops struct.
> >
> > Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
> > Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
> > Signed-off-by: Jiri Pirko <jiri@mellanox.com>
> > ---
> > .../net/ethernet/mellanox/mlxsw/spectrum_qdisc.c | 92
> ++++++++++++++++++++++
> > 1 file changed, 92 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
> b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
> > index 9e83edde7b35..272c04951e5d 100644
> > --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
> > +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
> > @@ -66,6 +66,11 @@ struct mlxsw_sp_qdisc_ops {
> > void *xstats_ptr);
> > void (*clean_stats)(struct mlxsw_sp_port *mlxsw_sp_port,
> > struct mlxsw_sp_qdisc *mlxsw_sp_qdisc);
> > + /* unoffload - to be used for a qdisc that stops being offloaded
> without
> > + * being destroyed.
> > + */
> > + void (*unoffload)(struct mlxsw_sp_port *mlxsw_sp_port,
> > + struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, void
> *params);
>
> Hm. You you need this just because you didn't add the backlog pointer
> to destroy? AFAIK on destroy we are free to reset stats as well, thus
> simplifying your driver... Let me know if I misunderstand.
This is meant exactly for the scenario where qdisc didn't get destroyed
yet is no longer offloaded; E.g., if number of bands increased beyond
What we can offload. So we can't reset the statistics in this case.
[Although I might be the one to misunderstand you,
as the 'not destroyed' was explicitly mentioned twice above]
>
> > };
> >
> > struct mlxsw_sp_qdisc {
> > @@ -73,6 +78,9 @@ struct mlxsw_sp_qdisc {
> > u8 tclass_num;
> > union {
> > struct red_stats red;
> > + struct mlxsw_sp_qdisc_prio_stats {
> > + u64 backlog;
>
> This is not a prio stat, it's a standard qstat. I've added it to
> struct mlxsw_sp_qdisc_stats. The reason you need to treat it
> separately is that RED has non-standard backlog handling which I'm
> trying to fix...
>
> > + } prio;
> > } xstats_base;
> > struct mlxsw_sp_qdisc_stats {
> > u64 tx_bytes;
> > @@ -144,6 +152,9 @@ mlxsw_sp_qdisc_replace(struct mlxsw_sp_port
> *mlxsw_sp_port, u32 handle,
> >
> > err_bad_param:
> > err_config:
> > + if (mlxsw_sp_qdisc->handle == handle && ops->unoffload)
> > + ops->unoffload(mlxsw_sp_port, mlxsw_sp_qdisc, params);
> > +
> > mlxsw_sp_qdisc_destroy(mlxsw_sp_port, mlxsw_sp_qdisc);
> > return err;
> > }
>
> > @@ -479,6 +567,10 @@ int mlxsw_sp_setup_tc_prio(struct mlxsw_sp_port
> *mlxsw_sp_port,
> > switch (p->command) {
> > case TC_PRIO_DESTROY:
> > return mlxsw_sp_qdisc_destroy(mlxsw_sp_port,
> mlxsw_sp_qdisc);
> > + case TC_PRIO_STATS:
> > + return mlxsw_sp_qdisc_get_stats(mlxsw_sp_port,
> mlxsw_sp_qdisc,
> > + &p->stats);
> > +
>
> nit: extra new line intentional? :)
>
> > default:
> > return -EOPNOTSUPP;
> > }
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support stats for PRIO qdisc
2018-01-12 0:39 ` Yuval Mintz
@ 2018-01-12 8:32 ` Jakub Kicinski
2018-01-12 8:46 ` Yuval Mintz
0 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2018-01-12 8:32 UTC (permalink / raw)
To: Yuval Mintz
Cc: Jiri Pirko, netdev@vger.kernel.org, Nogah Frankel,
davem@davemloft.net, Ido Schimmel, mlxsw, jhs@mojatatu.com,
xiyou.wangcong@gmail.com
On Fri, 12 Jan 2018 00:39:26 +0000 Yuval Mintz wrote:
> > Hm. You you need this just because you didn't add the backlog
> > pointer to destroy? AFAIK on destroy we are free to reset stats as
> > well, thus simplifying your driver... Let me know if I
> > misunderstand.
>
> This is meant exactly for the scenario where qdisc didn't get
> destroyed yet is no longer offloaded; E.g., if number of bands
> increased beyond What we can offload. So we can't reset the
> statistics in this case. [Although I might be the one to
> misunderstand you, as the 'not destroyed' was explicitly mentioned
> twice above]
I was trying to take some liberty with handling of destroy but your
approach may actually end up being simpler. I will withdraw my series
for now and reuse your new callback once this series lands.
Do you have any objections to changing RED to behave more like prio
(and other qdiscs) in principle?
^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support stats for PRIO qdisc
2018-01-12 8:32 ` Jakub Kicinski
@ 2018-01-12 8:46 ` Yuval Mintz
2018-01-12 9:29 ` Nogah Frankel
0 siblings, 1 reply; 21+ messages in thread
From: Yuval Mintz @ 2018-01-12 8:46 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Jiri Pirko, netdev@vger.kernel.org, Nogah Frankel,
davem@davemloft.net, Ido Schimmel, mlxsw, jhs@mojatatu.com,
xiyou.wangcong@gmail.com
> > > Hm. You you need this just because you didn't add the backlog
> > > pointer to destroy? AFAIK on destroy we are free to reset stats as
> > > well, thus simplifying your driver... Let me know if I
> > > misunderstand.
> >
> > This is meant exactly for the scenario where qdisc didn't get
> > destroyed yet is no longer offloaded; E.g., if number of bands
> > increased beyond What we can offload. So we can't reset the
> > statistics in this case. [Although I might be the one to
> > misunderstand you, as the 'not destroyed' was explicitly mentioned
> > twice above]
>
> I was trying to take some liberty with handling of destroy but your
> approach may actually end up being simpler. I will withdraw my series
> for now and reuse your new callback once this series lands.
>
> Do you have any objections to changing RED to behave more like prio
> (and other qdiscs) in principle?
>From statistics' perspective? None.
^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support stats for PRIO qdisc
2018-01-12 8:46 ` Yuval Mintz
@ 2018-01-12 9:29 ` Nogah Frankel
2018-01-12 9:40 ` Jakub Kicinski
0 siblings, 1 reply; 21+ messages in thread
From: Nogah Frankel @ 2018-01-12 9:29 UTC (permalink / raw)
To: Yuval Mintz, Jakub Kicinski
Cc: Jiri Pirko, netdev@vger.kernel.org, davem@davemloft.net,
Ido Schimmel, mlxsw, jhs@mojatatu.com, xiyou.wangcong@gmail.com
> -----Original Message-----
> From: Yuval Mintz
> Sent: Friday, January 12, 2018 10:47 AM
> To: Jakub Kicinski <kubakici@wp.pl>
> Cc: Jiri Pirko <jiri@resnulli.us>; netdev@vger.kernel.org; Nogah Frankel
> <nogahf@mellanox.com>; davem@davemloft.net; Ido Schimmel
> <idosch@mellanox.com>; mlxsw <mlxsw@mellanox.com>; jhs@mojatatu.com;
> xiyou.wangcong@gmail.com
> Subject: RE: [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support stats for PRIO qdisc
>
> > > > Hm. You you need this just because you didn't add the backlog
> > > > pointer to destroy? AFAIK on destroy we are free to reset stats as
> > > > well, thus simplifying your driver... Let me know if I
> > > > misunderstand.
The problem of doing it in destroy is when one qdisc is replacing another.
I want to be able to destroy the old qdisc to "make room" for the new one
before I get the destroy command for the old qdisc (that will come just
after the replace command for the new qdisc).
If I am saying that the destroy changes the stats, I need to save some data
about the old qdisc till I get the destroy command for it.
> > >
> > > This is meant exactly for the scenario where qdisc didn't get
> > > destroyed yet is no longer offloaded; E.g., if number of bands
> > > increased beyond What we can offload. So we can't reset the
> > > statistics in this case. [Although I might be the one to
> > > misunderstand you, as the 'not destroyed' was explicitly mentioned
> > > twice above]
> >
> > I was trying to take some liberty with handling of destroy but your
> > approach may actually end up being simpler. I will withdraw my series
> > for now and reuse your new callback once this series lands.
> >
> > Do you have any objections to changing RED to behave more like prio
> > (and other qdiscs) in principle?
On the contrary, I think it is great.
>
> From statistics' perspective? None.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support stats for PRIO qdisc
2018-01-12 9:29 ` Nogah Frankel
@ 2018-01-12 9:40 ` Jakub Kicinski
2018-01-12 10:26 ` Nogah Frankel
0 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2018-01-12 9:40 UTC (permalink / raw)
To: Nogah Frankel
Cc: Yuval Mintz, Jiri Pirko, netdev@vger.kernel.org,
davem@davemloft.net, Ido Schimmel, mlxsw, jhs@mojatatu.com,
xiyou.wangcong@gmail.com
On Fri, 12 Jan 2018 09:29:22 +0000 Nogah Frankel wrote:
> > -----Original Message-----
> > From: Yuval Mintz
> > Sent: Friday, January 12, 2018 10:47 AM
> > To: Jakub Kicinski <kubakici@wp.pl>
> > Cc: Jiri Pirko <jiri@resnulli.us>; netdev@vger.kernel.org; Nogah
> > Frankel <nogahf@mellanox.com>; davem@davemloft.net; Ido Schimmel
> > <idosch@mellanox.com>; mlxsw <mlxsw@mellanox.com>; jhs@mojatatu.com;
> > xiyou.wangcong@gmail.com
> > Subject: RE: [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support
> > stats for PRIO qdisc
> > > > > Hm. You you need this just because you didn't add the backlog
> > > > > pointer to destroy? AFAIK on destroy we are free to reset
> > > > > stats as well, thus simplifying your driver... Let me know
> > > > > if I misunderstand.
>
> The problem of doing it in destroy is when one qdisc is replacing
> another. I want to be able to destroy the old qdisc to "make room"
> for the new one before I get the destroy command for the old qdisc
> (that will come just after the replace command for the new qdisc).
> If I am saying that the destroy changes the stats, I need to save
> some data about the old qdisc till I get the destroy command for it.
Agreed, maintaining a coherent destroy behavior would be problematic
when successful replace with a new qdisc (e.g. different handle) is
involved :(
Besides the momentary stats seem to be reset before destroy so not
touching them may be in fact more correct. I need to look into the
propagation done in qdisc_tree_reduce_backlog(), it worries me. If
we start stacking the qdiscs (e.g. red on top of prio) it could mess
with the root's backlog...
^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support stats for PRIO qdisc
2018-01-12 9:40 ` Jakub Kicinski
@ 2018-01-12 10:26 ` Nogah Frankel
0 siblings, 0 replies; 21+ messages in thread
From: Nogah Frankel @ 2018-01-12 10:26 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Yuval Mintz, Jiri Pirko, netdev@vger.kernel.org,
davem@davemloft.net, Ido Schimmel, mlxsw, jhs@mojatatu.com,
xiyou.wangcong@gmail.com
> > > > > > Hm. You you need this just because you didn't add the backlog
> > > > > > pointer to destroy? AFAIK on destroy we are free to reset
> > > > > > stats as well, thus simplifying your driver... Let me know
> > > > > > if I misunderstand.
> >
> > The problem of doing it in destroy is when one qdisc is replacing
> > another. I want to be able to destroy the old qdisc to "make room"
> > for the new one before I get the destroy command for the old qdisc
> > (that will come just after the replace command for the new qdisc).
> > If I am saying that the destroy changes the stats, I need to save
> > some data about the old qdisc till I get the destroy command for it.
>
> Agreed, maintaining a coherent destroy behavior would be problematic
> when successful replace with a new qdisc (e.g. different handle) is
> involved :(
>
> Besides the momentary stats seem to be reset before destroy so not
> touching them may be in fact more correct. I need to look into the
> propagation done in qdisc_tree_reduce_backlog(), it worries me. If
> we start stacking the qdiscs (e.g. red on top of prio) it could mess
> with the root's backlog...
I think it can be solved in the driver, or by using the OFFLOAD flag to
avoid this backlog reduce for offloaded qdiscs.
It might make sense in some point to separate the HW statistics from the
SW, at least for the momentary ones.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [patch net-next 0/5] mlxsw: Offload PRIO qdisc
2018-01-11 10:20 [patch net-next 0/5] mlxsw: Offload PRIO qdisc Jiri Pirko
` (4 preceding siblings ...)
2018-01-11 10:21 ` [patch net-next 5/5] mlxsw: spectrum: qdiscs: Support stats for PRIO qdisc Jiri Pirko
@ 2018-01-12 13:27 ` Jamal Hadi Salim
5 siblings, 0 replies; 21+ messages in thread
From: Jamal Hadi Salim @ 2018-01-12 13:27 UTC (permalink / raw)
To: Jiri Pirko, netdev; +Cc: nogahf, yuvalm, davem, idosch, mlxsw, xiyou.wangcong
On 18-01-11 05:20 AM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
>
> Add an offload support for PRIO qdisc for mlxsw driver.
> PRIO qdisc is being offloaded by using ndo_setup_tc. It has three
> commands, to set or tune the qdisc, to remove it and to get its stats.
>
> Like RED offloading, offloading this qdisc is not enforced on the driver
> and determining its offload state is done in the dump action, when the
> stats are being updated.
> In the driver, offloading of PRIO is supported as root qdisc only. It
> supports only priorities 0-7 (the range that is used by the current static
> mapping of DSCP to skb prio and by 1:1 PCP values mapping) and up to 8
> bands.
>
> Patches 1-2 offload DSCP to priority mapping in the mlxsw_sp driver.
> Patch 3 adds offload support for PRIO qdisc.
> Patches 4-5 Add PRIO offload support in the mlxsw_sp driver.
Strictly DSCP or 802.1p as well?
cheers,
jamal
^ permalink raw reply [flat|nested] 21+ messages in thread