All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v1 0/5] TC: Introduce qevents
@ 2020-06-26 22:45 Petr Machata
  2020-06-26 22:45 ` [PATCH net-next v1 1/5] net: sched: Pass root lock to Qdisc_ops.enqueue Petr Machata
                   ` (7 more replies)
  0 siblings, 8 replies; 32+ messages in thread
From: Petr Machata @ 2020-06-26 22:45 UTC (permalink / raw)
  To: netdev
  Cc: David Miller, Jakub Kicinski, Cong Wang, Eric Dumazet, jiri,
	idosch, Petr Machata

The Spectrum hardware allows execution of one of several actions as a
result of queue management decisions: tail-dropping, early-dropping,
marking a packet, or passing a configured latency threshold or buffer
size. Such packets can be mirrored, trapped, or sampled.

Modeling the action to be taken as simply a TC action is very attractive,
but it is not obvious where to put these actions. At least with ECN marking
one could imagine a tree of qdiscs and classifiers that effectively
accomplishes this task, albeit in an impractically complex manner. But
there is just no way to match on dropped-ness of a packet, let alone
dropped-ness due to a particular reason.

To allow configuring user-defined actions as a result of inner workings of
a qdisc, this patch set introduces a concept of qevents. Those are attach
points for TC blocks, where filters can be put that are executed as the
packet hits well-defined points in the qdisc algorithms. The attached
blocks can be shared, in a manner similar to clsact ingress and egress
blocks, arbitrary classifiers with arbitrary actions can be put on them,
etc.

For example:

# tc qdisc add dev eth0 root handle 1: \
	red limit 500K avpkt 1K qevent early_drop block 10
# tc filter add block 10 \
	matchall action mirred egress mirror dev eth1

The central patch #2 introduces several helpers to allow easy and uniform
addition of qevents to qdiscs: initialization, destruction, qevent block
number change validation, and qevent handling, i.e. dispatch of the filters
attached to the block bound to a qevent.

Patch #1 adds root_lock argument to qdisc enqueue op. The problem this is
tackling is that if a qevent filter pushes packets to the same qdisc tree
that holds the qevent in the first place, attempt to take qdisc root lock
for the second time will lead to a deadlock. To solve the issue, qevent
handler needs to unlock and relock the root lock around the filter
processing. Passing root_lock around makes it possible to get the lock
where it is needed, and visibly so, such that it is obvious the lock will
be used when invoking a qevent.

The following two patches, #3 and #4, then add two qevents to the RED
qdisc: "early_drop" qevent fires when a packet is early-dropped; "mark"
qevent, when it is ECN-marked.

Patch #5 contains a selftest. I have mentioned this test when pushing the
RED ECN nodrop mode and said that "I have no confidence in its portability
to [...] different configurations". That still holds. The backlog and
packet size are tuned to make the test deterministic. But it is better than
nothing, and on the boxes that I ran it on it does work and shows that
qevents work the way they are supposed to, and that their addition has not
broken the other tested features.

This patch set does not deal with offloading. The idea there is that a
driver will be able to figure out that a given block is used in qevent
context by looking at binder type. A future patch-set will add a qdisc
pointer to struct flow_block_offload, which a driver will be able to
consult to glean the TC or other relevant attributes.

Changes from RFC to v1:
- Move a "q = qdisc_priv(sch)" from patch #3 to patch #4
- Fix deadlock caused by mirroring packet back to the same qdisc tree.
- Rename "tail" qevent to "tail_drop".
- Adapt to the new 100-column standard.
- Add a selftest

Petr Machata (5):
  net: sched: Pass root lock to Qdisc_ops.enqueue
  net: sched: Introduce helpers for qevent blocks
  net: sched: sch_red: Split init and change callbacks
  net: sched: sch_red: Add qevents "early_drop" and "mark"
  selftests: forwarding: Add a RED test for SW datapath

 include/net/flow_offload.h                    |   2 +
 include/net/pkt_cls.h                         |  49 ++
 include/net/sch_generic.h                     |   6 +-
 include/uapi/linux/pkt_sched.h                |   2 +
 net/core/dev.c                                |   4 +-
 net/sched/cls_api.c                           | 119 +++++
 net/sched/sch_atm.c                           |   4 +-
 net/sched/sch_blackhole.c                     |   2 +-
 net/sched/sch_cake.c                          |   2 +-
 net/sched/sch_cbq.c                           |   4 +-
 net/sched/sch_cbs.c                           |  18 +-
 net/sched/sch_choke.c                         |   2 +-
 net/sched/sch_codel.c                         |   2 +-
 net/sched/sch_drr.c                           |   4 +-
 net/sched/sch_dsmark.c                        |   4 +-
 net/sched/sch_etf.c                           |   2 +-
 net/sched/sch_ets.c                           |   4 +-
 net/sched/sch_fifo.c                          |   6 +-
 net/sched/sch_fq.c                            |   2 +-
 net/sched/sch_fq_codel.c                      |   2 +-
 net/sched/sch_fq_pie.c                        |   2 +-
 net/sched/sch_generic.c                       |   4 +-
 net/sched/sch_gred.c                          |   2 +-
 net/sched/sch_hfsc.c                          |   6 +-
 net/sched/sch_hhf.c                           |   2 +-
 net/sched/sch_htb.c                           |   4 +-
 net/sched/sch_multiq.c                        |   4 +-
 net/sched/sch_netem.c                         |   8 +-
 net/sched/sch_pie.c                           |   2 +-
 net/sched/sch_plug.c                          |   2 +-
 net/sched/sch_prio.c                          |   6 +-
 net/sched/sch_qfq.c                           |   4 +-
 net/sched/sch_red.c                           | 102 +++-
 net/sched/sch_sfb.c                           |   4 +-
 net/sched/sch_sfq.c                           |   2 +-
 net/sched/sch_skbprio.c                       |   2 +-
 net/sched/sch_taprio.c                        |   4 +-
 net/sched/sch_tbf.c                           |  10 +-
 net/sched/sch_teql.c                          |   4 +-
 .../selftests/net/forwarding/sch_red.sh       | 492 ++++++++++++++++++
 40 files changed, 822 insertions(+), 84 deletions(-)
 create mode 100755 tools/testing/selftests/net/forwarding/sch_red.sh

-- 
2.20.1


^ permalink raw reply	[flat|nested] 32+ messages in thread
* Re: [PATCH net-next v1 2/5] net: sched: Introduce helpers for qevent blocks
@ 2020-06-27 21:06 kernel test robot
  0 siblings, 0 replies; 32+ messages in thread
From: kernel test robot @ 2020-06-27 21:06 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 6111 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <79417f27b7c57da5c0eb54bb6d074d3a472d9ebf.1593209494.git.petrm@mellanox.com>
References: <79417f27b7c57da5c0eb54bb6d074d3a472d9ebf.1593209494.git.petrm(a)mellanox.com>
TO: Petr Machata <petrm@mellanox.com>

Hi Petr,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Petr-Machata/TC-Introduce-qevents/20200627-064838
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 18c955b730002afdb0f86be39c0d202450acbbfc
:::::: branch date: 22 hours ago
:::::: commit date: 22 hours ago
config: x86_64-randconfig-s021-20200628 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-13) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.2-dirty
        # save the attached .config to linux build tree
        make W=1 C=1 ARCH=x86_64 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)

   net/sched/cls_api.c:270:22: sparse: sparse: incorrect type in assignment (different base types) @@     expected restricted __be16 [usertype] protocol @@     got unsigned int [usertype] protocol @@
   net/sched/cls_api.c:270:22: sparse:     expected restricted __be16 [usertype] protocol
   net/sched/cls_api.c:270:22: sparse:     got unsigned int [usertype] protocol
   net/sched/cls_api.c:1672:16: sparse: sparse: incompatible types in comparison expression (different address spaces):
   net/sched/cls_api.c:1672:16: sparse:    struct tcf_proto *
   net/sched/cls_api.c:1672:16: sparse:    struct tcf_proto [noderef] __rcu *
   net/sched/cls_api.c:1772:20: sparse: sparse: incompatible types in comparison expression (different address spaces):
   net/sched/cls_api.c:1772:20: sparse:    struct tcf_proto [noderef] __rcu *
   net/sched/cls_api.c:1772:20: sparse:    struct tcf_proto *
   net/sched/cls_api.c:1734:25: sparse: sparse: incompatible types in comparison expression (different address spaces):
   net/sched/cls_api.c:1734:25: sparse:    struct tcf_proto [noderef] __rcu *
   net/sched/cls_api.c:1734:25: sparse:    struct tcf_proto *
   net/sched/cls_api.c:1754:16: sparse: sparse: incompatible types in comparison expression (different address spaces):
   net/sched/cls_api.c:1754:16: sparse:    struct tcf_proto *
   net/sched/cls_api.c:1754:16: sparse:    struct tcf_proto [noderef] __rcu *
   net/sched/cls_api.c:1819:25: sparse: sparse: restricted __be16 degrades to integer
   net/sched/cls_api.c:2494:50: sparse: sparse: restricted __be16 degrades to integer
   net/sched/cls_api.c:3725:9: sparse: sparse: context imbalance in 'tc_setup_flow_action' - different lock contexts for basic block
>> net/sched/cls_api.c:3834:28: sparse: sparse: context imbalance in 'tcf_qevent_handle' - unexpected unlock

# https://github.com/0day-ci/linux/commit/7c5fa194990f3016307b41d64095ee09df104e1a
git remote add linux-review https://github.com/0day-ci/linux
git remote update linux-review
git checkout 7c5fa194990f3016307b41d64095ee09df104e1a
vim +/tcf_qevent_handle +3834 net/sched/cls_api.c

7c5fa194990f30 Petr Machata 2020-06-27  3821  
7c5fa194990f30 Petr Machata 2020-06-27  3822  struct sk_buff *tcf_qevent_handle(struct tcf_qevent *qe, struct Qdisc *sch, struct sk_buff *skb,
7c5fa194990f30 Petr Machata 2020-06-27  3823  				  spinlock_t *root_lock, struct sk_buff **to_free, int *ret)
7c5fa194990f30 Petr Machata 2020-06-27  3824  {
7c5fa194990f30 Petr Machata 2020-06-27  3825  	struct tcf_result cl_res;
7c5fa194990f30 Petr Machata 2020-06-27  3826  	struct tcf_proto *fl;
7c5fa194990f30 Petr Machata 2020-06-27  3827  
7c5fa194990f30 Petr Machata 2020-06-27  3828  	if (!qe->info.block_index)
7c5fa194990f30 Petr Machata 2020-06-27  3829  		return skb;
7c5fa194990f30 Petr Machata 2020-06-27  3830  
7c5fa194990f30 Petr Machata 2020-06-27  3831  	fl = rcu_dereference_bh(qe->filter_chain);
7c5fa194990f30 Petr Machata 2020-06-27  3832  
7c5fa194990f30 Petr Machata 2020-06-27  3833  	if (root_lock)
7c5fa194990f30 Petr Machata 2020-06-27 @3834  		spin_unlock(root_lock);
7c5fa194990f30 Petr Machata 2020-06-27  3835  
7c5fa194990f30 Petr Machata 2020-06-27  3836  	switch (tcf_classify(skb, fl, &cl_res, false)) {
7c5fa194990f30 Petr Machata 2020-06-27  3837  	case TC_ACT_SHOT:
7c5fa194990f30 Petr Machata 2020-06-27  3838  		qdisc_qstats_drop(sch);
7c5fa194990f30 Petr Machata 2020-06-27  3839  		__qdisc_drop(skb, to_free);
7c5fa194990f30 Petr Machata 2020-06-27  3840  		*ret = __NET_XMIT_BYPASS;
7c5fa194990f30 Petr Machata 2020-06-27  3841  		return NULL;
7c5fa194990f30 Petr Machata 2020-06-27  3842  	case TC_ACT_STOLEN:
7c5fa194990f30 Petr Machata 2020-06-27  3843  	case TC_ACT_QUEUED:
7c5fa194990f30 Petr Machata 2020-06-27  3844  	case TC_ACT_TRAP:
7c5fa194990f30 Petr Machata 2020-06-27  3845  		__qdisc_drop(skb, to_free);
7c5fa194990f30 Petr Machata 2020-06-27  3846  		*ret = __NET_XMIT_STOLEN;
7c5fa194990f30 Petr Machata 2020-06-27  3847  		return NULL;
7c5fa194990f30 Petr Machata 2020-06-27  3848  	case TC_ACT_REDIRECT:
7c5fa194990f30 Petr Machata 2020-06-27  3849  		skb_do_redirect(skb);
7c5fa194990f30 Petr Machata 2020-06-27  3850  		*ret = __NET_XMIT_STOLEN;
7c5fa194990f30 Petr Machata 2020-06-27  3851  		return NULL;
7c5fa194990f30 Petr Machata 2020-06-27  3852  	}
7c5fa194990f30 Petr Machata 2020-06-27  3853  
7c5fa194990f30 Petr Machata 2020-06-27  3854  	if (root_lock)
7c5fa194990f30 Petr Machata 2020-06-27  3855  		spin_lock(root_lock);
7c5fa194990f30 Petr Machata 2020-06-27  3856  
7c5fa194990f30 Petr Machata 2020-06-27  3857  	return skb;
7c5fa194990f30 Petr Machata 2020-06-27  3858  }
7c5fa194990f30 Petr Machata 2020-06-27  3859  EXPORT_SYMBOL(tcf_qevent_handle);
7c5fa194990f30 Petr Machata 2020-06-27  3860  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 35447 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2020-07-14  9:12 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-06-26 22:45 [PATCH net-next v1 0/5] TC: Introduce qevents Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 1/5] net: sched: Pass root lock to Qdisc_ops.enqueue Petr Machata
2020-07-06 19:21   ` Cong Wang
2020-07-07 15:25     ` Petr Machata
2020-07-07 19:41       ` Cong Wang
2020-06-26 22:45 ` [PATCH net-next v1 2/5] net: sched: Introduce helpers for qevent blocks Petr Machata
2020-07-06 19:48   ` Cong Wang
2020-07-07 15:22     ` Petr Machata
2020-07-07 19:13       ` Cong Wang
2020-07-08 12:35         ` Petr Machata
2020-07-08 16:21           ` Petr Machata
2020-07-08 19:09             ` Cong Wang
2020-07-08 19:04           ` Cong Wang
2020-07-08 21:04             ` Petr Machata
2020-07-09  0:13               ` Petr Machata
2020-07-09 19:37                 ` Cong Wang
2020-07-10 14:40                   ` Petr Machata
2020-07-14  2:51                     ` Cong Wang
2020-07-14  9:12                       ` Petr Machata
2020-07-07 19:48   ` Cong Wang
2020-07-08  9:19     ` Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 3/5] net: sched: sch_red: Split init and change callbacks Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 4/5] net: sched: sch_red: Add qevents "early_drop" and "mark" Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 5/5] selftests: forwarding: Add a RED test for SW datapath Petr Machata
2020-06-26 22:45 ` [PATCH iproute2-next v1 1/4] uapi: pkt_sched: Add two new RED attributes Petr Machata
2020-06-26 22:45   ` [PATCH iproute2-next v1 2/4] tc: Add helpers to support qevent handling Petr Machata
2020-06-26 22:45   ` [PATCH iproute2-next v1 3/4] man: tc: Describe qevents Petr Machata
2020-06-26 22:45   ` [PATCH iproute2-next v1 4/4] tc: q_red: Add support for qevents "mark" and "early_drop" Petr Machata
2020-06-26 22:56 ` [PATCH net-next v1 0/5] TC: Introduce qevents Stephen Hemminger
2020-06-29 13:21   ` Petr Machata
2020-06-30  0:15 ` David Miller
  -- strict thread matches above, loose matches on Subject: below --
2020-06-27 21:06 [PATCH net-next v1 2/5] net: sched: Introduce helpers for qevent blocks kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.