netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/12] drop the qdisc lock for pfifo_fast/mq
@ 2015-12-30 17:50 John Fastabend
  2015-12-30 17:51 ` [RFC PATCH 01/12] lib: array based lock free queue John Fastabend
                   ` (12 more replies)
  0 siblings, 13 replies; 29+ messages in thread
From: John Fastabend @ 2015-12-30 17:50 UTC (permalink / raw)
  To: daniel, eric.dumazet, jhs, aduyck, brouer, davem
  Cc: john.r.fastabend, netdev, john.fastabend

Hi,

This is a first take at removing the qdisc lock on the xmit path
where qdiscs actually have queues of skbs. The ingress qdisc
which is already lockless was "easy" at least in the sense that
we did not need any lock-free data structures to hold skbs.

The series here is experimental at the moment I decided to dump
it to netdev list when the list of folks I wanted to send it to
privately grew to three or four. Hopefully more people will take
a look at it and give feedback/criticism/whatever. For now I've
only done very basic performance tests and it showed a slight
performance improvement with pfifo_fast but this is somewhat to
be expected as the dequeue operation in the qdisc is only removing
a single skb at a time a bulk dequeue would be better presumably
so I'm tinkering with a pfifo_bulk or option to pfifo_fast to make
that work. All that said I ran some traffic over night and my
kernel didn't crash, did a few interface resets and up/downs and
functionally everything is still up and running. On the TODO list
though is to review all the code paths into/out of sch_generic and
sch_api at the moment no promises I didn't miss a path.

The plan of attack here was

 - use the alf_queue (patch 1 from Jesper) and then convert
   pfifo_fast linked list of skbs over to the alf_queue.

 - fixup all the cases where pfifo fast uses qstats to be per-cpu

 - fixup qlen to support per cpu operations

 - make the gso_skb logic per cpu so any given cpu can park an
   skb when the driver throws an error or we get a cpu collision

 - wrap all the qdisc_lock calls in the xmit path with a wrapper
   that checks for a NOLOCK flag first

 - set the per cpu stats bit and nolock bit in pfifo fast and
   see if it works.

On the TODO list,

 - get some performance numbers for various cases all I've done
   so far is run some basic pktgen tests with a debug kernel and
   a few 'perf records'. Both seem to look positive but I'll do
   some more tests over the next few days.

 - review the code paths some more

 - have some cleanup/improvements/review to do in alf_queue

 - add helpers to remove nasty **void casts in alf_queue ops

 - support bulk dequeue from qdisc either pfifo_fast or new qdisc

 - support mqprio and multiq. multiq lets me run classifiers/actions
   and with the lockless bit lets multiple cpus run in parrallel
   for performance close to mq and mqprio.

Another note in my original take on this I tried to rework some of
the error handling out of the drivers and cpu_collision paths to drop
the gso_skb logic altogether. By using dql we could/should(?) know
if a pkt can be consumed at least in the ONETX case. I haven't given
up on this but it got a bit tricky so I dropped it for now.

---

John Fastabend (12):
      lib: array based lock free queue
      net: sched: free per cpu bstats
      net: sched: allow qdiscs to handle locking
      net: sched: provide per cpu qstat helpers
      net: sched: per cpu gso handlers
      net: sched: support qdisc_reset on NOLOCK qdisc
      net: sched: qdisc_qlen for per cpu logic
      net: sched: a dflt qdisc may be used with per cpu stats
      net: sched: pfifo_fast use alf_queue
      net: sched: helper to sum qlen
      net: sched: add support for TCQ_F_NOLOCK subqueues to sch_mq
      net: sched: pfifo_fast new option to deque multiple pkts


 include/linux/alf_queue.h |  368 +++++++++++++++++++++++++++++++++++++++++++++
 include/net/gen_stats.h   |    3 
 include/net/sch_generic.h |  101 ++++++++++++
 lib/Makefile              |    2 
 lib/alf_queue.c           |   42 +++++
 net/core/dev.c            |   20 +-
 net/core/gen_stats.c      |    9 +
 net/sched/sch_generic.c   |  237 +++++++++++++++++++++--------
 net/sched/sch_mq.c        |   25 ++-
 9 files changed, 717 insertions(+), 90 deletions(-)
 create mode 100644 include/linux/alf_queue.h
 create mode 100644 lib/alf_queue.c

--
Signature

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2016-01-15 19:44 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-30 17:50 [RFC PATCH 00/12] drop the qdisc lock for pfifo_fast/mq John Fastabend
2015-12-30 17:51 ` [RFC PATCH 01/12] lib: array based lock free queue John Fastabend
2016-01-13 19:28   ` Jesper Dangaard Brouer
2015-12-30 17:51 ` [RFC PATCH 02/12] net: sched: free per cpu bstats John Fastabend
2016-01-04 15:21   ` Daniel Borkmann
2016-01-04 17:32     ` Eric Dumazet
2016-01-04 18:08       ` John Fastabend
2015-12-30 17:51 ` [RFC PATCH 03/12] net: sched: allow qdiscs to handle locking John Fastabend
2015-12-30 17:52 ` [RFC PATCH 04/12] net: sched: provide per cpu qstat helpers John Fastabend
2015-12-30 17:52 ` [RFC PATCH 05/12] net: sched: per cpu gso handlers John Fastabend
2015-12-30 20:26   ` Jesper Dangaard Brouer
2015-12-30 20:42     ` John Fastabend
2015-12-30 17:53 ` [RFC PATCH 06/12] net: sched: support qdisc_reset on NOLOCK qdisc John Fastabend
2016-01-01  2:30   ` Alexei Starovoitov
2016-01-03 19:37     ` John Fastabend
2016-01-13 16:20   ` David Miller
2016-01-13 18:03     ` John Fastabend
2016-01-15 19:44       ` David Miller
2015-12-30 17:53 ` [RFC PATCH 07/12] net: sched: qdisc_qlen for per cpu logic John Fastabend
2015-12-30 17:53 ` [RFC PATCH 08/12] net: sched: a dflt qdisc may be used with per cpu stats John Fastabend
2015-12-30 17:54 ` [RFC PATCH 09/12] net: sched: pfifo_fast use alf_queue John Fastabend
2016-01-13 16:24   ` David Miller
2016-01-13 18:18     ` John Fastabend
2015-12-30 17:54 ` [RFC PATCH 10/12] net: sched: helper to sum qlen John Fastabend
2015-12-30 17:55 ` [RFC PATCH 11/12] net: sched: add support for TCQ_F_NOLOCK subqueues to sch_mq John Fastabend
2015-12-30 17:55 ` [RFC PATCH 12/12] net: sched: pfifo_fast new option to deque multiple pkts John Fastabend
2015-12-30 18:13   ` John Fastabend
2016-01-06 13:14 ` [RFC PATCH 00/12] drop the qdisc lock for pfifo_fast/mq Jamal Hadi Salim
2016-01-07 23:30   ` John Fastabend

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).