From: Jesper Dangaard Brouer <brouer@redhat.com>
To: "Ethy H. Brito" <ethy.brito@inexo.com.br>
Cc: "xdp-newbies@vger.kernel.org" <xdp-newbies@vger.kernel.org>,
brouer@redhat.com,
Robert Chacon <robert.chacon@jackrabbitwireless.com>,
Yoel Caspersen <yoel@kviknet.dk>
Subject: Re: Newbie questions
Date: Tue, 22 Jun 2021 11:18:09 +0200 [thread overview]
Message-ID: <20210622111809.16a1431e@carbon> (raw)
In-Reply-To: <20210621222809.2d7633cc@babalu>
On Mon, 21 Jun 2021 22:28:09 -0300
"Ethy H. Brito" <ethy.brito@inexo.com.br> wrote:
> On Fri, 18 Jun 2021 17:37:17 -0300
> "Ethy H. Brito" <ethy.brito@inexo.com.br> wrote:
>
> > On Fri, 18 Jun 2021 19:40:17 +0200
> > Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> >
> > > On Fri, 18 Jun 2021 13:31:06 -0300
> > > "Ethy H. Brito" <ethy.brito@inexo.com.br> wrote:
> > >
> > > > Hi All.
> > > >
> > > > I've been doing some home work reading the docs and some doubts have raised.
> > > > For reference, my environment is
> > > > Ubuntu 20.04
> > > > kernel 5.4.0-66
> > > > tc utility, iproute2-ss200127.
> > > >
> > > > 1) https://xdp-project.net/areas/cpumap.html#cpumap--Create-script-MQ-HTB-silo-setup says that:
> > > > "XPS (Transmit Packet Steering) will take precedence over any changes to
> > > > skb->queue_mapping. You need to disable *XDP* via mask=00 in files
> > > > /sys/class/net/DEV/queues/tx-*/xps_cpus"
> > > >
> > > > Shouldn't it say I need to disable *XPS* (not XDP) using mask=00??
> > >
> > > You are absolutely right it is a typo. Can I ask you to fix that and
> > > send a GitHub PR?
> > >
> > > The file you need to change is:
> > > https://github.com/xdp-project/xdp-project/blob/master/areas/cpumap.org
>
> File edited. PR sent.
Thanks merged it.
> > >
> > > > How to set that CPU-0 will deal with mq queue 7FFF:1, CPU-1 will deal
> > > > with 7FFF:2, and so on?
> > >
> > > That is the role of the XDP program that redirect into a cpumap, and
> > > the key in the cpumap is the CPU number.
>
> OK. I see that in source code.
Yes, see the explanation in the source code.
Also read: https://github.com/xdp-project/xdp-cpumap-tc/blob/master/src/howto_debug.org
The "tc_queue_mapping_kern.c" program[1] is the simplest solution,
which only does the skb->queue_mapping, and you have to configure Linux
to set the correct TC minor:major number on a per packet basis (e.g.
via iptables see comment in code).
The "tc_classify_kern.c" program[2] is more advanced and have
implemented a IP-lookup map that have this[3] config per entry:
struct ip_hash_info {
/* lookup key: __u32 IPv4-address */
__u32 cpu;
__u32 tc_handle; /* TC handle MAJOR:MINOR combined in __u32 */
};
[1] https://github.com/xdp-project/xdp-cpumap-tc/blob/master/src/tc_queue_mapping_kern.c#L40-L76
[2] https://github.com/xdp-project/xdp-cpumap-tc/blob/master/src/tc_classify_kern.c#L277
[3] https://github.com/xdp-project/xdp-cpumap-tc/blob/master/src/common_kern_user.h#L29-L33
> But I am still pretty in the dark here to start using XDP.
Okay, then let me explain some basic concepts for xdp-cpumap-tc.
1. XDP need to run on physical NIC with driver that supports (native) XDP.
2. XDP is a layer before network stack, before the "SKB" is created.
3. XDP redirect the raw frame to another CPU via XDP_REDIRECT'ing into a cpumap.
4. The cpumap (kthread) running on remote CPU will create the SKB and
call normal network stack on this CPU.
5. The TC-BPF program running on remote CPU update skb->queue_mapping
(and possibly skb->priority) to map packet into the TC-queue of
your choosing.
Notice for you scenario there are 4 BPF-progs running, two XDP and two
TC-BPF. See what is running via cmdline: "bpftool net"
# bpftool net
xdp:
eno49(4) driver id 22
eno50(5) driver id 26
tc:
eno49(4) clsact/egress tc_classify_kern.o:[tc_classify] id 42
eno50(5) clsact/egress tc_classify_kern.o:[tc_classify] id 43
All the BPF-programs share BPF-maps to have the same config.
Maps pinned:
# ls -1 /sys/fs/bpf/tc/globals/
map_ifindex_type
map_ip_hash
map_txq_config
> More newbie questions are necessary.
>
> My goal is simple: to control the bandwidth of a few (or a lot)
> thousands users using an of-the-shelf (almost) box. Two 10Gbps ether
> interface. One internal, one external.
I have access to a production system, that have 2x 25Gbit/s NIC (plus
VLANs for each apartment building), let me check how many customers
they have added. They are using[2] "tc_classify_kern.c" and their
IP-map contains 6086 entries (more than I expected actually).
> What come in thru eth0 goes out to eth0 or eth1 and what comes in
> thru eth1 comes out to eth0.
>
> Is there a road map about what to execute and in what order to
> achieve this task using xdp-cpumap-tc?
This is already available today, and running in production at an ISP.
Sorry for the lack of documentation on how to use it, but it is done.
> I have cloned xdp-cpumap-tc to try figuring it out reading the source code.
> But things did not get together.
>
> For instance, tc_classify_kern.c (as tc_queue_mapping_kern.c) "talks" about a "manuel" (sic)
> setup:
>
> tc qdisc add dev ixgbe2 clsact
> tc filter add dev ixgbe2 egress bpf da obj tc_classify_kern.o sec tc_classify
>
> At what point these commands are to be executed?
> They are not mentioned anywhere else. (tc_mq_htb_setup_example.sh forgot these perhaps?)
This is handled by: tc_classify_user
https://github.com/xdp-project/xdp-cpumap-tc/blob/master/src/tc_classify_user.c
The TC commands are called from C-code in this file:
https://github.com/xdp-project/xdp-cpumap-tc/blob/master/src/common_user.c
The roadmap is to convert this to use the new libbpf TC API instead, as
it is a mess to have a dependency on the right iproute2 version.
> Which one is be to loaded tc_classify_kern or tc_queue_mapping_kern?
> Or both? None? After and before what?
Actually due to limitation in iproute2 loader, you should load
XDP-programs first (as it will create maps with BTF info).
You cannot load tc_classify_kern and tc_queue_mapping_kern simultaneously.
>
> In the file tc_classify_kern.c, map_ifindex_type is defined
> differently from xdp_iphash_to_cpu_kern.c.
>
> ".size_value = sizeof(struct txq_config)" in the former
> and
> ".size_value = sizeof(__u32)" int the later.
>
> Is this a "Cut and paste" typo? Are they really meant to be two
> different maps?
Hmm... this looks like a copy-paste error. The tc_classify_kern.c
map_ifindex_type should have size_value = sizeof(__u32). It happens to
work because sizeof(struct txq_config) is also 4 bytes.
> Anyway, a step by step guide would be appreciated.
I'm hoping you will create/document that once you learn howto use these
programs ;-)
> Maybe it is time to start populating that BNG-router repo I was told about.
> How can I start helping with that? Worth doing it?
I think we need to convince other ISP's to join in...
... let me CC those guys again.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2021-06-22 9:18 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-18 16:31 Newbie questions Ethy H. Brito
2021-06-18 17:40 ` Jesper Dangaard Brouer
2021-06-18 20:37 ` Ethy H. Brito
2021-06-22 1:28 ` Ethy H. Brito
2021-06-22 9:18 ` Jesper Dangaard Brouer [this message]
-- strict thread matches above, loose matches on Subject: below --
2015-08-31 0:30 newbie questions Pierre-Louis Bossart
2015-08-31 7:43 ` Johannes Berg
2015-08-31 12:50 ` Pierre-Louis Bossart
2015-08-31 12:54 ` Johannes Berg
2015-08-31 13:21 ` Pierre-Louis Bossart
2015-08-31 13:33 ` Johannes Berg
2015-08-31 14:26 ` Pierre-Louis Bossart
2015-08-31 14:38 ` Johannes Berg
2012-10-06 15:31 Newbie questions Mark Kampe
2012-10-07 0:08 ` Adam Nielsen
2012-10-07 0:34 ` Mark Kampe
2012-10-01 12:30 Adam Nielsen
2012-10-01 13:20 ` Joao Eduardo Luis
2012-10-01 16:13 ` Sage Weil
2012-10-06 15:05 ` Adam Nielsen
2005-11-01 17:33 Larry Alkoff
2005-11-02 5:41 ` Justin Zygmont
2005-11-03 0:55 ` Ralph Alvy
2005-11-03 4:12 ` Larry Alkoff
2005-11-03 6:17 ` Ralph Alvy
2005-11-03 7:32 ` John R. Sowden
2005-11-03 19:02 ` Larry Alkoff
2005-11-03 21:26 ` John R. Sowden
2005-11-04 3:45 ` Justin Zygmont
2005-11-05 17:06 ` Ralph Alvy
2005-11-05 19:25 ` Larry Alkoff
2005-11-06 0:42 ` Ralph Alvy
[not found] ` <436F5554.2030304@pobox.com>
[not found] ` <200511070723.31259.ralvy@warpmail.net>
2005-11-07 16:36 ` Alain
2005-11-09 7:46 ` Ralph Alvy
2005-10-06 18:17 Gaurav Poothia
2005-10-06 21:04 ` Ivan Gyurdiev
2005-10-06 22:05 ` Luke Kenneth Casson Leighton
2005-10-06 18:12 Gaurav Poothia
2005-01-19 15:07 Scott Miller
2005-01-19 15:10 ` Geert Uytterhoeven
2005-01-19 20:53 ` Scott Miller
2004-12-15 19:49 Newbie Questions Joseph Swaminathan
2004-12-15 20:23 ` Marco Gerards
2004-12-15 20:51 ` Joseph Swaminathan
2004-12-15 20:56 ` Marco Gerards
2004-03-25 21:32 Newbie questions Jan Rychter
2004-03-26 2:26 ` Steven Hand
2004-04-07 21:08 ` Jan Rychter
2004-03-26 2:35 ` Ian Pratt
2002-08-03 4:10 Gustavo Sverzut Barbieri
[not found] ` <20020803041040.10310.qmail-L8+/D2FWflyA/QwVtaZbd3CJp6faPEW9@public.gmane.org>
2002-08-03 12:49 ` Axel Siebenwirth
[not found] <200204070157.g371vDs24544@superglide.netfx-2000.net>
2002-04-25 8:10 ` Newbie Questions Daniel
2002-04-09 21:39 Gyzmobro
2002-04-09 22:14 ` Glynn Clements
2001-12-11 23:44 Slightly confuzed Charles Steinkuehler
2001-12-12 14:59 ` Newbie questions Charles Steinkuehler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210622111809.16a1431e@carbon \
--to=brouer@redhat.com \
--cc=ethy.brito@inexo.com.br \
--cc=robert.chacon@jackrabbitwireless.com \
--cc=xdp-newbies@vger.kernel.org \
--cc=yoel@kviknet.dk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.