From: Leon Hwang <hffilwlqm@gmail.com>
To: bpf@vger.kernel.org
Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
Leon Hwang <hffilwlqm@gmail.com>
Subject: [RFC PATCH bpf-next 1/2] bpf: Add generic kfunc bpf_ffs64()
Date: Wed, 31 Jan 2024 23:56:06 +0800 [thread overview]
Message-ID: <20240131155607.51157-2-hffilwlqm@gmail.com> (raw)
In-Reply-To: <20240131155607.51157-1-hffilwlqm@gmail.com>
On XDP-based virtual network gateway, ffs (aka find first set) algorithm
is used to find the index of the very first 1-value bit in a bitmap,
which is an array of u64, in the gateway's ACL module.
The ACL module was designed from these two papers:
* "eBPF / XDP based firewall and packet filtering"[1]
* "Securing Linux with a Faster and Scalable Iptables"[2]
In the ACL module, the key details are:
1. Match source address to get a bitmap.
2. Match destination address to get a bitmap.
3. Match l4 protocol to get a bitmap.
4. Match source port to get a bitmap.
5. Match destination port to get a bitmap.
Finally, by traversing these 5 bitmaps and doing bitwise-and on 5 u64s
meanwhile, for every bitwise-and result, an u64, if it's not zero, do
ffs to find the index of the very first 1-value bit in the result. When
the index is found, convert it to a rule index of a rule policy bpf map,
whose type is BPF_MAP_TYPE_ARRAY or BPF_MAP_TYPE_PERCPU_ARRAY.
If __ffs64() kernel function can be reused in bpf, it can save some time in
finding the index of the very first 1-value bit in an u64.
Like AVX2, __ffs64() will be compiled to one instruction, "rep bsf", on
x86.
Then, I do compare bpf-implemented __ffs64() with this kfunc bpf_ffs64()
with following bpf code snippet:
#include "vmlinux.h"
#include "bpf/bpf_helpers.h"
unsigned long bpf_ffs64(u64 word) __ksym;
static __noinline __u64
__ffs64(__u64 word)
{
__u64 shift = 0;
if ((word & 0xffffffff) == 0) {
word >>= 32;
shift += 32;
}
if ((word & 0xffff) == 0) {
word >>= 16;
shift += 16;
}
if ((word & 0xff) == 0) {
word >>= 8;
shift += 8;
}
if ((word & 0xf) == 0) {
word >>= 4;
shift += 4;
}
if ((word & 0x3) == 0) {
word >>= 2;
shift += 2;
}
if ((word & 0x1) == 0) {
shift += 1;
}
return shift;
}
SEC("tc")
int tc_ffs1(struct __sk_buff *skb)
{
void *data_end = (void *)(long) skb->data_end;
u64 *data = (u64 *)(long) skb->data;
if ((void *)(u64) (data + 1) > data_end)
return 0;
return __ffs64(*data);
}
SEC("tc")
int tc_ffs2(struct __sk_buff *skb)
{
void *data_end = (void *)(long) skb->data_end;
u64 *data = (u64 *)(long) skb->data;
if ((void *)(u64) (data + 1) > data_end)
return 0;
return bpf_ffs64(*data);
}
char _license[] SEC("license") = "GPL";
Then, I run them on a KVM-based VM, which runs on a 48 cores and "Intel(R)
Xeon(R) Silver 4116 CPU @ 2.10GHz" CPU server.
As for the 1-value bit offset is 0, and for every time the bpf progs run
for 10000000 times, the average time cost data of bpf progs running is:
+----------+---------------+-------------------+
| Nth time | bpf __ffs64() | kfunc bpf_ffs64() |
+----------+---------------+-------------------+
| 1 | 164ns | 154ns |
| 2 | 166ns | 155ns |
| 3 | 160ns | 154ns |
| 4 | 161ns | 157ns |
| 5 | 161ns | 155ns |
| 6 | 163ns | 155ns |
| 7 | 164ns | 155ns |
| 8 | 159ns | 159ns |
| 9 | 171ns | 154ns |
| 10 | 164ns | 156ns |
| 11 | 161ns | 155ns |
| 12 | 160ns | 155ns |
| 13 | 161ns | 154ns |
| 14 | 165ns | 154ns |
| 15 | 161ns | 162ns |
| 16 | 161ns | 157ns |
| 17 | 164ns | 154ns |
| 18 | 162ns | 154ns |
| 19 | 159ns | 156ns |
| 20 | 160ns | 154ns |
+----------+---------------+-------------------+
As for the 1-value bit offset is 63, and for every time the bpf progs run
for 10000000 times, the average time cost data of bpf progs running is:
+----------+---------------+-------------------+
| Nth time | bpf __ffs64() | kfunc bpf_ffs64() |
+----------+---------------+-------------------+
| 1 | 163ns | 157ns |
| 2 | 163ns | 154ns |
| 3 | 165ns | 155ns |
| 4 | 167ns | 155ns |
| 5 | 165ns | 155ns |
| 6 | 163ns | 155ns |
| 7 | 162ns | 155ns |
| 8 | 162ns | 156ns |
| 9 | 174ns | 155ns |
| 10 | 162ns | 156ns |
| 11 | 168ns | 155ns |
| 12 | 169ns | 156ns |
| 13 | 162ns | 155ns |
| 14 | 169ns | 155ns |
| 15 | 162ns | 154ns |
| 16 | 163ns | 155ns |
| 17 | 162ns | 154ns |
| 18 | 166ns | 154ns |
| 19 | 165ns | 154ns |
| 20 | 165ns | 154ns |
+----------+---------------+-------------------+
As we can see, for every time, bpf __ffs64() costs around 165ns, and
kfunc bpf_ffs64() costs around 155ns. It seems that kfunc bpf_ffs64()
saves 10ns for every time.
If there is 1m PPS on the gateway, kfunc bpf_ffs64() will save much CPU
resource.
Links:
[1] http://vger.kernel.org/lpc_net2018_talks/ebpf-firewall-paper-LPC.pdf
[2] https://mbertrone.github.io/documents/21-Securing_Linux_with_a_Faster_and_Scalable_Iptables.pdf
Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
---
kernel/bpf/helpers.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index bcb951a2ecf4b..4db48a6a04a90 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -23,6 +23,7 @@
#include <linux/btf_ids.h>
#include <linux/bpf_mem_alloc.h>
#include <linux/kasan.h>
+#include <linux/bitops.h>
#include "../../lib/kstrtox.h"
@@ -2542,6 +2543,11 @@ __bpf_kfunc void bpf_throw(u64 cookie)
WARN(1, "A call to BPF exception callback should never return\n");
}
+__bpf_kfunc unsigned long bpf_ffs64(u64 word)
+{
+ return __ffs64(word);
+}
+
__bpf_kfunc_end_defs();
BTF_SET8_START(generic_btf_ids)
@@ -2573,6 +2579,7 @@ BTF_ID_FLAGS(func, bpf_task_get_cgroup1, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
#endif
BTF_ID_FLAGS(func, bpf_task_from_pid, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_throw)
+BTF_ID_FLAGS(func, bpf_ffs64)
BTF_SET8_END(generic_btf_ids)
static const struct btf_kfunc_id_set generic_kfunc_set = {
--
2.42.1
next prev parent reply other threads:[~2024-01-31 15:56 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-31 15:56 [RFC PATCH bpf-next 0/2] bpf: Add generic kfunc bpf_ffs64() Leon Hwang
2024-01-31 15:56 ` Leon Hwang [this message]
2024-01-31 15:56 ` [RFC PATCH bpf-next 2/2] selftests/bpf: Add testcases for " Leon Hwang
2024-02-02 22:18 ` [RFC PATCH bpf-next 0/2] bpf: Add " Andrii Nakryiko
2024-02-04 19:19 ` Yonghong Song
2024-02-05 18:18 ` Andrii Nakryiko
2024-02-05 18:34 ` Yonghong Song
2024-03-03 13:18 ` Leon Hwang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240131155607.51157-2-hffilwlqm@gmail.com \
--to=hffilwlqm@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox