All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Hwang <hffilwlqm@gmail.com>
To: bpf@vger.kernel.org
Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
	Leon Hwang <hffilwlqm@gmail.com>
Subject: [RFC PATCH bpf-next 1/2] bpf: Add generic kfunc bpf_ffs64()
Date: Wed, 31 Jan 2024 23:56:06 +0800	[thread overview]
Message-ID: <20240131155607.51157-2-hffilwlqm@gmail.com> (raw)
In-Reply-To: <20240131155607.51157-1-hffilwlqm@gmail.com>

On XDP-based virtual network gateway, ffs (aka find first set) algorithm
is used to find the index of the very first 1-value bit in a bitmap,
which is an array of u64, in the gateway's ACL module.

The ACL module was designed from these two papers:

* "eBPF / XDP based firewall and packet filtering"[1]
* "Securing Linux with a Faster and Scalable Iptables"[2]

In the ACL module, the key details are:

1. Match source address to get a bitmap.
2. Match destination address to get a bitmap.
3. Match l4 protocol to get a bitmap.
4. Match source port to get a bitmap.
5. Match destination port to get a bitmap.

Finally, by traversing these 5 bitmaps and doing bitwise-and on 5 u64s
meanwhile, for every bitwise-and result, an u64, if it's not zero, do
ffs to find the index of the very first 1-value bit in the result. When
the index is found, convert it to a rule index of a rule policy bpf map,
whose type is BPF_MAP_TYPE_ARRAY or BPF_MAP_TYPE_PERCPU_ARRAY.

If __ffs64() kernel function can be reused in bpf, it can save some time in
finding the index of the very first 1-value bit in an u64.

Like AVX2, __ffs64() will be compiled to one instruction, "rep bsf", on
x86.

Then, I do compare bpf-implemented __ffs64() with this kfunc bpf_ffs64()
with following bpf code snippet:

#include "vmlinux.h"

#include "bpf/bpf_helpers.h"

unsigned long bpf_ffs64(u64 word) __ksym;

static __noinline __u64
__ffs64(__u64 word)
{
	__u64 shift = 0;
	if ((word & 0xffffffff) == 0) {
		word >>= 32;
		shift += 32;
	}
	if ((word & 0xffff) == 0) {
		word >>= 16;
		shift += 16;
	}
	if ((word & 0xff) == 0) {
		word >>= 8;
		shift += 8;
	}
	if ((word & 0xf) == 0) {
		word >>= 4;
		shift += 4;
	}
	if ((word & 0x3) == 0) {
		word >>= 2;
		shift += 2;
	}
	if ((word & 0x1) == 0) {
		shift += 1;
	}

	return shift;
}

SEC("tc")
int tc_ffs1(struct __sk_buff *skb)
{
	void *data_end = (void *)(long) skb->data_end;
	u64 *data = (u64 *)(long) skb->data;

	if ((void *)(u64) (data + 1) > data_end)
		return 0;

	return __ffs64(*data);
}

SEC("tc")
int tc_ffs2(struct __sk_buff *skb)
{
	void *data_end = (void *)(long) skb->data_end;
	u64 *data = (u64 *)(long) skb->data;

	if ((void *)(u64) (data + 1) > data_end)
		return 0;

	return bpf_ffs64(*data);
}

char _license[] SEC("license") = "GPL";

Then, I run them on a KVM-based VM, which runs on a 48 cores and "Intel(R)
Xeon(R) Silver 4116 CPU @ 2.10GHz" CPU server.

As for the 1-value bit offset is 0, and for every time the bpf progs run
for 10000000 times, the average time cost data of bpf progs running is:

+----------+---------------+-------------------+
| Nth time | bpf __ffs64() | kfunc bpf_ffs64() |
+----------+---------------+-------------------+
|        1 | 164ns         | 154ns             |
|        2 | 166ns         | 155ns             |
|        3 | 160ns         | 154ns             |
|        4 | 161ns         | 157ns             |
|        5 | 161ns         | 155ns             |
|        6 | 163ns         | 155ns             |
|        7 | 164ns         | 155ns             |
|        8 | 159ns         | 159ns             |
|        9 | 171ns         | 154ns             |
|       10 | 164ns         | 156ns             |
|       11 | 161ns         | 155ns             |
|       12 | 160ns         | 155ns             |
|       13 | 161ns         | 154ns             |
|       14 | 165ns         | 154ns             |
|       15 | 161ns         | 162ns             |
|       16 | 161ns         | 157ns             |
|       17 | 164ns         | 154ns             |
|       18 | 162ns         | 154ns             |
|       19 | 159ns         | 156ns             |
|       20 | 160ns         | 154ns             |
+----------+---------------+-------------------+

As for the 1-value bit offset is 63, and for every time the bpf progs run
for 10000000 times, the average time cost data of bpf progs running is:

+----------+---------------+-------------------+
| Nth time | bpf __ffs64() | kfunc bpf_ffs64() |
+----------+---------------+-------------------+
|        1 | 163ns         | 157ns             |
|        2 | 163ns         | 154ns             |
|        3 | 165ns         | 155ns             |
|        4 | 167ns         | 155ns             |
|        5 | 165ns         | 155ns             |
|        6 | 163ns         | 155ns             |
|        7 | 162ns         | 155ns             |
|        8 | 162ns         | 156ns             |
|        9 | 174ns         | 155ns             |
|       10 | 162ns         | 156ns             |
|       11 | 168ns         | 155ns             |
|       12 | 169ns         | 156ns             |
|       13 | 162ns         | 155ns             |
|       14 | 169ns         | 155ns             |
|       15 | 162ns         | 154ns             |
|       16 | 163ns         | 155ns             |
|       17 | 162ns         | 154ns             |
|       18 | 166ns         | 154ns             |
|       19 | 165ns         | 154ns             |
|       20 | 165ns         | 154ns             |
+----------+---------------+-------------------+

As we can see, for every time, bpf __ffs64() costs around 165ns, and
kfunc bpf_ffs64() costs around 155ns. It seems that kfunc bpf_ffs64()
saves 10ns for every time.

If there is 1m PPS on the gateway, kfunc bpf_ffs64() will save much CPU
resource.

Links:

[1] http://vger.kernel.org/lpc_net2018_talks/ebpf-firewall-paper-LPC.pdf
[2] https://mbertrone.github.io/documents/21-Securing_Linux_with_a_Faster_and_Scalable_Iptables.pdf

Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
---
 kernel/bpf/helpers.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index bcb951a2ecf4b..4db48a6a04a90 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -23,6 +23,7 @@
 #include <linux/btf_ids.h>
 #include <linux/bpf_mem_alloc.h>
 #include <linux/kasan.h>
+#include <linux/bitops.h>
 
 #include "../../lib/kstrtox.h"
 
@@ -2542,6 +2543,11 @@ __bpf_kfunc void bpf_throw(u64 cookie)
 	WARN(1, "A call to BPF exception callback should never return\n");
 }
 
+__bpf_kfunc unsigned long bpf_ffs64(u64 word)
+{
+	return __ffs64(word);
+}
+
 __bpf_kfunc_end_defs();
 
 BTF_SET8_START(generic_btf_ids)
@@ -2573,6 +2579,7 @@ BTF_ID_FLAGS(func, bpf_task_get_cgroup1, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
 #endif
 BTF_ID_FLAGS(func, bpf_task_from_pid, KF_ACQUIRE | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_throw)
+BTF_ID_FLAGS(func, bpf_ffs64)
 BTF_SET8_END(generic_btf_ids)
 
 static const struct btf_kfunc_id_set generic_kfunc_set = {
-- 
2.42.1


  reply	other threads:[~2024-01-31 15:56 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-31 15:56 [RFC PATCH bpf-next 0/2] bpf: Add generic kfunc bpf_ffs64() Leon Hwang
2024-01-31 15:56 ` Leon Hwang [this message]
2024-01-31 15:56 ` [RFC PATCH bpf-next 2/2] selftests/bpf: Add testcases for " Leon Hwang
2024-02-02 22:18 ` [RFC PATCH bpf-next 0/2] bpf: Add " Andrii Nakryiko
2024-02-04 19:19   ` Yonghong Song
2024-02-05 18:18     ` Andrii Nakryiko
2024-02-05 18:34       ` Yonghong Song
2024-03-03 13:18         ` Leon Hwang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240131155607.51157-2-hffilwlqm@gmail.com \
    --to=hffilwlqm@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.