[PATCH v8 net-next 0/2] load imm64 insn and uapi/linux/bpf.h

linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v8 net-next 0/2] load imm64 insn and uapi/linux/bpf.h
@ 2014-08-27 20:37 Alexei Starovoitov
  2014-08-27 20:37 ` [PATCH v8 net-next 1/2] net: filter: add "load 64-bit immediate" eBPF instruction Alexei Starovoitov
       [not found] ` <1409171833-6979-1-git-send-email-ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org>
  0 siblings, 2 replies; 9+ messages in thread
From: Alexei Starovoitov @ 2014-08-27 20:37 UTC (permalink / raw)
  To: David S. Miller
  Cc: Ingo Molnar, Linus Torvalds, Andy Lutomirski, Steven Rostedt,
	Daniel Borkmann, Chema Gonzalez, Eric Dumazet, Peter Zijlstra,
	Brendan Gregg, Namhyung Kim, H. Peter Anvin, Andrew Morton,
	Kees Cook, linux-api, netdev, linux-kernel

Hi David,

I've been thinking on the minimum first patch set.
Came up with the following two:

1st patch adds 'load 64-bit immediate' instruction which by itself
is harmless and used to load constants only. In the future we may
add pseudo variants of this insn, so user space can request
internal kernel pointer. More detailed explanation in the commit log.

2nd patch exposed eBPF ISA to user space. It moves 55 lines from
filter.h into uapi/linux/bpf.h
Though there is no way currently to load eBPF programs from user
space, this patch shows the intent that eventually it will be possible.
The main goal here is to unblock LLVM upstreaming process.
Once these two are in, I can start posting LLVM RFCs to llvmdev list
and getting compiler bits in, so by the time bpf syscall and verifier
are in, we may have LLVM backend upstreamed as well.
LLVM wouldn't care what eBPF is used for, whether syscall is used
or some other mechanism. It just compiles C into eBPF ISA.
So these two patches are sufficient to start LLVM upstreaming.

All,
why do we need all of these?
Same reason why we're still using classic BPF and keep trying to extend it.
There are places in kernel where safe dynamic programs are mandatory.
network traffic capture needs in-kernel filtering,
seccomp needs safe mini programs to sandbox applications,
tracing needs them to filter events and so on.

Few LWN articles that explain things way better than my commit logs:
http://lwn.net/Articles/599755/
http://lwn.net/Articles/603983/
http://lwn.net/Articles/606089/
http://lwn.net/Articles/575531/

The first target for eBPF is to have dtrace equivalent that can be
used in _production_. Safety of programs is paramount.
Just like performance. eBPF programs in tracing should not affect 
performance of production severs, so huge effort on optimizing last bit.

ebpf+tracing, ebpf+seccomp, ebpf+sockets are the most obvious use cases.
ebpf+ovs is the one we use in large kvm hypervisors.
40Gbps of traffic are going through these programs, so performance
and safety are vital. Performance implications ruling out run-time checks
in critical path, so verifier is large mainly because it needs to do
all the checks during static analysis.

I think next patch set will include syscall shell with minimal functionality,
syscall doc and simple test.

Alexei Starovoitov (2):
  net: filter: add "load 64-bit immediate" eBPF instruction
  net: filter: split filter.h and expose eBPF to user space

 Documentation/networking/filter.txt |    8 +++-
 arch/x86/net/bpf_jit_comp.c         |   17 ++++++++
 include/linux/filter.h              |   74 +++++++++--------------------------
 include/uapi/linux/Kbuild           |    1 +
 include/uapi/linux/bpf.h            |   65 ++++++++++++++++++++++++++++++
 kernel/bpf/core.c                   |    5 +++
 lib/test_bpf.c                      |   21 ++++++++++
 7 files changed, 135 insertions(+), 56 deletions(-)
 create mode 100644 include/uapi/linux/bpf.h

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v8 net-next 1/2] net: filter: add "load 64-bit immediate" eBPF instruction
  2014-08-27 20:37 [PATCH v8 net-next 0/2] load imm64 insn and uapi/linux/bpf.h Alexei Starovoitov
@ 2014-08-27 20:37 ` Alexei Starovoitov
       [not found] ` <1409171833-6979-1-git-send-email-ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org>
  1 sibling, 0 replies; 9+ messages in thread
From: Alexei Starovoitov @ 2014-08-27 20:37 UTC (permalink / raw)
  To: David S. Miller
  Cc: Ingo Molnar, Linus Torvalds, Andy Lutomirski, Steven Rostedt,
	Daniel Borkmann, Chema Gonzalez, Eric Dumazet, Peter Zijlstra,
	Brendan Gregg, Namhyung Kim, H. Peter Anvin, Andrew Morton,
	Kees Cook, linux-api, netdev, linux-kernel

add BPF_LD_IMM64 instruction to load 64-bit immediate value into a register.
All previous instructions were 8-byte. This is first 16-byte instruction.
Two consecutive 'struct bpf_insn' blocks are interpreted as single instruction:
insn[0].code = BPF_LD | BPF_DW | BPF_IMM
insn[0].dst_reg = destination register
insn[0].imm = lower 32-bit
insn[1].code = 0
insn[1].imm = upper 32-bit
All unused fields must be zero.

Classic BPF has similar instruction: BPF_LD | BPF_W | BPF_IMM
which loads 32-bit immediate value into a register.

x64 JITs it as single 'movabsq %rax, imm64'
arm64 may JIT as sequence of four 'movk x0, #imm16, lsl #shift' insn

Note that old eBPF programs are binary compatible with new interpreter.

It helps eBPF programs load 64-bit constant into a register with one
instruction instead of using two registers and 4 instructions:
BPF_MOV32_IMM(R1, imm32)
BPF_ALU64_IMM(BPF_LSH, R1, 32)
BPF_MOV32_IMM(R2, imm32)
BPF_ALU64_REG(BPF_OR, R1, R2)

User space generated programs will use this instruction to load constants only.

To tell kernel that user space needs a pointer the _pseudo_ variant of
this instruction may be added later, which will use extra bits of encoding
to indicate what type of pointer user space is asking kernel to provide.
For example 'off' or 'src_reg' fields can be used for such purpose.
src_reg = 1 could mean that user space is asking kernel to validate and
load in-kernel map pointer.
src_reg = 2 could mean that user space needs readonly data section pointer
src_reg = 3 could mean that user space needs a pointer to per-cpu local data
All such future pseudo instructions will not be carrying the actual pointer
as part of the instruction, but rather will be treated as a request to kernel
to provide one. The kernel will verify the request_for_a_pointer, then
will drop pseudo marking and will store actual internal pointer inside
the instruction, so the end result is the interpreter and JITs never
see pseudo BPF_LD_IMM64 insns and only operate on single generic BPF_LD_IMM64.
User space never operates on direct pointers and verifier can easily
recognize request_for_pointer_pseudo_insn vs other instructions.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---
 Documentation/networking/filter.txt |    8 +++++++-
 arch/x86/net/bpf_jit_comp.c         |   17 +++++++++++++++++
 include/linux/filter.h              |   18 ++++++++++++++++++
 kernel/bpf/core.c                   |    5 +++++
 lib/test_bpf.c                      |   21 +++++++++++++++++++++
 5 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt
index c48a9704bda8..81916ab5d96f 100644
--- a/Documentation/networking/filter.txt
+++ b/Documentation/networking/filter.txt
@@ -951,7 +951,7 @@ Size modifier is one of ...
 
 Mode modifier is one of:
 
-  BPF_IMM  0x00  /* classic BPF only, reserved in eBPF */
+  BPF_IMM  0x00  /* used for 32-bit mov in classic BPF and 64-bit in eBPF */
   BPF_ABS  0x20
   BPF_IND  0x40
   BPF_MEM  0x60
@@ -995,6 +995,12 @@ BPF_XADD | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg
 Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW. Note that 1 and
 2 byte atomic increments are not supported.
 
+eBPF has one 16-byte instruction: BPF_LD | BPF_DW | BPF_IMM which consists
+of two consecutive 'struct bpf_insn' 8-byte blocks and interpreted as single
+instruction that loads 64-bit immediate value into a dst_reg.
+Classic BPF has similar instruction: BPF_LD | BPF_W | BPF_IMM which loads
+32-bit immediate value into a register.
+
 Testing
 -------
 
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index b08a98c59530..98837147ee57 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -393,6 +393,23 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
 			EMIT1_off32(add_1reg(0xB8, dst_reg), imm32);
 			break;
 
+		case BPF_LD | BPF_IMM | BPF_DW:
+			if (insn[1].code != 0 || insn[1].src_reg != 0 ||
+			    insn[1].dst_reg != 0 || insn[1].off != 0) {
+				/* verifier must catch invalid insns */
+				pr_err("invalid BPF_LD_IMM64 insn\n");
+				return -EINVAL;
+			}
+
+			/* movabsq %rax, imm64 */
+			EMIT2(add_1mod(0x48, dst_reg), add_1reg(0xB8, dst_reg));
+			EMIT(insn[0].imm, 4);
+			EMIT(insn[1].imm, 4);
+
+			insn++;
+			i++;
+			break;
+
 			/* dst %= src, dst /= src, dst %= imm32, dst /= imm32 */
 		case BPF_ALU | BPF_MOD | BPF_X:
 		case BPF_ALU | BPF_DIV | BPF_X:
diff --git a/include/linux/filter.h b/include/linux/filter.h
index a5227ab8ccb1..f3262b598262 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -161,6 +161,24 @@ enum {
 		.off   = 0,					\
 		.imm   = IMM })
 
+/* BPF_LD_IMM64 macro encodes single 'load 64-bit immediate' insn */
+#define BPF_LD_IMM64(DST, IMM)					\
+	BPF_LD_IMM64_RAW(DST, 0, IMM)
+
+#define BPF_LD_IMM64_RAW(DST, SRC, IMM)				\
+	((struct bpf_insn) {					\
+		.code  = BPF_LD | BPF_DW | BPF_IMM,		\
+		.dst_reg = DST,					\
+		.src_reg = SRC,					\
+		.off   = 0,					\
+		.imm   = (__u32) (IMM) }),			\
+	((struct bpf_insn) {					\
+		.code  = 0, /* zero is reserved opcode */	\
+		.dst_reg = 0,					\
+		.src_reg = 0,					\
+		.off   = 0,					\
+		.imm   = ((__u64) (IMM)) >> 32 })
+
 /* Short form of mov based on type, BPF_X: dst_reg = src_reg, BPF_K: dst_reg = imm32 */
 
 #define BPF_MOV64_RAW(TYPE, DST, SRC, IMM)			\
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 7f0dbcbb34af..0434c2170f2b 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -180,6 +180,7 @@ static unsigned int __bpf_prog_run(void *ctx, const struct bpf_insn *insn)
 		[BPF_LD | BPF_IND | BPF_W] = &&LD_IND_W,
 		[BPF_LD | BPF_IND | BPF_H] = &&LD_IND_H,
 		[BPF_LD | BPF_IND | BPF_B] = &&LD_IND_B,
+		[BPF_LD | BPF_IMM | BPF_DW] = &&LD_IMM_DW,
 	};
 	void *ptr;
 	int off;
@@ -239,6 +240,10 @@ select_insn:
 	ALU64_MOV_K:
 		DST = IMM;
 		CONT;
+	LD_IMM_DW:
+		DST = (u64) (u32) insn[0].imm | ((u64) (u32) insn[1].imm) << 32;
+		insn++;
+		CONT;
 	ALU64_ARSH_X:
 		(*(s64 *) &DST) >>= SRC;
 		CONT;
diff --git a/lib/test_bpf.c b/lib/test_bpf.c
index 8c66c6aace04..46ab1a7ef135 100644
--- a/lib/test_bpf.c
+++ b/lib/test_bpf.c
@@ -1735,6 +1735,27 @@ static struct bpf_test tests[] = {
 		{ },
 		{ { 1, 0 } },
 	},
+	{
+		"load 64-bit immediate",
+		.u.insns_int = {
+			BPF_LD_IMM64(R1, 0x567800001234L),
+			BPF_MOV64_REG(R2, R1),
+			BPF_MOV64_REG(R3, R2),
+			BPF_ALU64_IMM(BPF_RSH, R2, 32),
+			BPF_ALU64_IMM(BPF_LSH, R3, 32),
+			BPF_ALU64_IMM(BPF_RSH, R3, 32),
+			BPF_ALU64_IMM(BPF_MOV, R0, 0),
+			BPF_JMP_IMM(BPF_JEQ, R2, 0x5678, 1),
+			BPF_EXIT_INSN(),
+			BPF_JMP_IMM(BPF_JEQ, R3, 0x1234, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU64_IMM(BPF_MOV, R0, 1),
+			BPF_EXIT_INSN(),
+		},
+		INTERNAL,
+		{ },
+		{ { 0, 1 } }
+	},
 };
 
 static struct net_device dev;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v8 net-next 2/2] net: filter: split filter.h and expose eBPF to user space
       [not found] ` <1409171833-6979-1-git-send-email-ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org>
@ 2014-08-27 20:37   ` Alexei Starovoitov
  2014-08-29 17:39     ` Daniel Borkmann
  0 siblings, 1 reply; 9+ messages in thread
From: Alexei Starovoitov @ 2014-08-27 20:37 UTC (permalink / raw)
  To: David S. Miller
  Cc: Ingo Molnar, Linus Torvalds, Andy Lutomirski, Steven Rostedt,
	Daniel Borkmann, Chema Gonzalez, Eric Dumazet, Peter Zijlstra,
	Brendan Gregg, Namhyung Kim, H. Peter Anvin, Andrew Morton,
	Kees Cook, linux-api-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

allow user space to generate eBPF programs

uapi/linux/bpf.h: eBPF instruction set definition

linux/filter.h: the rest

This patch only moves macro definitions, but practically it freezes existing
eBPF instruction set, though new instructions can still be added in the future.

These eBPF definitions cannot go into uapi/linux/filter.h, since the names
may conflict with existing applications.

Full eBPF ISA description is in Documentation/networking/filter.txt

Signed-off-by: Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org>
---
 include/linux/filter.h    |   56 +-------------------------------------
 include/uapi/linux/Kbuild |    1 +
 include/uapi/linux/bpf.h  |   65 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 67 insertions(+), 55 deletions(-)
 create mode 100644 include/uapi/linux/bpf.h

diff --git a/include/linux/filter.h b/include/linux/filter.h
index f3262b598262..3150666cd4b9 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -9,53 +9,7 @@
 #include <linux/skbuff.h>
 #include <linux/workqueue.h>
 #include <uapi/linux/filter.h>
-
-/* Internally used and optimized filter representation with extended
- * instruction set based on top of classic BPF.
- */
-
-/* instruction classes */
-#define BPF_ALU64	0x07	/* alu mode in double word width */
-
-/* ld/ldx fields */
-#define BPF_DW		0x18	/* double word */
-#define BPF_XADD	0xc0	/* exclusive add */
-
-/* alu/jmp fields */
-#define BPF_MOV		0xb0	/* mov reg to reg */
-#define BPF_ARSH	0xc0	/* sign extending arithmetic shift right */
-
-/* change endianness of a register */
-#define BPF_END		0xd0	/* flags for endianness conversion: */
-#define BPF_TO_LE	0x00	/* convert to little-endian */
-#define BPF_TO_BE	0x08	/* convert to big-endian */
-#define BPF_FROM_LE	BPF_TO_LE
-#define BPF_FROM_BE	BPF_TO_BE
-
-#define BPF_JNE		0x50	/* jump != */
-#define BPF_JSGT	0x60	/* SGT is signed '>', GT in x86 */
-#define BPF_JSGE	0x70	/* SGE is signed '>=', GE in x86 */
-#define BPF_CALL	0x80	/* function call */
-#define BPF_EXIT	0x90	/* function return */
-
-/* Register numbers */
-enum {
-	BPF_REG_0 = 0,
-	BPF_REG_1,
-	BPF_REG_2,
-	BPF_REG_3,
-	BPF_REG_4,
-	BPF_REG_5,
-	BPF_REG_6,
-	BPF_REG_7,
-	BPF_REG_8,
-	BPF_REG_9,
-	BPF_REG_10,
-	__MAX_BPF_REG,
-};
-
-/* BPF has 10 general purpose 64-bit registers and stack frame. */
-#define MAX_BPF_REG	__MAX_BPF_REG
+#include <uapi/linux/bpf.h>
 
 /* ArgX, context and stack frame pointer register positions. Note,
  * Arg1, Arg2, Arg3, etc are used as argument mappings of function
@@ -317,14 +271,6 @@ enum {
 #define SK_RUN_FILTER(filter, ctx) \
 	(*filter->prog->bpf_func)(ctx, filter->prog->insnsi)
 
-struct bpf_insn {
-	__u8	code;		/* opcode */
-	__u8	dst_reg:4;	/* dest register */
-	__u8	src_reg:4;	/* source register */
-	__s16	off;		/* signed offset */
-	__s32	imm;		/* signed immediate constant */
-};
-
 #ifdef CONFIG_COMPAT
 /* A struct sock_filter is architecture independent. */
 struct compat_sock_fprog {
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 24e9033f8b3f..fb3f7b675229 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -67,6 +67,7 @@ header-y += bfs_fs.h
 header-y += binfmts.h
 header-y += blkpg.h
 header-y += blktrace_api.h
+header-y += bpf.h
 header-y += bpqether.h
 header-y += bsg.h
 header-y += btrfs.h
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
new file mode 100644
index 000000000000..479ed0b6be16
--- /dev/null
+++ b/include/uapi/linux/bpf.h
@@ -0,0 +1,65 @@
+/* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#ifndef _UAPI__LINUX_BPF_H__
+#define _UAPI__LINUX_BPF_H__
+
+#include <linux/types.h>
+
+/* Extended instruction set based on top of classic BPF */
+
+/* instruction classes */
+#define BPF_ALU64	0x07	/* alu mode in double word width */
+
+/* ld/ldx fields */
+#define BPF_DW		0x18	/* double word */
+#define BPF_XADD	0xc0	/* exclusive add */
+
+/* alu/jmp fields */
+#define BPF_MOV		0xb0	/* mov reg to reg */
+#define BPF_ARSH	0xc0	/* sign extending arithmetic shift right */
+
+/* change endianness of a register */
+#define BPF_END		0xd0	/* flags for endianness conversion: */
+#define BPF_TO_LE	0x00	/* convert to little-endian */
+#define BPF_TO_BE	0x08	/* convert to big-endian */
+#define BPF_FROM_LE	BPF_TO_LE
+#define BPF_FROM_BE	BPF_TO_BE
+
+#define BPF_JNE		0x50	/* jump != */
+#define BPF_JSGT	0x60	/* SGT is signed '>', GT in x86 */
+#define BPF_JSGE	0x70	/* SGE is signed '>=', GE in x86 */
+#define BPF_CALL	0x80	/* function call */
+#define BPF_EXIT	0x90	/* function return */
+
+/* Register numbers */
+enum {
+	BPF_REG_0 = 0,
+	BPF_REG_1,
+	BPF_REG_2,
+	BPF_REG_3,
+	BPF_REG_4,
+	BPF_REG_5,
+	BPF_REG_6,
+	BPF_REG_7,
+	BPF_REG_8,
+	BPF_REG_9,
+	BPF_REG_10,
+	__MAX_BPF_REG,
+};
+
+/* BPF has 10 general purpose 64-bit registers and stack frame. */
+#define MAX_BPF_REG	__MAX_BPF_REG
+
+struct bpf_insn {
+	__u8	code;		/* opcode */
+	__u8	dst_reg:4;	/* dest register */
+	__u8	src_reg:4;	/* source register */
+	__s16	off;		/* signed offset */
+	__s32	imm;		/* signed immediate constant */
+};
+
+#endif /* _UAPI__LINUX_BPF_H__ */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 net-next 2/2] net: filter: split filter.h and expose eBPF to user space
  2014-08-27 20:37   ` [PATCH v8 net-next 2/2] net: filter: split filter.h and expose eBPF to user space Alexei Starovoitov
@ 2014-08-29 17:39     ` Daniel Borkmann
       [not found]       ` <5400BAB7.80001-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Borkmann @ 2014-08-29 17:39 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Ingo Molnar, Linus Torvalds, Andy Lutomirski,
	Steven Rostedt, Chema Gonzalez, Eric Dumazet, Peter Zijlstra,
	Brendan Gregg, Namhyung Kim, H. Peter Anvin, Andrew Morton,
	Kees Cook, linux-api, netdev, linux-kernel

On 08/27/2014 10:37 PM, Alexei Starovoitov wrote:
> allow user space to generate eBPF programs
>
> uapi/linux/bpf.h: eBPF instruction set definition
>
> linux/filter.h: the rest

Very sorry for being late, but just a thought since we're touching user
space headers anyway ...

Wouldn't it be more consistent to have it organized as follows ...

  - uapi/linux/bpf.h    : classic BPF instruction set parts only
  - uapi/linux/ebpf.h   : eBPF instruction set definition (which also
                          includes uapi/linux/bpf.h though)
... and have ...

  - uapi/linux/filter.h : just include uapi/linux/bpf.h but rest is empty

That way, it would be more consistent ...

Old legacy application can stay with linux/filter.h; new applications
based on their needs can choose between linux/{e,}bpf.h and in the kernel,
we can just include linux/ebpf.h.

Right now, it seems, an eBPF user space program would need to include
2 header files in user space (linux/filter.h, linux/bpf.h) which I find
a bit confusing.

If you want, I could also take care of that later, but just thinking out
loudly ...

> This patch only moves macro definitions, but practically it freezes existing
> eBPF instruction set, though new instructions can still be added in the future.
>
> These eBPF definitions cannot go into uapi/linux/filter.h, since the names
> may conflict with existing applications.
>
> Full eBPF ISA description is in Documentation/networking/filter.txt
>
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 net-next 2/2] net: filter: split filter.h and expose eBPF to user space
       [not found]       ` <5400BAB7.80001-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2014-08-29 18:02         ` Alexei Starovoitov
       [not found]           ` <CAADnVQJbgiUK1vt_SDEG6Yee-Ht67e2M82PrHb3Kx533BOF-rg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Alexei Starovoitov @ 2014-08-29 18:02 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Alexei Starovoitov, David S. Miller, Ingo Molnar, Linus Torvalds,
	Andy Lutomirski, Steven Rostedt, Chema Gonzalez, Eric Dumazet,
	Peter Zijlstra, Brendan Gregg, Namhyung Kim, H. Peter Anvin,
	Andrew Morton, Kees Cook, Linux API,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Fri, Aug 29, 2014 at 10:39 AM, Daniel Borkmann <dborkman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On 08/27/2014 10:37 PM, Alexei Starovoitov wrote:
>>
>> allow user space to generate eBPF programs
>>
>> uapi/linux/bpf.h: eBPF instruction set definition
>>
>> linux/filter.h: the rest
>
>
> Very sorry for being late, but just a thought since we're touching user
> space headers anyway ...
>
> Wouldn't it be more consistent to have it organized as follows ...
>
>  - uapi/linux/bpf.h    : classic BPF instruction set parts only
>  - uapi/linux/ebpf.h   : eBPF instruction set definition (which also
>                          includes uapi/linux/bpf.h though)
> ... and have ...
>
>  - uapi/linux/filter.h : just include uapi/linux/bpf.h but rest is empty
>
> That way, it would be more consistent ...
>
> Old legacy application can stay with linux/filter.h; new applications
> based on their needs can choose between linux/{e,}bpf.h and in the kernel,
> we can just include linux/ebpf.h.
>
> Right now, it seems, an eBPF user space program would need to include
> 2 header files in user space (linux/filter.h, linux/bpf.h) which I find
> a bit confusing.

It's been bugging me as well, but I suspect having it the way you
described won't work. Mainly because we cannot do include <uapi/..>
inside uapi/*.h, so we would need to do include <linux/bpf.h>
inside uapi/linux/filter.h, but that will cause serious include path
confusion. That was the reason I didn't simply do include <linux/filter.h>
inside uapi/linux/bpf.h

Also I really dislike 'ebpf' name in all lower case. If we make such header
file name, we would need to rename all macros and function names
to EBPF_... which I find very ugly looking. I think all good abbreviations are
three letters :)
So I very much prefer bpf.h as a main file name.
Later we can move some of old classic BPF defines into
uapi/linux/bpf_common.h and then include it in both uapi/linux/bpf.h
and in uapi/linux/filter.h, then the nuisance of two include files for
user space will go away. Classic users will keep using linux/filter.h
and new apps will include linux/bpf.h only.
I think we should probably do such header optimization later and very carefully.
I'm a bit afraid to touch uapi/linux/filter.h since it's used in so
many user apps.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 net-next 2/2] net: filter: split filter.h and expose eBPF to user space
       [not found]           ` <CAADnVQJbgiUK1vt_SDEG6Yee-Ht67e2M82PrHb3Kx533BOF-rg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-08-29 22:24             ` Daniel Borkmann
       [not found]               ` <5400FDA0.7000704-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Borkmann @ 2014-08-29 22:24 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, David S. Miller, Ingo Molnar, Linus Torvalds,
	Andy Lutomirski, Steven Rostedt, Chema Gonzalez, Eric Dumazet,
	Peter Zijlstra, Brendan Gregg, Namhyung Kim, H. Peter Anvin,
	Andrew Morton, Kees Cook, Linux API,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On 08/29/2014 08:02 PM, Alexei Starovoitov wrote:
> On Fri, Aug 29, 2014 at 10:39 AM, Daniel Borkmann <dborkman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> On 08/27/2014 10:37 PM, Alexei Starovoitov wrote:
>>>
>>> allow user space to generate eBPF programs
>>>
>>> uapi/linux/bpf.h: eBPF instruction set definition
>>>
>>> linux/filter.h: the rest
>>
>> Very sorry for being late, but just a thought since we're touching user
>> space headers anyway ...
>>
>> Wouldn't it be more consistent to have it organized as follows ...
>>
>>   - uapi/linux/bpf.h    : classic BPF instruction set parts only
>>   - uapi/linux/ebpf.h   : eBPF instruction set definition (which also
>>                           includes uapi/linux/bpf.h though)
>> ... and have ...
>>
>>   - uapi/linux/filter.h : just include uapi/linux/bpf.h but rest is empty
>>
>> That way, it would be more consistent ...
>>
>> Old legacy application can stay with linux/filter.h; new applications
>> based on their needs can choose between linux/{e,}bpf.h and in the kernel,
>> we can just include linux/ebpf.h.
>>
>> Right now, it seems, an eBPF user space program would need to include
>> 2 header files in user space (linux/filter.h, linux/bpf.h) which I find
>> a bit confusing.
>
> It's been bugging me as well, but I suspect having it the way you
> described won't work. Mainly because we cannot do include <uapi/..>
> inside uapi/*.h, so we would need to do include <linux/bpf.h>
> inside uapi/linux/filter.h, but that will cause serious include path
> confusion. That was the reason I didn't simply do include <linux/filter.h>
> inside uapi/linux/bpf.h
>
> Also I really dislike 'ebpf' name in all lower case. If we make such header
> file name, we would need to rename all macros and function names
> to EBPF_... which I find very ugly looking. I think all good abbreviations are
> three letters :)

I don't think we would have to name defines that way, really, that would be
terrible. We can keep them simply *as is*. Not sure though why bpf.h + ebpf.h
would be that bad. ;) I haven't tried it out yet, but if we would indeed run
into a name collision, above proposal would resolve that.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 net-next 2/2] net: filter: split filter.h and expose eBPF to user space
       [not found]               ` <5400FDA0.7000704-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2014-08-29 23:01                 ` Alexei Starovoitov
  2014-08-30  6:22                   ` Daniel Borkmann
       [not found]                   ` <CAMEtUuyRUujYhRsH9aUx0h7wvU1DrKRHNWZtoOYEgHVfKdCTxw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 2 replies; 9+ messages in thread
From: Alexei Starovoitov @ 2014-08-29 23:01 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Alexei Starovoitov, David S. Miller, Ingo Molnar, Linus Torvalds,
	Andy Lutomirski, Steven Rostedt, Chema Gonzalez, Eric Dumazet,
	Peter Zijlstra, Brendan Gregg, Namhyung Kim, H. Peter Anvin,
	Andrew Morton, Kees Cook, Linux API,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Fri, Aug 29, 2014 at 3:24 PM, Daniel Borkmann <dborkman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>>
>> Also I really dislike 'ebpf' name in all lower case. If we make such
>> header
>> file name, we would need to rename all macros and function names
>> to EBPF_... which I find very ugly looking. I think all good abbreviations
>> are
>> three letters :)
>
>
> I don't think we would have to name defines that way, really, that would be
> terrible. We can keep them simply *as is*. Not sure though why bpf.h +
> ebpf.h
> would be that bad. ;) I haven't tried it out yet, but if we would indeed run
> into a name collision, above proposal would resolve that.

imo it's a consistency issue. If main uapi header is ebpf.h then
corresponding kernel internal header should be ebpf.h as well
and kernel/ebpf/ directory and so on.
That's why I insist on uapi/linux/bpf.h and no other name.
Note I didn't move any of the BPF_ALU64_REG, BPF_ALU32_IMM
macros from linux/filter.h. Without them my verifier testsuite
won't compile, so more lines would be added to bpf.h in the future.
At that time we can take 45 lines out of uapi/linux/filter.h and move
them into bpf_common.h. My request is let's not fight about it
right now. We didn't even cross the bridge yet and arguing
about beauty of user apps that come in 30 patches from now...

These two patches are about _intent_ of making eBPF usable
from userspace, so I can move along with llvm.
Also worth noting that llmv will not be including this uapi/linux/bpf.h
It has its own infra to generate instructions. Look at:
tools/bpf/llvm/lib/Target/BPF/BPFInstrInfo.td
it's a special 'table definition' language for describing bits and fields
of instructions.
So these two patches are mainly establishing _intent_ and bpf.h file
name. That's why I'm so paranoid about naming.

btw, I've spent last two days writing syscall manpage :(
What is the best way to present it for review?
If I just attach it raw, it's unreadable... I can include a link
to html page, but man2html produces ugly pages comparing
to what 'man' command shows. Any nice man converters
that generate stuff seen on man7.org ?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 net-next 2/2] net: filter: split filter.h and expose eBPF to user space
  2014-08-29 23:01                 ` Alexei Starovoitov
@ 2014-08-30  6:22                   ` Daniel Borkmann
       [not found]                   ` <CAMEtUuyRUujYhRsH9aUx0h7wvU1DrKRHNWZtoOYEgHVfKdCTxw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 0 replies; 9+ messages in thread
From: Daniel Borkmann @ 2014-08-30  6:22 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, David S. Miller, Ingo Molnar, Linus Torvalds,
	Andy Lutomirski, Steven Rostedt, Chema Gonzalez, Eric Dumazet,
	Peter Zijlstra, Brendan Gregg, Namhyung Kim, H. Peter Anvin,
	Andrew Morton, Kees Cook, Linux API, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org

On 08/30/2014 01:01 AM, Alexei Starovoitov wrote:
...
> btw, I've spent last two days writing syscall manpage :(
> What is the best way to present it for review?
> If I just attach it raw, it's unreadable... I can include a link
> to html page, but man2html produces ugly pages comparing
> to what 'man' command shows. Any nice man converters
> that generate stuff seen on man7.org ?

What about :

   man foo > bar

And then copy that into your mail client?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v8 net-next 2/2] net: filter: split filter.h and expose eBPF to user space
       [not found]                   ` <CAMEtUuyRUujYhRsH9aUx0h7wvU1DrKRHNWZtoOYEgHVfKdCTxw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-08-30  7:48                     ` Daniel Borkmann
  0 siblings, 0 replies; 9+ messages in thread
From: Daniel Borkmann @ 2014-08-30  7:48 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, David S. Miller, Ingo Molnar, Linus Torvalds,
	Andy Lutomirski, Steven Rostedt, Chema Gonzalez, Eric Dumazet,
	Peter Zijlstra, Brendan Gregg, Namhyung Kim, H. Peter Anvin,
	Andrew Morton, Kees Cook, Linux API,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

[-- Attachment #1: Type: text/plain, Size: 986 bytes --]

On 08/30/2014 01:01 AM, Alexei Starovoitov wrote:
...
> imo it's a consistency issue. If main uapi header is ebpf.h then
> corresponding kernel internal header should be ebpf.h as well
> and kernel/ebpf/ directory and so on.

I don't think that has to be enforced, but fair enough, if you
feel that way.

> That's why I insist on uapi/linux/bpf.h and no other name.
...
> them into bpf_common.h. My request is let's not fight about it
> right now. We didn't even cross the bridge yet and arguing
> about beauty of user apps that come in 30 patches from now...
...
> So these two patches are mainly establishing _intent_ and bpf.h file
> name. That's why I'm so paranoid about naming.

I understand, and that's why I said it could also be resolved later
in my previous email (at latest before it gets shipped though), but
just to give this some thought ...

I have attached one example, it doesn't have to be that way, but it's
one possibility if you want to stay with linux/bpf.h only.

[-- Attachment #2: 0001-net-filter-split-filter.h-and-expose-eBPF-to-user-sp.patch --]
[-- Type: text/x-patch, Size: 12291 bytes --]

>From b359aeec95b81262f352f7613178949b94b9a097 Mon Sep 17 00:00:00 2001
From: Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org>
Date: Wed, 27 Aug 2014 13:37:13 -0700
Subject: [PATCH] net: filter: split filter.h and expose eBPF to user space

Signed-off-by: Alexei Starovoitov <ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org>
---
 include/linux/filter.h      |  57 +-------------
 include/uapi/linux/Kbuild   |   1 +
 include/uapi/linux/bpf.h    | 179 ++++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/filter.h | 142 ++---------------------------------
 4 files changed, 186 insertions(+), 193 deletions(-)
 create mode 100644 include/uapi/linux/bpf.h

diff --git a/include/linux/filter.h b/include/linux/filter.h
index f3262b5..f2dd63a 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -8,54 +8,7 @@
 #include <linux/compat.h>
 #include <linux/skbuff.h>
 #include <linux/workqueue.h>
-#include <uapi/linux/filter.h>
-
-/* Internally used and optimized filter representation with extended
- * instruction set based on top of classic BPF.
- */
-
-/* instruction classes */
-#define BPF_ALU64	0x07	/* alu mode in double word width */
-
-/* ld/ldx fields */
-#define BPF_DW		0x18	/* double word */
-#define BPF_XADD	0xc0	/* exclusive add */
-
-/* alu/jmp fields */
-#define BPF_MOV		0xb0	/* mov reg to reg */
-#define BPF_ARSH	0xc0	/* sign extending arithmetic shift right */
-
-/* change endianness of a register */
-#define BPF_END		0xd0	/* flags for endianness conversion: */
-#define BPF_TO_LE	0x00	/* convert to little-endian */
-#define BPF_TO_BE	0x08	/* convert to big-endian */
-#define BPF_FROM_LE	BPF_TO_LE
-#define BPF_FROM_BE	BPF_TO_BE
-
-#define BPF_JNE		0x50	/* jump != */
-#define BPF_JSGT	0x60	/* SGT is signed '>', GT in x86 */
-#define BPF_JSGE	0x70	/* SGE is signed '>=', GE in x86 */
-#define BPF_CALL	0x80	/* function call */
-#define BPF_EXIT	0x90	/* function return */
-
-/* Register numbers */
-enum {
-	BPF_REG_0 = 0,
-	BPF_REG_1,
-	BPF_REG_2,
-	BPF_REG_3,
-	BPF_REG_4,
-	BPF_REG_5,
-	BPF_REG_6,
-	BPF_REG_7,
-	BPF_REG_8,
-	BPF_REG_9,
-	BPF_REG_10,
-	__MAX_BPF_REG,
-};
-
-/* BPF has 10 general purpose 64-bit registers and stack frame. */
-#define MAX_BPF_REG	__MAX_BPF_REG
+#include <uapi/linux/bpf.h>
 
 /* ArgX, context and stack frame pointer register positions. Note,
  * Arg1, Arg2, Arg3, etc are used as argument mappings of function
@@ -317,14 +270,6 @@ enum {
 #define SK_RUN_FILTER(filter, ctx) \
 	(*filter->prog->bpf_func)(ctx, filter->prog->insnsi)
 
-struct bpf_insn {
-	__u8	code;		/* opcode */
-	__u8	dst_reg:4;	/* dest register */
-	__u8	src_reg:4;	/* source register */
-	__s16	off;		/* signed offset */
-	__s32	imm;		/* signed immediate constant */
-};
-
 #ifdef CONFIG_COMPAT
 /* A struct sock_filter is architecture independent. */
 struct compat_sock_fprog {
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 24e9033..fb3f7b6 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -67,6 +67,7 @@ header-y += bfs_fs.h
 header-y += binfmts.h
 header-y += blkpg.h
 header-y += blktrace_api.h
+header-y += bpf.h
 header-y += bpqether.h
 header-y += bsg.h
 header-y += btrfs.h
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
new file mode 100644
index 0000000..76138c2
--- /dev/null
+++ b/include/uapi/linux/bpf.h
@@ -0,0 +1,179 @@
+#ifndef __UAPI_BPF_H
+#define __UAPI_BPF_H
+
+#include <linux/compiler.h>
+#include <linux/types.h>
+
+/* Current version of the filter code architecture. */
+#define BPF_MAJOR_VERSION 1
+#define BPF_MINOR_VERSION 1
+
+/* Try and keep these values and structures similar to BSD,
+ * especially the BPF code definitions which need to match
+ * so you can share filters.
+ */
+struct sock_filter {	/* Filter block */
+	__u16	code;	/* Actual filter code */
+	__u8	jt;	/* Jump true */
+	__u8	jf;	/* Jump false */
+	__u32	k;	/* Generic multiuse field */
+};
+
+struct sock_fprog {			/* Required for SO_ATTACH_FILTER. */
+	unsigned short		len;	/* Number of filter blocks */
+	struct sock_filter __user *filter;
+};
+
+/* Instruction classes */
+#define BPF_CLASS(code) ((code) & 0x07)
+#define  BPF_LD			  0x00
+#define  BPF_LDX		  0x01
+#define  BPF_ST			  0x02
+#define  BPF_STX		  0x03
+#define  BPF_ALU		  0x04
+#define  BPF_JMP		  0x05
+#define  BPF_RET		  0x06
+#define  BPF_MISC		  0x07
+
+/* ld/ldx fields */
+#define BPF_SIZE(code)  ((code) & 0x18)
+#define  BPF_W			  0x00
+#define  BPF_H			  0x08
+#define  BPF_B			  0x10
+#define BPF_MODE(code)  ((code) & 0xe0)
+#define  BPF_IMM		  0x00
+#define  BPF_ABS		  0x20
+#define  BPF_IND		  0x40
+#define  BPF_MEM		  0x60
+#define  BPF_LEN		  0x80
+#define  BPF_MSH		  0xa0
+
+/* alu/jmp fields */
+#define BPF_OP(code)    ((code) & 0xf0)
+#define  BPF_ADD		  0x00
+#define  BPF_SUB		  0x10
+#define  BPF_MUL		  0x20
+#define  BPF_DIV		  0x30
+#define  BPF_OR			  0x40
+#define  BPF_AND		  0x50
+#define  BPF_LSH		  0x60
+#define  BPF_RSH		  0x70
+#define  BPF_NEG		  0x80
+#define  BPF_MOD		  0x90
+#define  BPF_XOR		  0xa0
+
+#define  BPF_JA			  0x00
+#define  BPF_JEQ		  0x10
+#define  BPF_JGT		  0x20
+#define  BPF_JGE		  0x30
+#define  BPF_JSET		  0x40
+#define BPF_SRC(code)   ((code) & 0x08)
+#define  BPF_K			  0x00
+#define  BPF_X			  0x08
+
+/* ret - BPF_K and BPF_X also apply */
+#define BPF_RVAL(code)  ((code) & 0x18)
+#define  BPF_A			  0x10
+
+/* misc */
+#define BPF_MISCOP(code) ((code) & 0xf8)
+#define  BPF_TAX		   0x00
+#define  BPF_TXA		   0x80
+
+#ifndef __WITHOUT_EBPF
+/* Extended instruction set based on top of classic BPF */
+
+/* Instruction classes */
+#define BPF_ALU64	0x07	/* ALU mode in double word width */
+
+/* ld/ldx fields */
+#define BPF_DW		0x18	/* Double word */
+#define BPF_XADD	0xc0	/* Exclusive add */
+
+/* alu/jmp fields */
+#define BPF_MOV		0xb0	/* mov reg to reg */
+#define BPF_ARSH	0xc0	/* Sign extending arithmetic shift right */
+
+/* Change endianness of a register */
+#define BPF_END		0xd0	/* Flags for endianness conversion: */
+#define BPF_TO_LE	0x00	/* Convert to little-endian */
+#define BPF_TO_BE	0x08	/* Convert to big-endian */
+#define BPF_FROM_LE	BPF_TO_LE
+#define BPF_FROM_BE	BPF_TO_BE
+
+#define BPF_JNE		0x50	/* jump != */
+#define BPF_JSGT	0x60	/* SGT is signed '>', GT in x86 */
+#define BPF_JSGE	0x70	/* SGE is signed '>=', GE in x86 */
+#define BPF_CALL	0x80	/* Function call */
+#define BPF_EXIT	0x90	/* Function return */
+
+/* Register numbers */
+enum {
+	BPF_REG_0 = 0,
+	BPF_REG_1,
+	BPF_REG_2,
+	BPF_REG_3,
+	BPF_REG_4,
+	BPF_REG_5,
+	BPF_REG_6,
+	BPF_REG_7,
+	BPF_REG_8,
+	BPF_REG_9,
+	BPF_REG_10,
+	__MAX_BPF_REG,
+};
+
+/* BPF has 10 general purpose 64-bit registers and stack frame. */
+#define MAX_BPF_REG	__MAX_BPF_REG
+
+struct bpf_insn {
+	__u8	code;		/* Opcode */
+	__u8	dst_reg:4;	/* Dest register */
+	__u8	src_reg:4;	/* Source register */
+	__s16	off;		/* Signed offset */
+	__s32	imm;		/* Signed immediate constant */
+};
+
+#endif /* __WITHOUT_EBPF */
+
+#ifndef BPF_MAXINSNS
+# define BPF_MAXINSNS	4096
+#endif
+
+/* Macros for filter block array initializers. */
+#ifndef BPF_STMT
+# define BPF_STMT(code, k) { (unsigned short)(code), 0, 0, k }
+#endif
+#ifndef BPF_JUMP
+# define BPF_JUMP(code, k, jt, jf) { (unsigned short)(code), jt, jf, k }
+#endif
+
+/* Number of scratch memory words for: BPF_ST and BPF_STX */
+#define BPF_MEMWORDS 16
+
+/* Rationale: Negative offsets are invalid in BPF. We use
+ * them to reference ancillary data. Unlike introduction new
+ * instructions, it does not break existing compilers /
+ * optimizers.
+ */
+#define SKF_AD_OFF		(-0x1000)
+#define SKF_AD_PROTOCOL		0
+#define SKF_AD_PKTTYPE		4
+#define SKF_AD_IFINDEX		8
+#define SKF_AD_NLATTR		12
+#define SKF_AD_NLATTR_NEST	16
+#define SKF_AD_MARK		20
+#define SKF_AD_QUEUE		24
+#define SKF_AD_HATYPE		28
+#define SKF_AD_RXHASH		32
+#define SKF_AD_CPU		36
+#define SKF_AD_ALU_XOR_X	40
+#define SKF_AD_VLAN_TAG		44
+#define SKF_AD_VLAN_TAG_PRESENT	48
+#define SKF_AD_PAY_OFFSET	52
+#define SKF_AD_RANDOM		56
+#define SKF_AD_MAX		60
+#define SKF_NET_OFF		(-0x100000)
+#define SKF_LL_OFF		(-0x200000)
+
+#endif /* __UAPI_BPF_H */
diff --git a/include/uapi/linux/filter.h b/include/uapi/linux/filter.h
index 253b4d4..f7207bd 100644
--- a/include/uapi/linux/filter.h
+++ b/include/uapi/linux/filter.h
@@ -1,139 +1,7 @@
-/*
- * Linux Socket Filter Data Structures
- */
+#ifndef __UAPI_FILTER_H
+#define __UAPI_FILTER_H
 
-#ifndef _UAPI__LINUX_FILTER_H__
-#define _UAPI__LINUX_FILTER_H__
+#define __WITHOUT_EBPF
+#include <linux/bpf.h>
 
-#include <linux/compiler.h>
-#include <linux/types.h>
-
-
-/*
- * Current version of the filter code architecture.
- */
-#define BPF_MAJOR_VERSION 1
-#define BPF_MINOR_VERSION 1
-
-/*
- *	Try and keep these values and structures similar to BSD, especially
- *	the BPF code definitions which need to match so you can share filters
- */
- 
-struct sock_filter {	/* Filter block */
-	__u16	code;   /* Actual filter code */
-	__u8	jt;	/* Jump true */
-	__u8	jf;	/* Jump false */
-	__u32	k;      /* Generic multiuse field */
-};
-
-struct sock_fprog {	/* Required for SO_ATTACH_FILTER. */
-	unsigned short		len;	/* Number of filter blocks */
-	struct sock_filter __user *filter;
-};
-
-/*
- * Instruction classes
- */
-
-#define BPF_CLASS(code) ((code) & 0x07)
-#define         BPF_LD          0x00
-#define         BPF_LDX         0x01
-#define         BPF_ST          0x02
-#define         BPF_STX         0x03
-#define         BPF_ALU         0x04
-#define         BPF_JMP         0x05
-#define         BPF_RET         0x06
-#define         BPF_MISC        0x07
-
-/* ld/ldx fields */
-#define BPF_SIZE(code)  ((code) & 0x18)
-#define         BPF_W           0x00
-#define         BPF_H           0x08
-#define         BPF_B           0x10
-#define BPF_MODE(code)  ((code) & 0xe0)
-#define         BPF_IMM         0x00
-#define         BPF_ABS         0x20
-#define         BPF_IND         0x40
-#define         BPF_MEM         0x60
-#define         BPF_LEN         0x80
-#define         BPF_MSH         0xa0
-
-/* alu/jmp fields */
-#define BPF_OP(code)    ((code) & 0xf0)
-#define         BPF_ADD         0x00
-#define         BPF_SUB         0x10
-#define         BPF_MUL         0x20
-#define         BPF_DIV         0x30
-#define         BPF_OR          0x40
-#define         BPF_AND         0x50
-#define         BPF_LSH         0x60
-#define         BPF_RSH         0x70
-#define         BPF_NEG         0x80
-#define		BPF_MOD		0x90
-#define		BPF_XOR		0xa0
-
-#define         BPF_JA          0x00
-#define         BPF_JEQ         0x10
-#define         BPF_JGT         0x20
-#define         BPF_JGE         0x30
-#define         BPF_JSET        0x40
-#define BPF_SRC(code)   ((code) & 0x08)
-#define         BPF_K           0x00
-#define         BPF_X           0x08
-
-/* ret - BPF_K and BPF_X also apply */
-#define BPF_RVAL(code)  ((code) & 0x18)
-#define         BPF_A           0x10
-
-/* misc */
-#define BPF_MISCOP(code) ((code) & 0xf8)
-#define         BPF_TAX         0x00
-#define         BPF_TXA         0x80
-
-#ifndef BPF_MAXINSNS
-#define BPF_MAXINSNS 4096
-#endif
-
-/*
- * Macros for filter block array initializers.
- */
-#ifndef BPF_STMT
-#define BPF_STMT(code, k) { (unsigned short)(code), 0, 0, k }
-#endif
-#ifndef BPF_JUMP
-#define BPF_JUMP(code, k, jt, jf) { (unsigned short)(code), jt, jf, k }
-#endif
-
-/*
- * Number of scratch memory words for: BPF_ST and BPF_STX
- */
-#define BPF_MEMWORDS 16
-
-/* RATIONALE. Negative offsets are invalid in BPF.
-   We use them to reference ancillary data.
-   Unlike introduction new instructions, it does not break
-   existing compilers/optimizers.
- */
-#define SKF_AD_OFF    (-0x1000)
-#define SKF_AD_PROTOCOL 0
-#define SKF_AD_PKTTYPE 	4
-#define SKF_AD_IFINDEX 	8
-#define SKF_AD_NLATTR	12
-#define SKF_AD_NLATTR_NEST	16
-#define SKF_AD_MARK 	20
-#define SKF_AD_QUEUE	24
-#define SKF_AD_HATYPE	28
-#define SKF_AD_RXHASH	32
-#define SKF_AD_CPU	36
-#define SKF_AD_ALU_XOR_X	40
-#define SKF_AD_VLAN_TAG	44
-#define SKF_AD_VLAN_TAG_PRESENT 48
-#define SKF_AD_PAY_OFFSET	52
-#define SKF_AD_RANDOM	56
-#define SKF_AD_MAX	60
-#define SKF_NET_OFF   (-0x100000)
-#define SKF_LL_OFF    (-0x200000)
-
-
-#endif /* _UAPI__LINUX_FILTER_H__ */
+#endif /* __UAPI_FILTER_H */
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-08-30  7:48 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-27 20:37 [PATCH v8 net-next 0/2] load imm64 insn and uapi/linux/bpf.h Alexei Starovoitov
2014-08-27 20:37 ` [PATCH v8 net-next 1/2] net: filter: add "load 64-bit immediate" eBPF instruction Alexei Starovoitov
     [not found] ` <1409171833-6979-1-git-send-email-ast-uqk4Ao+rVK5Wk0Htik3J/w@public.gmane.org>
2014-08-27 20:37   ` [PATCH v8 net-next 2/2] net: filter: split filter.h and expose eBPF to user space Alexei Starovoitov
2014-08-29 17:39     ` Daniel Borkmann
     [not found]       ` <5400BAB7.80001-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-29 18:02         ` Alexei Starovoitov
     [not found]           ` <CAADnVQJbgiUK1vt_SDEG6Yee-Ht67e2M82PrHb3Kx533BOF-rg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-29 22:24             ` Daniel Borkmann
     [not found]               ` <5400FDA0.7000704-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-08-29 23:01                 ` Alexei Starovoitov
2014-08-30  6:22                   ` Daniel Borkmann
     [not found]                   ` <CAMEtUuyRUujYhRsH9aUx0h7wvU1DrKRHNWZtoOYEgHVfKdCTxw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-08-30  7:48                     ` Daniel Borkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).