Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH RFC 2/9] veth: Add driver XDP
From: Toshiaki Makita @ 2018-04-26 10:46 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: Toshiaki Makita, netdev
In-Reply-To: <20180425223852.0be2fd67@brouer.com>

Hi Jesper,

Thanks for taking a look!

On 2018/04/26 5:39, Jesper Dangaard Brouer wrote:
> On Tue, 24 Apr 2018 23:39:16 +0900
> Toshiaki Makita <toshiaki.makita1@gmail.com> wrote:
> 
>> This is basic implementation of veth driver XDP.
>>
>> Incoming packets are sent from the peer veth device in the form of skb,
>> so this is generally doing the same thing as generic XDP.
> 
> I'm unsure that context you are calling veth_xdp_rcv_skb() from.  The
> XDP (RX side) depend heavily on the protection provided by NAPI context.
> It looks like you are adding NAPI handler later.  

This is called from softirq or bh disabled context.
I can see XDP REDIRECT depends on NAPI since it uses per-cpu temporary
storage which is used in ndo_xdp_flush. I thought DROP and PASS is safe
here. Also this is basically the same context as generic XDP, which is
called from netif_rx_internal.

Anyway this is a temporary state and not needed. It looks like this does
not help review so I'll squash this and patch 4 (napi patch).

-- 
Toshiaki Makita

^ permalink raw reply

* Re: [PATCH] net: dwc-xlgmac: fix xlgmac_xmit()'s return type
From: Jose Abreu @ 2018-04-26 10:42 UTC (permalink / raw)
  To: Luc Van Oostenryck, linux-kernel; +Cc: Jose Abreu, netdev
In-Reply-To: <20180424131733.4510-1-luc.vanoostenryck@gmail.com>

On 24-04-2018 14:17, Luc Van Oostenryck wrote:
> The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
> which is a typedef for an enum type, but the implementation in this
> driver returns an 'int'.
>
> Fix this by returning 'netdev_tx_t' in this driver too.
>
> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
> ---

I wouldn't do this because of at least two reasons:
    - xlgmac_xmit() calls xlgmac_maybe_stop_tx_queue() and
xlgmac_prep_tso(), and this last one can return a negative error
code. I expect some others drivers to have similar behavior.
    - If you look along net subsystem you will see that this enum
is directly converted to an int in later stages.

So, and given that you sent a large number of patches about this,
perhaps it would be more clear to change the function definition?

Thanks and Best Regards,
Jose Miguel Abreu

^ permalink raw reply

* Re: [PATCH] NET: usb: qmi_wwan: add support for ublox R410M PID 0x90b2
From: Bjørn Mork @ 2018-04-26 10:41 UTC (permalink / raw)
  To: SZ Lin (林上智); +Cc: stable, netdev, linux-usb, linux-kernel
In-Reply-To: <20180426063013.453-1-sz.lin@moxa.com>

"SZ Lin (林上智)" <sz.lin@moxa.com> writes:

> This patch adds support for PID 0x90b2 of ublox R410M.

Acked-by: Bjørn Mork <bjorn@mork.no>

^ permalink raw reply

* Re: WARNING: kobject bug in br_add_if
From: Hangbin Liu @ 2018-04-26 10:37 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, bridge, David Miller, LKML, netdev, stephen hemminger,
	syzkaller-bugs, Greg Kroah-Hartman
In-Reply-To: <CACT4Y+aDko-kKP-u2S_UGBbB-uwnGw4dWTOSmXVDR3=osLJeFg@mail.gmail.com>

On Thu, Apr 26, 2018 at 10:04:16AM +0200, Dmitry Vyukov wrote:
> On Thu, Apr 26, 2018 at 8:13 AM, Hangbin Liu <liuhangbin@gmail.com> wrote:
> > On Wed, Apr 11, 2018 at 05:18:23PM +0200, Dmitry Vyukov wrote:
> >> On Wed, Apr 11, 2018 at 5:15 PM, syzbot
> >> <syzbot+de73361ee4971b6e6f75@syzkaller.appspotmail.com> wrote:
> >> > Hello,
> >> >
> >> > syzbot hit the following crash on upstream commit
> >> > 10b84daddbec72c6b440216a69de9a9605127f7a (Sat Mar 31 17:59:00 2018 +0000)
> >> > Merge branch 'perf-urgent-for-linus' of
> >> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> >> > syzbot dashboard link:
> >> > https://syzkaller.appspot.com/bug?extid=de73361ee4971b6e6f75
> >> >
> >> > So far this crash happened 4 times on net-next, upstream.
> >> > Unfortunately, I don't have any reproducer for this crash yet.
> >> > Raw console output:
> >> > https://syzkaller.appspot.com/x/log.txt?id=5007286875455488
> >> > Kernel config:
> >> > https://syzkaller.appspot.com/x/.config?id=-2760467897697295172
> >> > compiler: gcc (GCC) 7.1.1 20170620
> >> >
> >> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >> > Reported-by: syzbot+de73361ee4971b6e6f75@syzkaller.appspotmail.com
> >> > It will help syzbot understand when the bug is fixed. See footer for
> >> > details.
> >> > If you forward the report, please keep this part and the footer.
> >>
> >> +Greg
> >>
> >> The plan is to remove this WARNING from kobject_add, if there are no objections.
> >
> > Hi Dmitry,
> >
> > For this bug, why should we remove the WARNING instead of adding a check in
> > br_add_if()? Something like
> 
> 
> Mainline because nobody wants to fix these.
> If you think this is a real bug and you are ready to fix it, please
> mail an official patch.
> 
> >> > ------------[ cut here ]------------
> >> > binder: 23650:23651 unknown command 1078223622
> >> > kobject_add_internal failed for brport (error: -12 parent: bond0)

Re-checked the error. This is a -ENOMEM. So normally we could ignore it.

But on the other hand, although we could find out the slave iface's
master in netdev_master_upper_dev_link(). It already go much further
and allocate some resource and change iface state. e.g.

[54273.968516] br0: port 1(em1) entered blocking state
[54273.973979] br0: port 1(em1) entered disabled state

So I think we'd better return as early as possible. I will post a fix
for this.

Thanks
Hangbin

> >> > binder: 23650:23651 ioctl c0306201 2000dfd0 returned -22
> >> > WARNING: CPU: 1 PID: 23647 at lib/kobject.c:242
> >> > kobject_add_internal+0x3f6/0xbc0 lib/kobject.c:240
> >> > Kernel panic - not syncing: panic_on_warn set ...
> >> >
> >> > CPU: 1 PID: 23647 Comm: syz-executor7 Not tainted 4.16.0-rc7+ #374
> >> > binder: BINDER_SET_CONTEXT_MGR already set
> >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> >> > Google 01/01/2011
> >> > Call Trace:
> >> >  __dump_stack lib/dump_stack.c:17 [inline]
> >> >  dump_stack+0x194/0x24d lib/dump_stack.c:53
> >> >  panic+0x1e4/0x41c kernel/panic.c:183
> >> >  __warn+0x1dc/0x200 kernel/panic.c:547
> >> >  report_bug+0x1f4/0x2b0 lib/bug.c:186
> >> >  fixup_bug.part.10+0x37/0x80 arch/x86/kernel/traps.c:178
> >> >  fixup_bug arch/x86/kernel/traps.c:247 [inline]
> >> >  do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
> >> >  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
> >> >  invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986
> >> > RIP: 0010:kobject_add_internal+0x3f6/0xbc0 lib/kobject.c:240
> >> > RSP: 0018:ffff8801d089f560 EFLAGS: 00010286
> >> > RAX: dffffc0000000008 RBX: ffff8801adbee178 RCX: ffffffff815b193e
> >> > RDX: 0000000000040000 RSI: ffffc900022aa000 RDI: 1ffff1003a113e31
> >> > RBP: ffff8801d089f658 R08: 1ffff1003a113df3 R09: 0000000000000000
> >> > R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff1003a113eb2
> >> > R13: 00000000fffffff4 R14: ffff8801abd88828 R15: ffff8801d75a1e00
> >> >  kobject_add_varg lib/kobject.c:364 [inline]
> >> >  kobject_init_and_add+0xf9/0x150 lib/kobject.c:436
> >> >  br_add_if+0x79a/0x1a70 net/bridge/br_if.c:533
> >> >  add_del_if+0xf4/0x140 net/bridge/br_ioctl.c:101
> >> >  br_dev_ioctl+0xa2/0xc0 net/bridge/br_ioctl.c:396
> >> >  dev_ifsioc+0x333/0x9b0 net/core/dev_ioctl.c:334
> >> >  dev_ioctl+0x176/0xbe0 net/core/dev_ioctl.c:500
> >> >  sock_do_ioctl+0x1ba/0x390 net/socket.c:981
> >> >  sock_ioctl+0x367/0x670 net/socket.c:1081
> >> >  vfs_ioctl fs/ioctl.c:46 [inline]
> >> >  do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686
> >> >  SYSC_ioctl fs/ioctl.c:701 [inline]
> >> >  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
> >> >  do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
> >> >  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> >> > RIP: 0033:0x454e79
> >> > RSP: 002b:00007eff7dab7c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> >> > RAX: ffffffffffffffda RBX: 00007eff7dab86d4 RCX: 0000000000454e79
> >> > RDX: 0000000020000000 RSI: 00000000000089a2 RDI: 0000000000000014
> >> > RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
> >> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000015
> >> > R13: 0000000000000369 R14: 00000000006f7278 R15: 0000000000000006
> >> > Dumping ftrace buffer:
> >> >    (ftrace buffer empty)
> >> > Kernel Offset: disabled
> >> > Rebooting in 86400 seconds..

^ permalink raw reply

* [PATCH net-next] geneve: fix build with modular IPV6
From: Tobias Regnery @ 2018-04-26 10:36 UTC (permalink / raw)
  To: davem, netdev, linux-kernel; +Cc: alexey.kodanev, Tobias Regnery

Commit c40e89fd358e ("geneve: configure MTU based on a lower device") added
an IS_ENABLED(CONFIG_IPV6) to geneve, leading to the following link error
with CONFIG_GENEVE=y and CONFIG_IPV6=m:

drivers/net/geneve.o: In function `geneve_link_config':
geneve.c:(.text+0x14c): undefined reference to `rt6_lookup'

Fix this by adding a Kconfig dependency and forcing GENEVE to be a module
when IPV6 is a module.

Fixes: c40e89fd358e ("geneve: configure MTU based on a lower device")
Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com>
---
 drivers/net/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 891846655000..a029b27fd002 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -198,6 +198,7 @@ config VXLAN
 config GENEVE
        tristate "Generic Network Virtualization Encapsulation"
        depends on INET && NET_UDP_TUNNEL
+       depends on IPV6 || !IPV6
        select NET_IP_TUNNEL
        select GRO_CELLS
        ---help---
-- 
2.17.0

^ permalink raw reply related

* [PATCH v4] bpf, x86_32: add eBPF JIT compiler for ia32
From: Wang YanQing @ 2018-04-26 10:12 UTC (permalink / raw)
  To: daniel
  Cc: ast, illusionist.neo, tglx, mingo, hpa, davem, x86, netdev,
	linux-kernel

The JIT compiler emits ia32 bit instructions. Currently, It supports eBPF
only. Classic BPF is supported because of the conversion by BPF core.

Almost all instructions from eBPF ISA supported except the following:
BPF_ALU64 | BPF_DIV | BPF_K
BPF_ALU64 | BPF_DIV | BPF_X
BPF_ALU64 | BPF_MOD | BPF_K
BPF_ALU64 | BPF_MOD | BPF_X
BPF_STX | BPF_XADD | BPF_W
BPF_STX | BPF_XADD | BPF_DW

It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL at the moment.

IA32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI. I use
EAX|EDX|ECX|EBX as temporary registers to simulate instructions in eBPF
ISA, and allocate ESI|EDI to BPF_REG_AX for constant blinding, all others
eBPF registers, R0-R10, are simulated through scratch space on stack.

The reasons behind the hardware registers allocation policy are:
1:MUL need EAX:EDX, shift operation need ECX, so they aren't fit
  for general eBPF 64bit register simulation.
2:We need at least 4 registers to simulate most eBPF ISA operations
  on registers operands instead of on register&memory operands.
3:We need to put BPF_REG_AX on hardware registers, or constant blinding
  will degrade jit performance heavily.

Tested on PC (Intel(R) Core(TM) i5-5200U CPU).
Testing results on i5-5200U:
1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed]
2) test_progs: Summary: 83 PASSED, 0 FAILED.
3) test_lpm: OK
4) test_lru_map: OK
5) test_verifier: Summary: 828 PASSED, 0 FAILED.

Above tests are all done in following two conditions separately:
1:bpf_jit_enable=1 and bpf_jit_harden=0
2:bpf_jit_enable=1 and bpf_jit_harden=2

Below are some numbers for this jit implementation:
Note:
  I run test_progs in kselftest 100 times continuously for every condition,
  the numbers are in format: total/times=avg.
  The numbers that test_bpf reports show almost the same relation.

a:jit_enable=0 and jit_harden=0            b:jit_enable=1 and jit_harden=0
  test_pkt_access:PASS:ipv4:15622/100=156    test_pkt_access:PASS:ipv4:10674/100=106
  test_pkt_access:PASS:ipv6:9130/100=91      test_pkt_access:PASS:ipv6:4855/100=48
  test_xdp:PASS:ipv4:240198/100=2401         test_xdp:PASS:ipv4:138912/100=1389
  test_xdp:PASS:ipv6:137326/100=1373         test_xdp:PASS:ipv6:68542/100=685
  test_l4lb:PASS:ipv4:61100/100=611          test_l4lb:PASS:ipv4:37302/100=373
  test_l4lb:PASS:ipv6:101000/100=1010        test_l4lb:PASS:ipv6:55030/100=550

c:jit_enable=1 and jit_harden=2
  test_pkt_access:PASS:ipv4:10558/100=105
  test_pkt_access:PASS:ipv6:5092/100=50
  test_xdp:PASS:ipv4:131902/100=1319
  test_xdp:PASS:ipv6:77932/100=779
  test_l4lb:PASS:ipv4:38924/100=389
  test_l4lb:PASS:ipv6:57520/100=575

The numbers show we get 30%~50% improvement.

See Documentation/networking/filter.txt for more information.

Signed-off-by: Wang YanQing <udknight@gmail.com>
---
 Changes v3-v4:
 1:Fix changelog in commit.
   I install llvm-6.0, then test_progs willn't report errors.
   I submit another patch:
   "bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on x86_32 platform"
   to fix another problem, after that patch, test_verifier willn't report errors too.
 2:Fix clear r0[1] twice unnecessarily in *BPF_IND|BPF_ABS* simulation.

 Changes v2-v3:
 1:Move BPF_REG_AX to real hardware registers for performance reason.
 3:Using bpf_load_pointer instead of bpf_jit32.S, suggested by Daniel Borkmann.
 4:Delete partial codes in 1c2a088a6626, suggested by Daniel Borkmann.
 5:Some bug fixes and comments improvement.

 Changes v1-v2:
 1:Fix bug in emit_ia32_neg64.
 2:Fix bug in emit_ia32_arsh_r64.
 3:Delete filename in top level comment, suggested by Thomas Gleixner.
 4:Delete unnecessary boiler plate text, suggested by Thomas Gleixner.
 5:Rewrite some words in changelog.
 6:CodingSytle improvement and a little more comments.

 Hi all!
 This is the fourth version of this patch, and I think it is good enough
 to launch, any more suggestion?

 Thanks.

 arch/x86/Kconfig                     |    2 +-
 arch/x86/include/asm/nospec-branch.h |   26 +-
 arch/x86/net/Makefile                |    9 +-
 arch/x86/net/bpf_jit_comp32.c        | 2530 ++++++++++++++++++++++++++++++++++
 4 files changed, 2562 insertions(+), 5 deletions(-)
 create mode 100644 arch/x86/net/bpf_jit_comp32.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 00fcf81..1f5fa2f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -137,7 +137,7 @@ config X86
 	select HAVE_DMA_CONTIGUOUS
 	select HAVE_DYNAMIC_FTRACE
 	select HAVE_DYNAMIC_FTRACE_WITH_REGS
-	select HAVE_EBPF_JIT			if X86_64
+	select HAVE_EBPF_JIT
 	select HAVE_EFFICIENT_UNALIGNED_ACCESS
 	select HAVE_EXIT_THREAD
 	select HAVE_FENTRY			if X86_64 || DYNAMIC_FTRACE
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index f928ad9..a4c7ca4 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -291,14 +291,17 @@ static inline void indirect_branch_prediction_barrier(void)
  *    lfence
  *    jmp spec_trap
  *  do_rop:
- *    mov %rax,(%rsp)
+ *    mov %rax,(%rsp) for x86_64
+ *    mov %edx,(%esp) for x86_32
  *    retq
  *
  * Without retpolines configured:
  *
- *    jmp *%rax
+ *    jmp *%rax for x86_64
+ *    jmp *%edx for x86_32
  */
 #ifdef CONFIG_RETPOLINE
+#ifdef CONFIG_X86_64
 # define RETPOLINE_RAX_BPF_JIT_SIZE	17
 # define RETPOLINE_RAX_BPF_JIT()				\
 	EMIT1_off32(0xE8, 7);	 /* callq do_rop */		\
@@ -310,9 +313,28 @@ static inline void indirect_branch_prediction_barrier(void)
 	EMIT4(0x48, 0x89, 0x04, 0x24); /* mov %rax,(%rsp) */	\
 	EMIT1(0xC3);             /* retq */
 #else
+# define RETPOLINE_EDX_BPF_JIT()				\
+do {								\
+	EMIT1_off32(0xE8, 7);	 /* call do_rop */		\
+	/* spec_trap: */					\
+	EMIT2(0xF3, 0x90);       /* pause */			\
+	EMIT3(0x0F, 0xAE, 0xE8); /* lfence */			\
+	EMIT2(0xEB, 0xF9);       /* jmp spec_trap */		\
+	/* do_rop: */						\
+	EMIT3(0x89, 0x14, 0x24); /* mov %edx,(%esp) */		\
+	EMIT1(0xC3);             /* ret */			\
+} while (0)
+#endif
+#else /* !CONFIG_RETPOLINE */
+
+#ifdef CONFIG_X86_64
 # define RETPOLINE_RAX_BPF_JIT_SIZE	2
 # define RETPOLINE_RAX_BPF_JIT()				\
 	EMIT2(0xFF, 0xE0);	 /* jmp *%rax */
+#else
+# define RETPOLINE_EDX_BPF_JIT()				\
+	EMIT2(0xFF, 0xE2) /* jmp *%edx */
+#endif
 #endif
 
 #endif /* _ASM_X86_NOSPEC_BRANCH_H_ */
diff --git a/arch/x86/net/Makefile b/arch/x86/net/Makefile
index fefb4b6..f54c9d4 100644
--- a/arch/x86/net/Makefile
+++ b/arch/x86/net/Makefile
@@ -1,6 +1,11 @@
 #
 # Arch-specific network modules
 #
-OBJECT_FILES_NON_STANDARD_bpf_jit.o += y
 
-obj-$(CONFIG_BPF_JIT) += bpf_jit.o bpf_jit_comp.o
+
+ifeq ($(CONFIG_X86_32),y)
+        obj-$(CONFIG_BPF_JIT) += bpf_jit_comp32.o
+else
+        OBJECT_FILES_NON_STANDARD_bpf_jit.o += y
+        obj-$(CONFIG_BPF_JIT) += bpf_jit.o bpf_jit_comp.o
+endif
diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c
new file mode 100644
index 0000000..376bf95
--- /dev/null
+++ b/arch/x86/net/bpf_jit_comp32.c
@@ -0,0 +1,2530 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Just-In-Time compiler for eBPF filters on IA32 (32bit x86)
+ *
+ * Author: Wang YanQing (udknight@gmail.com)
+ * The code based on code and ideas from:
+ * Eric Dumazet (eric.dumazet@gmail.com)
+ * and from:
+ * Shubham Bansal <illusionist.neo@gmail.com>
+ */
+
+#include <linux/netdevice.h>
+#include <linux/filter.h>
+#include <linux/if_vlan.h>
+#include <asm/cacheflush.h>
+#include <asm/set_memory.h>
+#include <asm/nospec-branch.h>
+#include <linux/bpf.h>
+
+/*
+ * eBPF prog stack layout:
+ *
+ *                         high
+ * original ESP =>        +-----+
+ *                        |     | callee saved registers
+ *                        +-----+
+ *                        | ... | eBPF JIT scratch space
+ * BPF_FP,IA32_EBP  =>    +-----+
+ *                        | ... | eBPF prog stack
+ *                        +-----+
+ *                        |RSVD | JIT scratchpad
+ * current ESP =>         +-----+
+ *                        |     |
+ *                        | ... | Function call stack
+ *                        |     |
+ *                        +-----+
+ *                          low
+ *
+ * The callee saved registers:
+ *
+ *                                high
+ * original ESP =>        +------------------+ \
+ *                        |        ebp       | |
+ * current EBP =>         +------------------+ } callee saved registers
+ *                        |    ebx,esi,edi   | |
+ *                        +------------------+ /
+ *                                low
+ */
+
+static u8 *emit_code(u8 *ptr, u32 bytes, unsigned int len)
+{
+	if (len == 1)
+		*ptr = bytes;
+	else if (len == 2)
+		*(u16 *)ptr = bytes;
+	else {
+		*(u32 *)ptr = bytes;
+		barrier();
+	}
+	return ptr + len;
+}
+
+#define EMIT(bytes, len) \
+	do { prog = emit_code(prog, bytes, len); cnt += len; } while (0)
+
+#define EMIT1(b1)		EMIT(b1, 1)
+#define EMIT2(b1, b2)		EMIT((b1) + ((b2) << 8), 2)
+#define EMIT3(b1, b2, b3)	EMIT((b1) + ((b2) << 8) + ((b3) << 16), 3)
+#define EMIT4(b1, b2, b3, b4)   \
+	EMIT((b1) + ((b2) << 8) + ((b3) << 16) + ((b4) << 24), 4)
+
+#define EMIT1_off32(b1, off) \
+	do {EMIT1(b1); EMIT(off, 4); } while (0)
+#define EMIT2_off32(b1, b2, off) \
+	do {EMIT2(b1, b2); EMIT(off, 4); } while (0)
+#define EMIT3_off32(b1, b2, b3, off) \
+	do {EMIT3(b1, b2, b3); EMIT(off, 4); } while (0)
+#define EMIT4_off32(b1, b2, b3, b4, off) \
+	do {EMIT4(b1, b2, b3, b4); EMIT(off, 4); } while (0)
+
+#define jmp_label(label, jmp_insn_len) (label - cnt - jmp_insn_len)
+
+static bool is_imm8(int value)
+{
+	return value <= 127 && value >= -128;
+}
+
+static bool is_simm32(s64 value)
+{
+	return value == (s64) (s32) value;
+}
+
+#define STACK_OFFSET(k)	(k)
+#define TCALL_CNT	(MAX_BPF_JIT_REG + 0)	/* Tail Call Count */
+
+#define IA32_EAX	(0x0)
+#define IA32_EBX	(0x3)
+#define IA32_ECX	(0x1)
+#define IA32_EDX	(0x2)
+#define IA32_ESI	(0x6)
+#define IA32_EDI	(0x7)
+#define IA32_EBP	(0x5)
+#define IA32_ESP	(0x4)
+
+/* list of x86 cond jumps opcodes (. + s8)
+ * Add 0x10 (and an extra 0x0f) to generate far jumps (. + s32)
+ */
+#define IA32_JB  0x72
+#define IA32_JAE 0x73
+#define IA32_JE  0x74
+#define IA32_JNE 0x75
+#define IA32_JBE 0x76
+#define IA32_JA  0x77
+#define IA32_JL  0x7C
+#define IA32_JGE 0x7D
+#define IA32_JLE 0x7E
+#define IA32_JG  0x7F
+
+/*
+ * Map eBPF registers to x86_32 32bit registers or stack scratch space.
+ *
+ * 1. All the registers, R0-R10, are mapped to scratch space on stack.
+ * 2. We need two 64 bit temp registers to do complex operations on eBPF
+ *    registers.
+ * 3. For performance reason, the BPF_REG_AX for blinding constant, is
+ *    mapped to real hardware register pair, IA32_ESI and IA32_EDI.
+ *
+ * As the eBPF registers are all 64 bit registers and x86_32 has only 32 bit
+ * registers, we have to map each eBPF registers with two x86_32 32 bit regs
+ * or scratch memory space and we have to build eBPF 64 bit register from those.
+ *
+ * We use IA32_EAX, IA32_EDX, IA32_ECX, IA32_EBX as temporary registers.
+ */
+static const u8 bpf2ia32[][2] = {
+	/* return value from in-kernel function, and exit value from eBPF */
+	[BPF_REG_0] = {STACK_OFFSET(0), STACK_OFFSET(4)},
+
+	/* arguments from eBPF program to in-kernel function */
+	/* Stored on stack scratch space */
+	[BPF_REG_1] = {STACK_OFFSET(8), STACK_OFFSET(12)},
+	[BPF_REG_2] = {STACK_OFFSET(16), STACK_OFFSET(20)},
+	[BPF_REG_3] = {STACK_OFFSET(24), STACK_OFFSET(28)},
+	[BPF_REG_4] = {STACK_OFFSET(32), STACK_OFFSET(36)},
+	[BPF_REG_5] = {STACK_OFFSET(40), STACK_OFFSET(44)},
+
+	/* callee saved registers that in-kernel function will preserve */
+	/* Stored on stack scratch space */
+	[BPF_REG_6] = {STACK_OFFSET(48), STACK_OFFSET(52)},
+	[BPF_REG_7] = {STACK_OFFSET(56), STACK_OFFSET(60)},
+	[BPF_REG_8] = {STACK_OFFSET(64), STACK_OFFSET(68)},
+	[BPF_REG_9] = {STACK_OFFSET(72), STACK_OFFSET(76)},
+
+	/* Read only Frame Pointer to access Stack */
+	[BPF_REG_FP] = {STACK_OFFSET(80), STACK_OFFSET(84)},
+
+	/* temporary register for blinding constants. */
+	[BPF_REG_AX] = {IA32_ESI, IA32_EDI},
+
+	/* Tail call count. Stored on stack scratch space. */
+	[TCALL_CNT] = {STACK_OFFSET(88), STACK_OFFSET(92)},
+};
+
+#define dst_lo	dst[0]
+#define dst_hi	dst[1]
+#define src_lo	src[0]
+#define src_hi	src[1]
+
+#define STACK_ALIGNMENT	8
+/* Stack space for BPF_REG_1, BPF_REG_2, BPF_REG_3, BPF_REG_4,
+ * BPF_REG_5, BPF_REG_6, BPF_REG_7, BPF_REG_8, BPF_REG_9,
+ * BPF_REG_FP, BPF_REG_AX and Tail call counts.
+ */
+#define SCRATCH_SIZE 96
+
+/* total stack size used in JITed code */
+#define _STACK_SIZE \
+	(stack_depth + \
+	 + SCRATCH_SIZE + \
+	 + 4 /* extra for skb_copy_bits buffer */)
+
+#define STACK_SIZE ALIGN(_STACK_SIZE, STACK_ALIGNMENT)
+
+/* Get the offset of eBPF REGISTERs stored on scratch space. */
+#define STACK_VAR(off) (off)
+
+/* Offset of skb_copy_bits buffer */
+#define SKB_BUFFER STACK_VAR(SCRATCH_SIZE)
+
+/* encode 'dst_reg' register into x86_32 opcode 'byte' */
+static u8 add_1reg(u8 byte, u32 dst_reg)
+{
+	return byte + dst_reg;
+}
+
+/* encode 'dst_reg' and 'src_reg' registers into x86_32 opcode 'byte' */
+static u8 add_2reg(u8 byte, u32 dst_reg, u32 src_reg)
+{
+	return byte + dst_reg + (src_reg << 3);
+}
+
+static void jit_fill_hole(void *area, unsigned int size)
+{
+	/* fill whole space with int3 instructions */
+	memset(area, 0xcc, size);
+}
+
+/* Checks whether BPF register is on scratch stack space or not. */
+static inline bool is_on_stack(u8 bpf_reg)
+{
+	static u8 stack_regs[] = {BPF_REG_AX};
+	int i, reg_len = sizeof(stack_regs);
+
+	for (i = 0 ; i < reg_len ; i++) {
+		if (bpf_reg == stack_regs[i])
+			return false;
+	}
+	return true;
+}
+
+static inline void emit_ia32_mov_i(const u8 dst, const u32 val, bool dstk,
+				   u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+
+	if (dstk) {
+		if (val == 0) {
+			/* xor eax,eax */
+			EMIT2(0x33, add_2reg(0xC0, IA32_EAX, IA32_EAX));
+			/* mov dword ptr [ebp+off],eax */
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX),
+			      STACK_VAR(dst));
+		} else {
+			EMIT3_off32(0xC7, add_1reg(0x40, IA32_EBP),
+				    STACK_VAR(dst), val);
+		}
+	} else {
+		if (val == 0)
+			EMIT2(0x33, add_2reg(0xC0, dst, dst));
+		else
+			EMIT2_off32(0xC7, add_1reg(0xC0, dst),
+				    val);
+	}
+	*pprog = prog;
+}
+
+/* dst = imm (4 bytes)*/
+static inline void emit_ia32_mov_r(const u8 dst, const u8 src, bool dstk,
+				   bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 sreg = sstk ? IA32_EAX : src;
+
+	if (sstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(src));
+	if (dstk)
+		/* mov dword ptr [ebp+off],eax */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, sreg), STACK_VAR(dst));
+	else
+		/* mov dst,sreg */
+		EMIT2(0x89, add_2reg(0xC0, dst, sreg));
+
+	*pprog = prog;
+}
+
+/* dst = src */
+static inline void emit_ia32_mov_r64(const bool is64, const u8 dst[],
+				     const u8 src[], bool dstk,
+				     bool sstk, u8 **pprog)
+{
+	emit_ia32_mov_r(dst_lo, src_lo, dstk, sstk, pprog);
+	if (is64)
+		/* complete 8 byte move */
+		emit_ia32_mov_r(dst_hi, src_hi, dstk, sstk, pprog);
+	else
+		/* zero out high 4 bytes */
+		emit_ia32_mov_i(dst_hi, 0, dstk, pprog);
+}
+
+/* Sign extended move */
+static inline void emit_ia32_mov_i64(const bool is64, const u8 dst[],
+				     const u32 val, bool dstk, u8 **pprog)
+{
+	u32 hi = 0;
+
+	if (is64 && (val & (1<<31)))
+		hi = (u32)~0;
+	emit_ia32_mov_i(dst_lo, val, dstk, pprog);
+	emit_ia32_mov_i(dst_hi, hi, dstk, pprog);
+}
+
+/* ALU operation (32 bit)
+ * dst = dst * src
+ */
+static inline void emit_ia32_mul_r(const u8 dst, const u8 src, bool dstk,
+				   bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 sreg = sstk ? IA32_ECX : src;
+
+	if (sstk)
+		/* mov ecx,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX), STACK_VAR(src));
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(dst));
+	else
+		/* mov eax,dst */
+		EMIT2(0x8B, add_2reg(0xC0, dst, IA32_EAX));
+
+
+	EMIT2(0xF7, add_1reg(0xE0, sreg));
+
+	if (dstk)
+		/* mov dword ptr [ebp+off],eax */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst));
+	else
+		/* mov dst,eax */
+		EMIT2(0x89, add_2reg(0xC0, dst, IA32_EAX));
+
+	*pprog = prog;
+}
+
+static inline void emit_ia32_to_le_r64(const u8 dst[], s32 val,
+					 bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk && val != 64) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+	switch (val) {
+	case 16:
+		/* emit 'movzwl eax,ax' to zero extend 16-bit
+		 * into 64 bit
+		 */
+		EMIT2(0x0F, 0xB7);
+		EMIT1(add_2reg(0xC0, dreg_lo, dreg_lo));
+		/* xor dreg_hi,dreg_hi */
+		EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+		break;
+	case 32:
+		/* xor dreg_hi,dreg_hi */
+		EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+		break;
+	case 64:
+		/* nop */
+		break;
+	}
+
+	if (dstk && val != 64) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	*pprog = prog;
+}
+
+static inline void emit_ia32_to_be_r64(const u8 dst[], s32 val,
+				       bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+	switch (val) {
+	case 16:
+		/* emit 'ror %ax, 8' to swap lower 2 bytes */
+		EMIT1(0x66);
+		EMIT3(0xC1, add_1reg(0xC8, dreg_lo), 8);
+
+		EMIT2(0x0F, 0xB7);
+		EMIT1(add_2reg(0xC0, dreg_lo, dreg_lo));
+
+		/* xor dreg_hi,dreg_hi */
+		EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+		break;
+	case 32:
+		/* emit 'bswap eax' to swap lower 4 bytes */
+		EMIT1(0x0F);
+		EMIT1(add_1reg(0xC8, dreg_lo));
+
+		/* xor dreg_hi,dreg_hi */
+		EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+		break;
+	case 64:
+		/* emit 'bswap eax' to swap lower 4 bytes */
+		EMIT1(0x0F);
+		EMIT1(add_1reg(0xC8, dreg_lo));
+
+		/* emit 'bswap edx' to swap lower 4 bytes */
+		EMIT1(0x0F);
+		EMIT1(add_1reg(0xC8, dreg_hi));
+
+		/* mov ecx,dreg_hi */
+		EMIT2(0x89, add_2reg(0xC0, IA32_ECX, dreg_hi));
+		/* mov dreg_hi,dreg_lo */
+		EMIT2(0x89, add_2reg(0xC0, dreg_hi, dreg_lo));
+		/* mov dreg_lo,ecx */
+		EMIT2(0x89, add_2reg(0xC0, dreg_lo, IA32_ECX));
+
+		break;
+	}
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	*pprog = prog;
+}
+
+/* ALU operation (32 bit)
+ * dst = dst (div|mod) src
+ */
+static inline void emit_ia32_div_mod_r(const u8 op, const u8 dst, const u8 src,
+				       bool dstk, bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+
+	if (sstk)
+		/* mov ecx,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX),
+		      STACK_VAR(src));
+	else if (src != IA32_ECX)
+		/* mov ecx,src */
+		EMIT2(0x8B, add_2reg(0xC0, src, IA32_ECX));
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst));
+	else
+		/* mov eax,dst */
+		EMIT2(0x8B, add_2reg(0xC0, dst, IA32_EAX));
+
+	/* xor edx,edx */
+	EMIT2(0x31, add_2reg(0xC0, IA32_EDX, IA32_EDX));
+	/* div ecx */
+	EMIT2(0xF7, add_1reg(0xF0, IA32_ECX));
+
+	if (op == BPF_MOD) {
+		if (dstk)
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EDX),
+			      STACK_VAR(dst));
+		else
+			EMIT2(0x89, add_2reg(0xC0, dst, IA32_EDX));
+	} else {
+		if (dstk)
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX),
+			      STACK_VAR(dst));
+		else
+			EMIT2(0x89, add_2reg(0xC0, dst, IA32_EAX));
+	}
+	*pprog = prog;
+}
+
+/* ALU operation (32 bit)
+ * dst = dst (shift) src
+ */
+static inline void emit_ia32_shift_r(const u8 op, const u8 dst, const u8 src,
+				     bool dstk, bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg = dstk ? IA32_EAX : dst;
+	u8 b2;
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(dst));
+
+	if (sstk)
+		/* mov ecx,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX), STACK_VAR(src));
+	else if (src != IA32_ECX)
+		/* mov ecx,src */
+		EMIT2(0x8B, add_2reg(0xC0, src, IA32_ECX));
+
+	switch (op) {
+	case BPF_LSH:
+		b2 = 0xE0; break;
+	case BPF_RSH:
+		b2 = 0xE8; break;
+	case BPF_ARSH:
+		b2 = 0xF8; break;
+	default:
+		return;
+	}
+	EMIT2(0xD3, add_1reg(b2, dreg));
+
+	if (dstk)
+		/* mov dword ptr [ebp+off],dreg */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg), STACK_VAR(dst));
+	*pprog = prog;
+}
+
+/* ALU operation (32 bit)
+ * dst = dst (op) src
+ */
+static inline void emit_ia32_alu_r(const bool is64, const bool hi, const u8 op,
+				   const u8 dst, const u8 src, bool dstk,
+				   bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 sreg = sstk ? IA32_EAX : src;
+	u8 dreg = dstk ? IA32_EDX : dst;
+
+	if (sstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(src));
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX), STACK_VAR(dst));
+
+	switch (BPF_OP(op)) {
+	/* dst = dst + src */
+	case BPF_ADD:
+		if (hi && is64)
+			EMIT2(0x11, add_2reg(0xC0, dreg, sreg));
+		else
+			EMIT2(0x01, add_2reg(0xC0, dreg, sreg));
+		break;
+	/* dst = dst - src */
+	case BPF_SUB:
+		if (hi && is64)
+			EMIT2(0x19, add_2reg(0xC0, dreg, sreg));
+		else
+			EMIT2(0x29, add_2reg(0xC0, dreg, sreg));
+		break;
+	/* dst = dst | src */
+	case BPF_OR:
+		EMIT2(0x09, add_2reg(0xC0, dreg, sreg));
+		break;
+	/* dst = dst & src */
+	case BPF_AND:
+		EMIT2(0x21, add_2reg(0xC0, dreg, sreg));
+		break;
+	/* dst = dst ^ src */
+	case BPF_XOR:
+		EMIT2(0x31, add_2reg(0xC0, dreg, sreg));
+		break;
+	}
+
+	if (dstk)
+		/* mov dword ptr [ebp+off],dreg */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg),
+		      STACK_VAR(dst));
+	*pprog = prog;
+}
+
+/* ALU operation (64 bit) */
+static inline void emit_ia32_alu_r64(const bool is64, const u8 op,
+				     const u8 dst[], const u8 src[],
+				     bool dstk,  bool sstk,
+				     u8 **pprog)
+{
+	u8 *prog = *pprog;
+
+	emit_ia32_alu_r(is64, false, op, dst_lo, src_lo, dstk, sstk, &prog);
+	if (is64)
+		emit_ia32_alu_r(is64, true, op, dst_hi, src_hi, dstk, sstk,
+				&prog);
+	else
+		emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+	*pprog = prog;
+}
+
+/* ALU operation (32 bit)
+ * dst = dst (op) val
+ */
+static inline void emit_ia32_alu_i(const bool is64, const bool hi, const u8 op,
+				   const u8 dst, const s32 val, bool dstk,
+				   u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg = dstk ? IA32_EAX : dst;
+	u8 sreg = IA32_EDX;
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(dst));
+
+	if (!is_imm8(val))
+		/* mov edx,imm32*/
+		EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EDX), val);
+
+	switch (op) {
+	/* dst = dst + val */
+	case BPF_ADD:
+		if (hi && is64) {
+			if (is_imm8(val))
+				EMIT3(0x83, add_1reg(0xD0, dreg), val);
+			else
+				EMIT2(0x11, add_2reg(0xC0, dreg, sreg));
+		} else {
+			if (is_imm8(val))
+				EMIT3(0x83, add_1reg(0xC0, dreg), val);
+			else
+				EMIT2(0x01, add_2reg(0xC0, dreg, sreg));
+		}
+		break;
+	/* dst = dst - val */
+	case BPF_SUB:
+		if (hi && is64) {
+			if (is_imm8(val))
+				EMIT3(0x83, add_1reg(0xD8, dreg), val);
+			else
+				EMIT2(0x19, add_2reg(0xC0, dreg, sreg));
+		} else {
+			if (is_imm8(val))
+				EMIT3(0x83, add_1reg(0xE8, dreg), val);
+			else
+				EMIT2(0x29, add_2reg(0xC0, dreg, sreg));
+		}
+		break;
+	/* dst = dst | val */
+	case BPF_OR:
+		if (is_imm8(val))
+			EMIT3(0x83, add_1reg(0xC8, dreg), val);
+		else
+			EMIT2(0x09, add_2reg(0xC0, dreg, sreg));
+		break;
+	/* dst = dst & val */
+	case BPF_AND:
+		if (is_imm8(val))
+			EMIT3(0x83, add_1reg(0xE0, dreg), val);
+		else
+			EMIT2(0x21, add_2reg(0xC0, dreg, sreg));
+		break;
+	/* dst = dst ^ val */
+	case BPF_XOR:
+		if (is_imm8(val))
+			EMIT3(0x83, add_1reg(0xF0, dreg), val);
+		else
+			EMIT2(0x31, add_2reg(0xC0, dreg, sreg));
+		break;
+	case BPF_NEG:
+		EMIT2(0xF7, add_1reg(0xD8, dreg));
+		break;
+	}
+
+	if (dstk)
+		/* mov dword ptr [ebp+off],dreg */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg),
+		      STACK_VAR(dst));
+	*pprog = prog;
+}
+
+/* ALU operation (64 bit) */
+static inline void emit_ia32_alu_i64(const bool is64, const u8 op,
+				     const u8 dst[], const u32 val,
+				     bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	u32 hi = 0;
+
+	if (is64 && (val & (1<<31)))
+		hi = (u32)~0;
+
+	emit_ia32_alu_i(is64, false, op, dst_lo, val, dstk, &prog);
+	if (is64)
+		emit_ia32_alu_i(is64, true, op, dst_hi, hi, dstk, &prog);
+	else
+		emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+
+	*pprog = prog;
+}
+
+/* dst = ~dst (64 bit) */
+static inline void emit_ia32_neg64(const u8 dst[], bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+
+	/* xor ecx,ecx */
+	EMIT2(0x31, add_2reg(0xC0, IA32_ECX, IA32_ECX));
+	/* sub dreg_lo,ecx */
+	EMIT2(0x2B, add_2reg(0xC0, dreg_lo, IA32_ECX));
+	/* mov dreg_lo,ecx */
+	EMIT2(0x89, add_2reg(0xC0, dreg_lo, IA32_ECX));
+
+	/* xor ecx,ecx */
+	EMIT2(0x31, add_2reg(0xC0, IA32_ECX, IA32_ECX));
+	/* sbb dreg_hi,ecx */
+	EMIT2(0x19, add_2reg(0xC0, dreg_hi, IA32_ECX));
+	/* mov dreg_hi,ecx */
+	EMIT2(0x89, add_2reg(0xC0, dreg_hi, IA32_ECX));
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	*pprog = prog;
+}
+
+/* dst = dst << src */
+static inline void emit_ia32_lsh_r64(const u8 dst[], const u8 src[],
+				     bool dstk, bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	static int jmp_label1 = -1;
+	static int jmp_label2 = -1;
+	static int jmp_label3 = -1;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+
+	if (sstk)
+		/* mov ecx,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX),
+		      STACK_VAR(src_lo));
+	else
+		/* mov ecx,src_lo */
+		EMIT2(0x8B, add_2reg(0xC0, src_lo, IA32_ECX));
+
+	/* cmp ecx,32 */
+	EMIT3(0x83, add_1reg(0xF8, IA32_ECX), 32);
+	/* jumps when >= 32 */
+	if (is_imm8(jmp_label(jmp_label1, 2)))
+		EMIT2(IA32_JAE, jmp_label(jmp_label1, 2));
+	else
+		EMIT2_off32(0x0F, IA32_JAE + 0x10, jmp_label(jmp_label1, 6));
+
+	/* < 32 */
+	/* shl dreg_hi,cl */
+	EMIT2(0xD3, add_1reg(0xE0, dreg_hi));
+	/* mov ebx,dreg_lo */
+	EMIT2(0x8B, add_2reg(0xC0, dreg_lo, IA32_EBX));
+	/* shl dreg_lo,cl */
+	EMIT2(0xD3, add_1reg(0xE0, dreg_lo));
+
+	/* IA32_ECX = -IA32_ECX + 32 */
+	/* neg ecx */
+	EMIT2(0xF7, add_1reg(0xD8, IA32_ECX));
+	/* add ecx,32 */
+	EMIT3(0x83, add_1reg(0xC0, IA32_ECX), 32);
+
+	/* shr ebx,cl */
+	EMIT2(0xD3, add_1reg(0xE8, IA32_EBX));
+	/* or dreg_hi,ebx */
+	EMIT2(0x09, add_2reg(0xC0, dreg_hi, IA32_EBX));
+
+	/* goto out; */
+	if (is_imm8(jmp_label(jmp_label3, 2)))
+		EMIT2(0xEB, jmp_label(jmp_label3, 2));
+	else
+		EMIT1_off32(0xE9, jmp_label(jmp_label3, 5));
+
+	/* >= 32 */
+	if (jmp_label1 == -1)
+		jmp_label1 = cnt;
+
+	/* cmp ecx,64 */
+	EMIT3(0x83, add_1reg(0xF8, IA32_ECX), 64);
+	/* jumps when >= 64 */
+	if (is_imm8(jmp_label(jmp_label2, 2)))
+		EMIT2(IA32_JAE, jmp_label(jmp_label2, 2));
+	else
+		EMIT2_off32(0x0F, IA32_JAE + 0x10, jmp_label(jmp_label2, 6));
+
+	/* >= 32 && < 64 */
+	/* sub ecx,32 */
+	EMIT3(0x83, add_1reg(0xE8, IA32_ECX), 32);
+	/* shl dreg_lo,cl */
+	EMIT2(0xD3, add_1reg(0xE0, dreg_lo));
+	/* mov dreg_hi,dreg_lo */
+	EMIT2(0x89, add_2reg(0xC0, dreg_hi, dreg_lo));
+
+	/* xor dreg_lo,dreg_lo */
+	EMIT2(0x33, add_2reg(0xC0, dreg_lo, dreg_lo));
+
+	/* goto out; */
+	if (is_imm8(jmp_label(jmp_label3, 2)))
+		EMIT2(0xEB, jmp_label(jmp_label3, 2));
+	else
+		EMIT1_off32(0xE9, jmp_label(jmp_label3, 5));
+
+	/* >= 64 */
+	if (jmp_label2 == -1)
+		jmp_label2 = cnt;
+	/* xor dreg_lo,dreg_lo */
+	EMIT2(0x33, add_2reg(0xC0, dreg_lo, dreg_lo));
+	/* xor dreg_hi,dreg_hi */
+	EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+
+	if (jmp_label3 == -1)
+		jmp_label3 = cnt;
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	/* out: */
+	*pprog = prog;
+}
+
+/* dst = dst >> src (signed)*/
+static inline void emit_ia32_arsh_r64(const u8 dst[], const u8 src[],
+				      bool dstk, bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	static int jmp_label1 = -1;
+	static int jmp_label2 = -1;
+	static int jmp_label3 = -1;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+
+	if (sstk)
+		/* mov ecx,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX),
+		      STACK_VAR(src_lo));
+	else
+		/* mov ecx,src_lo */
+		EMIT2(0x8B, add_2reg(0xC0, src_lo, IA32_ECX));
+
+	/* cmp ecx,32 */
+	EMIT3(0x83, add_1reg(0xF8, IA32_ECX), 32);
+	/* jumps when >= 32 */
+	if (is_imm8(jmp_label(jmp_label1, 2)))
+		EMIT2(IA32_JAE, jmp_label(jmp_label1, 2));
+	else
+		EMIT2_off32(0x0F, IA32_JAE + 0x10, jmp_label(jmp_label1, 6));
+
+	/* < 32 */
+	/* lshr dreg_lo,cl */
+	EMIT2(0xD3, add_1reg(0xE8, dreg_lo));
+	/* mov ebx,dreg_hi */
+	EMIT2(0x8B, add_2reg(0xC0, dreg_hi, IA32_EBX));
+	/* ashr dreg_hi,cl */
+	EMIT2(0xD3, add_1reg(0xF8, dreg_hi));
+
+	/* IA32_ECX = -IA32_ECX + 32 */
+	/* neg ecx */
+	EMIT2(0xF7, add_1reg(0xD8, IA32_ECX));
+	/* add ecx,32 */
+	EMIT3(0x83, add_1reg(0xC0, IA32_ECX), 32);
+
+	/* shl ebx,cl */
+	EMIT2(0xD3, add_1reg(0xE0, IA32_EBX));
+	/* or dreg_lo,ebx */
+	EMIT2(0x09, add_2reg(0xC0, dreg_lo, IA32_EBX));
+
+	/* goto out; */
+	if (is_imm8(jmp_label(jmp_label3, 2)))
+		EMIT2(0xEB, jmp_label(jmp_label3, 2));
+	else
+		EMIT1_off32(0xE9, jmp_label(jmp_label3, 5));
+
+	/* >= 32 */
+	if (jmp_label1 == -1)
+		jmp_label1 = cnt;
+
+	/* cmp ecx,64 */
+	EMIT3(0x83, add_1reg(0xF8, IA32_ECX), 64);
+	/* jumps when >= 64 */
+	if (is_imm8(jmp_label(jmp_label2, 2)))
+		EMIT2(IA32_JAE, jmp_label(jmp_label2, 2));
+	else
+		EMIT2_off32(0x0F, IA32_JAE + 0x10, jmp_label(jmp_label2, 6));
+
+	/* >= 32 && < 64 */
+	/* sub ecx,32 */
+	EMIT3(0x83, add_1reg(0xE8, IA32_ECX), 32);
+	/* ashr dreg_hi,cl */
+	EMIT2(0xD3, add_1reg(0xF8, dreg_hi));
+	/* mov dreg_lo,dreg_hi */
+	EMIT2(0x89, add_2reg(0xC0, dreg_lo, dreg_hi));
+
+	/* ashr dreg_hi,imm8 */
+	EMIT3(0xC1, add_1reg(0xF8, dreg_hi), 31);
+
+	/* goto out; */
+	if (is_imm8(jmp_label(jmp_label3, 2)))
+		EMIT2(0xEB, jmp_label(jmp_label3, 2));
+	else
+		EMIT1_off32(0xE9, jmp_label(jmp_label3, 5));
+
+	/* >= 64 */
+	if (jmp_label2 == -1)
+		jmp_label2 = cnt;
+	/* ashr dreg_hi,imm8 */
+	EMIT3(0xC1, add_1reg(0xF8, dreg_hi), 31);
+	/* mov dreg_lo,dreg_hi */
+	EMIT2(0x89, add_2reg(0xC0, dreg_lo, dreg_hi));
+
+	if (jmp_label3 == -1)
+		jmp_label3 = cnt;
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	/* out: */
+	*pprog = prog;
+}
+
+/* dst = dst >> src */
+static inline void emit_ia32_rsh_r64(const u8 dst[], const u8 src[], bool dstk,
+				     bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	static int jmp_label1 = -1;
+	static int jmp_label2 = -1;
+	static int jmp_label3 = -1;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+
+	if (sstk)
+		/* mov ecx,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX),
+		      STACK_VAR(src_lo));
+	else
+		/* mov ecx,src_lo */
+		EMIT2(0x8B, add_2reg(0xC0, src_lo, IA32_ECX));
+
+	/* cmp ecx,32 */
+	EMIT3(0x83, add_1reg(0xF8, IA32_ECX), 32);
+	/* jumps when >= 32 */
+	if (is_imm8(jmp_label(jmp_label1, 2)))
+		EMIT2(IA32_JAE, jmp_label(jmp_label1, 2));
+	else
+		EMIT2_off32(0x0F, IA32_JAE + 0x10, jmp_label(jmp_label1, 6));
+
+	/* < 32 */
+	/* lshr dreg_lo,cl */
+	EMIT2(0xD3, add_1reg(0xE8, dreg_lo));
+	/* mov ebx,dreg_hi */
+	EMIT2(0x8B, add_2reg(0xC0, dreg_hi, IA32_EBX));
+	/* shr dreg_hi,cl */
+	EMIT2(0xD3, add_1reg(0xE8, dreg_hi));
+
+	/* IA32_ECX = -IA32_ECX + 32 */
+	/* neg ecx */
+	EMIT2(0xF7, add_1reg(0xD8, IA32_ECX));
+	/* add ecx,32 */
+	EMIT3(0x83, add_1reg(0xC0, IA32_ECX), 32);
+
+	/* shl ebx,cl */
+	EMIT2(0xD3, add_1reg(0xE0, IA32_EBX));
+	/* or dreg_lo,ebx */
+	EMIT2(0x09, add_2reg(0xC0, dreg_lo, IA32_EBX));
+
+	/* goto out; */
+	if (is_imm8(jmp_label(jmp_label3, 2)))
+		EMIT2(0xEB, jmp_label(jmp_label3, 2));
+	else
+		EMIT1_off32(0xE9, jmp_label(jmp_label3, 5));
+
+	/* >= 32 */
+	if (jmp_label1 == -1)
+		jmp_label1 = cnt;
+	/* cmp ecx,64 */
+	EMIT3(0x83, add_1reg(0xF8, IA32_ECX), 64);
+	/* jumps when >= 64 */
+	if (is_imm8(jmp_label(jmp_label2, 2)))
+		EMIT2(IA32_JAE, jmp_label(jmp_label2, 2));
+	else
+		EMIT2_off32(0x0F, IA32_JAE + 0x10, jmp_label(jmp_label2, 6));
+
+	/* >= 32 && < 64 */
+	/* sub ecx,32 */
+	EMIT3(0x83, add_1reg(0xE8, IA32_ECX), 32);
+	/* shr dreg_hi,cl */
+	EMIT2(0xD3, add_1reg(0xE8, dreg_hi));
+	/* mov dreg_lo,dreg_hi */
+	EMIT2(0x89, add_2reg(0xC0, dreg_lo, dreg_hi));
+	/* xor dreg_hi,dreg_hi */
+	EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+
+	/* goto out; */
+	if (is_imm8(jmp_label(jmp_label3, 2)))
+		EMIT2(0xEB, jmp_label(jmp_label3, 2));
+	else
+		EMIT1_off32(0xE9, jmp_label(jmp_label3, 5));
+
+	/* >= 64 */
+	if (jmp_label2 == -1)
+		jmp_label2 = cnt;
+	/* xor dreg_lo,dreg_lo */
+	EMIT2(0x33, add_2reg(0xC0, dreg_lo, dreg_lo));
+	/* xor dreg_hi,dreg_hi */
+	EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+
+	if (jmp_label3 == -1)
+		jmp_label3 = cnt;
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	/* out: */
+	*pprog = prog;
+}
+
+/* dst = dst << val */
+static inline void emit_ia32_lsh_i64(const u8 dst[], const u32 val,
+				     bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+	/* Do LSH operation */
+	if (val < 32) {
+		/* shl dreg_hi,imm8 */
+		EMIT3(0xC1, add_1reg(0xE0, dreg_hi), val);
+		/* mov ebx,dreg_lo */
+		EMIT2(0x8B, add_2reg(0xC0, dreg_lo, IA32_EBX));
+		/* shl dreg_lo,imm8 */
+		EMIT3(0xC1, add_1reg(0xE0, dreg_lo), val);
+
+		/* IA32_ECX = 32 - val */
+		/* mov ecx,val */
+		EMIT2(0xB1, val);
+		/* movzx ecx,ecx */
+		EMIT3(0x0F, 0xB6, add_2reg(0xC0, IA32_ECX, IA32_ECX));
+		/* neg ecx */
+		EMIT2(0xF7, add_1reg(0xD8, IA32_ECX));
+		/* add ecx,32 */
+		EMIT3(0x83, add_1reg(0xC0, IA32_ECX), 32);
+
+		/* shr ebx,cl */
+		EMIT2(0xD3, add_1reg(0xE8, IA32_EBX));
+		/* or dreg_hi,ebx */
+		EMIT2(0x09, add_2reg(0xC0, dreg_hi, IA32_EBX));
+	} else if (val >= 32 && val < 64) {
+		u32 value = val - 32;
+
+		/* shl dreg_lo,imm8 */
+		EMIT3(0xC1, add_1reg(0xE0, dreg_lo), value);
+		/* mov dreg_hi,dreg_lo */
+		EMIT2(0x89, add_2reg(0xC0, dreg_hi, dreg_lo));
+		/* xor dreg_lo,dreg_lo */
+		EMIT2(0x33, add_2reg(0xC0, dreg_lo, dreg_lo));
+	} else {
+		/* xor dreg_lo,dreg_lo */
+		EMIT2(0x33, add_2reg(0xC0, dreg_lo, dreg_lo));
+		/* xor dreg_hi,dreg_hi */
+		EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+	}
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	*pprog = prog;
+}
+
+/* dst = dst >> val */
+static inline void emit_ia32_rsh_i64(const u8 dst[], const u32 val,
+				     bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+
+	/* Do RSH operation */
+	if (val < 32) {
+		/* shr dreg_lo,imm8 */
+		EMIT3(0xC1, add_1reg(0xE8, dreg_lo), val);
+		/* mov ebx,dreg_hi */
+		EMIT2(0x8B, add_2reg(0xC0, dreg_hi, IA32_EBX));
+		/* shr dreg_hi,imm8 */
+		EMIT3(0xC1, add_1reg(0xE8, dreg_hi), val);
+
+		/* IA32_ECX = 32 - val */
+		/* mov ecx,val */
+		EMIT2(0xB1, val);
+		/* movzx ecx,ecx */
+		EMIT3(0x0F, 0xB6, add_2reg(0xC0, IA32_ECX, IA32_ECX));
+		/* neg ecx */
+		EMIT2(0xF7, add_1reg(0xD8, IA32_ECX));
+		/* add ecx,32 */
+		EMIT3(0x83, add_1reg(0xC0, IA32_ECX), 32);
+
+		/* shl ebx,cl */
+		EMIT2(0xD3, add_1reg(0xE0, IA32_EBX));
+		/* or dreg_lo,ebx */
+		EMIT2(0x09, add_2reg(0xC0, dreg_lo, IA32_EBX));
+	} else if (val >= 32 && val < 64) {
+		u32 value = val - 32;
+
+		/* shr dreg_hi,imm8 */
+		EMIT3(0xC1, add_1reg(0xE8, dreg_hi), value);
+		/* mov dreg_lo,dreg_hi */
+		EMIT2(0x89, add_2reg(0xC0, dreg_lo, dreg_hi));
+		/* xor dreg_hi,dreg_hi */
+		EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+	} else {
+		/* xor dreg_lo,dreg_lo */
+		EMIT2(0x33, add_2reg(0xC0, dreg_lo, dreg_lo));
+		/* xor dreg_hi,dreg_hi */
+		EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+	}
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	*pprog = prog;
+}
+
+/* dst = dst >> val (signed) */
+static inline void emit_ia32_arsh_i64(const u8 dst[], const u32 val,
+				      bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+	/* Do RSH operation */
+	if (val < 32) {
+		/* shr dreg_lo,imm8 */
+		EMIT3(0xC1, add_1reg(0xE8, dreg_lo), val);
+		/* mov ebx,dreg_hi */
+		EMIT2(0x8B, add_2reg(0xC0, dreg_hi, IA32_EBX));
+		/* ashr dreg_hi,imm8 */
+		EMIT3(0xC1, add_1reg(0xF8, dreg_hi), val);
+
+		/* IA32_ECX = 32 - val */
+		/* mov ecx,val */
+		EMIT2(0xB1, val);
+		/* movzx ecx,ecx */
+		EMIT3(0x0F, 0xB6, add_2reg(0xC0, IA32_ECX, IA32_ECX));
+		/* neg ecx */
+		EMIT2(0xF7, add_1reg(0xD8, IA32_ECX));
+		/* add ecx,32 */
+		EMIT3(0x83, add_1reg(0xC0, IA32_ECX), 32);
+
+		/* shl ebx,cl */
+		EMIT2(0xD3, add_1reg(0xE0, IA32_EBX));
+		/* or dreg_lo,ebx */
+		EMIT2(0x09, add_2reg(0xC0, dreg_lo, IA32_EBX));
+	} else if (val >= 32 && val < 64) {
+		u32 value = val - 32;
+
+		/* ashr dreg_hi,imm8 */
+		EMIT3(0xC1, add_1reg(0xF8, dreg_hi), value);
+		/* mov dreg_lo,dreg_hi */
+		EMIT2(0x89, add_2reg(0xC0, dreg_lo, dreg_hi));
+
+		/* ashr dreg_hi,imm8 */
+		EMIT3(0xC1, add_1reg(0xF8, dreg_hi), 31);
+	} else {
+		/* ashr dreg_hi,imm8 */
+		EMIT3(0xC1, add_1reg(0xF8, dreg_hi), 31);
+		/* mov dreg_lo,dreg_hi */
+		EMIT2(0x89, add_2reg(0xC0, dreg_lo, dreg_hi));
+	}
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	*pprog = prog;
+}
+
+static inline void emit_ia32_mul_r64(const u8 dst[], const u8 src[], bool dstk,
+				     bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_hi));
+	else
+		/* mov eax,dst_hi */
+		EMIT2(0x8B, add_2reg(0xC0, dst_hi, IA32_EAX));
+
+	if (sstk)
+		/* mul dword ptr [ebp+off] */
+		EMIT3(0xF7, add_1reg(0x60, IA32_EBP), STACK_VAR(src_lo));
+	else
+		/* mul src_lo */
+		EMIT2(0xF7, add_1reg(0xE0, src_lo));
+
+	/* mov ecx,eax */
+	EMIT2(0x89, add_2reg(0xC0, IA32_ECX, IA32_EAX));
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+	else
+		/* mov eax,dst_lo */
+		EMIT2(0x8B, add_2reg(0xC0, dst_lo, IA32_EAX));
+
+	if (sstk)
+		/* mul dword ptr [ebp+off] */
+		EMIT3(0xF7, add_1reg(0x60, IA32_EBP), STACK_VAR(src_hi));
+	else
+		/* mul src_hi */
+		EMIT2(0xF7, add_1reg(0xE0, src_hi));
+
+	/* add eax,eax */
+	EMIT2(0x01, add_2reg(0xC0, IA32_ECX, IA32_EAX));
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+	else
+		/* mov eax,dst_lo */
+		EMIT2(0x8B, add_2reg(0xC0, dst_lo, IA32_EAX));
+
+	if (sstk)
+		/* mul dword ptr [ebp+off] */
+		EMIT3(0xF7, add_1reg(0x60, IA32_EBP), STACK_VAR(src_lo));
+	else
+		/* mul src_lo */
+		EMIT2(0xF7, add_1reg(0xE0, src_lo));
+
+	/* add ecx,edx */
+	EMIT2(0x01, add_2reg(0xC0, IA32_ECX, IA32_EDX));
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],eax */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],ecx */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_ECX),
+		      STACK_VAR(dst_hi));
+	} else {
+		/* mov dst_lo,eax */
+		EMIT2(0x89, add_2reg(0xC0, dst_lo, IA32_EAX));
+		/* mov dst_hi,ecx */
+		EMIT2(0x89, add_2reg(0xC0, dst_hi, IA32_ECX));
+	}
+
+	*pprog = prog;
+}
+
+static inline void emit_ia32_mul_i64(const u8 dst[], const u32 val,
+				     bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u32 hi;
+
+	hi = val & (1<<31) ? (u32)~0 : 0;
+	/* movl eax,imm32 */
+	EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EAX), val);
+	if (dstk)
+		/* mul dword ptr [ebp+off] */
+		EMIT3(0xF7, add_1reg(0x60, IA32_EBP), STACK_VAR(dst_hi));
+	else
+		/* mul dst_hi */
+		EMIT2(0xF7, add_1reg(0xE0, dst_hi));
+
+	/* mov ecx,eax */
+	EMIT2(0x89, add_2reg(0xC0, IA32_ECX, IA32_EAX));
+
+	/* movl eax,imm32 */
+	EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EAX), hi);
+	if (dstk)
+		/* mul dword ptr [ebp+off] */
+		EMIT3(0xF7, add_1reg(0x60, IA32_EBP), STACK_VAR(dst_lo));
+	else
+		/* mul dst_lo */
+		EMIT2(0xF7, add_1reg(0xE0, dst_lo));
+	/* add ecx,eax */
+	EMIT2(0x01, add_2reg(0xC0, IA32_ECX, IA32_EAX));
+
+	/* movl eax,imm32 */
+	EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EAX), val);
+	if (dstk)
+		/* mul dword ptr [ebp+off] */
+		EMIT3(0xF7, add_1reg(0x60, IA32_EBP), STACK_VAR(dst_lo));
+	else
+		/* mul dst_lo */
+		EMIT2(0xF7, add_1reg(0xE0, dst_lo));
+
+	/* add ecx,edx */
+	EMIT2(0x01, add_2reg(0xC0, IA32_ECX, IA32_EDX));
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],eax */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],ecx */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_ECX),
+		      STACK_VAR(dst_hi));
+	} else {
+		/* mov dword ptr [ebp+off],eax */
+		EMIT2(0x89, add_2reg(0xC0, dst_lo, IA32_EAX));
+		/* mov dword ptr [ebp+off],ecx */
+		EMIT2(0x89, add_2reg(0xC0, dst_hi, IA32_ECX));
+	}
+
+	*pprog = prog;
+}
+
+static int bpf_size_to_x86_bytes(int bpf_size)
+{
+	if (bpf_size == BPF_W)
+		return 4;
+	else if (bpf_size == BPF_H)
+		return 2;
+	else if (bpf_size == BPF_B)
+		return 1;
+	else if (bpf_size == BPF_DW)
+		return 4; /* imm32 */
+	else
+		return 0;
+}
+
+struct jit_context {
+	int cleanup_addr; /* epilogue code offset */
+};
+
+/* maximum number of bytes emitted while JITing one eBPF insn */
+#define BPF_MAX_INSN_SIZE	128
+#define BPF_INSN_SAFETY		64
+
+#define PROLOGUE_SIZE 35
+
+/* emit prologue code for BPF program and check it's size.
+ * bpf_tail_call helper will skip it while jumping into another program
+ */
+static void emit_prologue(u8 **pprog, u32 stack_depth)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	const u8 *r1 = bpf2ia32[BPF_REG_1];
+	const u8 fplo = bpf2ia32[BPF_REG_FP][0];
+	const u8 fphi = bpf2ia32[BPF_REG_FP][1];
+	const u8 *tcc = bpf2ia32[TCALL_CNT];
+
+	/* push ebp */
+	EMIT1(0x55);
+	/* mov ebp,esp */
+	EMIT2(0x89, 0xE5);
+	/* push edi */
+	EMIT1(0x57);
+	/* push esi */
+	EMIT1(0x56);
+	/* push ebx */
+	EMIT1(0x53);
+
+	/* sub esp,STACK_SIZE */
+	EMIT2_off32(0x81, 0xEC, STACK_SIZE);
+	/* sub ebp,SCRATCH_SIZE+4+12*/
+	EMIT3(0x83, add_1reg(0xE8, IA32_EBP), SCRATCH_SIZE + 16);
+	/* xor ebx,ebx */
+	EMIT2(0x31, add_2reg(0xC0, IA32_EBX, IA32_EBX));
+
+	/* Set up BPF prog stack base register */
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EBP), STACK_VAR(fplo));
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EBX), STACK_VAR(fphi));
+
+	/* Move BPF_CTX (EAX) to BPF_REG_R1 */
+	/* mov dword ptr [ebp+off],eax */
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(r1[0]));
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EBX), STACK_VAR(r1[1]));
+
+	/* Initialize Tail Count */
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EBX), STACK_VAR(tcc[0]));
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EBX), STACK_VAR(tcc[1]));
+
+	BUILD_BUG_ON(cnt != PROLOGUE_SIZE);
+	*pprog = prog;
+}
+
+/* Emit epilogue code for BPF program */
+static void emit_epilogue(u8 **pprog, u32 stack_depth)
+{
+	u8 *prog = *pprog;
+	const u8 *r0 = bpf2ia32[BPF_REG_0];
+	int cnt = 0;
+
+	/* mov eax,dword ptr [ebp+off]*/
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(r0[0]));
+	/* mov edx,dword ptr [ebp+off]*/
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX), STACK_VAR(r0[1]));
+
+	/* add ebp,SCRATCH_SIZE+4+12*/
+	EMIT3(0x83, add_1reg(0xC0, IA32_EBP), SCRATCH_SIZE + 16);
+
+	/* mov ebx,dword ptr [ebp-12]*/
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EBX), -12);
+	/* mov esi,dword ptr [ebp-8]*/
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ESI), -8);
+	/* mov edi,dword ptr [ebp-4]*/
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDI), -4);
+
+	EMIT1(0xC9); /* leave */
+	EMIT1(0xC3); /* ret */
+	*pprog = prog;
+}
+
+/* generate the following code:
+ * ... bpf_tail_call(void *ctx, struct bpf_array *array, u64 index) ...
+ *   if (index >= array->map.max_entries)
+ *     goto out;
+ *   if (++tail_call_cnt > MAX_TAIL_CALL_CNT)
+ *     goto out;
+ *   prog = array->ptrs[index];
+ *   if (prog == NULL)
+ *     goto out;
+ *   goto *(prog->bpf_func + prologue_size);
+ * out:
+ */
+static void emit_bpf_tail_call(u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	const u8 *r1 = bpf2ia32[BPF_REG_1];
+	const u8 *r2 = bpf2ia32[BPF_REG_2];
+	const u8 *r3 = bpf2ia32[BPF_REG_3];
+	const u8 *tcc = bpf2ia32[TCALL_CNT];
+	u32 lo, hi;
+	static int jmp_label1 = -1;
+
+	/* if (index >= array->map.max_entries)
+	 *   goto out;
+	 */
+	/* mov eax,dword ptr [ebp+off] */
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(r2[0]));
+	/* mov edx,dword ptr [ebp+off] */
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX), STACK_VAR(r3[0]));
+
+	/* cmp dword ptr [eax+off],edx */
+	EMIT3(0x39, add_2reg(0x40, IA32_EAX, IA32_EDX),
+	      offsetof(struct bpf_array, map.max_entries));
+	/* jbe out */
+	EMIT2(IA32_JBE, jmp_label(jmp_label1, 2));
+
+	/* if (tail_call_cnt > MAX_TAIL_CALL_CNT)
+	 *   goto out;
+	 */
+	lo = (u32)MAX_TAIL_CALL_CNT;
+	hi = (u32)((u64)MAX_TAIL_CALL_CNT >> 32);
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX), STACK_VAR(tcc[0]));
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EBX), STACK_VAR(tcc[1]));
+
+	EMIT3(0x83, add_1reg(0xF8, IA32_EBX), hi);   /* cmp edx, hi */
+	EMIT2(IA32_JNE, 3);
+	EMIT3(0x83, add_1reg(0xF8, IA32_ECX), lo);   /* cmp ecx, lo */
+
+	EMIT2(IA32_JAE, jmp_label(jmp_label1, 2));   /* ja out */
+
+	EMIT3(0x83, add_1reg(0xC0, IA32_ECX), 0x01);   /* add eax, 0x1 */
+	EMIT3(0x83, add_1reg(0xD0, IA32_EBX), 0x00);   /* adc ebx, 0x0 */
+
+	/* mov dword ptr [ebp+off],eax */
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_ECX), STACK_VAR(tcc[0]));
+	/* mov dword ptr [ebp+off],edx */
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EBX), STACK_VAR(tcc[1]));
+
+	/* prog = array->ptrs[index]; */
+	/* mov edx, [eax + edx * 4 + offsetof(...)] */
+	EMIT3_off32(0x8B, 0x94, 0x90, offsetof(struct bpf_array, ptrs));
+
+	/* if (prog == NULL)
+	 *   goto out;
+	 */
+	EMIT2(0x85, add_2reg(0xC0, IA32_EDX, IA32_EDX)); /* test edx,edx */
+	EMIT2(IA32_JE, jmp_label(jmp_label1, 2)); /* je out */
+
+	/* goto *(prog->bpf_func + prologue_size); */
+	/* mov edx, dword ptr [edx + 32] */
+	EMIT3(0x8B, add_2reg(0x40, IA32_EDX, IA32_EDX),
+	      offsetof(struct bpf_prog, bpf_func));
+	/* add edx, prologue_size */
+	EMIT3(0x83, add_1reg(0xC0, IA32_EDX), PROLOGUE_SIZE);
+
+	/* mov eax,dword ptr [ebp+off] */
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(r1[0]));
+
+	/* now we're ready to jump into next BPF program
+	 * eax == ctx (1st arg)
+	 * edx == prog->bpf_func + prologue_size
+	 */
+	RETPOLINE_EDX_BPF_JIT();
+
+	if (jmp_label1 == -1)
+		jmp_label1 = cnt;
+
+	/* out: */
+	*pprog = prog;
+}
+
+// push the scratch stack register on top of the stack
+static inline void emit_push_r64(const u8 src[], u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+
+	/* mov ecx,dword ptr [ebp+off] */
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX), STACK_VAR(src_hi));
+	/* push ecx */
+	EMIT1(0x51);
+
+	/* mov ecx,dword ptr [ebp+off] */
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX), STACK_VAR(src_lo));
+	/* push ecx */
+	EMIT1(0x51);
+
+	*pprog = prog;
+}
+
+static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
+		  int oldproglen, struct jit_context *ctx)
+{
+	struct bpf_insn *insn = bpf_prog->insnsi;
+	int insn_cnt = bpf_prog->len;
+	bool seen_exit = false;
+	u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY];
+	int i, cnt = 0;
+	int proglen = 0;
+	u8 *prog = temp;
+
+	emit_prologue(&prog, bpf_prog->aux->stack_depth);
+
+	for (i = 0; i < insn_cnt; i++, insn++) {
+		const s32 imm32 = insn->imm;
+		const bool is64 = BPF_CLASS(insn->code) == BPF_ALU64;
+		const bool dstk = is_on_stack(insn->dst_reg);
+		const bool sstk = is_on_stack(insn->src_reg);
+		const u8 code = insn->code;
+		const u8 *dst = bpf2ia32[insn->dst_reg];
+		const u8 *src = bpf2ia32[insn->src_reg];
+		const u8 *r0 = bpf2ia32[BPF_REG_0];
+		s64 jmp_offset;
+		u8 jmp_cond;
+		int ilen;
+		u8 *func;
+
+		switch (code) {
+		/* ALU operations */
+		/* dst = src */
+		case BPF_ALU | BPF_MOV | BPF_K:
+		case BPF_ALU | BPF_MOV | BPF_X:
+		case BPF_ALU64 | BPF_MOV | BPF_K:
+		case BPF_ALU64 | BPF_MOV | BPF_X:
+			switch (BPF_SRC(code)) {
+			case BPF_X:
+				emit_ia32_mov_r64(is64, dst, src, dstk,
+						  sstk, &prog);
+				break;
+			case BPF_K:
+				/* Sign-extend immediate value to dst reg */
+				emit_ia32_mov_i64(is64, dst, imm32,
+						  dstk, &prog);
+				break;
+			}
+			break;
+		/* dst = dst + src/imm */
+		/* dst = dst - src/imm */
+		/* dst = dst | src/imm */
+		/* dst = dst & src/imm */
+		/* dst = dst ^ src/imm */
+		/* dst = dst * src/imm */
+		/* dst = dst << src */
+		/* dst = dst >> src */
+		case BPF_ALU | BPF_ADD | BPF_K:
+		case BPF_ALU | BPF_ADD | BPF_X:
+		case BPF_ALU | BPF_SUB | BPF_K:
+		case BPF_ALU | BPF_SUB | BPF_X:
+		case BPF_ALU | BPF_OR | BPF_K:
+		case BPF_ALU | BPF_OR | BPF_X:
+		case BPF_ALU | BPF_AND | BPF_K:
+		case BPF_ALU | BPF_AND | BPF_X:
+		case BPF_ALU | BPF_XOR | BPF_K:
+		case BPF_ALU | BPF_XOR | BPF_X:
+		case BPF_ALU64 | BPF_ADD | BPF_K:
+		case BPF_ALU64 | BPF_ADD | BPF_X:
+		case BPF_ALU64 | BPF_SUB | BPF_K:
+		case BPF_ALU64 | BPF_SUB | BPF_X:
+		case BPF_ALU64 | BPF_OR | BPF_K:
+		case BPF_ALU64 | BPF_OR | BPF_X:
+		case BPF_ALU64 | BPF_AND | BPF_K:
+		case BPF_ALU64 | BPF_AND | BPF_X:
+		case BPF_ALU64 | BPF_XOR | BPF_K:
+		case BPF_ALU64 | BPF_XOR | BPF_X:
+			switch (BPF_SRC(code)) {
+			case BPF_X:
+				emit_ia32_alu_r64(is64, BPF_OP(code), dst,
+						  src, dstk, sstk, &prog);
+				break;
+			case BPF_K:
+				emit_ia32_alu_i64(is64, BPF_OP(code), dst,
+						  imm32, dstk, &prog);
+				break;
+			}
+			break;
+		case BPF_ALU | BPF_MUL | BPF_K:
+		case BPF_ALU | BPF_MUL | BPF_X:
+			switch (BPF_SRC(code)) {
+			case BPF_X:
+				emit_ia32_mul_r(dst_lo, src_lo, dstk,
+						sstk, &prog);
+				break;
+			case BPF_K:
+				/* mov ecx,imm32*/
+				EMIT2_off32(0xC7, add_1reg(0xC0, IA32_ECX),
+					    imm32);
+				emit_ia32_mul_r(dst_lo, IA32_ECX, dstk,
+						false, &prog);
+				break;
+			}
+			emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+			break;
+		case BPF_ALU | BPF_LSH | BPF_X:
+		case BPF_ALU | BPF_RSH | BPF_X:
+		case BPF_ALU | BPF_ARSH | BPF_K:
+		case BPF_ALU | BPF_ARSH | BPF_X:
+			switch (BPF_SRC(code)) {
+			case BPF_X:
+				emit_ia32_shift_r(BPF_OP(code), dst_lo, src_lo,
+						  dstk, sstk, &prog);
+				break;
+			case BPF_K:
+				/* mov ecx,imm32*/
+				EMIT2_off32(0xC7, add_1reg(0xC0, IA32_ECX),
+					    imm32);
+				emit_ia32_shift_r(BPF_OP(code), dst_lo,
+						  IA32_ECX, dstk, false,
+						  &prog);
+				break;
+			}
+			emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+			break;
+		/* dst = dst / src(imm) */
+		/* dst = dst % src(imm) */
+		case BPF_ALU | BPF_DIV | BPF_K:
+		case BPF_ALU | BPF_DIV | BPF_X:
+		case BPF_ALU | BPF_MOD | BPF_K:
+		case BPF_ALU | BPF_MOD | BPF_X:
+			switch (BPF_SRC(code)) {
+			case BPF_X:
+				emit_ia32_div_mod_r(BPF_OP(code), dst_lo,
+						    src_lo, dstk, sstk, &prog);
+				break;
+			case BPF_K:
+				/* mov ecx,imm32*/
+				EMIT2_off32(0xC7, add_1reg(0xC0, IA32_ECX),
+					    imm32);
+				emit_ia32_div_mod_r(BPF_OP(code), dst_lo,
+						    IA32_ECX, dstk, false,
+						    &prog);
+				break;
+			}
+			emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+			break;
+		case BPF_ALU64 | BPF_DIV | BPF_K:
+		case BPF_ALU64 | BPF_DIV | BPF_X:
+		case BPF_ALU64 | BPF_MOD | BPF_K:
+		case BPF_ALU64 | BPF_MOD | BPF_X:
+			goto notyet;
+		/* dst = dst >> imm */
+		/* dst = dst << imm */
+		case BPF_ALU | BPF_RSH | BPF_K:
+		case BPF_ALU | BPF_LSH | BPF_K:
+			if (unlikely(imm32 > 31))
+				return -EINVAL;
+			/* mov ecx,imm32*/
+			EMIT2_off32(0xC7, add_1reg(0xC0, IA32_ECX), imm32);
+			emit_ia32_shift_r(BPF_OP(code), dst_lo, IA32_ECX, dstk,
+					  false, &prog);
+			emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+			break;
+		/* dst = dst << imm */
+		case BPF_ALU64 | BPF_LSH | BPF_K:
+			if (unlikely(imm32 > 63))
+				return -EINVAL;
+			emit_ia32_lsh_i64(dst, imm32, dstk, &prog);
+			break;
+		/* dst = dst >> imm */
+		case BPF_ALU64 | BPF_RSH | BPF_K:
+			if (unlikely(imm32 > 63))
+				return -EINVAL;
+			emit_ia32_rsh_i64(dst, imm32, dstk, &prog);
+			break;
+		/* dst = dst << src */
+		case BPF_ALU64 | BPF_LSH | BPF_X:
+			emit_ia32_lsh_r64(dst, src, dstk, sstk, &prog);
+			break;
+		/* dst = dst >> src */
+		case BPF_ALU64 | BPF_RSH | BPF_X:
+			emit_ia32_rsh_r64(dst, src, dstk, sstk, &prog);
+			break;
+		/* dst = dst >> src (signed) */
+		case BPF_ALU64 | BPF_ARSH | BPF_X:
+			emit_ia32_arsh_r64(dst, src, dstk, sstk, &prog);
+			break;
+		/* dst = dst >> imm (signed) */
+		case BPF_ALU64 | BPF_ARSH | BPF_K:
+			if (unlikely(imm32 > 63))
+				return -EINVAL;
+			emit_ia32_arsh_i64(dst, imm32, dstk, &prog);
+			break;
+		/* dst = ~dst */
+		case BPF_ALU | BPF_NEG:
+			emit_ia32_alu_i(is64, false, BPF_OP(code),
+					dst_lo, 0, dstk, &prog);
+			emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+			break;
+		/* dst = ~dst (64 bit) */
+		case BPF_ALU64 | BPF_NEG:
+			emit_ia32_neg64(dst, dstk, &prog);
+			break;
+		/* dst = dst * src/imm */
+		case BPF_ALU64 | BPF_MUL | BPF_X:
+		case BPF_ALU64 | BPF_MUL | BPF_K:
+			switch (BPF_SRC(code)) {
+			case BPF_X:
+				emit_ia32_mul_r64(dst, src, dstk, sstk, &prog);
+				break;
+			case BPF_K:
+				emit_ia32_mul_i64(dst, imm32, dstk, &prog);
+				break;
+			}
+			break;
+		/* dst = htole(dst) */
+		case BPF_ALU | BPF_END | BPF_FROM_LE:
+			emit_ia32_to_le_r64(dst, imm32, dstk, &prog);
+			break;
+		/* dst = htobe(dst) */
+		case BPF_ALU | BPF_END | BPF_FROM_BE:
+			emit_ia32_to_be_r64(dst, imm32, dstk, &prog);
+			break;
+		/* dst = imm64 */
+		case BPF_LD | BPF_IMM | BPF_DW: {
+			s32 hi, lo = imm32;
+
+			hi = insn[1].imm;
+			emit_ia32_mov_i(dst_lo, lo, dstk, &prog);
+			emit_ia32_mov_i(dst_hi, hi, dstk, &prog);
+			insn++;
+			i++;
+			break;
+		}
+		/* ST: *(u8*)(dst_reg + off) = imm */
+		case BPF_ST | BPF_MEM | BPF_H:
+		case BPF_ST | BPF_MEM | BPF_B:
+		case BPF_ST | BPF_MEM | BPF_W:
+		case BPF_ST | BPF_MEM | BPF_DW:
+			if (dstk)
+				/* mov eax,dword ptr [ebp+off] */
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+				      STACK_VAR(dst_lo));
+			else
+				/* mov eax,dst_lo */
+				EMIT2(0x8B, add_2reg(0xC0, dst_lo, IA32_EAX));
+
+			switch (BPF_SIZE(code)) {
+			case BPF_B:
+				EMIT(0xC6, 1); break;
+			case BPF_H:
+				EMIT2(0x66, 0xC7); break;
+			case BPF_W:
+			case BPF_DW:
+				EMIT(0xC7, 1); break;
+			}
+
+			if (is_imm8(insn->off))
+				EMIT2(add_1reg(0x40, IA32_EAX), insn->off);
+			else
+				EMIT1_off32(add_1reg(0x80, IA32_EAX),
+					    insn->off);
+			EMIT(imm32, bpf_size_to_x86_bytes(BPF_SIZE(code)));
+
+			if (BPF_SIZE(code) == BPF_DW) {
+				u32 hi;
+
+				hi = imm32 & (1<<31) ? (u32)~0 : 0;
+				EMIT2_off32(0xC7, add_1reg(0x80, IA32_EAX),
+					    insn->off + 4);
+				EMIT(hi, 4);
+			}
+			break;
+
+		/* STX: *(u8*)(dst_reg + off) = src_reg */
+		case BPF_STX | BPF_MEM | BPF_B:
+		case BPF_STX | BPF_MEM | BPF_H:
+		case BPF_STX | BPF_MEM | BPF_W:
+		case BPF_STX | BPF_MEM | BPF_DW:
+			if (dstk)
+				/* mov eax,dword ptr [ebp+off] */
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+				      STACK_VAR(dst_lo));
+			else
+				/* mov eax,dst_lo */
+				EMIT2(0x8B, add_2reg(0xC0, dst_lo, IA32_EAX));
+
+			if (sstk)
+				/* mov edx,dword ptr [ebp+off] */
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+				      STACK_VAR(src_lo));
+			else
+				/* mov edx,src_lo */
+				EMIT2(0x8B, add_2reg(0xC0, src_lo, IA32_EDX));
+
+			switch (BPF_SIZE(code)) {
+			case BPF_B:
+				EMIT(0x88, 1); break;
+			case BPF_H:
+				EMIT2(0x66, 0x89); break;
+			case BPF_W:
+			case BPF_DW:
+				EMIT(0x89, 1); break;
+			}
+
+			if (is_imm8(insn->off))
+				EMIT2(add_2reg(0x40, IA32_EAX, IA32_EDX),
+				      insn->off);
+			else
+				EMIT1_off32(add_2reg(0x80, IA32_EAX, IA32_EDX),
+					    insn->off);
+
+			if (BPF_SIZE(code) == BPF_DW) {
+				if (sstk)
+					/* mov edi,dword ptr [ebp+off] */
+					EMIT3(0x8B, add_2reg(0x40, IA32_EBP,
+							     IA32_EDX),
+					      STACK_VAR(src_hi));
+				else
+					/* mov edi,src_hi */
+					EMIT2(0x8B, add_2reg(0xC0, src_hi,
+							     IA32_EDX));
+				EMIT1(0x89);
+				if (is_imm8(insn->off + 4)) {
+					EMIT2(add_2reg(0x40, IA32_EAX,
+						       IA32_EDX),
+					      insn->off + 4);
+				} else {
+					EMIT1(add_2reg(0x80, IA32_EAX,
+						       IA32_EDX));
+					EMIT(insn->off + 4, 4);
+				}
+			}
+			break;
+
+		/* LDX: dst_reg = *(u8*)(src_reg + off) */
+		case BPF_LDX | BPF_MEM | BPF_B:
+		case BPF_LDX | BPF_MEM | BPF_H:
+		case BPF_LDX | BPF_MEM | BPF_W:
+		case BPF_LDX | BPF_MEM | BPF_DW:
+			if (sstk)
+				/* mov eax,dword ptr [ebp+off] */
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+				      STACK_VAR(src_lo));
+			else
+				/* mov eax,dword ptr [ebp+off] */
+				EMIT2(0x8B, add_2reg(0xC0, src_lo, IA32_EAX));
+
+			switch (BPF_SIZE(code)) {
+			case BPF_B:
+				EMIT2(0x0F, 0xB6); break;
+			case BPF_H:
+				EMIT2(0x0F, 0xB7); break;
+			case BPF_W:
+			case BPF_DW:
+				EMIT(0x8B, 1); break;
+			}
+
+			if (is_imm8(insn->off))
+				EMIT2(add_2reg(0x40, IA32_EAX, IA32_EDX),
+				      insn->off);
+			else
+				EMIT1_off32(add_2reg(0x80, IA32_EAX, IA32_EDX),
+					    insn->off);
+
+			if (dstk)
+				/* mov dword ptr [ebp+off],edx */
+				EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EDX),
+				      STACK_VAR(dst_lo));
+			else
+				/* mov dst_lo,edx */
+				EMIT2(0x89, add_2reg(0xC0, dst_lo, IA32_EDX));
+			switch (BPF_SIZE(code)) {
+			case BPF_B:
+			case BPF_H:
+			case BPF_W:
+				if (dstk) {
+					EMIT3(0xC7, add_1reg(0x40, IA32_EBP),
+					      STACK_VAR(dst_hi));
+					EMIT(0x0, 4);
+				} else {
+					EMIT3(0xC7, add_1reg(0xC0, dst_hi), 0);
+				}
+				break;
+			case BPF_DW:
+				EMIT2_off32(0x8B,
+					    add_2reg(0x80, IA32_EAX, IA32_EDX),
+					    insn->off + 4);
+				if (dstk)
+					EMIT3(0x89,
+					      add_2reg(0x40, IA32_EBP,
+						       IA32_EDX),
+					      STACK_VAR(dst_hi));
+				else
+					EMIT2(0x89,
+					      add_2reg(0xC0, dst_hi, IA32_EDX));
+				break;
+			default:
+				break;
+			}
+			break;
+		/* call */
+		case BPF_JMP | BPF_CALL:
+		{
+			const u8 *r1 = bpf2ia32[BPF_REG_1];
+			const u8 *r2 = bpf2ia32[BPF_REG_2];
+			const u8 *r3 = bpf2ia32[BPF_REG_3];
+			const u8 *r4 = bpf2ia32[BPF_REG_4];
+			const u8 *r5 = bpf2ia32[BPF_REG_5];
+
+			if (insn->src_reg == BPF_PSEUDO_CALL)
+				goto notyet;
+
+			func = (u8 *) __bpf_call_base + imm32;
+			jmp_offset = func - (image + addrs[i]);
+
+			if (!imm32 || !is_simm32(jmp_offset)) {
+				pr_err("unsupported bpf func %d addr %p image %p\n",
+				       imm32, func, image);
+				return -EINVAL;
+			}
+
+			/* mov eax,dword ptr [ebp+off] */
+			EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+			      STACK_VAR(r1[0]));
+			/* mov edx,dword ptr [ebp+off] */
+			EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+			      STACK_VAR(r1[1]));
+
+			emit_push_r64(r5, &prog);
+			emit_push_r64(r4, &prog);
+			emit_push_r64(r3, &prog);
+			emit_push_r64(r2, &prog);
+
+			EMIT1_off32(0xE8, jmp_offset + 9);
+
+			/* mov dword ptr [ebp+off],eax */
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX),
+			      STACK_VAR(r0[0]));
+			/* mov dword ptr [ebp+off],edx */
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EDX),
+			      STACK_VAR(r0[1]));
+
+			/* add esp,32 */
+			EMIT3(0x83, add_1reg(0xC0, IA32_ESP), 32);
+			break;
+		}
+		case BPF_JMP | BPF_TAIL_CALL:
+			emit_bpf_tail_call(&prog);
+			break;
+
+		/* cond jump */
+		case BPF_JMP | BPF_JEQ | BPF_X:
+		case BPF_JMP | BPF_JNE | BPF_X:
+		case BPF_JMP | BPF_JGT | BPF_X:
+		case BPF_JMP | BPF_JLT | BPF_X:
+		case BPF_JMP | BPF_JGE | BPF_X:
+		case BPF_JMP | BPF_JLE | BPF_X:
+		case BPF_JMP | BPF_JSGT | BPF_X:
+		case BPF_JMP | BPF_JSLE | BPF_X:
+		case BPF_JMP | BPF_JSLT | BPF_X:
+		case BPF_JMP | BPF_JSGE | BPF_X: {
+			u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+			u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+			u8 sreg_lo = sstk ? IA32_ECX : src_lo;
+			u8 sreg_hi = sstk ? IA32_EBX : src_hi;
+
+			if (dstk) {
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+				      STACK_VAR(dst_lo));
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+				      STACK_VAR(dst_hi));
+			}
+
+			if (sstk) {
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX),
+				      STACK_VAR(src_lo));
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EBX),
+				      STACK_VAR(src_hi));
+			}
+
+			/* cmp dreg_hi,sreg_hi */
+			EMIT2(0x39, add_2reg(0xC0, dreg_hi, sreg_hi));
+			EMIT2(IA32_JNE, 2);
+			/* cmp dreg_lo,sreg_lo */
+			EMIT2(0x39, add_2reg(0xC0, dreg_lo, sreg_lo));
+			goto emit_cond_jmp;
+		}
+		case BPF_JMP | BPF_JSET | BPF_X: {
+			u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+			u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+			u8 sreg_lo = sstk ? IA32_ECX : src_lo;
+			u8 sreg_hi = sstk ? IA32_EBX : src_hi;
+
+			if (dstk) {
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+				      STACK_VAR(dst_lo));
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+				      STACK_VAR(dst_hi));
+			}
+
+			if (sstk) {
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX),
+				      STACK_VAR(src_lo));
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EBX),
+				      STACK_VAR(src_hi));
+			}
+			/* and dreg_lo,sreg_lo */
+			EMIT2(0x23, add_2reg(0xC0, sreg_lo, dreg_lo));
+			/* and dreg_hi,sreg_hi */
+			EMIT2(0x23, add_2reg(0xC0, sreg_hi, dreg_hi));
+			/* or dreg_lo,dreg_hi */
+			EMIT2(0x09, add_2reg(0xC0, dreg_lo, dreg_hi));
+			goto emit_cond_jmp;
+		}
+		case BPF_JMP | BPF_JSET | BPF_K: {
+			u32 hi;
+			u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+			u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+			u8 sreg_lo = IA32_ECX;
+			u8 sreg_hi = IA32_EBX;
+
+			if (dstk) {
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+				      STACK_VAR(dst_lo));
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+				      STACK_VAR(dst_hi));
+			}
+			hi = imm32 & (1<<31) ? (u32)~0 : 0;
+
+			/* mov ecx,imm32 */
+			EMIT2_off32(0xC7, add_1reg(0xC0, IA32_ECX), imm32);
+			/* mov ebx,imm32 */
+			EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EBX), hi);
+
+			/* and dreg_lo,sreg_lo */
+			EMIT2(0x23, add_2reg(0xC0, sreg_lo, dreg_lo));
+			/* and dreg_hi,sreg_hi */
+			EMIT2(0x23, add_2reg(0xC0, sreg_hi, dreg_hi));
+			/* or dreg_lo,dreg_hi */
+			EMIT2(0x09, add_2reg(0xC0, dreg_lo, dreg_hi));
+			goto emit_cond_jmp;
+		}
+		case BPF_JMP | BPF_JEQ | BPF_K:
+		case BPF_JMP | BPF_JNE | BPF_K:
+		case BPF_JMP | BPF_JGT | BPF_K:
+		case BPF_JMP | BPF_JLT | BPF_K:
+		case BPF_JMP | BPF_JGE | BPF_K:
+		case BPF_JMP | BPF_JLE | BPF_K:
+		case BPF_JMP | BPF_JSGT | BPF_K:
+		case BPF_JMP | BPF_JSLE | BPF_K:
+		case BPF_JMP | BPF_JSLT | BPF_K:
+		case BPF_JMP | BPF_JSGE | BPF_K: {
+			u32 hi;
+			u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+			u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+			u8 sreg_lo = IA32_ECX;
+			u8 sreg_hi = IA32_EBX;
+
+			if (dstk) {
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+				      STACK_VAR(dst_lo));
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+				      STACK_VAR(dst_hi));
+			}
+
+			hi = imm32 & (1<<31) ? (u32)~0 : 0;
+			/* mov ecx,imm32 */
+			EMIT2_off32(0xC7, add_1reg(0xC0, IA32_ECX), imm32);
+			/* mov ebx,imm32 */
+			EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EBX), hi);
+
+			/* cmp dreg_hi,sreg_hi */
+			EMIT2(0x39, add_2reg(0xC0, dreg_hi, sreg_hi));
+			EMIT2(IA32_JNE, 2);
+			/* cmp dreg_lo,sreg_lo */
+			EMIT2(0x39, add_2reg(0xC0, dreg_lo, sreg_lo));
+
+emit_cond_jmp:		/* convert BPF opcode to x86 */
+			switch (BPF_OP(code)) {
+			case BPF_JEQ:
+				jmp_cond = IA32_JE;
+				break;
+			case BPF_JSET:
+			case BPF_JNE:
+				jmp_cond = IA32_JNE;
+				break;
+			case BPF_JGT:
+				/* GT is unsigned '>', JA in x86 */
+				jmp_cond = IA32_JA;
+				break;
+			case BPF_JLT:
+				/* LT is unsigned '<', JB in x86 */
+				jmp_cond = IA32_JB;
+				break;
+			case BPF_JGE:
+				/* GE is unsigned '>=', JAE in x86 */
+				jmp_cond = IA32_JAE;
+				break;
+			case BPF_JLE:
+				/* LE is unsigned '<=', JBE in x86 */
+				jmp_cond = IA32_JBE;
+				break;
+			case BPF_JSGT:
+				/* signed '>', GT in x86 */
+				jmp_cond = IA32_JG;
+				break;
+			case BPF_JSLT:
+				/* signed '<', LT in x86 */
+				jmp_cond = IA32_JL;
+				break;
+			case BPF_JSGE:
+				/* signed '>=', GE in x86 */
+				jmp_cond = IA32_JGE;
+				break;
+			case BPF_JSLE:
+				/* signed '<=', LE in x86 */
+				jmp_cond = IA32_JLE;
+				break;
+			default: /* to silence gcc warning */
+				return -EFAULT;
+			}
+			jmp_offset = addrs[i + insn->off] - addrs[i];
+			if (is_imm8(jmp_offset)) {
+				EMIT2(jmp_cond, jmp_offset);
+			} else if (is_simm32(jmp_offset)) {
+				EMIT2_off32(0x0F, jmp_cond + 0x10, jmp_offset);
+			} else {
+				pr_err("cond_jmp gen bug %llx\n", jmp_offset);
+				return -EFAULT;
+			}
+
+			break;
+		}
+		case BPF_JMP | BPF_JA:
+			jmp_offset = addrs[i + insn->off] - addrs[i];
+			if (!jmp_offset)
+				/* optimize out nop jumps */
+				break;
+emit_jmp:
+			if (is_imm8(jmp_offset)) {
+				EMIT2(0xEB, jmp_offset);
+			} else if (is_simm32(jmp_offset)) {
+				EMIT1_off32(0xE9, jmp_offset);
+			} else {
+				pr_err("jmp gen bug %llx\n", jmp_offset);
+				return -EFAULT;
+			}
+			break;
+
+		case BPF_LD | BPF_ABS | BPF_W:
+		case BPF_LD | BPF_ABS | BPF_H:
+		case BPF_LD | BPF_ABS | BPF_B:
+		case BPF_LD | BPF_IND | BPF_W:
+		case BPF_LD | BPF_IND | BPF_H:
+		case BPF_LD | BPF_IND | BPF_B:
+		{
+			int size;
+			const u8 *r6 = bpf2ia32[BPF_REG_6];
+
+			/* Setting up first argument */
+			/* mov eax,dword ptr [ebp+off] */
+			EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+			      STACK_VAR(r6[0]));
+
+			/* Setting up second argument */
+			if (BPF_MODE(code) == BPF_ABS) {
+				/* mov %edx, imm32 */
+				EMIT1_off32(0xBA, imm32);
+			} else {
+				if (sstk)
+					/* mov edx,dword ptr [ebp+off] */
+					EMIT3(0x8B, add_2reg(0x40, IA32_EBP,
+							     IA32_EDX),
+					      STACK_VAR(src_lo));
+				else
+					/* mov edx,src_lo */
+					EMIT2(0x8B, add_2reg(0xC0, src_lo,
+							     IA32_EDX));
+				if (imm32) {
+					if (is_imm8(imm32))
+						/* add %edx, imm8 */
+						EMIT3(0x83, 0xC2, imm32);
+					else
+						/* add %edx, imm32 */
+						EMIT2_off32(0x81, 0xC2, imm32);
+				}
+			}
+
+			/* Setting up third argument */
+			switch (BPF_SIZE(code)) {
+			case BPF_W:
+				size = 4;
+				break;
+			case BPF_H:
+				size = 2;
+				break;
+			case BPF_B:
+				size = 1;
+				break;
+			default:
+				return -EINVAL;
+			}
+			/* mov ecx,val */
+			EMIT2(0xB1, size);
+			/* movzx ecx,ecx */
+			EMIT3(0x0F, 0xB6, add_2reg(0xC0, IA32_ECX, IA32_ECX));
+
+			/* mov ebx,ebp */
+			EMIT2(0x8B, add_2reg(0xC0, IA32_EBP, IA32_EBX));
+			/* add %ebx,imm8 */
+			EMIT3(0x83, add_1reg(0xC0, IA32_EBX), SKB_BUFFER);
+			/* push ebx */
+			EMIT1(0x53);
+
+			/* Setting up function pointer to call */
+			/* mov ebx,imm32*/
+			EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EBX),
+				    (unsigned int)bpf_load_pointer);
+
+			EMIT2(0xFF, add_1reg(0xD0, IA32_EBX));
+			/* add %esp,4 */
+			EMIT3(0x83, add_1reg(0xC0, IA32_ESP), 4);
+			/* xor edx,edx */
+			EMIT2(0x33, add_2reg(0xC0, IA32_EDX, IA32_EDX));
+
+			/* mov dword ptr [ebp+off],eax */
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EDX),
+			      STACK_VAR(r0[0]));
+			/* mov dword ptr [ebp+off],edx */
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EDX),
+			      STACK_VAR(r0[1]));
+
+			/* Check if return address is NULL or not.
+			 * if NULL then jump to epilogue
+			 * else continue to load the value from retn address
+			 */
+			EMIT3(0x83, add_1reg(0xF8, IA32_EAX), 0);
+			jmp_offset = ctx->cleanup_addr - addrs[i];
+
+			switch (BPF_SIZE(code)) {
+			case BPF_W:
+				jmp_offset += 7;
+				break;
+			case BPF_H:
+				jmp_offset += 10;
+				break;
+			case BPF_B:
+				jmp_offset += 6;
+				break;
+			}
+
+			EMIT2_off32(0x0F, IA32_JE + 0x10, jmp_offset);
+			/* Load value from the address */
+			switch (BPF_SIZE(code)) {
+			case BPF_W:
+				/* mov eax,[eax] */
+				EMIT2(0x8B, 0x0);
+				/* emit 'bswap eax' */
+				EMIT2(0x0F, add_1reg(0xC8, IA32_EAX));
+				break;
+			case BPF_H:
+				EMIT3(0x0F, 0xB7, 0x0);
+				EMIT1(0x66);
+				EMIT3(0xC1, add_1reg(0xC8, IA32_EAX), 8);
+				break;
+			case BPF_B:
+				EMIT3(0x0F, 0xB6, 0x0);
+				break;
+			}
+
+			/* mov dword ptr [ebp+off],eax */
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX),
+			      STACK_VAR(r0[0]));
+			break;
+		}
+		/* STX XADD: lock *(u32 *)(dst + off) += src */
+		case BPF_STX | BPF_XADD | BPF_W:
+		/* STX XADD: lock *(u64 *)(dst + off) += src */
+		case BPF_STX | BPF_XADD | BPF_DW:
+			goto notyet;
+		case BPF_JMP | BPF_EXIT:
+			if (seen_exit) {
+				jmp_offset = ctx->cleanup_addr - addrs[i];
+				goto emit_jmp;
+			}
+			seen_exit = true;
+			/* update cleanup_addr */
+			ctx->cleanup_addr = proglen;
+			emit_epilogue(&prog, bpf_prog->aux->stack_depth);
+			break;
+notyet:
+			pr_info_once("*** NOT YET: opcode %02x ***\n", code);
+			return -EFAULT;
+		default:
+			/* This error will be seen if new instruction was added
+			 * to interpreter, but not to JIT
+			 * or if there is junk in bpf_prog
+			 */
+			pr_err("bpf_jit: unknown opcode %02x\n", code);
+			return -EINVAL;
+		}
+
+		ilen = prog - temp;
+		if (ilen > BPF_MAX_INSN_SIZE) {
+			pr_err("bpf_jit: fatal insn size error\n");
+			return -EFAULT;
+		}
+
+		if (image) {
+			if (unlikely(proglen + ilen > oldproglen)) {
+				pr_err("bpf_jit: fatal error\n");
+				return -EFAULT;
+			}
+			memcpy(image + proglen, temp, ilen);
+		}
+		proglen += ilen;
+		addrs[i] = proglen;
+		prog = temp;
+	}
+	return proglen;
+}
+
+struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+{
+	struct bpf_binary_header *header = NULL;
+	struct bpf_prog *tmp, *orig_prog = prog;
+	int proglen, oldproglen = 0;
+	struct jit_context ctx = {};
+	bool tmp_blinded = false;
+	u8 *image = NULL;
+	int *addrs;
+	int pass;
+	int i;
+
+	if (!prog->jit_requested)
+		return orig_prog;
+
+	tmp = bpf_jit_blind_constants(prog);
+	/* If blinding was requested and we failed during blinding,
+	 * we must fall back to the interpreter.
+	 */
+	if (IS_ERR(tmp))
+		return orig_prog;
+	if (tmp != prog) {
+		tmp_blinded = true;
+		prog = tmp;
+	}
+
+	addrs = kmalloc(prog->len * sizeof(*addrs), GFP_KERNEL);
+	if (!addrs) {
+		prog = orig_prog;
+		goto out;
+	}
+
+	/* Before first pass, make a rough estimation of addrs[]
+	 * each bpf instruction is translated to less than 64 bytes
+	 */
+	for (proglen = 0, i = 0; i < prog->len; i++) {
+		proglen += 64;
+		addrs[i] = proglen;
+	}
+	ctx.cleanup_addr = proglen;
+
+	/* JITed image shrinks with every pass and the loop iterates
+	 * until the image stops shrinking. Very large bpf programs
+	 * may converge on the last pass. In such case do one more
+	 * pass to emit the final image
+	 */
+	for (pass = 0; pass < 20 || image; pass++) {
+		proglen = do_jit(prog, addrs, image, oldproglen, &ctx);
+		if (proglen <= 0) {
+			image = NULL;
+			if (header)
+				bpf_jit_binary_free(header);
+			prog = orig_prog;
+			goto out_addrs;
+		}
+		if (image) {
+			if (proglen != oldproglen) {
+				pr_err("bpf_jit: proglen=%d != oldproglen=%d\n",
+				       proglen, oldproglen);
+				prog = orig_prog;
+				goto out_addrs;
+			}
+			break;
+		}
+		if (proglen == oldproglen) {
+			header = bpf_jit_binary_alloc(proglen, &image,
+						      1, jit_fill_hole);
+			if (!header) {
+				prog = orig_prog;
+				goto out_addrs;
+			}
+		}
+		oldproglen = proglen;
+		cond_resched();
+	}
+
+	if (bpf_jit_enable > 1)
+		bpf_jit_dump(prog->len, proglen, pass + 1, image);
+
+	if (image) {
+		bpf_jit_binary_lock_ro(header);
+		prog->bpf_func = (void *)image;
+		prog->jited = 1;
+		prog->jited_len = proglen;
+	} else {
+		prog = orig_prog;
+	}
+
+out_addrs:
+	kfree(addrs);
+out:
+	if (tmp_blinded)
+		bpf_jit_prog_release_other(prog, prog == orig_prog ?
+					   tmp : orig_prog);
+	return prog;
+}
-- 
1.8.5.6.2.g3d8a54e.dirty

^ permalink raw reply related

* pull-request: wireless-drivers 2018-04-26
From: Kalle Valo @ 2018-04-26 10:12 UTC (permalink / raw)
  To: David Miller; +Cc: linux-wireless, netdev, linux-kernel

Hi Dave,

here's a pull request to net tree, more info below. Please let me know
if you have any problems.

Kalle

The following changes since commit 4608f064532c28c0ea3c03fe26a3a5909852811a:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next (2018-04-03 14:08:58 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git tags/wireless-drivers-for-davem-2018-04-26

for you to fetch changes up to af8a41cccf8f469165c6debc8fe07c5fd2ca501a:

  rtlwifi: cleanup 8723be ant_sel definition (2018-04-24 13:15:08 +0300)

----------------------------------------------------------------
wireless-drivers fixes for 4.17

A few fixes for 4.17 but nothing really special. The new ETSI WMM
parameter support for iwlwifi is not technically a bugfix but
important for regulatory compliance.

iwlwifi

* use new ETSI WMM parameters from regulatory database

* fix a regression with the older firmware API 31 (eg. 31.560484.0)

brcmfmac

* fix a double free in nvmam loading fails

rtlwifi

* yet another fix for ant_sel module parameter

----------------------------------------------------------------
Arend Van Spriel (1):
      brcmfmac: fix firmware request processing if nvram load fails

Haim Dreyfuss (1):
      iwlwifi: mvm: query regdb for wmm rule if needed

Luca Coelho (1):
      iwlwifi: mvm: fix old scan version sizes

Ping-Ke Shih (1):
      rtlwifi: cleanup 8723be ant_sel definition

 .../broadcom/brcm80211/brcmfmac/firmware.c         |  36 ++++---
 drivers/net/wireless/intel/iwlwifi/fw/api/scan.h   |  13 +--
 drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c | 111 ++++++++++++++++++---
 drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.h |   6 +-
 drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c  |   3 +-
 .../realtek/rtlwifi/btcoexist/halbtcoutsrc.c       |  15 ---
 .../net/wireless/realtek/rtlwifi/rtl8723be/hw.c    |  11 +-
 drivers/net/wireless/realtek/rtlwifi/wifi.h        |   5 +
 8 files changed, 138 insertions(+), 62 deletions(-)

^ permalink raw reply

* [PATCH] bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on x86_32 platform
From: Wang YanQing @ 2018-04-26  9:57 UTC (permalink / raw)
  To: daniel; +Cc: ast, netdev, linux-kernel

All the testcases for BPF_PROG_TYPE_PERF_EVENT program type in
test_verifier(kselftest) report below errors on x86_32:
"
172/p unpriv: spill/fill of different pointers ldx FAIL
Unexpected error message!
0: (bf) r6 = r10
1: (07) r6 += -8
2: (15) if r1 == 0x0 goto pc+3
R1=ctx(id=0,off=0,imm=0) R6=fp-8,call_-1 R10=fp0,call_-1
3: (bf) r2 = r10
4: (07) r2 += -76
5: (7b) *(u64 *)(r6 +0) = r2
6: (55) if r1 != 0x0 goto pc+1
R1=ctx(id=0,off=0,imm=0) R2=fp-76,call_-1 R6=fp-8,call_-1 R10=fp0,call_-1 fp-8=fp
7: (7b) *(u64 *)(r6 +0) = r1
8: (79) r1 = *(u64 *)(r6 +0)
9: (79) r1 = *(u64 *)(r1 +68)
invalid bpf_context access off=68 size=8

378/p check bpf_perf_event_data->sample_period byte load permitted FAIL
Failed to load prog 'Permission denied'!
0: (b7) r0 = 0
1: (71) r0 = *(u8 *)(r1 +68)
invalid bpf_context access off=68 size=1

379/p check bpf_perf_event_data->sample_period half load permitted FAIL
Failed to load prog 'Permission denied'!
0: (b7) r0 = 0
1: (69) r0 = *(u16 *)(r1 +68)
invalid bpf_context access off=68 size=2

380/p check bpf_perf_event_data->sample_period word load permitted FAIL
Failed to load prog 'Permission denied'!
0: (b7) r0 = 0
1: (61) r0 = *(u32 *)(r1 +68)
invalid bpf_context access off=68 size=4

381/p check bpf_perf_event_data->sample_period dword load permitted FAIL
Failed to load prog 'Permission denied'!
0: (b7) r0 = 0
1: (79) r0 = *(u64 *)(r1 +68)
invalid bpf_context access off=68 size=8
"

This patch fix it, the fix isn't only necessary for x86_32, it will fix the
same problem for other platforms too, if their size of bpf_user_pt_regs_t
can't divide exactly into 8.

Signed-off-by: Wang YanQing <udknight@gmail.com>
---
 Hi all!
 After mainline accept this patch, then we need to submit a sync patch
 to update the tools/include/uapi/linux/bpf_perf_event.h.

 Thanks.

 include/uapi/linux/bpf_perf_event.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/linux/bpf_perf_event.h b/include/uapi/linux/bpf_perf_event.h
index eb1b9d2..ff4c092 100644
--- a/include/uapi/linux/bpf_perf_event.h
+++ b/include/uapi/linux/bpf_perf_event.h
@@ -12,7 +12,7 @@
 
 struct bpf_perf_event_data {
 	bpf_user_pt_regs_t regs;
-	__u64 sample_period;
+	__u64 sample_period __attribute__((aligned(8)));
 	__u64 addr;
 };
 
-- 
1.8.5.6.2.g3d8a54e.dirty

^ permalink raw reply related

* [PATCH][next] ath10k: fix spelling mistake: "servive" -> "service"
From: Colin King @ 2018-04-26  9:12 UTC (permalink / raw)
  To: Kalle Valo, ath10k, linux-wireless, netdev; +Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

Trivial fix to spelling mistake in ath10k_warn warning message text

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/net/wireless/ath/ath10k/wmi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
index df2e92a6c9bd..622e026ae940 100644
--- a/drivers/net/wireless/ath/ath10k/wmi.c
+++ b/drivers/net/wireless/ath/ath10k/wmi.c
@@ -5280,7 +5280,7 @@ void ath10k_wmi_event_service_available(struct ath10k *ar, struct sk_buff *skb)
 
 	ret = ath10k_wmi_pull_svc_avail(ar, skb, &arg);
 	if (ret) {
-		ath10k_warn(ar, "failed to parse servive available event: %d\n",
+		ath10k_warn(ar, "failed to parse service available event: %d\n",
 			    ret);
 	}
 
-- 
2.17.0

^ permalink raw reply related

* [PATCH net-next 6/6] mlxsw: spectrum_span: Allow bridge for gretap mirror
From: Ido Schimmel @ 2018-04-26  9:06 UTC (permalink / raw)
  To: netdev, bridge; +Cc: davem, stephen, jiri, nikolay, petrm, mlxsw, Ido Schimmel
In-Reply-To: <20180426090637.25262-1-idosch@mellanox.com>

From: Petr Machata <petrm@mellanox.com>

When handling mirroring to a gretap or ip6gretap netdevice in mlxsw, the
underlay address (i.e. the remote address of the tunnel) may be routed
to a bridge.

In that case, look up the resolved neighbor Ethernet address in that
bridge's FDB. Then configure the offload to direct the mirrored traffic
to that port, possibly with tagging.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 .../net/ethernet/mellanox/mlxsw/spectrum_span.c    | 102 +++++++++++++++++++--
 .../net/ethernet/mellanox/mlxsw/spectrum_span.h    |   1 +
 2 files changed, 97 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
index 65a77708ff61..92fb81839852 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
@@ -32,6 +32,7 @@
  * POSSIBILITY OF SUCH DAMAGE.
  */
 
+#include <linux/if_bridge.h>
 #include <linux/list.h>
 #include <net/arp.h>
 #include <net/gre.h>
@@ -39,8 +40,9 @@
 #include <net/ip6_tunnel.h>
 
 #include "spectrum.h"
-#include "spectrum_span.h"
 #include "spectrum_ipip.h"
+#include "spectrum_span.h"
+#include "spectrum_switchdev.h"
 
 int mlxsw_sp_span_init(struct mlxsw_sp *mlxsw_sp)
 {
@@ -167,6 +169,79 @@ mlxsw_sp_span_entry_unoffloadable(struct mlxsw_sp_span_parms *sparmsp)
 	return 0;
 }
 
+static struct net_device *
+mlxsw_sp_span_entry_bridge_8021q(const struct net_device *br_dev,
+				 unsigned char *dmac,
+				 u16 *p_vid)
+{
+	struct net_bridge_vlan_group *vg = br_vlan_group_rtnl(br_dev);
+	u16 pvid = br_vlan_group_pvid(vg);
+	struct net_device *edev = NULL;
+	struct net_bridge_vlan *v;
+
+	if (pvid)
+		edev = br_fdb_find_port_hold(br_dev, dmac, pvid);
+	if (!edev)
+		return NULL;
+
+	/* RTNL prevents edev from being removed. */
+	dev_put(edev);
+
+	vg = br_port_vlan_group_rtnl(edev);
+	v = br_vlan_find(vg, pvid);
+	if (!v)
+		return NULL;
+	if (!(br_vlan_flags(v) & BRIDGE_VLAN_INFO_UNTAGGED))
+		*p_vid = pvid;
+	return edev;
+}
+
+static struct net_device *
+mlxsw_sp_span_entry_bridge_8021d(const struct net_device *br_dev,
+				 unsigned char *dmac)
+{
+	struct net_device *edev = br_fdb_find_port_hold(br_dev, dmac, 0);
+
+	/* RTNL prevents edev from being removed. */
+	dev_put(edev);
+
+	return edev;
+}
+
+static struct net_device *
+mlxsw_sp_span_entry_bridge(const struct net_device *br_dev,
+			   unsigned char dmac[ETH_ALEN],
+			   u16 *p_vid)
+{
+	struct mlxsw_sp_bridge_port *bridge_port;
+	enum mlxsw_reg_spms_state spms_state;
+	struct mlxsw_sp_port *port;
+	struct net_device *dev;
+	u8 stp_state;
+
+	if (br_vlan_enabled(br_dev))
+		dev = mlxsw_sp_span_entry_bridge_8021q(br_dev, dmac, p_vid);
+	else
+		dev = mlxsw_sp_span_entry_bridge_8021d(br_dev, dmac);
+	if (!dev)
+		return NULL;
+
+	port = mlxsw_sp_port_dev_lower_find(dev);
+	if (!port)
+		return NULL;
+
+	bridge_port = mlxsw_sp_bridge_port_find(port->mlxsw_sp->bridge, dev);
+	if (!bridge_port)
+		return NULL;
+
+	stp_state = mlxsw_sp_bridge_port_stp_state(bridge_port);
+	spms_state = mlxsw_sp_stp_spms_state(stp_state);
+	if (spms_state != MLXSW_REG_SPMS_STATE_FORWARDING)
+		return NULL;
+
+	return dev;
+}
+
 static __maybe_unused int
 mlxsw_sp_span_entry_tunnel_parms_common(struct net_device *l3edev,
 					union mlxsw_sp_l3addr saddr,
@@ -177,13 +252,22 @@ mlxsw_sp_span_entry_tunnel_parms_common(struct net_device *l3edev,
 					struct mlxsw_sp_span_parms *sparmsp)
 {
 	unsigned char dmac[ETH_ALEN];
+	u16 vid = 0;
 
 	if (mlxsw_sp_l3addr_is_zero(gw))
 		gw = daddr;
 
-	if (!l3edev || !mlxsw_sp_port_dev_check(l3edev) ||
-	    mlxsw_sp_span_dmac(tbl, &gw, l3edev, dmac))
-		return mlxsw_sp_span_entry_unoffloadable(sparmsp);
+	if (!l3edev || mlxsw_sp_span_dmac(tbl, &gw, l3edev, dmac))
+		goto unoffloadable;
+
+	if (netif_is_bridge_master(l3edev)) {
+		l3edev = mlxsw_sp_span_entry_bridge(l3edev, dmac, &vid);
+		if (!l3edev)
+			goto unoffloadable;
+	}
+
+	if (!mlxsw_sp_port_dev_check(l3edev))
+		goto unoffloadable;
 
 	sparmsp->dest_port = netdev_priv(l3edev);
 	sparmsp->ttl = ttl;
@@ -191,7 +275,11 @@ mlxsw_sp_span_entry_tunnel_parms_common(struct net_device *l3edev,
 	memcpy(sparmsp->smac, l3edev->dev_addr, ETH_ALEN);
 	sparmsp->saddr = saddr;
 	sparmsp->daddr = daddr;
+	sparmsp->vid = vid;
 	return 0;
+
+unoffloadable:
+	return mlxsw_sp_span_entry_unoffloadable(sparmsp);
 }
 
 #if IS_ENABLED(CONFIG_NET_IPGRE)
@@ -268,9 +356,10 @@ mlxsw_sp_span_entry_gretap4_configure(struct mlxsw_sp_span_entry *span_entry,
 	/* Create a new port analayzer entry for local_port. */
 	mlxsw_reg_mpat_pack(mpat_pl, pa_id, local_port, true,
 			    MLXSW_REG_MPAT_SPAN_TYPE_REMOTE_ETH_L3);
+	mlxsw_reg_mpat_eth_rspan_pack(mpat_pl, sparms.vid);
 	mlxsw_reg_mpat_eth_rspan_l2_pack(mpat_pl,
 				    MLXSW_REG_MPAT_ETH_RSPAN_VERSION_NO_HEADER,
-				    sparms.dmac, false);
+				    sparms.dmac, !!sparms.vid);
 	mlxsw_reg_mpat_eth_rspan_l3_ipv4_pack(mpat_pl,
 					      sparms.ttl, sparms.smac,
 					      be32_to_cpu(sparms.saddr.addr4),
@@ -368,9 +457,10 @@ mlxsw_sp_span_entry_gretap6_configure(struct mlxsw_sp_span_entry *span_entry,
 	/* Create a new port analayzer entry for local_port. */
 	mlxsw_reg_mpat_pack(mpat_pl, pa_id, local_port, true,
 			    MLXSW_REG_MPAT_SPAN_TYPE_REMOTE_ETH_L3);
+	mlxsw_reg_mpat_eth_rspan_pack(mpat_pl, sparms.vid);
 	mlxsw_reg_mpat_eth_rspan_l2_pack(mpat_pl,
 				    MLXSW_REG_MPAT_ETH_RSPAN_VERSION_NO_HEADER,
-				    sparms.dmac, false);
+				    sparms.dmac, !!sparms.vid);
 	mlxsw_reg_mpat_eth_rspan_l3_ipv6_pack(mpat_pl, sparms.ttl, sparms.smac,
 					      sparms.saddr.addr6,
 					      sparms.daddr.addr6);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.h
index 4b87ec20e658..14a6de904db1 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.h
@@ -63,6 +63,7 @@ struct mlxsw_sp_span_parms {
 	unsigned char smac[ETH_ALEN];
 	union mlxsw_sp_l3addr daddr;
 	union mlxsw_sp_l3addr saddr;
+	u16 vid;
 };
 
 struct mlxsw_sp_span_entry_ops;
-- 
2.14.3

^ permalink raw reply related

* [PATCH net-next 5/6] mlxsw: Respin SPAN on switchdev events
From: Ido Schimmel @ 2018-04-26  9:06 UTC (permalink / raw)
  To: netdev, bridge; +Cc: davem, stephen, jiri, nikolay, petrm, mlxsw, Ido Schimmel
In-Reply-To: <20180426090637.25262-1-idosch@mellanox.com>

From: Petr Machata <petrm@mellanox.com>

Changes to switchdev artifact can make a SPAN entry offloadable or
unoffloadable. To that end:

- Listen to SWITCHDEV_FDB_*_TO_BRIDGE notifications in addition to
  the *_TO_DEVICE ones, to catch whatever activity is sent to the
  bridge (likely by mlxsw itself).

  On each FDB notification, respin SPAN to reconcile it with the FDB
  changes.

- Also respin on switchdev port attribute changes (which currently
  covers changes to STP state of ports) and port object additions and
  removals.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 .../ethernet/mellanox/mlxsw/spectrum_switchdev.c   | 63 ++++++++++++++++++++--
 1 file changed, 59 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index db4aea0f8996..1af99fe5fd32 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -49,6 +49,7 @@
 #include <linux/netlink.h>
 #include <net/switchdev.h>
 
+#include "spectrum_span.h"
 #include "spectrum_router.h"
 #include "spectrum_switchdev.h"
 #include "spectrum.h"
@@ -923,6 +924,9 @@ static int mlxsw_sp_port_attr_set(struct net_device *dev,
 		break;
 	}
 
+	if (switchdev_trans_ph_commit(trans))
+		mlxsw_sp_span_respin(mlxsw_sp_port->mlxsw_sp);
+
 	return err;
 }
 
@@ -1647,18 +1651,57 @@ mlxsw_sp_port_mrouter_update_mdb(struct mlxsw_sp_port *mlxsw_sp_port,
 	}
 }
 
+struct mlxsw_sp_span_respin_work {
+	struct work_struct work;
+	struct mlxsw_sp *mlxsw_sp;
+};
+
+static void mlxsw_sp_span_respin_work(struct work_struct *work)
+{
+	struct mlxsw_sp_span_respin_work *respin_work =
+		container_of(work, struct mlxsw_sp_span_respin_work, work);
+
+	rtnl_lock();
+	mlxsw_sp_span_respin(respin_work->mlxsw_sp);
+	rtnl_unlock();
+	kfree(respin_work);
+}
+
+static void mlxsw_sp_span_respin_schedule(struct mlxsw_sp *mlxsw_sp)
+{
+	struct mlxsw_sp_span_respin_work *respin_work;
+
+	respin_work = kzalloc(sizeof(*respin_work), GFP_ATOMIC);
+	if (!respin_work)
+		return;
+
+	INIT_WORK(&respin_work->work, mlxsw_sp_span_respin_work);
+	respin_work->mlxsw_sp = mlxsw_sp;
+
+	mlxsw_core_schedule_work(&respin_work->work);
+}
+
 static int mlxsw_sp_port_obj_add(struct net_device *dev,
 				 const struct switchdev_obj *obj,
 				 struct switchdev_trans *trans)
 {
 	struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
+	const struct switchdev_obj_port_vlan *vlan;
 	int err = 0;
 
 	switch (obj->id) {
 	case SWITCHDEV_OBJ_ID_PORT_VLAN:
-		err = mlxsw_sp_port_vlans_add(mlxsw_sp_port,
-					      SWITCHDEV_OBJ_PORT_VLAN(obj),
-					      trans);
+		vlan = SWITCHDEV_OBJ_PORT_VLAN(obj);
+		err = mlxsw_sp_port_vlans_add(mlxsw_sp_port, vlan, trans);
+
+		if (switchdev_trans_ph_commit(trans)) {
+			/* The event is emitted before the changes are actually
+			 * applied to the bridge. Therefore schedule the respin
+			 * call for later, so that the respin logic sees the
+			 * updated bridge state.
+			 */
+			mlxsw_sp_span_respin_schedule(mlxsw_sp_port->mlxsw_sp);
+		}
 		break;
 	case SWITCHDEV_OBJ_ID_PORT_MDB:
 		err = mlxsw_sp_port_mdb_add(mlxsw_sp_port,
@@ -1809,6 +1852,8 @@ static int mlxsw_sp_port_obj_del(struct net_device *dev,
 		break;
 	}
 
+	mlxsw_sp_span_respin(mlxsw_sp_port->mlxsw_sp);
+
 	return err;
 }
 
@@ -2236,8 +2281,16 @@ static void mlxsw_sp_switchdev_event_work(struct work_struct *work)
 		fdb_info = &switchdev_work->fdb_info;
 		mlxsw_sp_port_fdb_set(mlxsw_sp_port, fdb_info, false);
 		break;
+	case SWITCHDEV_FDB_ADD_TO_BRIDGE: /* fall through */
+	case SWITCHDEV_FDB_DEL_TO_BRIDGE:
+		/* These events are only used to potentially update an existing
+		 * SPAN mirror.
+		 */
+		break;
 	}
 
+	mlxsw_sp_span_respin(mlxsw_sp_port->mlxsw_sp);
+
 out:
 	rtnl_unlock();
 	kfree(switchdev_work->fdb_info.addr);
@@ -2266,7 +2319,9 @@ static int mlxsw_sp_switchdev_event(struct notifier_block *unused,
 
 	switch (event) {
 	case SWITCHDEV_FDB_ADD_TO_DEVICE: /* fall through */
-	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+	case SWITCHDEV_FDB_DEL_TO_DEVICE: /* fall through */
+	case SWITCHDEV_FDB_ADD_TO_BRIDGE: /* fall through */
+	case SWITCHDEV_FDB_DEL_TO_BRIDGE:
 		memcpy(&switchdev_work->fdb_info, ptr,
 		       sizeof(switchdev_work->fdb_info));
 		switchdev_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
-- 
2.14.3

^ permalink raw reply related

* [PATCH net-next 4/6] mlxsw: spectrum: Register SPAN before switchdev
From: Ido Schimmel @ 2018-04-26  9:06 UTC (permalink / raw)
  To: netdev, bridge; +Cc: davem, stephen, jiri, nikolay, petrm, mlxsw, Ido Schimmel
In-Reply-To: <20180426090637.25262-1-idosch@mellanox.com>

From: Petr Machata <petrm@mellanox.com>

Since switchdev events can trigger SPAN respin, it is necessary that the
data structures are available. Register SPAN first, with a commentary on
what the dependencies are.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 7317fb8079d1..94132f6cec61 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -3666,6 +3666,15 @@ static int mlxsw_sp_init(struct mlxsw_core *mlxsw_core,
 		goto err_lag_init;
 	}
 
+	/* Initialize SPAN before router and switchdev, so that those components
+	 * can call mlxsw_sp_span_respin().
+	 */
+	err = mlxsw_sp_span_init(mlxsw_sp);
+	if (err) {
+		dev_err(mlxsw_sp->bus_info->dev, "Failed to init span system\n");
+		goto err_span_init;
+	}
+
 	err = mlxsw_sp_switchdev_init(mlxsw_sp);
 	if (err) {
 		dev_err(mlxsw_sp->bus_info->dev, "Failed to initialize switchdev\n");
@@ -3684,15 +3693,6 @@ static int mlxsw_sp_init(struct mlxsw_core *mlxsw_core,
 		goto err_afa_init;
 	}
 
-	err = mlxsw_sp_span_init(mlxsw_sp);
-	if (err) {
-		dev_err(mlxsw_sp->bus_info->dev, "Failed to init span system\n");
-		goto err_span_init;
-	}
-
-	/* Initialize router after SPAN is initialized, so that the FIB and
-	 * neighbor event handlers can issue SPAN respin.
-	 */
 	err = mlxsw_sp_router_init(mlxsw_sp);
 	if (err) {
 		dev_err(mlxsw_sp->bus_info->dev, "Failed to initialize router\n");
@@ -3739,14 +3739,14 @@ static int mlxsw_sp_init(struct mlxsw_core *mlxsw_core,
 err_netdev_notifier:
 	mlxsw_sp_router_fini(mlxsw_sp);
 err_router_init:
-	mlxsw_sp_span_fini(mlxsw_sp);
-err_span_init:
 	mlxsw_sp_afa_fini(mlxsw_sp);
 err_afa_init:
 	mlxsw_sp_counter_pool_fini(mlxsw_sp);
 err_counter_pool_init:
 	mlxsw_sp_switchdev_fini(mlxsw_sp);
 err_switchdev_init:
+	mlxsw_sp_span_fini(mlxsw_sp);
+err_span_init:
 	mlxsw_sp_lag_fini(mlxsw_sp);
 err_lag_init:
 	mlxsw_sp_buffers_fini(mlxsw_sp);
@@ -3768,10 +3768,10 @@ static void mlxsw_sp_fini(struct mlxsw_core *mlxsw_core)
 	mlxsw_sp_acl_fini(mlxsw_sp);
 	unregister_netdevice_notifier(&mlxsw_sp->netdevice_nb);
 	mlxsw_sp_router_fini(mlxsw_sp);
-	mlxsw_sp_span_fini(mlxsw_sp);
 	mlxsw_sp_afa_fini(mlxsw_sp);
 	mlxsw_sp_counter_pool_fini(mlxsw_sp);
 	mlxsw_sp_switchdev_fini(mlxsw_sp);
+	mlxsw_sp_span_fini(mlxsw_sp);
 	mlxsw_sp_lag_fini(mlxsw_sp);
 	mlxsw_sp_buffers_fini(mlxsw_sp);
 	mlxsw_sp_traps_fini(mlxsw_sp);
-- 
2.14.3

^ permalink raw reply related

* [PATCH net-next 3/6] mlxsw: spectrum_switchdev: Publish two functions
From: Ido Schimmel @ 2018-04-26  9:06 UTC (permalink / raw)
  To: netdev, bridge; +Cc: davem, stephen, jiri, nikolay, petrm, mlxsw, Ido Schimmel
In-Reply-To: <20180426090637.25262-1-idosch@mellanox.com>

From: Petr Machata <petrm@mellanox.com>

Publish the existing function mlxsw_sp_bridge_port_find(), and add
another service accessor mlxsw_sp_bridge_port_stp_state(). Publish both
in a new file spectrum_switchdev.h.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 .../ethernet/mellanox/mlxsw/spectrum_switchdev.c   |  9 ++++-
 .../ethernet/mellanox/mlxsw/spectrum_switchdev.h   | 43 ++++++++++++++++++++++
 2 files changed, 51 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.h

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index c11c9a635866..db4aea0f8996 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -50,6 +50,7 @@
 #include <net/switchdev.h>
 
 #include "spectrum_router.h"
+#include "spectrum_switchdev.h"
 #include "spectrum.h"
 #include "core.h"
 #include "reg.h"
@@ -239,7 +240,7 @@ __mlxsw_sp_bridge_port_find(const struct mlxsw_sp_bridge_device *bridge_device,
 	return NULL;
 }
 
-static struct mlxsw_sp_bridge_port *
+struct mlxsw_sp_bridge_port *
 mlxsw_sp_bridge_port_find(struct mlxsw_sp_bridge *bridge,
 			  struct net_device *brport_dev)
 {
@@ -2297,6 +2298,12 @@ static struct notifier_block mlxsw_sp_switchdev_notifier = {
 	.notifier_call = mlxsw_sp_switchdev_event,
 };
 
+u8
+mlxsw_sp_bridge_port_stp_state(struct mlxsw_sp_bridge_port *bridge_port)
+{
+	return bridge_port->stp_state;
+}
+
 static int mlxsw_sp_fdb_init(struct mlxsw_sp *mlxsw_sp)
 {
 	struct mlxsw_sp_bridge *bridge = mlxsw_sp->bridge;
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.h
new file mode 100644
index 000000000000..bc44d5effc28
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: BSD-3-Clause OR GPL-2.0
+ * drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.h
+ * Copyright (c) 2018 Mellanox Technologies. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the names of the copyright holders nor the names of its
+ *    contributors may be used to endorse or promote products derived from
+ *    this software without specific prior written permission.
+ *
+ * Alternatively, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") version 2 as published by the Free
+ * Software Foundation.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <linux/netdevice.h>
+
+struct mlxsw_sp_bridge;
+struct mlxsw_sp_bridge_port;
+
+struct mlxsw_sp_bridge_port *
+mlxsw_sp_bridge_port_find(struct mlxsw_sp_bridge *bridge,
+			  struct net_device *brport_dev);
+
+u8 mlxsw_sp_bridge_port_stp_state(struct mlxsw_sp_bridge_port *bridge_port);
-- 
2.14.3

^ permalink raw reply related

* [PATCH net-next 2/6] mlxsw: spectrum: Extract mlxsw_sp_stp_spms_state()
From: Ido Schimmel @ 2018-04-26  9:06 UTC (permalink / raw)
  To: netdev, bridge; +Cc: davem, stephen, jiri, nikolay, petrm, mlxsw, Ido Schimmel
In-Reply-To: <20180426090637.25262-1-idosch@mellanox.com>

From: Petr Machata <petrm@mellanox.com>

Instead of duplicating the decision regarding port forwarding state made
by mlxsw_sp_port_vid_stp_set(), extract the decision-making into a new
function and reuse.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 26 +++++++++++++-------------
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h |  1 +
 2 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index ca38a30fbe91..7317fb8079d1 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -441,29 +441,29 @@ static void mlxsw_sp_txhdr_construct(struct sk_buff *skb,
 	mlxsw_tx_hdr_type_set(txhdr, MLXSW_TXHDR_TYPE_CONTROL);
 }
 
-int mlxsw_sp_port_vid_stp_set(struct mlxsw_sp_port *mlxsw_sp_port, u16 vid,
-			      u8 state)
+enum mlxsw_reg_spms_state mlxsw_sp_stp_spms_state(u8 state)
 {
-	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
-	enum mlxsw_reg_spms_state spms_state;
-	char *spms_pl;
-	int err;
-
 	switch (state) {
 	case BR_STATE_FORWARDING:
-		spms_state = MLXSW_REG_SPMS_STATE_FORWARDING;
-		break;
+		return MLXSW_REG_SPMS_STATE_FORWARDING;
 	case BR_STATE_LEARNING:
-		spms_state = MLXSW_REG_SPMS_STATE_LEARNING;
-		break;
+		return MLXSW_REG_SPMS_STATE_LEARNING;
 	case BR_STATE_LISTENING: /* fall-through */
 	case BR_STATE_DISABLED: /* fall-through */
 	case BR_STATE_BLOCKING:
-		spms_state = MLXSW_REG_SPMS_STATE_DISCARDING;
-		break;
+		return MLXSW_REG_SPMS_STATE_DISCARDING;
 	default:
 		BUG();
 	}
+}
+
+int mlxsw_sp_port_vid_stp_set(struct mlxsw_sp_port *mlxsw_sp_port, u16 vid,
+			      u8 state)
+{
+	enum mlxsw_reg_spms_state spms_state = mlxsw_sp_stp_spms_state(state);
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+	char *spms_pl;
+	int err;
 
 	spms_pl = kmalloc(MLXSW_REG_SPMS_LEN, GFP_KERNEL);
 	if (!spms_pl)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index 804d4d2c8031..4a519d8edec8 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -364,6 +364,7 @@ int __mlxsw_sp_port_headroom_set(struct mlxsw_sp_port *mlxsw_sp_port, int mtu,
 int mlxsw_sp_port_ets_maxrate_set(struct mlxsw_sp_port *mlxsw_sp_port,
 				  enum mlxsw_reg_qeec_hr hr, u8 index,
 				  u8 next_index, u32 maxrate);
+enum mlxsw_reg_spms_state mlxsw_sp_stp_spms_state(u8 stp_state);
 int mlxsw_sp_port_vid_stp_set(struct mlxsw_sp_port *mlxsw_sp_port, u16 vid,
 			      u8 state);
 int mlxsw_sp_port_vp_mode_set(struct mlxsw_sp_port *mlxsw_sp_port, bool enable);
-- 
2.14.3

^ permalink raw reply related

* [PATCH net-next 1/6] net: bridge: Publish bridge accessor functions
From: Ido Schimmel @ 2018-04-26  9:06 UTC (permalink / raw)
  To: netdev, bridge; +Cc: davem, stephen, jiri, nikolay, petrm, mlxsw, Ido Schimmel
In-Reply-To: <20180426090637.25262-1-idosch@mellanox.com>

From: Petr Machata <petrm@mellanox.com>

To allow querying FDB and vlan settings of a bridge, publish several
existing functions and add some new ones.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 include/linux/if_bridge.h | 55 +++++++++++++++++++++++++++++++++++++++++++++++
 net/bridge/br_fdb.c       | 25 +++++++++++++++++++++
 net/bridge/br_private.h   | 17 +++++++++------
 net/bridge/br_vlan.c      | 32 +++++++++++++++++++++++++++
 4 files changed, 123 insertions(+), 6 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index 02639ebea2f0..2020f61505b9 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -93,11 +93,66 @@ static inline bool br_multicast_router(const struct net_device *dev)
 
 #if IS_ENABLED(CONFIG_BRIDGE) && IS_ENABLED(CONFIG_BRIDGE_VLAN_FILTERING)
 bool br_vlan_enabled(const struct net_device *dev);
+
+struct net_bridge_vlan_group *
+br_vlan_group_rtnl(const struct net_device *br_dev);
+
+struct net_bridge_vlan_group *
+br_port_vlan_group_rtnl(const struct net_device *dev);
+
+u16 br_vlan_group_pvid(const struct net_bridge_vlan_group *vg);
+
+struct net_bridge_vlan *br_vlan_find(struct net_bridge_vlan_group *vg, u16 vid);
+
+u16 br_vlan_flags(const struct net_bridge_vlan *v);
+
 #else
 static inline bool br_vlan_enabled(const struct net_device *dev)
 {
 	return false;
 }
+
+static inline struct net_bridge_vlan_group *
+br_vlan_group_rtnl(const struct net_device *br_dev)
+{
+	return NULL;
+}
+
+static inline struct net_bridge_vlan_group *
+br_port_vlan_group_rtnl(const struct net_device *dev)
+{
+	return NULL;
+}
+
+static inline u16 br_vlan_group_pvid(const struct net_bridge_vlan_group *vg)
+{
+	return 0;
+}
+
+static inline struct net_bridge_vlan *
+br_vlan_find(struct net_bridge_vlan_group *vg, u16 vid)
+{
+	return NULL;
+}
+
+static inline u16 br_vlan_flags(const struct net_bridge_vlan *v)
+{
+	return 0;
+}
+#endif
+
+#if IS_ENABLED(CONFIG_BRIDGE)
+struct net_device *br_fdb_find_port_hold(const struct net_device *br_dev,
+					 const unsigned char *addr,
+					 __u16 vid);
+#else
+static inline struct net_device *
+br_fdb_find_port_hold(const struct net_device *br_dev,
+		      const unsigned char *addr,
+		      __u16 vid)
+{
+	return NULL;
+}
 #endif
 
 #endif
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index d9e69e4514be..cbdcf0e95224 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -121,6 +121,31 @@ static struct net_bridge_fdb_entry *br_fdb_find(struct net_bridge *br,
 	return fdb;
 }
 
+struct net_device *br_fdb_find_port_hold(const struct net_device *br_dev,
+					 const unsigned char *addr,
+					 __u16 vid)
+{
+	struct net_bridge_fdb_entry *f;
+	struct net_device *dev = NULL;
+	struct net_bridge *br;
+
+	if (!netif_is_bridge_master(br_dev))
+		return NULL;
+
+	br = netdev_priv(br_dev);
+
+	spin_lock_bh(&br->hash_lock);
+	f = br_fdb_find(br, addr, vid);
+	if (f && f->dst) {
+		dev = f->dst->dev;
+		dev_hold(dev);
+	}
+	spin_unlock_bh(&br->hash_lock);
+
+	return dev;
+}
+EXPORT_SYMBOL_GPL(br_fdb_find_port_hold);
+
 struct net_bridge_fdb_entry *br_fdb_find_rcu(struct net_bridge *br,
 					     const unsigned char *addr,
 					     __u16 vid)
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index a7cb3ece5031..3c929d587171 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -594,11 +594,22 @@ static inline bool br_rx_handler_check_rcu(const struct net_device *dev)
 	return rcu_dereference(dev->rx_handler) == br_handle_frame;
 }
 
+static inline bool br_rx_handler_check_rtnl(const struct net_device *dev)
+{
+	return rcu_dereference_rtnl(dev->rx_handler) == br_handle_frame;
+}
+
 static inline struct net_bridge_port *br_port_get_check_rcu(const struct net_device *dev)
 {
 	return br_rx_handler_check_rcu(dev) ? br_port_get_rcu(dev) : NULL;
 }
 
+static inline struct net_bridge_port *
+br_port_get_check_rtnl(const struct net_device *dev)
+{
+	return br_rx_handler_check_rtnl(dev) ? br_port_get_rtnl_rcu(dev) : NULL;
+}
+
 /* br_ioctl.c */
 int br_dev_ioctl(struct net_device *dev, struct ifreq *rq, int cmd);
 int br_ioctl_deviceless_stub(struct net *net, unsigned int cmd,
@@ -955,12 +966,6 @@ static inline void nbp_vlan_flush(struct net_bridge_port *port)
 {
 }
 
-static inline struct net_bridge_vlan *br_vlan_find(struct net_bridge_vlan_group *vg,
-						   u16 vid)
-{
-	return NULL;
-}
-
 static inline int nbp_vlan_init(struct net_bridge_port *port)
 {
 	return 0;
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index 9896f4975353..1c118c190653 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -671,6 +671,7 @@ struct net_bridge_vlan *br_vlan_find(struct net_bridge_vlan_group *vg, u16 vid)
 
 	return br_vlan_lookup(&vg->vlan_hash, vid);
 }
+EXPORT_SYMBOL_GPL(br_vlan_find);
 
 /* Must be protected by RTNL. */
 static void recalculate_group_addr(struct net_bridge *br)
@@ -1149,3 +1150,34 @@ void br_vlan_get_stats(const struct net_bridge_vlan *v,
 		stats->tx_packets += txpackets;
 	}
 }
+
+struct net_bridge_vlan_group *
+br_vlan_group_rtnl(const struct net_device *br_dev)
+{
+	if (netif_is_bridge_master(br_dev))
+		return br_vlan_group(netdev_priv(br_dev));
+	else
+		return NULL;
+}
+EXPORT_SYMBOL_GPL(br_vlan_group_rtnl);
+
+struct net_bridge_vlan_group *
+br_port_vlan_group_rtnl(const struct net_device *dev)
+{
+	struct net_bridge_port *p = br_port_get_check_rtnl(dev);
+
+	return p ? nbp_vlan_group(p) : NULL;
+}
+EXPORT_SYMBOL_GPL(br_port_vlan_group_rtnl);
+
+u16 br_vlan_group_pvid(const struct net_bridge_vlan_group *vg)
+{
+	return br_get_pvid(vg);
+}
+EXPORT_SYMBOL_GPL(br_vlan_group_pvid);
+
+u16 br_vlan_flags(const struct net_bridge_vlan *v)
+{
+	return v->flags;
+}
+EXPORT_SYMBOL_GPL(br_vlan_flags);
-- 
2.14.3

^ permalink raw reply related

* [PATCH net-next 0/6] mlxsw: SPAN: Support routes pointing at bridges
From: Ido Schimmel @ 2018-04-26  9:06 UTC (permalink / raw)
  To: netdev, bridge; +Cc: davem, stephen, jiri, nikolay, petrm, mlxsw, Ido Schimmel

Petr says:

When mirroring to a gretap or ip6gretap netdevice, the route that
directs the encapsulated packets can reference a bridge. In that case,
in the software model, the packet is switched.

Thus when offloading mirroring like that, take into consideration FDB,
STP, PVID configured at the bridge, and whether that VLAN ID should be
tagged on egress.

Patch #1 introduces a suite of functions to query various bridge bits of
configuration: FDB, VLAN groups, etc.

Patches #2 and #3 refactor some existing code and introduce a new
accessor function.

With patches #4 and #5 mlxsw calls mlxsw_sp_span_respin() on switchdev
events as well. There is no impact yet, because bridge as an underlay
device is still not allowed.

That is implemented in patch #6, which uses the new interfaces to figure
out on which one port the mirroring should be configured, and whether
the mirrored packets should be VLAN-tagged and how.

Petr Machata (6):
  net: bridge: Publish bridge accessor functions
  mlxsw: spectrum: Extract mlxsw_sp_stp_spms_state()
  mlxsw: spectrum_switchdev: Publish two functions
  mlxsw: spectrum: Register SPAN before switchdev
  mlxsw: Respin SPAN on switchdev events
  mlxsw: spectrum_span: Allow bridge for gretap mirror

 drivers/net/ethernet/mellanox/mlxsw/spectrum.c     |  50 +++++-----
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |   1 +
 .../net/ethernet/mellanox/mlxsw/spectrum_span.c    | 102 +++++++++++++++++++--
 .../net/ethernet/mellanox/mlxsw/spectrum_span.h    |   1 +
 .../ethernet/mellanox/mlxsw/spectrum_switchdev.c   |  72 ++++++++++++++-
 .../ethernet/mellanox/mlxsw/spectrum_switchdev.h   |  43 +++++++++
 include/linux/if_bridge.h                          |  55 +++++++++++
 net/bridge/br_fdb.c                                |  25 +++++
 net/bridge/br_private.h                            |  17 ++--
 net/bridge/br_vlan.c                               |  32 +++++++
 10 files changed, 356 insertions(+), 42 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.h

-- 
2.14.3

^ permalink raw reply

* [PATCH net] mlxsw: spectrum_switchdev: Do not remove mrouter port from MDB's ports list
From: Ido Schimmel @ 2018-04-26  8:46 UTC (permalink / raw)
  To: netdev; +Cc: davem, jiri, nogahf, colin.king, mlxsw, Ido Schimmel

When IGMP snooping is enabled on a bridge, traffic forwarded by an MDB
entry should be sent to both ports member in the MDB's ports list and
mrouter ports.

In case a port needs to be removed from an MDB's ports list, but this
port is also configured as an mrouter port, then do not update the
device so that it will continue to forward traffic through that port.

Fix a copy-paste error that checked that IGMP snooping is enabled twice
instead of checking the port's mrouter state.

Fixes: ded711c87a04 ("mlxsw: spectrum_switchdev: Consider mrouter status for mdb changes")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Colin King <colin.king@canonical.com>
Reviewed-by: Nogah Frankel <nogahf@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index c11c9a635866..4ed01182a82c 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -1718,13 +1718,11 @@ __mlxsw_sp_port_mdb_del(struct mlxsw_sp_port *mlxsw_sp_port,
 	struct net_device *dev = mlxsw_sp_port->dev;
 	int err;
 
-	if (bridge_port->bridge_device->multicast_enabled) {
-		if (bridge_port->bridge_device->multicast_enabled) {
-			err = mlxsw_sp_port_smid_set(mlxsw_sp_port, mid->mid,
-						     false);
-			if (err)
-				netdev_err(dev, "Unable to remove port from SMID\n");
-		}
+	if (bridge_port->bridge_device->multicast_enabled &&
+	    !bridge_port->mrouter) {
+		err = mlxsw_sp_port_smid_set(mlxsw_sp_port, mid->mid, false);
+		if (err)
+			netdev_err(dev, "Unable to remove port from SMID\n");
 	}
 
 	err = mlxsw_sp_port_remove_from_mid(mlxsw_sp_port, mid);
-- 
2.14.3

^ permalink raw reply related

* Re: [PATCH 2/2] net: stmmac: dwmac-meson: extend phy mode setting
From: Jerome Brunet @ 2018-04-26  8:47 UTC (permalink / raw)
  To: Yixun Lan, David S. Miller, netdev
  Cc: Kevin Hilman, Carlo Caione, Rob Herring, Martin Blumenstingl,
	linux-amlogic, linux-arm-kernel, linux-kernel
In-Reply-To: <20180426160508.29380-3-yixun.lan@amlogic.com>

On Thu, 2018-04-26 at 16:05 +0000, Yixun Lan wrote:
>   In the Meson-AXG SoC, the phy mode setting of PRG_ETH0 in the glue layer
> is extended from bit[0] to bit[2:0].
>   There is no problem if we configure it to the RGMII 1000M PHY mode,
> since the register setting is coincidentally compatible with previous one,
> but for the RMII 100M PHY mode, the configuration need to be changed to
> value - b100.
>   This patch was verified with a RTL8201F 100M ethernet PHY.
> 
> Signed-off-by: Yixun Lan <yixun.lan@amlogic.com>
> ---
>  .../ethernet/stmicro/stmmac/dwmac-meson8b.c   | 95 ++++++++++++++++---
>  1 file changed, 84 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c
> index 7cb794094a70..e3688b6dd87c 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c
> @@ -18,6 +18,7 @@
>  #include <linux/io.h>
>  #include <linux/ioport.h>
>  #include <linux/module.h>
> +#include <linux/of_device.h>
>  #include <linux/of_net.h>
>  #include <linux/mfd/syscon.h>
>  #include <linux/platform_device.h>
> @@ -29,6 +30,10 @@
>  
>  #define PRG_ETH0_RGMII_MODE		BIT(0)
>  
> +#define PRG_ETH0_EXT_PHY_MODE_MASK	GENMASK(2, 0)
> +#define PRG_ETH0_EXT_RGMII_MODE		1
> +#define PRG_ETH0_EXT_RMII_MODE		4
> +
>  /* mux to choose between fclk_div2 (bit unset) and mpll2 (bit set) */
>  #define PRG_ETH0_CLK_M250_SEL_SHIFT	4
>  #define PRG_ETH0_CLK_M250_SEL_MASK	GENMASK(4, 4)
> @@ -46,10 +51,16 @@
>  #define PRG_ETH0_TX_AND_PHY_REF_CLK	BIT(12)
>  
>  #define MUX_CLK_NUM_PARENTS		2
> +struct meson8b_dwmac_data {
> +	bool ext_phy_mode;
> +};
>  
>  struct meson8b_dwmac {
>  	struct device		*dev;
>  	void __iomem		*regs;
> +
> +	const struct meson8b_dwmac_data *data;
> +
>  	phy_interface_t		phy_mode;
>  	struct clk		*rgmii_tx_clk;
>  	u32			tx_delay_ns;
> @@ -171,6 +182,46 @@ static int meson8b_init_rgmii_tx_clk(struct meson8b_dwmac *dwmac)
>  	return 0;
>  }
>  
> +static int meson8b_init_set_mode(struct meson8b_dwmac *dwmac)
> +{
> +	bool ext_phy_mode = dwmac->data->ext_phy_mode;
> +
> +	switch (dwmac->phy_mode) {
> +	case PHY_INTERFACE_MODE_RGMII:
> +	case PHY_INTERFACE_MODE_RGMII_RXID:
> +	case PHY_INTERFACE_MODE_RGMII_ID:
> +	case PHY_INTERFACE_MODE_RGMII_TXID:
> +		/* enable RGMII mode */
> +		if (ext_phy_mode)

Looks weird to have this if target at a specific SoC withing a function named
after another SoC

Couldn't you make one function per soc type, and pass that function pointer in
the match data ?

> +			meson8b_dwmac_mask_bits(dwmac, PRG_ETH0,
> +						PRG_ETH0_EXT_PHY_MODE_MASK,
> +						PRG_ETH0_EXT_RGMII_MODE);
> +		else
> +			meson8b_dwmac_mask_bits(dwmac, PRG_ETH0,
> +						PRG_ETH0_RGMII_MODE,
> +						PRG_ETH0_RGMII_MODE);
> +
> +		break;
> +	case PHY_INTERFACE_MODE_RMII:
> +		/* disable RGMII mode -> enables RMII mode */
> +		if (ext_phy_mode)
> +			meson8b_dwmac_mask_bits(dwmac, PRG_ETH0,
> +						PRG_ETH0_EXT_PHY_MODE_MASK,
> +						PRG_ETH0_EXT_RMII_MODE);
> +		else
> +			meson8b_dwmac_mask_bits(dwmac, PRG_ETH0,
> +						PRG_ETH0_RGMII_MODE, 0);
> +
> +		break;
> +	default:
> +		dev_err(dwmac->dev, "fail to set phy-mode %s\n",
> +			phy_modes(dwmac->phy_mode));
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
>  static int meson8b_init_prg_eth(struct meson8b_dwmac *dwmac)
>  {
>  	int ret;
> @@ -188,10 +239,6 @@ static int meson8b_init_prg_eth(struct meson8b_dwmac *dwmac)
>  
>  	case PHY_INTERFACE_MODE_RGMII_ID:
>  	case PHY_INTERFACE_MODE_RGMII_TXID:
> -		/* enable RGMII mode */
> -		meson8b_dwmac_mask_bits(dwmac, PRG_ETH0, PRG_ETH0_RGMII_MODE,
> -					PRG_ETH0_RGMII_MODE);
> -
>  		/* only relevant for RMII mode -> disable in RGMII mode */
>  		meson8b_dwmac_mask_bits(dwmac, PRG_ETH0,
>  					PRG_ETH0_INVERTED_RMII_CLK, 0);
> @@ -224,10 +271,6 @@ static int meson8b_init_prg_eth(struct meson8b_dwmac *dwmac)
>  		break;
>  
>  	case PHY_INTERFACE_MODE_RMII:
> -		/* disable RGMII mode -> enables RMII mode */
> -		meson8b_dwmac_mask_bits(dwmac, PRG_ETH0, PRG_ETH0_RGMII_MODE,
> -					0);
> -
>  		/* invert internal clk_rmii_i to generate 25/2.5 tx_rx_clk */
>  		meson8b_dwmac_mask_bits(dwmac, PRG_ETH0,
>  					PRG_ETH0_INVERTED_RMII_CLK,
> @@ -274,6 +317,11 @@ static int meson8b_dwmac_probe(struct platform_device *pdev)
>  		goto err_remove_config_dt;
>  	}
>  
> +	dwmac->data = (const struct meson8b_dwmac_data *)
> +		of_device_get_match_data(&pdev->dev);
> +	if (!dwmac->data)
> +		return -EINVAL;
> +
>  	res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
>  	dwmac->regs = devm_ioremap_resource(&pdev->dev, res);
>  	if (IS_ERR(dwmac->regs)) {
> @@ -298,6 +346,10 @@ static int meson8b_dwmac_probe(struct platform_device *pdev)
>  	if (ret)
>  		goto err_remove_config_dt;
>  
> +	ret = meson8b_init_set_mode(dwmac);
> +	if (ret)
> +		goto err_remove_config_dt;
> +
>  	ret = meson8b_init_prg_eth(dwmac);
>  	if (ret)
>  		goto err_remove_config_dt;
> @@ -316,10 +368,31 @@ static int meson8b_dwmac_probe(struct platform_device *pdev)
>  	return ret;
>  }
>  
> +static const struct meson8b_dwmac_data meson8b_dwmac_data = {
> +	.ext_phy_mode = false,
> +};
> +
> +static const struct meson8b_dwmac_data meson_axg_dwmac_data = {
> +	.ext_phy_mode = true,
> +};
> +
>  static const struct of_device_id meson8b_dwmac_match[] = {
> -	{ .compatible = "amlogic,meson8b-dwmac" },
> -	{ .compatible = "amlogic,meson8m2-dwmac" },
> -	{ .compatible = "amlogic,meson-gxbb-dwmac" },
> +	{
> +		.compatible = "amlogic,meson8b-dwmac",
> +		.data = &meson8b_dwmac_data,
> +	},
> +	{
> +		.compatible = "amlogic,meson8m2-dwmac",
> +		.data = &meson8b_dwmac_data,
> +	},
> +	{
> +		.compatible = "amlogic,meson-gxbb-dwmac",
> +		.data = &meson8b_dwmac_data,
> +	},
> +	{
> +		.compatible = "amlogic,meson-axg-dwmac",
> +		.data = &meson_axg_dwmac_data,
> +	},
>  	{ }
>  };
>  MODULE_DEVICE_TABLE(of, meson8b_dwmac_match);

^ permalink raw reply

* Re: [bpf PATCH v2] bpf: fix for lex/yacc build error with gcc-5
From: Jiri Benc @ 2018-04-26  8:43 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev
In-Reply-To: <20180425212245.25999.21181.stgit@john-Precision-Tower-5810>

On Wed, 25 Apr 2018 14:22:45 -0700, John Fastabend wrote:
> --- a/tools/bpf/Makefile
> +++ b/tools/bpf/Makefile
> @@ -76,6 +76,8 @@ $(OUTPUT)bpf_asm: $(OUTPUT)bpf_asm.o $(OUTPUT)bpf_exp.yacc.o $(OUTPUT)bpf_exp.le
>  	$(QUIET_LINK)$(CC) $(CFLAGS) -o $@ $^
>  
>  $(OUTPUT)bpf_exp.lex.c: $(OUTPUT)bpf_exp.yacc.c
> +$(OUTPUT)bpf_exp.yacc.o: $(OUTPUT)bpf_exp.yacc.c
> +$(OUTPUT)bpf_exp.lex.o: $(OUTPUT)bpf_exp.lex.c

Looks better than v1, the first dependency is important.

For some reason, I did not need the other two rules.

By the way, make invoked from tools/bpf/ has never really worked, even
before my patchset. This works correctly:

cd tools
make bpf

 Jiri

^ permalink raw reply

* Re: [PATCH v4 ipsec-next] xfrm: remove VLA usage in __xfrm6_sort()
From: Steffen Klassert @ 2018-04-26  8:33 UTC (permalink / raw)
  To: Stefano Brivio
  Cc: Kees Cook, Andreas Christoforou, kernel-hardening, Herbert Xu,
	David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, netdev,
	linux-kernel
In-Reply-To: <20180425165852.6c7460c2@epycfail>

On Wed, Apr 25, 2018 at 04:58:52PM +0200, Stefano Brivio wrote:
> On Wed, 25 Apr 2018 07:46:39 -0700
> Kees Cook <keescook@chromium.org> wrote:
> 
> > In the quest to remove all stack VLA usage removed from the kernel[1],
> > just use XFRM_MAX_DEPTH as already done for the "class" array. In one
> > case, it'll do this loop up to 5, the other caller up to 6.
> > 
> > [1] https://lkml.org/lkml/2018/3/7/621
> > 
> > Co-developed-by: Andreas Christoforou <andreaschristofo@gmail.com>
> > Signed-off-by: Kees Cook <keescook@chromium.org>
> > ---
> > v4:
> > - actually remove memset(). :)
> > v3:
> > - adjust Subject and commit log (Steffen)
> > - use "= { }" instead of memset() (Stefano)
> > v2:
> > - use XFRM_MAX_DEPTH for "count" array (Steffen and Mathias).
> > ---
> 
> Acked-by: Stefano Brivio <sbrivio@redhat.com>

Applied to ipsec-next, thanks everyone!

^ permalink raw reply

* [PATCH 3/3] selftests/bpf: .gitignore: add test_btf
From: Sirio Balmelli @ 2018-04-26  8:31 UTC (permalink / raw)
  To: ast; +Cc: daniel, netdev

Signed-off-by: Sirio Balmelli <sirio@b-ad.ch>
---
 tools/testing/selftests/bpf/.gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore
index 5e1ab2f..9513c77 100644
--- a/tools/testing/selftests/bpf/.gitignore
+++ b/tools/testing/selftests/bpf/.gitignore
@@ -12,6 +12,7 @@ test_tcpbpf_user
 test_verifier_log
 feature
 test_libbpf_open
+test_btf
 test_sock
 test_sock_addr
 urandom_read
-- 
2.7.4

^ permalink raw reply related

* [PATCH 2/3] selftests/bpf: test_xdp_noinline.c: fix 'noinline' macro expansion
From: Sirio Balmelli @ 2018-04-26  8:31 UTC (permalink / raw)
  To: ast; +Cc: daniel, netdev

Compiling with clang 7.0.0 yields:
test_xdp_noinline.c:470:24: warning: unknown attribute '__attribute__' ignored [-Wunknown-attributes]
../../../include/linux/compiler-gcc.h:24:19: note: expanded from macro 'noinline'
                        ^
test_xdp_noinline.c:494:24: error: use of undeclared identifier 'noinline'; did you mean 'inline'?
static __attribute__ ((noinline))

This appears to be the 'noinline' attribute being itself macro-expanded,
so the compiler sees '__attribute__ ((__attribute__((noinline))))'.

Fix using an #ifndef.
Homogenize function declarations.

Signed-off-by: Sirio Balmelli <sirio@b-ad.ch>
---
 tools/testing/selftests/bpf/test_xdp_noinline.c | 79 +++++++++++++------------
 1 file changed, 42 insertions(+), 37 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_xdp_noinline.c b/tools/testing/selftests/bpf/test_xdp_noinline.c
index 5e4aac7..5b5f3f2 100644
--- a/tools/testing/selftests/bpf/test_xdp_noinline.c
+++ b/tools/testing/selftests/bpf/test_xdp_noinline.c
@@ -15,6 +15,11 @@
 #include <linux/udp.h>
 #include "bpf_helpers.h"
 
+/* some compiler-specific header might define this */
+#ifndef noinline
+#define noinline (__attribute__ ((noinline)))
+#endif
+
 #define bpf_printk(fmt, ...)				\
 ({							\
 	char ____fmt[] = fmt;				\
@@ -55,7 +60,7 @@ static __u32 rol32(__u32 word, unsigned int shift)
 
 typedef unsigned int u32;
 
-static __attribute__ ((noinline))
+static noinline
 u32 jhash(const void *key, u32 length, u32 initval)
 {
 	u32 a, b, c;
@@ -92,7 +97,7 @@ u32 jhash(const void *key, u32 length, u32 initval)
 	return c;
 }
 
-static __attribute__ ((noinline))
+static noinline
 u32 __jhash_nwords(u32 a, u32 b, u32 c, u32 initval)
 {
 	a += initval;
@@ -102,7 +107,7 @@ u32 __jhash_nwords(u32 a, u32 b, u32 c, u32 initval)
 	return c;
 }
 
-static __attribute__ ((noinline))
+static noinline
 u32 jhash_2words(u32 a, u32 b, u32 initval)
 {
 	return __jhash_nwords(a, b, 0, initval + JHASH_INITVAL + (2 << 2));
@@ -239,7 +244,7 @@ static inline __u64 calc_offset(bool is_ipv6, bool is_icmp)
 	return off;
 }
 
-static __attribute__ ((noinline))
+static noinline
 bool parse_udp(void *data, void *data_end,
 	       bool is_ipv6, struct packet_description *pckt)
 {
@@ -261,7 +266,7 @@ bool parse_udp(void *data, void *data_end,
 	return 1;
 }
 
-static __attribute__ ((noinline))
+static noinline
 bool parse_tcp(void *data, void *data_end,
 	       bool is_ipv6, struct packet_description *pckt)
 {
@@ -285,7 +290,7 @@ bool parse_tcp(void *data, void *data_end,
 	return 1;
 }
 
-static __attribute__ ((noinline))
+static noinline
 bool encap_v6(struct xdp_md *xdp, struct ctl_value *cval,
 	      struct packet_description *pckt,
 	      struct real_definition *dst, __u32 pkt_bytes)
@@ -328,7 +333,7 @@ bool encap_v6(struct xdp_md *xdp, struct ctl_value *cval,
 	return 1;
 }
 
-static __attribute__ ((noinline))
+static noinline
 bool encap_v4(struct xdp_md *xdp, struct ctl_value *cval,
 	      struct packet_description *pckt,
 	      struct real_definition *dst, __u32 pkt_bytes)
@@ -382,7 +387,7 @@ bool encap_v4(struct xdp_md *xdp, struct ctl_value *cval,
 	return 1;
 }
 
-static __attribute__ ((noinline))
+static noinline
 bool decap_v6(struct xdp_md *xdp, void **data, void **data_end, bool inner_v4)
 {
 	struct eth_hdr *new_eth;
@@ -403,7 +408,7 @@ bool decap_v6(struct xdp_md *xdp, void **data, void **data_end, bool inner_v4)
 	return 1;
 }
 
-static __attribute__ ((noinline))
+static noinline
 bool decap_v4(struct xdp_md *xdp, void **data, void **data_end)
 {
 	struct eth_hdr *new_eth;
@@ -421,7 +426,7 @@ bool decap_v4(struct xdp_md *xdp, void **data, void **data_end)
 	return 1;
 }
 
-static __attribute__ ((noinline))
+static noinline
 int swap_mac_and_send(void *data, void *data_end)
 {
 	unsigned char tmp_mac[6];
@@ -434,7 +439,7 @@ int swap_mac_and_send(void *data, void *data_end)
 	return XDP_TX;
 }
 
-static __attribute__ ((noinline))
+static noinline
 int send_icmp_reply(void *data, void *data_end)
 {
 	struct icmphdr *icmp_hdr;
@@ -467,7 +472,7 @@ int send_icmp_reply(void *data, void *data_end)
 	return swap_mac_and_send(data, data_end);
 }
 
-static __attribute__ ((noinline))
+static noinline
 int send_icmp6_reply(void *data, void *data_end)
 {
 	struct icmp6hdr *icmp_hdr;
@@ -491,7 +496,7 @@ int send_icmp6_reply(void *data, void *data_end)
 	return swap_mac_and_send(data, data_end);
 }
 
-static __attribute__ ((noinline))
+static noinline
 int parse_icmpv6(void *data, void *data_end, __u64 off,
 		 struct packet_description *pckt)
 {
@@ -516,7 +521,7 @@ int parse_icmpv6(void *data, void *data_end, __u64 off,
 	return -1;
 }
 
-static __attribute__ ((noinline))
+static noinline
 int parse_icmp(void *data, void *data_end, __u64 off,
 	       struct packet_description *pckt)
 {
@@ -543,7 +548,7 @@ int parse_icmp(void *data, void *data_end, __u64 off,
 	return -1;
 }
 
-static __attribute__ ((noinline))
+static noinline
 __u32 get_packet_hash(struct packet_description *pckt,
 		      bool hash_16bytes)
 {
@@ -555,11 +560,11 @@ __u32 get_packet_hash(struct packet_description *pckt,
 				    24);
 }
 
-__attribute__ ((noinline))
-static bool get_packet_dst(struct real_definition **real,
-			   struct packet_description *pckt,
-			   struct vip_meta *vip_info,
-			   bool is_ipv6, void *lru_map)
+static noinline
+bool get_packet_dst(struct real_definition **real,
+		    struct packet_description *pckt,
+		    struct vip_meta *vip_info,
+		    bool is_ipv6, void *lru_map)
 {
 	struct real_pos_lru new_dst_lru = { };
 	bool hash_16bytes = is_ipv6;
@@ -608,10 +613,10 @@ static bool get_packet_dst(struct real_definition **real,
 	return 1;
 }
 
-__attribute__ ((noinline))
-static void connection_table_lookup(struct real_definition **real,
-				    struct packet_description *pckt,
-				    void *lru_map)
+static noinline
+void connection_table_lookup(struct real_definition **real,
+			     struct packet_description *pckt,
+			     void *lru_map)
 {
 
 	struct real_pos_lru *dst_lru;
@@ -635,11 +640,11 @@ static void connection_table_lookup(struct real_definition **real,
  * below function has 6 arguments whereas bpf and llvm allow maximum of 5
  * but since it's _static_ llvm can optimize one argument away
  */
-__attribute__ ((noinline))
-static int process_l3_headers_v6(struct packet_description *pckt,
-				 __u8 *protocol, __u64 off,
-				 __u16 *pkt_bytes, void *data,
-				 void *data_end)
+static noinline
+int process_l3_headers_v6(struct packet_description *pckt,
+			  __u8 *protocol, __u64 off,
+			  __u16 *pkt_bytes, void *data,
+			  void *data_end)
 {
 	struct ipv6hdr *ip6h;
 	__u64 iph_len;
@@ -666,11 +671,11 @@ static int process_l3_headers_v6(struct packet_description *pckt,
 	return -1;
 }
 
-__attribute__ ((noinline))
-static int process_l3_headers_v4(struct packet_description *pckt,
-				 __u8 *protocol, __u64 off,
-				 __u16 *pkt_bytes, void *data,
-				 void *data_end)
+static noinline
+int process_l3_headers_v4(struct packet_description *pckt,
+			  __u8 *protocol, __u64 off,
+			  __u16 *pkt_bytes, void *data,
+			  void *data_end)
 {
 	struct iphdr *iph;
 	__u64 iph_len;
@@ -698,9 +703,9 @@ static int process_l3_headers_v4(struct packet_description *pckt,
 	return -1;
 }
 
-__attribute__ ((noinline))
-static int process_packet(void *data, __u64 off, void *data_end,
-			  bool is_ipv6, struct xdp_md *xdp)
+static inline
+int process_packet(void *data, __u64 off, void *data_end,
+		   bool is_ipv6, struct xdp_md *xdp)
 {
 
 	struct real_definition *dst = NULL;
-- 
2.7.4

^ permalink raw reply related

* [PATCH 1/3] selftests/bpf: Makefile: add includes to fix broken test build
From: Sirio Balmelli @ 2018-04-26  8:31 UTC (permalink / raw)
  To: ast; +Cc: daniel, netdev

several bpf tests fail to build with clang 7.0.0:
...
In file included from ../../../include/uapi/linux/bpf.h:11:
In file included from ./include/uapi/linux/types.h:5:
/usr/include/asm-generic/int-ll64.h:11:10: fatal error: 'asm/bitsperlong.h' file not found

/usr/include/asm-generic/int-ll64.h is from outside the kernel repo,
probably a good idea to repoint to -I$(ROOT)/include/uapi.
asm/bitsperlong.h is architecture-specific, cater for this with an
architecture-specific include -I$(ROOT)/$(ARCH)/include/uapi.

Re-building now yields:
../../../../include/uapi/linux/stddef.h:2:10: fatal error: 'linux/compiler_types.h' file not found

Fix this with -I$(ROOT)/include

Signed-off-by: Sirio Balmelli <sirio@b-ad.ch>
---
 tools/testing/selftests/bpf/Makefile | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 0b72cc7..6a8cfaf 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -80,8 +80,14 @@ else
   CPU ?= generic
 endif
 
-CLANG_FLAGS = -I. -I./include/uapi -I../../../include/uapi \
-	      -Wno-compare-distinct-pointer-types
+ARCH := arch/$(subst _64,,$(shell uname -p))
+ROOT :=../../../..
+TOOLS :=../../..
+CLANG_FLAGS = -I. -I./include/uapi \
+	-I$(TOOLS)/include/uapi -I$(TOOLS)/include \
+	-I$(ROOT)/$(ARCH)/include/uapi \
+	-I$(ROOT)/include/uapi -I$(ROOT)/include \
+	-Wno-compare-distinct-pointer-types
 
 $(OUTPUT)/test_l4lb_noinline.o: CLANG_FLAGS += -fno-inline
 $(OUTPUT)/test_xdp_noinline.o: CLANG_FLAGS += -fno-inline
-- 
2.7.4

^ permalink raw reply related

* [PATCH 0/3] selftests/bpf
From: Sirio Balmelli @ 2018-04-26  8:30 UTC (permalink / raw)
  To: ast; +Cc: daniel, netdev

Fix some errors getting BPF selftests to build in 4.17.0-rc2 from net-next.

uname -a
Linux vm4 4.17.0-rc2-net-next+ #2 SMP Wed Apr 25 22:51:20 CEST 2018 x86_64 x86_64 x86_64 GNU/Linux

clang --version
clang version 7.0.0 (trunk 330819)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin

llc --version
LLVM (http://llvm.org/):
  LLVM version 7.0.0svn
  Optimized build.
  Default target: x86_64-unknown-linux-gnu
  Host CPU: haswell

Sirio Balmelli (3):
  selftests/bpf: Makefile: add includes to fix broken test build
  selftests/bpf: test_xdp_noinline.c: fix 'noinline' macro expansion
  selftests/bpf: .gitignore: add test_btf

 tools/testing/selftests/bpf/.gitignore          |  1 +
 tools/testing/selftests/bpf/Makefile            | 10 +++-
 tools/testing/selftests/bpf/test_xdp_noinline.c | 79 +++++++++++++------------
 3 files changed, 51 insertions(+), 39 deletions(-)

-- 
2.7.4

^ permalink raw reply

* Re: [net-next v3] ipv6: sr: Compute flowlabel for outer IPv6 header of seg6 encap mode
From: Ahmed Abdelsalam @ 2018-04-26  8:23 UTC (permalink / raw)
  To: David Miller; +Cc: dav.lebrun, kuznet, yoshfuji, netdev, linux-kernel
In-Reply-To: <20180425.130304.1605881647944477071.davem@davemloft.net>

On Wed, 25 Apr 2018 13:03:04 -0400 (EDT)
David Miller <davem@davemloft.net> wrote:

> From: Ahmed Abdelsalam <amsalam20@gmail.com>
> Date: Tue, 24 Apr 2018 20:23:16 +0200
> 
> > ECMP (equal-cost multipath) hashes are typically computed on the packets'
> > 5-tuple(src IP, dst IP, src port, dst port, L4 proto).
> > 
> > For encapsulated packets, the L4 data is not readily available and ECMP
> > hashing will often revert to (src IP, dst IP). This will lead to traffic
> > polarization on a single ECMP path, causing congestion and waste of network
> > capacity.
> > 
> > In IPv6, the 20-bit flow label field is also used as part of the ECMP hash.
> > In the lack of L4 data, the hashing will be on (src IP, dst IP, flow
> > label). Having a non-zero flow label is thus important for proper traffic
> > load balancing when L4 data is unavailable (i.e., when packets are
> > encapsulated).
> > 
> > Currently, the seg6_do_srh_encap() function extracts the original packet's
> > flow label and set it as the outer IPv6 flow label. There are two issues
> > with this behaviour:
> > 
> > a) There is no guarantee that the inner flow label is set by the source.
> > b) If the original packet is not IPv6, the flow label will be set to
> > zero (e.g., IPv4 or L2 encap).
> > 
> > This patch adds a function, named seg6_make_flowlabel(), that computes a
> > flow label from a given skb. It supports IPv6, IPv4 and L2 payloads, and
> > leverages the per namespace 'seg6_flowlabel" sysctl value.
> > 
> > The currently support behaviours are as follows:
> > -1 set flowlabel to zero.
> > 0 copy flowlabel from Inner paceket in case of Inner IPv6
> > (Set flowlabel to 0 in case IPv4/L2)
> > 1 Compute the flowlabel using seg6_make_flowlabel()
> > 
> > This patch has been tested for IPv6, IPv4, and L2 traffic.
> > 
> > Signed-off-by: Ahmed Abdelsalam <amsalam20@gmail.com>
> 
> Applied.
> 
> Please submit a patch which adds appropriate documentation for this new sysctl
> to Documentation/networking/ip-sysctl.txt

Thanks David
I will send you a patch for ip-sysctl.txt

-- 
Ahmed Abdelsalam <amsalam20@gmail.com>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox