Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH bpf-next 1/5] samples/bpf: Fix typo in comment
From: Leo Yan @ 2018-04-25 10:01 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Daniel Thompson, Alexei Starovoitov, Daniel Borkmann, netdev,
	linux-kernel
In-Reply-To: <20180420155213.2867fcf5@redhat.com>

On Fri, Apr 20, 2018 at 03:52:13PM +0200, Jesper Dangaard Brouer wrote:
> On Fri, 20 Apr 2018 14:21:16 +0100
> Daniel Thompson <daniel.thompson@linaro.org> wrote:
> 
> > On Fri, Apr 20, 2018 at 02:10:04PM +0200, Jesper Dangaard Brouer wrote:
> > > 
> > > On Thu, 19 Apr 2018 09:34:02 +0800 Leo Yan <leo.yan@linaro.org> wrote:
> > >   
> > > > Fix typo by replacing 'iif' with 'if'.
> > > > 
> > > > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > > > ---
> > > >  samples/bpf/bpf_load.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c
> > > > index bebe418..28e4678 100644
> > > > --- a/samples/bpf/bpf_load.c
> > > > +++ b/samples/bpf/bpf_load.c
> > > > @@ -393,7 +393,7 @@ static int load_elf_maps_section(struct bpf_map_data *maps, int maps_shndx,
> > > >  			continue;
> > > >  		if (sym[nr_maps].st_shndx != maps_shndx)
> > > >  			continue;
> > > > -		/* Only increment iif maps section */
> > > > +		/* Only increment if maps section */
> > > >  		nr_maps++;
> > > >  	}  
> > > 
> > > This was actually not a typo from my side.
> > > 
> > > With 'iif' I mean 'if and only if' ... but it doesn't matter much.  
> > 
> > I think 'if and only if' is more commonly abbreviated 'iff' isn't it?
> 
> Ah, yes![1]  -- then it *is* actually a typo! - LOL
> 
> I'm fine with changing this to "if" :-)

Thanks for the reviewing, Daniel & Jesper.
I also learn it from the discussion :)

> [1] https://en.wikipedia.org/wiki/If_and_only_if
> 
> -- 
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH 0/3] bpf: Store/dump license string for loaded program
From: Jiri Olsa @ 2018-04-25 10:17 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, lkml, netdev,
	Quentin Monnet
In-Reply-To: <20180423201134.r56ts7usec4jjnbh@ast-mbp>

On Mon, Apr 23, 2018 at 02:11:36PM -0600, Alexei Starovoitov wrote:
> On Mon, Apr 23, 2018 at 08:59:24AM +0200, Jiri Olsa wrote:
> > hi,
> > sending the change to store and dump the license
> > info for loaded BPF programs. It's important for
> > us get the license info, when investigating on
> > screwed up machine.
> 
> hmm. boolean flag whether bpf prog is gpl or not
> is already exposed via bpf_prog_info.

hum, I can't see that (on bpf-next/master)
would the attached change be ok with you?

jirka


---
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index e6679393b687..2ce9c9d41c2b 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1062,6 +1062,7 @@ struct bpf_prog_info {
 	__u32 ifindex;
 	__u64 netns_dev;
 	__u64 netns_ino;
+	__u16 gpl_compatible:1;
 } __attribute__((aligned(8)));
 
 struct bpf_map_info {
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index fe23dc5a3ec4..7bb4ff1c770a 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1914,6 +1914,7 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
 	info.load_time = prog->aux->load_time;
 	info.created_by_uid = from_kuid_munged(current_user_ns(),
 					       prog->aux->user->uid);
+	info.gpl_compatible = prog->gpl_compatible;
 
 	memcpy(info.tag, prog->tag, sizeof(prog->tag));
 	memcpy(info.name, prog->aux->name, sizeof(prog->aux->name));

^ permalink raw reply related

* Re: [PATCH stable v4.4+] r8152: add Linksys USB3GIGV1 id
From: Greg KH @ 2018-04-25 10:30 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Oliver Neukum, David S. Miller, linux-usb, netdev, linux-kernel,
	Grant Grundler
In-Reply-To: <1524650077-13270-1-git-send-email-krzk@kernel.org>

On Wed, Apr 25, 2018 at 11:54:37AM +0200, Krzysztof Kozlowski wrote:
> commit 90841047a01b452cc8c3f9b990698b264143334a upstream
> 
> This linksys dongle by default comes up in cdc_ether mode.
> This patch allows r8152 to claim the device:
>    Bus 002 Device 002: ID 13b1:0041 Linksys
> 
> Signed-off-by: Grant Grundler <grundler@chromium.org>
> Reviewed-by: Douglas Anderson <dianders@chromium.org>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> [krzk: Rebase on v4.4]

Also added to 4.9.y, thanks.

greg k-h

^ permalink raw reply

* [PATCH] net: amd8111e: remove redundant duplicated if statement
From: Colin King @ 2018-04-25 10:31 UTC (permalink / raw)
  To: David S . Miller, Allen Pais, netdev; +Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

There are two identical nested if statements, the second is redundant
and can be removed. Also clean up white space formatting.

Cleans up cppcheck warning:
drivers/net/ethernet/amd/amd8111e.c:1080: (warning) Identical inner 'if'
condition is always true.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/net/ethernet/amd/amd8111e.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/amd/amd8111e.c b/drivers/net/ethernet/amd/amd8111e.c
index c99e3e845ac0..a90080f12e67 100644
--- a/drivers/net/ethernet/amd/amd8111e.c
+++ b/drivers/net/ethernet/amd/amd8111e.c
@@ -1074,16 +1074,12 @@ static int amd8111e_calc_coalesce(struct net_device *dev)
 				amd8111e_set_coalesce(dev,TX_INTR_COAL);
 				coal_conf->tx_coal_type = MEDIUM_COALESCE;
 			}
-
-		}
-		else if(tx_pkt_size >= 1024){
-			if (tx_pkt_size >= 1024){
-				if(coal_conf->tx_coal_type !=  HIGH_COALESCE){
-					coal_conf->tx_timeout = 4;
-					coal_conf->tx_event_count = 8;
-					amd8111e_set_coalesce(dev,TX_INTR_COAL);
-					coal_conf->tx_coal_type = HIGH_COALESCE;
-				}
+		} else if (tx_pkt_size >= 1024) {
+			if (coal_conf->tx_coal_type != HIGH_COALESCE) {
+				coal_conf->tx_timeout = 4;
+				coal_conf->tx_event_count = 8;
+				amd8111e_set_coalesce(dev, TX_INTR_COAL);
+				coal_conf->tx_coal_type = HIGH_COALESCE;
 			}
 		}
 	}
-- 
2.17.0

^ permalink raw reply related

* [PATCH] mkiss: remove redundant check for len > 0
From: Colin King @ 2018-04-25 10:43 UTC (permalink / raw)
  To: David S . Miller, Johannes Berg, Stephen Hemminger, netdev
  Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

The check for len > 0 is always true and hence is redundant as
this check is already being made to execute the code inside the
while-loop. Hence it is redundant and can be removed.

Cleans up cppcheck warning:
drivers/net/hamradio/mkiss.c:220: (warning) Identical inner 'if'
condition is always true.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/net/hamradio/mkiss.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/hamradio/mkiss.c b/drivers/net/hamradio/mkiss.c
index c180b480f8ef..13e4c1eff353 100644
--- a/drivers/net/hamradio/mkiss.c
+++ b/drivers/net/hamradio/mkiss.c
@@ -217,7 +217,7 @@ static int kiss_esc_crc(unsigned char *s, unsigned char *d, unsigned short crc,
 			c = *s++;
 		else if (len > 1)
 			c = crc >> 8;
-		else if (len > 0)
+		else
 			c = crc & 0xff;
 
 		len--;
-- 
2.17.0

^ permalink raw reply related

* [PATCH net 1/1] net/smc: keep clcsock reference in smc_tcp_listen_work()
From: Ursula Braun @ 2018-04-25 10:48 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun

The internal CLC socket should exist till the SMC-socket is released.
Function tcp_listen_worker() releases the internal CLC socket of a
listen socket, if an smc_close_active() is called. This function
is called for the final release(), but it is called for shutdown
SHUT_RDWR as well. This opens a door for protection faults, if
socket calls using the internal CLC socket are called for a
shutdown listen socket.

With the changes of
commit 3d502067599f ("net/smc: simplify wait when closing listen socket")
there is no need anymore to release the internal CLC socket in
function tcp_listen_worker((). It is sufficient to release it in
smc_release().

Fixes: 127f49705823 ("net/smc: release clcsock from tcp_listen_worker")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Reported-by: syzbot+9045fc589fcd196ef522@syzkaller.appspotmail.com
Reported-by: syzbot+28a2c86cf19c81d871fa@syzkaller.appspotmail.com
Reported-by: syzbot+9605e6cace1b5efd4a0a@syzkaller.appspotmail.com
Reported-by: syzbot+cf9012c597c8379d535c@syzkaller.appspotmail.com
---
 net/smc/af_smc.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index f5d4b69dbabc..4470501374bf 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -978,10 +978,6 @@ static void smc_tcp_listen_work(struct work_struct *work)
 	}
 
 out:
-	if (lsmc->clcsock) {
-		sock_release(lsmc->clcsock);
-		lsmc->clcsock = NULL;
-	}
 	release_sock(lsk);
 	sock_put(&lsmc->sk); /* sock_hold in smc_listen */
 }
-- 
2.13.5

^ permalink raw reply related

* Re: [PATCH bpf-next 14/15] xsk: statistics support
From: Magnus Karlsson @ 2018-04-25 10:50 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Björn Töpel, Karlsson, Magnus, Alexander Duyck,
	Alexander Duyck, John Fastabend, Alexei Starovoitov,
	Jesper Dangaard Brouer, Daniel Borkmann, Michael S. Tsirkin,
	Network Development, michael.lundkvist, Brandeburg, Jesse,
	Singhai, Anjali, Zhang, Qi Z
In-Reply-To: <CAF=yD-KSZnnA+ntcBnW5Ohrb--OedbSuhqWVhcrxze6zH0-mFw@mail.gmail.com>

On Tue, Apr 24, 2018 at 6:58 PM, Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
> On Mon, Apr 23, 2018 at 9:56 AM, Björn Töpel <bjorn.topel@gmail.com> wrote:
>> From: Magnus Karlsson <magnus.karlsson@intel.com>
>>
>> In this commit, a new getsockopt is added: XDP_STATISTICS. This is
>> used to obtain stats from the sockets.
>>
>> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
>
>> +static int xsk_getsockopt(struct socket *sock, int level, int optname,
>> +                         char __user *optval, int __user *optlen)
>> +{
>> +       struct sock *sk = sock->sk;
>> +       struct xdp_sock *xs = xdp_sk(sk);
>> +       int len;
>> +
>> +       if (level != SOL_XDP)
>> +               return -ENOPROTOOPT;
>> +
>> +       if (get_user(len, optlen))
>> +               return -EFAULT;
>> +       if (len < 0)
>> +               return -EINVAL;
>> +
>> +       switch (optname) {
>> +       case XDP_STATISTICS:
>> +       {
>> +               struct xdp_statistics stats;
>> +
>> +               if (len != sizeof(stats))
>> +                       return -EINVAL;
>> +
>> +               mutex_lock(&xs->mutex);
>> +               stats.rx_dropped = xs->rx_dropped;
>> +               stats.rx_invalid_descs = xskq_nb_invalid_descs(xs->rx);
>> +               stats.tx_invalid_descs = xskq_nb_invalid_descs(xs->tx);
>> +               mutex_unlock(&xs->mutex);
>> +
>> +               if (copy_to_user(optval, &stats, sizeof(stats)))
>> +                       return -EFAULT;
>> +               return 0;
>
> For forward compatibility, could allow caller to pass a struct larger
> than stats and return the number of bytes filled in.

Yes definitely. Will fix right away.

> The lock can also be elided with something like gnet_stats, but it is probably
> taken rarely enough that that is not worth the effort, at least right now.

Will put this on the ever expanding todo list for future patches ;-).

Thanks: Magnus

^ permalink raw reply

* Re: [PATCH v2] bpf, x86_32: add eBPF JIT compiler for ia32
From: Wang YanQing @ 2018-04-25 11:00 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: ast, illusionist.neo, tglx, mingo, hpa, davem, x86, netdev,
	linux-kernel
In-Reply-To: <20e1eabd-e821-8240-cb4c-6da253c49585@iogearbox.net>

On Wed, Apr 25, 2018 at 02:11:16AM +0200, Daniel Borkmann wrote:
> On 04/19/2018 05:54 PM, Wang YanQing wrote:
> > The JIT compiler emits ia32 bit instructions. Currently, It supports
> > eBPF only. Classic BPF is supported because of the conversion by BPF core.
> > 
> > Almost all instructions from eBPF ISA supported except the following:
> > BPF_ALU64 | BPF_DIV | BPF_K
> > BPF_ALU64 | BPF_DIV | BPF_X
> > BPF_ALU64 | BPF_MOD | BPF_K
> > BPF_ALU64 | BPF_MOD | BPF_X
> > BPF_STX | BPF_XADD | BPF_W
> > BPF_STX | BPF_XADD | BPF_DW
> > 
> > It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL too.
> > 
> > ia32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI,
> > and in these six registers, we can't treat all of them as real
> > general purpose registers in jit:
> > MUL instructions need EAX:EDX, shift instructions need ECX.
> > 
> > So I decide to use stack to emulate all eBPF 64 registers, this will
> > simplify the implementation a lot, because we don't need to face the
> > flexible memory address modes on ia32, for example, we don't need to
> > write below code pattern for one BPF_ADD instruction:
> > 
> >     if (src is a register && dst is a register)
> >     {
> >        //one instruction encoding for ADD instruction
> >     } else if (only src is a register)
> >     {
> >        //another different instruction encoding for ADD instruction
> >     } else if (only dst is a register)
> >     {
> >        //another different instruction encoding for ADD instruction
> >     } else
> >     {
> >        //src and dst are all on stack.
> >        //move src or dst to temporary registers
> >     }
> > 
> > If the above example if-else-else-else isn't so painful, try to think
> > it for BPF_ALU64|BPF_*SHIFT* instruction which we need to use many
> > native instructions to emulate.
> > 
> > Tested on my PC (Intel(R) Core(TM) i5-5200U CPU) and virtualbox.
> 
> Just out of plain curiosity, do you have a specific use case on the
> JIT for x86_32? Are you targeting this to be used for e.g. Atom or
> the like (sorry, don't really have a good overview where x86_32 is
> still in use these days)?
Yes, you are right that x86_32 isn't the BIG DEAL anymore, but they willn't
disappear completely in near future, the same as 32-bit linux kernel on
64bit machine.

For me, because I use 32-bit linux on development/hacking notebook for many years,
I try to trace/perf the kernel, then I meet eBPF, it is strange that there isn't
a jit for it, so I try to fill the empty.
> 
> > Testing results on i5-5200U:
> > 
> > 1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed]
> > 2) test_progs: Summary: 81 PASSED, 2 FAILED.
> >    test_progs report "libbpf: incorrect bpf_call opcode" for
> >    test_l4lb_noinline and test_xdp_noinline, because there is
> >    no llvm-6.0 on my machine, and current implementation doesn't
> >    support BPF_PSEUDO_CALL, so I think we can ignore the two failed
> >    testcases.
> > 3) test_lpm: OK
> > 4) test_lru_map: OK
> > 5) test_verifier: Summary: 823 PASSED, 5 FAILED
> >    test_verifier report "invalid bpf_context access off=68 size=1/2/4/8"
> >    for all the 5 FAILED testcases with/without jit, we need to fix the
> >    failed testcases themself instead of this jit.
> 
> Can you elaborate further on these? Looks like this definitely needs
> fixing on 32 bit. Would be great to get a better understanding of the
> underlying bug(s) and properly fix them.
Ok! I will try to dig them out.
> 
> > Above tests are all done with following flags settings discretely:
> > 1:bpf_jit_enable=1 and bpf_jit_harden=0
> > 2:bpf_jit_enable=1 and bpf_jit_harden=2
> > 
> > Below are some numbers for this jit implementation:
> > Note:
> >   I run test_progs in kselftest 100 times continuously for every testcase,
> >   the numbers are in format: total/times=avg.
> >   The numbers that test_bpf reports show almost the same relation.
> > 
> > a:jit_enable=0 and jit_harden=0            b:jit_enable=1 and jit_harden=0
> >   test_pkt_access:PASS:ipv4:15622/100=156  test_pkt_access:PASS:ipv4:10057/100=100
> >   test_pkt_access:PASS:ipv6:9130/100=91    test_pkt_access:PASS:ipv6:5055/100=50
> >   test_xdp:PASS:ipv4:240198/100=2401       test_xdp:PASS:ipv4:145945/100=1459
> >   test_xdp:PASS:ipv6:137326/100=1373       test_xdp:PASS:ipv6:67337/100=673
> >   test_l4lb:PASS:ipv4:61100/100=611        test_l4lb:PASS:ipv4:38137/100=381
> >   test_l4lb:PASS:ipv6:101000/100=1010      test_l4lb:PASS:ipv6:57779/100=577
> > 
> > c:jit_enable=1 and jit_harden=2
> >   test_pkt_access:PASS:ipv4:12650/100=126
> >   test_pkt_access:PASS:ipv6:7074/100=70
> >   test_xdp:PASS:ipv4:147211/100=1472
> >   test_xdp:PASS:ipv6:85783/100=857
> >   test_l4lb:PASS:ipv4:53222/100=532
> >   test_l4lb:PASS:ipv6:76322/100=763
> > 
> > Yes, the numbers are pretty when turn off jit_harden, if we want to speedup
> > jit_harden, then we need to move BPF_REG_AX to *real* register instead of stack
> > emulation, but when we do it, we need to face all the pain I describe above. We
> > can do it in next step.
> > 
> > See Documentation/networking/filter.txt for more information.
> > 
> > Signed-off-by: Wang YanQing <udknight@gmail.com>
> [...]
> > +		/* call */
> > +		case BPF_JMP | BPF_CALL:
> > +		{
> > +			const u8 *r1 = bpf2ia32[BPF_REG_1];
> > +			const u8 *r2 = bpf2ia32[BPF_REG_2];
> > +			const u8 *r3 = bpf2ia32[BPF_REG_3];
> > +			const u8 *r4 = bpf2ia32[BPF_REG_4];
> > +			const u8 *r5 = bpf2ia32[BPF_REG_5];
> > +
> > +			if (insn->src_reg == BPF_PSEUDO_CALL)
> > +				goto notyet;
> 
> Here you bail out on BPF_PSEUDO_CALL, but in below bpf_int_jit_compile() you
> implement half of e.g. 1c2a088a6626 ("bpf: x64: add JIT support for multi-function
> programs"), which is rather confusing to leave it in this half complete state;
> either remove altogether or implement everything.
Done.
> 
> > +			func = (u8 *) __bpf_call_base + imm32;
> > +			jmp_offset = func - (image + addrs[i]);
> > +
> > +			if (!imm32 || !is_simm32(jmp_offset)) {
> > +				pr_err("unsupported bpf func %d addr %p image %p\n",
> > +				       imm32, func, image);
> > +				return -EINVAL;
> > +			}
> > +
> > +			/* mov eax,dword ptr [ebp+off] */
> > +			EMIT3(0x8B, add_2reg(0x40, IA32_EBP, tmp2[0]),
> > +			      STACK_VAR(r1[0]));
> > +			/* mov edx,dword ptr [ebp+off] */
> > +			EMIT3(0x8B, add_2reg(0x40, IA32_EBP, tmp2[1]),
> > +			      STACK_VAR(r1[1]));
> > +
> > +			emit_push_r64(r5, &prog);
> > +			emit_push_r64(r4, &prog);
> > +			emit_push_r64(r3, &prog);
> > +			emit_push_r64(r2, &prog);
> > +
> > +			EMIT1_off32(0xE8, jmp_offset + 9);
> > +
> > +			/* mov dword ptr [ebp+off],eax */
> > +			EMIT3(0x89, add_2reg(0x40, IA32_EBP, tmp2[0]),
> > +			      STACK_VAR(r0[0]));
> > +			/* mov dword ptr [ebp+off],edx */
> > +			EMIT3(0x89, add_2reg(0x40, IA32_EBP, tmp2[1]),
> > +			      STACK_VAR(r0[1]));
> > +
> > +			/* add esp,32 */
> > +			EMIT3(0x83, add_1reg(0xC0, IA32_ESP), 32);
> > +			break;
> > +		}
> > +		case BPF_JMP | BPF_TAIL_CALL:
> > +			emit_bpf_tail_call(&prog);
> > +			break;
> > +
> > +		/* cond jump */
> > +		case BPF_JMP | BPF_JEQ | BPF_X:
> > +		case BPF_JMP | BPF_JNE | BPF_X:
> > +		case BPF_JMP | BPF_JGT | BPF_X:
> > +		case BPF_JMP | BPF_JLT | BPF_X:
> > +		case BPF_JMP | BPF_JGE | BPF_X:
> > +		case BPF_JMP | BPF_JLE | BPF_X:
> > +		case BPF_JMP | BPF_JSGT | BPF_X:
> > +		case BPF_JMP | BPF_JSLE | BPF_X:
> > +		case BPF_JMP | BPF_JSLT | BPF_X:
> > +		case BPF_JMP | BPF_JSGE | BPF_X:
> > +			/* mov esi,dword ptr [ebp+off] */
> > +			EMIT3(0x8B, add_2reg(0x40, IA32_EBP, tmp[0]),
> > +			      STACK_VAR(src_hi));
> > +			/* cmp dword ptr [ebp+off], esi */
> > +			EMIT3(0x39, add_2reg(0x40, IA32_EBP, tmp[0]),
> > +			      STACK_VAR(dst_hi));
> > +
> > +			EMIT2(IA32_JNE, 6);
> > +			/* mov esi,dword ptr [ebp+off] */
> > +			EMIT3(0x8B, add_2reg(0x40, IA32_EBP, tmp[0]),
> > +			      STACK_VAR(src_lo));
> > +			/* cmp dword ptr [ebp+off], esi */
> > +			EMIT3(0x39, add_2reg(0x40, IA32_EBP, tmp[0]),
> > +			      STACK_VAR(dst_lo));
> > +			goto emit_cond_jmp;
> > +
> > +		case BPF_JMP | BPF_JSET | BPF_X:
> > +			/* mov esi,dword ptr [ebp+off] */
> > +			EMIT3(0x8B, add_2reg(0x40, IA32_EBP, tmp[0]),
> > +			      STACK_VAR(dst_lo));
> > +			/* and esi,dword ptr [ebp+off]*/
> > +			EMIT3(0x23, add_2reg(0x40, IA32_EBP, tmp[0]),
> > +			      STACK_VAR(src_lo));
> > +
> > +			/* mov edi,dword ptr [ebp+off] */
> > +			EMIT3(0x8B, add_2reg(0x40, IA32_EBP, tmp[1]),
> > +			      STACK_VAR(dst_hi));
> > +			/* and edi,dword ptr [ebp+off] */
> > +			EMIT3(0x23, add_2reg(0x40, IA32_EBP, tmp[1]),
> > +			      STACK_VAR(src_hi));
> > +			/* or esi,edi */
> > +			EMIT2(0x09, add_2reg(0xC0, tmp[0], tmp[1]));
> > +			goto emit_cond_jmp;
> > +
> > +		case BPF_JMP | BPF_JSET | BPF_K: {
> > +			u32 hi;
> > +
> > +			hi = imm32 & (1<<31) ? (u32)~0 : 0;
> > +			/* mov esi,imm32 */
> > +			EMIT2_off32(0xC7, add_1reg(0xC0, tmp[0]), imm32);
> > +			/* and esi,dword ptr [ebp+off]*/
> > +			EMIT3(0x23, add_2reg(0x40, IA32_EBP, tmp[0]),
> > +			      STACK_VAR(dst_lo));
> > +
> > +			/* mov esi,imm32 */
> > +			EMIT2_off32(0xC7, add_1reg(0xC0, tmp[1]), hi);
> > +			/* and esi,dword ptr [ebp+off] */
> > +			EMIT3(0x23, add_2reg(0x40, IA32_EBP, tmp[1]),
> > +			      STACK_VAR(dst_hi));
> > +			/* or esi,edi */
> > +			EMIT2(0x09, add_2reg(0xC0, tmp[0], tmp[1]));
> > +			goto emit_cond_jmp;
> > +		}
> > +		case BPF_JMP | BPF_JEQ | BPF_K:
> > +		case BPF_JMP | BPF_JNE | BPF_K:
> > +		case BPF_JMP | BPF_JGT | BPF_K:
> > +		case BPF_JMP | BPF_JLT | BPF_K:
> > +		case BPF_JMP | BPF_JGE | BPF_K:
> > +		case BPF_JMP | BPF_JLE | BPF_K:
> > +		case BPF_JMP | BPF_JSGT | BPF_K:
> > +		case BPF_JMP | BPF_JSLE | BPF_K:
> > +		case BPF_JMP | BPF_JSLT | BPF_K:
> > +		case BPF_JMP | BPF_JSGE | BPF_K: {
> > +			u32 hi;
> > +
> > +			hi = imm32 & (1<<31) ? (u32)~0 : 0;
> > +			/* mov esi,imm32 */
> > +			EMIT2_off32(0xC7, add_1reg(0xC0, tmp[0]), hi);
> > +			/* cmp dword ptr [ebp+off],esi */
> > +			EMIT3(0x39, add_2reg(0x40, IA32_EBP, tmp[0]),
> > +			      STACK_VAR(dst_hi));
> > +
> > +			EMIT2(IA32_JNE, 6);
> > +			/* mov esi,imm32 */
> > +			EMIT2_off32(0xC7, add_1reg(0xC0, tmp[0]), imm32);
> > +			/* cmp dword ptr [ebp+off],esi */
> > +			EMIT3(0x39, add_2reg(0x40, IA32_EBP, tmp[0]),
> > +			      STACK_VAR(dst_lo));
> > +
> > +emit_cond_jmp:		/* convert BPF opcode to x86 */
> > +			switch (BPF_OP(code)) {
> > +			case BPF_JEQ:
> > +				jmp_cond = IA32_JE;
> > +				break;
> > +			case BPF_JSET:
> > +			case BPF_JNE:
> > +				jmp_cond = IA32_JNE;
> > +				break;
> > +			case BPF_JGT:
> > +				/* GT is unsigned '>', JA in x86 */
> > +				jmp_cond = IA32_JA;
> > +				break;
> > +			case BPF_JLT:
> > +				/* LT is unsigned '<', JB in x86 */
> > +				jmp_cond = IA32_JB;
> > +				break;
> > +			case BPF_JGE:
> > +				/* GE is unsigned '>=', JAE in x86 */
> > +				jmp_cond = IA32_JAE;
> > +				break;
> > +			case BPF_JLE:
> > +				/* LE is unsigned '<=', JBE in x86 */
> > +				jmp_cond = IA32_JBE;
> > +				break;
> > +			case BPF_JSGT:
> > +				/* signed '>', GT in x86 */
> > +				jmp_cond = IA32_JG;
> > +				break;
> > +			case BPF_JSLT:
> > +				/* signed '<', LT in x86 */
> > +				jmp_cond = IA32_JL;
> > +				break;
> > +			case BPF_JSGE:
> > +				/* signed '>=', GE in x86 */
> > +				jmp_cond = IA32_JGE;
> > +				break;
> > +			case BPF_JSLE:
> > +				/* signed '<=', LE in x86 */
> > +				jmp_cond = IA32_JLE;
> > +				break;
> > +			default: /* to silence gcc warning */
> > +				return -EFAULT;
> > +			}
> > +			jmp_offset = addrs[i + insn->off] - addrs[i];
> > +			if (is_imm8(jmp_offset)) {
> > +				EMIT2(jmp_cond, jmp_offset);
> > +			} else if (is_simm32(jmp_offset)) {
> > +				EMIT2_off32(0x0F, jmp_cond + 0x10, jmp_offset);
> > +			} else {
> > +				pr_err("cond_jmp gen bug %llx\n", jmp_offset);
> > +				return -EFAULT;
> > +			}
> > +
> > +			break;
> > +		}
> > +		case BPF_JMP | BPF_JA:
> > +			jmp_offset = addrs[i + insn->off] - addrs[i];
> > +			if (!jmp_offset)
> > +				/* optimize out nop jumps */
> > +				break;
> > +emit_jmp:
> > +			if (is_imm8(jmp_offset)) {
> > +				EMIT2(0xEB, jmp_offset);
> > +			} else if (is_simm32(jmp_offset)) {
> > +				EMIT1_off32(0xE9, jmp_offset);
> > +			} else {
> > +				pr_err("jmp gen bug %llx\n", jmp_offset);
> > +				return -EFAULT;
> > +			}
> > +			break;
> > +
> > +		case BPF_LD | BPF_IND | BPF_W:
> > +			func = sk_load_word;
> > +			goto common_load;
> > +		case BPF_LD | BPF_ABS | BPF_W:
> > +			func = CHOOSE_LOAD_FUNC(imm32, sk_load_word);
> > +common_load:
> > +			jmp_offset = func - (image + addrs[i]);
> > +			if (!func || !is_simm32(jmp_offset)) {
> > +				pr_err("unsupported bpf func %d addr %p image %p\n",
> > +				       imm32, func, image);
> > +				return -EINVAL;
> > +			}
> > +			if (BPF_MODE(code) == BPF_ABS) {
> > +				/* mov %edx, imm32 */
> > +				EMIT1_off32(0xBA, imm32);
> > +			} else {
> > +				/* mov edx,dword ptr [ebp+off] */
> > +				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
> > +				      STACK_VAR(src_lo));
> > +				if (imm32) {
> > +					if (is_imm8(imm32))
> > +						/* add %edx, imm8 */
> > +						EMIT3(0x83, 0xC2, imm32);
> > +					else
> > +						/* add %edx, imm32 */
> > +						EMIT2_off32(0x81, 0xC2, imm32);
> > +				}
> > +			}
> > +			emit_load_skb_data_hlen(&prog);
> 
> The only place where you call the emit_load_skb_data_hlen() is here, so it
> effectively doesn't cache anything as with the other JITs. I would suggest
> to keep the complexity of the BPF_ABS/IND rather very low, remove the
> arch/x86/net/bpf_jit32.S handling altogether and just follow the model the
> way arm64 implemented this in the arch/arm64/net/bpf_jit_comp.c JIT. For
> eBPF these instructions are legacy and have been source of many subtle
> bugs in the various BPF JITs in the past, plus nobody complained about the
> current interpreter speed for LD_ABS/IND on x86_32 anyway so far. ;) We're
> also very close to have a rewrite of LD_ABS/IND out of native eBPF with
> similar performance to be able to finally remove them from the core.
Done.

Thanks.

^ permalink raw reply

* [PATCH net 1/3] net: mvpp2: Fix clk error path in mvpp2_probe
From: Maxime Chevallier @ 2018-04-25 11:07 UTC (permalink / raw)
  To: davem
  Cc: ymarkman, Antoine Tenart, netdev, gregory.clement, linux-kernel,
	Maxime Chevallier, nadavh, linux-arm-kernel, thomas.petazzoni,
	miquel.raynal, stefanc, mw, linux
In-Reply-To: <20180425110731.20153-1-maxime.chevallier@bootlin.com>

When clk_prepare_enable fails for the axi_clk, the mg_clk isn't properly
cleaned up. Add another jump label to handle that case, and make sure we
jump to it in the later error cases.

Fixes: 4792ea04bcd0 ("net: mvpp2: Fix clock resource by adding an optional bus clock")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
---
 drivers/net/ethernet/marvell/mvpp2.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index 4202f9b5b966..0c2f04813d42 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -8774,12 +8774,12 @@ static int mvpp2_probe(struct platform_device *pdev)
 		if (IS_ERR(priv->axi_clk)) {
 			err = PTR_ERR(priv->axi_clk);
 			if (err == -EPROBE_DEFER)
-				goto err_gop_clk;
+				goto err_mg_clk;
 			priv->axi_clk = NULL;
 		} else {
 			err = clk_prepare_enable(priv->axi_clk);
 			if (err < 0)
-				goto err_gop_clk;
+				goto err_mg_clk;
 		}
 
 		/* Get system's tclk rate */
@@ -8793,7 +8793,7 @@ static int mvpp2_probe(struct platform_device *pdev)
 	if (priv->hw_version == MVPP22) {
 		err = dma_set_mask(&pdev->dev, MVPP2_DESC_DMA_MASK);
 		if (err)
-			goto err_mg_clk;
+			goto err_axi_clk;
 		/* Sadly, the BM pools all share the same register to
 		 * store the high 32 bits of their address. So they
 		 * must all have the same high 32 bits, which forces
@@ -8801,14 +8801,14 @@ static int mvpp2_probe(struct platform_device *pdev)
 		 */
 		err = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32));
 		if (err)
-			goto err_mg_clk;
+			goto err_axi_clk;
 	}
 
 	/* Initialize network controller */
 	err = mvpp2_init(pdev, priv);
 	if (err < 0) {
 		dev_err(&pdev->dev, "failed to initialize controller\n");
-		goto err_mg_clk;
+		goto err_axi_clk;
 	}
 
 	/* Initialize ports */
@@ -8821,7 +8821,7 @@ static int mvpp2_probe(struct platform_device *pdev)
 	if (priv->port_count == 0) {
 		dev_err(&pdev->dev, "no ports enabled\n");
 		err = -ENODEV;
-		goto err_mg_clk;
+		goto err_axi_clk;
 	}
 
 	/* Statistics must be gathered regularly because some of them (like
@@ -8849,8 +8849,9 @@ static int mvpp2_probe(struct platform_device *pdev)
 			mvpp2_port_remove(priv->port_list[i]);
 		i++;
 	}
-err_mg_clk:
+err_axi_clk:
 	clk_disable_unprepare(priv->axi_clk);
+err_mg_clk:
 	if (priv->hw_version == MVPP22)
 		clk_disable_unprepare(priv->mg_clk);
 err_gop_clk:
-- 
2.11.0

^ permalink raw reply related

* [PATCH net 0/3] net: mvpp2: Fix hangs when starting some interfaces on 7k/8k
From: Maxime Chevallier @ 2018-04-25 11:07 UTC (permalink / raw)
  To: davem
  Cc: Maxime Chevallier, netdev, linux-kernel, Antoine Tenart,
	thomas.petazzoni, gregory.clement, miquel.raynal, nadavh, stefanc,
	ymarkman, mw, linux, linux-arm-kernel

Armada 7K / 8K clock management has recently been reworked, see :

commit c7e92def1ef4 ("clk: mvebu: cp110: Fix clock tree representation")

I have been experiencing overall system hangs on MacchiatoBin when starting
the eth1 interface since then. It turns out some clocks dependencies were
missing in the PPv2 and xmdio driver, the clock rework made this visible.

This series adds the missing clocks in the CP-110 DT bindings for these 2
controllers, updating the documentation accordingly. It also adds support
for the missing 'MG Core clock' in mvpp2.

Thanks to Gregory Clement for finding the root cause of this bug.

Maxime Chevallier (3):
  net: mvpp2: Fix clk error path in mvpp2_probe
  net: mvpp2: Fix clock resource by adding missing mg_core_clk
  ARM64: dts: marvell: armada-cp110: Add clocks for the xmdio node

 .../devicetree/bindings/net/marvell-pp2.txt        |  9 ++++---
 arch/arm64/boot/dts/marvell/armada-cp110.dtsi      |  7 +++--
 drivers/net/ethernet/marvell/mvpp2.c               | 31 +++++++++++++++++-----
 3 files changed, 34 insertions(+), 13 deletions(-)

-- 
2.11.0

^ permalink raw reply

* [PATCH net 2/3] net: mvpp2: Fix clock resource by adding missing mg_core_clk
From: Maxime Chevallier @ 2018-04-25 11:07 UTC (permalink / raw)
  To: davem
  Cc: Maxime Chevallier, netdev, linux-kernel, Antoine Tenart,
	thomas.petazzoni, gregory.clement, miquel.raynal, nadavh, stefanc,
	ymarkman, mw, linux, linux-arm-kernel
In-Reply-To: <20180425110731.20153-1-maxime.chevallier@bootlin.com>

Marvell's PPv2.2 IP needs an additional clock named "MG Core clock".
This is required on Armada 7K and 8K.

This commit adds the required clock, updates the devicetree and its
documentation accordingly, also fixing a small typo in the
marvell-mpp2.txt examples.

Fixes: c7e92def1ef4 ("clk: mvebu: cp110: Fix clock tree representation")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
---
 .../devicetree/bindings/net/marvell-pp2.txt          |  9 +++++----
 arch/arm64/boot/dts/marvell/armada-cp110.dtsi        |  5 +++--
 drivers/net/ethernet/marvell/mvpp2.c                 | 20 ++++++++++++++++++--
 3 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/marvell-pp2.txt b/Documentation/devicetree/bindings/net/marvell-pp2.txt
index 1814fa13f6ab..fc019df0d863 100644
--- a/Documentation/devicetree/bindings/net/marvell-pp2.txt
+++ b/Documentation/devicetree/bindings/net/marvell-pp2.txt
@@ -21,9 +21,10 @@ Required properties:
 	- main controller clock (for both armada-375-pp2 and armada-7k-pp2)
 	- GOP clock (for both armada-375-pp2 and armada-7k-pp2)
 	- MG clock (only for armada-7k-pp2)
+	- MG Core clock (only for armada-7k-pp2)
 	- AXI clock (only for armada-7k-pp2)
-- clock-names: names of used clocks, must be "pp_clk", "gop_clk", "mg_clk"
-  and "axi_clk" (the 2 latter only for armada-7k-pp2).
+- clock-names: names of used clocks, must be "pp_clk", "gop_clk", "mg_clk",
+  "mg_core_clk" and "axi_clk" (the 3 latter only for armada-7k-pp2).
 
 The ethernet ports are represented by subnodes. At least one port is
 required.
@@ -80,8 +81,8 @@ cpm_ethernet: ethernet@0 {
 	compatible = "marvell,armada-7k-pp22";
 	reg = <0x0 0x100000>, <0x129000 0xb000>;
 	clocks = <&cpm_syscon0 1 3>, <&cpm_syscon0 1 9>,
-		 <&cpm_syscon0 1 5>, <&cpm_syscon0 1 18>;
-	clock-names = "pp_clk", "gop_clk", "gp_clk", "axi_clk";
+		 <&cpm_syscon0 1 5>, <&cpm_syscon0 1 6>, <&cpm_syscon0 1 18>;
+	clock-names = "pp_clk", "gop_clk", "mg_clk", "mg_core_clk", "axi_clk";
 
 	eth0: eth0 {
 		interrupts = <ICU_GRP_NSR 39 IRQ_TYPE_LEVEL_HIGH>,
diff --git a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
index 48cad7919efa..6c137ac656e9 100644
--- a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
@@ -38,9 +38,10 @@
 			compatible = "marvell,armada-7k-pp22";
 			reg = <0x0 0x100000>, <0x129000 0xb000>;
 			clocks = <&CP110_LABEL(clk) 1 3>, <&CP110_LABEL(clk) 1 9>,
-				 <&CP110_LABEL(clk) 1 5>, <&CP110_LABEL(clk) 1 18>;
+				 <&CP110_LABEL(clk) 1 5>, <&CP110_LABEL(clk) 1 6>,
+				 <&CP110_LABEL(clk) 1 18>;
 			clock-names = "pp_clk", "gop_clk",
-				      "mg_clk", "axi_clk";
+				      "mg_clk", "mg_core_clk", "axi_clk";
 			marvell,system-controller = <&CP110_LABEL(syscon0)>;
 			status = "disabled";
 			dma-coherent;
diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index 0c2f04813d42..6a17b67a0d61 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -942,6 +942,7 @@ struct mvpp2 {
 	struct clk *pp_clk;
 	struct clk *gop_clk;
 	struct clk *mg_clk;
+	struct clk *mg_core_clk;
 	struct clk *axi_clk;
 
 	/* List of pointers to port structures */
@@ -8768,18 +8769,28 @@ static int mvpp2_probe(struct platform_device *pdev)
 			err = clk_prepare_enable(priv->mg_clk);
 			if (err < 0)
 				goto err_gop_clk;
+
+			priv->mg_core_clk = devm_clk_get(&pdev->dev, "mg_core_clk");
+			if (IS_ERR(priv->mg_core_clk)) {
+				err = PTR_ERR(priv->mg_core_clk);
+				goto err_mg_clk;
+			}
+
+			err = clk_prepare_enable(priv->mg_core_clk);
+			if (err < 0)
+				goto err_mg_clk;
 		}
 
 		priv->axi_clk = devm_clk_get(&pdev->dev, "axi_clk");
 		if (IS_ERR(priv->axi_clk)) {
 			err = PTR_ERR(priv->axi_clk);
 			if (err == -EPROBE_DEFER)
-				goto err_mg_clk;
+				goto err_mg_core_clk;
 			priv->axi_clk = NULL;
 		} else {
 			err = clk_prepare_enable(priv->axi_clk);
 			if (err < 0)
-				goto err_mg_clk;
+				goto err_mg_core_clk;
 		}
 
 		/* Get system's tclk rate */
@@ -8851,6 +8862,10 @@ static int mvpp2_probe(struct platform_device *pdev)
 	}
 err_axi_clk:
 	clk_disable_unprepare(priv->axi_clk);
+
+err_mg_core_clk:
+	if (priv->hw_version == MVPP22)
+		clk_disable_unprepare(priv->mg_core_clk);
 err_mg_clk:
 	if (priv->hw_version == MVPP22)
 		clk_disable_unprepare(priv->mg_clk);
@@ -8898,6 +8913,7 @@ static int mvpp2_remove(struct platform_device *pdev)
 		return 0;
 
 	clk_disable_unprepare(priv->axi_clk);
+	clk_disable_unprepare(priv->mg_core_clk);
 	clk_disable_unprepare(priv->mg_clk);
 	clk_disable_unprepare(priv->pp_clk);
 	clk_disable_unprepare(priv->gop_clk);
-- 
2.11.0

^ permalink raw reply related

* [PATCH net 3/3] ARM64: dts: marvell: armada-cp110: Add clocks for the xmdio node
From: Maxime Chevallier @ 2018-04-25 11:07 UTC (permalink / raw)
  To: davem
  Cc: Maxime Chevallier, netdev, linux-kernel, Antoine Tenart,
	thomas.petazzoni, gregory.clement, miquel.raynal, nadavh, stefanc,
	ymarkman, mw, linux, linux-arm-kernel
In-Reply-To: <20180425110731.20153-1-maxime.chevallier@bootlin.com>

The Marvell XSMI controller needs 3 clocks to operate correctly :
 - The MG clock (clk 5)
 - The MG Core clock (clk 6)
 - The GOP clock (clk 18)

 This commit adds them, to avoid system hangs when using these
 interfaces.

Fixes: c7e92def1ef4 ("clk: mvebu: cp110: Fix clock tree representation")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
---
 arch/arm64/boot/dts/marvell/armada-cp110.dtsi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
index 6c137ac656e9..ed2f1237ea1e 100644
--- a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
@@ -142,6 +142,8 @@
 			#size-cells = <0>;
 			compatible = "marvell,xmdio";
 			reg = <0x12a600 0x10>;
+			clocks = <&CP110_LABEL(clk) 1 5>,
+				 <&CP110_LABEL(clk) 1 6>, <&CP110_LABEL(clk) 1 18>;
 			status = "disabled";
 		};
 
-- 
2.11.0

^ permalink raw reply related

* [PATCH v3] bpf, x86_32: add eBPF JIT compiler for ia32
From: Wang YanQing @ 2018-04-25 11:11 UTC (permalink / raw)
  To: daniel
  Cc: ast, illusionist.neo, tglx, mingo, hpa, davem, x86, netdev,
	linux-kernel

The JIT compiler emits ia32 bit instructions. Currently, It supports eBPF
only. Classic BPF is supported because of the conversion by BPF core.

Almost all instructions from eBPF ISA supported except the following:
BPF_ALU64 | BPF_DIV | BPF_K
BPF_ALU64 | BPF_DIV | BPF_X
BPF_ALU64 | BPF_MOD | BPF_K
BPF_ALU64 | BPF_MOD | BPF_X
BPF_STX | BPF_XADD | BPF_W
BPF_STX | BPF_XADD | BPF_DW

It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL at the moment.

IA32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI. I use
EAX|EDX|ECX|EBX as temporary registers to simulate instructions in eBPF
ISA, and allocate ESI|EDI to BPF_REG_AX for constant blinding, all others
eBPF registers, R0-R10, are simulated through scratch space on stack.

The reasons behind the hardware registers allocation policy are:
1:MUL need EAX:EDX, shift operation need ECX, so they aren't fit
  for general eBPF 64bit register simulation.
2:We need at least 4 registers to simulate most eBPF ISA operations
  on registers operands instead of on register&memory operands.
3:We need to put BPF_REG_AX on hardware registers, or constant blinding
  will degrade jit performance heavily.

Tested on PC (Intel(R) Core(TM) i5-5200U CPU).
Testing results on i5-5200U:
1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed]
2) test_progs: Summary: 81 PASSED, 2 FAILED.
3) test_lpm: OK
4) test_lru_map: OK
5) test_verifier: Summary: 823 PASSED, 5 FAILED
Note: these failed testcases are caused by unconcerned reasons, we need
      to fix the testcases themself:
      "invalid bpf_context access" and "libbpf: incorrect bpf_call opcode"

Above tests are all done in following two conditions separately:
1:bpf_jit_enable=1 and bpf_jit_harden=0
2:bpf_jit_enable=1 and bpf_jit_harden=2

Below are some numbers for this jit implementation:
Note:
  I run test_progs in kselftest 100 times continuously for every condition,
  the numbers are in format: total/times=avg.
  The numbers that test_bpf reports show almost the same relation.

a:jit_enable=0 and jit_harden=0            b:jit_enable=1 and jit_harden=0
  test_pkt_access:PASS:ipv4:15622/100=156    test_pkt_access:PASS:ipv4:10674/100=106
  test_pkt_access:PASS:ipv6:9130/100=91      test_pkt_access:PASS:ipv6:4855/100=48
  test_xdp:PASS:ipv4:240198/100=2401         test_xdp:PASS:ipv4:138912/100=1389
  test_xdp:PASS:ipv6:137326/100=1373         test_xdp:PASS:ipv6:68542/100=685
  test_l4lb:PASS:ipv4:61100/100=611          test_l4lb:PASS:ipv4:37302/100=373
  test_l4lb:PASS:ipv6:101000/100=1010        test_l4lb:PASS:ipv6:55030/100=550

c:jit_enable=1 and jit_harden=2
  test_pkt_access:PASS:ipv4:10558/100=105
  test_pkt_access:PASS:ipv6:5092/100=50
  test_xdp:PASS:ipv4:131902/100=1319
  test_xdp:PASS:ipv6:77932/100=779
  test_l4lb:PASS:ipv4:38924/100=389
  test_l4lb:PASS:ipv6:57520/100=575

The numbers show we get 30%~50% improvement.

See Documentation/networking/filter.txt for more information.

Signed-off-by: Wang YanQing <udknight@gmail.com>
---
 Changes v2-v3:
 1:Move BPF_REG_AX to real hardware registers for performance reason.
 3:Using bpf_load_pointer instead of bpf_jit32.S, suggested by Daniel Borkmann.
 4:Delete partial codes in 1c2a088a6626, suggested by Daniel Borkmann.
 5:Some bug fixes and comments improvement.

 Changes v1-v2:
 1:Fix bug in emit_ia32_neg64.
 2:Fix bug in emit_ia32_arsh_r64.
 3:Delete filename in top level comment, suggested by Thomas Gleixner.
 4:Delete unnecessary boiler plate text, suggested by Thomas Gleixner.
 5:Rewrite some words in changelog.
 6:CodingSytle improvement and a little more comments.
  
 arch/x86/Kconfig                     |    2 +-
 arch/x86/include/asm/nospec-branch.h |   26 +-
 arch/x86/net/Makefile                |    9 +-
 arch/x86/net/bpf_jit_comp32.c        | 2533 ++++++++++++++++++++++++++++++++++
 4 files changed, 2565 insertions(+), 5 deletions(-)
 create mode 100644 arch/x86/net/bpf_jit_comp32.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 00fcf81..1f5fa2f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -137,7 +137,7 @@ config X86
 	select HAVE_DMA_CONTIGUOUS
 	select HAVE_DYNAMIC_FTRACE
 	select HAVE_DYNAMIC_FTRACE_WITH_REGS
-	select HAVE_EBPF_JIT			if X86_64
+	select HAVE_EBPF_JIT
 	select HAVE_EFFICIENT_UNALIGNED_ACCESS
 	select HAVE_EXIT_THREAD
 	select HAVE_FENTRY			if X86_64 || DYNAMIC_FTRACE
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index f928ad9..a4c7ca4 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -291,14 +291,17 @@ static inline void indirect_branch_prediction_barrier(void)
  *    lfence
  *    jmp spec_trap
  *  do_rop:
- *    mov %rax,(%rsp)
+ *    mov %rax,(%rsp) for x86_64
+ *    mov %edx,(%esp) for x86_32
  *    retq
  *
  * Without retpolines configured:
  *
- *    jmp *%rax
+ *    jmp *%rax for x86_64
+ *    jmp *%edx for x86_32
  */
 #ifdef CONFIG_RETPOLINE
+#ifdef CONFIG_X86_64
 # define RETPOLINE_RAX_BPF_JIT_SIZE	17
 # define RETPOLINE_RAX_BPF_JIT()				\
 	EMIT1_off32(0xE8, 7);	 /* callq do_rop */		\
@@ -310,9 +313,28 @@ static inline void indirect_branch_prediction_barrier(void)
 	EMIT4(0x48, 0x89, 0x04, 0x24); /* mov %rax,(%rsp) */	\
 	EMIT1(0xC3);             /* retq */
 #else
+# define RETPOLINE_EDX_BPF_JIT()				\
+do {								\
+	EMIT1_off32(0xE8, 7);	 /* call do_rop */		\
+	/* spec_trap: */					\
+	EMIT2(0xF3, 0x90);       /* pause */			\
+	EMIT3(0x0F, 0xAE, 0xE8); /* lfence */			\
+	EMIT2(0xEB, 0xF9);       /* jmp spec_trap */		\
+	/* do_rop: */						\
+	EMIT3(0x89, 0x14, 0x24); /* mov %edx,(%esp) */		\
+	EMIT1(0xC3);             /* ret */			\
+} while (0)
+#endif
+#else /* !CONFIG_RETPOLINE */
+
+#ifdef CONFIG_X86_64
 # define RETPOLINE_RAX_BPF_JIT_SIZE	2
 # define RETPOLINE_RAX_BPF_JIT()				\
 	EMIT2(0xFF, 0xE0);	 /* jmp *%rax */
+#else
+# define RETPOLINE_EDX_BPF_JIT()				\
+	EMIT2(0xFF, 0xE2) /* jmp *%edx */
+#endif
 #endif
 
 #endif /* _ASM_X86_NOSPEC_BRANCH_H_ */
diff --git a/arch/x86/net/Makefile b/arch/x86/net/Makefile
index fefb4b6..f54c9d4 100644
--- a/arch/x86/net/Makefile
+++ b/arch/x86/net/Makefile
@@ -1,6 +1,11 @@
 #
 # Arch-specific network modules
 #
-OBJECT_FILES_NON_STANDARD_bpf_jit.o += y
 
-obj-$(CONFIG_BPF_JIT) += bpf_jit.o bpf_jit_comp.o
+
+ifeq ($(CONFIG_X86_32),y)
+        obj-$(CONFIG_BPF_JIT) += bpf_jit_comp32.o
+else
+        OBJECT_FILES_NON_STANDARD_bpf_jit.o += y
+        obj-$(CONFIG_BPF_JIT) += bpf_jit.o bpf_jit_comp.o
+endif
diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c
new file mode 100644
index 0000000..25578b6
--- /dev/null
+++ b/arch/x86/net/bpf_jit_comp32.c
@@ -0,0 +1,2533 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Just-In-Time compiler for eBPF filters on IA32 (32bit x86)
+ *
+ * Author: Wang YanQing (udknight@gmail.com)
+ * The code based on code and ideas from:
+ * Eric Dumazet (eric.dumazet@gmail.com)
+ * and from:
+ * Shubham Bansal <illusionist.neo@gmail.com>
+ */
+
+#include <linux/netdevice.h>
+#include <linux/filter.h>
+#include <linux/if_vlan.h>
+#include <asm/cacheflush.h>
+#include <asm/set_memory.h>
+#include <asm/nospec-branch.h>
+#include <linux/bpf.h>
+
+/*
+ * eBPF prog stack layout:
+ *
+ *                         high
+ * original ESP =>        +-----+
+ *                        |     | callee saved registers
+ *                        +-----+
+ *                        | ... | eBPF JIT scratch space
+ * BPF_FP,IA32_EBP  =>    +-----+
+ *                        | ... | eBPF prog stack
+ *                        +-----+
+ *                        |RSVD | JIT scratchpad
+ * current ESP =>         +-----+
+ *                        |     |
+ *                        | ... | Function call stack
+ *                        |     |
+ *                        +-----+
+ *                          low
+ *
+ * The callee saved registers:
+ *
+ *                                high
+ * original ESP =>        +------------------+ \
+ *                        |        ebp       | |
+ * current EBP =>         +------------------+ } callee saved registers
+ *                        |    ebx,esi,edi   | |
+ *                        +------------------+ /
+ *                                low
+ */
+
+static u8 *emit_code(u8 *ptr, u32 bytes, unsigned int len)
+{
+	if (len == 1)
+		*ptr = bytes;
+	else if (len == 2)
+		*(u16 *)ptr = bytes;
+	else {
+		*(u32 *)ptr = bytes;
+		barrier();
+	}
+	return ptr + len;
+}
+
+#define EMIT(bytes, len) \
+	do { prog = emit_code(prog, bytes, len); cnt += len; } while (0)
+
+#define EMIT1(b1)		EMIT(b1, 1)
+#define EMIT2(b1, b2)		EMIT((b1) + ((b2) << 8), 2)
+#define EMIT3(b1, b2, b3)	EMIT((b1) + ((b2) << 8) + ((b3) << 16), 3)
+#define EMIT4(b1, b2, b3, b4)   \
+	EMIT((b1) + ((b2) << 8) + ((b3) << 16) + ((b4) << 24), 4)
+
+#define EMIT1_off32(b1, off) \
+	do {EMIT1(b1); EMIT(off, 4); } while (0)
+#define EMIT2_off32(b1, b2, off) \
+	do {EMIT2(b1, b2); EMIT(off, 4); } while (0)
+#define EMIT3_off32(b1, b2, b3, off) \
+	do {EMIT3(b1, b2, b3); EMIT(off, 4); } while (0)
+#define EMIT4_off32(b1, b2, b3, b4, off) \
+	do {EMIT4(b1, b2, b3, b4); EMIT(off, 4); } while (0)
+
+#define jmp_label(label, jmp_insn_len) (label - cnt - jmp_insn_len)
+
+static bool is_imm8(int value)
+{
+	return value <= 127 && value >= -128;
+}
+
+static bool is_simm32(s64 value)
+{
+	return value == (s64) (s32) value;
+}
+
+#define STACK_OFFSET(k)	(k)
+#define TCALL_CNT	(MAX_BPF_JIT_REG + 0)	/* Tail Call Count */
+
+#define IA32_EAX	(0x0)
+#define IA32_EBX	(0x3)
+#define IA32_ECX	(0x1)
+#define IA32_EDX	(0x2)
+#define IA32_ESI	(0x6)
+#define IA32_EDI	(0x7)
+#define IA32_EBP	(0x5)
+#define IA32_ESP	(0x4)
+
+/* list of x86 cond jumps opcodes (. + s8)
+ * Add 0x10 (and an extra 0x0f) to generate far jumps (. + s32)
+ */
+#define IA32_JB  0x72
+#define IA32_JAE 0x73
+#define IA32_JE  0x74
+#define IA32_JNE 0x75
+#define IA32_JBE 0x76
+#define IA32_JA  0x77
+#define IA32_JL  0x7C
+#define IA32_JGE 0x7D
+#define IA32_JLE 0x7E
+#define IA32_JG  0x7F
+
+/*
+ * Map eBPF registers to x86_32 32bit registers or stack scratch space.
+ *
+ * 1. All the registers, R0-R10, are mapped to scratch space on stack.
+ * 2. We need two 64 bit temp registers to do complex operations on eBPF
+ *    registers.
+ * 3. For performance reason, the BPF_REG_AX for blinding constant, is
+ *    mapped to real hardware register pair, IA32_ESI and IA32_EDI.
+ *
+ * As the eBPF registers are all 64 bit registers and x86_32 has only 32 bit
+ * registers, we have to map each eBPF registers with two x86_32 32 bit regs
+ * or scratch memory space and we have to build eBPF 64 bit register from those.
+ *
+ * We use IA32_EAX, IA32_EDX, IA32_ECX, IA32_EBX as temporary registers.
+ */
+static const u8 bpf2ia32[][2] = {
+	/* return value from in-kernel function, and exit value from eBPF */
+	[BPF_REG_0] = {STACK_OFFSET(0), STACK_OFFSET(4)},
+
+	/* arguments from eBPF program to in-kernel function */
+	/* Stored on stack scratch space */
+	[BPF_REG_1] = {STACK_OFFSET(8), STACK_OFFSET(12)},
+	[BPF_REG_2] = {STACK_OFFSET(16), STACK_OFFSET(20)},
+	[BPF_REG_3] = {STACK_OFFSET(24), STACK_OFFSET(28)},
+	[BPF_REG_4] = {STACK_OFFSET(32), STACK_OFFSET(36)},
+	[BPF_REG_5] = {STACK_OFFSET(40), STACK_OFFSET(44)},
+
+	/* callee saved registers that in-kernel function will preserve */
+	/* Stored on stack scratch space */
+	[BPF_REG_6] = {STACK_OFFSET(48), STACK_OFFSET(52)},
+	[BPF_REG_7] = {STACK_OFFSET(56), STACK_OFFSET(60)},
+	[BPF_REG_8] = {STACK_OFFSET(64), STACK_OFFSET(68)},
+	[BPF_REG_9] = {STACK_OFFSET(72), STACK_OFFSET(76)},
+
+	/* Read only Frame Pointer to access Stack */
+	[BPF_REG_FP] = {STACK_OFFSET(80), STACK_OFFSET(84)},
+
+	/* temporary register for blinding constants. */
+	[BPF_REG_AX] = {IA32_ESI, IA32_EDI},
+
+	/* Tail call count. Stored on stack scratch space. */
+	[TCALL_CNT] = {STACK_OFFSET(88), STACK_OFFSET(92)},
+};
+
+#define dst_lo	dst[0]
+#define dst_hi	dst[1]
+#define src_lo	src[0]
+#define src_hi	src[1]
+
+#define STACK_ALIGNMENT	8
+/* Stack space for BPF_REG_1, BPF_REG_2, BPF_REG_3, BPF_REG_4,
+ * BPF_REG_5, BPF_REG_6, BPF_REG_7, BPF_REG_8, BPF_REG_9,
+ * BPF_REG_FP, BPF_REG_AX and Tail call counts.
+ */
+#define SCRATCH_SIZE 96
+
+/* total stack size used in JITed code */
+#define _STACK_SIZE \
+	(stack_depth + \
+	 + SCRATCH_SIZE + \
+	 + 4 /* extra for skb_copy_bits buffer */)
+
+#define STACK_SIZE ALIGN(_STACK_SIZE, STACK_ALIGNMENT)
+
+/* Get the offset of eBPF REGISTERs stored on scratch space. */
+#define STACK_VAR(off) (off)
+
+/* Offset of skb_copy_bits buffer */
+#define SKB_BUFFER STACK_VAR(SCRATCH_SIZE)
+
+/* encode 'dst_reg' register into x86_32 opcode 'byte' */
+static u8 add_1reg(u8 byte, u32 dst_reg)
+{
+	return byte + dst_reg;
+}
+
+/* encode 'dst_reg' and 'src_reg' registers into x86_32 opcode 'byte' */
+static u8 add_2reg(u8 byte, u32 dst_reg, u32 src_reg)
+{
+	return byte + dst_reg + (src_reg << 3);
+}
+
+static void jit_fill_hole(void *area, unsigned int size)
+{
+	/* fill whole space with int3 instructions */
+	memset(area, 0xcc, size);
+}
+
+/* Checks whether BPF register is on scratch stack space or not. */
+static inline bool is_on_stack(u8 bpf_reg)
+{
+	static u8 stack_regs[] = {BPF_REG_AX};
+	int i, reg_len = sizeof(stack_regs);
+
+	for (i = 0 ; i < reg_len ; i++) {
+		if (bpf_reg == stack_regs[i])
+			return false;
+	}
+	return true;
+}
+
+static inline void emit_ia32_mov_i(const u8 dst, const u32 val, bool dstk,
+				   u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+
+	if (dstk) {
+		if (val == 0) {
+			/* xor eax,eax */
+			EMIT2(0x33, add_2reg(0xC0, IA32_EAX, IA32_EAX));
+			/* mov dword ptr [ebp+off],eax */
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX),
+			      STACK_VAR(dst));
+		} else {
+			EMIT3_off32(0xC7, add_1reg(0x40, IA32_EBP),
+				    STACK_VAR(dst), val);
+		}
+	} else {
+		if (val == 0)
+			EMIT2(0x33, add_2reg(0xC0, dst, dst));
+		else
+			EMIT2_off32(0xC7, add_1reg(0xC0, dst),
+				    val);
+	}
+	*pprog = prog;
+}
+
+/* dst = imm (4 bytes)*/
+static inline void emit_ia32_mov_r(const u8 dst, const u8 src, bool dstk,
+				   bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 sreg = sstk ? IA32_EAX : src;
+
+	if (sstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(src));
+	if (dstk)
+		/* mov dword ptr [ebp+off],eax */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, sreg), STACK_VAR(dst));
+	else
+		/* mov dst,sreg */
+		EMIT2(0x89, add_2reg(0xC0, dst, sreg));
+
+	*pprog = prog;
+}
+
+/* dst = src */
+static inline void emit_ia32_mov_r64(const bool is64, const u8 dst[],
+				     const u8 src[], bool dstk,
+				     bool sstk, u8 **pprog)
+{
+	emit_ia32_mov_r(dst_lo, src_lo, dstk, sstk, pprog);
+	if (is64)
+		/* complete 8 byte move */
+		emit_ia32_mov_r(dst_hi, src_hi, dstk, sstk, pprog);
+	else
+		/* zero out high 4 bytes */
+		emit_ia32_mov_i(dst_hi, 0, dstk, pprog);
+}
+
+/* Sign extended move */
+static inline void emit_ia32_mov_i64(const bool is64, const u8 dst[],
+				     const u32 val, bool dstk, u8 **pprog)
+{
+	u32 hi = 0;
+
+	if (is64 && (val & (1<<31)))
+		hi = (u32)~0;
+	emit_ia32_mov_i(dst_lo, val, dstk, pprog);
+	emit_ia32_mov_i(dst_hi, hi, dstk, pprog);
+}
+
+/* ALU operation (32 bit)
+ * dst = dst * src
+ */
+static inline void emit_ia32_mul_r(const u8 dst, const u8 src, bool dstk,
+				   bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 sreg = sstk ? IA32_ECX : src;
+
+	if (sstk)
+		/* mov ecx,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX), STACK_VAR(src));
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(dst));
+	else
+		/* mov eax,dst */
+		EMIT2(0x8B, add_2reg(0xC0, dst, IA32_EAX));
+
+
+	EMIT2(0xF7, add_1reg(0xE0, sreg));
+
+	if (dstk)
+		/* mov dword ptr [ebp+off],eax */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst));
+	else
+		/* mov dst,eax */
+		EMIT2(0x89, add_2reg(0xC0, dst, IA32_EAX));
+
+	*pprog = prog;
+}
+
+static inline void emit_ia32_to_le_r64(const u8 dst[], s32 val,
+					 bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk && val != 64) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+	switch (val) {
+	case 16:
+		/* emit 'movzwl eax,ax' to zero extend 16-bit
+		 * into 64 bit
+		 */
+		EMIT2(0x0F, 0xB7);
+		EMIT1(add_2reg(0xC0, dreg_lo, dreg_lo));
+		/* xor dreg_hi,dreg_hi */
+		EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+		break;
+	case 32:
+		/* xor dreg_hi,dreg_hi */
+		EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+		break;
+	case 64:
+		/* nop */
+		break;
+	}
+
+	if (dstk && val != 64) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	*pprog = prog;
+}
+
+static inline void emit_ia32_to_be_r64(const u8 dst[], s32 val,
+				       bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+	switch (val) {
+	case 16:
+		/* emit 'ror %ax, 8' to swap lower 2 bytes */
+		EMIT1(0x66);
+		EMIT3(0xC1, add_1reg(0xC8, dreg_lo), 8);
+
+		EMIT2(0x0F, 0xB7);
+		EMIT1(add_2reg(0xC0, dreg_lo, dreg_lo));
+
+		/* xor dreg_hi,dreg_hi */
+		EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+		break;
+	case 32:
+		/* emit 'bswap eax' to swap lower 4 bytes */
+		EMIT1(0x0F);
+		EMIT1(add_1reg(0xC8, dreg_lo));
+
+		/* xor dreg_hi,dreg_hi */
+		EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+		break;
+	case 64:
+		/* emit 'bswap eax' to swap lower 4 bytes */
+		EMIT1(0x0F);
+		EMIT1(add_1reg(0xC8, dreg_lo));
+
+		/* emit 'bswap edx' to swap lower 4 bytes */
+		EMIT1(0x0F);
+		EMIT1(add_1reg(0xC8, dreg_hi));
+
+		/* mov ecx,dreg_hi */
+		EMIT2(0x89, add_2reg(0xC0, IA32_ECX, dreg_hi));
+		/* mov dreg_hi,dreg_lo */
+		EMIT2(0x89, add_2reg(0xC0, dreg_hi, dreg_lo));
+		/* mov dreg_lo,ecx */
+		EMIT2(0x89, add_2reg(0xC0, dreg_lo, IA32_ECX));
+
+		break;
+	}
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	*pprog = prog;
+}
+
+/* ALU operation (32 bit)
+ * dst = dst (div|mod) src
+ */
+static inline void emit_ia32_div_mod_r(const u8 op, const u8 dst, const u8 src,
+				       bool dstk, bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+
+	if (sstk)
+		/* mov ecx,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX),
+		      STACK_VAR(src));
+	else if (src != IA32_ECX)
+		/* mov ecx,src */
+		EMIT2(0x8B, add_2reg(0xC0, src, IA32_ECX));
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst));
+	else
+		/* mov eax,dst */
+		EMIT2(0x8B, add_2reg(0xC0, dst, IA32_EAX));
+
+	/* xor edx,edx */
+	EMIT2(0x31, add_2reg(0xC0, IA32_EDX, IA32_EDX));
+	/* div ecx */
+	EMIT2(0xF7, add_1reg(0xF0, IA32_ECX));
+
+	if (op == BPF_MOD) {
+		if (dstk)
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EDX),
+			      STACK_VAR(dst));
+		else
+			EMIT2(0x89, add_2reg(0xC0, dst, IA32_EDX));
+	} else {
+		if (dstk)
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX),
+			      STACK_VAR(dst));
+		else
+			EMIT2(0x89, add_2reg(0xC0, dst, IA32_EAX));
+	}
+	*pprog = prog;
+}
+
+/* ALU operation (32 bit)
+ * dst = dst (shift) src
+ */
+static inline void emit_ia32_shift_r(const u8 op, const u8 dst, const u8 src,
+				     bool dstk, bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg = dstk ? IA32_EAX : dst;
+	u8 b2;
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(dst));
+
+	if (sstk)
+		/* mov ecx,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX), STACK_VAR(src));
+	else if (src != IA32_ECX)
+		/* mov ecx,src */
+		EMIT2(0x8B, add_2reg(0xC0, src, IA32_ECX));
+
+	switch (op) {
+	case BPF_LSH:
+		b2 = 0xE0; break;
+	case BPF_RSH:
+		b2 = 0xE8; break;
+	case BPF_ARSH:
+		b2 = 0xF8; break;
+	default:
+		return;
+	}
+	EMIT2(0xD3, add_1reg(b2, dreg));
+
+	if (dstk)
+		/* mov dword ptr [ebp+off],dreg */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg), STACK_VAR(dst));
+	*pprog = prog;
+}
+
+/* ALU operation (32 bit)
+ * dst = dst (op) src
+ */
+static inline void emit_ia32_alu_r(const bool is64, const bool hi, const u8 op,
+				   const u8 dst, const u8 src, bool dstk,
+				   bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 sreg = sstk ? IA32_EAX : src;
+	u8 dreg = dstk ? IA32_EDX : dst;
+
+	if (sstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(src));
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX), STACK_VAR(dst));
+
+	switch (BPF_OP(op)) {
+	/* dst = dst + src */
+	case BPF_ADD:
+		if (hi && is64)
+			EMIT2(0x11, add_2reg(0xC0, dreg, sreg));
+		else
+			EMIT2(0x01, add_2reg(0xC0, dreg, sreg));
+		break;
+	/* dst = dst - src */
+	case BPF_SUB:
+		if (hi && is64)
+			EMIT2(0x19, add_2reg(0xC0, dreg, sreg));
+		else
+			EMIT2(0x29, add_2reg(0xC0, dreg, sreg));
+		break;
+	/* dst = dst | src */
+	case BPF_OR:
+		EMIT2(0x09, add_2reg(0xC0, dreg, sreg));
+		break;
+	/* dst = dst & src */
+	case BPF_AND:
+		EMIT2(0x21, add_2reg(0xC0, dreg, sreg));
+		break;
+	/* dst = dst ^ src */
+	case BPF_XOR:
+		EMIT2(0x31, add_2reg(0xC0, dreg, sreg));
+		break;
+	}
+
+	if (dstk)
+		/* mov dword ptr [ebp+off],dreg */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg),
+		      STACK_VAR(dst));
+	*pprog = prog;
+}
+
+/* ALU operation (64 bit) */
+static inline void emit_ia32_alu_r64(const bool is64, const u8 op,
+				     const u8 dst[], const u8 src[],
+				     bool dstk,  bool sstk,
+				     u8 **pprog)
+{
+	u8 *prog = *pprog;
+
+	emit_ia32_alu_r(is64, false, op, dst_lo, src_lo, dstk, sstk, &prog);
+	if (is64)
+		emit_ia32_alu_r(is64, true, op, dst_hi, src_hi, dstk, sstk,
+				&prog);
+	else
+		emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+	*pprog = prog;
+}
+
+/* ALU operation (32 bit)
+ * dst = dst (op) val
+ */
+static inline void emit_ia32_alu_i(const bool is64, const bool hi, const u8 op,
+				   const u8 dst, const s32 val, bool dstk,
+				   u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg = dstk ? IA32_EAX : dst;
+	u8 sreg = IA32_EDX;
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(dst));
+
+	if (!is_imm8(val))
+		/* mov edx,imm32*/
+		EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EDX), val);
+
+	switch (op) {
+	/* dst = dst + val */
+	case BPF_ADD:
+		if (hi && is64) {
+			if (is_imm8(val))
+				EMIT3(0x83, add_1reg(0xD0, dreg), val);
+			else
+				EMIT2(0x11, add_2reg(0xC0, dreg, sreg));
+		} else {
+			if (is_imm8(val))
+				EMIT3(0x83, add_1reg(0xC0, dreg), val);
+			else
+				EMIT2(0x01, add_2reg(0xC0, dreg, sreg));
+		}
+		break;
+	/* dst = dst - val */
+	case BPF_SUB:
+		if (hi && is64) {
+			if (is_imm8(val))
+				EMIT3(0x83, add_1reg(0xD8, dreg), val);
+			else
+				EMIT2(0x19, add_2reg(0xC0, dreg, sreg));
+		} else {
+			if (is_imm8(val))
+				EMIT3(0x83, add_1reg(0xE8, dreg), val);
+			else
+				EMIT2(0x29, add_2reg(0xC0, dreg, sreg));
+		}
+		break;
+	/* dst = dst | val */
+	case BPF_OR:
+		if (is_imm8(val))
+			EMIT3(0x83, add_1reg(0xC8, dreg), val);
+		else
+			EMIT2(0x09, add_2reg(0xC0, dreg, sreg));
+		break;
+	/* dst = dst & val */
+	case BPF_AND:
+		if (is_imm8(val))
+			EMIT3(0x83, add_1reg(0xE0, dreg), val);
+		else
+			EMIT2(0x21, add_2reg(0xC0, dreg, sreg));
+		break;
+	/* dst = dst ^ val */
+	case BPF_XOR:
+		if (is_imm8(val))
+			EMIT3(0x83, add_1reg(0xF0, dreg), val);
+		else
+			EMIT2(0x31, add_2reg(0xC0, dreg, sreg));
+		break;
+	case BPF_NEG:
+		EMIT2(0xF7, add_1reg(0xD8, dreg));
+		break;
+	}
+
+	if (dstk)
+		/* mov dword ptr [ebp+off],dreg */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg),
+		      STACK_VAR(dst));
+	*pprog = prog;
+}
+
+/* ALU operation (64 bit) */
+static inline void emit_ia32_alu_i64(const bool is64, const u8 op,
+				     const u8 dst[], const u32 val,
+				     bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	u32 hi = 0;
+
+	if (is64 && (val & (1<<31)))
+		hi = (u32)~0;
+
+	emit_ia32_alu_i(is64, false, op, dst_lo, val, dstk, &prog);
+	if (is64)
+		emit_ia32_alu_i(is64, true, op, dst_hi, hi, dstk, &prog);
+	else
+		emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+
+	*pprog = prog;
+}
+
+/* dst = ~dst (64 bit) */
+static inline void emit_ia32_neg64(const u8 dst[], bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+
+	/* xor ecx,ecx */
+	EMIT2(0x31, add_2reg(0xC0, IA32_ECX, IA32_ECX));
+	/* sub dreg_lo,ecx */
+	EMIT2(0x2B, add_2reg(0xC0, dreg_lo, IA32_ECX));
+	/* mov dreg_lo,ecx */
+	EMIT2(0x89, add_2reg(0xC0, dreg_lo, IA32_ECX));
+
+	/* xor ecx,ecx */
+	EMIT2(0x31, add_2reg(0xC0, IA32_ECX, IA32_ECX));
+	/* sbb dreg_hi,ecx */
+	EMIT2(0x19, add_2reg(0xC0, dreg_hi, IA32_ECX));
+	/* mov dreg_hi,ecx */
+	EMIT2(0x89, add_2reg(0xC0, dreg_hi, IA32_ECX));
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	*pprog = prog;
+}
+
+/* dst = dst << src */
+static inline void emit_ia32_lsh_r64(const u8 dst[], const u8 src[],
+				     bool dstk, bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	static int jmp_label1 = -1;
+	static int jmp_label2 = -1;
+	static int jmp_label3 = -1;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+
+	if (sstk)
+		/* mov ecx,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX),
+		      STACK_VAR(src_lo));
+	else
+		/* mov ecx,src_lo */
+		EMIT2(0x8B, add_2reg(0xC0, src_lo, IA32_ECX));
+
+	/* cmp ecx,32 */
+	EMIT3(0x83, add_1reg(0xF8, IA32_ECX), 32);
+	/* jumps when >= 32 */
+	if (is_imm8(jmp_label(jmp_label1, 2)))
+		EMIT2(IA32_JAE, jmp_label(jmp_label1, 2));
+	else
+		EMIT2_off32(0x0F, IA32_JAE + 0x10, jmp_label(jmp_label1, 6));
+
+	/* < 32 */
+	/* shl dreg_hi,cl */
+	EMIT2(0xD3, add_1reg(0xE0, dreg_hi));
+	/* mov ebx,dreg_lo */
+	EMIT2(0x8B, add_2reg(0xC0, dreg_lo, IA32_EBX));
+	/* shl dreg_lo,cl */
+	EMIT2(0xD3, add_1reg(0xE0, dreg_lo));
+
+	/* IA32_ECX = -IA32_ECX + 32 */
+	/* neg ecx */
+	EMIT2(0xF7, add_1reg(0xD8, IA32_ECX));
+	/* add ecx,32 */
+	EMIT3(0x83, add_1reg(0xC0, IA32_ECX), 32);
+
+	/* shr ebx,cl */
+	EMIT2(0xD3, add_1reg(0xE8, IA32_EBX));
+	/* or dreg_hi,ebx */
+	EMIT2(0x09, add_2reg(0xC0, dreg_hi, IA32_EBX));
+
+	/* goto out; */
+	if (is_imm8(jmp_label(jmp_label3, 2)))
+		EMIT2(0xEB, jmp_label(jmp_label3, 2));
+	else
+		EMIT1_off32(0xE9, jmp_label(jmp_label3, 5));
+
+	/* >= 32 */
+	if (jmp_label1 == -1)
+		jmp_label1 = cnt;
+
+	/* cmp ecx,64 */
+	EMIT3(0x83, add_1reg(0xF8, IA32_ECX), 64);
+	/* jumps when >= 64 */
+	if (is_imm8(jmp_label(jmp_label2, 2)))
+		EMIT2(IA32_JAE, jmp_label(jmp_label2, 2));
+	else
+		EMIT2_off32(0x0F, IA32_JAE + 0x10, jmp_label(jmp_label2, 6));
+
+	/* >= 32 && < 64 */
+	/* sub ecx,32 */
+	EMIT3(0x83, add_1reg(0xE8, IA32_ECX), 32);
+	/* shl dreg_lo,cl */
+	EMIT2(0xD3, add_1reg(0xE0, dreg_lo));
+	/* mov dreg_hi,dreg_lo */
+	EMIT2(0x89, add_2reg(0xC0, dreg_hi, dreg_lo));
+
+	/* xor dreg_lo,dreg_lo */
+	EMIT2(0x33, add_2reg(0xC0, dreg_lo, dreg_lo));
+
+	/* goto out; */
+	if (is_imm8(jmp_label(jmp_label3, 2)))
+		EMIT2(0xEB, jmp_label(jmp_label3, 2));
+	else
+		EMIT1_off32(0xE9, jmp_label(jmp_label3, 5));
+
+	/* >= 64 */
+	if (jmp_label2 == -1)
+		jmp_label2 = cnt;
+	/* xor dreg_lo,dreg_lo */
+	EMIT2(0x33, add_2reg(0xC0, dreg_lo, dreg_lo));
+	/* xor dreg_hi,dreg_hi */
+	EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+
+	if (jmp_label3 == -1)
+		jmp_label3 = cnt;
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	/* out: */
+	*pprog = prog;
+}
+
+/* dst = dst >> src (signed)*/
+static inline void emit_ia32_arsh_r64(const u8 dst[], const u8 src[],
+				      bool dstk, bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	static int jmp_label1 = -1;
+	static int jmp_label2 = -1;
+	static int jmp_label3 = -1;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+
+	if (sstk)
+		/* mov ecx,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX),
+		      STACK_VAR(src_lo));
+	else
+		/* mov ecx,src_lo */
+		EMIT2(0x8B, add_2reg(0xC0, src_lo, IA32_ECX));
+
+	/* cmp ecx,32 */
+	EMIT3(0x83, add_1reg(0xF8, IA32_ECX), 32);
+	/* jumps when >= 32 */
+	if (is_imm8(jmp_label(jmp_label1, 2)))
+		EMIT2(IA32_JAE, jmp_label(jmp_label1, 2));
+	else
+		EMIT2_off32(0x0F, IA32_JAE + 0x10, jmp_label(jmp_label1, 6));
+
+	/* < 32 */
+	/* lshr dreg_lo,cl */
+	EMIT2(0xD3, add_1reg(0xE8, dreg_lo));
+	/* mov ebx,dreg_hi */
+	EMIT2(0x8B, add_2reg(0xC0, dreg_hi, IA32_EBX));
+	/* ashr dreg_hi,cl */
+	EMIT2(0xD3, add_1reg(0xF8, dreg_hi));
+
+	/* IA32_ECX = -IA32_ECX + 32 */
+	/* neg ecx */
+	EMIT2(0xF7, add_1reg(0xD8, IA32_ECX));
+	/* add ecx,32 */
+	EMIT3(0x83, add_1reg(0xC0, IA32_ECX), 32);
+
+	/* shl ebx,cl */
+	EMIT2(0xD3, add_1reg(0xE0, IA32_EBX));
+	/* or dreg_lo,ebx */
+	EMIT2(0x09, add_2reg(0xC0, dreg_lo, IA32_EBX));
+
+	/* goto out; */
+	if (is_imm8(jmp_label(jmp_label3, 2)))
+		EMIT2(0xEB, jmp_label(jmp_label3, 2));
+	else
+		EMIT1_off32(0xE9, jmp_label(jmp_label3, 5));
+
+	/* >= 32 */
+	if (jmp_label1 == -1)
+		jmp_label1 = cnt;
+
+	/* cmp ecx,64 */
+	EMIT3(0x83, add_1reg(0xF8, IA32_ECX), 64);
+	/* jumps when >= 64 */
+	if (is_imm8(jmp_label(jmp_label2, 2)))
+		EMIT2(IA32_JAE, jmp_label(jmp_label2, 2));
+	else
+		EMIT2_off32(0x0F, IA32_JAE + 0x10, jmp_label(jmp_label2, 6));
+
+	/* >= 32 && < 64 */
+	/* sub ecx,32 */
+	EMIT3(0x83, add_1reg(0xE8, IA32_ECX), 32);
+	/* ashr dreg_hi,cl */
+	EMIT2(0xD3, add_1reg(0xF8, dreg_hi));
+	/* mov dreg_lo,dreg_hi */
+	EMIT2(0x89, add_2reg(0xC0, dreg_lo, dreg_hi));
+
+	/* ashr dreg_hi,imm8 */
+	EMIT3(0xC1, add_1reg(0xF8, dreg_hi), 31);
+
+	/* goto out; */
+	if (is_imm8(jmp_label(jmp_label3, 2)))
+		EMIT2(0xEB, jmp_label(jmp_label3, 2));
+	else
+		EMIT1_off32(0xE9, jmp_label(jmp_label3, 5));
+
+	/* >= 64 */
+	if (jmp_label2 == -1)
+		jmp_label2 = cnt;
+	/* ashr dreg_hi,imm8 */
+	EMIT3(0xC1, add_1reg(0xF8, dreg_hi), 31);
+	/* mov dreg_lo,dreg_hi */
+	EMIT2(0x89, add_2reg(0xC0, dreg_lo, dreg_hi));
+
+	if (jmp_label3 == -1)
+		jmp_label3 = cnt;
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	/* out: */
+	*pprog = prog;
+}
+
+/* dst = dst >> src */
+static inline void emit_ia32_rsh_r64(const u8 dst[], const u8 src[], bool dstk,
+				     bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	static int jmp_label1 = -1;
+	static int jmp_label2 = -1;
+	static int jmp_label3 = -1;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+
+	if (sstk)
+		/* mov ecx,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX),
+		      STACK_VAR(src_lo));
+	else
+		/* mov ecx,src_lo */
+		EMIT2(0x8B, add_2reg(0xC0, src_lo, IA32_ECX));
+
+	/* cmp ecx,32 */
+	EMIT3(0x83, add_1reg(0xF8, IA32_ECX), 32);
+	/* jumps when >= 32 */
+	if (is_imm8(jmp_label(jmp_label1, 2)))
+		EMIT2(IA32_JAE, jmp_label(jmp_label1, 2));
+	else
+		EMIT2_off32(0x0F, IA32_JAE + 0x10, jmp_label(jmp_label1, 6));
+
+	/* < 32 */
+	/* lshr dreg_lo,cl */
+	EMIT2(0xD3, add_1reg(0xE8, dreg_lo));
+	/* mov ebx,dreg_hi */
+	EMIT2(0x8B, add_2reg(0xC0, dreg_hi, IA32_EBX));
+	/* shr dreg_hi,cl */
+	EMIT2(0xD3, add_1reg(0xE8, dreg_hi));
+
+	/* IA32_ECX = -IA32_ECX + 32 */
+	/* neg ecx */
+	EMIT2(0xF7, add_1reg(0xD8, IA32_ECX));
+	/* add ecx,32 */
+	EMIT3(0x83, add_1reg(0xC0, IA32_ECX), 32);
+
+	/* shl ebx,cl */
+	EMIT2(0xD3, add_1reg(0xE0, IA32_EBX));
+	/* or dreg_lo,ebx */
+	EMIT2(0x09, add_2reg(0xC0, dreg_lo, IA32_EBX));
+
+	/* goto out; */
+	if (is_imm8(jmp_label(jmp_label3, 2)))
+		EMIT2(0xEB, jmp_label(jmp_label3, 2));
+	else
+		EMIT1_off32(0xE9, jmp_label(jmp_label3, 5));
+
+	/* >= 32 */
+	if (jmp_label1 == -1)
+		jmp_label1 = cnt;
+	/* cmp ecx,64 */
+	EMIT3(0x83, add_1reg(0xF8, IA32_ECX), 64);
+	/* jumps when >= 64 */
+	if (is_imm8(jmp_label(jmp_label2, 2)))
+		EMIT2(IA32_JAE, jmp_label(jmp_label2, 2));
+	else
+		EMIT2_off32(0x0F, IA32_JAE + 0x10, jmp_label(jmp_label2, 6));
+
+	/* >= 32 && < 64 */
+	/* sub ecx,32 */
+	EMIT3(0x83, add_1reg(0xE8, IA32_ECX), 32);
+	/* shr dreg_hi,cl */
+	EMIT2(0xD3, add_1reg(0xE8, dreg_hi));
+	/* mov dreg_lo,dreg_hi */
+	EMIT2(0x89, add_2reg(0xC0, dreg_lo, dreg_hi));
+	/* xor dreg_hi,dreg_hi */
+	EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+
+	/* goto out; */
+	if (is_imm8(jmp_label(jmp_label3, 2)))
+		EMIT2(0xEB, jmp_label(jmp_label3, 2));
+	else
+		EMIT1_off32(0xE9, jmp_label(jmp_label3, 5));
+
+	/* >= 64 */
+	if (jmp_label2 == -1)
+		jmp_label2 = cnt;
+	/* xor dreg_lo,dreg_lo */
+	EMIT2(0x33, add_2reg(0xC0, dreg_lo, dreg_lo));
+	/* xor dreg_hi,dreg_hi */
+	EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+
+	if (jmp_label3 == -1)
+		jmp_label3 = cnt;
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	/* out: */
+	*pprog = prog;
+}
+
+/* dst = dst << val */
+static inline void emit_ia32_lsh_i64(const u8 dst[], const u32 val,
+				     bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+	/* Do LSH operation */
+	if (val < 32) {
+		/* shl dreg_hi,imm8 */
+		EMIT3(0xC1, add_1reg(0xE0, dreg_hi), val);
+		/* mov ebx,dreg_lo */
+		EMIT2(0x8B, add_2reg(0xC0, dreg_lo, IA32_EBX));
+		/* shl dreg_lo,imm8 */
+		EMIT3(0xC1, add_1reg(0xE0, dreg_lo), val);
+
+		/* IA32_ECX = 32 - val */
+		/* mov ecx,val */
+		EMIT2(0xB1, val);
+		/* movzx ecx,ecx */
+		EMIT3(0x0F, 0xB6, add_2reg(0xC0, IA32_ECX, IA32_ECX));
+		/* neg ecx */
+		EMIT2(0xF7, add_1reg(0xD8, IA32_ECX));
+		/* add ecx,32 */
+		EMIT3(0x83, add_1reg(0xC0, IA32_ECX), 32);
+
+		/* shr ebx,cl */
+		EMIT2(0xD3, add_1reg(0xE8, IA32_EBX));
+		/* or dreg_hi,ebx */
+		EMIT2(0x09, add_2reg(0xC0, dreg_hi, IA32_EBX));
+	} else if (val >= 32 && val < 64) {
+		u32 value = val - 32;
+
+		/* shl dreg_lo,imm8 */
+		EMIT3(0xC1, add_1reg(0xE0, dreg_lo), value);
+		/* mov dreg_hi,dreg_lo */
+		EMIT2(0x89, add_2reg(0xC0, dreg_hi, dreg_lo));
+		/* xor dreg_lo,dreg_lo */
+		EMIT2(0x33, add_2reg(0xC0, dreg_lo, dreg_lo));
+	} else {
+		/* xor dreg_lo,dreg_lo */
+		EMIT2(0x33, add_2reg(0xC0, dreg_lo, dreg_lo));
+		/* xor dreg_hi,dreg_hi */
+		EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+	}
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	*pprog = prog;
+}
+
+/* dst = dst >> val */
+static inline void emit_ia32_rsh_i64(const u8 dst[], const u32 val,
+				     bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+
+	/* Do RSH operation */
+	if (val < 32) {
+		/* shr dreg_lo,imm8 */
+		EMIT3(0xC1, add_1reg(0xE8, dreg_lo), val);
+		/* mov ebx,dreg_hi */
+		EMIT2(0x8B, add_2reg(0xC0, dreg_hi, IA32_EBX));
+		/* shr dreg_hi,imm8 */
+		EMIT3(0xC1, add_1reg(0xE8, dreg_hi), val);
+
+		/* IA32_ECX = 32 - val */
+		/* mov ecx,val */
+		EMIT2(0xB1, val);
+		/* movzx ecx,ecx */
+		EMIT3(0x0F, 0xB6, add_2reg(0xC0, IA32_ECX, IA32_ECX));
+		/* neg ecx */
+		EMIT2(0xF7, add_1reg(0xD8, IA32_ECX));
+		/* add ecx,32 */
+		EMIT3(0x83, add_1reg(0xC0, IA32_ECX), 32);
+
+		/* shl ebx,cl */
+		EMIT2(0xD3, add_1reg(0xE0, IA32_EBX));
+		/* or dreg_lo,ebx */
+		EMIT2(0x09, add_2reg(0xC0, dreg_lo, IA32_EBX));
+	} else if (val >= 32 && val < 64) {
+		u32 value = val - 32;
+
+		/* shr dreg_hi,imm8 */
+		EMIT3(0xC1, add_1reg(0xE8, dreg_hi), value);
+		/* mov dreg_lo,dreg_hi */
+		EMIT2(0x89, add_2reg(0xC0, dreg_lo, dreg_hi));
+		/* xor dreg_hi,dreg_hi */
+		EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+	} else {
+		/* xor dreg_lo,dreg_lo */
+		EMIT2(0x33, add_2reg(0xC0, dreg_lo, dreg_lo));
+		/* xor dreg_hi,dreg_hi */
+		EMIT2(0x33, add_2reg(0xC0, dreg_hi, dreg_hi));
+	}
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	*pprog = prog;
+}
+
+/* dst = dst >> val (signed) */
+static inline void emit_ia32_arsh_i64(const u8 dst[], const u32 val,
+				      bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+	u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+
+	if (dstk) {
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+		      STACK_VAR(dst_hi));
+	}
+	/* Do RSH operation */
+	if (val < 32) {
+		/* shr dreg_lo,imm8 */
+		EMIT3(0xC1, add_1reg(0xE8, dreg_lo), val);
+		/* mov ebx,dreg_hi */
+		EMIT2(0x8B, add_2reg(0xC0, dreg_hi, IA32_EBX));
+		/* ashr dreg_hi,imm8 */
+		EMIT3(0xC1, add_1reg(0xF8, dreg_hi), val);
+
+		/* IA32_ECX = 32 - val */
+		/* mov ecx,val */
+		EMIT2(0xB1, val);
+		/* movzx ecx,ecx */
+		EMIT3(0x0F, 0xB6, add_2reg(0xC0, IA32_ECX, IA32_ECX));
+		/* neg ecx */
+		EMIT2(0xF7, add_1reg(0xD8, IA32_ECX));
+		/* add ecx,32 */
+		EMIT3(0x83, add_1reg(0xC0, IA32_ECX), 32);
+
+		/* shl ebx,cl */
+		EMIT2(0xD3, add_1reg(0xE0, IA32_EBX));
+		/* or dreg_lo,ebx */
+		EMIT2(0x09, add_2reg(0xC0, dreg_lo, IA32_EBX));
+	} else if (val >= 32 && val < 64) {
+		u32 value = val - 32;
+
+		/* ashr dreg_hi,imm8 */
+		EMIT3(0xC1, add_1reg(0xF8, dreg_hi), value);
+		/* mov dreg_lo,dreg_hi */
+		EMIT2(0x89, add_2reg(0xC0, dreg_lo, dreg_hi));
+
+		/* ashr dreg_hi,imm8 */
+		EMIT3(0xC1, add_1reg(0xF8, dreg_hi), 31);
+	} else {
+		/* ashr dreg_hi,imm8 */
+		EMIT3(0xC1, add_1reg(0xF8, dreg_hi), 31);
+		/* mov dreg_lo,dreg_hi */
+		EMIT2(0x89, add_2reg(0xC0, dreg_lo, dreg_hi));
+	}
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],dreg_lo */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_lo),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],dreg_hi */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, dreg_hi),
+		      STACK_VAR(dst_hi));
+	}
+	*pprog = prog;
+}
+
+static inline void emit_ia32_mul_r64(const u8 dst[], const u8 src[], bool dstk,
+				     bool sstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_hi));
+	else
+		/* mov eax,dst_hi */
+		EMIT2(0x8B, add_2reg(0xC0, dst_hi, IA32_EAX));
+
+	if (sstk)
+		/* mul dword ptr [ebp+off] */
+		EMIT3(0xF7, add_1reg(0x60, IA32_EBP), STACK_VAR(src_lo));
+	else
+		/* mul src_lo */
+		EMIT2(0xF7, add_1reg(0xE0, src_lo));
+
+	/* mov ecx,eax */
+	EMIT2(0x89, add_2reg(0xC0, IA32_ECX, IA32_EAX));
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+	else
+		/* mov eax,dst_lo */
+		EMIT2(0x8B, add_2reg(0xC0, dst_lo, IA32_EAX));
+
+	if (sstk)
+		/* mul dword ptr [ebp+off] */
+		EMIT3(0xF7, add_1reg(0x60, IA32_EBP), STACK_VAR(src_hi));
+	else
+		/* mul src_hi */
+		EMIT2(0xF7, add_1reg(0xE0, src_hi));
+
+	/* add eax,eax */
+	EMIT2(0x01, add_2reg(0xC0, IA32_ECX, IA32_EAX));
+
+	if (dstk)
+		/* mov eax,dword ptr [ebp+off] */
+		EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+	else
+		/* mov eax,dst_lo */
+		EMIT2(0x8B, add_2reg(0xC0, dst_lo, IA32_EAX));
+
+	if (sstk)
+		/* mul dword ptr [ebp+off] */
+		EMIT3(0xF7, add_1reg(0x60, IA32_EBP), STACK_VAR(src_lo));
+	else
+		/* mul src_lo */
+		EMIT2(0xF7, add_1reg(0xE0, src_lo));
+
+	/* add ecx,edx */
+	EMIT2(0x01, add_2reg(0xC0, IA32_ECX, IA32_EDX));
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],eax */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],ecx */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_ECX),
+		      STACK_VAR(dst_hi));
+	} else {
+		/* mov dst_lo,eax */
+		EMIT2(0x89, add_2reg(0xC0, dst_lo, IA32_EAX));
+		/* mov dst_hi,ecx */
+		EMIT2(0x89, add_2reg(0xC0, dst_hi, IA32_ECX));
+	}
+
+	*pprog = prog;
+}
+
+static inline void emit_ia32_mul_i64(const u8 dst[], const u32 val,
+				     bool dstk, u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	u32 hi;
+
+	hi = val & (1<<31) ? (u32)~0 : 0;
+	/* movl eax,imm32 */
+	EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EAX), val);
+	if (dstk)
+		/* mul dword ptr [ebp+off] */
+		EMIT3(0xF7, add_1reg(0x60, IA32_EBP), STACK_VAR(dst_hi));
+	else
+		/* mul dst_hi */
+		EMIT2(0xF7, add_1reg(0xE0, dst_hi));
+
+	/* mov ecx,eax */
+	EMIT2(0x89, add_2reg(0xC0, IA32_ECX, IA32_EAX));
+
+	/* movl eax,imm32 */
+	EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EAX), hi);
+	if (dstk)
+		/* mul dword ptr [ebp+off] */
+		EMIT3(0xF7, add_1reg(0x60, IA32_EBP), STACK_VAR(dst_lo));
+	else
+		/* mul dst_lo */
+		EMIT2(0xF7, add_1reg(0xE0, dst_lo));
+	/* add ecx,eax */
+	EMIT2(0x01, add_2reg(0xC0, IA32_ECX, IA32_EAX));
+
+	/* movl eax,imm32 */
+	EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EAX), val);
+	if (dstk)
+		/* mul dword ptr [ebp+off] */
+		EMIT3(0xF7, add_1reg(0x60, IA32_EBP), STACK_VAR(dst_lo));
+	else
+		/* mul dst_lo */
+		EMIT2(0xF7, add_1reg(0xE0, dst_lo));
+
+	/* add ecx,edx */
+	EMIT2(0x01, add_2reg(0xC0, IA32_ECX, IA32_EDX));
+
+	if (dstk) {
+		/* mov dword ptr [ebp+off],eax */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX),
+		      STACK_VAR(dst_lo));
+		/* mov dword ptr [ebp+off],ecx */
+		EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_ECX),
+		      STACK_VAR(dst_hi));
+	} else {
+		/* mov dword ptr [ebp+off],eax */
+		EMIT2(0x89, add_2reg(0xC0, dst_lo, IA32_EAX));
+		/* mov dword ptr [ebp+off],ecx */
+		EMIT2(0x89, add_2reg(0xC0, dst_hi, IA32_ECX));
+	}
+
+	*pprog = prog;
+}
+
+static int bpf_size_to_x86_bytes(int bpf_size)
+{
+	if (bpf_size == BPF_W)
+		return 4;
+	else if (bpf_size == BPF_H)
+		return 2;
+	else if (bpf_size == BPF_B)
+		return 1;
+	else if (bpf_size == BPF_DW)
+		return 4; /* imm32 */
+	else
+		return 0;
+}
+
+struct jit_context {
+	int cleanup_addr; /* epilogue code offset */
+};
+
+/* maximum number of bytes emitted while JITing one eBPF insn */
+#define BPF_MAX_INSN_SIZE	128
+#define BPF_INSN_SAFETY		64
+
+#define PROLOGUE_SIZE 35
+
+/* emit prologue code for BPF program and check it's size.
+ * bpf_tail_call helper will skip it while jumping into another program
+ */
+static void emit_prologue(u8 **pprog, u32 stack_depth)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	const u8 *r1 = bpf2ia32[BPF_REG_1];
+	const u8 fplo = bpf2ia32[BPF_REG_FP][0];
+	const u8 fphi = bpf2ia32[BPF_REG_FP][1];
+	const u8 *tcc = bpf2ia32[TCALL_CNT];
+
+	/* push ebp */
+	EMIT1(0x55);
+	/* mov ebp,esp */
+	EMIT2(0x89, 0xE5);
+	/* push edi */
+	EMIT1(0x57);
+	/* push esi */
+	EMIT1(0x56);
+	/* push ebx */
+	EMIT1(0x53);
+
+	/* sub esp,STACK_SIZE */
+	EMIT2_off32(0x81, 0xEC, STACK_SIZE);
+	/* sub ebp,SCRATCH_SIZE+4+12*/
+	EMIT3(0x83, add_1reg(0xE8, IA32_EBP), SCRATCH_SIZE + 16);
+	/* xor ebx,ebx */
+	EMIT2(0x31, add_2reg(0xC0, IA32_EBX, IA32_EBX));
+
+	/* Set up BPF prog stack base register */
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EBP), STACK_VAR(fplo));
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EBX), STACK_VAR(fphi));
+
+	/* Move BPF_CTX (EAX) to BPF_REG_R1 */
+	/* mov dword ptr [ebp+off],eax */
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(r1[0]));
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EBX), STACK_VAR(r1[1]));
+
+	/* Initialize Tail Count */
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EBX), STACK_VAR(tcc[0]));
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EBX), STACK_VAR(tcc[1]));
+
+	BUILD_BUG_ON(cnt != PROLOGUE_SIZE);
+	*pprog = prog;
+}
+
+/* Emit epilogue code for BPF program */
+static void emit_epilogue(u8 **pprog, u32 stack_depth)
+{
+	u8 *prog = *pprog;
+	const u8 *r0 = bpf2ia32[BPF_REG_0];
+	int cnt = 0;
+
+	/* mov eax,dword ptr [ebp+off]*/
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(r0[0]));
+	/* mov edx,dword ptr [ebp+off]*/
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX), STACK_VAR(r0[1]));
+
+	/* add ebp,SCRATCH_SIZE+4+12*/
+	EMIT3(0x83, add_1reg(0xC0, IA32_EBP), SCRATCH_SIZE + 16);
+
+	/* mov ebx,dword ptr [ebp-12]*/
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EBX), -12);
+	/* mov esi,dword ptr [ebp-8]*/
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ESI), -8);
+	/* mov edi,dword ptr [ebp-4]*/
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDI), -4);
+
+	EMIT1(0xC9); /* leave */
+	EMIT1(0xC3); /* ret */
+	*pprog = prog;
+}
+
+/* generate the following code:
+ * ... bpf_tail_call(void *ctx, struct bpf_array *array, u64 index) ...
+ *   if (index >= array->map.max_entries)
+ *     goto out;
+ *   if (++tail_call_cnt > MAX_TAIL_CALL_CNT)
+ *     goto out;
+ *   prog = array->ptrs[index];
+ *   if (prog == NULL)
+ *     goto out;
+ *   goto *(prog->bpf_func + prologue_size);
+ * out:
+ */
+static void emit_bpf_tail_call(u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+	const u8 *r1 = bpf2ia32[BPF_REG_1];
+	const u8 *r2 = bpf2ia32[BPF_REG_2];
+	const u8 *r3 = bpf2ia32[BPF_REG_3];
+	const u8 *tcc = bpf2ia32[TCALL_CNT];
+	u32 lo, hi;
+	static int jmp_label1 = -1;
+
+	/* if (index >= array->map.max_entries)
+	 *   goto out;
+	 */
+	/* mov eax,dword ptr [ebp+off] */
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(r2[0]));
+	/* mov edx,dword ptr [ebp+off] */
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX), STACK_VAR(r3[0]));
+
+	/* cmp dword ptr [eax+off],edx */
+	EMIT3(0x39, add_2reg(0x40, IA32_EAX, IA32_EDX),
+	      offsetof(struct bpf_array, map.max_entries));
+	/* jbe out */
+	EMIT2(IA32_JBE, jmp_label(jmp_label1, 2));
+
+	/* if (tail_call_cnt > MAX_TAIL_CALL_CNT)
+	 *   goto out;
+	 */
+	lo = (u32)MAX_TAIL_CALL_CNT;
+	hi = (u32)((u64)MAX_TAIL_CALL_CNT >> 32);
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX), STACK_VAR(tcc[0]));
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EBX), STACK_VAR(tcc[1]));
+
+	EMIT3(0x83, add_1reg(0xF8, IA32_EBX), hi);   /* cmp edx, hi */
+	EMIT2(IA32_JNE, 3);
+	EMIT3(0x83, add_1reg(0xF8, IA32_ECX), lo);   /* cmp ecx, lo */
+
+	EMIT2(IA32_JAE, jmp_label(jmp_label1, 2));   /* ja out */
+
+	EMIT3(0x83, add_1reg(0xC0, IA32_ECX), 0x01);   /* add eax, 0x1 */
+	EMIT3(0x83, add_1reg(0xD0, IA32_EBX), 0x00);   /* adc ebx, 0x0 */
+
+	/* mov dword ptr [ebp+off],eax */
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_ECX), STACK_VAR(tcc[0]));
+	/* mov dword ptr [ebp+off],edx */
+	EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EBX), STACK_VAR(tcc[1]));
+
+	/* prog = array->ptrs[index]; */
+	/* mov edx, [eax + edx * 4 + offsetof(...)] */
+	EMIT3_off32(0x8B, 0x94, 0x90, offsetof(struct bpf_array, ptrs));
+
+	/* if (prog == NULL)
+	 *   goto out;
+	 */
+	EMIT2(0x85, add_2reg(0xC0, IA32_EDX, IA32_EDX)); /* test edx,edx */
+	EMIT2(IA32_JE, jmp_label(jmp_label1, 2)); /* je out */
+
+	/* goto *(prog->bpf_func + prologue_size); */
+	/* mov edx, dword ptr [edx + 32] */
+	EMIT3(0x8B, add_2reg(0x40, IA32_EDX, IA32_EDX),
+	      offsetof(struct bpf_prog, bpf_func));
+	/* add edx, prologue_size */
+	EMIT3(0x83, add_1reg(0xC0, IA32_EDX), PROLOGUE_SIZE);
+
+	/* mov eax,dword ptr [ebp+off] */
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX), STACK_VAR(r1[0]));
+
+	/* now we're ready to jump into next BPF program
+	 * eax == ctx (1st arg)
+	 * edx == prog->bpf_func + prologue_size
+	 */
+	RETPOLINE_EDX_BPF_JIT();
+
+	if (jmp_label1 == -1)
+		jmp_label1 = cnt;
+
+	/* out: */
+	*pprog = prog;
+}
+
+// push the scratch stack register on top of the stack
+static inline void emit_push_r64(const u8 src[], u8 **pprog)
+{
+	u8 *prog = *pprog;
+	int cnt = 0;
+
+	/* mov ecx,dword ptr [ebp+off] */
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX), STACK_VAR(src_hi));
+	/* push ecx */
+	EMIT1(0x51);
+
+	/* mov ecx,dword ptr [ebp+off] */
+	EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX), STACK_VAR(src_lo));
+	/* push ecx */
+	EMIT1(0x51);
+
+	*pprog = prog;
+}
+
+static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
+		  int oldproglen, struct jit_context *ctx)
+{
+	struct bpf_insn *insn = bpf_prog->insnsi;
+	int insn_cnt = bpf_prog->len;
+	bool seen_exit = false;
+	u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY];
+	int i, cnt = 0;
+	int proglen = 0;
+	u8 *prog = temp;
+
+	emit_prologue(&prog, bpf_prog->aux->stack_depth);
+
+	for (i = 0; i < insn_cnt; i++, insn++) {
+		const s32 imm32 = insn->imm;
+		const bool is64 = BPF_CLASS(insn->code) == BPF_ALU64;
+		const bool dstk = is_on_stack(insn->dst_reg);
+		const bool sstk = is_on_stack(insn->src_reg);
+		const u8 code = insn->code;
+		const u8 *dst = bpf2ia32[insn->dst_reg];
+		const u8 *src = bpf2ia32[insn->src_reg];
+		const u8 *r0 = bpf2ia32[BPF_REG_0];
+		s64 jmp_offset;
+		u8 jmp_cond;
+		int ilen;
+		u8 *func;
+
+		switch (code) {
+		/* ALU operations */
+		/* dst = src */
+		case BPF_ALU | BPF_MOV | BPF_K:
+		case BPF_ALU | BPF_MOV | BPF_X:
+		case BPF_ALU64 | BPF_MOV | BPF_K:
+		case BPF_ALU64 | BPF_MOV | BPF_X:
+			switch (BPF_SRC(code)) {
+			case BPF_X:
+				emit_ia32_mov_r64(is64, dst, src, dstk,
+						  sstk, &prog);
+				break;
+			case BPF_K:
+				/* Sign-extend immediate value to dst reg */
+				emit_ia32_mov_i64(is64, dst, imm32,
+						  dstk, &prog);
+				break;
+			}
+			break;
+		/* dst = dst + src/imm */
+		/* dst = dst - src/imm */
+		/* dst = dst | src/imm */
+		/* dst = dst & src/imm */
+		/* dst = dst ^ src/imm */
+		/* dst = dst * src/imm */
+		/* dst = dst << src */
+		/* dst = dst >> src */
+		case BPF_ALU | BPF_ADD | BPF_K:
+		case BPF_ALU | BPF_ADD | BPF_X:
+		case BPF_ALU | BPF_SUB | BPF_K:
+		case BPF_ALU | BPF_SUB | BPF_X:
+		case BPF_ALU | BPF_OR | BPF_K:
+		case BPF_ALU | BPF_OR | BPF_X:
+		case BPF_ALU | BPF_AND | BPF_K:
+		case BPF_ALU | BPF_AND | BPF_X:
+		case BPF_ALU | BPF_XOR | BPF_K:
+		case BPF_ALU | BPF_XOR | BPF_X:
+		case BPF_ALU64 | BPF_ADD | BPF_K:
+		case BPF_ALU64 | BPF_ADD | BPF_X:
+		case BPF_ALU64 | BPF_SUB | BPF_K:
+		case BPF_ALU64 | BPF_SUB | BPF_X:
+		case BPF_ALU64 | BPF_OR | BPF_K:
+		case BPF_ALU64 | BPF_OR | BPF_X:
+		case BPF_ALU64 | BPF_AND | BPF_K:
+		case BPF_ALU64 | BPF_AND | BPF_X:
+		case BPF_ALU64 | BPF_XOR | BPF_K:
+		case BPF_ALU64 | BPF_XOR | BPF_X:
+			switch (BPF_SRC(code)) {
+			case BPF_X:
+				emit_ia32_alu_r64(is64, BPF_OP(code), dst,
+						  src, dstk, sstk, &prog);
+				break;
+			case BPF_K:
+				emit_ia32_alu_i64(is64, BPF_OP(code), dst,
+						  imm32, dstk, &prog);
+				break;
+			}
+			break;
+		case BPF_ALU | BPF_MUL | BPF_K:
+		case BPF_ALU | BPF_MUL | BPF_X:
+			switch (BPF_SRC(code)) {
+			case BPF_X:
+				emit_ia32_mul_r(dst_lo, src_lo, dstk,
+						sstk, &prog);
+				break;
+			case BPF_K:
+				/* mov ecx,imm32*/
+				EMIT2_off32(0xC7, add_1reg(0xC0, IA32_ECX),
+					    imm32);
+				emit_ia32_mul_r(dst_lo, IA32_ECX, dstk,
+						false, &prog);
+				break;
+			}
+			emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+			break;
+		case BPF_ALU | BPF_LSH | BPF_X:
+		case BPF_ALU | BPF_RSH | BPF_X:
+		case BPF_ALU | BPF_ARSH | BPF_K:
+		case BPF_ALU | BPF_ARSH | BPF_X:
+			switch (BPF_SRC(code)) {
+			case BPF_X:
+				emit_ia32_shift_r(BPF_OP(code), dst_lo, src_lo,
+						  dstk, sstk, &prog);
+				break;
+			case BPF_K:
+				/* mov ecx,imm32*/
+				EMIT2_off32(0xC7, add_1reg(0xC0, IA32_ECX),
+					    imm32);
+				emit_ia32_shift_r(BPF_OP(code), dst_lo,
+						  IA32_ECX, dstk, false,
+						  &prog);
+				break;
+			}
+			emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+			break;
+		/* dst = dst / src(imm) */
+		/* dst = dst % src(imm) */
+		case BPF_ALU | BPF_DIV | BPF_K:
+		case BPF_ALU | BPF_DIV | BPF_X:
+		case BPF_ALU | BPF_MOD | BPF_K:
+		case BPF_ALU | BPF_MOD | BPF_X:
+			switch (BPF_SRC(code)) {
+			case BPF_X:
+				emit_ia32_div_mod_r(BPF_OP(code), dst_lo,
+						    src_lo, dstk, sstk, &prog);
+				break;
+			case BPF_K:
+				/* mov ecx,imm32*/
+				EMIT2_off32(0xC7, add_1reg(0xC0, IA32_ECX),
+					    imm32);
+				emit_ia32_div_mod_r(BPF_OP(code), dst_lo,
+						    IA32_ECX, dstk, false,
+						    &prog);
+				break;
+			}
+			emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+			break;
+		case BPF_ALU64 | BPF_DIV | BPF_K:
+		case BPF_ALU64 | BPF_DIV | BPF_X:
+		case BPF_ALU64 | BPF_MOD | BPF_K:
+		case BPF_ALU64 | BPF_MOD | BPF_X:
+			goto notyet;
+		/* dst = dst >> imm */
+		/* dst = dst << imm */
+		case BPF_ALU | BPF_RSH | BPF_K:
+		case BPF_ALU | BPF_LSH | BPF_K:
+			if (unlikely(imm32 > 31))
+				return -EINVAL;
+			/* mov ecx,imm32*/
+			EMIT2_off32(0xC7, add_1reg(0xC0, IA32_ECX), imm32);
+			emit_ia32_shift_r(BPF_OP(code), dst_lo, IA32_ECX, dstk,
+					  false, &prog);
+			emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+			break;
+		/* dst = dst << imm */
+		case BPF_ALU64 | BPF_LSH | BPF_K:
+			if (unlikely(imm32 > 63))
+				return -EINVAL;
+			emit_ia32_lsh_i64(dst, imm32, dstk, &prog);
+			break;
+		/* dst = dst >> imm */
+		case BPF_ALU64 | BPF_RSH | BPF_K:
+			if (unlikely(imm32 > 63))
+				return -EINVAL;
+			emit_ia32_rsh_i64(dst, imm32, dstk, &prog);
+			break;
+		/* dst = dst << src */
+		case BPF_ALU64 | BPF_LSH | BPF_X:
+			emit_ia32_lsh_r64(dst, src, dstk, sstk, &prog);
+			break;
+		/* dst = dst >> src */
+		case BPF_ALU64 | BPF_RSH | BPF_X:
+			emit_ia32_rsh_r64(dst, src, dstk, sstk, &prog);
+			break;
+		/* dst = dst >> src (signed) */
+		case BPF_ALU64 | BPF_ARSH | BPF_X:
+			emit_ia32_arsh_r64(dst, src, dstk, sstk, &prog);
+			break;
+		/* dst = dst >> imm (signed) */
+		case BPF_ALU64 | BPF_ARSH | BPF_K:
+			if (unlikely(imm32 > 63))
+				return -EINVAL;
+			emit_ia32_arsh_i64(dst, imm32, dstk, &prog);
+			break;
+		/* dst = ~dst */
+		case BPF_ALU | BPF_NEG:
+			emit_ia32_alu_i(is64, false, BPF_OP(code),
+					dst_lo, 0, dstk, &prog);
+			emit_ia32_mov_i(dst_hi, 0, dstk, &prog);
+			break;
+		/* dst = ~dst (64 bit) */
+		case BPF_ALU64 | BPF_NEG:
+			emit_ia32_neg64(dst, dstk, &prog);
+			break;
+		/* dst = dst * src/imm */
+		case BPF_ALU64 | BPF_MUL | BPF_X:
+		case BPF_ALU64 | BPF_MUL | BPF_K:
+			switch (BPF_SRC(code)) {
+			case BPF_X:
+				emit_ia32_mul_r64(dst, src, dstk, sstk, &prog);
+				break;
+			case BPF_K:
+				emit_ia32_mul_i64(dst, imm32, dstk, &prog);
+				break;
+			}
+			break;
+		/* dst = htole(dst) */
+		case BPF_ALU | BPF_END | BPF_FROM_LE:
+			emit_ia32_to_le_r64(dst, imm32, dstk, &prog);
+			break;
+		/* dst = htobe(dst) */
+		case BPF_ALU | BPF_END | BPF_FROM_BE:
+			emit_ia32_to_be_r64(dst, imm32, dstk, &prog);
+			break;
+		/* dst = imm64 */
+		case BPF_LD | BPF_IMM | BPF_DW: {
+			s32 hi, lo = imm32;
+
+			hi = insn[1].imm;
+			emit_ia32_mov_i(dst_lo, lo, dstk, &prog);
+			emit_ia32_mov_i(dst_hi, hi, dstk, &prog);
+			insn++;
+			i++;
+			break;
+		}
+		/* ST: *(u8*)(dst_reg + off) = imm */
+		case BPF_ST | BPF_MEM | BPF_H:
+		case BPF_ST | BPF_MEM | BPF_B:
+		case BPF_ST | BPF_MEM | BPF_W:
+		case BPF_ST | BPF_MEM | BPF_DW:
+			if (dstk)
+				/* mov eax,dword ptr [ebp+off] */
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+				      STACK_VAR(dst_lo));
+			else
+				/* mov eax,dst_lo */
+				EMIT2(0x8B, add_2reg(0xC0, dst_lo, IA32_EAX));
+
+			switch (BPF_SIZE(code)) {
+			case BPF_B:
+				EMIT(0xC6, 1); break;
+			case BPF_H:
+				EMIT2(0x66, 0xC7); break;
+			case BPF_W:
+			case BPF_DW:
+				EMIT(0xC7, 1); break;
+			}
+
+			if (is_imm8(insn->off))
+				EMIT2(add_1reg(0x40, IA32_EAX), insn->off);
+			else
+				EMIT1_off32(add_1reg(0x80, IA32_EAX),
+					    insn->off);
+			EMIT(imm32, bpf_size_to_x86_bytes(BPF_SIZE(code)));
+
+			if (BPF_SIZE(code) == BPF_DW) {
+				u32 hi;
+
+				hi = imm32 & (1<<31) ? (u32)~0 : 0;
+				EMIT2_off32(0xC7, add_1reg(0x80, IA32_EAX),
+					    insn->off + 4);
+				EMIT(hi, 4);
+			}
+			break;
+
+		/* STX: *(u8*)(dst_reg + off) = src_reg */
+		case BPF_STX | BPF_MEM | BPF_B:
+		case BPF_STX | BPF_MEM | BPF_H:
+		case BPF_STX | BPF_MEM | BPF_W:
+		case BPF_STX | BPF_MEM | BPF_DW:
+			if (dstk)
+				/* mov eax,dword ptr [ebp+off] */
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+				      STACK_VAR(dst_lo));
+			else
+				/* mov eax,dst_lo */
+				EMIT2(0x8B, add_2reg(0xC0, dst_lo, IA32_EAX));
+
+			if (sstk)
+				/* mov edx,dword ptr [ebp+off] */
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+				      STACK_VAR(src_lo));
+			else
+				/* mov edx,src_lo */
+				EMIT2(0x8B, add_2reg(0xC0, src_lo, IA32_EDX));
+
+			switch (BPF_SIZE(code)) {
+			case BPF_B:
+				EMIT(0x88, 1); break;
+			case BPF_H:
+				EMIT2(0x66, 0x89); break;
+			case BPF_W:
+			case BPF_DW:
+				EMIT(0x89, 1); break;
+			}
+
+			if (is_imm8(insn->off))
+				EMIT2(add_2reg(0x40, IA32_EAX, IA32_EDX),
+				      insn->off);
+			else
+				EMIT1_off32(add_2reg(0x80, IA32_EAX, IA32_EDX),
+					    insn->off);
+
+			if (BPF_SIZE(code) == BPF_DW) {
+				if (sstk)
+					/* mov edi,dword ptr [ebp+off] */
+					EMIT3(0x8B, add_2reg(0x40, IA32_EBP,
+							     IA32_EDX),
+					      STACK_VAR(src_hi));
+				else
+					/* mov edi,src_hi */
+					EMIT2(0x8B, add_2reg(0xC0, src_hi,
+							     IA32_EDX));
+				EMIT1(0x89);
+				if (is_imm8(insn->off + 4)) {
+					EMIT2(add_2reg(0x40, IA32_EAX,
+						       IA32_EDX),
+					      insn->off + 4);
+				} else {
+					EMIT1(add_2reg(0x80, IA32_EAX,
+						       IA32_EDX));
+					EMIT(insn->off + 4, 4);
+				}
+			}
+			break;
+
+		/* LDX: dst_reg = *(u8*)(src_reg + off) */
+		case BPF_LDX | BPF_MEM | BPF_B:
+		case BPF_LDX | BPF_MEM | BPF_H:
+		case BPF_LDX | BPF_MEM | BPF_W:
+		case BPF_LDX | BPF_MEM | BPF_DW:
+			if (sstk)
+				/* mov eax,dword ptr [ebp+off] */
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+				      STACK_VAR(src_lo));
+			else
+				/* mov eax,dword ptr [ebp+off] */
+				EMIT2(0x8B, add_2reg(0xC0, src_lo, IA32_EAX));
+
+			switch (BPF_SIZE(code)) {
+			case BPF_B:
+				EMIT2(0x0F, 0xB6); break;
+			case BPF_H:
+				EMIT2(0x0F, 0xB7); break;
+			case BPF_W:
+			case BPF_DW:
+				EMIT(0x8B, 1); break;
+			}
+
+			if (is_imm8(insn->off))
+				EMIT2(add_2reg(0x40, IA32_EAX, IA32_EDX),
+				      insn->off);
+			else
+				EMIT1_off32(add_2reg(0x80, IA32_EAX, IA32_EDX),
+					    insn->off);
+
+			if (dstk)
+				/* mov dword ptr [ebp+off],edx */
+				EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EDX),
+				      STACK_VAR(dst_lo));
+			else
+				/* mov dst_lo,edx */
+				EMIT2(0x89, add_2reg(0xC0, dst_lo, IA32_EDX));
+			switch (BPF_SIZE(code)) {
+			case BPF_B:
+			case BPF_H:
+			case BPF_W:
+				if (dstk) {
+					EMIT3(0xC7, add_1reg(0x40, IA32_EBP),
+					      STACK_VAR(dst_hi));
+					EMIT(0x0, 4);
+				} else {
+					EMIT3(0xC7, add_1reg(0xC0, dst_hi), 0);
+				}
+				break;
+			case BPF_DW:
+				EMIT2_off32(0x8B,
+					    add_2reg(0x80, IA32_EAX, IA32_EDX),
+					    insn->off + 4);
+				if (dstk)
+					EMIT3(0x89,
+					      add_2reg(0x40, IA32_EBP,
+						       IA32_EDX),
+					      STACK_VAR(dst_hi));
+				else
+					EMIT2(0x89,
+					      add_2reg(0xC0, dst_hi, IA32_EDX));
+				break;
+			default:
+				break;
+			}
+			break;
+		/* call */
+		case BPF_JMP | BPF_CALL:
+		{
+			const u8 *r1 = bpf2ia32[BPF_REG_1];
+			const u8 *r2 = bpf2ia32[BPF_REG_2];
+			const u8 *r3 = bpf2ia32[BPF_REG_3];
+			const u8 *r4 = bpf2ia32[BPF_REG_4];
+			const u8 *r5 = bpf2ia32[BPF_REG_5];
+
+			if (insn->src_reg == BPF_PSEUDO_CALL)
+				goto notyet;
+
+			func = (u8 *) __bpf_call_base + imm32;
+			jmp_offset = func - (image + addrs[i]);
+
+			if (!imm32 || !is_simm32(jmp_offset)) {
+				pr_err("unsupported bpf func %d addr %p image %p\n",
+				       imm32, func, image);
+				return -EINVAL;
+			}
+
+			/* mov eax,dword ptr [ebp+off] */
+			EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+			      STACK_VAR(r1[0]));
+			/* mov edx,dword ptr [ebp+off] */
+			EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+			      STACK_VAR(r1[1]));
+
+			emit_push_r64(r5, &prog);
+			emit_push_r64(r4, &prog);
+			emit_push_r64(r3, &prog);
+			emit_push_r64(r2, &prog);
+
+			EMIT1_off32(0xE8, jmp_offset + 9);
+
+			/* mov dword ptr [ebp+off],eax */
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX),
+			      STACK_VAR(r0[0]));
+			/* mov dword ptr [ebp+off],edx */
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EDX),
+			      STACK_VAR(r0[1]));
+
+			/* add esp,32 */
+			EMIT3(0x83, add_1reg(0xC0, IA32_ESP), 32);
+			break;
+		}
+		case BPF_JMP | BPF_TAIL_CALL:
+			emit_bpf_tail_call(&prog);
+			break;
+
+		/* cond jump */
+		case BPF_JMP | BPF_JEQ | BPF_X:
+		case BPF_JMP | BPF_JNE | BPF_X:
+		case BPF_JMP | BPF_JGT | BPF_X:
+		case BPF_JMP | BPF_JLT | BPF_X:
+		case BPF_JMP | BPF_JGE | BPF_X:
+		case BPF_JMP | BPF_JLE | BPF_X:
+		case BPF_JMP | BPF_JSGT | BPF_X:
+		case BPF_JMP | BPF_JSLE | BPF_X:
+		case BPF_JMP | BPF_JSLT | BPF_X:
+		case BPF_JMP | BPF_JSGE | BPF_X: {
+			u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+			u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+			u8 sreg_lo = sstk ? IA32_ECX : src_lo;
+			u8 sreg_hi = sstk ? IA32_EBX : src_hi;
+
+			if (dstk) {
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+				      STACK_VAR(dst_lo));
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+				      STACK_VAR(dst_hi));
+			}
+
+			if (sstk) {
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX),
+				      STACK_VAR(src_lo));
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EBX),
+				      STACK_VAR(src_hi));
+			}
+
+			/* cmp dreg_hi,sreg_hi */
+			EMIT2(0x39, add_2reg(0xC0, dreg_hi, sreg_hi));
+			EMIT2(IA32_JNE, 2);
+			/* cmp dreg_lo,sreg_lo */
+			EMIT2(0x39, add_2reg(0xC0, dreg_lo, sreg_lo));
+			goto emit_cond_jmp;
+		}
+		case BPF_JMP | BPF_JSET | BPF_X: {
+			u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+			u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+			u8 sreg_lo = sstk ? IA32_ECX : src_lo;
+			u8 sreg_hi = sstk ? IA32_EBX : src_hi;
+
+			if (dstk) {
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+				      STACK_VAR(dst_lo));
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+				      STACK_VAR(dst_hi));
+			}
+
+			if (sstk) {
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX),
+				      STACK_VAR(src_lo));
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EBX),
+				      STACK_VAR(src_hi));
+			}
+			/* and dreg_lo,sreg_lo */
+			EMIT2(0x23, add_2reg(0xC0, sreg_lo, dreg_lo));
+			/* and dreg_hi,sreg_hi */
+			EMIT2(0x23, add_2reg(0xC0, sreg_hi, dreg_hi));
+			/* or dreg_lo,dreg_hi */
+			EMIT2(0x09, add_2reg(0xC0, dreg_lo, dreg_hi));
+			goto emit_cond_jmp;
+		}
+		case BPF_JMP | BPF_JSET | BPF_K: {
+			u32 hi;
+			u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+			u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+			u8 sreg_lo = IA32_ECX;
+			u8 sreg_hi = IA32_EBX;
+
+			if (dstk) {
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+				      STACK_VAR(dst_lo));
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+				      STACK_VAR(dst_hi));
+			}
+			hi = imm32 & (1<<31) ? (u32)~0 : 0;
+
+			/* mov ecx,imm32 */
+			EMIT2_off32(0xC7, add_1reg(0xC0, IA32_ECX), imm32);
+			/* mov ebx,imm32 */
+			EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EBX), hi);
+
+			/* and dreg_lo,sreg_lo */
+			EMIT2(0x23, add_2reg(0xC0, sreg_lo, dreg_lo));
+			/* and dreg_hi,sreg_hi */
+			EMIT2(0x23, add_2reg(0xC0, sreg_hi, dreg_hi));
+			/* or dreg_lo,dreg_hi */
+			EMIT2(0x09, add_2reg(0xC0, dreg_lo, dreg_hi));
+			goto emit_cond_jmp;
+		}
+		case BPF_JMP | BPF_JEQ | BPF_K:
+		case BPF_JMP | BPF_JNE | BPF_K:
+		case BPF_JMP | BPF_JGT | BPF_K:
+		case BPF_JMP | BPF_JLT | BPF_K:
+		case BPF_JMP | BPF_JGE | BPF_K:
+		case BPF_JMP | BPF_JLE | BPF_K:
+		case BPF_JMP | BPF_JSGT | BPF_K:
+		case BPF_JMP | BPF_JSLE | BPF_K:
+		case BPF_JMP | BPF_JSLT | BPF_K:
+		case BPF_JMP | BPF_JSGE | BPF_K: {
+			u32 hi;
+			u8 dreg_lo = dstk ? IA32_EAX : dst_lo;
+			u8 dreg_hi = dstk ? IA32_EDX : dst_hi;
+			u8 sreg_lo = IA32_ECX;
+			u8 sreg_hi = IA32_EBX;
+
+			if (dstk) {
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+				      STACK_VAR(dst_lo));
+				EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX),
+				      STACK_VAR(dst_hi));
+			}
+
+			hi = imm32 & (1<<31) ? (u32)~0 : 0;
+			/* mov ecx,imm32 */
+			EMIT2_off32(0xC7, add_1reg(0xC0, IA32_ECX), imm32);
+			/* mov ebx,imm32 */
+			EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EBX), hi);
+
+			/* cmp dreg_hi,sreg_hi */
+			EMIT2(0x39, add_2reg(0xC0, dreg_hi, sreg_hi));
+			EMIT2(IA32_JNE, 2);
+			/* cmp dreg_lo,sreg_lo */
+			EMIT2(0x39, add_2reg(0xC0, dreg_lo, sreg_lo));
+
+emit_cond_jmp:		/* convert BPF opcode to x86 */
+			switch (BPF_OP(code)) {
+			case BPF_JEQ:
+				jmp_cond = IA32_JE;
+				break;
+			case BPF_JSET:
+			case BPF_JNE:
+				jmp_cond = IA32_JNE;
+				break;
+			case BPF_JGT:
+				/* GT is unsigned '>', JA in x86 */
+				jmp_cond = IA32_JA;
+				break;
+			case BPF_JLT:
+				/* LT is unsigned '<', JB in x86 */
+				jmp_cond = IA32_JB;
+				break;
+			case BPF_JGE:
+				/* GE is unsigned '>=', JAE in x86 */
+				jmp_cond = IA32_JAE;
+				break;
+			case BPF_JLE:
+				/* LE is unsigned '<=', JBE in x86 */
+				jmp_cond = IA32_JBE;
+				break;
+			case BPF_JSGT:
+				/* signed '>', GT in x86 */
+				jmp_cond = IA32_JG;
+				break;
+			case BPF_JSLT:
+				/* signed '<', LT in x86 */
+				jmp_cond = IA32_JL;
+				break;
+			case BPF_JSGE:
+				/* signed '>=', GE in x86 */
+				jmp_cond = IA32_JGE;
+				break;
+			case BPF_JSLE:
+				/* signed '<=', LE in x86 */
+				jmp_cond = IA32_JLE;
+				break;
+			default: /* to silence gcc warning */
+				return -EFAULT;
+			}
+			jmp_offset = addrs[i + insn->off] - addrs[i];
+			if (is_imm8(jmp_offset)) {
+				EMIT2(jmp_cond, jmp_offset);
+			} else if (is_simm32(jmp_offset)) {
+				EMIT2_off32(0x0F, jmp_cond + 0x10, jmp_offset);
+			} else {
+				pr_err("cond_jmp gen bug %llx\n", jmp_offset);
+				return -EFAULT;
+			}
+
+			break;
+		}
+		case BPF_JMP | BPF_JA:
+			jmp_offset = addrs[i + insn->off] - addrs[i];
+			if (!jmp_offset)
+				/* optimize out nop jumps */
+				break;
+emit_jmp:
+			if (is_imm8(jmp_offset)) {
+				EMIT2(0xEB, jmp_offset);
+			} else if (is_simm32(jmp_offset)) {
+				EMIT1_off32(0xE9, jmp_offset);
+			} else {
+				pr_err("jmp gen bug %llx\n", jmp_offset);
+				return -EFAULT;
+			}
+			break;
+
+		case BPF_LD | BPF_ABS | BPF_W:
+		case BPF_LD | BPF_ABS | BPF_H:
+		case BPF_LD | BPF_ABS | BPF_B:
+		case BPF_LD | BPF_IND | BPF_W:
+		case BPF_LD | BPF_IND | BPF_H:
+		case BPF_LD | BPF_IND | BPF_B:
+		{
+			int size;
+			const u8 *r6 = bpf2ia32[BPF_REG_6];
+
+			/* Setting up first argument */
+			/* mov eax,dword ptr [ebp+off] */
+			EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EAX),
+			      STACK_VAR(r6[0]));
+
+			/* Setting up second argument */
+			if (BPF_MODE(code) == BPF_ABS) {
+				/* mov %edx, imm32 */
+				EMIT1_off32(0xBA, imm32);
+			} else {
+				if (sstk)
+					/* mov edx,dword ptr [ebp+off] */
+					EMIT3(0x8B, add_2reg(0x40, IA32_EBP,
+							     IA32_EDX),
+					      STACK_VAR(src_lo));
+				else
+					/* mov edx,src_lo */
+					EMIT2(0x8B, add_2reg(0xC0, src_lo,
+							     IA32_EDX));
+				if (imm32) {
+					if (is_imm8(imm32))
+						/* add %edx, imm8 */
+						EMIT3(0x83, 0xC2, imm32);
+					else
+						/* add %edx, imm32 */
+						EMIT2_off32(0x81, 0xC2, imm32);
+				}
+			}
+
+			/* Setting up third argument */
+			switch (BPF_SIZE(code)) {
+			case BPF_W:
+				size = 4;
+				break;
+			case BPF_H:
+				size = 2;
+				break;
+			case BPF_B:
+				size = 1;
+				break;
+			default:
+				return -EINVAL;
+			}
+			/* mov ecx,val */
+			EMIT2(0xB1, size);
+			/* movzx ecx,ecx */
+			EMIT3(0x0F, 0xB6, add_2reg(0xC0, IA32_ECX, IA32_ECX));
+
+			/* mov ebx,ebp */
+			EMIT2(0x8B, add_2reg(0xC0, IA32_EBP, IA32_EBX));
+			/* add %ebx,imm8 */
+			EMIT3(0x83, add_1reg(0xC0, IA32_EBX), SKB_BUFFER);
+			/* push ebx */
+			EMIT1(0x53);
+
+			/* Setting up function pointer to call */
+			/* mov ebx,imm32*/
+			EMIT2_off32(0xC7, add_1reg(0xC0, IA32_EBX),
+				    (unsigned int)bpf_load_pointer);
+
+			EMIT2(0xFF, add_1reg(0xD0, IA32_EBX));
+			/* add %esp,4 */
+			EMIT3(0x83, add_1reg(0xC0, IA32_ESP), 4);
+			/* xor edx,edx */
+			EMIT2(0x33, add_2reg(0xC0, IA32_EDX, IA32_EDX));
+
+			/* mov dword ptr [ebp+off],eax */
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EDX),
+			      STACK_VAR(r0[0]));
+			/* mov dword ptr [ebp+off],eax */
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EDX),
+			      STACK_VAR(r0[1]));
+
+			/* Check if return address is NULL or not.
+			 * if NULL then jump to epilogue
+			 * else continue to load the value from retn address
+			 */
+			EMIT3(0x83, add_1reg(0xF8, IA32_EAX), 0);
+			jmp_offset = ctx->cleanup_addr - addrs[i];
+
+			switch (BPF_SIZE(code)) {
+			case BPF_W:
+				jmp_offset += 10;
+				break;
+			case BPF_H:
+				jmp_offset += 13;
+				break;
+			case BPF_B:
+				jmp_offset += 9;
+				break;
+			}
+
+			EMIT2_off32(0x0F, IA32_JE + 0x10, jmp_offset);
+			/* Load value from the address */
+			switch (BPF_SIZE(code)) {
+			case BPF_W:
+				/* mov eax,[eax] */
+				EMIT2(0x8B, 0x0);
+				/* emit 'bswap eax' */
+				EMIT2(0x0F, add_1reg(0xC8, IA32_EAX));
+				break;
+			case BPF_H:
+				EMIT3(0x0F, 0xB7, 0x0);
+				EMIT1(0x66);
+				EMIT3(0xC1, add_1reg(0xC8, IA32_EAX), 8);
+				break;
+			case BPF_B:
+				EMIT3(0x0F, 0xB6, 0x0);
+				break;
+			}
+
+			/* mov dword ptr [ebp+off],eax */
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX),
+			      STACK_VAR(r0[0]));
+			/* mov dword ptr [ebp+off],eax */
+			EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EDX),
+			      STACK_VAR(r0[1]));
+			break;
+		}
+		/* STX XADD: lock *(u32 *)(dst + off) += src */
+		case BPF_STX | BPF_XADD | BPF_W:
+		/* STX XADD: lock *(u64 *)(dst + off) += src */
+		case BPF_STX | BPF_XADD | BPF_DW:
+			goto notyet;
+		case BPF_JMP | BPF_EXIT:
+			if (seen_exit) {
+				jmp_offset = ctx->cleanup_addr - addrs[i];
+				goto emit_jmp;
+			}
+			seen_exit = true;
+			/* update cleanup_addr */
+			ctx->cleanup_addr = proglen;
+			emit_epilogue(&prog, bpf_prog->aux->stack_depth);
+			break;
+notyet:
+			pr_info_once("*** NOT YET: opcode %02x ***\n", code);
+			return -EFAULT;
+		default:
+			/* This error will be seen if new instruction was added
+			 * to interpreter, but not to JIT
+			 * or if there is junk in bpf_prog
+			 */
+			pr_err("bpf_jit: unknown opcode %02x\n", code);
+			return -EINVAL;
+		}
+
+		ilen = prog - temp;
+		if (ilen > BPF_MAX_INSN_SIZE) {
+			pr_err("bpf_jit: fatal insn size error\n");
+			return -EFAULT;
+		}
+
+		if (image) {
+			if (unlikely(proglen + ilen > oldproglen)) {
+				pr_err("bpf_jit: fatal error\n");
+				return -EFAULT;
+			}
+			memcpy(image + proglen, temp, ilen);
+		}
+		proglen += ilen;
+		addrs[i] = proglen;
+		prog = temp;
+	}
+	return proglen;
+}
+
+struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+{
+	struct bpf_binary_header *header = NULL;
+	struct bpf_prog *tmp, *orig_prog = prog;
+	int proglen, oldproglen = 0;
+	struct jit_context ctx = {};
+	bool tmp_blinded = false;
+	u8 *image = NULL;
+	int *addrs;
+	int pass;
+	int i;
+
+	if (!prog->jit_requested)
+		return orig_prog;
+
+	tmp = bpf_jit_blind_constants(prog);
+	/* If blinding was requested and we failed during blinding,
+	 * we must fall back to the interpreter.
+	 */
+	if (IS_ERR(tmp))
+		return orig_prog;
+	if (tmp != prog) {
+		tmp_blinded = true;
+		prog = tmp;
+	}
+
+	addrs = kmalloc(prog->len * sizeof(*addrs), GFP_KERNEL);
+	if (!addrs) {
+		prog = orig_prog;
+		goto out;
+	}
+
+	/* Before first pass, make a rough estimation of addrs[]
+	 * each bpf instruction is translated to less than 64 bytes
+	 */
+	for (proglen = 0, i = 0; i < prog->len; i++) {
+		proglen += 64;
+		addrs[i] = proglen;
+	}
+	ctx.cleanup_addr = proglen;
+
+	/* JITed image shrinks with every pass and the loop iterates
+	 * until the image stops shrinking. Very large bpf programs
+	 * may converge on the last pass. In such case do one more
+	 * pass to emit the final image
+	 */
+	for (pass = 0; pass < 20 || image; pass++) {
+		proglen = do_jit(prog, addrs, image, oldproglen, &ctx);
+		if (proglen <= 0) {
+			image = NULL;
+			if (header)
+				bpf_jit_binary_free(header);
+			prog = orig_prog;
+			goto out_addrs;
+		}
+		if (image) {
+			if (proglen != oldproglen) {
+				pr_err("bpf_jit: proglen=%d != oldproglen=%d\n",
+				       proglen, oldproglen);
+				prog = orig_prog;
+				goto out_addrs;
+			}
+			break;
+		}
+		if (proglen == oldproglen) {
+			header = bpf_jit_binary_alloc(proglen, &image,
+						      1, jit_fill_hole);
+			if (!header) {
+				prog = orig_prog;
+				goto out_addrs;
+			}
+		}
+		oldproglen = proglen;
+		cond_resched();
+	}
+
+	if (bpf_jit_enable > 1)
+		bpf_jit_dump(prog->len, proglen, pass + 1, image);
+
+	if (image) {
+		bpf_jit_binary_lock_ro(header);
+		prog->bpf_func = (void *)image;
+		prog->jited = 1;
+		prog->jited_len = proglen;
+	} else {
+		prog = orig_prog;
+	}
+
+out_addrs:
+	kfree(addrs);
+out:
+	if (tmp_blinded)
+		bpf_jit_prog_release_other(prog, prog == orig_prog ?
+					   tmp : orig_prog);
+	return prog;
+}
-- 
1.8.5.6.2.g3d8a54e.dirty

^ permalink raw reply related

* [PATCH] rds: ib: Fix missing call to rds_ib_dev_put in rds_ib_setup_qp
From: Dag Moxnes @ 2018-04-25 11:22 UTC (permalink / raw)
  To: Santosh Shilimkar, David S. Miller, netdev, linux-rdma, rds-devel,
	linux-kernel

The function rds_ib_setup_qp is calling rds_ib_get_client_data and
should correspondingly call rds_ib_dev_put. This call was lost in
the non-error path with the introduction of error handling done in
commit 3b12f73a5c29 ("rds: ib: add error handle")

Signed-off-by: Dag Moxnes <dag.moxnes@oracle.com>
---
 net/rds/ib_cm.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c
index eea1d86..13b38ad 100644
--- a/net/rds/ib_cm.c
+++ b/net/rds/ib_cm.c
@@ -547,7 +547,7 @@ static int rds_ib_setup_qp(struct rds_connection *conn)
 	rdsdebug("conn %p pd %p cq %p %p\n", conn, ic->i_pd,
 		 ic->i_send_cq, ic->i_recv_cq);
 
-	return ret;
+	goto out;
 
 sends_out:
 	vfree(ic->i_sends);
@@ -572,6 +572,7 @@ static int rds_ib_setup_qp(struct rds_connection *conn)
 		ic->i_send_cq = NULL;
 rds_ibdev_out:
 	rds_ib_remove_conn(rds_ibdev, conn);
+out:
 	rds_ib_dev_put(rds_ibdev);
 
 	return ret;
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH net 3/3] ARM64: dts: marvell: armada-cp110: Add clocks for the xmdio node
From: Gregory CLEMENT @ 2018-04-25 11:41 UTC (permalink / raw)
  To: Maxime Chevallier
  Cc: davem, netdev, linux-kernel, Antoine Tenart, thomas.petazzoni,
	miquel.raynal, nadavh, stefanc, ymarkman, mw, linux,
	linux-arm-kernel
In-Reply-To: <20180425110731.20153-4-maxime.chevallier@bootlin.com>

Hi Maxime,
 
 On mer., avril 25 2018, Maxime Chevallier <maxime.chevallier@bootlin.com> wrote:

> The Marvell XSMI controller needs 3 clocks to operate correctly :
>  - The MG clock (clk 5)
>  - The MG Core clock (clk 6)
>  - The GOP clock (clk 18)
>
>  This commit adds them, to avoid system hangs when using these
>  interfaces.
>
> Fixes: c7e92def1ef4 ("clk: mvebu: cp110: Fix clock tree representation")
> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>

This patch should not have been sent with the net prefix and should not
be merge through the net subsystem.

I will take care of it to avoid conflict in the devei tree file during
the merge windows.

Thanks,

Gregory

> ---
>  arch/arm64/boot/dts/marvell/armada-cp110.dtsi | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
> index 6c137ac656e9..ed2f1237ea1e 100644
> --- a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
> +++ b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
> @@ -142,6 +142,8 @@
>  			#size-cells = <0>;
>  			compatible = "marvell,xmdio";
>  			reg = <0x12a600 0x10>;
> +			clocks = <&CP110_LABEL(clk) 1 5>,
> +				 <&CP110_LABEL(clk) 1 6>, <&CP110_LABEL(clk) 1 18>;
>  			status = "disabled";
>  		};
>  
> -- 
> 2.11.0
>

-- 
Gregory Clement, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
http://bootlin.com

^ permalink raw reply

* Re: [PATCH net 2/3] net: mvpp2: Fix clock resource by adding missing mg_core_clk
From: Gregory CLEMENT @ 2018-04-25 11:43 UTC (permalink / raw)
  To: Maxime Chevallier
  Cc: davem, netdev, linux-kernel, Antoine Tenart, thomas.petazzoni,
	miquel.raynal, nadavh, stefanc, ymarkman, mw, linux,
	linux-arm-kernel
In-Reply-To: <20180425110731.20153-3-maxime.chevallier@bootlin.com>

Hi Maxime,
 
 On mer., avril 25 2018, Maxime Chevallier <maxime.chevallier@bootlin.com> wrote:

> Marvell's PPv2.2 IP needs an additional clock named "MG Core clock".
> This is required on Armada 7K and 8K.
>
> This commit adds the required clock, updates the devicetree and its
> documentation accordingly, also fixing a small typo in the
> marvell-mpp2.txt examples.
>
> Fixes: c7e92def1ef4 ("clk: mvebu: cp110: Fix clock tree representation")
> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
> ---
>  .../devicetree/bindings/net/marvell-pp2.txt          |  9 +++++----
>  arch/arm64/boot/dts/marvell/armada-cp110.dtsi        |  5 +++--

Could you remove the dtsi part and submit it as a separate patch. Then I
will take care of it.

Thanks,

Gregory


>  drivers/net/ethernet/marvell/mvpp2.c                 | 20 ++++++++++++++++++--
>  3 files changed, 26 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/net/marvell-pp2.txt b/Documentation/devicetree/bindings/net/marvell-pp2.txt
> index 1814fa13f6ab..fc019df0d863 100644
> --- a/Documentation/devicetree/bindings/net/marvell-pp2.txt
> +++ b/Documentation/devicetree/bindings/net/marvell-pp2.txt
> @@ -21,9 +21,10 @@ Required properties:
>  	- main controller clock (for both armada-375-pp2 and armada-7k-pp2)
>  	- GOP clock (for both armada-375-pp2 and armada-7k-pp2)
>  	- MG clock (only for armada-7k-pp2)
> +	- MG Core clock (only for armada-7k-pp2)
>  	- AXI clock (only for armada-7k-pp2)
> -- clock-names: names of used clocks, must be "pp_clk", "gop_clk", "mg_clk"
> -  and "axi_clk" (the 2 latter only for armada-7k-pp2).
> +- clock-names: names of used clocks, must be "pp_clk", "gop_clk", "mg_clk",
> +  "mg_core_clk" and "axi_clk" (the 3 latter only for armada-7k-pp2).
>  
>  The ethernet ports are represented by subnodes. At least one port is
>  required.
> @@ -80,8 +81,8 @@ cpm_ethernet: ethernet@0 {
>  	compatible = "marvell,armada-7k-pp22";
>  	reg = <0x0 0x100000>, <0x129000 0xb000>;
>  	clocks = <&cpm_syscon0 1 3>, <&cpm_syscon0 1 9>,
> -		 <&cpm_syscon0 1 5>, <&cpm_syscon0 1 18>;
> -	clock-names = "pp_clk", "gop_clk", "gp_clk", "axi_clk";
> +		 <&cpm_syscon0 1 5>, <&cpm_syscon0 1 6>, <&cpm_syscon0 1 18>;
> +	clock-names = "pp_clk", "gop_clk", "mg_clk", "mg_core_clk", "axi_clk";
>  
>  	eth0: eth0 {
>  		interrupts = <ICU_GRP_NSR 39 IRQ_TYPE_LEVEL_HIGH>,
> diff --git a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
> index 48cad7919efa..6c137ac656e9 100644
> --- a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
> +++ b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
> @@ -38,9 +38,10 @@
>  			compatible = "marvell,armada-7k-pp22";
>  			reg = <0x0 0x100000>, <0x129000 0xb000>;
>  			clocks = <&CP110_LABEL(clk) 1 3>, <&CP110_LABEL(clk) 1 9>,
> -				 <&CP110_LABEL(clk) 1 5>, <&CP110_LABEL(clk) 1 18>;
> +				 <&CP110_LABEL(clk) 1 5>, <&CP110_LABEL(clk) 1 6>,
> +				 <&CP110_LABEL(clk) 1 18>;
>  			clock-names = "pp_clk", "gop_clk",
> -				      "mg_clk", "axi_clk";
> +				      "mg_clk", "mg_core_clk", "axi_clk";
>  			marvell,system-controller = <&CP110_LABEL(syscon0)>;
>  			status = "disabled";
>  			dma-coherent;
> diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
> index 0c2f04813d42..6a17b67a0d61 100644
> --- a/drivers/net/ethernet/marvell/mvpp2.c
> +++ b/drivers/net/ethernet/marvell/mvpp2.c
> @@ -942,6 +942,7 @@ struct mvpp2 {
>  	struct clk *pp_clk;
>  	struct clk *gop_clk;
>  	struct clk *mg_clk;
> +	struct clk *mg_core_clk;
>  	struct clk *axi_clk;
>  
>  	/* List of pointers to port structures */
> @@ -8768,18 +8769,28 @@ static int mvpp2_probe(struct platform_device *pdev)
>  			err = clk_prepare_enable(priv->mg_clk);
>  			if (err < 0)
>  				goto err_gop_clk;
> +
> +			priv->mg_core_clk = devm_clk_get(&pdev->dev, "mg_core_clk");
> +			if (IS_ERR(priv->mg_core_clk)) {
> +				err = PTR_ERR(priv->mg_core_clk);
> +				goto err_mg_clk;
> +			}
> +
> +			err = clk_prepare_enable(priv->mg_core_clk);
> +			if (err < 0)
> +				goto err_mg_clk;
>  		}
>  
>  		priv->axi_clk = devm_clk_get(&pdev->dev, "axi_clk");
>  		if (IS_ERR(priv->axi_clk)) {
>  			err = PTR_ERR(priv->axi_clk);
>  			if (err == -EPROBE_DEFER)
> -				goto err_mg_clk;
> +				goto err_mg_core_clk;
>  			priv->axi_clk = NULL;
>  		} else {
>  			err = clk_prepare_enable(priv->axi_clk);
>  			if (err < 0)
> -				goto err_mg_clk;
> +				goto err_mg_core_clk;
>  		}
>  
>  		/* Get system's tclk rate */
> @@ -8851,6 +8862,10 @@ static int mvpp2_probe(struct platform_device *pdev)
>  	}
>  err_axi_clk:
>  	clk_disable_unprepare(priv->axi_clk);
> +
> +err_mg_core_clk:
> +	if (priv->hw_version == MVPP22)
> +		clk_disable_unprepare(priv->mg_core_clk);
>  err_mg_clk:
>  	if (priv->hw_version == MVPP22)
>  		clk_disable_unprepare(priv->mg_clk);
> @@ -8898,6 +8913,7 @@ static int mvpp2_remove(struct platform_device *pdev)
>  		return 0;
>  
>  	clk_disable_unprepare(priv->axi_clk);
> +	clk_disable_unprepare(priv->mg_core_clk);
>  	clk_disable_unprepare(priv->mg_clk);
>  	clk_disable_unprepare(priv->pp_clk);
>  	clk_disable_unprepare(priv->gop_clk);
> -- 
> 2.11.0
>

-- 
Gregory Clement, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
http://bootlin.com

^ permalink raw reply

* Re: [PATCH net-next] sctp: fix const parameter violation in sctp_make_sack
From: Neil Horman @ 2018-04-25 11:45 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner; +Cc: netdev, linux-sctp, Xin Long, Vlad Yasevich
In-Reply-To: <b9065b3cfb0a2bf3c83008ff53967f473e081766.1524603855.git.marcelo.leitner@gmail.com>

On Tue, Apr 24, 2018 at 06:17:34PM -0300, Marcelo Ricardo Leitner wrote:
> sctp_make_sack() make changes to the asoc and this cast is just
> bypassing the const attribute. As there is no need to have the const
> there, just remove it and fix the violation.
> 
> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> ---
> 
> This one can go to net or net-next, but targetting net-next here just to
> keep it together with the rest (which I'll post as patches get in).
> 
>  include/net/sctp/sm.h    | 2 +-
>  net/sctp/sm_make_chunk.c | 9 ++++-----
>  2 files changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h
> index 2d0e782c90551377ad654bcef1224bbdb75ba394..f4b657478a304050851f33d92c71162a4a4a2e50 100644
> --- a/include/net/sctp/sm.h
> +++ b/include/net/sctp/sm.h
> @@ -207,7 +207,7 @@ struct sctp_chunk *sctp_make_datafrag_empty(const struct sctp_association *asoc,
>  					    int len, __u8 flags, gfp_t gfp);
>  struct sctp_chunk *sctp_make_ecne(const struct sctp_association *asoc,
>  				  const __u32 lowest_tsn);
> -struct sctp_chunk *sctp_make_sack(const struct sctp_association *asoc);
> +struct sctp_chunk *sctp_make_sack(struct sctp_association *asoc);
>  struct sctp_chunk *sctp_make_shutdown(const struct sctp_association *asoc,
>  				      const struct sctp_chunk *chunk);
>  struct sctp_chunk *sctp_make_shutdown_ack(const struct sctp_association *asoc,
> diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
> index 5a4fb1dc8400a0316177ce65be8126857297eb5e..db93eabd6ef500ab20be6e091ee06bd3b60923c9 100644
> --- a/net/sctp/sm_make_chunk.c
> +++ b/net/sctp/sm_make_chunk.c
> @@ -779,10 +779,9 @@ struct sctp_chunk *sctp_make_datafrag_empty(const struct sctp_association *asoc,
>   * association.  This reports on which TSN's we've seen to date,
>   * including duplicates and gaps.
>   */
> -struct sctp_chunk *sctp_make_sack(const struct sctp_association *asoc)
> +struct sctp_chunk *sctp_make_sack(struct sctp_association *asoc)
>  {
>  	struct sctp_tsnmap *map = (struct sctp_tsnmap *)&asoc->peer.tsn_map;
> -	struct sctp_association *aptr = (struct sctp_association *)asoc;
>  	struct sctp_gap_ack_block gabs[SCTP_MAX_GABS];
>  	__u16 num_gabs, num_dup_tsns;
>  	struct sctp_transport *trans;
> @@ -857,7 +856,7 @@ struct sctp_chunk *sctp_make_sack(const struct sctp_association *asoc)
> 
>  	/* Add the duplicate TSN information.  */
>  	if (num_dup_tsns) {
> -		aptr->stats.idupchunks += num_dup_tsns;
> +		asoc->stats.idupchunks += num_dup_tsns;
>  		sctp_addto_chunk(retval, sizeof(__u32) * num_dup_tsns,
>  				 sctp_tsnmap_get_dups(map));
>  	}
> @@ -869,11 +868,11 @@ struct sctp_chunk *sctp_make_sack(const struct sctp_association *asoc)
>  	 * association so no transport will match after a wrap event like this,
>  	 * Until the next sack
>  	 */
> -	if (++aptr->peer.sack_generation == 0) {
> +	if (++asoc->peer.sack_generation == 0) {
>  		list_for_each_entry(trans, &asoc->peer.transport_addr_list,
>  				    transports)
>  			trans->sack_generation = 0;
> -		aptr->peer.sack_generation = 1;
> +		asoc->peer.sack_generation = 1;
>  	}
>  nodata:
>  	return retval;
> --
> 2.14.3
> 
> 
Acked-by: Neil Horman <nhorman@tuxdriver.com

^ permalink raw reply

* Re: [PATCH net-next] sctp: fix identification of new acks for SFR-CACC
From: Neil Horman @ 2018-04-25 11:47 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner; +Cc: netdev, linux-sctp, Xin Long, Vlad Yasevich
In-Reply-To: <4e837f7eaf67a1a964d1b3418aa3cf759f512535.1524603870.git.marcelo.leitner@gmail.com>

On Tue, Apr 24, 2018 at 06:17:35PM -0300, Marcelo Ricardo Leitner wrote:
> It's currently written as:
> 
> if (!tchunk->tsn_gap_acked) {   [1]
> 	tchunk->tsn_gap_acked = 1;
> 	...
> }
> 
> if (TSN_lte(tsn, sack_ctsn)) {
> 	if (!tchunk->tsn_gap_acked) {
> 		/* SFR-CACC processing */
> 		...
> 	}
> }
> 
> Which causes the SFR-CACC processing on ack reception to never process,
> as tchunk->tsn_gap_acked is always true by then. Block [1] was
> moved to that position by the commit marked below.
> 
> This patch fixes it by doing SFR-CACC processing earlier, before
> tsn_gap_acked is set to true.
> 
> Fixes: 31b02e154940 ("sctp: Failover transmitted list on transport delete")
> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> ---
> 
> Even though this is a -stable candidate, please apply it to net-next
> to avoid conflicts with subsequent patches in my queue. Thanks.
> 
>  net/sctp/outqueue.c | 48 +++++++++++++++++++++++-------------------------
>  1 file changed, 23 insertions(+), 25 deletions(-)
> 
> diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
> index f211b3db6a3543073e113da121bb28518b0af491..dee7cbd5483149024f2f3195db2fe4d473b1a00a 100644
> --- a/net/sctp/outqueue.c
> +++ b/net/sctp/outqueue.c
> @@ -1457,7 +1457,7 @@ static void sctp_check_transmitted(struct sctp_outq *q,
>  			 * the outstanding bytes for this chunk, so only
>  			 * count bytes associated with a transport.
>  			 */
> -			if (transport) {
> +			if (transport && !tchunk->tsn_gap_acked) {
>  				/* If this chunk is being used for RTT
>  				 * measurement, calculate the RTT and update
>  				 * the RTO using this value.
> @@ -1469,14 +1469,34 @@ static void sctp_check_transmitted(struct sctp_outq *q,
>  				 * first instance of the packet or a later
>  				 * instance).
>  				 */
> -				if (!tchunk->tsn_gap_acked &&
> -				    !sctp_chunk_retransmitted(tchunk) &&
> +				if (!sctp_chunk_retransmitted(tchunk) &&
>  				    tchunk->rtt_in_progress) {
>  					tchunk->rtt_in_progress = 0;
>  					rtt = jiffies - tchunk->sent_at;
>  					sctp_transport_update_rto(transport,
>  								  rtt);
>  				}
> +
> +				if (TSN_lte(tsn, sack_ctsn)) {
> +					/*
> +					 * SFR-CACC algorithm:
> +					 * 2) If the SACK contains gap acks
> +					 * and the flag CHANGEOVER_ACTIVE is
> +					 * set the receiver of the SACK MUST
> +					 * take the following action:
> +					 *
> +					 * B) For each TSN t being acked that
> +					 * has not been acked in any SACK so
> +					 * far, set cacc_saw_newack to 1 for
> +					 * the destination that the TSN was
> +					 * sent to.
> +					 */
> +					if (sack->num_gap_ack_blocks &&
> +					    q->asoc->peer.primary_path->cacc.
> +					    changeover_active)
> +						transport->cacc.cacc_saw_newack
> +							= 1;
> +				}
>  			}
> 
>  			/* If the chunk hasn't been marked as ACKED,
> @@ -1508,28 +1528,6 @@ static void sctp_check_transmitted(struct sctp_outq *q,
>  				restart_timer = 1;
>  				forward_progress = true;
> 
> -				if (!tchunk->tsn_gap_acked) {
> -					/*
> -					 * SFR-CACC algorithm:
> -					 * 2) If the SACK contains gap acks
> -					 * and the flag CHANGEOVER_ACTIVE is
> -					 * set the receiver of the SACK MUST
> -					 * take the following action:
> -					 *
> -					 * B) For each TSN t being acked that
> -					 * has not been acked in any SACK so
> -					 * far, set cacc_saw_newack to 1 for
> -					 * the destination that the TSN was
> -					 * sent to.
> -					 */
> -					if (transport &&
> -					    sack->num_gap_ack_blocks &&
> -					    q->asoc->peer.primary_path->cacc.
> -					    changeover_active)
> -						transport->cacc.cacc_saw_newack
> -							= 1;
> -				}
> -
>  				list_add_tail(&tchunk->transmitted_list,
>  					      &q->sacked);
>  			} else {
> --
> 2.14.3
> 
> 
Acked-by: Neil Horman <nhorman@tuxdriver.com>

^ permalink raw reply

* Re: [PATCH net-next] sctp: remove the unused sctp_assoc_is_match function
From: Neil Horman @ 2018-04-25 11:49 UTC (permalink / raw)
  To: Xin Long; +Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, davem
In-Reply-To: <0acb8eaec9a5be9959d39bb5017d49bb31d0d5e5.1524649607.git.lucien.xin@gmail.com>

On Wed, Apr 25, 2018 at 05:46:47PM +0800, Xin Long wrote:
> After Commit 4f0087812648 ("sctp: apply rhashtable api to send/recv
> path"), there's no place using sctp_assoc_is_match, so remove it.
> 
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> ---
>  include/net/sctp/structs.h |  4 ----
>  net/sctp/associola.c       | 25 -------------------------
>  2 files changed, 29 deletions(-)
> 
> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> index a0ec462..05594b2 100644
> --- a/include/net/sctp/structs.h
> +++ b/include/net/sctp/structs.h
> @@ -2091,10 +2091,6 @@ void sctp_assoc_control_transport(struct sctp_association *asoc,
>  				  enum sctp_transport_cmd command,
>  				  sctp_sn_error_t error);
>  struct sctp_transport *sctp_assoc_lookup_tsn(struct sctp_association *, __u32);
> -struct sctp_transport *sctp_assoc_is_match(struct sctp_association *,
> -					   struct net *,
> -					   const union sctp_addr *,
> -					   const union sctp_addr *);
>  void sctp_assoc_migrate(struct sctp_association *, struct sock *);
>  int sctp_assoc_update(struct sctp_association *old,
>  		      struct sctp_association *new);
> diff --git a/net/sctp/associola.c b/net/sctp/associola.c
> index 837806d..a8f3b08 100644
> --- a/net/sctp/associola.c
> +++ b/net/sctp/associola.c
> @@ -988,31 +988,6 @@ struct sctp_transport *sctp_assoc_lookup_tsn(struct sctp_association *asoc,
>  	return match;
>  }
>  
> -/* Is this the association we are looking for? */
> -struct sctp_transport *sctp_assoc_is_match(struct sctp_association *asoc,
> -					   struct net *net,
> -					   const union sctp_addr *laddr,
> -					   const union sctp_addr *paddr)
> -{
> -	struct sctp_transport *transport;
> -
> -	if ((htons(asoc->base.bind_addr.port) == laddr->v4.sin_port) &&
> -	    (htons(asoc->peer.port) == paddr->v4.sin_port) &&
> -	    net_eq(sock_net(asoc->base.sk), net)) {
> -		transport = sctp_assoc_lookup_paddr(asoc, paddr);
> -		if (!transport)
> -			goto out;
> -
> -		if (sctp_bind_addr_match(&asoc->base.bind_addr, laddr,
> -					 sctp_sk(asoc->base.sk)))
> -			goto out;
> -	}
> -	transport = NULL;
> -
> -out:
> -	return transport;
> -}
> -
>  /* Do delayed input processing.  This is scheduled by sctp_rcv(). */
>  static void sctp_assoc_bh_rcv(struct work_struct *work)
>  {
> -- 
> 2.1.0
> 
> 
Acked-by: Neil Horman <nhorman@tuxdriver.com>

^ permalink raw reply

* Re: [PATCH net 2/3] net: mvpp2: Fix clock resource by adding missing mg_core_clk
From: Maxime Chevallier @ 2018-04-25 11:52 UTC (permalink / raw)
  To: Gregory CLEMENT
  Cc: davem, netdev, linux-kernel, Antoine Tenart, thomas.petazzoni,
	miquel.raynal, nadavh, stefanc, ymarkman, mw, linux,
	linux-arm-kernel
In-Reply-To: <87in8feckt.fsf@bootlin.com>

Hi Gregory,

On Wed, 25 Apr 2018 13:43:14 +0200
Gregory CLEMENT <gregory.clement@bootlin.com> wrote:

>Hi Maxime,
> 
> On mer., avril 25 2018, Maxime Chevallier
> <maxime.chevallier@bootlin.com> wrote:
>
>> Marvell's PPv2.2 IP needs an additional clock named "MG Core clock".
>> This is required on Armada 7K and 8K.
>>
>> This commit adds the required clock, updates the devicetree and its
>> documentation accordingly, also fixing a small typo in the
>> marvell-mpp2.txt examples.
>>
>> Fixes: c7e92def1ef4 ("clk: mvebu: cp110: Fix clock tree
>> representation") Signed-off-by: Maxime Chevallier
>> <maxime.chevallier@bootlin.com> ---
>>  .../devicetree/bindings/net/marvell-pp2.txt          |  9 +++++----
>>  arch/arm64/boot/dts/marvell/armada-cp110.dtsi        |  5 +++--  
>
>Could you remove the dtsi part and submit it as a separate patch. Then
>I will take care of it.

Ok no problem, I'll split that and re-send it.

Thanks,

Maxime

-- 
Maxime Chevallier, Bootlin (formerly Free Electrons)
Embedded Linux and kernel engineering
https://bootlin.com

^ permalink raw reply

* Re: How to detect libbfd when building tools/bpf?
From: Quentin Monnet @ 2018-04-25 12:01 UTC (permalink / raw)
  To: shhuiw, Daniel Borkmann
  Cc: ast, kstewart, gregkh, tglx, pombredanne, netdev, acme,
	jakub.kicinski
In-Reply-To: <5ade9631.046f370a.24af5.a921SMTPIN_ADDED_BROKEN@mx.google.com>

2018-04-24 10:27 UTC+0800 ~ shhuiw <shhuiw@foxmail.com>
> 
> 
> On 04/23/18 18:05, Daniel Borkmann wrote:
>> On 04/22/2018 09:51 AM, Wang Sheng-Hui wrote:
>>> Sorry to trouble you !
>>>
>>> I run debian and installed binutils-dev beforehand.
>>> Then I copied tools/build/feature/test-libbfd.c to t.c and run:
>>> ------------------------------------------------------------------------
>>> root@lazyfintech:~# cat t.c
>>> #include <bfd.h>
>>>
>>> extern int printf(const char *format, ...);
>>>
>>> int main(void)
>>> {
>>>     char symbol[4096] = "FieldName__9ClassNameFd";
>>>     char *tmp;
>>>
>>>     tmp = bfd_demangle(0, symbol, 0);
>>>
>>>     printf("demangled symbol: {%s}\n", tmp);
- On some distributions (e.g. OpenSUSE), there is no default alias
>>>
>>>     return 0;
>>> }
>>> root@lazyfintech:~# gcc t.c -lbfd
>> Can you run with 'gcc t.c -lbfd -lz -liberty -ldl' since this is
>> what the Makefile for the feature test is doing:
>>
>>    $(OUTPUT)test-libbfd.bin:
>>          $(BUILD) -DPACKAGE='"perf"' -lbfd -lz -liberty -ldl
>>
>> Seems most likely that you're missing some of the remaining libs
>> to have the feature test properly detect it?
> I missed the lib iberty. After the lib installed, the auto-detect can work.
> 
> Thanks, Daniel!

For the record, we (at least Jakub and I) know about this issue with
libbfd detection for bpf tools. In theory the Makefile *should* refuse
to build if libbfd is not detected (since libbfd is required), but
checking the result from the detection was apparently omitted when
detection itself was added to the Makefile. This would be trivial to
patch in the Makefile, but I would like to have the detection itself
fixed before that:

- On some distributions (e.g. OpenSUSE), there is no default alias
libbfd.so to libbfd-<version>.so, which causes the static library
libbfd.a to be picked. But the latter does not include libiberty, which
must be available on the system, and called on the command line
("-liberty") for the feature to be detected (or for bpftool to be compiled).

- Some other distributions (e.g. Debian and derivatives) have this
alias, and for them it is not required to link against any version of
libiberty (it seems to be already included in libbfd.so). On such
systems, installing libiberty is not a requirement to compile the tools.

So fixing the Makefile to make libbfd detection mandatory would force
all users to install libiberty to compile bpftool, when it is actually
not required for many of them. I have proposed, internally, a couple of
patches to fix libbfd detection, but I haven't found something clean
enough so far to propose it on this list. But I intend to come back to
it and to fix it eventually, so that libbfd detection would no longer
appear as "OFF" when all needed components are present on the system.

Cheers,
Quentin

^ permalink raw reply

* Re: [PATCH net 1/3] net: mvpp2: Fix clk error path in mvpp2_probe
From: Gregory CLEMENT @ 2018-04-25 12:05 UTC (permalink / raw)
  To: Maxime Chevallier
  Cc: davem, netdev, linux-kernel, Antoine Tenart, thomas.petazzoni,
	miquel.raynal, nadavh, stefanc, ymarkman, mw, linux,
	linux-arm-kernel
In-Reply-To: <20180425110731.20153-2-maxime.chevallier@bootlin.com>

Hi Maxime,
 
 On mer., avril 25 2018, Maxime Chevallier <maxime.chevallier@bootlin.com> wrote:

> When clk_prepare_enable fails for the axi_clk, the mg_clk isn't properly
> cleaned up. Add another jump label to handle that case, and make sure we
> jump to it in the later error cases.
>
> Fixes: 4792ea04bcd0 ("net: mvpp2: Fix clock resource by adding an optional bus clock")
> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>

Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com>

Thanks,

Gregory

> ---
>  drivers/net/ethernet/marvell/mvpp2.c | 15 ++++++++-------
>  1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
> index 4202f9b5b966..0c2f04813d42 100644
> --- a/drivers/net/ethernet/marvell/mvpp2.c
> +++ b/drivers/net/ethernet/marvell/mvpp2.c
> @@ -8774,12 +8774,12 @@ static int mvpp2_probe(struct platform_device *pdev)
>  		if (IS_ERR(priv->axi_clk)) {
>  			err = PTR_ERR(priv->axi_clk);
>  			if (err == -EPROBE_DEFER)
> -				goto err_gop_clk;
> +				goto err_mg_clk;
>  			priv->axi_clk = NULL;
>  		} else {
>  			err = clk_prepare_enable(priv->axi_clk);
>  			if (err < 0)
> -				goto err_gop_clk;
> +				goto err_mg_clk;
>  		}
>  
>  		/* Get system's tclk rate */
> @@ -8793,7 +8793,7 @@ static int mvpp2_probe(struct platform_device *pdev)
>  	if (priv->hw_version == MVPP22) {
>  		err = dma_set_mask(&pdev->dev, MVPP2_DESC_DMA_MASK);
>  		if (err)
> -			goto err_mg_clk;
> +			goto err_axi_clk;
>  		/* Sadly, the BM pools all share the same register to
>  		 * store the high 32 bits of their address. So they
>  		 * must all have the same high 32 bits, which forces
> @@ -8801,14 +8801,14 @@ static int mvpp2_probe(struct platform_device *pdev)
>  		 */
>  		err = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32));
>  		if (err)
> -			goto err_mg_clk;
> +			goto err_axi_clk;
>  	}
>  
>  	/* Initialize network controller */
>  	err = mvpp2_init(pdev, priv);
>  	if (err < 0) {
>  		dev_err(&pdev->dev, "failed to initialize controller\n");
> -		goto err_mg_clk;
> +		goto err_axi_clk;
>  	}
>  
>  	/* Initialize ports */
> @@ -8821,7 +8821,7 @@ static int mvpp2_probe(struct platform_device *pdev)
>  	if (priv->port_count == 0) {
>  		dev_err(&pdev->dev, "no ports enabled\n");
>  		err = -ENODEV;
> -		goto err_mg_clk;
> +		goto err_axi_clk;
>  	}
>  
>  	/* Statistics must be gathered regularly because some of them (like
> @@ -8849,8 +8849,9 @@ static int mvpp2_probe(struct platform_device *pdev)
>  			mvpp2_port_remove(priv->port_list[i]);
>  		i++;
>  	}
> -err_mg_clk:
> +err_axi_clk:
>  	clk_disable_unprepare(priv->axi_clk);
> +err_mg_clk:
>  	if (priv->hw_version == MVPP22)
>  		clk_disable_unprepare(priv->mg_clk);
>  err_gop_clk:
> -- 
> 2.11.0
>

-- 
Gregory Clement, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
http://bootlin.com

^ permalink raw reply

* Re: [PATCH net 3/3] ARM64: dts: marvell: armada-cp110: Add clocks for the xmdio node
From: Gregory CLEMENT @ 2018-04-25 12:30 UTC (permalink / raw)
  To: Maxime Chevallier
  Cc: davem, netdev, linux-kernel, Antoine Tenart, thomas.petazzoni,
	miquel.raynal, nadavh, stefanc, ymarkman, mw, linux,
	linux-arm-kernel
In-Reply-To: <87muxrecnf.fsf@bootlin.com>

Hi agin,
 
 On mer., avril 25 2018, Gregory CLEMENT <gregory.clement@bootlin.com> wrote:

> Hi Maxime,
>  
>  On mer., avril 25 2018, Maxime Chevallier <maxime.chevallier@bootlin.com> wrote:
>
>> The Marvell XSMI controller needs 3 clocks to operate correctly :
>>  - The MG clock (clk 5)
>>  - The MG Core clock (clk 6)
>>  - The GOP clock (clk 18)
>>
>>  This commit adds them, to avoid system hangs when using these
>>  interfaces.
>>
>> Fixes: c7e92def1ef4 ("clk: mvebu: cp110: Fix clock tree representation")
>> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
>
> This patch should not have been sent with the net prefix and should not
> be merge through the net subsystem.
>
> I will take care of it to avoid conflict in the devei tree file during
> the merge windows.


So, applied on mvebu/fixes

Thanks,

Gregory


>
> Thanks,
>
> Gregory
>
>> ---
>>  arch/arm64/boot/dts/marvell/armada-cp110.dtsi | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
>> index 6c137ac656e9..ed2f1237ea1e 100644
>> --- a/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
>> +++ b/arch/arm64/boot/dts/marvell/armada-cp110.dtsi
>> @@ -142,6 +142,8 @@
>>  			#size-cells = <0>;
>>  			compatible = "marvell,xmdio";
>>  			reg = <0x12a600 0x10>;
>> +			clocks = <&CP110_LABEL(clk) 1 5>,
>> +				 <&CP110_LABEL(clk) 1 6>, <&CP110_LABEL(clk) 1 18>;
>>  			status = "disabled";
>>  		};
>>  
>> -- 
>> 2.11.0
>>
>
> -- 
> Gregory Clement, Bootlin (formerly Free Electrons)
> Embedded Linux and Kernel engineering
> http://bootlin.com

-- 
Gregory Clement, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
http://bootlin.com

^ permalink raw reply

* IT Helpdesk! Treat Very Urgently!!!
From: Paula Peterson @ 2018-04-25 12:30 UTC (permalink / raw)
  To: Paula Peterson
In-Reply-To: <EA828B2E9707E34ABC6E5CBD7AACD681013B42C5C2@SDOEXMBX1-08E.SCCCD.NET>


Dear User,

 Please take note of this important update that our new web-mail has been improved with a new messaging system from Microsoft Exchange\Outlook which also include faster usage on email, shared calendar,web-documents also with the new Anti-spam version.

 Kindly use the link below to Switch to the New Microsoft Exchange/Outlook.

Click on Microsoft Exchange<http://2w0g99hp69vbla66jukqhmdtm.designmysite.pro/>  to Switch immediately.


© IT Technical Support
Copyright 2018, Web-mail Maintenance,
 All Rights Reserved.

^ permalink raw reply

* Re: [PATCH bpf-next 03/15] xsk: add umem fill queue support and mmap
From: Björn Töpel @ 2018-04-25 12:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Karlsson, Magnus, Duyck, Alexander H, Alexander Duyck,
	John Fastabend, Alexei Starovoitov, Jesper Dangaard Brouer,
	Willem de Bruijn, Daniel Borkmann, Netdev, michael.lundkvist,
	Brandeburg, Jesse, Singhai, Anjali, Zhang, Qi Z
In-Reply-To: <20180424021605-mutt-send-email-mst@kernel.org>

2018-04-24 1:16 GMT+02:00 Michael S. Tsirkin <mst@redhat.com>:
> On Mon, Apr 23, 2018 at 03:56:07PM +0200, Björn Töpel wrote:
>> From: Magnus Karlsson <magnus.karlsson@intel.com>
>>
>> Here, we add another setsockopt for registered user memory (umem)
>> called XDP_UMEM_FILL_QUEUE. Using this socket option, the process can
>> ask the kernel to allocate a queue (ring buffer) and also mmap it
>> (XDP_UMEM_PGOFF_FILL_QUEUE) into the process.
>>
>> The queue is used to explicitly pass ownership of umem frames from the
>> user process to the kernel. These frames will in a later patch be
>> filled in with Rx packet data by the kernel.
>>
>> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
>> ---
>>  include/uapi/linux/if_xdp.h | 15 +++++++++++
>>  net/xdp/Makefile            |  2 +-
>>  net/xdp/xdp_umem.c          |  5 ++++
>>  net/xdp/xdp_umem.h          |  2 ++
>>  net/xdp/xsk.c               | 62 ++++++++++++++++++++++++++++++++++++++++++++-
>>  net/xdp/xsk_queue.c         | 58 ++++++++++++++++++++++++++++++++++++++++++
>>  net/xdp/xsk_queue.h         | 38 +++++++++++++++++++++++++++
>>  7 files changed, 180 insertions(+), 2 deletions(-)
>>  create mode 100644 net/xdp/xsk_queue.c
>>  create mode 100644 net/xdp/xsk_queue.h
>>
>> diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h
>> index 41252135a0fe..975661e1baca 100644
>> --- a/include/uapi/linux/if_xdp.h
>> +++ b/include/uapi/linux/if_xdp.h
>> @@ -23,6 +23,7 @@
>>
>>  /* XDP socket options */
>>  #define XDP_UMEM_REG                 3
>> +#define XDP_UMEM_FILL_RING           4
>>
>>  struct xdp_umem_reg {
>>       __u64 addr; /* Start of packet data area */
>> @@ -31,4 +32,18 @@ struct xdp_umem_reg {
>>       __u32 frame_headroom; /* Frame head room */
>>  };
>>
>> +/* Pgoff for mmaping the rings */
>> +#define XDP_UMEM_PGOFF_FILL_RING     0x100000000
>> +
>> +struct xdp_ring {
>> +     __u32 producer __attribute__((aligned(64)));
>> +     __u32 consumer __attribute__((aligned(64)));
>> +};
>> +
>> +/* Used for the fill and completion queues for buffers */
>> +struct xdp_umem_ring {
>> +     struct xdp_ring ptrs;
>> +     __u32 desc[0] __attribute__((aligned(64)));
>> +};
>> +
>>  #endif /* _LINUX_IF_XDP_H */
>> diff --git a/net/xdp/Makefile b/net/xdp/Makefile
>> index a5d736640a0f..074fb2b2d51c 100644
>> --- a/net/xdp/Makefile
>> +++ b/net/xdp/Makefile
>> @@ -1,2 +1,2 @@
>> -obj-$(CONFIG_XDP_SOCKETS) += xsk.o xdp_umem.o
>> +obj-$(CONFIG_XDP_SOCKETS) += xsk.o xdp_umem.o xsk_queue.o
>>
>> diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c
>> index bff058f5a769..6fc233e03f30 100644
>> --- a/net/xdp/xdp_umem.c
>> +++ b/net/xdp/xdp_umem.c
>> @@ -62,6 +62,11 @@ static void xdp_umem_release(struct xdp_umem *umem)
>>       struct mm_struct *mm;
>>       unsigned long diff;
>>
>> +     if (umem->fq) {
>> +             xskq_destroy(umem->fq);
>> +             umem->fq = NULL;
>> +     }
>> +
>>       if (umem->pgs) {
>>               xdp_umem_unpin_pages(umem);
>>
>> diff --git a/net/xdp/xdp_umem.h b/net/xdp/xdp_umem.h
>> index 58714f4f7f25..3086091aebdd 100644
>> --- a/net/xdp/xdp_umem.h
>> +++ b/net/xdp/xdp_umem.h
>> @@ -18,9 +18,11 @@
>>  #include <linux/mm.h>
>>  #include <linux/if_xdp.h>
>>
>> +#include "xsk_queue.h"
>>  #include "xdp_umem_props.h"
>>
>>  struct xdp_umem {
>> +     struct xsk_queue *fq;
>>       struct page **pgs;
>>       struct xdp_umem_props props;
>>       u32 npgs;
>> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
>> index 19fc719cbe0d..bf6a1151df28 100644
>> --- a/net/xdp/xsk.c
>> +++ b/net/xdp/xsk.c
>> @@ -32,6 +32,7 @@
>>  #include <linux/netdevice.h>
>>  #include <net/sock.h>
>>
>> +#include "xsk_queue.h"
>>  #include "xdp_umem.h"
>>
>>  struct xdp_sock {
>> @@ -47,6 +48,21 @@ static struct xdp_sock *xdp_sk(struct sock *sk)
>>       return (struct xdp_sock *)sk;
>>  }
>>
>> +static int xsk_init_queue(u32 entries, struct xsk_queue **queue)
>> +{
>> +     struct xsk_queue *q;
>> +
>> +     if (entries == 0 || *queue || !is_power_of_2(entries))
>> +             return -EINVAL;
>> +
>> +     q = xskq_create(entries);
>> +     if (!q)
>> +             return -ENOMEM;
>> +
>> +     *queue = q;
>> +     return 0;
>> +}
>> +
>>  static int xsk_release(struct socket *sock)
>>  {
>>       struct sock *sk = sock->sk;
>> @@ -109,6 +125,23 @@ static int xsk_setsockopt(struct socket *sock, int level, int optname,
>>               mutex_unlock(&xs->mutex);
>>               return 0;
>>       }
>> +     case XDP_UMEM_FILL_RING:
>> +     {
>> +             struct xsk_queue **q;
>> +             int entries;
>> +
>> +             if (!xs->umem)
>> +                     return -EINVAL;
>> +
>> +             if (copy_from_user(&entries, optval, sizeof(entries)))
>> +                     return -EFAULT;
>> +
>> +             mutex_lock(&xs->mutex);
>> +             q = &xs->umem->fq;
>> +             err = xsk_init_queue(entries, q);
>> +             mutex_unlock(&xs->mutex);
>> +             return err;
>> +     }
>>       default:
>>               break;
>>       }
>> @@ -116,6 +149,33 @@ static int xsk_setsockopt(struct socket *sock, int level, int optname,
>>       return -ENOPROTOOPT;
>>  }
>>
>> +static int xsk_mmap(struct file *file, struct socket *sock,
>> +                 struct vm_area_struct *vma)
>> +{
>> +     unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;
>> +     unsigned long size = vma->vm_end - vma->vm_start;
>> +     struct xdp_sock *xs = xdp_sk(sock->sk);
>> +     struct xsk_queue *q;
>> +     unsigned long pfn;
>> +     struct page *qpg;
>> +
>> +     if (!xs->umem)
>> +             return -EINVAL;
>> +
>> +     if (offset == XDP_UMEM_PGOFF_FILL_RING)
>> +             q = xs->umem->fq;
>> +     else
>> +             return -EINVAL;
>> +
>> +     qpg = virt_to_head_page(q->ring);
>> +     if (size > (PAGE_SIZE << compound_order(qpg)))
>> +             return -EINVAL;
>> +
>> +     pfn = virt_to_phys(q->ring) >> PAGE_SHIFT;
>> +     return remap_pfn_range(vma, vma->vm_start, pfn,
>> +                            size, vma->vm_page_prot);
>> +}
>> +
>>  static struct proto xsk_proto = {
>>       .name =         "XDP",
>>       .owner =        THIS_MODULE,
>> @@ -139,7 +199,7 @@ static const struct proto_ops xsk_proto_ops = {
>>       .getsockopt =   sock_no_getsockopt,
>>       .sendmsg =      sock_no_sendmsg,
>>       .recvmsg =      sock_no_recvmsg,
>> -     .mmap =         sock_no_mmap,
>> +     .mmap =         xsk_mmap,
>>       .sendpage =     sock_no_sendpage,
>>  };
>>
>> diff --git a/net/xdp/xsk_queue.c b/net/xdp/xsk_queue.c
>> new file mode 100644
>> index 000000000000..23da4f29d3fb
>> --- /dev/null
>> +++ b/net/xdp/xsk_queue.c
>> @@ -0,0 +1,58 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/* XDP user-space ring structure
>> + * Copyright(c) 2018 Intel Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + */
>> +
>> +#include <linux/slab.h>
>> +
>> +#include "xsk_queue.h"
>> +
>> +static u32 xskq_umem_get_ring_size(struct xsk_queue *q)
>> +{
>> +     return sizeof(struct xdp_umem_ring) + q->nentries * sizeof(u32);
>> +}
>> +
>> +struct xsk_queue *xskq_create(u32 nentries)
>> +{
>> +     struct xsk_queue *q;
>> +     gfp_t gfp_flags;
>> +     size_t size;
>> +
>> +     q = kzalloc(sizeof(*q), GFP_KERNEL);
>> +     if (!q)
>> +             return NULL;
>> +
>> +     q->nentries = nentries;
>> +     q->ring_mask = nentries - 1;
>> +
>> +     gfp_flags = GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN |
>> +                 __GFP_COMP  | __GFP_NORETRY;
>> +     size = xskq_umem_get_ring_size(q);
>> +
>> +     q->ring = (struct xdp_ring *)__get_free_pages(gfp_flags,
>> +                                                   get_order(size));
>> +     if (!q->ring) {
>> +             kfree(q);
>> +             return NULL;
>> +     }
>> +
>> +     return q;
>> +}
>> +
>> +void xskq_destroy(struct xsk_queue *q)
>> +{
>> +     if (!q)
>> +             return;
>> +
>> +     page_frag_free(q->ring);
>> +     kfree(q);
>> +}
>> diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
>> new file mode 100644
>> index 000000000000..7eb556bf73be
>> --- /dev/null
>> +++ b/net/xdp/xsk_queue.h
>> @@ -0,0 +1,38 @@
>> +/* SPDX-License-Identifier: GPL-2.0
>> + * XDP user-space ring structure
>> + * Copyright(c) 2018 Intel Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + */
>> +
>> +#ifndef _LINUX_XSK_QUEUE_H
>> +#define _LINUX_XSK_QUEUE_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/if_xdp.h>
>> +
>> +#include "xdp_umem_props.h"
>> +
>> +struct xsk_queue {
>> +     struct xdp_umem_props umem_props;
>> +     u32 ring_mask;
>> +     u32 nentries;
>> +     u32 prod_head;
>> +     u32 prod_tail;
>> +     u32 cons_head;
>> +     u32 cons_tail;
>> +     struct xdp_ring *ring;
>> +     u64 invalid_descs;
>> +};
>
> Any documentation on how e.g. the locking works here?
>

It's a SPSC queue. On the kernel side we guarantee synchronization via
the NAPI context. As for the user-space application, it's the
responsibility of the application to do the synchronization.

Even though the xsk_queue structure has both cons/prod members, for a
certain queue, only prod_ *or* cons_ will be used. For the kernel it
means that it will consume from fill and tx queues, and produce to
completion and rx queues.

Note that prod_/cons_ are the *cached* local variables, the actual
produce/consumer pointers reside in the kernel/user shared xdp_ring.

I'll try to make it clearer in the documentation!

>
>> +
>> +struct xsk_queue *xskq_create(u32 nentries);
>> +void xskq_destroy(struct xsk_queue *q);
>> +
>> +#endif /* _LINUX_XSK_QUEUE_H */
>> --
>> 2.14.1

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox