All of lore.kernel.org
 help / color / mirror / Atom feed
From: dave.martin@linaro.org (Dave Martin)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2] ARM: net: JIT compiler for packet filters
Date: Mon, 19 Dec 2011 18:18:39 +0000	[thread overview]
Message-ID: <20111219181839.GH2031@linaro.org> (raw)
In-Reply-To: <20111219164513.GA25105@swarm.cs.pub.ro>

On Mon, Dec 19, 2011 at 06:45:13PM +0200, Mircea Gherzan wrote:
> Hi,
> 
> On Mon, Dec 19, 2011 at 12:50:21PM +0000, Dave Martin wrote:
> > On Mon, Dec 19, 2011 at 09:40:30AM +0100, Mircea Gherzan wrote:
> > > Based of Matt Evans's PPC64 implementation.
> > > 
> > > Supports only ARM mode with EABI.
> > > 
> > > Supports both little and big endian. Depends on the support for
> > > unaligned loads on ARMv7. Does not support all the BPF opcodes
> > > that deal with ancillary data. The scratch memory of the filter
> > > lives on the stack.
> > > 
> > > Enabled in the same way as for x86-64 and PPC64:
> > > 
> > > 	echo 1 > /proc/sys/net/core/bpf_jit_enable
> > > 
> > > A value greater than 1 enables opcode output.
> > > 
> > > Signed-off-by: Mircea Gherzan <mgherzan@gmail.com>
> > > ---
> > 
> > Interesting patch... I haven't reviewed in detail, but I have a few
> > quick comments.
> > 
> > > 
> > > Changes in v2:
> > >  * enable the compiler ony for ARMv5+ because of the BLX instruction
> > >  * use the same comparison for the ARM version checks
> > >  * use misaligned accesses on ARMv6
> > 
> > You probably want to change the commit message now to reflect this.
> 
> Will do in the next version.
> 
> > 
> > >  * fix the SEEN_MEM
> > >  * fix the mem_words_used()
> > > 
> > >  arch/arm/Kconfig          |    1 +
> > >  arch/arm/Makefile         |    1 +
> > >  arch/arm/net/Makefile     |    3 +
> > >  arch/arm/net/bpf_jit_32.c |  838 +++++++++++++++++++++++++++++++++++++++++++++
> > >  arch/arm/net/bpf_jit_32.h |  174 ++++++++++
> > >  5 files changed, 1017 insertions(+), 0 deletions(-)
> > >  create mode 100644 arch/arm/net/Makefile
> > >  create mode 100644 arch/arm/net/bpf_jit_32.c
> > >  create mode 100644 arch/arm/net/bpf_jit_32.h
> > > 
> > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > > index abba5b8..ea65c41 100644
> > > --- a/arch/arm/Kconfig
> > > +++ b/arch/arm/Kconfig
> > > @@ -30,6 +30,7 @@ config ARM
> > >  	select HAVE_SPARSE_IRQ
> > >  	select GENERIC_IRQ_SHOW
> > >  	select CPU_PM if (SUSPEND || CPU_IDLE)
> > > +	select HAVE_BPF_JIT if (!THUMB2_KERNEL && AEABI)
> > 
> > Have to tried your code with a Thumb-2 kernel?
> 
> Not yet.
> 
> > Quickly skimming though your patch, I don't see an obvious reason why we
> > can't have that working, though I haven't tried it yet.
> > 
> > Note that it's fine to have the JIT generating ARM code, even if the rest
> > if the kernel is Thumb-2.  This would only start to cause problems if we
> > want to do things like set kprobes in the JITted code, or unwind out of
> > the JITted code.
> > 
> > It's just necessary to make sure that calls/returns into/out of the
> > JITted code are handled correctly.  You don't seem to do any scary
> > arithmetic or mov to or from pc or lr, and it doesn't look like you ever
> > call back into the kernel from JITted code, so the implementation is
> > probably safe for ARM/Thumb interworking already (if I've understood
> > correctly).
> 
> The JITed code calls back to the kernel for the load helpers. So setting
> bit 0 is required.

When you take the address of a link-time external function symbol,
bit[0] in the address will automatically be set appropriately by the
linker to indicate the target instruction set -- you already use BX/BLX
to jump to such symbols, so you should switch correctly when calling
_to_ the kernel.

Returns should also work, except for old-style "mov pc,lr" returns made
in Thumb code (from ARM code, this magically works for >= v7).  Such returns
only happen in hand-written assembler: for C code, the compiler always
generates proper AEABI-compliant return sequences.

So, for calling load_func[], jit_get_skb_b etc. (which are C functions),
there should be no problem.

I think the only code which you call from the JIT output but which does
not return compliantly is __aeabi_uidiv() in arch/arm/lib/lib1funcs.S.


I have a quick hacked-up patch (below) which attempts to fix this;
I'd be interested if this works for you  -- but finalising your ARM-only
version of the patch should still be the priority.

If this fix does work, I'll turn it into a proper patch, as we can maybe
use it more widely.

[...]

> > > +		case BPF_S_ALU_DIV_X:
> > > +			ctx->seen |= SEEN_X;
> > > +			emit(ARM_CMP_I(r_X, 0), ctx);
> > > +			emit_err_ret(ARM_COND_EQ, ctx);
> > > +			emit(ARM_MOV_R(ARM_R1, r_X), ctx);
> > > +div:
> > > +			ctx->seen |= SEEN_CALL;
> > > +
> > > +			emit(ARM_MOV_R(ARM_R0, r_A), ctx);
> > > +			emit_mov_i(r_scratch, (u32)__aeabi_uidiv, ctx);
> > > +			emit(ARM_BLX_R(r_scratch), ctx);
> > > +			emit(ARM_MOV_R(r_A, ARM_R0), ctx);
> > > +			break;
> > 
> > I don't know how much division is used by the packet filter JIT.  If
> > it gets used a significant amount, you might want to support hardware
> > divide for CPUs that have it:
> 
> Division rarely appears in "normal" BPF filters: it must be an explicit
> part of the human-readable filter expression (the BPF compiler does not
> generate division opcodes in other cases, AFAICT). Nonetheless, support
> for hardware division would spare a bit of stack space for filters like
> "len / 100 == 1".
> 
> > Cortex-A15 and later processors may have hardware integer divide
> > support.  You can check for its availability at runtime using by testing
> > the HWCAP_IDIVA (for ARM) or HWCAP_IDIVT (for Thumb) bits in elf_hwcap
> > (see arch/arm/include/asm/hwcap.h).
> 
> I will include this in the next version of the patch.

Ok, cool

Cheers
---Dave

WARNING: multiple messages have this Message-ID (diff)
From: Dave Martin <dave.martin@linaro.org>
To: Mircea Gherzan <mgherzan@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org
Subject: Re: [PATCH v2] ARM: net: JIT compiler for packet filters
Date: Mon, 19 Dec 2011 18:18:39 +0000	[thread overview]
Message-ID: <20111219181839.GH2031@linaro.org> (raw)
In-Reply-To: <20111219164513.GA25105@swarm.cs.pub.ro>

On Mon, Dec 19, 2011 at 06:45:13PM +0200, Mircea Gherzan wrote:
> Hi,
> 
> On Mon, Dec 19, 2011 at 12:50:21PM +0000, Dave Martin wrote:
> > On Mon, Dec 19, 2011 at 09:40:30AM +0100, Mircea Gherzan wrote:
> > > Based of Matt Evans's PPC64 implementation.
> > > 
> > > Supports only ARM mode with EABI.
> > > 
> > > Supports both little and big endian. Depends on the support for
> > > unaligned loads on ARMv7. Does not support all the BPF opcodes
> > > that deal with ancillary data. The scratch memory of the filter
> > > lives on the stack.
> > > 
> > > Enabled in the same way as for x86-64 and PPC64:
> > > 
> > > 	echo 1 > /proc/sys/net/core/bpf_jit_enable
> > > 
> > > A value greater than 1 enables opcode output.
> > > 
> > > Signed-off-by: Mircea Gherzan <mgherzan@gmail.com>
> > > ---
> > 
> > Interesting patch... I haven't reviewed in detail, but I have a few
> > quick comments.
> > 
> > > 
> > > Changes in v2:
> > >  * enable the compiler ony for ARMv5+ because of the BLX instruction
> > >  * use the same comparison for the ARM version checks
> > >  * use misaligned accesses on ARMv6
> > 
> > You probably want to change the commit message now to reflect this.
> 
> Will do in the next version.
> 
> > 
> > >  * fix the SEEN_MEM
> > >  * fix the mem_words_used()
> > > 
> > >  arch/arm/Kconfig          |    1 +
> > >  arch/arm/Makefile         |    1 +
> > >  arch/arm/net/Makefile     |    3 +
> > >  arch/arm/net/bpf_jit_32.c |  838 +++++++++++++++++++++++++++++++++++++++++++++
> > >  arch/arm/net/bpf_jit_32.h |  174 ++++++++++
> > >  5 files changed, 1017 insertions(+), 0 deletions(-)
> > >  create mode 100644 arch/arm/net/Makefile
> > >  create mode 100644 arch/arm/net/bpf_jit_32.c
> > >  create mode 100644 arch/arm/net/bpf_jit_32.h
> > > 
> > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > > index abba5b8..ea65c41 100644
> > > --- a/arch/arm/Kconfig
> > > +++ b/arch/arm/Kconfig
> > > @@ -30,6 +30,7 @@ config ARM
> > >  	select HAVE_SPARSE_IRQ
> > >  	select GENERIC_IRQ_SHOW
> > >  	select CPU_PM if (SUSPEND || CPU_IDLE)
> > > +	select HAVE_BPF_JIT if (!THUMB2_KERNEL && AEABI)
> > 
> > Have to tried your code with a Thumb-2 kernel?
> 
> Not yet.
> 
> > Quickly skimming though your patch, I don't see an obvious reason why we
> > can't have that working, though I haven't tried it yet.
> > 
> > Note that it's fine to have the JIT generating ARM code, even if the rest
> > if the kernel is Thumb-2.  This would only start to cause problems if we
> > want to do things like set kprobes in the JITted code, or unwind out of
> > the JITted code.
> > 
> > It's just necessary to make sure that calls/returns into/out of the
> > JITted code are handled correctly.  You don't seem to do any scary
> > arithmetic or mov to or from pc or lr, and it doesn't look like you ever
> > call back into the kernel from JITted code, so the implementation is
> > probably safe for ARM/Thumb interworking already (if I've understood
> > correctly).
> 
> The JITed code calls back to the kernel for the load helpers. So setting
> bit 0 is required.

When you take the address of a link-time external function symbol,
bit[0] in the address will automatically be set appropriately by the
linker to indicate the target instruction set -- you already use BX/BLX
to jump to such symbols, so you should switch correctly when calling
_to_ the kernel.

Returns should also work, except for old-style "mov pc,lr" returns made
in Thumb code (from ARM code, this magically works for >= v7).  Such returns
only happen in hand-written assembler: for C code, the compiler always
generates proper AEABI-compliant return sequences.

So, for calling load_func[], jit_get_skb_b etc. (which are C functions),
there should be no problem.

I think the only code which you call from the JIT output but which does
not return compliantly is __aeabi_uidiv() in arch/arm/lib/lib1funcs.S.


I have a quick hacked-up patch (below) which attempts to fix this;
I'd be interested if this works for you  -- but finalising your ARM-only
version of the patch should still be the priority.

If this fix does work, I'll turn it into a proper patch, as we can maybe
use it more widely.

[...]

> > > +		case BPF_S_ALU_DIV_X:
> > > +			ctx->seen |= SEEN_X;
> > > +			emit(ARM_CMP_I(r_X, 0), ctx);
> > > +			emit_err_ret(ARM_COND_EQ, ctx);
> > > +			emit(ARM_MOV_R(ARM_R1, r_X), ctx);
> > > +div:
> > > +			ctx->seen |= SEEN_CALL;
> > > +
> > > +			emit(ARM_MOV_R(ARM_R0, r_A), ctx);
> > > +			emit_mov_i(r_scratch, (u32)__aeabi_uidiv, ctx);
> > > +			emit(ARM_BLX_R(r_scratch), ctx);
> > > +			emit(ARM_MOV_R(r_A, ARM_R0), ctx);
> > > +			break;
> > 
> > I don't know how much division is used by the packet filter JIT.  If
> > it gets used a significant amount, you might want to support hardware
> > divide for CPUs that have it:
> 
> Division rarely appears in "normal" BPF filters: it must be an explicit
> part of the human-readable filter expression (the BPF compiler does not
> generate division opcodes in other cases, AFAICT). Nonetheless, support
> for hardware division would spare a bit of stack space for filters like
> "len / 100 == 1".
> 
> > Cortex-A15 and later processors may have hardware integer divide
> > support.  You can check for its availability at runtime using by testing
> > the HWCAP_IDIVA (for ARM) or HWCAP_IDIVT (for Thumb) bits in elf_hwcap
> > (see arch/arm/include/asm/hwcap.h).
> 
> I will include this in the next version of the patch.

Ok, cool

Cheers
---Dave

  reply	other threads:[~2011-12-19 18:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-19  8:40 [PATCH v2] ARM: net: JIT compiler for packet filters Mircea Gherzan
2011-12-19  8:40 ` Mircea Gherzan
2011-12-19 12:50 ` Dave Martin
2011-12-19 12:50   ` Dave Martin
2011-12-19 15:19   ` Uwe Kleine-König
2011-12-19 15:19     ` Uwe Kleine-König
2011-12-19 15:24     ` Dave Martin
2011-12-19 15:24       ` Dave Martin
2011-12-19 15:33       ` Uwe Kleine-König
2011-12-19 15:33         ` Uwe Kleine-König
2011-12-19 15:52         ` Dave Martin
2011-12-19 15:52           ` Dave Martin
2011-12-19 17:14           ` Mircea Gherzan
2011-12-19 17:14             ` Mircea Gherzan
2011-12-19 17:39           ` Catalin Marinas
2011-12-19 17:39             ` Catalin Marinas
2011-12-19 16:45   ` Mircea Gherzan
2011-12-19 16:45     ` Mircea Gherzan
2011-12-19 18:18     ` Dave Martin [this message]
2011-12-19 18:18       ` Dave Martin
2011-12-19 18:29       ` Dave Martin
2011-12-19 18:29         ` Dave Martin
2011-12-21 14:59         ` Mircea Gherzan
2011-12-21 14:59           ` Mircea Gherzan
2011-12-22 17:00           ` Dave Martin
2011-12-22 17:00             ` Dave Martin
2011-12-22 17:34             ` David Laight
2011-12-22 17:34               ` David Laight
2011-12-23  2:26               ` Nicolas Pitre
2011-12-23  2:26                 ` Nicolas Pitre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111219181839.GH2031@linaro.org \
    --to=dave.martin@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.