* Re: [PATCH net-next 08/18] net: dsa: mv88e6xxx: move generic VTU GetNext
From: Vivien Didelot @ 2017-04-28 15:07 UTC (permalink / raw)
To: Andrew Lunn
Cc: netdev, linux-kernel, kernel, David S. Miller, Florian Fainelli
In-Reply-To: <20170427185912.GK17364@lunn.ch>
Hi Andrew,
Andrew Lunn <andrew@lunn.ch> writes:
>> + /* Write the VID to iterate from only once */
>> + if (!entry->valid) {
>> + err = mv88e6xxx_g1_vtu_vid_write(chip, entry);
>> + if (err)
>> + return err;
>> + }
>
> Please could you add a bigger comment. It is not clear why you write
> it, when it is invalid. That just seems wrong, and needs a good
> comment to explain why it is correct, more than what you currently
> have as a comment.
This trick could indeed benefit a better explanation. The reason for it
is that I used the same comment as the ATU GetNext implementation, i.e.:
/* Write the MAC address to iterate from only once */
if (entry->state == GLOBAL_ATU_DATA_STATE_UNUSED) {
err = mv88e6xxx_g1_atu_mac_write(chip, entry);
if (err)
return err;
}
I suggest me sending a future patch to improve the comments of both
GetNext (ATU and VTU) implementations at the same time later.
Thanks,
Vivien
^ permalink raw reply
* Re: arm64: next-20170428 hangs on boot
From: Yury Norov @ 2017-04-28 15:09 UTC (permalink / raw)
To: Mark Rutland; +Cc: linux-arm-kernel, linux-kernel, netdev, davem
In-Reply-To: <20170428145233.GB5292@leverpostej>
On Fri, Apr 28, 2017 at 03:52:34PM +0100, Mark Rutland wrote:
> On Fri, Apr 28, 2017 at 04:24:29PM +0300, Yury Norov wrote:
> > Hi all,
>
> Hi,
>
> [adding Dave Miller, netdev, lkml]
thanks
> > On QEMU the next-20170428 hangs on boot for me due to kernel panic in
> > rtnetlink_init():
> >
> > void __init rtnetlink_init(void)
> > {
> > if (register_pernet_subsys(&rtnetlink_net_ops))
> > panic("rtnetlink_init: cannot initialize rtnetlink\n");
> >
> > ...
> > }
>
> I see the same thing with a next-20170428 arm64 defconfig, on a Juno R1
> system:
>
> [ 0.531949] Kernel panic - not syncing: rtnetlink_init: cannot initialize rtnetlink
> [ 0.531949]
> [ 0.541271] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.11.0-rc8-next-20170428-00002-g6ee3799 #10
> [ 0.550307] Hardware name: ARM Juno development board (r1) (DT)
> [ 0.556332] Call trace:
> [ 0.558833] [<ffff000008088538>] dump_backtrace+0x0/0x238
> [ 0.564332] [<ffff000008088834>] show_stack+0x14/0x20
> [ 0.569477] [<ffff00000839dd54>] dump_stack+0x9c/0xc0
> [ 0.574622] [<ffff000008175344>] panic+0x11c/0x28c
> [ 0.579505] [<ffff000008d80034>] rtnetlink_init+0x2c/0x1d0
> [ 0.585092] [<ffff000008d8047c>] netlink_proto_init+0x14c/0x17c
> [ 0.591119] [<ffff000008083150>] do_one_initcall+0x38/0x120
> [ 0.596796] [<ffff000008d30d00>] kernel_init_freeable+0x1a0/0x240
> [ 0.603003] [<ffff00000892a790>] kernel_init+0x10/0x100
> [ 0.608324] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> [ 0.613736] SMP: stopping secondary CPUs
> [ 0.617738] ---[ end Kernel panic - not syncing: rtnetlink_init: cannot initialize rtnetlink
>
> If this isn't a known issue, it would be worth trying to bisect this.
The exact function that fails is:
include/linux/rhashtable.h
static inline void *__rhashtable_insert_fast(
struct rhashtable *ht, const void *key, struct rhash_head *obj,
const struct rhashtable_params params, bool rhlist)
{
...
data = ERR_PTR(-E2BIG);
if (unlikely(rht_grow_above_max(ht, tbl)))
goto out;
...
out:
spin_unlock_bh(lock);
rcu_read_unlock();
return data;
}
And the backtrace:
#0 __rhashtable_insert_fast (rhlist=<optimized out>, params=..., obj=<optimized out>,
key=<optimized out>, ht=<optimized out>) at ./include/linux/rhashtable.h:803
#1 rhashtable_lookup_insert_key (params=..., obj=<optimized out>, key=<optimized out>,
ht=<optimized out>) at ./include/linux/rhashtable.h:980
#2 __netlink_insert (sk=<optimized out>, table=<optimized out>) at net/netlink/af_netlink.c:484
#3 netlink_insert (sk=0xffff80003da85000, portid=0) at net/netlink/af_netlink.c:548
#4 0xffff00000876c5a0 in __netlink_kernel_create (net=<optimized out>, unit=0, module=0x0,
cfg=0xffff80003d84fc60) at net/netlink/af_netlink.c:1996
#5 0xffff000008756704 in netlink_kernel_create (cfg=<optimized out>, unit=<optimized out>,
net=<optimized out>) at ./include/linux/netlink.h:62
#6 rtnetlink_net_init (net=0xffff000008c7c100 <init_net>) at net/core/rtnetlink.c:4175
#7 0xffff000008737a2c in ops_init (ops=0xffff000008c7e268 <rtnetlink_net_ops>,
net=0xffff000008c7c100 <init_net>) at net/core/net_namespace.c:117
#8 0xffff000008738704 in __register_pernet_operations (ops=<optimized out>,
list=<optimized out>) at net/core/net_namespace.c:818
#9 register_pernet_operations (list=<optimized out>, ops=0xffff000008c7e268
<rtnetlink_net_ops>) at net/core/net_namespace.c:892
#10 0xffff0000087387fc in register_pernet_subsys (ops=0xffff000008c7e268
<rtnetlink_net_ops>) at net/core/net_namespace.c:934
#11 0xffff000008b5b9b8 in rtnetlink_init () at net/core/rtnetlink.c:4195
#12 0xffff000008b5be08 in netlink_proto_init () at net/netlink/af_netlink.c:2730
#13 0xffff000008083158 in do_one_initcall (fn=0xffff000008b5bcc4 <netlink_proto_init>) at init/main.c:795
#14 0xffff000008b20d04 in do_initcall_level (level=<optimized out>) at init/main.c:861
#15 do_initcalls () at init/main.c:869
#16 do_basic_setup () at init/main.c:887
Yury
^ permalink raw reply
* Re: [PATCH] [4.11 regression] cpsw/netcp: refine cpts dependency
From: Tony Lindgren @ 2017-04-28 15:15 UTC (permalink / raw)
To: Arnd Bergmann
Cc: David S . Miller, Grygorii Strashko, Nicolas Pitre, WingMan Kwok,
netdev, linux-kernel, linux-omap
In-Reply-To: <20170428150358.1537030-1-arnd@arndb.de>
* Arnd Bergmann <arnd@arndb.de> [170428 08:06]:
> Tony Lindgren reports a kernel oops that resulted from my compile-time
> fix on the default config. This shows two problems:
>
> a) configurations that did not already enable PTP_1588_CLOCK will
> now miss the cpts driver
>
> b) when cpts support is disabled, the driver crashes. This is a
> preexisting problem that we did not notice before my patch.
>
> While the second problem is still being investigated, this modifies
> the dependencies again, getting us back to the original state, with
> another 'select NET_PTP_CLASSIFY' added in to avoid the original
> link error we got, and the 'depends on POSIX_TIMERS' to hide
> the CPTS support when turning it on would be useless.
>
> Cc: stable@vger.kernel.org # 4.11 needs this
> Fixes: 07fef3623407 ("cpsw/netcp: cpts depends on posix_timers")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> Adding the Cc: stable in case this doesn't make it into 4.11
> any more.
Thanks this fixes the oops I'm seeing:
Tested-by: Tony Lindgren <tony@atomide.com>
> drivers/net/ethernet/ti/Kconfig | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig
> index 9e631952b86f..48a541eb0af2 100644
> --- a/drivers/net/ethernet/ti/Kconfig
> +++ b/drivers/net/ethernet/ti/Kconfig
> @@ -76,7 +76,7 @@ config TI_CPSW
> config TI_CPTS
> bool "TI Common Platform Time Sync (CPTS) Support"
> depends on TI_CPSW || TI_KEYSTONE_NETCP
> - depends on PTP_1588_CLOCK
> + depends on POSIX_TIMERS
> ---help---
> This driver supports the Common Platform Time Sync unit of
> the CPSW Ethernet Switch and Keystone 2 1g/10g Switch Subsystem.
> @@ -87,6 +87,8 @@ config TI_CPTS_MOD
> tristate
> depends on TI_CPTS
> default y if TI_CPSW=y || TI_KEYSTONE_NETCP=y
> + select NET_PTP_CLASSIFY
> + imply PTP_1588_CLOCK
> default m
>
> config TI_KEYSTONE_NETCP
> --
> 2.9.0
>
^ permalink raw reply
* Re: [PATCH net v2] net: dev: Fix possible memleaks when fail to register_netdevice
From: David Ahern @ 2017-04-28 15:22 UTC (permalink / raw)
To: gfree.wind, jiri, davem, kuznet, jmorris, yoshfuji, kaber,
steffen.klassert, herbert, netdev
Cc: Gao Feng
In-Reply-To: <1493349568-58635-1-git-send-email-gfree.wind@foxmail.com>
On 4/27/17 9:19 PM, gfree.wind@foxmail.com wrote:
> drivers/net/dummy.c | 14 +++++++++++---
> drivers/net/ifb.c | 33 +++++++++++++++++++++++----------
> drivers/net/loopback.c | 15 ++++++++++++++-
> drivers/net/team/team.c | 15 ++++++++++++---
> drivers/net/veth.c | 15 ++++++++++++++-
> net/8021q/vlan_dev.c | 17 +++++++++++++----
> net/ipv4/ip_tunnel.c | 11 ++++++++++-
> net/ipv6/ip6_gre.c | 18 ++++++++++++++----
> net/ipv6/ip6_tunnel.c | 11 ++++++++++-
> net/ipv6/ip6_vti.c | 11 ++++++++++-
> net/ipv6/sit.c | 17 +++++++++++++----
> 11 files changed, 144 insertions(+), 33 deletions(-)
Seems like the changes to each file are completely independent, so they
should be in separate patches making each easier to review.
^ permalink raw reply
* RE: [PATCH net v2] net: dev: Fix possible memleaks when fail to register_netdevice
From: Gao Feng @ 2017-04-28 15:34 UTC (permalink / raw)
To: 'David Ahern', gfree.wind, jiri, davem, kuznet, jmorris,
yoshfuji, kaber, steffen.klassert, herbert, netdev
In-Reply-To: <7d305254-03c5-0e6a-fac9-e9f4c758bf77@gmail.com>
> From: David Ahern [mailto:dsahern@gmail.com]
> Sent: Friday, April 28, 2017 11:23 PM
> On 4/27/17 9:19 PM, gfree.wind@foxmail.com wrote:
> > drivers/net/dummy.c | 14 +++++++++++---
> > drivers/net/ifb.c | 33 +++++++++++++++++++++++----------
> > drivers/net/loopback.c | 15 ++++++++++++++-
> drivers/net/team/team.c
> > | 15 ++++++++++++---
> > drivers/net/veth.c | 15 ++++++++++++++-
> > net/8021q/vlan_dev.c | 17 +++++++++++++----
> > net/ipv4/ip_tunnel.c | 11 ++++++++++-
> > net/ipv6/ip6_gre.c | 18 ++++++++++++++----
> > net/ipv6/ip6_tunnel.c | 11 ++++++++++-
> > net/ipv6/ip6_vti.c | 11 ++++++++++-
> > net/ipv6/sit.c | 17 +++++++++++++----
> > 11 files changed, 144 insertions(+), 33 deletions(-)
>
> Seems like the changes to each file are completely independent, so they
should
> be in separate patches making each easier to review.
Thanks, I would like split them into multiple patches.
Best Regards
Feng
^ permalink raw reply
* Re: arm64: next-20170428 hangs on boot
From: Florian Fainelli @ 2017-04-28 15:40 UTC (permalink / raw)
To: Yury Norov, Mark Rutland; +Cc: netdev, linux-kernel, linux-arm-kernel, davem
In-Reply-To: <20170428150931.iyoewmc5qs55aw3t@yury-N73SV>
On 04/28/2017 08:09 AM, Yury Norov wrote:
> On Fri, Apr 28, 2017 at 03:52:34PM +0100, Mark Rutland wrote:
>> On Fri, Apr 28, 2017 at 04:24:29PM +0300, Yury Norov wrote:
>>> Hi all,
>>
>> Hi,
>>
>> [adding Dave Miller, netdev, lkml]
>
> thanks
>
>>> On QEMU the next-20170428 hangs on boot for me due to kernel panic in
>>> rtnetlink_init():
>>>
>>> void __init rtnetlink_init(void)
>>> {
>>> if (register_pernet_subsys(&rtnetlink_net_ops))
>>> panic("rtnetlink_init: cannot initialize rtnetlink\n");
>>>
>>> ...
>>> }
>>
>> I see the same thing with a next-20170428 arm64 defconfig, on a Juno R1
>> system:
>>
>> [ 0.531949] Kernel panic - not syncing: rtnetlink_init: cannot initialize rtnetlink
>> [ 0.531949]
>> [ 0.541271] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.11.0-rc8-next-20170428-00002-g6ee3799 #10
>> [ 0.550307] Hardware name: ARM Juno development board (r1) (DT)
>> [ 0.556332] Call trace:
>> [ 0.558833] [<ffff000008088538>] dump_backtrace+0x0/0x238
>> [ 0.564332] [<ffff000008088834>] show_stack+0x14/0x20
>> [ 0.569477] [<ffff00000839dd54>] dump_stack+0x9c/0xc0
>> [ 0.574622] [<ffff000008175344>] panic+0x11c/0x28c
>> [ 0.579505] [<ffff000008d80034>] rtnetlink_init+0x2c/0x1d0
>> [ 0.585092] [<ffff000008d8047c>] netlink_proto_init+0x14c/0x17c
>> [ 0.591119] [<ffff000008083150>] do_one_initcall+0x38/0x120
>> [ 0.596796] [<ffff000008d30d00>] kernel_init_freeable+0x1a0/0x240
>> [ 0.603003] [<ffff00000892a790>] kernel_init+0x10/0x100
>> [ 0.608324] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
>> [ 0.613736] SMP: stopping secondary CPUs
>> [ 0.617738] ---[ end Kernel panic - not syncing: rtnetlink_init: cannot initialize rtnetlink
>>
>> If this isn't a known issue, it would be worth trying to bisect this.
It's fixed already by this commit in net-next:
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=2d2ab658d2debcb4c0e29c9e6f18e5683f3077bf
>
> The exact function that fails is:
> include/linux/rhashtable.h
> static inline void *__rhashtable_insert_fast(
> struct rhashtable *ht, const void *key, struct rhash_head *obj,
> const struct rhashtable_params params, bool rhlist)
> {
> ...
>
> data = ERR_PTR(-E2BIG);
> if (unlikely(rht_grow_above_max(ht, tbl)))
> goto out;
> ...
>
> out:
> spin_unlock_bh(lock);
> rcu_read_unlock();
>
> return data;
> }
>
> And the backtrace:
> #0 __rhashtable_insert_fast (rhlist=<optimized out>, params=..., obj=<optimized out>,
> key=<optimized out>, ht=<optimized out>) at ./include/linux/rhashtable.h:803
> #1 rhashtable_lookup_insert_key (params=..., obj=<optimized out>, key=<optimized out>,
> ht=<optimized out>) at ./include/linux/rhashtable.h:980
> #2 __netlink_insert (sk=<optimized out>, table=<optimized out>) at net/netlink/af_netlink.c:484
> #3 netlink_insert (sk=0xffff80003da85000, portid=0) at net/netlink/af_netlink.c:548
> #4 0xffff00000876c5a0 in __netlink_kernel_create (net=<optimized out>, unit=0, module=0x0,
> cfg=0xffff80003d84fc60) at net/netlink/af_netlink.c:1996
> #5 0xffff000008756704 in netlink_kernel_create (cfg=<optimized out>, unit=<optimized out>,
> net=<optimized out>) at ./include/linux/netlink.h:62
> #6 rtnetlink_net_init (net=0xffff000008c7c100 <init_net>) at net/core/rtnetlink.c:4175
> #7 0xffff000008737a2c in ops_init (ops=0xffff000008c7e268 <rtnetlink_net_ops>,
> net=0xffff000008c7c100 <init_net>) at net/core/net_namespace.c:117
> #8 0xffff000008738704 in __register_pernet_operations (ops=<optimized out>,
> list=<optimized out>) at net/core/net_namespace.c:818
> #9 register_pernet_operations (list=<optimized out>, ops=0xffff000008c7e268
> <rtnetlink_net_ops>) at net/core/net_namespace.c:892
> #10 0xffff0000087387fc in register_pernet_subsys (ops=0xffff000008c7e268
> <rtnetlink_net_ops>) at net/core/net_namespace.c:934
> #11 0xffff000008b5b9b8 in rtnetlink_init () at net/core/rtnetlink.c:4195
> #12 0xffff000008b5be08 in netlink_proto_init () at net/netlink/af_netlink.c:2730
> #13 0xffff000008083158 in do_one_initcall (fn=0xffff000008b5bcc4 <netlink_proto_init>) at init/main.c:795
> #14 0xffff000008b20d04 in do_initcall_level (level=<optimized out>) at init/main.c:861
> #15 do_initcalls () at init/main.c:869
> #16 do_basic_setup () at init/main.c:887
>
> Yury
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
--
Florian
^ permalink raw reply
* Re: [PATCH net-next] rhashtable: Do not lower max_elems when max_size is zero
From: Florian Fainelli @ 2017-04-28 15:42 UTC (permalink / raw)
To: David Miller, herbert; +Cc: netdev, fw, tgraf
In-Reply-To: <20170428.101449.1468521013732143831.davem@davemloft.net>
On 04/28/2017 07:14 AM, David Miller wrote:
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Fri, 28 Apr 2017 14:10:48 +0800
>
>> The commit 6d684e54690c ("rhashtable: Cap total number of entries
>> to 2^31") breaks rhashtable users that do not set max_size. This
>> is because when max_size is zero max_elems is also incorrectly set
>> to zero instead of 2^31.
>>
>> This patch fixes it by only lowering max_elems when max_size is not
>> zero.
>>
>> Fixes: 6d684e54690c ("rhashtable: Cap total number of entries to 2^31")
>> Reported-by: Florian Fainelli <f.fainelli@gmail.com>
>> Reported-by: kernel test robot <fengguang.wu@intel.com>
>> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Thanks Herbert
--
Florian
^ permalink raw reply
* Re: [PATCH v1 net-next 5/6] net: allow simultaneous SW and HW transmit timestamping
From: Willem de Bruijn @ 2017-04-28 15:50 UTC (permalink / raw)
To: Miroslav Lichvar
Cc: Network Development, Richard Cochran, Willem de Bruijn,
Soheil Hassas Yeganeh, Keller, Jacob E, Denny Page, Jiri Benc
In-Reply-To: <20170428085422.GD3401@localhost>
On Fri, Apr 28, 2017 at 4:54 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> On Wed, Apr 26, 2017 at 08:00:02PM -0400, Willem de Bruijn wrote:
>> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> > index 81ef53f..42bff22 100644
>> > --- a/include/linux/skbuff.h
>> > +++ b/include/linux/skbuff.h
>> > @@ -3300,8 +3300,7 @@ void skb_tstamp_tx(struct sk_buff *orig_skb,
>> >
>> > static inline void sw_tx_timestamp(struct sk_buff *skb)
>> > {
>> > - if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
>> > - !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
>> > + if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP)
>> > skb_tstamp_tx(skb, NULL);
>> > }
>
>> > +++ b/net/core/skbuff.c
>> > @@ -3874,6 +3874,10 @@ void __skb_tstamp_tx(struct sk_buff *orig_skb,
>> > if (!sk)
>> > return;
>> >
>> > + if (!hwtstamps && !(sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW) &&
>> > + skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)
>> > + return;
>> > +
>>
>> This check should only happen for software transmit timestamps, so simpler to
>> revise the check in sw_tx_timestamp above to
>>
>> if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
>> - !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
>> + (!(skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)) ||
>> + (skb->sk && skb->sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW)
>
> I'm not sure if this can work. sk_buff.h would need to include sock.h
> in order to get the definition of struct sock. Any suggestions?
A more elegant solution would be to not set SKBTX_IN_PROGRESS
at all if SOF_TIMESTAMPING_OPT_TX_SWHW is set on the socket.
But the patch to do so is not elegant, having to update callsites in many
device drivers.
Otherwise you may indeed have to call skb_tstamp_tx for every packet
that has SKBTX_SW_TSTAMP set, as you do. We can at least move
the skb->sk != NULL check into skb_tx_timestamp in skbuff.h.
By the way, if changing this code, I think that it's time to get rid of
sw_tx_timestamp. It is only called from skb_tx_timestamp. Let's
just move the condition in there.
^ permalink raw reply
* Re: [PATCH v2 binutils] Add BPF support to binutils...
From: Aaron Conole @ 2017-04-28 15:57 UTC (permalink / raw)
To: David Miller; +Cc: ast, daniel, netdev, xdp-newbies
In-Reply-To: <20170427.170934.891283291562287326.davem@davemloft.net>
Hi David,
David Miller <davem@davemloft.net> writes:
> Here is what I have after today's work. I think I sorted out the
> endianness issues.
>
> gas can be controlled explicitly using "-EB" and "-EL" options. The
> default is whatever endianness the host has. The elf names for the
> two variants are "elf64-bpfbe" and "elf64-bpfle".
>
> I fleshed out all the rest of the assembler parsing for instructions
> and added many entries to the gas testsuite.
>
> They are all explicitly in little endian, although I should add big
> endian versions too of course.
>
> If someone is looking for a way to help, could you please verify the
> testsuite output to make sure the opcode and fields are correctly
> set in the testsuite. Just look in:
>
> gas/testsuite/gas/bpf/
>
> and there are two files for every test. One is the "foo.s" file which
> gets built using gas into an object file "foo.o". Then there is a
> dump file named "foo.d" which specifies optionally how to run gas and
> with what options, and then what to dump with (usually "objdump -dr")
> then there is text which the testsuite compares with the dump of
> the resulting "foo.o" file.
>
> The testsuite is driven by bpf.exp which has pretty straightforward
> syntax.
>
> Anyways, enjoy. I'll keep cracking on this tomorrow.
I'll get an arm board up and running to do some testing there. As a
teaser:
Test run by aconole on Fri Apr 28 11:52:18 2017
Target is bpf-linux-elf
Host is x86_64-pc-linux-gnu
=== gas tests ===
...
Running /home/aconole/git/binutils-gdb/gas/testsuite/gas/bfin/error.exp ...
Running /home/aconole/git/binutils-gdb/gas/testsuite/gas/bpf/bpf.exp ...
Running /home/aconole/git/binutils-gdb/gas/testsuite/gas/cfi/cfi.exp ...
...
And also:
11:56:08 aconole {master} ~/git/binutils-gdb$ ./binutils/objdump -dr move.o
move.o: file format elf64-bpfle
Disassembly of section .text:
0000000000000000 <.text>:
0: bf 12 00 00 00 00 00 00 mov r1, r2
8: b7 10 00 00 ef 00 00 00 mov r1, 239
10: bc 12 00 00 00 00 00 00 mov32 r1, r2
18: b4 10 00 00 ef 00 00 00 mov32 r1, 239
20: bf 36 00 00 00 00 00 00 mov r3, r6
28: bf 63 00 00 00 00 00 00 mov r6, r3
30: bf 89 00 00 00 00 00 00 mov r8, r9
38: bf a1 00 00 00 00 00 00 mov r10, r1
40: bf 73 00 00 00 00 00 00 mov r7, r3
48: b7 50 00 00 02 00 00 00 mov r5, 2
This is on an intel i7 (so not terribly exotic)
-Aaron
^ permalink raw reply
* Re: [PATCH v2 binutils] Add BPF support to binutils...
From: David Miller @ 2017-04-28 16:04 UTC (permalink / raw)
To: aconole; +Cc: ast, daniel, netdev, xdp-newbies
In-Reply-To: <f7td1bwtthr.fsf@redhat.com>
From: Aaron Conole <aconole@bytheb.org>
Date: Fri, 28 Apr 2017 11:57:36 -0400
> I'll get an arm board up and running to do some testing there. As a
> teaser:
Great.
I started working on some more relocation stuff, so more of the
generic gas tests pass.
For example, stuff like this now works properly:
[davem@dhcp-10-15-49-210 build-bpf]$ cat gas/y.s
.data
.globl foo
foo: .xword bar
[davem@dhcp-10-15-49-210 build-bpf]$ gas/as-new -o gas/y.o gas/y.s
[davem@dhcp-10-15-49-210 build-bpf]$ binutils/objdump -r gas/y.o
gas/y.o: file format elf64-bpfle
RELOCATION RECORDS FOR [.data]:
OFFSET TYPE VALUE
0000000000000000 R_BPF_DATA_64 bar
[davem@dhcp-10-15-49-210 build-bpf]$
It turned out that I needed to separate the R_BPF_* relocations into
data vs. insn ones.
Another idea I am thinking about pursuing is adding BPF simulator
support under sim/ so that people can use gdb to step through BPF
programs.
I hope we can make it work in a way that we can even step through
XDP programs and feed them simple test packets, stuff like that.
Anyways, quick relative live patch against v2 from my tree for the
reloc stuff:
diff --git a/bfd/elf64-bpf.c b/bfd/elf64-bpf.c
index 9944bb4..1be285d 100644
--- a/bfd/elf64-bpf.c
+++ b/bfd/elf64-bpf.c
@@ -1,8 +1,89 @@
#include "sysdep.h"
#include "bfd.h"
+#include "bfdlink.h"
#include "libbfd.h"
+#include "libiberty.h"
#include "elf-bfd.h"
+#include "elf/bpf.h"
#include "opcode/bpf.h"
+#include "objalloc.h"
+#include "elf64-bpf.h"
+
+/* In case we're on a 32-bit machine, construct a 64-bit "-1" value. */
+#define MINUS_ONE (~ (bfd_vma) 0)
+
+static reloc_howto_type _bfd_bpf_elf_howto_table[] =
+{
+ HOWTO(R_BPF_NONE, 0,3, 0,FALSE,0,complain_overflow_dont, bfd_elf_generic_reloc, "R_BPF_NONE", FALSE,0,0x00000000,TRUE),
+
+ /* XXX these are wrong XXX */
+ HOWTO(R_BPF_INSN_16, 0,1,16,FALSE,0,complain_overflow_bitfield,bfd_elf_generic_reloc, "R_BPF_INSN_16", FALSE,0,0x0000ffff,TRUE),
+ HOWTO(R_BPF_INSN_32, 0,2,32,FALSE,0,complain_overflow_bitfield,bfd_elf_generic_reloc, "R_BPF_INSN_32", FALSE,0,0xffffffff,TRUE),
+ HOWTO(R_BPF_INSN_64, 0,4,64,FALSE,0,complain_overflow_bitfield,bfd_elf_generic_reloc, "R_BPF_INSN_64", FALSE,0,MINUS_ONE,TRUE),
+ HOWTO(R_BPF_WDISP16, 0,1,16,TRUE, 0,complain_overflow_signed, bfd_elf_generic_reloc, "R_BPF_WDISP16", FALSE,0,0x0000ffff,TRUE),
+
+ HOWTO(R_BPF_DATA_8, 0,0, 8,FALSE,0,complain_overflow_bitfield,bfd_elf_generic_reloc, "R_BPF_DATA_8", FALSE,0,0x000000ff,TRUE),
+ HOWTO(R_BPF_DATA_16, 0,1,16,FALSE,0,complain_overflow_bitfield,bfd_elf_generic_reloc, "R_BPF_DATA_16", FALSE,0,0x0000ffff,TRUE),
+ HOWTO(R_BPF_DATA_32, 0,2,32,FALSE,0,complain_overflow_bitfield,bfd_elf_generic_reloc, "R_BPF_DATA_32", FALSE,0,0xffffffff,TRUE),
+ HOWTO(R_BPF_DATA_64, 0,4,64,FALSE,0,complain_overflow_bitfield,bfd_elf_generic_reloc, "R_BPF_DATA_64", FALSE,0,MINUS_ONE,TRUE),
+};
+
+reloc_howto_type *
+_bfd_bpf_elf_reloc_type_lookup (bfd *abfd ATTRIBUTE_UNUSED,
+ bfd_reloc_code_real_type code)
+{
+ switch (code)
+ {
+ case BFD_RELOC_NONE:
+ return &_bfd_bpf_elf_howto_table[R_BPF_NONE];
+
+ case BFD_RELOC_BPF_WDISP16:
+ return &_bfd_bpf_elf_howto_table[R_BPF_WDISP16];
+
+ case BFD_RELOC_BPF_16:
+ return &_bfd_bpf_elf_howto_table[R_BPF_INSN_16];
+
+ case BFD_RELOC_BPF_32:
+ return &_bfd_bpf_elf_howto_table[R_BPF_INSN_32];
+
+ case BFD_RELOC_BPF_64:
+ return &_bfd_bpf_elf_howto_table[R_BPF_INSN_64];
+
+ case BFD_RELOC_8:
+ return &_bfd_bpf_elf_howto_table[R_BPF_DATA_8];
+
+ case BFD_RELOC_16:
+ return &_bfd_bpf_elf_howto_table[R_BPF_DATA_16];
+
+ case BFD_RELOC_32:
+ return &_bfd_bpf_elf_howto_table[R_BPF_DATA_32];
+
+ case BFD_RELOC_64:
+ return &_bfd_bpf_elf_howto_table[R_BPF_DATA_64];
+
+ default:
+ break;
+ }
+ bfd_set_error (bfd_error_bad_value);
+ return NULL;
+}
+
+reloc_howto_type *
+_bfd_bpf_elf_reloc_name_lookup (bfd *abfd ATTRIBUTE_UNUSED,
+ const char *r_name)
+{
+ unsigned int i;
+
+ for (i = 0;
+ i < (sizeof (_bfd_bpf_elf_howto_table)
+ / sizeof (_bfd_bpf_elf_howto_table[0]));
+ i++)
+ if (_bfd_bpf_elf_howto_table[i].name != NULL
+ && strcasecmp (_bfd_bpf_elf_howto_table[i].name, r_name) == 0)
+ return &_bfd_bpf_elf_howto_table[i];
+
+ return NULL;
+}
static void
check_for_relocs (bfd * abfd, asection * o, void * failed)
@@ -34,6 +115,30 @@ elf64_generic_link_add_symbols (bfd *abfd, struct bfd_link_info *info)
return bfd_elf_link_add_symbols (abfd, info);
}
+static reloc_howto_type *
+elf_bpf_rtype_to_howto (unsigned int r_type)
+{
+ if (r_type >= (unsigned int) R_BPF_max)
+ {
+ _bfd_error_handler (_("invalid relocation type %d"), (int) r_type);
+ r_type = R_BPF_NONE;
+ }
+ return &_bfd_bpf_elf_howto_table[r_type];
+}
+
+/* Given a bpf ELF reloc type, fill in an arelent structure. */
+
+static void
+elf_bpf_info_to_howto (bfd *abfd ATTRIBUTE_UNUSED, arelent *cache_ptr,
+ Elf_Internal_Rela *dst)
+{
+ unsigned r_type;
+
+ r_type = ELF64_R_TYPE (dst->r_info);
+ cache_ptr->howto = elf_bpf_rtype_to_howto (r_type);
+ BFD_ASSERT (r_type == cache_ptr->howto->type);
+}
+
#define TARGET_LITTLE_SYM bpf_elf64_le_vec
#define TARGET_LITTLE_NAME "elf64-bpfle"
#define TARGET_BIG_SYM bpf_elf64_be_vec
@@ -42,8 +147,10 @@ elf64_generic_link_add_symbols (bfd *abfd, struct bfd_link_info *info)
#define ELF_MAXPAGESIZE 0x100000
#define ELF_MACHINE_CODE EM_BPF
-#define bfd_elf64_bfd_reloc_type_lookup bfd_default_reloc_type_lookup
-#define bfd_elf64_bfd_reloc_name_lookup _bfd_norelocs_bfd_reloc_name_lookup
+#define elf_info_to_howto elf_bpf_info_to_howto
+
+#define bfd_elf64_bfd_reloc_type_lookup _bfd_bpf_elf_reloc_type_lookup
+#define bfd_elf64_bfd_reloc_name_lookup _bfd_bpf_elf_reloc_name_lookup
#define bfd_elf64_bfd_link_add_symbols elf64_generic_link_add_symbols
#include "elf64-target.h"
diff --git a/gas/config/tc-bpf.c b/gas/config/tc-bpf.c
index f5fb308..0ba2afa 100644
--- a/gas/config/tc-bpf.c
+++ b/gas/config/tc-bpf.c
@@ -546,12 +546,57 @@ md_apply_fix (fixS *fixP, valueT *valP ATTRIBUTE_UNUSED, segT segment ATTRIBUTE_
}
}
+ if (fixP->fx_addsy == NULL)
+ fixP->fx_done = 1;
}
arelent *
tc_gen_reloc (asection *section ATTRIBUTE_UNUSED, fixS *fixp ATTRIBUTE_UNUSED)
{
- return NULL;
+ bfd_reloc_code_real_type code;
+ arelent *reloc;
+
+ reloc = XNEW (arelent);
+ reloc->sym_ptr_ptr = XNEW (asymbol *);
+ *reloc->sym_ptr_ptr = symbol_get_bfdsym (fixp->fx_addsy);
+ reloc->address = fixp->fx_frag->fr_address + fixp->fx_where;
+
+ switch (fixp->fx_r_type)
+ {
+ case BFD_RELOC_BPF_WDISP16:
+ case BFD_RELOC_BPF_16:
+ case BFD_RELOC_BPF_32:
+ case BFD_RELOC_BPF_64:
+ case BFD_RELOC_8:
+ case BFD_RELOC_16:
+ case BFD_RELOC_32:
+ case BFD_RELOC_64:
+ code = fixp->fx_r_type;
+ break;
+ default:
+ abort ();
+ return NULL;
+ }
+
+ reloc->howto = bfd_reloc_type_lookup (stdoutput, code);
+ if (reloc->howto == 0)
+ {
+ as_bad_where (fixp->fx_file, fixp->fx_line,
+ _("internal error: can't export reloc type %d (`%s')"),
+ fixp->fx_r_type, bfd_get_reloc_code_name (code));
+ xfree (reloc);
+ return NULL;
+ }
+ if (code != BFD_RELOC_BPF_WDISP16)
+ reloc->addend = fixp->fx_addnumber;
+ else if (symbol_section_p (fixp->fx_addsy))
+ reloc->addend = (section->vma
+ + fixp->fx_addnumber
+ + md_pcrel_from (fixp));
+ else
+ reloc->addend = fixp->fx_offset;
+
+ return reloc;
}
symbolS *
diff --git a/include/elf/bpf.h b/include/elf/bpf.h
index 3a84d9a..5019b11 100644
--- a/include/elf/bpf.h
+++ b/include/elf/bpf.h
@@ -26,10 +26,14 @@
/* Relocation types. */
START_RELOC_NUMBERS (elf_bpf_reloc_type)
RELOC_NUMBER (R_BPF_NONE, 0)
- RELOC_NUMBER (R_BPF_16, 1)
- RELOC_NUMBER (R_BPF_32, 2)
- RELOC_NUMBER (R_BPF_64, 3)
+ RELOC_NUMBER (R_BPF_INSN_16, 1)
+ RELOC_NUMBER (R_BPF_INSN_32, 2)
+ RELOC_NUMBER (R_BPF_INSN_64, 3)
RELOC_NUMBER (R_BPF_WDISP16, 4)
+ RELOC_NUMBER (R_BPF_DATA_8, 5)
+ RELOC_NUMBER (R_BPF_DATA_16, 6)
+ RELOC_NUMBER (R_BPF_DATA_32, 7)
+ RELOC_NUMBER (R_BPF_DATA_64, 8)
END_RELOC_NUMBERS (R_BPF_max)
#endif /* _ELF_BPF_H */
^ permalink raw reply related
* Re: arm64: next-20170428 hangs on boot
From: David Miller @ 2017-04-28 16:05 UTC (permalink / raw)
To: mark.rutland; +Cc: ynorov, linux-arm-kernel, linux-kernel, netdev
In-Reply-To: <20170428145233.GB5292@leverpostej>
From: Mark Rutland <mark.rutland@arm.com>
Date: Fri, 28 Apr 2017 15:52:34 +0100
> On Fri, Apr 28, 2017 at 04:24:29PM +0300, Yury Norov wrote:
>> Hi all,
>
> Hi,
>
> [adding Dave Miller, netdev, lkml]
>
>> On QEMU the next-20170428 hangs on boot for me due to kernel panic in
>> rtnetlink_init():
>>
>> void __init rtnetlink_init(void)
>> {
>> if (register_pernet_subsys(&rtnetlink_net_ops))
>> panic("rtnetlink_init: cannot initialize rtnetlink\n");
>>
>> ...
>> }
>
> I see the same thing with a next-20170428 arm64 defconfig, on a Juno R1
> system:
As stated, should be fixed by:
>From 2d2ab658d2debcb4c0e29c9e6f18e5683f3077bf Mon Sep 17 00:00:00 2001
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 28 Apr 2017 14:10:48 +0800
Subject: [PATCH] rhashtable: Do not lower max_elems when max_size is zero
The commit 6d684e54690c ("rhashtable: Cap total number of entries
to 2^31") breaks rhashtable users that do not set max_size. This
is because when max_size is zero max_elems is also incorrectly set
to zero instead of 2^31.
This patch fixes it by only lowering max_elems when max_size is not
zero.
Fixes: 6d684e54690c ("rhashtable: Cap total number of entries to 2^31")
Reported-by: Florian Fainelli <f.fainelli@gmail.com>
Reported-by: kernel test robot <fengguang.wu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
lib/rhashtable.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 751630b..3895486 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -958,13 +958,14 @@ int rhashtable_init(struct rhashtable *ht,
if (params->min_size)
ht->p.min_size = roundup_pow_of_two(params->min_size);
- if (params->max_size)
- ht->p.max_size = rounddown_pow_of_two(params->max_size);
-
/* Cap total entries at 2^31 to avoid nelems overflow. */
ht->max_elems = 1u << 31;
- if (ht->p.max_size < ht->max_elems / 2)
- ht->max_elems = ht->p.max_size * 2;
+
+ if (params->max_size) {
+ ht->p.max_size = rounddown_pow_of_two(params->max_size);
+ if (ht->p.max_size < ht->max_elems / 2)
+ ht->max_elems = ht->p.max_size * 2;
+ }
ht->p.min_size = max(ht->p.min_size, HASH_MIN_SIZE);
--
2.4.11
^ permalink raw reply related
* Re: [PATCH] stmmac: Add support for SIMATIC IOT2000 platform
From: David Miller @ 2017-04-28 16:09 UTC (permalink / raw)
To: jan.kiszka
Cc: peppe.cavallaro, alexandre.torgue, netdev, linux-kernel,
sascha.weisenberger
In-Reply-To: <8c536123-6189-e0b6-1977-dc7a521718dd@siemens.com>
From: Jan Kiszka <jan.kiszka@siemens.com>
Date: Mon, 24 Apr 2017 21:27:15 +0200
> The IOT2000 is industrial controller platform, derived from the Intel
> Galileo Gen2 board. The variant IOT2020 comes with one LAN port, the
> IOT2040 has two of them. They can be told apart based on the board asset
> tag in the DMI table.
>
> Based on patch by Sascha Weisenberger.
>
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> Signed-off-by: Sascha Weisenberger <sascha.weisenberger@siemens.com>
It looks like there is still some discussion going on about precise
detection and disambiguation of these devices.
Once things have settled down please resubmit the final patch.
Thank you.
^ permalink raw reply
* Re: arm64: next-20170428 hangs on boot
From: Yury Norov @ 2017-04-28 16:10 UTC (permalink / raw)
To: Florian Fainelli
Cc: Mark Rutland, netdev, linux-kernel, linux-arm-kernel, davem
In-Reply-To: <e65b3ec3-790d-da96-9e1f-3d344f65517b@gmail.com>
On Fri, Apr 28, 2017 at 08:40:54AM -0700, Florian Fainelli wrote:
> On 04/28/2017 08:09 AM, Yury Norov wrote:
> > On Fri, Apr 28, 2017 at 03:52:34PM +0100, Mark Rutland wrote:
> >> On Fri, Apr 28, 2017 at 04:24:29PM +0300, Yury Norov wrote:
> >>> Hi all,
> >>
> >> Hi,
> >>
> >> [adding Dave Miller, netdev, lkml]
> >
> > thanks
> >
> >>> On QEMU the next-20170428 hangs on boot for me due to kernel panic in
> >>> rtnetlink_init():
> >>>
> >>> void __init rtnetlink_init(void)
> >>> {
> >>> if (register_pernet_subsys(&rtnetlink_net_ops))
> >>> panic("rtnetlink_init: cannot initialize rtnetlink\n");
> >>>
> >>> ...
> >>> }
> >>
> >> I see the same thing with a next-20170428 arm64 defconfig, on a Juno R1
> >> system:
> >>
> >> [ 0.531949] Kernel panic - not syncing: rtnetlink_init: cannot initialize rtnetlink
> >> [ 0.531949]
> >> [ 0.541271] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.11.0-rc8-next-20170428-00002-g6ee3799 #10
> >> [ 0.550307] Hardware name: ARM Juno development board (r1) (DT)
> >> [ 0.556332] Call trace:
> >> [ 0.558833] [<ffff000008088538>] dump_backtrace+0x0/0x238
> >> [ 0.564332] [<ffff000008088834>] show_stack+0x14/0x20
> >> [ 0.569477] [<ffff00000839dd54>] dump_stack+0x9c/0xc0
> >> [ 0.574622] [<ffff000008175344>] panic+0x11c/0x28c
> >> [ 0.579505] [<ffff000008d80034>] rtnetlink_init+0x2c/0x1d0
> >> [ 0.585092] [<ffff000008d8047c>] netlink_proto_init+0x14c/0x17c
> >> [ 0.591119] [<ffff000008083150>] do_one_initcall+0x38/0x120
> >> [ 0.596796] [<ffff000008d30d00>] kernel_init_freeable+0x1a0/0x240
> >> [ 0.603003] [<ffff00000892a790>] kernel_init+0x10/0x100
> >> [ 0.608324] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
> >> [ 0.613736] SMP: stopping secondary CPUs
> >> [ 0.617738] ---[ end Kernel panic - not syncing: rtnetlink_init: cannot initialize rtnetlink
> >>
> >> If this isn't a known issue, it would be worth trying to bisect this.
>
> It's fixed already by this commit in net-next:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=2d2ab658d2debcb4c0e29c9e6f18e5683f3077bf
Works for me, thank you.
^ permalink raw reply
* [PATCH 0/7] crypto: aesni: provide generic gcm(aes)
From: Sabrina Dubroca @ 2017-04-28 16:11 UTC (permalink / raw)
To: netdev
Cc: Sabrina Dubroca, Hannes Frederic Sowa, Herbert Xu,
David S. Miller, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
x86, linux-crypto, linux-kernel
The current aesni AES-GCM implementation only offers support for
rfc4106(gcm(aes)). This makes some things a little bit simpler
(handling of associated data and authentication tag), but it means
that non-IPsec users of gcm(aes) have to rely on
gcm_base(ctr-aes-aesni,ghash-clmulni), which is much slower.
This patchset adds handling of all valid authentication tag lengths
and of any associated data length to the assembly code, and exposes a
generic gcm(aes) AEAD algorithm to the crypto API.
With these patches, performance of MACsec on a single core increases
by 40% (from 4.5Gbps to around 6.3Gbps).
Sabrina Dubroca (7):
crypto: aesni: make non-AVX AES-GCM work with any aadlen
crypto: aesni: make non-AVX AES-GCM work with all valid auth_tag_len
crypto: aesni: make AVX AES-GCM work with any aadlen
crypto: aesni: make AVX AES-GCM work with all valid auth_tag_len
crypto: aesni: make AVX2 AES-GCM work with any aadlen
crypto: aesni: make AVX2 AES-GCM work with all valid auth_tag_len
crypto: aesni: add generic gcm(aes)
arch/x86/crypto/aesni-intel_asm.S | 231 +++++++++++++++++++------
arch/x86/crypto/aesni-intel_avx-x86_64.S | 283 ++++++++++++++++++++++---------
arch/x86/crypto/aesni-intel_glue.c | 208 +++++++++++++++++------
3 files changed, 539 insertions(+), 183 deletions(-)
--
2.12.2
^ permalink raw reply
* [PATCH 2/7] crypto: aesni: make non-AVX AES-GCM work with all valid auth_tag_len
From: Sabrina Dubroca @ 2017-04-28 16:11 UTC (permalink / raw)
To: netdev
Cc: Sabrina Dubroca, Hannes Frederic Sowa, Herbert Xu,
David S. Miller, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
x86, linux-crypto, linux-kernel
In-Reply-To: <cover.1493395785.git.sd@queasysnail.net>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
arch/x86/crypto/aesni-intel_asm.S | 62 ++++++++++++++++++++++++++++++---------
1 file changed, 48 insertions(+), 14 deletions(-)
diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index 605726aaf0a2..16627fec80b2 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -1549,18 +1549,35 @@ ENTRY(aesni_gcm_dec)
mov arg10, %r11 # %r11 = auth_tag_len
cmp $16, %r11
je _T_16_decrypt
- cmp $12, %r11
- je _T_12_decrypt
+ cmp $8, %r11
+ jl _T_4_decrypt
_T_8_decrypt:
MOVQ_R64_XMM %xmm0, %rax
mov %rax, (%r10)
- jmp _return_T_done_decrypt
-_T_12_decrypt:
- MOVQ_R64_XMM %xmm0, %rax
- mov %rax, (%r10)
+ add $8, %r10
+ sub $8, %r11
psrldq $8, %xmm0
+ cmp $0, %r11
+ je _return_T_done_decrypt
+_T_4_decrypt:
+ movd %xmm0, %eax
+ mov %eax, (%r10)
+ add $4, %r10
+ sub $4, %r11
+ psrldq $4, %xmm0
+ cmp $0, %r11
+ je _return_T_done_decrypt
+_T_123_decrypt:
movd %xmm0, %eax
- mov %eax, 8(%r10)
+ cmp $2, %r11
+ jl _T_1_decrypt
+ mov %ax, (%r10)
+ cmp $2, %r11
+ je _return_T_done_decrypt
+ add $2, %r10
+ sar $16, %eax
+_T_1_decrypt:
+ mov %al, (%r10)
jmp _return_T_done_decrypt
_T_16_decrypt:
movdqu %xmm0, (%r10)
@@ -1813,18 +1830,35 @@ ENTRY(aesni_gcm_enc)
mov arg10, %r11 # %r11 = auth_tag_len
cmp $16, %r11
je _T_16_encrypt
- cmp $12, %r11
- je _T_12_encrypt
+ cmp $8, %r11
+ jl _T_4_encrypt
_T_8_encrypt:
MOVQ_R64_XMM %xmm0, %rax
mov %rax, (%r10)
- jmp _return_T_done_encrypt
-_T_12_encrypt:
- MOVQ_R64_XMM %xmm0, %rax
- mov %rax, (%r10)
+ add $8, %r10
+ sub $8, %r11
psrldq $8, %xmm0
+ cmp $0, %r11
+ je _return_T_done_encrypt
+_T_4_encrypt:
+ movd %xmm0, %eax
+ mov %eax, (%r10)
+ add $4, %r10
+ sub $4, %r11
+ psrldq $4, %xmm0
+ cmp $0, %r11
+ je _return_T_done_encrypt
+_T_123_encrypt:
movd %xmm0, %eax
- mov %eax, 8(%r10)
+ cmp $2, %r11
+ jl _T_1_encrypt
+ mov %ax, (%r10)
+ cmp $2, %r11
+ je _return_T_done_encrypt
+ add $2, %r10
+ sar $16, %eax
+_T_1_encrypt:
+ mov %al, (%r10)
jmp _return_T_done_encrypt
_T_16_encrypt:
movdqu %xmm0, (%r10)
--
2.12.2
^ permalink raw reply related
* [PATCH 4/7] crypto: aesni: make AVX AES-GCM work with all valid auth_tag_len
From: Sabrina Dubroca @ 2017-04-28 16:11 UTC (permalink / raw)
To: netdev
Cc: Sabrina Dubroca, Hannes Frederic Sowa, Herbert Xu,
David S. Miller, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
x86, linux-crypto, linux-kernel
In-Reply-To: <cover.1493395785.git.sd@queasysnail.net>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
arch/x86/crypto/aesni-intel_avx-x86_64.S | 31 ++++++++++++++++++++++++-------
1 file changed, 24 insertions(+), 7 deletions(-)
diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S b/arch/x86/crypto/aesni-intel_avx-x86_64.S
index a73117c84904..ee6283120f83 100644
--- a/arch/x86/crypto/aesni-intel_avx-x86_64.S
+++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S
@@ -1481,19 +1481,36 @@ VARIABLE_OFFSET = 16*8
cmp $16, %r11
je _T_16\@
- cmp $12, %r11
- je _T_12\@
+ cmp $8, %r11
+ jl _T_4\@
_T_8\@:
vmovq %xmm9, %rax
mov %rax, (%r10)
- jmp _return_T_done\@
-_T_12\@:
- vmovq %xmm9, %rax
- mov %rax, (%r10)
+ add $8, %r10
+ sub $8, %r11
vpsrldq $8, %xmm9, %xmm9
+ cmp $0, %r11
+ je _return_T_done\@
+_T_4\@:
vmovd %xmm9, %eax
- mov %eax, 8(%r10)
+ mov %eax, (%r10)
+ add $4, %r10
+ sub $4, %r11
+ vpsrldq $4, %xmm9, %xmm9
+ cmp $0, %r11
+ je _return_T_done\@
+_T_123\@:
+ vmovd %xmm9, %eax
+ cmp $2, %r11
+ jl _T_1\@
+ mov %ax, (%r10)
+ cmp $2, %r11
+ je _return_T_done\@
+ add $2, %r10
+ sar $16, %eax
+_T_1\@:
+ mov %al, (%r10)
jmp _return_T_done\@
_T_16\@:
--
2.12.2
^ permalink raw reply related
* [PATCH 5/7] crypto: aesni: make AVX2 AES-GCM work with any aadlen
From: Sabrina Dubroca @ 2017-04-28 16:12 UTC (permalink / raw)
To: netdev
Cc: Sabrina Dubroca, Hannes Frederic Sowa, Herbert Xu,
David S. Miller, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
x86, linux-crypto, linux-kernel
In-Reply-To: <cover.1493395785.git.sd@queasysnail.net>
This is the first step to make the aesni AES-GCM implementation
generic. The current code was written for rfc4106, so it handles only
some specific sizes of associated data.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
arch/x86/crypto/aesni-intel_avx-x86_64.S | 85 ++++++++++++++++++++++----------
1 file changed, 58 insertions(+), 27 deletions(-)
diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S b/arch/x86/crypto/aesni-intel_avx-x86_64.S
index ee6283120f83..7230808a7cef 100644
--- a/arch/x86/crypto/aesni-intel_avx-x86_64.S
+++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S
@@ -1702,41 +1702,73 @@ ENDPROC(aesni_gcm_dec_avx_gen2)
.macro INITIAL_BLOCKS_AVX2 num_initial_blocks T1 T2 T3 T4 T5 CTR XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 T6 T_key ENC_DEC VER
i = (8-\num_initial_blocks)
+ j = 0
setreg
- mov arg6, %r10 # r10 = AAD
- mov arg7, %r12 # r12 = aadLen
-
-
- mov %r12, %r11
-
- vpxor reg_i, reg_i, reg_i
-_get_AAD_loop\@:
- vmovd (%r10), \T1
- vpslldq $12, \T1, \T1
- vpsrldq $4, reg_i, reg_i
- vpxor \T1, reg_i, reg_i
+ mov arg6, %r10 # r10 = AAD
+ mov arg7, %r12 # r12 = aadLen
- add $4, %r10
- sub $4, %r12
- jg _get_AAD_loop\@
+ mov %r12, %r11
- cmp $16, %r11
- je _get_AAD_loop2_done\@
- mov $16, %r12
+ vpxor reg_j, reg_j, reg_j
+ vpxor reg_i, reg_i, reg_i
-_get_AAD_loop2\@:
- vpsrldq $4, reg_i, reg_i
- sub $4, %r12
- cmp %r11, %r12
- jg _get_AAD_loop2\@
+ cmp $16, %r11
+ jl _get_AAD_rest8\@
+_get_AAD_blocks\@:
+ vmovdqu (%r10), reg_i
+ vpshufb SHUF_MASK(%rip), reg_i, reg_i
+ vpxor reg_i, reg_j, reg_j
+ GHASH_MUL_AVX2 reg_j, \T2, \T1, \T3, \T4, \T5, \T6
+ add $16, %r10
+ sub $16, %r12
+ sub $16, %r11
+ cmp $16, %r11
+ jge _get_AAD_blocks\@
+ vmovdqu reg_j, reg_i
+ cmp $0, %r11
+ je _get_AAD_done\@
-_get_AAD_loop2_done\@:
+ vpxor reg_i, reg_i, reg_i
- #byte-reflect the AAD data
- vpshufb SHUF_MASK(%rip), reg_i, reg_i
+ /* read the last <16B of AAD. since we have at least 4B of
+ data right after the AAD (the ICV, and maybe some CT), we can
+ read 4B/8B blocks safely, and then get rid of the extra stuff */
+_get_AAD_rest8\@:
+ cmp $4, %r11
+ jle _get_AAD_rest4\@
+ movq (%r10), \T1
+ add $8, %r10
+ sub $8, %r11
+ vpslldq $8, \T1, \T1
+ vpsrldq $8, reg_i, reg_i
+ vpxor \T1, reg_i, reg_i
+ jmp _get_AAD_rest8\@
+_get_AAD_rest4\@:
+ cmp $0, %r11
+ jle _get_AAD_rest0\@
+ mov (%r10), %eax
+ movq %rax, \T1
+ add $4, %r10
+ sub $4, %r11
+ vpslldq $12, \T1, \T1
+ vpsrldq $4, reg_i, reg_i
+ vpxor \T1, reg_i, reg_i
+_get_AAD_rest0\@:
+ /* finalize: shift out the extra bytes we read, and align
+ left. since pslldq can only shift by an immediate, we use
+ vpshufb and an array of shuffle masks */
+ movq %r12, %r11
+ salq $4, %r11
+ movdqu aad_shift_arr(%r11), \T1
+ vpshufb \T1, reg_i, reg_i
+_get_AAD_rest_final\@:
+ vpshufb SHUF_MASK(%rip), reg_i, reg_i
+ vpxor reg_j, reg_i, reg_i
+ GHASH_MUL_AVX2 reg_i, \T2, \T1, \T3, \T4, \T5, \T6
+_get_AAD_done\@:
# initialize the data pointer offset as zero
xor %r11, %r11
@@ -1811,7 +1843,6 @@ ENDPROC(aesni_gcm_dec_avx_gen2)
i = (8-\num_initial_blocks)
j = (9-\num_initial_blocks)
setreg
- GHASH_MUL_AVX2 reg_i, \T2, \T1, \T3, \T4, \T5, \T6
.rep \num_initial_blocks
vpxor reg_i, reg_j, reg_j
--
2.12.2
^ permalink raw reply related
* [PATCH 7/7] crypto: aesni: add generic gcm(aes)
From: Sabrina Dubroca @ 2017-04-28 16:12 UTC (permalink / raw)
To: netdev
Cc: Sabrina Dubroca, Hannes Frederic Sowa, Herbert Xu,
David S. Miller, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
x86, linux-crypto, linux-kernel
In-Reply-To: <cover.1493395785.git.sd@queasysnail.net>
Now that the asm side of things can support all the valid lengths of ICV
and all lengths of associated data, provide the glue code to expose a
generic gcm(aes) crypto algorithm.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
arch/x86/crypto/aesni-intel_glue.c | 208 ++++++++++++++++++++++++++++---------
1 file changed, 158 insertions(+), 50 deletions(-)
diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
index 93de8ea51548..4a55cdcdc008 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -61,6 +61,11 @@ struct aesni_rfc4106_gcm_ctx {
u8 nonce[4];
};
+struct generic_gcmaes_ctx {
+ u8 hash_subkey[16] AESNI_ALIGN_ATTR;
+ struct crypto_aes_ctx aes_key_expanded AESNI_ALIGN_ATTR;
+};
+
struct aesni_xts_ctx {
u8 raw_tweak_ctx[sizeof(struct crypto_aes_ctx)] AESNI_ALIGN_ATTR;
u8 raw_crypt_ctx[sizeof(struct crypto_aes_ctx)] AESNI_ALIGN_ATTR;
@@ -102,13 +107,11 @@ asmlinkage void aesni_xts_crypt8(struct crypto_aes_ctx *ctx, u8 *out,
* u8 *out, Ciphertext output. Encrypt in-place is allowed.
* const u8 *in, Plaintext input
* unsigned long plaintext_len, Length of data in bytes for encryption.
- * u8 *iv, Pre-counter block j0: 4 byte salt (from Security Association)
- * concatenated with 8 byte Initialisation Vector (from IPSec ESP
- * Payload) concatenated with 0x00000001. 16-byte aligned pointer.
+ * u8 *iv, Pre-counter block j0: 12 byte IV concatenated with 0x00000001.
+ * 16-byte aligned pointer.
* u8 *hash_subkey, the Hash sub key input. Data starts on a 16-byte boundary.
* const u8 *aad, Additional Authentication Data (AAD)
- * unsigned long aad_len, Length of AAD in bytes. With RFC4106 this
- * is going to be 8 or 12 bytes
+ * unsigned long aad_len, Length of AAD in bytes.
* u8 *auth_tag, Authenticated Tag output.
* unsigned long auth_tag_len), Authenticated Tag Length in bytes.
* Valid values are 16 (most likely), 12 or 8.
@@ -123,9 +126,8 @@ asmlinkage void aesni_gcm_enc(void *ctx, u8 *out,
* u8 *out, Plaintext output. Decrypt in-place is allowed.
* const u8 *in, Ciphertext input
* unsigned long ciphertext_len, Length of data in bytes for decryption.
- * u8 *iv, Pre-counter block j0: 4 byte salt (from Security Association)
- * concatenated with 8 byte Initialisation Vector (from IPSec ESP
- * Payload) concatenated with 0x00000001. 16-byte aligned pointer.
+ * u8 *iv, Pre-counter block j0: 12 byte IV concatenated with 0x00000001.
+ * 16-byte aligned pointer.
* u8 *hash_subkey, the Hash sub key input. Data starts on a 16-byte boundary.
* const u8 *aad, Additional Authentication Data (AAD)
* unsigned long aad_len, Length of AAD in bytes. With RFC4106 this is going
@@ -275,6 +277,16 @@ aesni_rfc4106_gcm_ctx *aesni_rfc4106_gcm_ctx_get(struct crypto_aead *tfm)
align = 1;
return PTR_ALIGN(crypto_aead_ctx(tfm), align);
}
+
+static inline struct
+generic_gcmaes_ctx *generic_gcmaes_ctx_get(struct crypto_aead *tfm)
+{
+ unsigned long align = AESNI_ALIGN;
+
+ if (align <= crypto_tfm_ctx_alignment())
+ align = 1;
+ return PTR_ALIGN(crypto_aead_ctx(tfm), align);
+}
#endif
static inline struct crypto_aes_ctx *aes_ctx(void *raw_ctx)
@@ -712,32 +724,34 @@ static int rfc4106_set_authsize(struct crypto_aead *parent,
return crypto_aead_setauthsize(&cryptd_tfm->base, authsize);
}
-static int helper_rfc4106_encrypt(struct aead_request *req)
+static int generic_gcmaes_set_authsize(struct crypto_aead *tfm,
+ unsigned int authsize)
+{
+ switch (authsize) {
+ case 4:
+ case 8:
+ case 12:
+ case 13:
+ case 14:
+ case 15:
+ case 16:
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int gcmaes_encrypt(struct aead_request *req, unsigned int assoclen,
+ u8 *hash_subkey, u8 *iv, void *aes_ctx)
{
u8 one_entry_in_sg = 0;
u8 *src, *dst, *assoc;
- __be32 counter = cpu_to_be32(1);
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
- struct aesni_rfc4106_gcm_ctx *ctx = aesni_rfc4106_gcm_ctx_get(tfm);
- void *aes_ctx = &(ctx->aes_key_expanded);
unsigned long auth_tag_len = crypto_aead_authsize(tfm);
- u8 iv[16] __attribute__ ((__aligned__(AESNI_ALIGN)));
struct scatter_walk src_sg_walk;
struct scatter_walk dst_sg_walk = {};
- unsigned int i;
-
- /* Assuming we are supporting rfc4106 64-bit extended */
- /* sequence numbers We need to have the AAD length equal */
- /* to 16 or 20 bytes */
- if (unlikely(req->assoclen != 16 && req->assoclen != 20))
- return -EINVAL;
-
- /* IV below built */
- for (i = 0; i < 4; i++)
- *(iv+i) = ctx->nonce[i];
- for (i = 0; i < 8; i++)
- *(iv+4+i) = req->iv[i];
- *((__be32 *)(iv+12)) = counter;
if (sg_is_last(req->src) &&
(!PageHighMem(sg_page(req->src)) ||
@@ -768,7 +782,7 @@ static int helper_rfc4106_encrypt(struct aead_request *req)
kernel_fpu_begin();
aesni_gcm_enc_tfm(aes_ctx, dst, src, req->cryptlen, iv,
- ctx->hash_subkey, assoc, req->assoclen - 8,
+ hash_subkey, assoc, assoclen,
dst + req->cryptlen, auth_tag_len);
kernel_fpu_end();
@@ -791,37 +805,20 @@ static int helper_rfc4106_encrypt(struct aead_request *req)
return 0;
}
-static int helper_rfc4106_decrypt(struct aead_request *req)
+static int gcmaes_decrypt(struct aead_request *req, unsigned int assoclen,
+ u8 *hash_subkey, u8 *iv, void *aes_ctx)
{
u8 one_entry_in_sg = 0;
u8 *src, *dst, *assoc;
unsigned long tempCipherLen = 0;
- __be32 counter = cpu_to_be32(1);
- int retval = 0;
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
- struct aesni_rfc4106_gcm_ctx *ctx = aesni_rfc4106_gcm_ctx_get(tfm);
- void *aes_ctx = &(ctx->aes_key_expanded);
unsigned long auth_tag_len = crypto_aead_authsize(tfm);
- u8 iv[16] __attribute__ ((__aligned__(AESNI_ALIGN)));
u8 authTag[16];
struct scatter_walk src_sg_walk;
struct scatter_walk dst_sg_walk = {};
- unsigned int i;
-
- if (unlikely(req->assoclen != 16 && req->assoclen != 20))
- return -EINVAL;
-
- /* Assuming we are supporting rfc4106 64-bit extended */
- /* sequence numbers We need to have the AAD length */
- /* equal to 16 or 20 bytes */
+ int retval = 0;
tempCipherLen = (unsigned long)(req->cryptlen - auth_tag_len);
- /* IV below built */
- for (i = 0; i < 4; i++)
- *(iv+i) = ctx->nonce[i];
- for (i = 0; i < 8; i++)
- *(iv+4+i) = req->iv[i];
- *((__be32 *)(iv+12)) = counter;
if (sg_is_last(req->src) &&
(!PageHighMem(sg_page(req->src)) ||
@@ -838,7 +835,6 @@ static int helper_rfc4106_decrypt(struct aead_request *req)
scatterwalk_start(&dst_sg_walk, req->dst);
dst = scatterwalk_map(&dst_sg_walk) + req->assoclen;
}
-
} else {
/* Allocate memory for src, dst, assoc */
assoc = kmalloc(req->cryptlen + req->assoclen, GFP_ATOMIC);
@@ -850,9 +846,10 @@ static int helper_rfc4106_decrypt(struct aead_request *req)
dst = src;
}
+
kernel_fpu_begin();
aesni_gcm_dec_tfm(aes_ctx, dst, src, tempCipherLen, iv,
- ctx->hash_subkey, assoc, req->assoclen - 8,
+ hash_subkey, assoc, assoclen,
authTag, auth_tag_len);
kernel_fpu_end();
@@ -875,6 +872,60 @@ static int helper_rfc4106_decrypt(struct aead_request *req)
kfree(assoc);
}
return retval;
+
+}
+
+static int helper_rfc4106_encrypt(struct aead_request *req)
+{
+ struct crypto_aead *tfm = crypto_aead_reqtfm(req);
+ struct aesni_rfc4106_gcm_ctx *ctx = aesni_rfc4106_gcm_ctx_get(tfm);
+ void *aes_ctx = &(ctx->aes_key_expanded);
+ u8 iv[16] __attribute__ ((__aligned__(AESNI_ALIGN)));
+ unsigned int i;
+ __be32 counter = cpu_to_be32(1);
+
+ /* Assuming we are supporting rfc4106 64-bit extended */
+ /* sequence numbers We need to have the AAD length equal */
+ /* to 16 or 20 bytes */
+ if (unlikely(req->assoclen != 16 && req->assoclen != 20))
+ return -EINVAL;
+
+ /* IV below built */
+ for (i = 0; i < 4; i++)
+ *(iv+i) = ctx->nonce[i];
+ for (i = 0; i < 8; i++)
+ *(iv+4+i) = req->iv[i];
+ *((__be32 *)(iv+12)) = counter;
+
+ return gcmaes_encrypt(req, req->assoclen - 8, ctx->hash_subkey, iv,
+ aes_ctx);
+}
+
+static int helper_rfc4106_decrypt(struct aead_request *req)
+{
+ __be32 counter = cpu_to_be32(1);
+ struct crypto_aead *tfm = crypto_aead_reqtfm(req);
+ struct aesni_rfc4106_gcm_ctx *ctx = aesni_rfc4106_gcm_ctx_get(tfm);
+ void *aes_ctx = &(ctx->aes_key_expanded);
+ u8 iv[16] __attribute__ ((__aligned__(AESNI_ALIGN)));
+ unsigned int i;
+
+ if (unlikely(req->assoclen != 16 && req->assoclen != 20))
+ return -EINVAL;
+
+ /* Assuming we are supporting rfc4106 64-bit extended */
+ /* sequence numbers We need to have the AAD length */
+ /* equal to 16 or 20 bytes */
+
+ /* IV below built */
+ for (i = 0; i < 4; i++)
+ *(iv+i) = ctx->nonce[i];
+ for (i = 0; i < 8; i++)
+ *(iv+4+i) = req->iv[i];
+ *((__be32 *)(iv+12)) = counter;
+
+ return gcmaes_decrypt(req, req->assoclen - 8, ctx->hash_subkey, iv,
+ aes_ctx);
}
static int rfc4106_encrypt(struct aead_request *req)
@@ -1035,6 +1086,46 @@ struct {
};
#ifdef CONFIG_X86_64
+static int generic_gcmaes_set_key(struct crypto_aead *aead, const u8 *key,
+ unsigned int key_len)
+{
+ struct generic_gcmaes_ctx *ctx = generic_gcmaes_ctx_get(aead);
+
+ return aes_set_key_common(crypto_aead_tfm(aead),
+ &ctx->aes_key_expanded, key, key_len) ?:
+ rfc4106_set_hash_subkey(ctx->hash_subkey, key, key_len);
+}
+
+static int generic_gcmaes_encrypt(struct aead_request *req)
+{
+ struct crypto_aead *tfm = crypto_aead_reqtfm(req);
+ struct generic_gcmaes_ctx *ctx = generic_gcmaes_ctx_get(tfm);
+ void *aes_ctx = &(ctx->aes_key_expanded);
+ u8 iv[16] __attribute__ ((__aligned__(AESNI_ALIGN)));
+ __be32 counter = cpu_to_be32(1);
+
+ memcpy(iv, req->iv, 12);
+ *((__be32 *)(iv+12)) = counter;
+
+ return gcmaes_encrypt(req, req->assoclen, ctx->hash_subkey, iv,
+ aes_ctx);
+}
+
+static int generic_gcmaes_decrypt(struct aead_request *req)
+{
+ __be32 counter = cpu_to_be32(1);
+ struct crypto_aead *tfm = crypto_aead_reqtfm(req);
+ struct aesni_rfc4106_gcm_ctx *ctx = aesni_rfc4106_gcm_ctx_get(tfm);
+ void *aes_ctx = &(ctx->aes_key_expanded);
+ u8 iv[16] __attribute__ ((__aligned__(AESNI_ALIGN)));
+
+ memcpy(iv, req->iv, 12);
+ *((__be32 *)(iv+12)) = counter;
+
+ return gcmaes_decrypt(req, req->assoclen, ctx->hash_subkey, iv,
+ aes_ctx);
+}
+
static struct aead_alg aesni_aead_algs[] = { {
.setkey = common_rfc4106_set_key,
.setauthsize = common_rfc4106_set_authsize,
@@ -1069,6 +1160,23 @@ static struct aead_alg aesni_aead_algs[] = { {
.cra_ctxsize = sizeof(struct cryptd_aead *),
.cra_module = THIS_MODULE,
},
+}, {
+ .setkey = generic_gcmaes_set_key,
+ .setauthsize = generic_gcmaes_set_authsize,
+ .encrypt = generic_gcmaes_encrypt,
+ .decrypt = generic_gcmaes_decrypt,
+ .ivsize = 12,
+ .maxauthsize = 16,
+ .base = {
+ .cra_name = "gcm(aes)",
+ .cra_driver_name = "generic-gcm-aesni",
+ .cra_priority = 400,
+ .cra_flags = CRYPTO_ALG_ASYNC,
+ .cra_blocksize = 1,
+ .cra_ctxsize = sizeof(struct generic_gcmaes_ctx),
+ .cra_alignmask = AESNI_ALIGN - 1,
+ .cra_module = THIS_MODULE,
+ },
} };
#else
static struct aead_alg aesni_aead_algs[0];
--
2.12.2
^ permalink raw reply related
* [PATCH 1/7] crypto: aesni: make non-AVX AES-GCM work with any aadlen
From: Sabrina Dubroca @ 2017-04-28 16:11 UTC (permalink / raw)
To: netdev
Cc: Sabrina Dubroca, Hannes Frederic Sowa, Herbert Xu,
David S. Miller, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
x86, linux-crypto, linux-kernel
In-Reply-To: <cover.1493395785.git.sd@queasysnail.net>
This is the first step to make the aesni AES-GCM implementation
generic. The current code was written for rfc4106, so it handles only
some specific sizes of associated data.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
arch/x86/crypto/aesni-intel_asm.S | 169 +++++++++++++++++++++++++++++---------
1 file changed, 132 insertions(+), 37 deletions(-)
diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index 3c465184ff8a..605726aaf0a2 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -89,6 +89,29 @@ SHIFT_MASK: .octa 0x0f0e0d0c0b0a09080706050403020100
ALL_F: .octa 0xffffffffffffffffffffffffffffffff
.octa 0x00000000000000000000000000000000
+.section .rodata
+.align 16
+.type aad_shift_arr, @object
+.size aad_shift_arr, 272
+aad_shift_arr:
+ .octa 0xffffffffffffffffffffffffffffffff
+ .octa 0xffffffffffffffffffffffffffffff0C
+ .octa 0xffffffffffffffffffffffffffff0D0C
+ .octa 0xffffffffffffffffffffffffff0E0D0C
+ .octa 0xffffffffffffffffffffffff0F0E0D0C
+ .octa 0xffffffffffffffffffffff0C0B0A0908
+ .octa 0xffffffffffffffffffff0D0C0B0A0908
+ .octa 0xffffffffffffffffff0E0D0C0B0A0908
+ .octa 0xffffffffffffffff0F0E0D0C0B0A0908
+ .octa 0xffffffffffffff0C0B0A090807060504
+ .octa 0xffffffffffff0D0C0B0A090807060504
+ .octa 0xffffffffff0E0D0C0B0A090807060504
+ .octa 0xffffffff0F0E0D0C0B0A090807060504
+ .octa 0xffffff0C0B0A09080706050403020100
+ .octa 0xffff0D0C0B0A09080706050403020100
+ .octa 0xff0E0D0C0B0A09080706050403020100
+ .octa 0x0F0E0D0C0B0A09080706050403020100
+
.text
@@ -252,32 +275,66 @@ XMM2 XMM3 XMM4 XMMDst TMP6 TMP7 i i_seq operation
mov arg8, %r12 # %r12 = aadLen
mov %r12, %r11
pxor %xmm\i, %xmm\i
+ pxor \XMM2, \XMM2
-_get_AAD_loop\num_initial_blocks\operation:
- movd (%r10), \TMP1
- pslldq $12, \TMP1
- psrldq $4, %xmm\i
+ cmp $16, %r11
+ jl _get_AAD_rest8\num_initial_blocks\operation
+_get_AAD_blocks\num_initial_blocks\operation:
+ movdqu (%r10), %xmm\i
+ PSHUFB_XMM %xmm14, %xmm\i # byte-reflect the AAD data
+ pxor %xmm\i, \XMM2
+ GHASH_MUL \XMM2, \TMP3, \TMP1, \TMP2, \TMP4, \TMP5, \XMM1
+ add $16, %r10
+ sub $16, %r12
+ sub $16, %r11
+ cmp $16, %r11
+ jge _get_AAD_blocks\num_initial_blocks\operation
+
+ movdqu \XMM2, %xmm\i
+ cmp $0, %r11
+ je _get_AAD_done\num_initial_blocks\operation
+
+ pxor %xmm\i,%xmm\i
+
+ /* read the last <16B of AAD. since we have at least 4B of
+ data right after the AAD (the ICV, and maybe some CT), we can
+ read 4B/8B blocks safely, and then get rid of the extra stuff */
+_get_AAD_rest8\num_initial_blocks\operation:
+ cmp $4, %r11
+ jle _get_AAD_rest4\num_initial_blocks\operation
+ movq (%r10), \TMP1
+ add $8, %r10
+ sub $8, %r11
+ pslldq $8, \TMP1
+ psrldq $8, %xmm\i
pxor \TMP1, %xmm\i
+ jmp _get_AAD_rest8\num_initial_blocks\operation
+_get_AAD_rest4\num_initial_blocks\operation:
+ cmp $0, %r11
+ jle _get_AAD_rest0\num_initial_blocks\operation
+ mov (%r10), %eax
+ movq %rax, \TMP1
add $4, %r10
- sub $4, %r12
- jne _get_AAD_loop\num_initial_blocks\operation
-
- cmp $16, %r11
- je _get_AAD_loop2_done\num_initial_blocks\operation
-
- mov $16, %r12
-_get_AAD_loop2\num_initial_blocks\operation:
+ sub $4, %r10
+ pslldq $12, \TMP1
psrldq $4, %xmm\i
- sub $4, %r12
- cmp %r11, %r12
- jne _get_AAD_loop2\num_initial_blocks\operation
-
-_get_AAD_loop2_done\num_initial_blocks\operation:
+ pxor \TMP1, %xmm\i
+_get_AAD_rest0\num_initial_blocks\operation:
+ /* finalize: shift out the extra bytes we read, and align
+ left. since pslldq can only shift by an immediate, we use
+ vpshufb and an array of shuffle masks */
+ movq %r12, %r11
+ salq $4, %r11
+ movdqu aad_shift_arr(%r11), \TMP1
+ PSHUFB_XMM \TMP1, %xmm\i
+_get_AAD_rest_final\num_initial_blocks\operation:
PSHUFB_XMM %xmm14, %xmm\i # byte-reflect the AAD data
+ pxor \XMM2, %xmm\i
+ GHASH_MUL %xmm\i, \TMP3, \TMP1, \TMP2, \TMP4, \TMP5, \XMM1
+_get_AAD_done\num_initial_blocks\operation:
xor %r11, %r11 # initialise the data pointer offset as zero
-
- # start AES for num_initial_blocks blocks
+ # start AES for num_initial_blocks blocks
mov %arg5, %rax # %rax = *Y0
movdqu (%rax), \XMM0 # XMM0 = Y0
@@ -322,7 +379,7 @@ XMM2 XMM3 XMM4 XMMDst TMP6 TMP7 i i_seq operation
# prepare plaintext/ciphertext for GHASH computation
.endr
.endif
- GHASH_MUL %xmm\i, \TMP3, \TMP1, \TMP2, \TMP4, \TMP5, \XMM1
+
# apply GHASH on num_initial_blocks blocks
.if \i == 5
@@ -477,28 +534,66 @@ XMM2 XMM3 XMM4 XMMDst TMP6 TMP7 i i_seq operation
mov arg8, %r12 # %r12 = aadLen
mov %r12, %r11
pxor %xmm\i, %xmm\i
-_get_AAD_loop\num_initial_blocks\operation:
- movd (%r10), \TMP1
- pslldq $12, \TMP1
- psrldq $4, %xmm\i
+ pxor \XMM2, \XMM2
+
+ cmp $16, %r11
+ jl _get_AAD_rest8\num_initial_blocks\operation
+_get_AAD_blocks\num_initial_blocks\operation:
+ movdqu (%r10), %xmm\i
+ PSHUFB_XMM %xmm14, %xmm\i # byte-reflect the AAD data
+ pxor %xmm\i, \XMM2
+ GHASH_MUL \XMM2, \TMP3, \TMP1, \TMP2, \TMP4, \TMP5, \XMM1
+ add $16, %r10
+ sub $16, %r12
+ sub $16, %r11
+ cmp $16, %r11
+ jge _get_AAD_blocks\num_initial_blocks\operation
+
+ movdqu \XMM2, %xmm\i
+ cmp $0, %r11
+ je _get_AAD_done\num_initial_blocks\operation
+
+ pxor %xmm\i,%xmm\i
+
+ /* read the last <16B of AAD. since we have at least 4B of
+ data right after the AAD (the ICV, and maybe some PT), we can
+ read 4B/8B blocks safely, and then get rid of the extra stuff */
+_get_AAD_rest8\num_initial_blocks\operation:
+ cmp $4, %r11
+ jle _get_AAD_rest4\num_initial_blocks\operation
+ movq (%r10), \TMP1
+ add $8, %r10
+ sub $8, %r11
+ pslldq $8, \TMP1
+ psrldq $8, %xmm\i
pxor \TMP1, %xmm\i
+ jmp _get_AAD_rest8\num_initial_blocks\operation
+_get_AAD_rest4\num_initial_blocks\operation:
+ cmp $0, %r11
+ jle _get_AAD_rest0\num_initial_blocks\operation
+ mov (%r10), %eax
+ movq %rax, \TMP1
add $4, %r10
- sub $4, %r12
- jne _get_AAD_loop\num_initial_blocks\operation
- cmp $16, %r11
- je _get_AAD_loop2_done\num_initial_blocks\operation
- mov $16, %r12
-_get_AAD_loop2\num_initial_blocks\operation:
+ sub $4, %r10
+ pslldq $12, \TMP1
psrldq $4, %xmm\i
- sub $4, %r12
- cmp %r11, %r12
- jne _get_AAD_loop2\num_initial_blocks\operation
-_get_AAD_loop2_done\num_initial_blocks\operation:
+ pxor \TMP1, %xmm\i
+_get_AAD_rest0\num_initial_blocks\operation:
+ /* finalize: shift out the extra bytes we read, and align
+ left. since pslldq can only shift by an immediate, we use
+ vpshufb and an array of shuffle masks */
+ movq %r12, %r11
+ salq $4, %r11
+ movdqu aad_shift_arr(%r11), \TMP1
+ PSHUFB_XMM \TMP1, %xmm\i
+_get_AAD_rest_final\num_initial_blocks\operation:
PSHUFB_XMM %xmm14, %xmm\i # byte-reflect the AAD data
+ pxor \XMM2, %xmm\i
+ GHASH_MUL %xmm\i, \TMP3, \TMP1, \TMP2, \TMP4, \TMP5, \XMM1
+_get_AAD_done\num_initial_blocks\operation:
xor %r11, %r11 # initialise the data pointer offset as zero
-
- # start AES for num_initial_blocks blocks
+ # start AES for num_initial_blocks blocks
mov %arg5, %rax # %rax = *Y0
movdqu (%rax), \XMM0 # XMM0 = Y0
@@ -543,7 +638,7 @@ XMM2 XMM3 XMM4 XMMDst TMP6 TMP7 i i_seq operation
# prepare plaintext/ciphertext for GHASH computation
.endr
.endif
- GHASH_MUL %xmm\i, \TMP3, \TMP1, \TMP2, \TMP4, \TMP5, \XMM1
+
# apply GHASH on num_initial_blocks blocks
.if \i == 5
--
2.12.2
^ permalink raw reply related
* [PATCH 3/7] crypto: aesni: make AVX AES-GCM work with any aadlen
From: Sabrina Dubroca @ 2017-04-28 16:11 UTC (permalink / raw)
To: netdev
Cc: Sabrina Dubroca, Hannes Frederic Sowa, Herbert Xu,
David S. Miller, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
x86, linux-crypto, linux-kernel
In-Reply-To: <cover.1493395785.git.sd@queasysnail.net>
This is the first step to make the aesni AES-GCM implementation
generic. The current code was written for rfc4106, so it handles
only some specific sizes of associated data.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
arch/x86/crypto/aesni-intel_avx-x86_64.S | 122 ++++++++++++++++++++++---------
1 file changed, 88 insertions(+), 34 deletions(-)
diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S b/arch/x86/crypto/aesni-intel_avx-x86_64.S
index d664382c6e56..a73117c84904 100644
--- a/arch/x86/crypto/aesni-intel_avx-x86_64.S
+++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S
@@ -155,6 +155,30 @@ SHIFT_MASK: .octa 0x0f0e0d0c0b0a09080706050403020100
ALL_F: .octa 0xffffffffffffffffffffffffffffffff
.octa 0x00000000000000000000000000000000
+.section .rodata
+.align 16
+.type aad_shift_arr, @object
+.size aad_shift_arr, 272
+aad_shift_arr:
+ .octa 0xffffffffffffffffffffffffffffffff
+ .octa 0xffffffffffffffffffffffffffffff0C
+ .octa 0xffffffffffffffffffffffffffff0D0C
+ .octa 0xffffffffffffffffffffffffff0E0D0C
+ .octa 0xffffffffffffffffffffffff0F0E0D0C
+ .octa 0xffffffffffffffffffffff0C0B0A0908
+ .octa 0xffffffffffffffffffff0D0C0B0A0908
+ .octa 0xffffffffffffffffff0E0D0C0B0A0908
+ .octa 0xffffffffffffffff0F0E0D0C0B0A0908
+ .octa 0xffffffffffffff0C0B0A090807060504
+ .octa 0xffffffffffff0D0C0B0A090807060504
+ .octa 0xffffffffff0E0D0C0B0A090807060504
+ .octa 0xffffffff0F0E0D0C0B0A090807060504
+ .octa 0xffffff0C0B0A09080706050403020100
+ .octa 0xffff0D0C0B0A09080706050403020100
+ .octa 0xff0E0D0C0B0A09080706050403020100
+ .octa 0x0F0E0D0C0B0A09080706050403020100
+
+
.text
@@ -372,41 +396,72 @@ VARIABLE_OFFSET = 16*8
.macro INITIAL_BLOCKS_AVX num_initial_blocks T1 T2 T3 T4 T5 CTR XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 T6 T_key ENC_DEC
i = (8-\num_initial_blocks)
+ j = 0
setreg
- mov arg6, %r10 # r10 = AAD
- mov arg7, %r12 # r12 = aadLen
-
-
- mov %r12, %r11
-
- vpxor reg_i, reg_i, reg_i
-_get_AAD_loop\@:
- vmovd (%r10), \T1
- vpslldq $12, \T1, \T1
- vpsrldq $4, reg_i, reg_i
- vpxor \T1, reg_i, reg_i
-
- add $4, %r10
- sub $4, %r12
- jg _get_AAD_loop\@
-
-
- cmp $16, %r11
- je _get_AAD_loop2_done\@
- mov $16, %r12
-
-_get_AAD_loop2\@:
- vpsrldq $4, reg_i, reg_i
- sub $4, %r12
- cmp %r11, %r12
- jg _get_AAD_loop2\@
-
-_get_AAD_loop2_done\@:
-
- #byte-reflect the AAD data
- vpshufb SHUF_MASK(%rip), reg_i, reg_i
-
+ mov arg6, %r10 # r10 = AAD
+ mov arg7, %r12 # r12 = aadLen
+
+
+ mov %r12, %r11
+
+ vpxor reg_j, reg_j, reg_j
+ vpxor reg_i, reg_i, reg_i
+ cmp $16, %r11
+ jl _get_AAD_rest8\@
+_get_AAD_blocks\@:
+ vmovdqu (%r10), reg_i
+ vpshufb SHUF_MASK(%rip), reg_i, reg_i
+ vpxor reg_i, reg_j, reg_j
+ GHASH_MUL_AVX reg_j, \T2, \T1, \T3, \T4, \T5, \T6
+ add $16, %r10
+ sub $16, %r12
+ sub $16, %r11
+ cmp $16, %r11
+ jge _get_AAD_blocks\@
+ vmovdqu reg_j, reg_i
+ cmp $0, %r11
+ je _get_AAD_done\@
+
+ vpxor reg_i, reg_i, reg_i
+
+ /* read the last <16B of AAD. since we have at least 4B of
+ data right after the AAD (the ICV, and maybe some CT), we can
+ read 4B/8B blocks safely, and then get rid of the extra stuff */
+_get_AAD_rest8\@:
+ cmp $4, %r11
+ jle _get_AAD_rest4\@
+ movq (%r10), \T1
+ add $8, %r10
+ sub $8, %r11
+ vpslldq $8, \T1, \T1
+ vpsrldq $8, reg_i, reg_i
+ vpxor \T1, reg_i, reg_i
+ jmp _get_AAD_rest8\@
+_get_AAD_rest4\@:
+ cmp $0, %r11
+ jle _get_AAD_rest0\@
+ mov (%r10), %eax
+ movq %rax, \T1
+ add $4, %r10
+ sub $4, %r11
+ vpslldq $12, \T1, \T1
+ vpsrldq $4, reg_i, reg_i
+ vpxor \T1, reg_i, reg_i
+_get_AAD_rest0\@:
+ /* finalize: shift out the extra bytes we read, and align
+ left. since pslldq can only shift by an immediate, we use
+ vpshufb and an array of shuffle masks */
+ movq %r12, %r11
+ salq $4, %r11
+ movdqu aad_shift_arr(%r11), \T1
+ vpshufb \T1, reg_i, reg_i
+_get_AAD_rest_final\@:
+ vpshufb SHUF_MASK(%rip), reg_i, reg_i
+ vpxor reg_j, reg_i, reg_i
+ GHASH_MUL_AVX reg_i, \T2, \T1, \T3, \T4, \T5, \T6
+
+_get_AAD_done\@:
# initialize the data pointer offset as zero
xor %r11, %r11
@@ -480,7 +535,6 @@ VARIABLE_OFFSET = 16*8
i = (8-\num_initial_blocks)
j = (9-\num_initial_blocks)
setreg
- GHASH_MUL_AVX reg_i, \T2, \T1, \T3, \T4, \T5, \T6
.rep \num_initial_blocks
vpxor reg_i, reg_j, reg_j
--
2.12.2
^ permalink raw reply related
* [PATCH 6/7] crypto: aesni: make AVX2 AES-GCM work with all valid auth_tag_len
From: Sabrina Dubroca @ 2017-04-28 16:12 UTC (permalink / raw)
To: netdev
Cc: Sabrina Dubroca, Hannes Frederic Sowa, Herbert Xu,
David S. Miller, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
x86, linux-crypto, linux-kernel
In-Reply-To: <cover.1493395785.git.sd@queasysnail.net>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
arch/x86/crypto/aesni-intel_avx-x86_64.S | 31 ++++++++++++++++++++++++-------
1 file changed, 24 insertions(+), 7 deletions(-)
diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S b/arch/x86/crypto/aesni-intel_avx-x86_64.S
index 7230808a7cef..faecb1518bf8 100644
--- a/arch/x86/crypto/aesni-intel_avx-x86_64.S
+++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S
@@ -2804,19 +2804,36 @@ ENDPROC(aesni_gcm_dec_avx_gen2)
cmp $16, %r11
je _T_16\@
- cmp $12, %r11
- je _T_12\@
+ cmp $8, %r11
+ jl _T_4\@
_T_8\@:
vmovq %xmm9, %rax
mov %rax, (%r10)
- jmp _return_T_done\@
-_T_12\@:
- vmovq %xmm9, %rax
- mov %rax, (%r10)
+ add $8, %r10
+ sub $8, %r11
vpsrldq $8, %xmm9, %xmm9
+ cmp $0, %r11
+ je _return_T_done\@
+_T_4\@:
vmovd %xmm9, %eax
- mov %eax, 8(%r10)
+ mov %eax, (%r10)
+ add $4, %r10
+ sub $4, %r11
+ vpsrldq $4, %xmm9, %xmm9
+ cmp $0, %r11
+ je _return_T_done\@
+_T_123\@:
+ vmovd %xmm9, %eax
+ cmp $2, %r11
+ jl _T_1\@
+ mov %ax, (%r10)
+ cmp $2, %r11
+ je _return_T_done\@
+ add $2, %r10
+ sar $16, %eax
+_T_1\@:
+ mov %al, (%r10)
jmp _return_T_done\@
_T_16\@:
--
2.12.2
^ permalink raw reply related
* [patch net-next] net: sched: add helpers to handle extended actions
From: Jiri Pirko @ 2017-04-28 16:13 UTC (permalink / raw)
To: netdev; +Cc: davem, jhs, xiyou.wangcong, mlxsw
From: Jiri Pirko <jiri@mellanox.com>
Jump is now the only one using value action opcode. This is going to
change soon. So introduce helpers to work with this. Convert TC_ACT_JUMP.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
include/uapi/linux/pkt_cls.h | 15 ++++++++++++++-
net/sched/act_api.c | 2 +-
2 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index f1129e3..d613be3 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -37,7 +37,20 @@ enum {
#define TC_ACT_QUEUED 5
#define TC_ACT_REPEAT 6
#define TC_ACT_REDIRECT 7
-#define TC_ACT_JUMP 0x10000000
+
+/* There is a special kind of actions called "extended actions",
+ * which need a value parameter. These have a local opcode located in
+ * the highest nibble, starting from 1. The rest of the bits
+ * are used to carry the value. These two parts together make
+ * a combined opcode.
+ */
+#define __TC_ACT_EXT_SHIFT 28
+#define __TC_ACT_EXT(local) ((local) << __TC_ACT_EXT_SHIFT)
+#define TC_ACT_EXT_VAL_MASK ((1 << __TC_ACT_EXT_SHIFT) - 1)
+#define TC_ACT_EXT_CMP(combined, opcode) \
+ (((combined) & (~TC_ACT_EXT_VAL_MASK)) == opcode)
+
+#define TC_ACT_JUMP __TC_ACT_EXT(1)
/* Action type identifiers*/
enum {
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 7f2cd70..a90e8f3 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -453,7 +453,7 @@ int tcf_action_exec(struct sk_buff *skb, struct tc_action **actions,
if (ret == TC_ACT_REPEAT)
goto repeat; /* we need a ttl - JHS */
- if (ret & TC_ACT_JUMP) {
+ if (TC_ACT_EXT_CMP(ret, TC_ACT_JUMP)) {
jmp_prgcnt = ret & TCA_ACT_MAX_PRIO_MASK;
if (!jmp_prgcnt || (jmp_prgcnt > nr_actions)) {
/* faulty opcode, stop pipeline */
--
2.7.4
^ permalink raw reply related
* Re: llvm-objdump...
From: David Miller @ 2017-04-28 16:17 UTC (permalink / raw)
To: ast; +Cc: daniel, netdev
In-Reply-To: <612f0df6-711c-b6b1-4bd7-596a7f329737@fb.com>
From: Alexei Starovoitov <ast@fb.com>
Date: Tue, 25 Apr 2017 20:48:52 -0700
> On 4/25/17 10:13 AM, David Miller wrote:
>>
>> I think there are some endianness issues ;-)
>>
>> davem@patience:~/src/GIT/net-next/tools/testing/selftests/bpf$
>> llvm-objdump -S x.o
>
> nice host name ;)
>
>> x.o: file format ELF64-BPF
>>
>> Disassembly of section test1:
>> process:
>> 0: b7 00 00 00 00 00 00 02 r0 = 33554432
>> 1: 61 21 00 50 00 00 00 00 r1 = *(u32 *)(r2 + 20480)
>>
>> That first instruction should be "r0 = 2"
>
> hmm. I haven't tested it on big endian.
> When last time s390 folks tested samples/bpf with llvm we didn't even
> have automatic -march=bpf in llvm, so they used -march=bpfeb.
> There was no llvm-objdump support either.
>
> llvm side does this:
> tatic Triple::ArchType parseBPFArch(StringRef ArchName) {
> if (ArchName.equals("bpf")) {
> if (sys::IsLittleEndianHost)
> return Triple::bpfel;
> else
> return Triple::bpfeb;
> } else if (ArchName.equals("bpf_be") || ArchName.equals("bpfeb")) {
> return Triple::bpfeb;
> } else if (ArchName.equals("bpf_le") || ArchName.equals("bpfel")) {
> return Triple::bpfel;
>
> It works for clang and for llvm.
> I thought llvm-objdump should infer triple from elf file
> and do the 'right thing'... hmm
>
> could you please test it with -g and see whether dwarf is still
> correct in .o ?
> llvm-objdump -S should print original C code next to asm.
> Hope bpf dwarf is not broken on big-endian...
Even if I give it -triple=bpfeb it emits immediates incorrectly.
The bug is certainly in the insn field fetcher of the disassembler.
^ permalink raw reply
* Re: [PATCH v6 1/5] skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflow
From: Sabrina Dubroca @ 2017-04-28 16:18 UTC (permalink / raw)
To: Jason A. Donenfeld
Cc: netdev, linux-kernel, David.Laight, kernel-hardening, davem
In-Reply-To: <20170425184734.26563-1-Jason@zx2c4.com>
2017-04-25, 20:47:30 +0200, Jason A. Donenfeld wrote:
> This is a defense-in-depth measure in response to bugs like
> 4d6fa57b4dab ("macsec: avoid heap overflow in skb_to_sgvec"). While
> we're at it, we also limit the amount of recursion this function is
> allowed to do. Not actually providing a bounded base case is a future
> diaster that we can easily avoid here.
>
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> ---
> Changes v5->v6:
> * Use unlikely() for the rare overflow conditions.
> * Also bound recursion, since this is a potential disaster we can avert.
>
> net/core/skbuff.c | 31 ++++++++++++++++++++++++-------
> 1 file changed, 24 insertions(+), 7 deletions(-)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index f86bf69cfb8d..24fb53f8534e 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -3489,16 +3489,22 @@ void __init skb_init(void)
> * @len: Length of buffer space to be mapped
> *
> * Fill the specified scatter-gather list with mappings/pointers into a
> - * region of the buffer space attached to a socket buffer.
> + * region of the buffer space attached to a socket buffer. Returns either
> + * the number of scatterlist items used, or -EMSGSIZE if the contents
> + * could not fit.
> */
One small thing here: since you're touching this comment, could you
move it next to skb_to_sgvec, since that's the function it's supposed
to document?
Thanks!
> static int
> -__skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
> +__skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len,
> + unsigned int recursion_level)
--
Sabrina
^ permalink raw reply
* Re: [PATCH net v1 0/3] tipc: fix hanging socket connections
From: David Miller @ 2017-04-28 16:20 UTC (permalink / raw)
To: parthasarathy.bhuvaragan; +Cc: netdev, tipc-discussion
In-Reply-To: <1493193902-18332-1-git-send-email-parthasarathy.bhuvaragan@ericsson.com>
From: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Date: Wed, 26 Apr 2017 10:04:59 +0200
> This patch series contains fixes for the socket layer to
> prevent hanging / stale connections.
Series applied, thank you.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox