Linux Security Modules development

Linux Security Modules development
 help / color / mirror / Atom feed

* Re: [PATCH v18 05/23] net: Prepare UDS for security module stacking
From: Casey Schaufler @ 2020-07-09 19:24 UTC (permalink / raw)
  To: John Johansen, Stephen Smalley
  Cc: Casey Schaufler, James Morris, LSM List, SElinux list, Kees Cook,
	Tetsuo Handa, Paul Moore, Stephen Smalley, netdev,
	Casey Schaufler
In-Reply-To: <8a5a243f-e991-ad55-0503-654cc2587133@canonical.com>

On 7/9/2020 9:28 AM, John Johansen wrote:
> On 7/9/20 9:11 AM, Stephen Smalley wrote:
>> On Wed, Jul 8, 2020 at 8:23 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>>> Change the data used in UDS SO_PEERSEC processing from a
>>> secid to a more general struct lsmblob. Update the
>>> security_socket_getpeersec_dgram() interface to use the
>>> lsmblob. There is a small amount of scaffolding code
>>> that will come out when the security_secid_to_secctx()
>>> code is brought in line with the lsmblob.
>>>
>>> The secid field of the unix_skb_parms structure has been
>>> replaced with a pointer to an lsmblob structure, and the
>>> lsmblob is allocated as needed. This is similar to how the
>>> list of passed files is managed. While an lsmblob structure
>>> will fit in the available space today, there is no guarantee
>>> that the addition of other data to the unix_skb_parms or
>>> support for additional security modules wouldn't exceed what
>>> is available.
>>>
>>> Reviewed-by: Kees Cook <keescook@chromium.org>
>>> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
>>> Cc: netdev@vger.kernel.org
>>> ---
>>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>>> index 3385a7a0b231..d246aefcf4da 100644
>>> --- a/net/unix/af_unix.c
>>> +++ b/net/unix/af_unix.c
>>> @@ -138,17 +138,23 @@ static struct hlist_head *unix_sockets_unbound(void *addr)
>>>  #ifdef CONFIG_SECURITY_NETWORK
>>>  static void unix_get_secdata(struct scm_cookie *scm, struct sk_buff *skb)
>>>  {
>>> -       UNIXCB(skb).secid = scm->secid;
>>> +       UNIXCB(skb).lsmdata = kmemdup(&scm->lsmblob, sizeof(scm->lsmblob),
>>> +                                     GFP_KERNEL);
>>>  }
>>>
>>>  static inline void unix_set_secdata(struct scm_cookie *scm, struct sk_buff *skb)
>>>  {
>>> -       scm->secid = UNIXCB(skb).secid;
>>> +       if (likely(UNIXCB(skb).lsmdata))
>>> +               scm->lsmblob = *(UNIXCB(skb).lsmdata);
>>> +       else
>>> +               lsmblob_init(&scm->lsmblob, 0);
>>>  }
>>>
>>>  static inline bool unix_secdata_eq(struct scm_cookie *scm, struct sk_buff *skb)
>>>  {
>>> -       return (scm->secid == UNIXCB(skb).secid);
>>> +       if (likely(UNIXCB(skb).lsmdata))
>>> +               return lsmblob_equal(&scm->lsmblob, UNIXCB(skb).lsmdata);
>>> +       return false;
>>>  }
>> I don't think that this provides sensible behavior to userspace.  On a
>> transient memory allocation failure, instead of returning an error to
>> the sender and letting them handle it, this will just proceed with
>> sending the message without its associated security information, and
>> potentially split messages on arbitrary boundaries because it cannot
>> tell whether the sender had the same security information.  I think
>> you instead need to change unix_get_secdata() to return an error on
>> allocation failure and propagate that up to the sender.

Can't say that I think that would go over especially well.
You're right about that being a better, or at least more correct,
change.

>>   Not a fan of
>> this change in general both due to extra overhead on this code path
>> and potential for breakage on allocation failures.  I know it was
>> motivated by paul's observation that we won't be able to fit many more
>> secids into the cb but not sure we have to go there prematurely,

Paul wasn't completely against the original approach. His objection
was that using a struct lsmblob, which was already close to the maximum
size it can be and that can grow over time, might be a hard sell.

>> especially absent its usage by upstream AA (no unix_stream_connect
>> hook implementation upstream).  Also not sure how the whole bpf local
> I'm not sure how premature it is, I am running late for 5.9 but would
> like to land apparmor unix mediation in 5.10

Which means that scaffolding around the UNIXCB.secid wouldn't
suffice for very long.

>
>> storage approach to supporting security modules (or at least bpf lsm)
>> might reduce need for expanding these structures?

I think the allocation failure case would still be an issue,
and it could be much more complicated to deal with using the
local storage model. The fundamental problem comes back to fitting
more that 32 bits of information into 32 bits without having to
perform an operation that might fail.

At this point I'm inclined to revert to the original implementation
and see if it doesn't turn out to be acceptable after all. I remain
open to better ideas.



^ permalink raw reply

* Re: [PATCH v19 08/12] landlock: Add syscall implementation
From: Christian Brauner @ 2020-07-09 17:47 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mickaël Salaün, linux-kernel@vger.kernel.org, Al Viro,
	Andy Lutomirski, Anton Ivanov, Casey Schaufler, James Morris,
	Jann Horn, Jeff Dike, Jonathan Corbet, Kees Cook, Michael Kerrisk,
	Mickaël Salaün, Richard Weinberger, Serge E . Hallyn,
	Shuah Khan, Vincent Dagonneau, Kernel Hardening, Linux API,
	linux-arch, open list:DOCUMENTATION, Linux FS-devel Mailing List,
	open list:KERNEL SELFTEST FRAMEWORK, LSM List,
	the arch/x86 maintainers
In-Reply-To: <CAK8P3a34X1qfDhn8u3nR+aQA_g+V2i35L0oTnvhNAs83YJPB_w@mail.gmail.com>

On Thu, Jul 09, 2020 at 07:26:18PM +0200, Arnd Bergmann wrote:
> On Wed, Jul 8, 2020 at 7:50 PM Mickaël Salaün <mic@digikod.net> wrote:
> > On 08/07/2020 15:49, Arnd Bergmann wrote:
> > > On Wed, Jul 8, 2020 at 3:04 PM Mickaël Salaün <mic@digikod.net> wrote:
> > >> On 08/07/2020 10:57, Arnd Bergmann wrote:
> > >>> On Tue, Jul 7, 2020 at 8:10 PM Mickaël Salaün <mic@digikod.net> wrote:
> > >>>
> > >>> It looks like all you need here today is a single argument bit, plus
> > >>> possibly some room for extensibility. I would suggest removing all
> > >>> the extra bits and using a syscall like
> > >>>
> > >>> SYSCALL_DEFINE1(landlock_create_ruleset, u32, flags);
> > >>>
> > >>> I don't really see how this needs any variable-length arguments,
> > >>> it really doesn't do much.
> > >>
> > >> We need the attr_ptr/attr_size pattern because the number of ruleset
> > >> properties will increase (e.g. network access mask).
> > >
> > > But how many bits do you think you will *actually* need in total that
> > > this needs to be a two-dimensional set of flags? At the moment you
> > > only have a single bit that you interpret.
> >
> > I think there is a misunderstanding. For this syscall I wasn't talking
> > about the "options" field but about the "handled_access_fs" field which
> > has 14 bits dedicated to control access to the file system:
> > https://landlock.io/linux-doc/landlock-v19/security/landlock/user.html#filesystem-flags
> 
> Ok, got it. I didn't read far enough there.
> 
> > The idea is to add other handled_access_* fields for other kernel object
> > types (e.g. network, process, etc.).
> >
> > The "options" field is fine as a raw __u32 syscall argument.
> 
> I'd still like to avoid having it variable-length and structured though.
> How about having a __u32 "options" flag, plus an indirect argument
> with 32 fixed-length (all 32 bit or all 64 bit) flag words, each of which
> corresponds to one of the option bits?
> 
> It's still fairly complex that way, but not as much as the version
> you have right now that can be extended in multiple dimensions.
> 
> This could possibly also help avoid the need for the get_features

What is this fresh hell again, please?

> syscall: If user space just passes the bitmap of all the access flags
> it wants to use in a fixed-size structure, the kernel can update the
> bits to mask out the ones it does not understand and write back
> that bitmap as the result of create_ruleset().
> 
> > >>> To be on the safe side, you might split up the flags into either the
> > >>> upper/lower 16 bits or two u32 arguments, to allow both compatible
> > >>> (ignored by older kernels if flag is set) and incompatible (return error
> > >>> when an unknown flag is set) bits.
> > >>
> > >> This may be a good idea in general, but in the case of Landlock, because
> > >> this kind of (discretionary) sandboxing should be a best-effort security
> > >> feature, we should avoid incompatible behavior. In practice, every
> > >> unknown bit returns an error because userland can probe for available
> > >> bits thanks to the get_features command. This kind of (in)compatibility
> > >> can then be handled by userland.
> > >
> > > If there are not going to be incompatible extensions, then just ignore
> > > all unknown bits and never return an error but get rid of the user
> > > space probing that just complicates the interface.
> >
> > There was multiple discussions about ABI compatibility, especially
> > inspired by open(2) vs. openat2(2), and ignoring flags seems to be a bad
> > idea. In the "sandboxer" example, we first probe the supported features
> > and then mask unknown bits (i.e. access rights) at run time in userland.
> > This strategy is quite straightforward, backward compatible and
> > future-proof.
> 
> For behavior changing flags, I agree they should be seen as
> incompatible flags (i.e. return an error if an unknown bit is set).
> 
> However, for the flags you pass in in an allowlist, treating them
> as compatible (i.e. ignore any unknown flags, allowing everything
> you are not forbidding already) seems completely reasonable
> to me. Do you foresee user space doing anything other than masking
> out the bits that the kernel doesn't know about? If not, then doing
> it in the  kernel should always be simpler.
> 
> > >> I suggest this syscall signature:
> > >> SYSCALL_DEFINE3(landlock_create_ruleset, __u32, options, const struct
> > >> landlock_attr_ruleset __user *, ruleset_ptr, size_t, ruleset_size);
> > >
> > > The other problem here is that indirect variable-size structured arguments
> > > are a pain to instrument with things like strace or seccomp, so you
> > > should first try to use a fixed argument list, and fall back to a fixed
> > > structure if that fails.
> >
> > I agree that it is not perfect with the current tools but this kind of
> > extensible structs are becoming common and well defined (e.g. openat2).
> > Moreover there is some work going on for seccomp to support "extensible
> > argument" syscalls: https://lwn.net/Articles/822256/
> 
> openat2() is already more complex than we'd ideally want, I think we
> should try hard to make new syscalls simpler than that, following the
> rule that any interface should be as simple as possible, but no simpler.

Extensible structs are targeted at system calls that are either known to
grow a lot of features or we already have prior versions that have
accumulated quite a lot of features or that by their nature need to be
more complex.
openat2() is not really complex per se (At least not yet. It will likely
grow quite a bit in the future...). The kernel now has infrastructure
since clone3() and later generalized with openat2() and is well-equipped
with a consistent api to deal with such syscalls so I don't see how this
is really an issue in the first place. Yes, syscalls should be kept
as simple as possible but we don't need to lock us into a "structs as
arguments" are inherently bad mindset. That will also cause us to end up
with crappy syscalls that are awkward to use for userspace.
(Second-level pointers is a whole different issue of course.)

(Arnd, you should also note that we're giving a talk at kernel summit
about new syscall conventions and I'm syncing with Florian who'll be
talking about the userspace side and requirements of this.)

Christian

> 
> > >>>> +static int syscall_add_rule_path_beneath(const void __user *const attr_ptr,
> > >>>> +               const size_t attr_size)
> > >>>> +{
> > >>>> +       struct landlock_attr_path_beneath attr_path_beneath;
> > >>>> +       struct path path;
> > >>>> +       struct landlock_ruleset *ruleset;
> > >>>> +       int err;
> > >>>
> > >>> Similarly, it looks like this wants to be
> > >>>
> > >>> SYSCALL_DEFINE3(landlock_add_rule_path_beneath, int, ruleset, int,
> > >>> path, __u32, flags)
> > >>>
> > >>> I don't see any need to extend this in a way that wouldn't already
> > >>> be served better by adding another system call. You might argue
> > >>> that 'flags' and 'allowed_access' could be separate, with the latter
> > >>> being an indirect in/out argument here, like
> > >>>
> > >>> SYSCALL_DEFINE4(landlock_add_rule_path_beneath, int, ruleset, int, path,
> > >>>                            __u64 *, allowed_acces, __u32, flags)
> > >>
> > >> To avoid adding a new syscall for each new rule type (e.g. path_beneath,
> > >> path_range, net_ipv4_range, etc.), I think it would be better to keep
> > >> the attr_ptr/attr_size pattern and to explicitely set a dedicated option
> > >> flag to specify the attr type.
> > >>
> > >> This would look like this:
> > >> SYSCALL_DEFINE4(landlock_add_rule, __u32, options, int, ruleset, const
> > >> void __user *, rule_ptr, size_t, rule_size);
> > >>
> > >> The rule_ptr could then point to multiple types like struct
> > >> landlock_attr_path_beneath (without the current ruleset_fd field).
> > >
> > > This again introduces variable-sized structured data. How many different
> > > kinds of rule types do you think there will be (most likely, and maybe an
> > > upper bound)?
> >
> > I don't know how many rule types will come, but right now I think it may
> > be less than 10.
> 
> Ok,
> 
> > > Could (some of) these be generalized to use the same data structure?
> >
> > I don't think so, file path and network addresses are an example of very
> > different types.
> 
> Clearly the target object is something different, but maybe there is
> enough commonality to still make them fit into a more regular form.
> 
> For the file system case, you have an identify for an object
> (the file descriptor) and the  '__u64 allowed_access'. I would
> expect that the 'allowed_access' concept is generic enough that
> you can make it a direct argument (32 bit register arg, or pointer
> to a __u64). Do you expect others to need something besides
> an object identifier and a permission bitmask? Maybe it could
> be something like
> 
>  SYSCALL_DEFINE4(landlock_add_rule, int, ruleset, __u32, options,
>                        const void __user *, object, const __u64 __user
> *, allowed_access,
>                        __u32, flags);
> 
> with a fixed-length 'object' identifier type (file descriptor,
> sockaddr_storage, ...) for each option.
> 
>     Arnd

^ permalink raw reply

* Re: [PATCH v19 08/12] landlock: Add syscall implementation
From: Arnd Bergmann @ 2020-07-09 17:26 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: linux-kernel@vger.kernel.org, Al Viro, Andy Lutomirski,
	Anton Ivanov, Casey Schaufler, James Morris, Jann Horn, Jeff Dike,
	Jonathan Corbet, Kees Cook, Michael Kerrisk,
	Mickaël Salaün, Richard Weinberger, Serge E . Hallyn,
	Shuah Khan, Vincent Dagonneau, Kernel Hardening, Linux API,
	linux-arch, open list:DOCUMENTATION, Linux FS-devel Mailing List,
	open list:KERNEL SELFTEST FRAMEWORK, LSM List,
	the arch/x86 maintainers
In-Reply-To: <ec79f6ad-1c11-d69f-724b-622baa28f19f@digikod.net>

On Wed, Jul 8, 2020 at 7:50 PM Mickaël Salaün <mic@digikod.net> wrote:
> On 08/07/2020 15:49, Arnd Bergmann wrote:
> > On Wed, Jul 8, 2020 at 3:04 PM Mickaël Salaün <mic@digikod.net> wrote:
> >> On 08/07/2020 10:57, Arnd Bergmann wrote:
> >>> On Tue, Jul 7, 2020 at 8:10 PM Mickaël Salaün <mic@digikod.net> wrote:
> >>>
> >>> It looks like all you need here today is a single argument bit, plus
> >>> possibly some room for extensibility. I would suggest removing all
> >>> the extra bits and using a syscall like
> >>>
> >>> SYSCALL_DEFINE1(landlock_create_ruleset, u32, flags);
> >>>
> >>> I don't really see how this needs any variable-length arguments,
> >>> it really doesn't do much.
> >>
> >> We need the attr_ptr/attr_size pattern because the number of ruleset
> >> properties will increase (e.g. network access mask).
> >
> > But how many bits do you think you will *actually* need in total that
> > this needs to be a two-dimensional set of flags? At the moment you
> > only have a single bit that you interpret.
>
> I think there is a misunderstanding. For this syscall I wasn't talking
> about the "options" field but about the "handled_access_fs" field which
> has 14 bits dedicated to control access to the file system:
> https://landlock.io/linux-doc/landlock-v19/security/landlock/user.html#filesystem-flags

Ok, got it. I didn't read far enough there.

> The idea is to add other handled_access_* fields for other kernel object
> types (e.g. network, process, etc.).
>
> The "options" field is fine as a raw __u32 syscall argument.

I'd still like to avoid having it variable-length and structured though.
How about having a __u32 "options" flag, plus an indirect argument
with 32 fixed-length (all 32 bit or all 64 bit) flag words, each of which
corresponds to one of the option bits?

It's still fairly complex that way, but not as much as the version
you have right now that can be extended in multiple dimensions.

This could possibly also help avoid the need for the get_features
syscall: If user space just passes the bitmap of all the access flags
it wants to use in a fixed-size structure, the kernel can update the
bits to mask out the ones it does not understand and write back
that bitmap as the result of create_ruleset().

> >>> To be on the safe side, you might split up the flags into either the
> >>> upper/lower 16 bits or two u32 arguments, to allow both compatible
> >>> (ignored by older kernels if flag is set) and incompatible (return error
> >>> when an unknown flag is set) bits.
> >>
> >> This may be a good idea in general, but in the case of Landlock, because
> >> this kind of (discretionary) sandboxing should be a best-effort security
> >> feature, we should avoid incompatible behavior. In practice, every
> >> unknown bit returns an error because userland can probe for available
> >> bits thanks to the get_features command. This kind of (in)compatibility
> >> can then be handled by userland.
> >
> > If there are not going to be incompatible extensions, then just ignore
> > all unknown bits and never return an error but get rid of the user
> > space probing that just complicates the interface.
>
> There was multiple discussions about ABI compatibility, especially
> inspired by open(2) vs. openat2(2), and ignoring flags seems to be a bad
> idea. In the "sandboxer" example, we first probe the supported features
> and then mask unknown bits (i.e. access rights) at run time in userland.
> This strategy is quite straightforward, backward compatible and
> future-proof.

For behavior changing flags, I agree they should be seen as
incompatible flags (i.e. return an error if an unknown bit is set).

However, for the flags you pass in in an allowlist, treating them
as compatible (i.e. ignore any unknown flags, allowing everything
you are not forbidding already) seems completely reasonable
to me. Do you foresee user space doing anything other than masking
out the bits that the kernel doesn't know about? If not, then doing
it in the  kernel should always be simpler.

> >> I suggest this syscall signature:
> >> SYSCALL_DEFINE3(landlock_create_ruleset, __u32, options, const struct
> >> landlock_attr_ruleset __user *, ruleset_ptr, size_t, ruleset_size);
> >
> > The other problem here is that indirect variable-size structured arguments
> > are a pain to instrument with things like strace or seccomp, so you
> > should first try to use a fixed argument list, and fall back to a fixed
> > structure if that fails.
>
> I agree that it is not perfect with the current tools but this kind of
> extensible structs are becoming common and well defined (e.g. openat2).
> Moreover there is some work going on for seccomp to support "extensible
> argument" syscalls: https://lwn.net/Articles/822256/

openat2() is already more complex than we'd ideally want, I think we
should try hard to make new syscalls simpler than that, following the
rule that any interface should be as simple as possible, but no simpler.

> >>>> +static int syscall_add_rule_path_beneath(const void __user *const attr_ptr,
> >>>> +               const size_t attr_size)
> >>>> +{
> >>>> +       struct landlock_attr_path_beneath attr_path_beneath;
> >>>> +       struct path path;
> >>>> +       struct landlock_ruleset *ruleset;
> >>>> +       int err;
> >>>
> >>> Similarly, it looks like this wants to be
> >>>
> >>> SYSCALL_DEFINE3(landlock_add_rule_path_beneath, int, ruleset, int,
> >>> path, __u32, flags)
> >>>
> >>> I don't see any need to extend this in a way that wouldn't already
> >>> be served better by adding another system call. You might argue
> >>> that 'flags' and 'allowed_access' could be separate, with the latter
> >>> being an indirect in/out argument here, like
> >>>
> >>> SYSCALL_DEFINE4(landlock_add_rule_path_beneath, int, ruleset, int, path,
> >>>                            __u64 *, allowed_acces, __u32, flags)
> >>
> >> To avoid adding a new syscall for each new rule type (e.g. path_beneath,
> >> path_range, net_ipv4_range, etc.), I think it would be better to keep
> >> the attr_ptr/attr_size pattern and to explicitely set a dedicated option
> >> flag to specify the attr type.
> >>
> >> This would look like this:
> >> SYSCALL_DEFINE4(landlock_add_rule, __u32, options, int, ruleset, const
> >> void __user *, rule_ptr, size_t, rule_size);
> >>
> >> The rule_ptr could then point to multiple types like struct
> >> landlock_attr_path_beneath (without the current ruleset_fd field).
> >
> > This again introduces variable-sized structured data. How many different
> > kinds of rule types do you think there will be (most likely, and maybe an
> > upper bound)?
>
> I don't know how many rule types will come, but right now I think it may
> be less than 10.

Ok,

> > Could (some of) these be generalized to use the same data structure?
>
> I don't think so, file path and network addresses are an example of very
> different types.

Clearly the target object is something different, but maybe there is
enough commonality to still make them fit into a more regular form.

For the file system case, you have an identify for an object
(the file descriptor) and the  '__u64 allowed_access'. I would
expect that the 'allowed_access' concept is generic enough that
you can make it a direct argument (32 bit register arg, or pointer
to a __u64). Do you expect others to need something besides
an object identifier and a permission bitmask? Maybe it could
be something like

 SYSCALL_DEFINE4(landlock_add_rule, int, ruleset, __u32, options,
                       const void __user *, object, const __u64 __user
*, allowed_access,
                       __u32, flags);

with a fixed-length 'object' identifier type (file descriptor,
sockaddr_storage, ...) for each option.

    Arnd

^ permalink raw reply

* Re: [PATCH v18 05/23] net: Prepare UDS for security module stacking
From: John Johansen @ 2020-07-09 16:28 UTC (permalink / raw)
  To: Stephen Smalley, Casey Schaufler
  Cc: Casey Schaufler, James Morris, LSM List, SElinux list, Kees Cook,
	Tetsuo Handa, Paul Moore, Stephen Smalley, netdev
In-Reply-To: <CAEjxPJ4EefLKKvMo=8ZWeA4gVioH=WQ=52rnMuW5TnyExmJsRg@mail.gmail.com>

On 7/9/20 9:11 AM, Stephen Smalley wrote:
> On Wed, Jul 8, 2020 at 8:23 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>>
>> Change the data used in UDS SO_PEERSEC processing from a
>> secid to a more general struct lsmblob. Update the
>> security_socket_getpeersec_dgram() interface to use the
>> lsmblob. There is a small amount of scaffolding code
>> that will come out when the security_secid_to_secctx()
>> code is brought in line with the lsmblob.
>>
>> The secid field of the unix_skb_parms structure has been
>> replaced with a pointer to an lsmblob structure, and the
>> lsmblob is allocated as needed. This is similar to how the
>> list of passed files is managed. While an lsmblob structure
>> will fit in the available space today, there is no guarantee
>> that the addition of other data to the unix_skb_parms or
>> support for additional security modules wouldn't exceed what
>> is available.
>>
>> Reviewed-by: Kees Cook <keescook@chromium.org>
>> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
>> Cc: netdev@vger.kernel.org
>> ---
> 
>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>> index 3385a7a0b231..d246aefcf4da 100644
>> --- a/net/unix/af_unix.c
>> +++ b/net/unix/af_unix.c
>> @@ -138,17 +138,23 @@ static struct hlist_head *unix_sockets_unbound(void *addr)
>>  #ifdef CONFIG_SECURITY_NETWORK
>>  static void unix_get_secdata(struct scm_cookie *scm, struct sk_buff *skb)
>>  {
>> -       UNIXCB(skb).secid = scm->secid;
>> +       UNIXCB(skb).lsmdata = kmemdup(&scm->lsmblob, sizeof(scm->lsmblob),
>> +                                     GFP_KERNEL);
>>  }
>>
>>  static inline void unix_set_secdata(struct scm_cookie *scm, struct sk_buff *skb)
>>  {
>> -       scm->secid = UNIXCB(skb).secid;
>> +       if (likely(UNIXCB(skb).lsmdata))
>> +               scm->lsmblob = *(UNIXCB(skb).lsmdata);
>> +       else
>> +               lsmblob_init(&scm->lsmblob, 0);
>>  }
>>
>>  static inline bool unix_secdata_eq(struct scm_cookie *scm, struct sk_buff *skb)
>>  {
>> -       return (scm->secid == UNIXCB(skb).secid);
>> +       if (likely(UNIXCB(skb).lsmdata))
>> +               return lsmblob_equal(&scm->lsmblob, UNIXCB(skb).lsmdata);
>> +       return false;
>>  }
> 
> I don't think that this provides sensible behavior to userspace.  On a
> transient memory allocation failure, instead of returning an error to
> the sender and letting them handle it, this will just proceed with
> sending the message without its associated security information, and
> potentially split messages on arbitrary boundaries because it cannot
> tell whether the sender had the same security information.  I think
> you instead need to change unix_get_secdata() to return an error on
> allocation failure and propagate that up to the sender.  Not a fan of
> this change in general both due to extra overhead on this code path
> and potential for breakage on allocation failures.  I know it was
> motivated by paul's observation that we won't be able to fit many more
> secids into the cb but not sure we have to go there prematurely,
> especially absent its usage by upstream AA (no unix_stream_connect
> hook implementation upstream).  Also not sure how the whole bpf local

I'm not sure how premature it is, I am running late for 5.9 but would
like to land apparmor unix mediation in 5.10

> storage approach to supporting security modules (or at least bpf lsm)
> might reduce need for expanding these structures?
> 


^ permalink raw reply

* Re: [PATCH v18 05/23] net: Prepare UDS for security module stacking
From: Stephen Smalley @ 2020-07-09 16:11 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Casey Schaufler, James Morris, LSM List, SElinux list, Kees Cook,
	John Johansen, Tetsuo Handa, Paul Moore, Stephen Smalley, netdev
In-Reply-To: <20200709001234.9719-6-casey@schaufler-ca.com>

On Wed, Jul 8, 2020 at 8:23 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>
> Change the data used in UDS SO_PEERSEC processing from a
> secid to a more general struct lsmblob. Update the
> security_socket_getpeersec_dgram() interface to use the
> lsmblob. There is a small amount of scaffolding code
> that will come out when the security_secid_to_secctx()
> code is brought in line with the lsmblob.
>
> The secid field of the unix_skb_parms structure has been
> replaced with a pointer to an lsmblob structure, and the
> lsmblob is allocated as needed. This is similar to how the
> list of passed files is managed. While an lsmblob structure
> will fit in the available space today, there is no guarantee
> that the addition of other data to the unix_skb_parms or
> support for additional security modules wouldn't exceed what
> is available.
>
> Reviewed-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
> Cc: netdev@vger.kernel.org
> ---

> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 3385a7a0b231..d246aefcf4da 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -138,17 +138,23 @@ static struct hlist_head *unix_sockets_unbound(void *addr)
>  #ifdef CONFIG_SECURITY_NETWORK
>  static void unix_get_secdata(struct scm_cookie *scm, struct sk_buff *skb)
>  {
> -       UNIXCB(skb).secid = scm->secid;
> +       UNIXCB(skb).lsmdata = kmemdup(&scm->lsmblob, sizeof(scm->lsmblob),
> +                                     GFP_KERNEL);
>  }
>
>  static inline void unix_set_secdata(struct scm_cookie *scm, struct sk_buff *skb)
>  {
> -       scm->secid = UNIXCB(skb).secid;
> +       if (likely(UNIXCB(skb).lsmdata))
> +               scm->lsmblob = *(UNIXCB(skb).lsmdata);
> +       else
> +               lsmblob_init(&scm->lsmblob, 0);
>  }
>
>  static inline bool unix_secdata_eq(struct scm_cookie *scm, struct sk_buff *skb)
>  {
> -       return (scm->secid == UNIXCB(skb).secid);
> +       if (likely(UNIXCB(skb).lsmdata))
> +               return lsmblob_equal(&scm->lsmblob, UNIXCB(skb).lsmdata);
> +       return false;
>  }

I don't think that this provides sensible behavior to userspace.  On a
transient memory allocation failure, instead of returning an error to
the sender and letting them handle it, this will just proceed with
sending the message without its associated security information, and
potentially split messages on arbitrary boundaries because it cannot
tell whether the sender had the same security information.  I think
you instead need to change unix_get_secdata() to return an error on
allocation failure and propagate that up to the sender.  Not a fan of
this change in general both due to extra overhead on this code path
and potential for breakage on allocation failures.  I know it was
motivated by paul's observation that we won't be able to fit many more
secids into the cb but not sure we have to go there prematurely,
especially absent its usage by upstream AA (no unix_stream_connect
hook implementation upstream).  Also not sure how the whole bpf local
storage approach to supporting security modules (or at least bpf lsm)
might reduce need for expanding these structures?

^ permalink raw reply

* [PATCH bpf-next v4 4/4] bpf: Add selftests for local_storage
From: KP Singh @ 2020-07-09 10:12 UTC (permalink / raw)
  To: linux-kernel, bpf, linux-security-module
  Cc: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Paul Turner, Jann Horn, Florent Revest
In-Reply-To: <20200709101239.3829793-1-kpsingh@chromium.org>

From: KP Singh <kpsingh@google.com>

inode_local_storage:

* Hook to the file_open and inode_unlink LSM hooks.
* Create and unlink a temporary file.
* Store some information in the inode's bpf_local_storage during
  file_open.
* Verify that this information exists when the file is unlinked.

sk_local_storage:

* Hook to the socket_post_create and socket_bind LSM hooks.
* Open and bind a socket and set the sk_storage in the
  socket_post_create hook using the start_server helper.
* Verify if the information is set in the socket_bind hook.

Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: KP Singh <kpsingh@google.com>
---
 .../bpf/prog_tests/test_local_storage.c       |  60 ++++++++
 .../selftests/bpf/progs/local_storage.c       | 136 ++++++++++++++++++
 2 files changed, 196 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/test_local_storage.c
 create mode 100644 tools/testing/selftests/bpf/progs/local_storage.c

diff --git a/tools/testing/selftests/bpf/prog_tests/test_local_storage.c b/tools/testing/selftests/bpf/prog_tests/test_local_storage.c
new file mode 100644
index 000000000000..d4ba89195c43
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/test_local_storage.c
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright (C) 2020 Google LLC.
+ */
+
+#include <test_progs.h>
+#include <linux/limits.h>
+
+#include "local_storage.skel.h"
+#include "network_helpers.h"
+
+int create_and_unlink_file(void)
+{
+	char fname[PATH_MAX] = "/tmp/fileXXXXXX";
+	int fd;
+
+	fd = mkstemp(fname);
+	if (fd < 0)
+		return fd;
+
+	close(fd);
+	unlink(fname);
+	return 0;
+}
+
+void test_test_local_storage(void)
+{
+	struct local_storage *skel = NULL;
+	int err, duration = 0, serv_sk = -1;
+
+	skel = local_storage__open_and_load();
+	if (CHECK(!skel, "skel_load", "lsm skeleton failed\n"))
+		goto close_prog;
+
+	err = local_storage__attach(skel);
+	if (CHECK(err, "attach", "lsm attach failed: %d\n", err))
+		goto close_prog;
+
+	skel->bss->monitored_pid = getpid();
+
+	err = create_and_unlink_file();
+	if (CHECK(err < 0, "exec_cmd", "err %d errno %d\n", err, errno))
+		goto close_prog;
+
+	CHECK(!skel->bss->inode_storage_result, "inode_storage_result",
+	      "inode_local_storage not set");
+
+	serv_sk = start_server(AF_INET6, SOCK_STREAM, NULL, 0, 0);
+	if (CHECK(serv_sk < 0, "start_server", "failed to start server\n"))
+		goto close_prog;
+
+	CHECK(!skel->bss->sk_storage_result, "sk_storage_result",
+	      "sk_local_storage not set");
+
+	close(serv_sk);
+
+close_prog:
+	local_storage__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/local_storage.c b/tools/testing/selftests/bpf/progs/local_storage.c
new file mode 100644
index 000000000000..cb608b7b90f0
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/local_storage.c
@@ -0,0 +1,136 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright 2020 Google LLC.
+ */
+
+#include <errno.h>
+#include <linux/bpf.h>
+#include <stdbool.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+#define DUMMY_STORAGE_VALUE 0xdeadbeef
+
+int monitored_pid = 0;
+bool inode_storage_result = false;
+bool sk_storage_result = false;
+
+struct dummy_storage {
+	__u32 value;
+};
+
+struct {
+	__uint(type, BPF_MAP_TYPE_INODE_STORAGE);
+	__uint(map_flags, BPF_F_NO_PREALLOC);
+	__type(key, int);
+	__type(value, struct dummy_storage);
+} inode_storage_map SEC(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_SK_STORAGE);
+	__uint(map_flags, BPF_F_NO_PREALLOC | BPF_F_CLONE);
+	__type(key, int);
+	__type(value, struct dummy_storage);
+} sk_storage_map SEC(".maps");
+
+/* TODO Use vmlinux.h once BTF pruning for embedded types is fixed.
+ */
+struct sock {} __attribute__((preserve_access_index));
+struct sockaddr {} __attribute__((preserve_access_index));
+struct socket {
+	struct sock *sk;
+} __attribute__((preserve_access_index));
+
+struct inode {} __attribute__((preserve_access_index));
+struct dentry {
+	struct inode *d_inode;
+} __attribute__((preserve_access_index));
+struct file {
+	struct inode *f_inode;
+} __attribute__((preserve_access_index));
+
+
+SEC("lsm/inode_unlink")
+int BPF_PROG(unlink_hook, struct inode *dir, struct dentry *victim)
+{
+	__u32 pid = bpf_get_current_pid_tgid() >> 32;
+	struct dummy_storage *storage;
+
+	if (pid != monitored_pid)
+		return 0;
+
+	storage = bpf_inode_storage_get(&inode_storage_map, victim->d_inode, 0,
+				     BPF_SK_STORAGE_GET_F_CREATE);
+	if (!storage)
+		return 0;
+
+	if (storage->value == DUMMY_STORAGE_VALUE)
+		inode_storage_result = true;
+
+	return 0;
+}
+
+SEC("lsm/socket_bind")
+int BPF_PROG(socket_bind, struct socket *sock, struct sockaddr *address,
+	     int addrlen)
+{
+	__u32 pid = bpf_get_current_pid_tgid() >> 32;
+	struct dummy_storage *storage;
+
+	if (pid != monitored_pid)
+		return 0;
+
+	storage = bpf_sk_storage_get(&sk_storage_map, sock->sk, 0,
+				     BPF_SK_STORAGE_GET_F_CREATE);
+	if (!storage)
+		return 0;
+
+	if (storage->value == DUMMY_STORAGE_VALUE)
+		sk_storage_result = true;
+
+	return 0;
+}
+
+SEC("lsm/socket_post_create")
+int BPF_PROG(socket_post_create, struct socket *sock, int family, int type,
+	     int protocol, int kern)
+{
+	__u32 pid = bpf_get_current_pid_tgid() >> 32;
+	struct dummy_storage *storage;
+
+	if (pid != monitored_pid)
+		return 0;
+
+	storage = bpf_sk_storage_get(&sk_storage_map, sock->sk, 0,
+				     BPF_SK_STORAGE_GET_F_CREATE);
+	if (!storage)
+		return 0;
+
+	storage->value = DUMMY_STORAGE_VALUE;
+
+	return 0;
+}
+
+SEC("lsm/file_open")
+int BPF_PROG(test_int_hook, struct file *file)
+{
+	__u32 pid = bpf_get_current_pid_tgid() >> 32;
+	struct dummy_storage *storage;
+
+	if (pid != monitored_pid)
+		return 0;
+
+	if (!file->f_inode)
+		return 0;
+
+	storage = bpf_inode_storage_get(&inode_storage_map, file->f_inode, 0,
+				     BPF_LOCAL_STORAGE_GET_F_CREATE);
+	if (!storage)
+		return 0;
+
+	storage->value = DUMMY_STORAGE_VALUE;
+	return 0;
+}
-- 
2.27.0.389.gc38d7665816-goog


^ permalink raw reply related

* [PATCH bpf-next v4 1/4] bpf: Generalize bpf_sk_storage
From: KP Singh @ 2020-07-09 10:12 UTC (permalink / raw)
  To: linux-kernel, bpf, linux-security-module
  Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau,
	Paul Turner, Jann Horn, Florent Revest
In-Reply-To: <20200709101239.3829793-1-kpsingh@chromium.org>

From: KP Singh <kpsingh@google.com>

Refactor the functionality in bpf_sk_storage.c so that concept of
storage linked to kernel objects can be extended to other objects like
inode, task_struct etc.

bpf_sk_storage is updated to be bpf_local_storage with a union that
contains a pointer to the owner object. The type of the
bpf_local_storage can be determined using the newly added
bpf_local_storage_type enum.

Each new local storage will still be a separate map and provide its own
set of helpers. This allows for future object specific extensions and
still share a lot of the underlying implementation.

Signed-off-by: KP Singh <kpsingh@google.com>
---
 include/linux/bpf.h               |  14 +
 include/linux/bpf_local_storage.h | 175 +++++++
 include/net/sock.h                |   4 +-
 include/uapi/linux/bpf.h          |  12 +-
 kernel/bpf/Makefile               |   1 +
 kernel/bpf/bpf_local_storage.c    | 517 +++++++++++++++++++
 net/core/bpf_sk_storage.c         | 803 ++++++------------------------
 tools/include/uapi/linux/bpf.h    |  12 +-
 8 files changed, 879 insertions(+), 659 deletions(-)
 create mode 100644 include/linux/bpf_local_storage.h
 create mode 100644 kernel/bpf/bpf_local_storage.c

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 0cd7f6884c5c..95ab7031cd8e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -33,6 +33,9 @@ struct btf;
 struct btf_type;
 struct exception_table_entry;
 struct seq_operations;
+struct bpf_local_storage;
+struct bpf_local_storage_map;
+struct bpf_local_storage_elem;
 
 extern struct idr btf_idr;
 extern spinlock_t btf_idr_lock;
@@ -93,6 +96,17 @@ struct bpf_map_ops {
 	__poll_t (*map_poll)(struct bpf_map *map, struct file *filp,
 			     struct poll_table_struct *pts);
 
+	/* Functions called by bpf_local_storage maps */
+	void (*map_local_storage_unlink)(struct bpf_local_storage *local_storage,
+					 bool uncharge_omem);
+	struct bpf_local_storage_elem *(*map_selem_alloc)(
+		struct bpf_local_storage_map *smap, void *owner, void *value,
+		bool charge_omem);
+	struct bpf_local_storage_data *(*map_local_storage_update)(
+		void  *owner, struct bpf_map *map, void *value, u64 flags);
+	int (*map_local_storage_alloc)(void *owner,
+				       struct bpf_local_storage_map *smap,
+				       struct bpf_local_storage_elem *elem);
 	/* BTF name and id of struct allocated by map_alloc */
 	const char * const map_btf_name;
 	int *map_btf_id;
diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h
new file mode 100644
index 000000000000..605b81f2f806
--- /dev/null
+++ b/include/linux/bpf_local_storage.h
@@ -0,0 +1,175 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2019 Facebook
+ * Copyright 2020 Google LLC.
+ */
+
+#ifndef _BPF_LOCAL_STORAGE_H
+#define _BPF_LOCAL_STORAGE_H
+
+#include <linux/bpf.h>
+#include <linux/rculist.h>
+#include <linux/list.h>
+#include <linux/hash.h>
+#include <linux/types.h>
+#include <uapi/linux/btf.h>
+
+#define LOCAL_STORAGE_CREATE_FLAG_MASK					\
+	(BPF_F_NO_PREALLOC | BPF_F_CLONE)
+
+struct bucket {
+	struct hlist_head list;
+	raw_spinlock_t lock;
+};
+
+/* Thp map is not the primary owner of a bpf_local_storage_elem.
+ * Instead, the container object (eg. sk->sk_bpf_storage) is.
+ *
+ * The map (bpf_local_storage_map) is for two purposes
+ * 1. Define the size of the "local storage".  It is
+ *    the map's value_size.
+ *
+ * 2. Maintain a list to keep track of all elems such
+ *    that they can be cleaned up during the map destruction.
+ *
+ * When a bpf local storage is being looked up for a
+ * particular object,  the "bpf_map" pointer is actually used
+ * as the "key" to search in the list of elem in
+ * the respective bpf_local_storage owned by the object.
+ *
+ * e.g. sk->sk_bpf_storage is the mini-map with the "bpf_map" pointer
+ * as the searching key.
+ */
+struct bpf_local_storage_map {
+	struct bpf_map map;
+	/* Lookup elem does not require accessing the map.
+	 *
+	 * Updating/Deleting requires a bucket lock to
+	 * link/unlink the elem from the map.  Having
+	 * multiple buckets to improve contention.
+	 */
+	struct bucket *buckets;
+	u32 bucket_log;
+	u16 elem_size;
+	u16 cache_idx;
+};
+
+struct bpf_local_storage_data {
+	/* smap is used as the searching key when looking up
+	 * from the obejct's bpf_local_storage.
+	 *
+	 * Put it in the same cacheline as the data to minimize
+	 * the number of cachelines access during the cache hit case.
+	 */
+	struct bpf_local_storage_map __rcu *smap;
+	u8 data[] __aligned(8);
+};
+
+/* Linked to bpf_local_storage and bpf_local_storage_map */
+struct bpf_local_storage_elem {
+	struct hlist_node map_node;	/* Linked to bpf_local_storage_map */
+	struct hlist_node snode;	/* Linked to bpf_local_storage */
+	struct bpf_local_storage __rcu *local_storage;
+	struct rcu_head rcu;
+	/* 8 bytes hole */
+	/* The data is stored in aother cacheline to minimize
+	 * the number of cachelines access during a cache hit.
+	 */
+	struct bpf_local_storage_data sdata ____cacheline_aligned;
+};
+
+#define SELEM(_SDATA) \
+	container_of((_SDATA), struct bpf_local_storage_elem, sdata)
+#define SDATA(_SELEM) (&(_SELEM)->sdata)
+#define BPF_STORAGE_CACHE_SIZE	16
+
+u16 bpf_ls_cache_idx_get(spinlock_t *cache_idx_lock,
+			   u64 *cache_idx_usage_count);
+
+void bpf_ls_cache_idx_free(spinlock_t *cache_idx_lock,
+			   u64 *cache_idx_usage_counts, u16 idx);
+
+#define DEFINE_BPF_STORAGE_CACHE(type)					\
+static DEFINE_SPINLOCK(cache_idx_lock_##type);				\
+static u64 cache_idx_usage_counts_##type[BPF_STORAGE_CACHE_SIZE];	\
+static u16 cache_idx_get_##type(void)					\
+{									\
+	return bpf_ls_cache_idx_get(&cache_idx_lock_##type,		\
+				    cache_idx_usage_counts_##type);	\
+}									\
+static void cache_idx_free_##type(u16 idx)				\
+{									\
+	return bpf_ls_cache_idx_free(&cache_idx_lock_##type,		\
+				     cache_idx_usage_counts_##type,	\
+				     idx);				\
+}
+
+/* U16_MAX is much more than enough for sk local storage
+ * considering a tcp_sock is ~2k.
+ */
+#define BPF_LOCAL_STORAGE_MAX_VALUE_SIZE				\
+	min_t(u32,							\
+	      (KMALLOC_MAX_SIZE - MAX_BPF_STACK -			\
+	       sizeof(struct bpf_local_storage_elem)),			\
+	      (U16_MAX - sizeof(struct bpf_local_storage_elem)))
+
+struct bpf_local_storage {
+	struct bpf_local_storage_data __rcu *cache[BPF_STORAGE_CACHE_SIZE];
+	struct hlist_head list;		/* List of bpf_local_storage_elem */
+	/* The object that owns the the above "list" of
+	 * bpf_local_storage_elem
+	 */
+	union {
+		struct sock *sk;
+	};
+	struct rcu_head rcu;
+	raw_spinlock_t lock;	/* Protect adding/removing from the "list" */
+};
+
+/* Helper functions for bpf_local_storage */
+int bpf_local_storage_map_alloc_check(union bpf_attr *attr);
+
+struct bpf_local_storage_map *bpf_local_storage_map_alloc(union bpf_attr *attr);
+
+struct bpf_local_storage_data *
+bpf_local_storage_lookup(struct bpf_local_storage *local_storage,
+			 struct bpf_local_storage_map *smap,
+			 bool cacheit_lockit);
+
+void bpf_local_storage_map_free(struct bpf_local_storage_map *smap);
+
+int bpf_local_storage_map_check_btf(const struct bpf_map *map,
+				    const struct btf *btf,
+				    const struct btf_type *key_type,
+				    const struct btf_type *value_type);
+
+void bpf_selem_link(struct bpf_local_storage *local_storage,
+		    struct bpf_local_storage_elem *selem);
+
+bool bpf_selem_unlink(struct bpf_local_storage *local_storage,
+		      struct bpf_local_storage_elem *selem, bool uncharge_omem);
+
+void bpf_selem_unlink_map_elem(struct bpf_local_storage_elem *selem);
+
+void bpf_selem_link_map(struct bpf_local_storage_map *smap,
+			struct bpf_local_storage_elem *selem);
+
+void bpf_selem_unlink_map(struct bpf_local_storage_elem *selem);
+
+struct bpf_local_storage_elem *
+bpf_selem_alloc(struct bpf_local_storage_map *smap, void *value);
+
+struct bpf_local_storage *
+bpf_local_storage_alloc(struct bpf_local_storage_map *smap);
+
+int bpf_local_storage_publish(struct bpf_local_storage_elem *first_selem,
+			      struct bpf_local_storage **addr,
+			      struct bpf_local_storage *curr);
+
+int bpf_local_storage_check_update_flags(struct bpf_map *map, u64 map_flags);
+
+struct bpf_local_storage_data *
+bpf_local_storage_update(void *owner, struct bpf_map *map,
+			 struct bpf_local_storage *local_storage, void *value,
+			 u64 map_flags);
+#endif /* _BPF_LOCAL_STORAGE_H */
diff --git a/include/net/sock.h b/include/net/sock.h
index 84c813dd0152..5eada8a5eb21 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -245,7 +245,7 @@ struct sock_common {
 	/* public: */
 };
 
-struct bpf_sk_storage;
+struct bpf_local_storage;
 
 /**
   *	struct sock - network layer representation of sockets
@@ -516,7 +516,7 @@ struct sock {
 	void                    (*sk_destruct)(struct sock *sk);
 	struct sock_reuseport __rcu	*sk_reuseport_cb;
 #ifdef CONFIG_BPF_SYSCALL
-	struct bpf_sk_storage __rcu	*sk_bpf_storage;
+	struct bpf_local_storage __rcu	*sk_bpf_storage;
 #endif
 	struct rcu_head		sk_rcu;
 };
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 548a749aebb3..1f3e831c4813 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2802,10 +2802,10 @@ union bpf_attr {
  *		"type". The bpf-local-storage "type" (i.e. the *map*) is
  *		searched against all bpf-local-storages residing at *sk*.
  *
- *		An optional *flags* (**BPF_SK_STORAGE_GET_F_CREATE**) can be
+ *		An optional *flags* (**BPF_LOCAL_STORAGE_GET_F_CREATE**) can be
  *		used such that a new bpf-local-storage will be
  *		created if one does not exist.  *value* can be used
- *		together with **BPF_SK_STORAGE_GET_F_CREATE** to specify
+ *		together with **BPF_LOCAL_STORAGE_GET_F_CREATE** to specify
  *		the initial value of a bpf-local-storage.  If *value* is
  *		**NULL**, the new bpf-local-storage will be zero initialized.
  *	Return
@@ -3572,9 +3572,13 @@ enum {
 	BPF_F_SYSCTL_BASE_NAME		= (1ULL << 0),
 };
 
-/* BPF_FUNC_sk_storage_get flags */
+/* BPF_FUNC_<local>_storage_get flags */
 enum {
-	BPF_SK_STORAGE_GET_F_CREATE	= (1ULL << 0),
+	BPF_LOCAL_STORAGE_GET_F_CREATE	= (1ULL << 0),
+	/* BPF_SK_STORAGE_GET_F_CREATE is only kept for backward compatibility
+	 * and BPF_LOCAL_STORAGE_GET_F_CREATE must be used instead.
+	 */
+	BPF_SK_STORAGE_GET_F_CREATE  = BPF_LOCAL_STORAGE_GET_F_CREATE,
 };
 
 /* BPF_FUNC_read_branch_records flags. */
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 1131a921e1a6..0acb8f8a6042 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -12,6 +12,7 @@ obj-$(CONFIG_BPF_JIT) += dispatcher.o
 ifeq ($(CONFIG_NET),y)
 obj-$(CONFIG_BPF_SYSCALL) += devmap.o
 obj-$(CONFIG_BPF_SYSCALL) += cpumap.o
+obj-$(CONFIG_BPF_SYSCALL) += bpf_local_storage.o
 obj-$(CONFIG_BPF_SYSCALL) += offload.o
 obj-$(CONFIG_BPF_SYSCALL) += net_namespace.o
 endif
diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
new file mode 100644
index 000000000000..c818eb6f8261
--- /dev/null
+++ b/kernel/bpf/bpf_local_storage.c
@@ -0,0 +1,517 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2019 Facebook
+ * Copyright 2020 Google LLC.
+ */
+
+#include <linux/rculist.h>
+#include <linux/list.h>
+#include <linux/hash.h>
+#include <linux/types.h>
+#include <linux/spinlock.h>
+#include <linux/bpf.h>
+#include <linux/bpf_local_storage.h>
+#include <net/sock.h>
+#include <uapi/linux/sock_diag.h>
+#include <uapi/linux/btf.h>
+
+#define SELEM(_SDATA)                                                          \
+	container_of((_SDATA), struct bpf_local_storage_elem, sdata)
+#define SDATA(_SELEM) (&(_SELEM)->sdata)
+
+static struct bucket *select_bucket(struct bpf_local_storage_map *smap,
+				    struct bpf_local_storage_elem *selem)
+{
+	return &smap->buckets[hash_ptr(selem, smap->bucket_log)];
+}
+
+static bool selem_linked_to_node(const struct bpf_local_storage_elem *selem)
+{
+	return !hlist_unhashed(&selem->snode);
+}
+
+static bool selem_linked_to_map(const struct bpf_local_storage_elem *selem)
+{
+	return !hlist_unhashed(&selem->map_node);
+}
+
+struct bpf_local_storage_elem *
+bpf_selem_alloc(struct bpf_local_storage_map *smap, void *value)
+{
+	struct bpf_local_storage_elem *selem;
+
+	selem = kzalloc(smap->elem_size, GFP_ATOMIC | __GFP_NOWARN);
+	if (selem) {
+		if (value)
+			memcpy(SDATA(selem)->data, value, smap->map.value_size);
+		return selem;
+	}
+
+	return NULL;
+}
+
+/* local_storage->lock must be held and selem->local_storage == local_storage.
+ * The caller must ensure selem->smap is still valid to be
+ * dereferenced for its smap->elem_size and smap->cache_idx.
+ *
+ * uncharge_omem is only relevant for BPF_MAP_TYPE_SK_STORAGE.
+ */
+bool bpf_selem_unlink(struct bpf_local_storage *local_storage,
+		      struct bpf_local_storage_elem *selem, bool uncharge_omem)
+{
+	struct bpf_local_storage_map *smap;
+	bool free_local_storage;
+
+	smap = rcu_dereference(SDATA(selem)->smap);
+	free_local_storage = hlist_is_singular_node(&selem->snode,
+						    &local_storage->list);
+
+	/* local_storage is not freed now.  local_storage->lock is
+	 * still held and raw_spin_unlock_bh(&local_storage->lock)
+	 * will be done by the caller.
+	 * Although the unlock will be done under
+	 * rcu_read_lock(),  it is more intutivie to
+	 * read if kfree_rcu(local_storage, rcu) is done
+	 * after the raw_spin_unlock_bh(&local_storage->lock).
+	 *
+	 * Hence, a "bool free_local_storage" is returned
+	 * to the caller which then calls the kfree_rcu()
+	 * after unlock.
+	 */
+	if (free_local_storage)
+		smap->map.ops->map_local_storage_unlink(local_storage,
+							uncharge_omem);
+
+	hlist_del_init_rcu(&selem->snode);
+	if (rcu_access_pointer(local_storage->cache[smap->cache_idx]) ==
+	    SDATA(selem))
+		RCU_INIT_POINTER(local_storage->cache[smap->cache_idx], NULL);
+
+	kfree_rcu(selem, rcu);
+
+	return free_local_storage;
+}
+
+void bpf_selem_link(struct bpf_local_storage *local_storage,
+		    struct bpf_local_storage_elem *selem)
+{
+	RCU_INIT_POINTER(selem->local_storage, local_storage);
+	hlist_add_head(&selem->snode, &local_storage->list);
+}
+
+void bpf_selem_unlink_map(struct bpf_local_storage_elem *selem)
+{
+	struct bpf_local_storage_map *smap;
+	struct bucket *b;
+
+	if (unlikely(!selem_linked_to_map(selem)))
+		/* selem has already be unlinked from smap */
+		return;
+
+	smap = rcu_dereference(SDATA(selem)->smap);
+	b = select_bucket(smap, selem);
+	raw_spin_lock_bh(&b->lock);
+	if (likely(selem_linked_to_map(selem)))
+		hlist_del_init_rcu(&selem->map_node);
+	raw_spin_unlock_bh(&b->lock);
+}
+
+void bpf_selem_link_map(struct bpf_local_storage_map *smap,
+			struct bpf_local_storage_elem *selem)
+{
+	struct bucket *b = select_bucket(smap, selem);
+
+	raw_spin_lock_bh(&b->lock);
+	RCU_INIT_POINTER(SDATA(selem)->smap, smap);
+	hlist_add_head_rcu(&selem->map_node, &b->list);
+	raw_spin_unlock_bh(&b->lock);
+}
+
+void bpf_selem_unlink_map_elem(struct bpf_local_storage_elem *selem)
+{
+	struct bpf_local_storage *local_storage;
+	bool free_local_storage = false;
+
+	/* Always unlink from map before unlinking from local_storage
+	 * because selem will be freed after successfully unlinked from
+	 * the local_storage.
+	 */
+	bpf_selem_unlink_map(selem);
+
+	if (unlikely(!selem_linked_to_node(selem)))
+		/* selem has already been unlinked from its owner */
+		return;
+
+	local_storage = rcu_dereference(selem->local_storage);
+	raw_spin_lock_bh(&local_storage->lock);
+	if (likely(selem_linked_to_node(selem)))
+		free_local_storage =
+			bpf_selem_unlink(local_storage, selem, true);
+	raw_spin_unlock_bh(&local_storage->lock);
+
+	if (free_local_storage)
+		kfree_rcu(local_storage, rcu);
+}
+
+struct bpf_local_storage_data *
+bpf_local_storage_lookup(struct bpf_local_storage *local_storage,
+			 struct bpf_local_storage_map *smap,
+			 bool cacheit_lockit)
+{
+	struct bpf_local_storage_data *sdata;
+	struct bpf_local_storage_elem *selem;
+
+	/* Fast path (cache hit) */
+	sdata = rcu_dereference(local_storage->cache[smap->cache_idx]);
+	if (sdata && rcu_access_pointer(sdata->smap) == smap)
+		return sdata;
+
+	/* Slow path (cache miss) */
+	hlist_for_each_entry_rcu(selem, &local_storage->list, snode)
+		if (rcu_access_pointer(SDATA(selem)->smap) == smap)
+			break;
+
+	if (!selem)
+		return NULL;
+
+	sdata = SDATA(selem);
+	if (cacheit_lockit) {
+		/* spinlock is needed to avoid racing with the
+		 * parallel delete.  Otherwise, publishing an already
+		 * deleted sdata to the cache will become a use-after-free
+		 * problem in the next bpf_local_storage_lookup().
+		 */
+		raw_spin_lock_bh(&local_storage->lock);
+		if (selem_linked_to_node(selem))
+			rcu_assign_pointer(
+				local_storage->cache[smap->cache_idx], sdata);
+		raw_spin_unlock_bh(&local_storage->lock);
+	}
+
+	return sdata;
+}
+
+static int check_flags(const struct bpf_local_storage_data *old_sdata,
+		       u64 map_flags)
+{
+	if (old_sdata && (map_flags & ~BPF_F_LOCK) == BPF_NOEXIST)
+		/* elem already exists */
+		return -EEXIST;
+
+	if (!old_sdata && (map_flags & ~BPF_F_LOCK) == BPF_EXIST)
+		/* elem doesn't exist, cannot update it */
+		return -ENOENT;
+
+	return 0;
+}
+
+struct bpf_local_storage *
+bpf_local_storage_alloc(struct bpf_local_storage_map *smap)
+{
+	struct bpf_local_storage *storage;
+
+	storage = kzalloc(sizeof(*storage), GFP_ATOMIC | __GFP_NOWARN);
+	if (!storage)
+		return NULL;
+
+	INIT_HLIST_HEAD(&storage->list);
+	raw_spin_lock_init(&storage->lock);
+	return storage;
+}
+
+/* Publish local_storage to the address.  This is used because we are already
+ * in a region where we cannot grab a lock on the object owning the storage (
+ * (e.g sk->sk_lock). Hence, atomic ops is used.
+ *
+ * From now on, the addr pointer is protected
+ * by the local_storage->lock.  Hence, upon freeing,
+ * the local_storage->lock must be held before unlinking the storage from the
+ * owner.
+ */
+int bpf_local_storage_publish(struct bpf_local_storage_elem *first_selem,
+			      struct bpf_local_storage **addr,
+			      struct bpf_local_storage *curr)
+{
+	struct bpf_local_storage *prev;
+
+	prev = cmpxchg(addr, NULL, curr);
+	if (unlikely(prev)) {
+		/* Note that even first_selem was linked to smap's
+		 * bucket->list, first_selem can be freed immediately
+		 * (instead of kfree_rcu) because
+		 * bpf_local_storage_map_free() does a
+		 * synchronize_rcu() before walking the bucket->list.
+		 * Hence, no one is accessing selem from the
+		 * bucket->list under rcu_read_lock().
+		 */
+		bpf_selem_unlink_map(first_selem);
+		return -EAGAIN;
+	}
+
+	return 0;
+}
+
+int bpf_local_storage_check_update_flags(struct bpf_map *map, u64 map_flags)
+{
+	/* BPF_EXIST and BPF_NOEXIST cannot be both set */
+	if (unlikely((map_flags & ~BPF_F_LOCK) > BPF_EXIST) ||
+	    /* BPF_F_LOCK can only be used in a value with spin_lock */
+	    unlikely((map_flags & BPF_F_LOCK) && !map_value_has_spin_lock(map)))
+		return -EINVAL;
+
+	return 0;
+}
+
+struct bpf_local_storage_data *
+bpf_local_storage_update(void *owner, struct bpf_map *map,
+			 struct bpf_local_storage *local_storage, void *value,
+			 u64 map_flags)
+{
+	struct bpf_local_storage_data *old_sdata = NULL;
+	struct bpf_local_storage_elem *selem;
+	struct bpf_local_storage_map *smap;
+	int err;
+
+	smap = (struct bpf_local_storage_map *)map;
+
+	if ((map_flags & BPF_F_LOCK) && !(map_flags & BPF_NOEXIST)) {
+		/* Hoping to find an old_sdata to do inline update
+		 * such that it can avoid taking the local_storage->lock
+		 * and changing the lists.
+		 */
+		old_sdata = bpf_local_storage_lookup(local_storage, smap, false);
+		err = check_flags(old_sdata, map_flags);
+		if (err)
+			return ERR_PTR(err);
+
+		if (old_sdata && selem_linked_to_node(SELEM(old_sdata))) {
+			copy_map_value_locked(map, old_sdata->data,
+					      value, false);
+			return old_sdata;
+		}
+	}
+
+	raw_spin_lock_bh(&local_storage->lock);
+
+	/* Recheck local_storage->list under local_storage->lock */
+	if (unlikely(hlist_empty(&local_storage->list))) {
+		/* A parallel del is happening and local_storage is going
+		 * away.  It has just been checked before, so very
+		 * unlikely.  Return instead of retry to keep things
+		 * simple.
+		 */
+		err = -EAGAIN;
+		goto unlock_err;
+	}
+
+	old_sdata = bpf_local_storage_lookup(local_storage, smap, false);
+	err = check_flags(old_sdata, map_flags);
+	if (err)
+		goto unlock_err;
+
+	if (old_sdata && (map_flags & BPF_F_LOCK)) {
+		copy_map_value_locked(map, old_sdata->data, value, false);
+		selem = SELEM(old_sdata);
+		goto unlock;
+	}
+
+	/* local_storage->lock is held.  Hence, we are sure
+	 * we can unlink and uncharge the old_sdata successfully
+	 * later.  Hence, instead of charging the new selem now
+	 * and then uncharge the old selem later (which may cause
+	 * a potential but unnecessary charge failure),  avoid taking
+	 * a charge at all here (the "!old_sdata" check) and the
+	 * old_sdata will not be uncharged later during bpf_selem_unlink().
+	 */
+	selem = map->ops->map_selem_alloc(smap, owner, value, !old_sdata);
+	if (!selem) {
+		err = -ENOMEM;
+		goto unlock_err;
+	}
+
+	/* First, link the new selem to the map */
+	bpf_selem_link_map(smap, selem);
+
+	/* Second, link (and publish) the new selem to local_storage */
+	bpf_selem_link(local_storage, selem);
+
+	/* Third, remove old selem, SELEM(old_sdata) */
+	if (old_sdata) {
+		bpf_selem_unlink_map(SELEM(old_sdata));
+		bpf_selem_unlink(local_storage, SELEM(old_sdata), false);
+	}
+
+unlock:
+	raw_spin_unlock_bh(&local_storage->lock);
+	return SDATA(selem);
+
+unlock_err:
+	raw_spin_unlock_bh(&local_storage->lock);
+	return ERR_PTR(err);
+}
+
+u16 bpf_ls_cache_idx_get(spinlock_t *cache_idx_lock,
+			 u64 *cache_idx_usage_counts)
+{
+	u64 min_usage = U64_MAX;
+	u16 i, res = 0;
+
+	spin_lock(cache_idx_lock);
+
+	for (i = 0; i < BPF_STORAGE_CACHE_SIZE; i++) {
+		if (cache_idx_usage_counts[i] < min_usage) {
+			min_usage = cache_idx_usage_counts[i];
+			res = i;
+
+			/* Found a free cache_idx */
+			if (!min_usage)
+				break;
+		}
+	}
+	cache_idx_usage_counts[res]++;
+
+	spin_unlock(cache_idx_lock);
+
+	return res;
+}
+
+void bpf_ls_cache_idx_free(spinlock_t *cache_idx_lock,
+			   u64 *cache_idx_usage_counts, u16 idx)
+{
+	spin_lock(cache_idx_lock);
+	cache_idx_usage_counts[idx]--;
+	spin_unlock(cache_idx_lock);
+}
+
+void bpf_local_storage_map_free(struct bpf_local_storage_map *smap)
+{
+	struct bpf_local_storage_elem *selem;
+	struct bucket *b;
+	unsigned int i;
+
+	/* Note that this map might be concurrently cloned from
+	 * bpf_sk_storage_clone. Wait for any existing bpf_sk_storage_clone
+	 * RCU read section to finish before proceeding. New RCU
+	 * read sections should be prevented via bpf_map_inc_not_zero.
+	 */
+	synchronize_rcu();
+
+	/* bpf prog and the userspace can no longer access this map
+	 * now.  No new selem (of this map) can be added
+	 * to the bpf_local_storage or to the map bucket's list.
+	 *
+	 * The elem of this map can be cleaned up here
+	 * or by bpf_local_storage_free() during the destruction of the
+	 * owner object. eg. __sk_destruct.
+	 */
+	for (i = 0; i < (1U << smap->bucket_log); i++) {
+		b = &smap->buckets[i];
+
+		rcu_read_lock();
+		/* No one is adding to b->list now */
+		while ((selem = hlist_entry_safe(rcu_dereference_raw(hlist_first_rcu(&b->list)),
+						 struct bpf_local_storage_elem,
+						 map_node))) {
+			bpf_selem_unlink_map_elem(selem);
+			cond_resched_rcu();
+		}
+		rcu_read_unlock();
+	}
+
+	/* bpf_local_storage_free() may still need to access the map.
+	 * e.g. bpf_local_storage_free() has unlinked selem from the map
+	 * which then made the above while((selem = ...)) loop
+	 * exited immediately.
+	 *
+	 * However, the bpf_local_storage_free() still needs to access
+	 * the smap->elem_size to do the uncharging in
+	 * bpf_selem_unlink().
+	 *
+	 * Hence, wait another rcu grace period for the
+	 * bpf_local_storage_free() to finish.
+	 */
+	synchronize_rcu();
+
+	kvfree(smap->buckets);
+	kfree(smap);
+}
+
+int bpf_local_storage_map_alloc_check(union bpf_attr *attr)
+{
+	if (attr->map_flags & ~LOCAL_STORAGE_CREATE_FLAG_MASK ||
+	    !(attr->map_flags & BPF_F_NO_PREALLOC) ||
+	    attr->max_entries ||
+	    attr->key_size != sizeof(int) || !attr->value_size ||
+	    /* Enforce BTF for userspace sk dumping */
+	    !attr->btf_key_type_id || !attr->btf_value_type_id)
+		return -EINVAL;
+
+	if (!bpf_capable())
+		return -EPERM;
+
+	if (attr->value_size > BPF_LOCAL_STORAGE_MAX_VALUE_SIZE)
+		return -E2BIG;
+
+	return 0;
+}
+
+struct bpf_local_storage_map *bpf_local_storage_map_alloc(union bpf_attr *attr)
+{
+	struct bpf_local_storage_map *smap;
+	unsigned int i;
+	u32 nbuckets;
+	u64 cost;
+	int ret;
+
+	smap = kzalloc(sizeof(*smap), GFP_USER | __GFP_NOWARN);
+	if (!smap)
+		return ERR_PTR(-ENOMEM);
+	bpf_map_init_from_attr(&smap->map, attr);
+
+	nbuckets = roundup_pow_of_two(num_possible_cpus());
+	/* Use at least 2 buckets, select_bucket() is undefined behavior with 1 bucket */
+	nbuckets = max_t(u32, 2, nbuckets);
+	smap->bucket_log = ilog2(nbuckets);
+	cost = sizeof(*smap->buckets) * nbuckets + sizeof(*smap);
+
+	ret = bpf_map_charge_init(&smap->map.memory, cost);
+	if (ret < 0) {
+		kfree(smap);
+		return ERR_PTR(ret);
+	}
+
+	smap->buckets = kvcalloc(sizeof(*smap->buckets), nbuckets,
+				 GFP_USER | __GFP_NOWARN);
+	if (!smap->buckets) {
+		bpf_map_charge_finish(&smap->map.memory);
+		kfree(smap);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	for (i = 0; i < nbuckets; i++) {
+		INIT_HLIST_HEAD(&smap->buckets[i].list);
+		raw_spin_lock_init(&smap->buckets[i].lock);
+	}
+
+	smap->elem_size =
+		sizeof(struct bpf_local_storage_elem) + attr->value_size;
+
+	return smap;
+}
+
+int bpf_local_storage_map_check_btf(const struct bpf_map *map,
+				    const struct btf *btf,
+				    const struct btf_type *key_type,
+				    const struct btf_type *value_type)
+{
+	u32 int_data;
+
+	if (BTF_INFO_KIND(key_type->info) != BTF_KIND_INT)
+		return -EINVAL;
+
+	int_data = *(u32 *)(key_type + 1);
+	if (BTF_INT_BITS(int_data) != 32 || BTF_INT_OFFSET(int_data))
+		return -EINVAL;
+
+	return 0;
+}
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index 6f921c4ddc2c..a2b00a09d843 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -1,103 +1,20 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2019 Facebook  */
+#include "linux/bpf.h"
+#include "asm-generic/bug.h"
+#include "linux/err.h"
 #include <linux/rculist.h>
 #include <linux/list.h>
 #include <linux/hash.h>
 #include <linux/types.h>
 #include <linux/spinlock.h>
 #include <linux/bpf.h>
+#include <linux/bpf_local_storage.h>
 #include <net/bpf_sk_storage.h>
 #include <net/sock.h>
 #include <uapi/linux/sock_diag.h>
 #include <uapi/linux/btf.h>
 
-#define SK_STORAGE_CREATE_FLAG_MASK					\
-	(BPF_F_NO_PREALLOC | BPF_F_CLONE)
-
-struct bucket {
-	struct hlist_head list;
-	raw_spinlock_t lock;
-};
-
-/* Thp map is not the primary owner of a bpf_sk_storage_elem.
- * Instead, the sk->sk_bpf_storage is.
- *
- * The map (bpf_sk_storage_map) is for two purposes
- * 1. Define the size of the "sk local storage".  It is
- *    the map's value_size.
- *
- * 2. Maintain a list to keep track of all elems such
- *    that they can be cleaned up during the map destruction.
- *
- * When a bpf local storage is being looked up for a
- * particular sk,  the "bpf_map" pointer is actually used
- * as the "key" to search in the list of elem in
- * sk->sk_bpf_storage.
- *
- * Hence, consider sk->sk_bpf_storage is the mini-map
- * with the "bpf_map" pointer as the searching key.
- */
-struct bpf_sk_storage_map {
-	struct bpf_map map;
-	/* Lookup elem does not require accessing the map.
-	 *
-	 * Updating/Deleting requires a bucket lock to
-	 * link/unlink the elem from the map.  Having
-	 * multiple buckets to improve contention.
-	 */
-	struct bucket *buckets;
-	u32 bucket_log;
-	u16 elem_size;
-	u16 cache_idx;
-};
-
-struct bpf_sk_storage_data {
-	/* smap is used as the searching key when looking up
-	 * from sk->sk_bpf_storage.
-	 *
-	 * Put it in the same cacheline as the data to minimize
-	 * the number of cachelines access during the cache hit case.
-	 */
-	struct bpf_sk_storage_map __rcu *smap;
-	u8 data[] __aligned(8);
-};
-
-/* Linked to bpf_sk_storage and bpf_sk_storage_map */
-struct bpf_sk_storage_elem {
-	struct hlist_node map_node;	/* Linked to bpf_sk_storage_map */
-	struct hlist_node snode;	/* Linked to bpf_sk_storage */
-	struct bpf_sk_storage __rcu *sk_storage;
-	struct rcu_head rcu;
-	/* 8 bytes hole */
-	/* The data is stored in aother cacheline to minimize
-	 * the number of cachelines access during a cache hit.
-	 */
-	struct bpf_sk_storage_data sdata ____cacheline_aligned;
-};
-
-#define SELEM(_SDATA) container_of((_SDATA), struct bpf_sk_storage_elem, sdata)
-#define SDATA(_SELEM) (&(_SELEM)->sdata)
-#define BPF_SK_STORAGE_CACHE_SIZE	16
-
-static DEFINE_SPINLOCK(cache_idx_lock);
-static u64 cache_idx_usage_counts[BPF_SK_STORAGE_CACHE_SIZE];
-
-struct bpf_sk_storage {
-	struct bpf_sk_storage_data __rcu *cache[BPF_SK_STORAGE_CACHE_SIZE];
-	struct hlist_head list;	/* List of bpf_sk_storage_elem */
-	struct sock *sk;	/* The sk that owns the the above "list" of
-				 * bpf_sk_storage_elem.
-				 */
-	struct rcu_head rcu;
-	raw_spinlock_t lock;	/* Protect adding/removing from the "list" */
-};
-
-static struct bucket *select_bucket(struct bpf_sk_storage_map *smap,
-				    struct bpf_sk_storage_elem *selem)
-{
-	return &smap->buckets[hash_ptr(selem, smap->bucket_log)];
-}
-
 static int omem_charge(struct sock *sk, unsigned int size)
 {
 	/* same check as in sock_kmalloc() */
@@ -110,31 +27,19 @@ static int omem_charge(struct sock *sk, unsigned int size)
 	return -ENOMEM;
 }
 
-static bool selem_linked_to_sk(const struct bpf_sk_storage_elem *selem)
-{
-	return !hlist_unhashed(&selem->snode);
-}
-
-static bool selem_linked_to_map(const struct bpf_sk_storage_elem *selem)
+static struct bpf_local_storage_elem *
+sk_selem_alloc(struct bpf_local_storage_map *smap, void *owner, void *value,
+	       bool charge_omem)
 {
-	return !hlist_unhashed(&selem->map_node);
-}
-
-static struct bpf_sk_storage_elem *selem_alloc(struct bpf_sk_storage_map *smap,
-					       struct sock *sk, void *value,
-					       bool charge_omem)
-{
-	struct bpf_sk_storage_elem *selem;
+	struct bpf_local_storage_elem *selem;
+	struct sock *sk = owner;
 
 	if (charge_omem && omem_charge(sk, smap->elem_size))
 		return NULL;
 
-	selem = kzalloc(smap->elem_size, GFP_ATOMIC | __GFP_NOWARN);
-	if (selem) {
-		if (value)
-			memcpy(SDATA(selem)->data, value, smap->map.value_size);
+	selem = bpf_selem_alloc(smap, value);
+	if (selem)
 		return selem;
-	}
 
 	if (charge_omem)
 		atomic_sub(smap->elem_size, &sk->sk_omem_alloc);
@@ -142,242 +47,53 @@ static struct bpf_sk_storage_elem *selem_alloc(struct bpf_sk_storage_map *smap,
 	return NULL;
 }
 
-/* sk_storage->lock must be held and selem->sk_storage == sk_storage.
- * The caller must ensure selem->smap is still valid to be
- * dereferenced for its smap->elem_size and smap->cache_idx.
- */
-static bool __selem_unlink_sk(struct bpf_sk_storage *sk_storage,
-			      struct bpf_sk_storage_elem *selem,
+static void unlink_sk_storage(struct bpf_local_storage *local_storage,
 			      bool uncharge_omem)
 {
-	struct bpf_sk_storage_map *smap;
-	bool free_sk_storage;
-	struct sock *sk;
-
-	smap = rcu_dereference(SDATA(selem)->smap);
-	sk = sk_storage->sk;
+	struct sock *sk = local_storage->sk;
 
-	/* All uncharging on sk->sk_omem_alloc must be done first.
-	 * sk may be freed once the last selem is unlinked from sk_storage.
-	 */
 	if (uncharge_omem)
-		atomic_sub(smap->elem_size, &sk->sk_omem_alloc);
-
-	free_sk_storage = hlist_is_singular_node(&selem->snode,
-						 &sk_storage->list);
-	if (free_sk_storage) {
-		atomic_sub(sizeof(struct bpf_sk_storage), &sk->sk_omem_alloc);
-		sk_storage->sk = NULL;
-		/* After this RCU_INIT, sk may be freed and cannot be used */
-		RCU_INIT_POINTER(sk->sk_bpf_storage, NULL);
-
-		/* sk_storage is not freed now.  sk_storage->lock is
-		 * still held and raw_spin_unlock_bh(&sk_storage->lock)
-		 * will be done by the caller.
-		 *
-		 * Although the unlock will be done under
-		 * rcu_read_lock(),  it is more intutivie to
-		 * read if kfree_rcu(sk_storage, rcu) is done
-		 * after the raw_spin_unlock_bh(&sk_storage->lock).
-		 *
-		 * Hence, a "bool free_sk_storage" is returned
-		 * to the caller which then calls the kfree_rcu()
-		 * after unlock.
-		 */
-	}
-	hlist_del_init_rcu(&selem->snode);
-	if (rcu_access_pointer(sk_storage->cache[smap->cache_idx]) ==
-	    SDATA(selem))
-		RCU_INIT_POINTER(sk_storage->cache[smap->cache_idx], NULL);
-
-	kfree_rcu(selem, rcu);
-
-	return free_sk_storage;
-}
-
-static void selem_unlink_sk(struct bpf_sk_storage_elem *selem)
-{
-	struct bpf_sk_storage *sk_storage;
-	bool free_sk_storage = false;
-
-	if (unlikely(!selem_linked_to_sk(selem)))
-		/* selem has already been unlinked from sk */
-		return;
-
-	sk_storage = rcu_dereference(selem->sk_storage);
-	raw_spin_lock_bh(&sk_storage->lock);
-	if (likely(selem_linked_to_sk(selem)))
-		free_sk_storage = __selem_unlink_sk(sk_storage, selem, true);
-	raw_spin_unlock_bh(&sk_storage->lock);
-
-	if (free_sk_storage)
-		kfree_rcu(sk_storage, rcu);
-}
+		atomic_sub(sizeof(struct bpf_local_storage),
+			   &sk->sk_omem_alloc);
 
-static void __selem_link_sk(struct bpf_sk_storage *sk_storage,
-			    struct bpf_sk_storage_elem *selem)
-{
-	RCU_INIT_POINTER(selem->sk_storage, sk_storage);
-	hlist_add_head(&selem->snode, &sk_storage->list);
+	/* After this RCU_INIT, sk may be freed and cannot be used */
+	RCU_INIT_POINTER(sk->sk_bpf_storage, NULL);
+	local_storage->sk = NULL;
 }
 
-static void selem_unlink_map(struct bpf_sk_storage_elem *selem)
+static int sk_storage_alloc(void *owner,
+			    struct bpf_local_storage_map *smap,
+			    struct bpf_local_storage_elem *first_selem)
 {
-	struct bpf_sk_storage_map *smap;
-	struct bucket *b;
-
-	if (unlikely(!selem_linked_to_map(selem)))
-		/* selem has already be unlinked from smap */
-		return;
-
-	smap = rcu_dereference(SDATA(selem)->smap);
-	b = select_bucket(smap, selem);
-	raw_spin_lock_bh(&b->lock);
-	if (likely(selem_linked_to_map(selem)))
-		hlist_del_init_rcu(&selem->map_node);
-	raw_spin_unlock_bh(&b->lock);
-}
-
-static void selem_link_map(struct bpf_sk_storage_map *smap,
-			   struct bpf_sk_storage_elem *selem)
-{
-	struct bucket *b = select_bucket(smap, selem);
-
-	raw_spin_lock_bh(&b->lock);
-	RCU_INIT_POINTER(SDATA(selem)->smap, smap);
-	hlist_add_head_rcu(&selem->map_node, &b->list);
-	raw_spin_unlock_bh(&b->lock);
-}
-
-static void selem_unlink(struct bpf_sk_storage_elem *selem)
-{
-	/* Always unlink from map before unlinking from sk_storage
-	 * because selem will be freed after successfully unlinked from
-	 * the sk_storage.
-	 */
-	selem_unlink_map(selem);
-	selem_unlink_sk(selem);
-}
-
-static struct bpf_sk_storage_data *
-__sk_storage_lookup(struct bpf_sk_storage *sk_storage,
-		    struct bpf_sk_storage_map *smap,
-		    bool cacheit_lockit)
-{
-	struct bpf_sk_storage_data *sdata;
-	struct bpf_sk_storage_elem *selem;
-
-	/* Fast path (cache hit) */
-	sdata = rcu_dereference(sk_storage->cache[smap->cache_idx]);
-	if (sdata && rcu_access_pointer(sdata->smap) == smap)
-		return sdata;
-
-	/* Slow path (cache miss) */
-	hlist_for_each_entry_rcu(selem, &sk_storage->list, snode)
-		if (rcu_access_pointer(SDATA(selem)->smap) == smap)
-			break;
-
-	if (!selem)
-		return NULL;
-
-	sdata = SDATA(selem);
-	if (cacheit_lockit) {
-		/* spinlock is needed to avoid racing with the
-		 * parallel delete.  Otherwise, publishing an already
-		 * deleted sdata to the cache will become a use-after-free
-		 * problem in the next __sk_storage_lookup().
-		 */
-		raw_spin_lock_bh(&sk_storage->lock);
-		if (selem_linked_to_sk(selem))
-			rcu_assign_pointer(sk_storage->cache[smap->cache_idx],
-					   sdata);
-		raw_spin_unlock_bh(&sk_storage->lock);
-	}
-
-	return sdata;
-}
-
-static struct bpf_sk_storage_data *
-sk_storage_lookup(struct sock *sk, struct bpf_map *map, bool cacheit_lockit)
-{
-	struct bpf_sk_storage *sk_storage;
-	struct bpf_sk_storage_map *smap;
-
-	sk_storage = rcu_dereference(sk->sk_bpf_storage);
-	if (!sk_storage)
-		return NULL;
-
-	smap = (struct bpf_sk_storage_map *)map;
-	return __sk_storage_lookup(sk_storage, smap, cacheit_lockit);
-}
-
-static int check_flags(const struct bpf_sk_storage_data *old_sdata,
-		       u64 map_flags)
-{
-	if (old_sdata && (map_flags & ~BPF_F_LOCK) == BPF_NOEXIST)
-		/* elem already exists */
-		return -EEXIST;
-
-	if (!old_sdata && (map_flags & ~BPF_F_LOCK) == BPF_EXIST)
-		/* elem doesn't exist, cannot update it */
-		return -ENOENT;
-
-	return 0;
-}
-
-static int sk_storage_alloc(struct sock *sk,
-			    struct bpf_sk_storage_map *smap,
-			    struct bpf_sk_storage_elem *first_selem)
-{
-	struct bpf_sk_storage *prev_sk_storage, *sk_storage;
+	struct bpf_local_storage *curr;
+	struct sock *sk = owner;
 	int err;
 
-	err = omem_charge(sk, sizeof(*sk_storage));
+	err = omem_charge(sk, sizeof(*curr));
 	if (err)
 		return err;
 
-	sk_storage = kzalloc(sizeof(*sk_storage), GFP_ATOMIC | __GFP_NOWARN);
-	if (!sk_storage) {
+	curr = bpf_local_storage_alloc(smap);
+	if (!curr) {
 		err = -ENOMEM;
 		goto uncharge;
 	}
-	INIT_HLIST_HEAD(&sk_storage->list);
-	raw_spin_lock_init(&sk_storage->lock);
-	sk_storage->sk = sk;
-
-	__selem_link_sk(sk_storage, first_selem);
-	selem_link_map(smap, first_selem);
-	/* Publish sk_storage to sk.  sk->sk_lock cannot be acquired.
-	 * Hence, atomic ops is used to set sk->sk_bpf_storage
-	 * from NULL to the newly allocated sk_storage ptr.
-	 *
-	 * From now on, the sk->sk_bpf_storage pointer is protected
-	 * by the sk_storage->lock.  Hence,  when freeing
-	 * the sk->sk_bpf_storage, the sk_storage->lock must
-	 * be held before setting sk->sk_bpf_storage to NULL.
-	 */
-	prev_sk_storage = cmpxchg((struct bpf_sk_storage **)&sk->sk_bpf_storage,
-				  NULL, sk_storage);
-	if (unlikely(prev_sk_storage)) {
-		selem_unlink_map(first_selem);
-		err = -EAGAIN;
-		goto uncharge;
 
-		/* Note that even first_selem was linked to smap's
-		 * bucket->list, first_selem can be freed immediately
-		 * (instead of kfree_rcu) because
-		 * bpf_sk_storage_map_free() does a
-		 * synchronize_rcu() before walking the bucket->list.
-		 * Hence, no one is accessing selem from the
-		 * bucket->list under rcu_read_lock().
-		 */
-	}
+	curr->sk = sk;
+
+	bpf_selem_link(curr, first_selem);
+	bpf_selem_link_map(smap, first_selem);
+
+	err = bpf_local_storage_publish(first_selem,
+		(struct bpf_local_storage **)&sk->sk_bpf_storage, curr);
+	if (err)
+		goto uncharge;
 
 	return 0;
 
 uncharge:
-	kfree(sk_storage);
-	atomic_sub(sizeof(*sk_storage), &sk->sk_omem_alloc);
+	kfree(curr);
+	atomic_sub(sizeof(*curr), &sk->sk_omem_alloc);
 	return err;
 }
 
@@ -386,36 +102,31 @@ static int sk_storage_alloc(struct sock *sk,
  * Otherwise, it will become a leak (and other memory issues
  * during map destruction).
  */
-static struct bpf_sk_storage_data *sk_storage_update(struct sock *sk,
-						     struct bpf_map *map,
-						     void *value,
-						     u64 map_flags)
+static struct bpf_local_storage_data *
+sk_storage_update(void *owner, struct bpf_map *map, void *value, u64 map_flags)
 {
-	struct bpf_sk_storage_data *old_sdata = NULL;
-	struct bpf_sk_storage_elem *selem;
-	struct bpf_sk_storage *sk_storage;
-	struct bpf_sk_storage_map *smap;
+	struct bpf_local_storage_data *old_sdata = NULL;
+	struct bpf_local_storage_elem *selem;
+	struct bpf_local_storage *local_storage;
+	struct bpf_local_storage_map *smap;
+	struct sock *sk;
 	int err;
 
-	/* BPF_EXIST and BPF_NOEXIST cannot be both set */
-	if (unlikely((map_flags & ~BPF_F_LOCK) > BPF_EXIST) ||
-	    /* BPF_F_LOCK can only be used in a value with spin_lock */
-	    unlikely((map_flags & BPF_F_LOCK) && !map_value_has_spin_lock(map)))
-		return ERR_PTR(-EINVAL);
+	err = bpf_local_storage_check_update_flags(map, map_flags);
+	if (err)
+		return ERR_PTR(err);
 
-	smap = (struct bpf_sk_storage_map *)map;
-	sk_storage = rcu_dereference(sk->sk_bpf_storage);
-	if (!sk_storage || hlist_empty(&sk_storage->list)) {
-		/* Very first elem for this sk */
-		err = check_flags(NULL, map_flags);
-		if (err)
-			return ERR_PTR(err);
+	sk = owner;
+	local_storage = rcu_dereference(sk->sk_bpf_storage);
+	smap = (struct bpf_local_storage_map *)map;
 
-		selem = selem_alloc(smap, sk, value, true);
+	if (!local_storage || hlist_empty(&local_storage->list)) {
+		/* Very first elem */
+		selem = map->ops->map_selem_alloc(smap, owner, value, !old_sdata);
 		if (!selem)
 			return ERR_PTR(-ENOMEM);
 
-		err = sk_storage_alloc(sk, smap, selem);
+		err = map->ops->map_local_storage_alloc(owner, smap, selem);
 		if (err) {
 			kfree(selem);
 			atomic_sub(smap->elem_size, &sk->sk_omem_alloc);
@@ -425,130 +136,42 @@ static struct bpf_sk_storage_data *sk_storage_update(struct sock *sk,
 		return SDATA(selem);
 	}
 
-	if ((map_flags & BPF_F_LOCK) && !(map_flags & BPF_NOEXIST)) {
-		/* Hoping to find an old_sdata to do inline update
-		 * such that it can avoid taking the sk_storage->lock
-		 * and changing the lists.
-		 */
-		old_sdata = __sk_storage_lookup(sk_storage, smap, false);
-		err = check_flags(old_sdata, map_flags);
-		if (err)
-			return ERR_PTR(err);
-		if (old_sdata && selem_linked_to_sk(SELEM(old_sdata))) {
-			copy_map_value_locked(map, old_sdata->data,
-					      value, false);
-			return old_sdata;
-		}
-	}
-
-	raw_spin_lock_bh(&sk_storage->lock);
-
-	/* Recheck sk_storage->list under sk_storage->lock */
-	if (unlikely(hlist_empty(&sk_storage->list))) {
-		/* A parallel del is happening and sk_storage is going
-		 * away.  It has just been checked before, so very
-		 * unlikely.  Return instead of retry to keep things
-		 * simple.
-		 */
-		err = -EAGAIN;
-		goto unlock_err;
-	}
-
-	old_sdata = __sk_storage_lookup(sk_storage, smap, false);
-	err = check_flags(old_sdata, map_flags);
-	if (err)
-		goto unlock_err;
-
-	if (old_sdata && (map_flags & BPF_F_LOCK)) {
-		copy_map_value_locked(map, old_sdata->data, value, false);
-		selem = SELEM(old_sdata);
-		goto unlock;
-	}
-
-	/* sk_storage->lock is held.  Hence, we are sure
-	 * we can unlink and uncharge the old_sdata successfully
-	 * later.  Hence, instead of charging the new selem now
-	 * and then uncharge the old selem later (which may cause
-	 * a potential but unnecessary charge failure),  avoid taking
-	 * a charge at all here (the "!old_sdata" check) and the
-	 * old_sdata will not be uncharged later during __selem_unlink_sk().
-	 */
-	selem = selem_alloc(smap, sk, value, !old_sdata);
-	if (!selem) {
-		err = -ENOMEM;
-		goto unlock_err;
-	}
-
-	/* First, link the new selem to the map */
-	selem_link_map(smap, selem);
-
-	/* Second, link (and publish) the new selem to sk_storage */
-	__selem_link_sk(sk_storage, selem);
+	return bpf_local_storage_update(owner, map, local_storage, value,
+					map_flags);
+}
 
-	/* Third, remove old selem, SELEM(old_sdata) */
-	if (old_sdata) {
-		selem_unlink_map(SELEM(old_sdata));
-		__selem_unlink_sk(sk_storage, SELEM(old_sdata), false);
-	}
+static struct bpf_local_storage_data *
+sk_storage_lookup(struct sock *sk, struct bpf_map *map, bool cacheit_lockit)
+{
+	struct bpf_local_storage *sk_storage;
+	struct bpf_local_storage_map *smap;
 
-unlock:
-	raw_spin_unlock_bh(&sk_storage->lock);
-	return SDATA(selem);
+	sk_storage = rcu_dereference(sk->sk_bpf_storage);
+	if (!sk_storage)
+		return NULL;
 
-unlock_err:
-	raw_spin_unlock_bh(&sk_storage->lock);
-	return ERR_PTR(err);
+	smap = (struct bpf_local_storage_map *)map;
+	return bpf_local_storage_lookup(sk_storage, smap, cacheit_lockit);
 }
 
 static int sk_storage_delete(struct sock *sk, struct bpf_map *map)
 {
-	struct bpf_sk_storage_data *sdata;
+	struct bpf_local_storage_data *sdata;
 
 	sdata = sk_storage_lookup(sk, map, false);
 	if (!sdata)
 		return -ENOENT;
 
-	selem_unlink(SELEM(sdata));
+	bpf_selem_unlink_map_elem(SELEM(sdata));
 
 	return 0;
 }
 
-static u16 cache_idx_get(void)
-{
-	u64 min_usage = U64_MAX;
-	u16 i, res = 0;
-
-	spin_lock(&cache_idx_lock);
-
-	for (i = 0; i < BPF_SK_STORAGE_CACHE_SIZE; i++) {
-		if (cache_idx_usage_counts[i] < min_usage) {
-			min_usage = cache_idx_usage_counts[i];
-			res = i;
-
-			/* Found a free cache_idx */
-			if (!min_usage)
-				break;
-		}
-	}
-	cache_idx_usage_counts[res]++;
-
-	spin_unlock(&cache_idx_lock);
-
-	return res;
-}
-
-static void cache_idx_free(u16 idx)
-{
-	spin_lock(&cache_idx_lock);
-	cache_idx_usage_counts[idx]--;
-	spin_unlock(&cache_idx_lock);
-}
-
 /* Called by __sk_destruct() & bpf_sk_storage_clone() */
 void bpf_sk_storage_free(struct sock *sk)
 {
-	struct bpf_sk_storage_elem *selem;
-	struct bpf_sk_storage *sk_storage;
+	struct bpf_local_storage_elem *selem;
+	struct bpf_local_storage *sk_storage;
 	bool free_sk_storage = false;
 	struct hlist_node *n;
 
@@ -562,9 +185,9 @@ void bpf_sk_storage_free(struct sock *sk)
 	/* Netiher the bpf_prog nor the bpf-map's syscall
 	 * could be modifying the sk_storage->list now.
 	 * Thus, no elem can be added-to or deleted-from the
-	 * sk_storage->list by the bpf_prog or by the bpf-map's syscall.
+	 * local_storage->list by the bpf_prog or by the bpf-map's syscall.
 	 *
-	 * It is racing with bpf_sk_storage_map_free() alone
+	 * It is racing with bpf_local_storage_map_free() alone
 	 * when unlinking elem from the sk_storage->list and
 	 * the map's bucket->list.
 	 */
@@ -573,8 +196,8 @@ void bpf_sk_storage_free(struct sock *sk)
 		/* Always unlink from map before unlinking from
 		 * sk_storage.
 		 */
-		selem_unlink_map(selem);
-		free_sk_storage = __selem_unlink_sk(sk_storage, selem, true);
+		bpf_selem_unlink_map(selem);
+		free_sk_storage = bpf_selem_unlink(sk_storage, selem, true);
 	}
 	raw_spin_unlock_bh(&sk_storage->lock);
 	rcu_read_unlock();
@@ -583,163 +206,11 @@ void bpf_sk_storage_free(struct sock *sk)
 		kfree_rcu(sk_storage, rcu);
 }
 
-static void bpf_sk_storage_map_free(struct bpf_map *map)
+static void *bpf_sk_storage_lookup_elem(struct bpf_map *map, void *key)
 {
-	struct bpf_sk_storage_elem *selem;
-	struct bpf_sk_storage_map *smap;
-	struct bucket *b;
-	unsigned int i;
-
-	smap = (struct bpf_sk_storage_map *)map;
-
-	cache_idx_free(smap->cache_idx);
-
-	/* Note that this map might be concurrently cloned from
-	 * bpf_sk_storage_clone. Wait for any existing bpf_sk_storage_clone
-	 * RCU read section to finish before proceeding. New RCU
-	 * read sections should be prevented via bpf_map_inc_not_zero.
-	 */
-	synchronize_rcu();
-
-	/* bpf prog and the userspace can no longer access this map
-	 * now.  No new selem (of this map) can be added
-	 * to the sk->sk_bpf_storage or to the map bucket's list.
-	 *
-	 * The elem of this map can be cleaned up here
-	 * or
-	 * by bpf_sk_storage_free() during __sk_destruct().
-	 */
-	for (i = 0; i < (1U << smap->bucket_log); i++) {
-		b = &smap->buckets[i];
-
-		rcu_read_lock();
-		/* No one is adding to b->list now */
-		while ((selem = hlist_entry_safe(rcu_dereference_raw(hlist_first_rcu(&b->list)),
-						 struct bpf_sk_storage_elem,
-						 map_node))) {
-			selem_unlink(selem);
-			cond_resched_rcu();
-		}
-		rcu_read_unlock();
-	}
-
-	/* bpf_sk_storage_free() may still need to access the map.
-	 * e.g. bpf_sk_storage_free() has unlinked selem from the map
-	 * which then made the above while((selem = ...)) loop
-	 * exited immediately.
-	 *
-	 * However, the bpf_sk_storage_free() still needs to access
-	 * the smap->elem_size to do the uncharging in
-	 * __selem_unlink_sk().
-	 *
-	 * Hence, wait another rcu grace period for the
-	 * bpf_sk_storage_free() to finish.
-	 */
-	synchronize_rcu();
-
-	kvfree(smap->buckets);
-	kfree(map);
-}
-
-/* U16_MAX is much more than enough for sk local storage
- * considering a tcp_sock is ~2k.
- */
-#define MAX_VALUE_SIZE							\
-	min_t(u32,							\
-	      (KMALLOC_MAX_SIZE - MAX_BPF_STACK - sizeof(struct bpf_sk_storage_elem)), \
-	      (U16_MAX - sizeof(struct bpf_sk_storage_elem)))
-
-static int bpf_sk_storage_map_alloc_check(union bpf_attr *attr)
-{
-	if (attr->map_flags & ~SK_STORAGE_CREATE_FLAG_MASK ||
-	    !(attr->map_flags & BPF_F_NO_PREALLOC) ||
-	    attr->max_entries ||
-	    attr->key_size != sizeof(int) || !attr->value_size ||
-	    /* Enforce BTF for userspace sk dumping */
-	    !attr->btf_key_type_id || !attr->btf_value_type_id)
-		return -EINVAL;
-
-	if (!bpf_capable())
-		return -EPERM;
-
-	if (attr->value_size > MAX_VALUE_SIZE)
-		return -E2BIG;
-
-	return 0;
-}
-
-static struct bpf_map *bpf_sk_storage_map_alloc(union bpf_attr *attr)
-{
-	struct bpf_sk_storage_map *smap;
-	unsigned int i;
-	u32 nbuckets;
-	u64 cost;
-	int ret;
-
-	smap = kzalloc(sizeof(*smap), GFP_USER | __GFP_NOWARN);
-	if (!smap)
-		return ERR_PTR(-ENOMEM);
-	bpf_map_init_from_attr(&smap->map, attr);
-
-	nbuckets = roundup_pow_of_two(num_possible_cpus());
-	/* Use at least 2 buckets, select_bucket() is undefined behavior with 1 bucket */
-	nbuckets = max_t(u32, 2, nbuckets);
-	smap->bucket_log = ilog2(nbuckets);
-	cost = sizeof(*smap->buckets) * nbuckets + sizeof(*smap);
-
-	ret = bpf_map_charge_init(&smap->map.memory, cost);
-	if (ret < 0) {
-		kfree(smap);
-		return ERR_PTR(ret);
-	}
-
-	smap->buckets = kvcalloc(sizeof(*smap->buckets), nbuckets,
-				 GFP_USER | __GFP_NOWARN);
-	if (!smap->buckets) {
-		bpf_map_charge_finish(&smap->map.memory);
-		kfree(smap);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	for (i = 0; i < nbuckets; i++) {
-		INIT_HLIST_HEAD(&smap->buckets[i].list);
-		raw_spin_lock_init(&smap->buckets[i].lock);
-	}
-
-	smap->elem_size = sizeof(struct bpf_sk_storage_elem) + attr->value_size;
-	smap->cache_idx = cache_idx_get();
-
-	return &smap->map;
-}
-
-static int notsupp_get_next_key(struct bpf_map *map, void *key,
-				void *next_key)
-{
-	return -ENOTSUPP;
-}
-
-static int bpf_sk_storage_map_check_btf(const struct bpf_map *map,
-					const struct btf *btf,
-					const struct btf_type *key_type,
-					const struct btf_type *value_type)
-{
-	u32 int_data;
-
-	if (BTF_INFO_KIND(key_type->info) != BTF_KIND_INT)
-		return -EINVAL;
-
-	int_data = *(u32 *)(key_type + 1);
-	if (BTF_INT_BITS(int_data) != 32 || BTF_INT_OFFSET(int_data))
-		return -EINVAL;
-
-	return 0;
-}
-
-static void *bpf_fd_sk_storage_lookup_elem(struct bpf_map *map, void *key)
-{
-	struct bpf_sk_storage_data *sdata;
+	struct bpf_local_storage_data *sdata;
 	struct socket *sock;
-	int fd, err;
+	int fd, err = -EINVAL;
 
 	fd = *(int *)key;
 	sock = sockfd_lookup(fd, &err);
@@ -752,17 +223,18 @@ static void *bpf_fd_sk_storage_lookup_elem(struct bpf_map *map, void *key)
 	return ERR_PTR(err);
 }
 
-static int bpf_fd_sk_storage_update_elem(struct bpf_map *map, void *key,
-					 void *value, u64 map_flags)
+static int bpf_sk_storage_update_elem(struct bpf_map *map, void *key,
+				      void *value, u64 map_flags)
 {
-	struct bpf_sk_storage_data *sdata;
+	struct bpf_local_storage_data *sdata;
 	struct socket *sock;
 	int fd, err;
 
 	fd = *(int *)key;
 	sock = sockfd_lookup(fd, &err);
 	if (sock) {
-		sdata = sk_storage_update(sock->sk, map, value, map_flags);
+		sdata = map->ops->map_local_storage_update(sock->sk, map, value,
+							   map_flags);
 		sockfd_put(sock);
 		return PTR_ERR_OR_ZERO(sdata);
 	}
@@ -770,7 +242,7 @@ static int bpf_fd_sk_storage_update_elem(struct bpf_map *map, void *key,
 	return err;
 }
 
-static int bpf_fd_sk_storage_delete_elem(struct bpf_map *map, void *key)
+static int bpf_sk_storage_delete_elem(struct bpf_map *map, void *key)
 {
 	struct socket *sock;
 	int fd, err;
@@ -780,20 +252,19 @@ static int bpf_fd_sk_storage_delete_elem(struct bpf_map *map, void *key)
 	if (sock) {
 		err = sk_storage_delete(sock->sk, map);
 		sockfd_put(sock);
-		return err;
 	}
 
 	return err;
 }
 
-static struct bpf_sk_storage_elem *
+static struct bpf_local_storage_elem *
 bpf_sk_storage_clone_elem(struct sock *newsk,
-			  struct bpf_sk_storage_map *smap,
-			  struct bpf_sk_storage_elem *selem)
+			  struct bpf_local_storage_map *smap,
+			  struct bpf_local_storage_elem *selem)
 {
-	struct bpf_sk_storage_elem *copy_selem;
+	struct bpf_local_storage_elem *copy_selem;
 
-	copy_selem = selem_alloc(smap, newsk, NULL, true);
+	copy_selem = sk_selem_alloc(smap, newsk, NULL, true);
 	if (!copy_selem)
 		return NULL;
 
@@ -809,9 +280,9 @@ bpf_sk_storage_clone_elem(struct sock *newsk,
 
 int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
 {
-	struct bpf_sk_storage *new_sk_storage = NULL;
-	struct bpf_sk_storage *sk_storage;
-	struct bpf_sk_storage_elem *selem;
+	struct bpf_local_storage *new_sk_storage = NULL;
+	struct bpf_local_storage *sk_storage;
+	struct bpf_local_storage_elem *selem;
 	int ret = 0;
 
 	RCU_INIT_POINTER(newsk->sk_bpf_storage, NULL);
@@ -823,8 +294,8 @@ int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
 		goto out;
 
 	hlist_for_each_entry_rcu(selem, &sk_storage->list, snode) {
-		struct bpf_sk_storage_elem *copy_selem;
-		struct bpf_sk_storage_map *smap;
+		struct bpf_local_storage_elem *copy_selem;
+		struct bpf_local_storage_map *smap;
 		struct bpf_map *map;
 
 		smap = rcu_dereference(SDATA(selem)->smap);
@@ -832,7 +303,7 @@ int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
 			continue;
 
 		/* Note that for lockless listeners adding new element
-		 * here can race with cleanup in bpf_sk_storage_map_free.
+		 * here can race with cleanup in bpf_local_storage_map_free.
 		 * Try to grab map refcnt to make sure that it's still
 		 * alive and prevent concurrent removal.
 		 */
@@ -848,8 +319,8 @@ int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
 		}
 
 		if (new_sk_storage) {
-			selem_link_map(smap, copy_selem);
-			__selem_link_sk(new_sk_storage, copy_selem);
+			bpf_selem_link_map(smap, copy_selem);
+			bpf_selem_link(new_sk_storage, copy_selem);
 		} else {
 			ret = sk_storage_alloc(newsk, smap, copy_selem);
 			if (ret) {
@@ -860,7 +331,8 @@ int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
 				goto out;
 			}
 
-			new_sk_storage = rcu_dereference(copy_selem->sk_storage);
+			new_sk_storage =
+				rcu_dereference(copy_selem->local_storage);
 		}
 		bpf_map_put(map);
 	}
@@ -869,7 +341,7 @@ int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
 	rcu_read_unlock();
 
 	/* In case of an error, don't free anything explicitly here, the
-	 * caller is responsible to call bpf_sk_storage_free.
+	 * caller is responsible to call bpf_local_storage_free.
 	 */
 
 	return ret;
@@ -878,7 +350,7 @@ int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
 BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk,
 	   void *, value, u64, flags)
 {
-	struct bpf_sk_storage_data *sdata;
+	struct bpf_local_storage_data *sdata;
 
 	if (flags > BPF_SK_STORAGE_GET_F_CREATE)
 		return (unsigned long)NULL;
@@ -887,7 +359,7 @@ BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk,
 	if (sdata)
 		return (unsigned long)sdata->data;
 
-	if (flags == BPF_SK_STORAGE_GET_F_CREATE &&
+	if (flags == BPF_LOCAL_STORAGE_GET_F_CREATE &&
 	    /* Cannot add new elem to a going away sk.
 	     * Otherwise, the new elem may become a leak
 	     * (and also other memory issues during map
@@ -919,18 +391,51 @@ BPF_CALL_2(bpf_sk_storage_delete, struct bpf_map *, map, struct sock *, sk)
 	return -ENOENT;
 }
 
+static int notsupp_get_next_key(struct bpf_map *map, void *key,
+				void *next_key)
+{
+	return -ENOTSUPP;
+}
+
+DEFINE_BPF_STORAGE_CACHE(sk);
+
+static struct bpf_map *sk_storage_map_alloc(union bpf_attr *attr)
+{
+	struct bpf_local_storage_map *smap;
+
+	smap = bpf_local_storage_map_alloc(attr);
+	if (IS_ERR(smap))
+		return ERR_CAST(smap);
+
+	smap->cache_idx = cache_idx_get_sk();
+	return &smap->map;
+}
+
+static void sk_storage_map_free(struct bpf_map *map)
+{
+	struct bpf_local_storage_map *smap;
+
+	smap = (struct bpf_local_storage_map *)map;
+	cache_idx_free_sk(smap->cache_idx);
+	bpf_local_storage_map_free(smap);
+}
+
 static int sk_storage_map_btf_id;
 const struct bpf_map_ops sk_storage_map_ops = {
-	.map_alloc_check = bpf_sk_storage_map_alloc_check,
-	.map_alloc = bpf_sk_storage_map_alloc,
-	.map_free = bpf_sk_storage_map_free,
+	.map_alloc_check = bpf_local_storage_map_alloc_check,
+	.map_alloc = sk_storage_map_alloc,
+	.map_free = sk_storage_map_free,
 	.map_get_next_key = notsupp_get_next_key,
-	.map_lookup_elem = bpf_fd_sk_storage_lookup_elem,
-	.map_update_elem = bpf_fd_sk_storage_update_elem,
-	.map_delete_elem = bpf_fd_sk_storage_delete_elem,
-	.map_check_btf = bpf_sk_storage_map_check_btf,
-	.map_btf_name = "bpf_sk_storage_map",
+	.map_lookup_elem = bpf_sk_storage_lookup_elem,
+	.map_update_elem = bpf_sk_storage_update_elem,
+	.map_delete_elem = bpf_sk_storage_delete_elem,
+	.map_check_btf = bpf_local_storage_map_check_btf,
+	.map_btf_name = "bpf_local_storage_map",
 	.map_btf_id = &sk_storage_map_btf_id,
+	.map_local_storage_alloc = sk_storage_alloc,
+	.map_selem_alloc = sk_selem_alloc,
+	.map_local_storage_update = sk_storage_update,
+	.map_local_storage_unlink = unlink_sk_storage,
 };
 
 const struct bpf_func_proto bpf_sk_storage_get_proto = {
@@ -1011,7 +516,7 @@ bpf_sk_storage_diag_alloc(const struct nlattr *nla_stgs)
 	u32 nr_maps = 0;
 	int rem, err;
 
-	/* bpf_sk_storage_map is currently limited to CAP_SYS_ADMIN as
+	/* bpf_local_storage_map is currently limited to CAP_SYS_ADMIN as
 	 * the map_alloc_check() side also does.
 	 */
 	if (!bpf_capable())
@@ -1061,13 +566,13 @@ bpf_sk_storage_diag_alloc(const struct nlattr *nla_stgs)
 }
 EXPORT_SYMBOL_GPL(bpf_sk_storage_diag_alloc);
 
-static int diag_get(struct bpf_sk_storage_data *sdata, struct sk_buff *skb)
+static int diag_get(struct bpf_local_storage_data *sdata, struct sk_buff *skb)
 {
 	struct nlattr *nla_stg, *nla_value;
-	struct bpf_sk_storage_map *smap;
+	struct bpf_local_storage_map *smap;
 
 	/* It cannot exceed max nlattr's payload */
-	BUILD_BUG_ON(U16_MAX - NLA_HDRLEN < MAX_VALUE_SIZE);
+	BUILD_BUG_ON(U16_MAX - NLA_HDRLEN < BPF_LOCAL_STORAGE_MAX_VALUE_SIZE);
 
 	nla_stg = nla_nest_start(skb, SK_DIAG_BPF_STORAGE);
 	if (!nla_stg)
@@ -1103,9 +608,9 @@ static int bpf_sk_storage_diag_put_all(struct sock *sk, struct sk_buff *skb,
 {
 	/* stg_array_type (e.g. INET_DIAG_BPF_SK_STORAGES) */
 	unsigned int diag_size = nla_total_size(0);
-	struct bpf_sk_storage *sk_storage;
-	struct bpf_sk_storage_elem *selem;
-	struct bpf_sk_storage_map *smap;
+	struct bpf_local_storage *sk_storage;
+	struct bpf_local_storage_elem *selem;
+	struct bpf_local_storage_map *smap;
 	struct nlattr *nla_stgs;
 	unsigned int saved_len;
 	int err = 0;
@@ -1158,8 +663,8 @@ int bpf_sk_storage_diag_put(struct bpf_sk_storage_diag *diag,
 {
 	/* stg_array_type (e.g. INET_DIAG_BPF_SK_STORAGES) */
 	unsigned int diag_size = nla_total_size(0);
-	struct bpf_sk_storage *sk_storage;
-	struct bpf_sk_storage_data *sdata;
+	struct bpf_local_storage *sk_storage;
+	struct bpf_local_storage_data *sdata;
 	struct nlattr *nla_stgs;
 	unsigned int saved_len;
 	int err = 0;
@@ -1186,8 +691,8 @@ int bpf_sk_storage_diag_put(struct bpf_sk_storage_diag *diag,
 
 	saved_len = skb->len;
 	for (i = 0; i < diag->nr_maps; i++) {
-		sdata = __sk_storage_lookup(sk_storage,
-				(struct bpf_sk_storage_map *)diag->maps[i],
+		sdata = bpf_local_storage_lookup(sk_storage,
+				(struct bpf_local_storage_map *)diag->maps[i],
 				false);
 
 		if (!sdata)
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 548a749aebb3..1f3e831c4813 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2802,10 +2802,10 @@ union bpf_attr {
  *		"type". The bpf-local-storage "type" (i.e. the *map*) is
  *		searched against all bpf-local-storages residing at *sk*.
  *
- *		An optional *flags* (**BPF_SK_STORAGE_GET_F_CREATE**) can be
+ *		An optional *flags* (**BPF_LOCAL_STORAGE_GET_F_CREATE**) can be
  *		used such that a new bpf-local-storage will be
  *		created if one does not exist.  *value* can be used
- *		together with **BPF_SK_STORAGE_GET_F_CREATE** to specify
+ *		together with **BPF_LOCAL_STORAGE_GET_F_CREATE** to specify
  *		the initial value of a bpf-local-storage.  If *value* is
  *		**NULL**, the new bpf-local-storage will be zero initialized.
  *	Return
@@ -3572,9 +3572,13 @@ enum {
 	BPF_F_SYSCTL_BASE_NAME		= (1ULL << 0),
 };
 
-/* BPF_FUNC_sk_storage_get flags */
+/* BPF_FUNC_<local>_storage_get flags */
 enum {
-	BPF_SK_STORAGE_GET_F_CREATE	= (1ULL << 0),
+	BPF_LOCAL_STORAGE_GET_F_CREATE	= (1ULL << 0),
+	/* BPF_SK_STORAGE_GET_F_CREATE is only kept for backward compatibility
+	 * and BPF_LOCAL_STORAGE_GET_F_CREATE must be used instead.
+	 */
+	BPF_SK_STORAGE_GET_F_CREATE  = BPF_LOCAL_STORAGE_GET_F_CREATE,
 };
 
 /* BPF_FUNC_read_branch_records flags. */
-- 
2.27.0.389.gc38d7665816-goog


^ permalink raw reply related

* [PATCH bpf-next v4 3/4] bpf: Allow local storage to be used from LSM programs
From: KP Singh @ 2020-07-09 10:12 UTC (permalink / raw)
  To: linux-kernel, bpf, linux-security-module
  Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau,
	Paul Turner, Jann Horn, Florent Revest
In-Reply-To: <20200709101239.3829793-1-kpsingh@chromium.org>

From: KP Singh <kpsingh@google.com>

Adds support for both bpf_{sk, inode}_storage_{get, delete} to be used
in LSM programs. These helpers are not used for tracing programs
(currently) as their usage is tied to the life-cycle of the object and
should only be used where the owning object won't be freed (when the
owning object is passed as an argument to the LSM hook). Thus, they
are safer to use in LSM hooks than tracing. Usage of local storage in
tracing programs will probably follow a per function based whitelist
approach.

Since the UAPI helper signature for bpf_sk_storage expect a bpf_sock,
it, leads to a compilation warning for LSM programs, it's also updated
to accept a void * pointer instead.

Signed-off-by: KP Singh <kpsingh@google.com>
---
 include/net/bpf_sk_storage.h   |  2 ++
 include/uapi/linux/bpf.h       |  4 ++--
 kernel/bpf/bpf_lsm.c           | 21 ++++++++++++++++++++-
 net/core/bpf_sk_storage.c      | 22 ++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h |  4 ++--
 5 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/include/net/bpf_sk_storage.h b/include/net/bpf_sk_storage.h
index 5036c94c0503..d231da1b57ca 100644
--- a/include/net/bpf_sk_storage.h
+++ b/include/net/bpf_sk_storage.h
@@ -9,6 +9,8 @@ void bpf_sk_storage_free(struct sock *sk);
 
 extern const struct bpf_func_proto bpf_sk_storage_get_proto;
 extern const struct bpf_func_proto bpf_sk_storage_delete_proto;
+extern const struct bpf_func_proto sk_storage_get_btf_proto;
+extern const struct bpf_func_proto sk_storage_delete_btf_proto;
 
 struct bpf_sk_storage_diag;
 struct sk_buff;
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 42fc442f4586..3d2859ccc7ae 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2787,7 +2787,7 @@ union bpf_attr {
  *
  *		**-ERANGE** if resulting value was out of range.
  *
- * void *bpf_sk_storage_get(struct bpf_map *map, struct bpf_sock *sk, void *value, u64 flags)
+ * void *bpf_sk_storage_get(struct bpf_map *map, void *sk, void *value, u64 flags)
  *	Description
  *		Get a bpf-local-storage from a *sk*.
  *
@@ -2815,7 +2815,7 @@ union bpf_attr {
  *		**NULL** if not found or there was an error in adding
  *		a new bpf-local-storage.
  *
- * long bpf_sk_storage_delete(struct bpf_map *map, struct bpf_sock *sk)
+ * long bpf_sk_storage_delete(struct bpf_map *map, void *sk)
  *	Description
  *		Delete a bpf-local-storage from a *sk*.
  *	Return
diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index fb278144e9fd..9cd1428c7199 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -11,6 +11,8 @@
 #include <linux/bpf_lsm.h>
 #include <linux/kallsyms.h>
 #include <linux/bpf_verifier.h>
+#include <net/bpf_sk_storage.h>
+#include <linux/bpf_local_storage.h>
 
 /* For every LSM hook that allows attachment of BPF programs, declare a nop
  * function where a BPF program can be attached.
@@ -45,10 +47,27 @@ int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
 	return 0;
 }
 
+static const struct bpf_func_proto *
+bpf_lsm_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
+{
+	switch (func_id) {
+	case BPF_FUNC_inode_storage_get:
+		return &bpf_inode_storage_get_proto;
+	case BPF_FUNC_inode_storage_delete:
+		return &bpf_inode_storage_delete_proto;
+	case BPF_FUNC_sk_storage_get:
+		return &sk_storage_get_btf_proto;
+	case BPF_FUNC_sk_storage_delete:
+		return &sk_storage_delete_btf_proto;
+	default:
+		return tracing_prog_func_proto(func_id, prog);
+	}
+}
+
 const struct bpf_prog_ops lsm_prog_ops = {
 };
 
 const struct bpf_verifier_ops lsm_verifier_ops = {
-	.get_func_proto = tracing_prog_func_proto,
+	.get_func_proto = bpf_lsm_func_proto,
 	.is_valid_access = btf_ctx_access,
 };
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index a2b00a09d843..ed0a07e0bb67 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -456,6 +456,28 @@ const struct bpf_func_proto bpf_sk_storage_delete_proto = {
 	.arg2_type	= ARG_PTR_TO_SOCKET,
 };
 
+static int sk_storage_get_btf_ids[4];
+const struct bpf_func_proto sk_storage_get_btf_proto = {
+	.func		= bpf_sk_storage_get,
+	.gpl_only	= false,
+	.ret_type	= RET_PTR_TO_MAP_VALUE_OR_NULL,
+	.arg1_type	= ARG_CONST_MAP_PTR,
+	.arg2_type	= ARG_PTR_TO_BTF_ID,
+	.arg3_type	= ARG_PTR_TO_MAP_VALUE_OR_NULL,
+	.arg4_type	= ARG_ANYTHING,
+	.btf_id		= sk_storage_get_btf_ids,
+};
+
+static int sk_storage_delete_btf_ids[2];
+const struct bpf_func_proto sk_storage_delete_btf_proto = {
+	.func		= bpf_sk_storage_delete,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_CONST_MAP_PTR,
+	.arg2_type	= ARG_PTR_TO_BTF_ID,
+	.btf_id		= sk_storage_delete_btf_ids,
+};
+
 struct bpf_sk_storage_diag {
 	u32 nr_maps;
 	struct bpf_map *maps[];
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 42fc442f4586..3d2859ccc7ae 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2787,7 +2787,7 @@ union bpf_attr {
  *
  *		**-ERANGE** if resulting value was out of range.
  *
- * void *bpf_sk_storage_get(struct bpf_map *map, struct bpf_sock *sk, void *value, u64 flags)
+ * void *bpf_sk_storage_get(struct bpf_map *map, void *sk, void *value, u64 flags)
  *	Description
  *		Get a bpf-local-storage from a *sk*.
  *
@@ -2815,7 +2815,7 @@ union bpf_attr {
  *		**NULL** if not found or there was an error in adding
  *		a new bpf-local-storage.
  *
- * long bpf_sk_storage_delete(struct bpf_map *map, struct bpf_sock *sk)
+ * long bpf_sk_storage_delete(struct bpf_map *map, void *sk)
  *	Description
  *		Delete a bpf-local-storage from a *sk*.
  *	Return
-- 
2.27.0.389.gc38d7665816-goog


^ permalink raw reply related

* [PATCH bpf-next v4 2/4] bpf: Implement bpf_local_storage for inodes
From: KP Singh @ 2020-07-09 10:12 UTC (permalink / raw)
  To: linux-kernel, bpf, linux-security-module
  Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau,
	Paul Turner, Jann Horn, Florent Revest
In-Reply-To: <20200709101239.3829793-1-kpsingh@chromium.org>

From: KP Singh <kpsingh@google.com>

Similar to bpf_local_storage for sockets, add local storage for inodes.
The life-cycle of storage is managed with the life-cycle of the inode.
i.e. the storage is destroyed along with the owning inode.

The BPF LSM allocates an __rcu pointer to the bpf_local_storage in the
security blob which are now stackable and can co-exist with other LSMs.

Signed-off-by: KP Singh <kpsingh@google.com>
---
 include/linux/bpf_local_storage.h             |  15 +
 include/linux/bpf_lsm.h                       |  21 ++
 include/linux/bpf_types.h                     |   3 +
 include/uapi/linux/bpf.h                      |  38 ++
 kernel/bpf/Makefile                           |   1 +
 kernel/bpf/bpf_inode_storage.c                | 333 ++++++++++++++++++
 kernel/bpf/syscall.c                          |   3 +-
 kernel/bpf/verifier.c                         |  10 +
 security/bpf/hooks.c                          |   7 +
 .../bpf/bpftool/Documentation/bpftool-map.rst |   2 +-
 tools/bpf/bpftool/bash-completion/bpftool     |   3 +-
 tools/bpf/bpftool/map.c                       |   3 +-
 tools/include/uapi/linux/bpf.h                |  38 ++
 tools/lib/bpf/libbpf_probes.c                 |   5 +-
 14 files changed, 476 insertions(+), 6 deletions(-)
 create mode 100644 kernel/bpf/bpf_inode_storage.c

diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h
index 605b81f2f806..63cc7b93ae80 100644
--- a/include/linux/bpf_local_storage.h
+++ b/include/linux/bpf_local_storage.h
@@ -113,6 +113,8 @@ static void cache_idx_free_##type(u16 idx)				\
 	       sizeof(struct bpf_local_storage_elem)),			\
 	      (U16_MAX - sizeof(struct bpf_local_storage_elem)))
 
+struct inode;
+
 struct bpf_local_storage {
 	struct bpf_local_storage_data __rcu *cache[BPF_STORAGE_CACHE_SIZE];
 	struct hlist_head list;		/* List of bpf_local_storage_elem */
@@ -121,6 +123,9 @@ struct bpf_local_storage {
 	 */
 	union {
 		struct sock *sk;
+#ifdef CONFIG_BPF_LSM
+		struct inode *inode;
+#endif
 	};
 	struct rcu_head rcu;
 	raw_spinlock_t lock;	/* Protect adding/removing from the "list" */
@@ -172,4 +177,14 @@ struct bpf_local_storage_data *
 bpf_local_storage_update(void *owner, struct bpf_map *map,
 			 struct bpf_local_storage *local_storage, void *value,
 			 u64 map_flags);
+
+#ifdef CONFIG_BPF_LSM
+extern const struct bpf_func_proto bpf_inode_storage_get_proto;
+extern const struct bpf_func_proto bpf_inode_storage_delete_proto;
+void bpf_inode_storage_free(struct inode *inode);
+#else
+static inline void bpf_inode_storage_free(struct inode *inode)
+{
+}
+#endif
 #endif /* _BPF_LOCAL_STORAGE_H */
diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
index af74712af585..d0683ada1e49 100644
--- a/include/linux/bpf_lsm.h
+++ b/include/linux/bpf_lsm.h
@@ -17,9 +17,24 @@
 #include <linux/lsm_hook_defs.h>
 #undef LSM_HOOK
 
+struct bpf_storage_blob {
+	struct bpf_local_storage __rcu *storage;
+};
+
+extern struct lsm_blob_sizes bpf_lsm_blob_sizes;
+
 int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
 			const struct bpf_prog *prog);
 
+static inline struct bpf_storage_blob *bpf_inode(
+	const struct inode *inode)
+{
+	if (unlikely(!inode->i_security))
+		return NULL;
+
+	return inode->i_security + bpf_lsm_blob_sizes.lbs_inode;
+}
+
 #else /* !CONFIG_BPF_LSM */
 
 static inline int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
@@ -28,6 +43,12 @@ static inline int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
 	return -EOPNOTSUPP;
 }
 
+static inline struct bpf_storage_blob *bpf_inode(
+	const struct inode *inode)
+{
+	return NULL;
+}
+
 #endif /* CONFIG_BPF_LSM */
 
 #endif /* _LINUX_BPF_LSM_H */
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index a18ae82a298a..2950576de4ae 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -105,6 +105,9 @@ BPF_MAP_TYPE(BPF_MAP_TYPE_SK_STORAGE, sk_storage_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_SOCKMAP, sock_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_SOCKHASH, sock_hash_ops)
 #endif
+#ifdef CONFIG_BPF_LSM
+BPF_MAP_TYPE(BPF_MAP_TYPE_INODE_STORAGE, inode_storage_map_ops)
+#endif
 BPF_MAP_TYPE(BPF_MAP_TYPE_CPUMAP, cpu_map_ops)
 #if defined(CONFIG_XDP_SOCKETS)
 BPF_MAP_TYPE(BPF_MAP_TYPE_XSKMAP, xsk_map_ops)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 1f3e831c4813..42fc442f4586 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -148,6 +148,7 @@ enum bpf_map_type {
 	BPF_MAP_TYPE_DEVMAP_HASH,
 	BPF_MAP_TYPE_STRUCT_OPS,
 	BPF_MAP_TYPE_RINGBUF,
+	BPF_MAP_TYPE_INODE_STORAGE,
 };
 
 /* Note that tracing related programs such as
@@ -3319,6 +3320,41 @@ union bpf_attr {
  *		A non-negative value equal to or less than *size* on success,
  *		or a negative error in case of failure.
  *
+ * void *bpf_inode_storage_get(struct bpf_map *map, void *inode, void *value, u64 flags)
+ *	Description
+ *		Get a bpf_local_storage from an *inode*.
+ *
+ *		Logically, it could be thought of as getting the value from
+ *		a *map* with *inode* as the **key**.  From this
+ *		perspective,  the usage is not much different from
+ *		**bpf_map_lookup_elem**\ (*map*, **&**\ *inode*) except this
+ *		helper enforces the key must be an inode and the map must also
+ *		be a **BPF_MAP_TYPE_INODE_STORAGE**.
+ *
+ *		Underneath, the value is stored locally at *inode* instead of
+ *		the *map*.  The *map* is used as the bpf-local-storage
+ *		"type". The bpf-local-storage "type" (i.e. the *map*) is
+ *		searched against all bpf_local_storage residing at *inode*.
+ *
+ *		An optional *flags* (**BPF_LOCAL_STORAGE_GET_F_CREATE**) can be
+ *		used such that a new bpf_local_storage will be
+ *		created if one does not exist.  *value* can be used
+ *		together with **BPF_LOCAL_STORAGE_GET_F_CREATE** to specify
+ *		the initial value of a bpf_local_storage.  If *value* is
+ *		**NULL**, the new bpf_local_storage will be zero initialized.
+ *	Return
+ *		A bpf_local_storage pointer is returned on success.
+ *
+ *		**NULL** if not found or there was an error in adding
+ *		a new bpf_local_storage.
+ *
+ * int bpf_inode_storage_delete(struct bpf_map *map, void *inode)
+ *	Description
+ *		Delete a bpf_local_storage from an *inode*.
+ *	Return
+ *		0 on success.
+ *
+ *		**-ENOENT** if the bpf_local_storage cannot be found.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -3463,6 +3499,8 @@ union bpf_attr {
 	FN(skc_to_tcp_request_sock),	\
 	FN(skc_to_udp6_sock),		\
 	FN(get_task_stack),		\
+	FN(inode_storage_get),		\
+	FN(inode_storage_delete),	\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 0acb8f8a6042..0ea9fd15977c 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -5,6 +5,7 @@ CFLAGS_core.o += $(call cc-disable-warning, override-init)
 obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o bpf_iter.o map_iter.o task_iter.o
 obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o
 obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o
+obj-${CONFIG_BPF_LSM}	  += bpf_inode_storage.o
 obj-$(CONFIG_BPF_SYSCALL) += disasm.o
 obj-$(CONFIG_BPF_JIT) += trampoline.o
 obj-$(CONFIG_BPF_SYSCALL) += btf.o
diff --git a/kernel/bpf/bpf_inode_storage.c b/kernel/bpf/bpf_inode_storage.c
new file mode 100644
index 000000000000..d63db7a24168
--- /dev/null
+++ b/kernel/bpf/bpf_inode_storage.c
@@ -0,0 +1,333 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2019 Facebook
+ * Copyright 2020 Google LLC.
+ */
+
+#include <linux/rculist.h>
+#include <linux/list.h>
+#include <linux/hash.h>
+#include <linux/types.h>
+#include <linux/spinlock.h>
+#include <linux/bpf.h>
+#include <linux/bpf_local_storage.h>
+#include <net/sock.h>
+#include <uapi/linux/sock_diag.h>
+#include <uapi/linux/btf.h>
+#include <linux/bpf_lsm.h>
+
+static struct bpf_local_storage_elem *
+inode_selem_alloc(struct bpf_local_storage_map *smap, void *owner,
+		  void *value, bool charge_omem)
+{
+	return bpf_selem_alloc(smap, value);
+}
+
+static void unlink_inode_storage(struct bpf_local_storage *local_storage,
+				 bool uncharge_omem)
+{
+	struct bpf_storage_blob *bsb;
+	struct inode *inode;
+
+	inode = local_storage->inode;
+	bsb = bpf_inode(inode);
+	if (!bsb)
+		return;
+	RCU_INIT_POINTER(bsb->storage, NULL);
+	local_storage->inode = NULL;
+}
+
+static struct bpf_local_storage_data *inode_storage_lookup(struct inode *inode,
+							   struct bpf_map *map,
+							   bool cacheit_lockit)
+{
+	struct bpf_local_storage *inode_storage;
+	struct bpf_local_storage_map *smap;
+	struct bpf_storage_blob *bsb;
+
+	bsb = bpf_inode(inode);
+	if (!bsb)
+		return ERR_PTR(-ENOENT);
+
+	inode_storage = rcu_dereference(bsb->storage);
+	if (!inode_storage)
+		return NULL;
+
+	smap = (struct bpf_local_storage_map *)map;
+	return bpf_local_storage_lookup(inode_storage, smap, cacheit_lockit);
+}
+
+static int inode_storage_alloc(void *owner, struct bpf_local_storage_map *smap,
+			       struct bpf_local_storage_elem *first_selem)
+{
+	struct bpf_local_storage *curr;
+	struct bpf_storage_blob *bsb;
+	struct inode *inode = owner;
+	int err;
+
+	bsb = bpf_inode(inode);
+	if (!bsb)
+		return -EINVAL;
+
+	curr = bpf_local_storage_alloc(smap);
+	if (!curr)
+		return -ENOMEM;
+
+	curr->inode = inode;
+
+	bpf_selem_link(curr, first_selem);
+	bpf_selem_link_map(smap, first_selem);
+
+	err = bpf_local_storage_publish(first_selem,
+		(struct bpf_local_storage **)&bsb->storage, curr);
+	if (err) {
+		kfree(curr);
+		return err;
+	}
+
+	return 0;
+}
+
+static struct bpf_local_storage_data *inode_storage_update(void *owner,
+							   struct bpf_map *map,
+							   void *value,
+							   u64 map_flags)
+{
+	struct bpf_local_storage_data *old_sdata = NULL;
+	struct bpf_local_storage_elem *selem;
+	struct bpf_local_storage *local_storage;
+	struct bpf_local_storage_map *smap;
+	struct bpf_storage_blob *bsb;
+	struct inode *inode;
+	int err;
+
+	err = bpf_local_storage_check_update_flags(map, map_flags);
+	if (err)
+		return ERR_PTR(err);
+
+	inode = owner;
+	bsb = bpf_inode(inode);
+	local_storage = rcu_dereference(bsb->storage);
+	smap = (struct bpf_local_storage_map *)map;
+
+	if (!local_storage || hlist_empty(&local_storage->list)) {
+		/* Very first elem */
+		selem = map->ops->map_selem_alloc(smap, owner, value, !old_sdata);
+		if (!selem)
+			return ERR_PTR(-ENOMEM);
+
+		err = map->ops->map_local_storage_alloc(owner, smap, selem);
+		if (err) {
+			kfree(selem);
+			return ERR_PTR(err);
+		}
+
+		return SDATA(selem);
+	}
+
+	return bpf_local_storage_update(owner, map, local_storage, value,
+					map_flags);
+}
+
+
+void bpf_inode_storage_free(struct inode *inode)
+{
+	struct bpf_local_storage_elem *selem;
+	struct bpf_local_storage *local_storage;
+	bool free_inode_storage = false;
+	struct bpf_storage_blob *bsb;
+	struct hlist_node *n;
+
+	bsb = bpf_inode(inode);
+	if (!bsb)
+		return;
+
+	rcu_read_lock();
+
+	local_storage = rcu_dereference(bsb->storage);
+	if (!local_storage) {
+		rcu_read_unlock();
+		return;
+	}
+
+	/* Netiher the bpf_prog nor the bpf-map's syscall
+	 * could be modifying the local_storage->list now.
+	 * Thus, no elem can be added-to or deleted-from the
+	 * local_storage->list by the bpf_prog or by the bpf-map's syscall.
+	 *
+	 * It is racing with bpf_local_storage_map_free() alone
+	 * when unlinking elem from the local_storage->list and
+	 * the map's bucket->list.
+	 */
+	raw_spin_lock_bh(&local_storage->lock);
+	hlist_for_each_entry_safe(selem, n, &local_storage->list, snode) {
+		/* Always unlink from map before unlinking from
+		 * local_storage.
+		 */
+		bpf_selem_unlink_map(selem);
+		free_inode_storage =
+			bpf_selem_unlink(local_storage, selem, false);
+	}
+	raw_spin_unlock_bh(&local_storage->lock);
+	rcu_read_unlock();
+
+	/* free_inoode_storage should always be true as long as
+	 * local_storage->list was non-empty.
+	 */
+	if (free_inode_storage)
+		kfree_rcu(local_storage, rcu);
+}
+
+
+static void *bpf_inode_storage_lookup_elem(struct bpf_map *map, void *key)
+{
+	struct bpf_local_storage_data *sdata;
+	struct inode *inode;
+	int err = -EINVAL;
+
+	if (key) {
+		inode = *(struct inode **)(key);
+		sdata = inode_storage_lookup(inode, map, true);
+		return sdata ? sdata->data : NULL;
+	}
+
+	return ERR_PTR(err);
+}
+
+static int bpf_inode_storage_update_elem(struct bpf_map *map, void *key,
+					 void *value, u64 map_flags)
+{
+	struct bpf_local_storage_data *sdata;
+	struct inode *inode;
+	int err = -EINVAL;
+
+	if (key) {
+		inode = *(struct inode **)(key);
+		sdata = map->ops->map_local_storage_update(inode, map, value,
+							   map_flags);
+		return PTR_ERR_OR_ZERO(sdata);
+	}
+	return err;
+}
+
+static int inode_storage_delete(struct inode *inode, struct bpf_map *map)
+{
+	struct bpf_local_storage_data *sdata;
+
+	sdata = inode_storage_lookup(inode, map, false);
+	if (!sdata)
+		return -ENOENT;
+
+	bpf_selem_unlink_map_elem(SELEM(sdata));
+
+	return 0;
+}
+
+static int bpf_inode_storage_delete_elem(struct bpf_map *map, void *key)
+{
+	struct inode *inode;
+	int err = -EINVAL;
+
+	if (key) {
+		inode = *(struct inode **)(key);
+		err = inode_storage_delete(inode, map);
+	}
+
+	return err;
+}
+
+BPF_CALL_4(bpf_inode_storage_get, struct bpf_map *, map, struct inode *, inode,
+	   void *, value, u64, flags)
+{
+	struct bpf_local_storage_data *sdata;
+
+	if (flags & ~(BPF_LOCAL_STORAGE_GET_F_CREATE))
+		return (unsigned long)NULL;
+
+	sdata = inode_storage_lookup(inode, map, true);
+	if (sdata)
+		return (unsigned long)sdata->data;
+
+	if (flags & BPF_LOCAL_STORAGE_GET_F_CREATE) {
+		sdata = inode_storage_update(inode, map, value, BPF_NOEXIST);
+		return IS_ERR(sdata) ?
+			(unsigned long)NULL : (unsigned long)sdata->data;
+	}
+
+	return (unsigned long)NULL;
+}
+
+BPF_CALL_2(bpf_inode_storage_delete,
+	   struct bpf_map *, map, struct inode *, inode)
+{
+	return inode_storage_delete(inode, map);
+}
+
+static int notsupp_get_next_key(struct bpf_map *map, void *key,
+				void *next_key)
+{
+	return -ENOTSUPP;
+}
+
+DEFINE_BPF_STORAGE_CACHE(inode);
+
+static struct bpf_map *inode_storage_map_alloc(union bpf_attr *attr)
+{
+	struct bpf_local_storage_map *smap;
+
+	smap = bpf_local_storage_map_alloc(attr);
+	if (IS_ERR(smap))
+		return ERR_CAST(smap);
+
+	smap->cache_idx = cache_idx_get_inode();
+	return &smap->map;
+}
+
+static void inode_storage_map_free(struct bpf_map *map)
+{
+	struct bpf_local_storage_map *smap;
+
+	smap = (struct bpf_local_storage_map *)map;
+	cache_idx_free_inode(smap->cache_idx);
+	bpf_local_storage_map_free(smap);
+}
+
+static int inode_storage_map_btf_id;
+const struct bpf_map_ops inode_storage_map_ops = {
+	.map_alloc_check = bpf_local_storage_map_alloc_check,
+	.map_alloc = inode_storage_map_alloc,
+	.map_free = inode_storage_map_free,
+	.map_get_next_key = notsupp_get_next_key,
+	.map_lookup_elem = bpf_inode_storage_lookup_elem,
+	.map_update_elem = bpf_inode_storage_update_elem,
+	.map_delete_elem = bpf_inode_storage_delete_elem,
+	.map_check_btf = bpf_local_storage_map_check_btf,
+	.map_btf_name = "bpf_local_storage_map",
+	.map_btf_id = &inode_storage_map_btf_id,
+	.map_local_storage_alloc = inode_storage_alloc,
+	.map_selem_alloc = inode_selem_alloc,
+	.map_local_storage_update = inode_storage_update,
+	.map_local_storage_unlink = unlink_inode_storage,
+};
+
+static int bpf_inode_storage_get_btf_ids[4];
+const struct bpf_func_proto bpf_inode_storage_get_proto = {
+	.func		= bpf_inode_storage_get,
+	.gpl_only	= false,
+	.ret_type	= RET_PTR_TO_MAP_VALUE_OR_NULL,
+	.arg1_type	= ARG_CONST_MAP_PTR,
+	.arg2_type	= ARG_PTR_TO_BTF_ID,
+	.arg3_type	= ARG_PTR_TO_MAP_VALUE_OR_NULL,
+	.arg4_type	= ARG_ANYTHING,
+	.btf_id		= bpf_inode_storage_get_btf_ids,
+};
+
+static int bpf_inode_storage_delete_btf_ids[2];
+const struct bpf_func_proto bpf_inode_storage_delete_proto = {
+	.func		= bpf_inode_storage_delete,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_CONST_MAP_PTR,
+	.arg2_type	= ARG_PTR_TO_BTF_ID,
+	.btf_id		= bpf_inode_storage_delete_btf_ids,
+};
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 156f51ffada2..3fc7586408f7 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -768,7 +768,8 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf,
 		if (map->map_type != BPF_MAP_TYPE_HASH &&
 		    map->map_type != BPF_MAP_TYPE_ARRAY &&
 		    map->map_type != BPF_MAP_TYPE_CGROUP_STORAGE &&
-		    map->map_type != BPF_MAP_TYPE_SK_STORAGE)
+		    map->map_type != BPF_MAP_TYPE_SK_STORAGE &&
+		    map->map_type != BPF_MAP_TYPE_INODE_STORAGE)
 			return -ENOTSUPP;
 		if (map->spin_lock_off + sizeof(struct bpf_spin_lock) >
 		    map->value_size) {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b608185e1ffd..9080a62b583a 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4164,6 +4164,11 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
 		    func_id != BPF_FUNC_sk_storage_delete)
 			goto error;
 		break;
+	case BPF_MAP_TYPE_INODE_STORAGE:
+		if (func_id != BPF_FUNC_inode_storage_get &&
+		    func_id != BPF_FUNC_inode_storage_delete)
+			goto error;
+		break;
 	default:
 		break;
 	}
@@ -4237,6 +4242,11 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
 		if (map->map_type != BPF_MAP_TYPE_SK_STORAGE)
 			goto error;
 		break;
+	case BPF_FUNC_inode_storage_get:
+	case BPF_FUNC_inode_storage_delete:
+		if (map->map_type != BPF_MAP_TYPE_INODE_STORAGE)
+			goto error;
+		break;
 	default:
 		break;
 	}
diff --git a/security/bpf/hooks.c b/security/bpf/hooks.c
index 32d32d485451..35f9b19259e5 100644
--- a/security/bpf/hooks.c
+++ b/security/bpf/hooks.c
@@ -3,6 +3,7 @@
 /*
  * Copyright (C) 2020 Google LLC.
  */
+#include <linux/bpf_local_storage.h>
 #include <linux/lsm_hooks.h>
 #include <linux/bpf_lsm.h>
 
@@ -11,6 +12,7 @@ static struct security_hook_list bpf_lsm_hooks[] __lsm_ro_after_init = {
 	LSM_HOOK_INIT(NAME, bpf_lsm_##NAME),
 	#include <linux/lsm_hook_defs.h>
 	#undef LSM_HOOK
+	LSM_HOOK_INIT(inode_free_security, bpf_inode_storage_free),
 };
 
 static int __init bpf_lsm_init(void)
@@ -20,7 +22,12 @@ static int __init bpf_lsm_init(void)
 	return 0;
 }
 
+struct lsm_blob_sizes bpf_lsm_blob_sizes __lsm_ro_after_init = {
+	.lbs_inode = sizeof(struct bpf_storage_blob),
+};
+
 DEFINE_LSM(bpf) = {
 	.name = "bpf",
 	.init = bpf_lsm_init,
+	.blobs = &bpf_lsm_blob_sizes
 };
diff --git a/tools/bpf/bpftool/Documentation/bpftool-map.rst b/tools/bpf/bpftool/Documentation/bpftool-map.rst
index 41e2a74252d0..083db6c2fc67 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-map.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool-map.rst
@@ -49,7 +49,7 @@ MAP COMMANDS
 |		| **lru_percpu_hash** | **lpm_trie** | **array_of_maps** | **hash_of_maps**
 |		| **devmap** | **devmap_hash** | **sockmap** | **cpumap** | **xskmap** | **sockhash**
 |		| **cgroup_storage** | **reuseport_sockarray** | **percpu_cgroup_storage**
-|		| **queue** | **stack** | **sk_storage** | **struct_ops** | **ringbuf** }
+|		| **queue** | **stack** | **sk_storage** | **struct_ops** | **ringbuf** | **inode_storage** }
 
 DESCRIPTION
 ===========
diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
index 25b25aca1112..34cadc081a78 100644
--- a/tools/bpf/bpftool/bash-completion/bpftool
+++ b/tools/bpf/bpftool/bash-completion/bpftool
@@ -688,7 +688,8 @@ _bpftool()
                                 lru_percpu_hash lpm_trie array_of_maps \
                                 hash_of_maps devmap devmap_hash sockmap cpumap \
                                 xskmap sockhash cgroup_storage reuseport_sockarray \
-                                percpu_cgroup_storage queue stack' -- \
+                                percpu_cgroup_storage queue stack sk_storage \
+                                struct_ops inode_storage' -- \
                                                    "$cur" ) )
                             return 0
                             ;;
diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
index 3a27d31a1856..bc0071228f88 100644
--- a/tools/bpf/bpftool/map.c
+++ b/tools/bpf/bpftool/map.c
@@ -50,6 +50,7 @@ const char * const map_type_name[] = {
 	[BPF_MAP_TYPE_SK_STORAGE]		= "sk_storage",
 	[BPF_MAP_TYPE_STRUCT_OPS]		= "struct_ops",
 	[BPF_MAP_TYPE_RINGBUF]			= "ringbuf",
+	[BPF_MAP_TYPE_INODE_STORAGE]		= "inode_storage",
 };
 
 const size_t map_type_name_size = ARRAY_SIZE(map_type_name);
@@ -1442,7 +1443,7 @@ static int do_help(int argc, char **argv)
 		"                 lru_percpu_hash | lpm_trie | array_of_maps | hash_of_maps |\n"
 		"                 devmap | devmap_hash | sockmap | cpumap | xskmap | sockhash |\n"
 		"                 cgroup_storage | reuseport_sockarray | percpu_cgroup_storage |\n"
-		"                 queue | stack | sk_storage | struct_ops | ringbuf }\n"
+		"                 queue | stack | sk_storage | struct_ops | ringbuf | inode_storage }\n"
 		"       " HELP_SPEC_OPTIONS "\n"
 		"",
 		bin_name, argv[-2]);
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 1f3e831c4813..42fc442f4586 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -148,6 +148,7 @@ enum bpf_map_type {
 	BPF_MAP_TYPE_DEVMAP_HASH,
 	BPF_MAP_TYPE_STRUCT_OPS,
 	BPF_MAP_TYPE_RINGBUF,
+	BPF_MAP_TYPE_INODE_STORAGE,
 };
 
 /* Note that tracing related programs such as
@@ -3319,6 +3320,41 @@ union bpf_attr {
  *		A non-negative value equal to or less than *size* on success,
  *		or a negative error in case of failure.
  *
+ * void *bpf_inode_storage_get(struct bpf_map *map, void *inode, void *value, u64 flags)
+ *	Description
+ *		Get a bpf_local_storage from an *inode*.
+ *
+ *		Logically, it could be thought of as getting the value from
+ *		a *map* with *inode* as the **key**.  From this
+ *		perspective,  the usage is not much different from
+ *		**bpf_map_lookup_elem**\ (*map*, **&**\ *inode*) except this
+ *		helper enforces the key must be an inode and the map must also
+ *		be a **BPF_MAP_TYPE_INODE_STORAGE**.
+ *
+ *		Underneath, the value is stored locally at *inode* instead of
+ *		the *map*.  The *map* is used as the bpf-local-storage
+ *		"type". The bpf-local-storage "type" (i.e. the *map*) is
+ *		searched against all bpf_local_storage residing at *inode*.
+ *
+ *		An optional *flags* (**BPF_LOCAL_STORAGE_GET_F_CREATE**) can be
+ *		used such that a new bpf_local_storage will be
+ *		created if one does not exist.  *value* can be used
+ *		together with **BPF_LOCAL_STORAGE_GET_F_CREATE** to specify
+ *		the initial value of a bpf_local_storage.  If *value* is
+ *		**NULL**, the new bpf_local_storage will be zero initialized.
+ *	Return
+ *		A bpf_local_storage pointer is returned on success.
+ *
+ *		**NULL** if not found or there was an error in adding
+ *		a new bpf_local_storage.
+ *
+ * int bpf_inode_storage_delete(struct bpf_map *map, void *inode)
+ *	Description
+ *		Delete a bpf_local_storage from an *inode*.
+ *	Return
+ *		0 on success.
+ *
+ *		**-ENOENT** if the bpf_local_storage cannot be found.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -3463,6 +3499,8 @@ union bpf_attr {
 	FN(skc_to_tcp_request_sock),	\
 	FN(skc_to_udp6_sock),		\
 	FN(get_task_stack),		\
+	FN(inode_storage_get),		\
+	FN(inode_storage_delete),	\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
index 10cd8d1891f5..b859558ce290 100644
--- a/tools/lib/bpf/libbpf_probes.c
+++ b/tools/lib/bpf/libbpf_probes.c
@@ -170,7 +170,7 @@ int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
 	return btf_fd;
 }
 
-static int load_sk_storage_btf(void)
+static int load_local_storage_btf(void)
 {
 	const char strs[] = "\0bpf_spin_lock\0val\0cnt\0l";
 	/* struct bpf_spin_lock {
@@ -229,12 +229,13 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
 		key_size	= 0;
 		break;
 	case BPF_MAP_TYPE_SK_STORAGE:
+	case BPF_MAP_TYPE_INODE_STORAGE:
 		btf_key_type_id = 1;
 		btf_value_type_id = 3;
 		value_size = 8;
 		max_entries = 0;
 		map_flags = BPF_F_NO_PREALLOC;
-		btf_fd = load_sk_storage_btf();
+		btf_fd = load_local_storage_btf();
 		if (btf_fd < 0)
 			return false;
 		break;
-- 
2.27.0.389.gc38d7665816-goog


^ permalink raw reply related

* [PATCH bpf-next v4 0/4] Generalizing bpf_local_storage
From: KP Singh @ 2020-07-09 10:12 UTC (permalink / raw)
  To: linux-kernel, bpf, linux-security-module
  Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau,
	Paul Turner, Jann Horn, Florent Revest

From: KP Singh <kpsingh@google.com>

# v3 -> v4

- Fixed a missing include to bpf_sk_storage.h in bpf_sk_storage.c
- Fixed some functions that were not marked as static which led to
  W=1 compilation warnings.

# v2 -> v3

* Restructured the code as per Martin's suggestions:
  - Common functionality in bpf_local_storage.c
  - bpf_sk_storage functionality remains in net/bpf_sk_storage.
  - bpf_inode_storage is kept separate as it is enabled only with
    CONFIG_BPF_LSM.
* A separate cache for inode and sk storage with macros to define it.
* Use the ops style approach as suggested by Martin instead of the
  enum + switch style.
* Added the inode map to bpftool bash completion and docs.
* Rebase and indentation fixes.

# v1 -> v2

* Use the security blob pointer instead of dedicated member in
  struct inode.
* Better code re-use as suggested by Alexei.
* Dropped the inode count arithmetic as pointed out by Alexei.
* Minor bug fixes and rebase.

bpf_sk_storage can already be used by some BPF program types to annotate
socket objects. These annotations are managed with the life-cycle of the
object (i.e. freed when the object is freed) which makes BPF programs
much simpler and less prone to errors and leaks.

This patch series:

* Generalizes the bpf_sk_storage infrastructure to allow easy
  implementation of local storage for other objects
* Implements local storage for inodes
* Makes both bpf_{sk, inode}_storage available to LSM programs.

Local storage is safe to use in LSM programs as the attachment sites are
limited and the owning object won't be freed, however, this is not the
case for tracing. Usage in tracing is expected to follow a white-list
based approach similar to the d_path helper
(https://lore.kernel.org/bpf/20200506132946.2164578-1-jolsa@kernel.org).

Access to local storage would allow LSM programs to implement stateful
detections like detecting the unlink of a running executable from the
examples shared as a part of the KRSI series
https://lore.kernel.org/bpf/20200329004356.27286-1-kpsingh@chromium.org/
and
https://github.com/sinkap/linux-krsi/blob/patch/v1/examples/samples/bpf/lsm_detect_exec_unlink.c


KP Singh (4):
  bpf: Generalize bpf_sk_storage
  bpf: Implement bpf_local_storage for inodes
  bpf: Allow local storage to be used from LSM programs
  bpf: Add selftests for local_storage

 include/linux/bpf.h                           |  14 +
 include/linux/bpf_local_storage.h             | 190 ++++
 include/linux/bpf_lsm.h                       |  21 +
 include/linux/bpf_types.h                     |   3 +
 include/net/bpf_sk_storage.h                  |   2 +
 include/net/sock.h                            |   4 +-
 include/uapi/linux/bpf.h                      |  54 +-
 kernel/bpf/Makefile                           |   2 +
 kernel/bpf/bpf_inode_storage.c                | 333 +++++++
 kernel/bpf/bpf_local_storage.c                | 517 +++++++++++
 kernel/bpf/bpf_lsm.c                          |  21 +-
 kernel/bpf/syscall.c                          |   3 +-
 kernel/bpf/verifier.c                         |  10 +
 net/core/bpf_sk_storage.c                     | 825 ++++--------------
 security/bpf/hooks.c                          |   7 +
 .../bpf/bpftool/Documentation/bpftool-map.rst |   2 +-
 tools/bpf/bpftool/bash-completion/bpftool     |   3 +-
 tools/bpf/bpftool/map.c                       |   3 +-
 tools/include/uapi/linux/bpf.h                |  54 +-
 tools/lib/bpf/libbpf_probes.c                 |   5 +-
 .../bpf/prog_tests/test_local_storage.c       |  60 ++
 .../selftests/bpf/progs/local_storage.c       | 136 +++
 22 files changed, 1599 insertions(+), 670 deletions(-)
 create mode 100644 include/linux/bpf_local_storage.h
 create mode 100644 kernel/bpf/bpf_inode_storage.c
 create mode 100644 kernel/bpf/bpf_local_storage.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/test_local_storage.c
 create mode 100644 tools/testing/selftests/bpf/progs/local_storage.c

-- 
2.27.0.389.gc38d7665816-goog


^ permalink raw reply

* Re: [PATCH bpf-next v3 2/4] bpf: Implement bpf_local_storage for inodes
From: KP Singh @ 2020-07-09  9:44 UTC (permalink / raw)
  To: kernel test robot
  Cc: KP Singh, linux-kernel, bpf, linux-security-module, kbuild-all,
	Alexei Starovoitov, Daniel Borkmann, Paul Turner, Jann Horn,
	Florent Revest
In-Reply-To: <202007091250.vqzrSanp%lkp@intel.com>

On 09-Jul 12:37, kernel test robot wrote:
> Hi KP,
> 
> I love your patch! Perhaps something to improve:
> 
> [auto build test WARNING on bpf-next/master]
> 
> url:    https://github.com/0day-ci/linux/commits/KP-Singh/Generalizing-bpf_local_storage/20200709-085810
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
> config: mips-allyesconfig (attached as .config)
> compiler: mips-linux-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # save the attached .config to linux build tree
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=mips 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> 
> All warnings (new ones prefixed by >>):
> 
> >> kernel/bpf/bpf_inode_storage.c:274:17: warning: no previous prototype for 'inode_storage_map_alloc' [-Wmissing-prototypes]
>      274 | struct bpf_map *inode_storage_map_alloc(union bpf_attr *attr)
>          |                 ^~~~~~~~~~~~~~~~~~~~~~~
> >> kernel/bpf/bpf_inode_storage.c:286:6: warning: no previous prototype for 'inode_storage_map_free' [-Wmissing-prototypes]
>      286 | void inode_storage_map_free(struct bpf_map *map)
>          |      ^~~~~~~~~~~~~~~~~~~~~~

Thanks! Should have been static. Fixed these. Will send a v4 with
these fixes.

- KP 

> 
> vim +/inode_storage_map_alloc +274 kernel/bpf/bpf_inode_storage.c
> 
>    273	
>  > 274	struct bpf_map *inode_storage_map_alloc(union bpf_attr *attr)
>    275	{
>    276		struct bpf_local_storage_map *smap;
>    277	
>    278		smap = bpf_local_storage_map_alloc(attr);
>    279		if (IS_ERR(smap))
>    280			return ERR_CAST(smap);
>    281	
>    282		smap->cache_idx = cache_idx_get_inode();
>    283		return &smap->map;
>    284	}
>    285	
>  > 286	void inode_storage_map_free(struct bpf_map *map)
>    287	{
>    288		struct bpf_local_storage_map *smap;
>    289	
>    290		smap = (struct bpf_local_storage_map *)map;
>    291		cache_idx_free_inode(smap->cache_idx);
>    292		bpf_local_storage_map_free(smap);
>    293	}
>    294	
> 
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org



^ permalink raw reply

* Re: [PATCH bpf-next v3 1/4] bpf: Generalize bpf_sk_storage
From: KP Singh @ 2020-07-09  9:43 UTC (permalink / raw)
  To: kernel test robot
  Cc: KP Singh, linux-kernel, bpf, linux-security-module, kbuild-all,
	Alexei Starovoitov, Daniel Borkmann, Paul Turner, Jann Horn,
	Florent Revest
In-Reply-To: <202007091053.Se7i8FMj%lkp@intel.com>

On 09-Jul 10:49, kernel test robot wrote:
> Hi KP,
> 
> I love your patch! Perhaps something to improve:
> 
> [auto build test WARNING on bpf-next/master]
> 
> url:    https://github.com/0day-ci/linux/commits/KP-Singh/Generalizing-bpf_local_storage/20200709-085810
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
> config: alpha-allyesconfig (attached as .config)
> compiler: alpha-linux-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # save the attached .config to linux build tree
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=alpha 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> 
> All warnings (new ones prefixed by >>):
> 
> >> net/core/bpf_sk_storage.c:170:6: warning: no previous prototype for 'bpf_sk_storage_free' [-Wmissing-prototypes]
>      170 | void bpf_sk_storage_free(struct sock *sk)

Thanks! Fixed with a missing include to bpf_sk_storage.h in
bpf_sk_storage.c

>          |      ^~~~~~~~~~~~~~~~~~~
> >> net/core/bpf_sk_storage.c:280:5: warning: no previous prototype for 'bpf_sk_storage_clone' [-Wmissing-prototypes]
>      280 | int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
>          |     ^~~~~~~~~~~~~~~~~~~~
> >> net/core/bpf_sk_storage.c:401:17: warning: no previous prototype for 'sk_storage_map_alloc' [-Wmissing-prototypes]
>      401 | struct bpf_map *sk_storage_map_alloc(union bpf_attr *attr)
>          |                 ^~~~~~~~~~~~~~~~~~~~

should have been static. Fixed.

> >> net/core/bpf_sk_storage.c:413:6: warning: no previous prototype for 'sk_storage_map_free' [-Wmissing-prototypes]
>      413 | void sk_storage_map_free(struct bpf_map *map)
>          |      ^~~~~~~~~~~~~~~~~~~

Ditto. Fixed.


- KP

> >> net/core/bpf_sk_storage.c:483:6: warning: no previous prototype for 'bpf_sk_storage_diag_free' [-Wmissing-prototypes]
>      483 | void bpf_sk_storage_diag_free(struct bpf_sk_storage_diag *diag)
>          |      ^~~~~~~~~~~~~~~~~~~~~~~~
> >> net/core/bpf_sk_storage.c:511:1: warning: no previous prototype for 'bpf_sk_storage_diag_alloc' [-Wmissing-prototypes]
>      511 | bpf_sk_storage_diag_alloc(const struct nlattr *nla_stgs)
>          | ^~~~~~~~~~~~~~~~~~~~~~~~~
> >> net/core/bpf_sk_storage.c:658:5: warning: no previous prototype for 'bpf_sk_storage_diag_put' [-Wmissing-prototypes]
>      658 | int bpf_sk_storage_diag_put(struct bpf_sk_storage_diag *diag,
>          |     ^~~~~~~~~~~~~~~~~~~~~~~
> 
> vim +/bpf_sk_storage_free +170 net/core/bpf_sk_storage.c
> 
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  168  
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  169  /* Called by __sk_destruct() & bpf_sk_storage_clone() */
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26 @170  void bpf_sk_storage_free(struct sock *sk)
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  171  {
> 9af362a775d83f KP Singh           2020-07-09  172  	struct bpf_local_storage_elem *selem;
> 9af362a775d83f KP Singh           2020-07-09  173  	struct bpf_local_storage *sk_storage;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  174  	bool free_sk_storage = false;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  175  	struct hlist_node *n;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  176  
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  177  	rcu_read_lock();
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  178  	sk_storage = rcu_dereference(sk->sk_bpf_storage);
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  179  	if (!sk_storage) {
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  180  		rcu_read_unlock();
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  181  		return;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  182  	}
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  183  
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  184  	/* Netiher the bpf_prog nor the bpf-map's syscall
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  185  	 * could be modifying the sk_storage->list now.
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  186  	 * Thus, no elem can be added-to or deleted-from the
> 9af362a775d83f KP Singh           2020-07-09  187  	 * local_storage->list by the bpf_prog or by the bpf-map's syscall.
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  188  	 *
> 9af362a775d83f KP Singh           2020-07-09  189  	 * It is racing with bpf_local_storage_map_free() alone
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  190  	 * when unlinking elem from the sk_storage->list and
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  191  	 * the map's bucket->list.
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  192  	 */
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  193  	raw_spin_lock_bh(&sk_storage->lock);
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  194  	hlist_for_each_entry_safe(selem, n, &sk_storage->list, snode) {
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  195  		/* Always unlink from map before unlinking from
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  196  		 * sk_storage.
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  197  		 */
> 9af362a775d83f KP Singh           2020-07-09  198  		bpf_selem_unlink_map(selem);
> 9af362a775d83f KP Singh           2020-07-09  199  		free_sk_storage = bpf_selem_unlink(sk_storage, selem, true);
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  200  	}
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  201  	raw_spin_unlock_bh(&sk_storage->lock);
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  202  	rcu_read_unlock();
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  203  
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  204  	if (free_sk_storage)
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  205  		kfree_rcu(sk_storage, rcu);
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  206  }
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  207  
> 9af362a775d83f KP Singh           2020-07-09  208  static void *bpf_sk_storage_lookup_elem(struct bpf_map *map, void *key)
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  209  {
> 9af362a775d83f KP Singh           2020-07-09  210  	struct bpf_local_storage_data *sdata;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  211  	struct socket *sock;
> 9af362a775d83f KP Singh           2020-07-09  212  	int fd, err = -EINVAL;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  213  
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  214  	fd = *(int *)key;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  215  	sock = sockfd_lookup(fd, &err);
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  216  	if (sock) {
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  217  		sdata = sk_storage_lookup(sock->sk, map, true);
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  218  		sockfd_put(sock);
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  219  		return sdata ? sdata->data : NULL;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  220  	}
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  221  
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  222  	return ERR_PTR(err);
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  223  }
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  224  
> 9af362a775d83f KP Singh           2020-07-09  225  static int bpf_sk_storage_update_elem(struct bpf_map *map, void *key,
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  226  				      void *value, u64 map_flags)
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  227  {
> 9af362a775d83f KP Singh           2020-07-09  228  	struct bpf_local_storage_data *sdata;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  229  	struct socket *sock;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  230  	int fd, err;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  231  
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  232  	fd = *(int *)key;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  233  	sock = sockfd_lookup(fd, &err);
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  234  	if (sock) {
> 9af362a775d83f KP Singh           2020-07-09  235  		sdata = map->ops->map_local_storage_update(sock->sk, map, value,
> 9af362a775d83f KP Singh           2020-07-09  236  							   map_flags);
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  237  		sockfd_put(sock);
> 71f150f4c2af5f YueHaibing         2019-04-29  238  		return PTR_ERR_OR_ZERO(sdata);
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  239  	}
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  240  
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  241  	return err;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  242  }
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  243  
> 9af362a775d83f KP Singh           2020-07-09  244  static int bpf_sk_storage_delete_elem(struct bpf_map *map, void *key)
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  245  {
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  246  	struct socket *sock;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  247  	int fd, err;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  248  
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  249  	fd = *(int *)key;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  250  	sock = sockfd_lookup(fd, &err);
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  251  	if (sock) {
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  252  		err = sk_storage_delete(sock->sk, map);
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  253  		sockfd_put(sock);
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  254  	}
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  255  
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  256  	return err;
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  257  }
> 6ac99e8f23d4b1 Martin KaFai Lau   2019-04-26  258  
> 9af362a775d83f KP Singh           2020-07-09  259  static struct bpf_local_storage_elem *
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  260  bpf_sk_storage_clone_elem(struct sock *newsk,
> 9af362a775d83f KP Singh           2020-07-09  261  			  struct bpf_local_storage_map *smap,
> 9af362a775d83f KP Singh           2020-07-09  262  			  struct bpf_local_storage_elem *selem)
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  263  {
> 9af362a775d83f KP Singh           2020-07-09  264  	struct bpf_local_storage_elem *copy_selem;
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  265  
> 9af362a775d83f KP Singh           2020-07-09  266  	copy_selem = sk_selem_alloc(smap, newsk, NULL, true);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  267  	if (!copy_selem)
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  268  		return NULL;
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  269  
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  270  	if (map_value_has_spin_lock(&smap->map))
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  271  		copy_map_value_locked(&smap->map, SDATA(copy_selem)->data,
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  272  				      SDATA(selem)->data, true);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  273  	else
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  274  		copy_map_value(&smap->map, SDATA(copy_selem)->data,
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  275  			       SDATA(selem)->data);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  276  
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  277  	return copy_selem;
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  278  }
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  279  
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14 @280  int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  281  {
> 9af362a775d83f KP Singh           2020-07-09  282  	struct bpf_local_storage *new_sk_storage = NULL;
> 9af362a775d83f KP Singh           2020-07-09  283  	struct bpf_local_storage *sk_storage;
> 9af362a775d83f KP Singh           2020-07-09  284  	struct bpf_local_storage_elem *selem;
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  285  	int ret = 0;
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  286  
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  287  	RCU_INIT_POINTER(newsk->sk_bpf_storage, NULL);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  288  
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  289  	rcu_read_lock();
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  290  	sk_storage = rcu_dereference(sk->sk_bpf_storage);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  291  
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  292  	if (!sk_storage || hlist_empty(&sk_storage->list))
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  293  		goto out;
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  294  
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  295  	hlist_for_each_entry_rcu(selem, &sk_storage->list, snode) {
> 9af362a775d83f KP Singh           2020-07-09  296  		struct bpf_local_storage_elem *copy_selem;
> 9af362a775d83f KP Singh           2020-07-09  297  		struct bpf_local_storage_map *smap;
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  298  		struct bpf_map *map;
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  299  
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  300  		smap = rcu_dereference(SDATA(selem)->smap);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  301  		if (!(smap->map.map_flags & BPF_F_CLONE))
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  302  			continue;
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  303  
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  304  		/* Note that for lockless listeners adding new element
> 9af362a775d83f KP Singh           2020-07-09  305  		 * here can race with cleanup in bpf_local_storage_map_free.
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  306  		 * Try to grab map refcnt to make sure that it's still
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  307  		 * alive and prevent concurrent removal.
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  308  		 */
> 1e0bd5a091e5d9 Andrii Nakryiko    2019-11-17  309  		map = bpf_map_inc_not_zero(&smap->map);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  310  		if (IS_ERR(map))
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  311  			continue;
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  312  
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  313  		copy_selem = bpf_sk_storage_clone_elem(newsk, smap, selem);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  314  		if (!copy_selem) {
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  315  			ret = -ENOMEM;
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  316  			bpf_map_put(map);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  317  			goto out;
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  318  		}
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  319  
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  320  		if (new_sk_storage) {
> 9af362a775d83f KP Singh           2020-07-09  321  			bpf_selem_link_map(smap, copy_selem);
> 9af362a775d83f KP Singh           2020-07-09  322  			bpf_selem_link(new_sk_storage, copy_selem);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  323  		} else {
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  324  			ret = sk_storage_alloc(newsk, smap, copy_selem);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  325  			if (ret) {
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  326  				kfree(copy_selem);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  327  				atomic_sub(smap->elem_size,
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  328  					   &newsk->sk_omem_alloc);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  329  				bpf_map_put(map);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  330  				goto out;
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  331  			}
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  332  
> 9af362a775d83f KP Singh           2020-07-09  333  			new_sk_storage =
> 9af362a775d83f KP Singh           2020-07-09  334  				rcu_dereference(copy_selem->local_storage);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  335  		}
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  336  		bpf_map_put(map);
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  337  	}
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  338  
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  339  out:
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  340  	rcu_read_unlock();
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  341  
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  342  	/* In case of an error, don't free anything explicitly here, the
> 9af362a775d83f KP Singh           2020-07-09  343  	 * caller is responsible to call bpf_local_storage_free.
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  344  	 */
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  345  
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  346  	return ret;
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  347  }
> 8f51dfc73bf181 Stanislav Fomichev 2019-08-14  348  
> 
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org



^ permalink raw reply

* [PATCH v5 0/8] crpyto: introduce OSCCA certificate and SM2 asymmetric algorithm
From: Tianjia Zhang @ 2020-07-09  8:40 UTC (permalink / raw)
  To: herbert, davem, dhowells, mcoquelin.stm32, alexandre.torgue,
	jmorris, serge, nramas, tusharsu, zohar, vt, gilad, pvanleeuwen,
	zhang.jia
  Cc: linux-crypto, linux-kernel, keyrings, linux-stm32,
	linux-arm-kernel, linux-security-module, linux-integrity,
	tianjia.zhang

Hello all,

This new module implement the OSCCA certificate and SM2 public key
algorithm. It was published by State Encryption Management Bureau, China.
List of specifications for OSCCA certificate and SM2 elliptic curve
public key cryptography:

* GM/T 0003.1-2012
* GM/T 0003.2-2012
* GM/T 0003.3-2012
* GM/T 0003.4-2012
* GM/T 0003.5-2012
* GM/T 0015-2012
* GM/T 0009-2012 

IETF: https://tools.ietf.org/html/draft-shen-sm2-ecdsa-02
oscca: http://www.oscca.gov.cn/sca/xxgk/2010-12/17/content_1002386.shtml
scctc: http://www.gmbz.org.cn/main/bzlb.html

These patchs add the OID object identifier defined by OSCCA. The
x509 certificate supports sm2-with-sm3 type certificate parsing
and verification.

The sm2 algorithm is based on libgcrypt's mpi implementation, and has
made some additions to the kernel's original mpi library, and added the
implementation of ec to better support elliptic curve-like algorithms.

sm2 has good support in both openssl and gnupg projects, and sm3 and sm4
of the OSCCA algorithm family have also been implemented in the kernel.

Among them, sm3 and sm4 have been well implemented in the kernel.
This group of patches has newly introduced sm2. In order to implement
sm2 more perfectly, I expanded the mpi library and introduced the
ec implementation of the mpi library as the basic algorithm. Compared
to the kernel's crypto/ecc.c, the implementation of mpi/ec.c is more
complete and elegant, sm2 is implemented based on these algorithms.

---
v5 changes:
  1. fix compilation failure when SM2 is configured as a module.
  2. simplify the mpi and ec code, remove unused functions reported by test robot.

v4 changes:
  1. Pass data directly when calculating sm2 certificate digest
  2. rebase on mainline.

v3 changes:
  1. integrity asymmetric digsig support sm2-with-sm3 algorithm.
  2. remove unused sm2_set_priv_key().
  3. rebase on mainline.

v2 changes:
  1. simplify the sm2 algorithm and only retain the verify function.
  2. extract the sm2 certificate code into a separate file.

Tianjia Zhang (8):
  crypto: sm3 - export crypto_sm3_final function
  lib/mpi: Extend the MPI library
  lib/mpi: Introduce ec implementation to MPI library
  crypto: sm2 - introduce OSCCA SM2 asymmetric cipher algorithm
  crypto: testmgr - support test with different ciphertext per
    encryption
  X.509: support OSCCA certificate parse
  X.509: support OSCCA sm2-with-sm3 certificate verification
  integrity: Asymmetric digsig supports SM2-with-SM3 algorithm

 crypto/Kconfig                            |   17 +
 crypto/Makefile                           |    8 +
 crypto/asymmetric_keys/Makefile           |    1 +
 crypto/asymmetric_keys/public_key.c       |    6 +
 crypto/asymmetric_keys/public_key_sm2.c   |   61 +
 crypto/asymmetric_keys/x509_cert_parser.c |   14 +-
 crypto/asymmetric_keys/x509_public_key.c  |    3 +
 crypto/sm2.c                              |  473 +++++++
 crypto/sm2signature.asn1                  |    4 +
 crypto/sm3_generic.c                      |    7 +-
 crypto/testmgr.c                          |    7 +-
 include/crypto/public_key.h               |   15 +
 include/crypto/sm2.h                      |   25 +
 include/crypto/sm3.h                      |    2 +
 include/linux/mpi.h                       |  193 +++
 include/linux/oid_registry.h              |    6 +
 lib/mpi/Makefile                          |    6 +
 lib/mpi/ec.c                              | 1509 +++++++++++++++++++++
 lib/mpi/mpi-add.c                         |  207 +++
 lib/mpi/mpi-bit.c                         |  251 ++++
 lib/mpi/mpi-cmp.c                         |   46 +-
 lib/mpi/mpi-div.c                         |  238 ++++
 lib/mpi/mpi-internal.h                    |   53 +
 lib/mpi/mpi-inv.c                         |  143 ++
 lib/mpi/mpi-mod.c                         |  155 +++
 lib/mpi/mpi-mul.c                         |   94 ++
 lib/mpi/mpicoder.c                        |  336 +++++
 lib/mpi/mpih-div.c                        |  294 ++++
 lib/mpi/mpih-mul.c                        |   25 +
 lib/mpi/mpiutil.c                         |  204 +++
 security/integrity/digsig_asymmetric.c    |   14 +-
 31 files changed, 4399 insertions(+), 18 deletions(-)
 create mode 100644 crypto/asymmetric_keys/public_key_sm2.c
 create mode 100644 crypto/sm2.c
 create mode 100644 crypto/sm2signature.asn1
 create mode 100644 include/crypto/sm2.h
 create mode 100644 lib/mpi/ec.c
 create mode 100644 lib/mpi/mpi-add.c
 create mode 100644 lib/mpi/mpi-div.c
 create mode 100644 lib/mpi/mpi-inv.c
 create mode 100644 lib/mpi/mpi-mod.c
 create mode 100644 lib/mpi/mpi-mul.c

-- 
2.17.1

^ permalink raw reply

* [PATCH v5 1/8] crypto: sm3 - export crypto_sm3_final function
From: Tianjia Zhang @ 2020-07-09  8:40 UTC (permalink / raw)
  To: herbert, davem, dhowells, mcoquelin.stm32, alexandre.torgue,
	jmorris, serge, nramas, tusharsu, zohar, vt, gilad, pvanleeuwen,
	zhang.jia
  Cc: linux-crypto, linux-kernel, keyrings, linux-stm32,
	linux-arm-kernel, linux-security-module, linux-integrity,
	tianjia.zhang
In-Reply-To: <20200709084015.21886-1-tianjia.zhang@linux.alibaba.com>

Both crypto_sm3_update and crypto_sm3_finup have been
exported, exporting crypto_sm3_final, to avoid having to
use crypto_sm3_finup(desc, NULL, 0, dgst) to calculate
the hash in some cases.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
---
 crypto/sm3_generic.c | 7 ++++---
 include/crypto/sm3.h | 2 ++
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/crypto/sm3_generic.c b/crypto/sm3_generic.c
index 3468975215ca..193c4584bd00 100644
--- a/crypto/sm3_generic.c
+++ b/crypto/sm3_generic.c
@@ -149,17 +149,18 @@ int crypto_sm3_update(struct shash_desc *desc, const u8 *data,
 }
 EXPORT_SYMBOL(crypto_sm3_update);
 
-static int sm3_final(struct shash_desc *desc, u8 *out)
+int crypto_sm3_final(struct shash_desc *desc, u8 *out)
 {
 	sm3_base_do_finalize(desc, sm3_generic_block_fn);
 	return sm3_base_finish(desc, out);
 }
+EXPORT_SYMBOL(crypto_sm3_final);
 
 int crypto_sm3_finup(struct shash_desc *desc, const u8 *data,
 			unsigned int len, u8 *hash)
 {
 	sm3_base_do_update(desc, data, len, sm3_generic_block_fn);
-	return sm3_final(desc, hash);
+	return crypto_sm3_final(desc, hash);
 }
 EXPORT_SYMBOL(crypto_sm3_finup);
 
@@ -167,7 +168,7 @@ static struct shash_alg sm3_alg = {
 	.digestsize	=	SM3_DIGEST_SIZE,
 	.init		=	sm3_base_init,
 	.update		=	crypto_sm3_update,
-	.final		=	sm3_final,
+	.final		=	crypto_sm3_final,
 	.finup		=	crypto_sm3_finup,
 	.descsize	=	sizeof(struct sm3_state),
 	.base		=	{
diff --git a/include/crypto/sm3.h b/include/crypto/sm3.h
index 1438942dc773..42ea21289ba9 100644
--- a/include/crypto/sm3.h
+++ b/include/crypto/sm3.h
@@ -35,6 +35,8 @@ struct shash_desc;
 extern int crypto_sm3_update(struct shash_desc *desc, const u8 *data,
 			      unsigned int len);
 
+extern int crypto_sm3_final(struct shash_desc *desc, u8 *out);
+
 extern int crypto_sm3_finup(struct shash_desc *desc, const u8 *data,
 			     unsigned int len, u8 *hash);
 #endif
-- 
2.17.1


^ permalink raw reply related

* [PATCH v5 6/8] X.509: support OSCCA certificate parse
From: Tianjia Zhang @ 2020-07-09  8:40 UTC (permalink / raw)
  To: herbert, davem, dhowells, mcoquelin.stm32, alexandre.torgue,
	jmorris, serge, nramas, tusharsu, zohar, vt, gilad, pvanleeuwen,
	zhang.jia
  Cc: linux-crypto, linux-kernel, keyrings, linux-stm32,
	linux-arm-kernel, linux-security-module, linux-integrity,
	tianjia.zhang
In-Reply-To: <20200709084015.21886-1-tianjia.zhang@linux.alibaba.com>

The digital certificate format based on SM2 crypto algorithm as
specified in GM/T 0015-2012. It was published by State Encryption
Management Bureau, China.

This patch adds the OID object identifier defined by OSCCA. The
x509 certificate supports sm2-with-sm3 type certificate parsing.
It uses the standard elliptic curve public key, and the sm2
algorithm signs the hash generated by sm3.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
---
 crypto/asymmetric_keys/x509_cert_parser.c | 14 +++++++++++++-
 include/linux/oid_registry.h              |  6 ++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/crypto/asymmetric_keys/x509_cert_parser.c b/crypto/asymmetric_keys/x509_cert_parser.c
index 26ec20ef4899..6a8aee22bfd4 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -234,6 +234,10 @@ int x509_note_pkey_algo(void *context, size_t hdrlen,
 	case OID_gost2012Signature512:
 		ctx->cert->sig->hash_algo = "streebog512";
 		goto ecrdsa;
+
+	case OID_sm2_with_sm3:
+		ctx->cert->sig->hash_algo = "sm3";
+		goto sm2;
 	}
 
 rsa_pkcs1:
@@ -246,6 +250,11 @@ int x509_note_pkey_algo(void *context, size_t hdrlen,
 	ctx->cert->sig->encoding = "raw";
 	ctx->algo_oid = ctx->last_oid;
 	return 0;
+sm2:
+	ctx->cert->sig->pkey_algo = "sm2";
+	ctx->cert->sig->encoding = "raw";
+	ctx->algo_oid = ctx->last_oid;
+	return 0;
 }
 
 /*
@@ -266,7 +275,8 @@ int x509_note_signature(void *context, size_t hdrlen,
 	}
 
 	if (strcmp(ctx->cert->sig->pkey_algo, "rsa") == 0 ||
-	    strcmp(ctx->cert->sig->pkey_algo, "ecrdsa") == 0) {
+	    strcmp(ctx->cert->sig->pkey_algo, "ecrdsa") == 0 ||
+	    strcmp(ctx->cert->sig->pkey_algo, "sm2") == 0) {
 		/* Discard the BIT STRING metadata */
 		if (vlen < 1 || *(const u8 *)value != 0)
 			return -EBADMSG;
@@ -456,6 +466,8 @@ int x509_extract_key_data(void *context, size_t hdrlen,
 	else if (ctx->last_oid == OID_gost2012PKey256 ||
 		 ctx->last_oid == OID_gost2012PKey512)
 		ctx->cert->pub->pkey_algo = "ecrdsa";
+	else if (ctx->last_oid == OID_id_ecPublicKey)
+		ctx->cert->pub->pkey_algo = "sm2";
 	else
 		return -ENOPKG;
 
diff --git a/include/linux/oid_registry.h b/include/linux/oid_registry.h
index 657d6bf2c064..48fe3133ff39 100644
--- a/include/linux/oid_registry.h
+++ b/include/linux/oid_registry.h
@@ -107,6 +107,12 @@ enum OID {
 	OID_gostTC26Sign512B,		/* 1.2.643.7.1.2.1.2.2 */
 	OID_gostTC26Sign512C,		/* 1.2.643.7.1.2.1.2.3 */
 
+	/* OSCCA */
+	OID_sm2,			/* 1.2.156.10197.1.301 */
+	OID_sm3,			/* 1.2.156.10197.1.401 */
+	OID_sm2_with_sm3,		/* 1.2.156.10197.1.501 */
+	OID_sm3WithRSAEncryption,	/* 1.2.156.10197.1.504 */
+
 	OID__NR
 };
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH v5 2/8] lib/mpi: Extend the MPI library
From: Tianjia Zhang @ 2020-07-09  8:40 UTC (permalink / raw)
  To: herbert, davem, dhowells, mcoquelin.stm32, alexandre.torgue,
	jmorris, serge, nramas, tusharsu, zohar, vt, gilad, pvanleeuwen,
	zhang.jia
  Cc: linux-crypto, linux-kernel, keyrings, linux-stm32,
	linux-arm-kernel, linux-security-module, linux-integrity,
	tianjia.zhang
In-Reply-To: <20200709084015.21886-1-tianjia.zhang@linux.alibaba.com>

Expand the mpi library based on libgcrypt, and the ECC algorithm of
mpi based on libgcrypt requires these functions.
Some other algorithms will be developed based on mpi ecc, such as SM2.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
---
 include/linux/mpi.h    |  88 +++++++++++
 lib/mpi/Makefile       |   5 +
 lib/mpi/mpi-add.c      | 207 +++++++++++++++++++++++++
 lib/mpi/mpi-bit.c      | 251 ++++++++++++++++++++++++++++++
 lib/mpi/mpi-cmp.c      |  46 ++++--
 lib/mpi/mpi-div.c      | 238 +++++++++++++++++++++++++++++
 lib/mpi/mpi-internal.h |  53 +++++++
 lib/mpi/mpi-inv.c      | 143 ++++++++++++++++++
 lib/mpi/mpi-mod.c      | 155 +++++++++++++++++++
 lib/mpi/mpi-mul.c      |  94 ++++++++++++
 lib/mpi/mpicoder.c     | 336 +++++++++++++++++++++++++++++++++++++++++
 lib/mpi/mpih-div.c     | 294 ++++++++++++++++++++++++++++++++++++
 lib/mpi/mpih-mul.c     |  25 +++
 lib/mpi/mpiutil.c      | 204 +++++++++++++++++++++++++
 14 files changed, 2129 insertions(+), 10 deletions(-)
 create mode 100644 lib/mpi/mpi-add.c
 create mode 100644 lib/mpi/mpi-div.c
 create mode 100644 lib/mpi/mpi-inv.c
 create mode 100644 lib/mpi/mpi-mod.c
 create mode 100644 lib/mpi/mpi-mul.c

diff --git a/include/linux/mpi.h b/include/linux/mpi.h
index 7bd6d8af0004..2dddf4c6e011 100644
--- a/include/linux/mpi.h
+++ b/include/linux/mpi.h
@@ -40,21 +40,79 @@ struct gcry_mpi {
 typedef struct gcry_mpi *MPI;
 
 #define mpi_get_nlimbs(a)     ((a)->nlimbs)
+#define mpi_has_sign(a)       ((a)->sign)
 
 /*-- mpiutil.c --*/
 MPI mpi_alloc(unsigned nlimbs);
+void mpi_clear(MPI a);
 void mpi_free(MPI a);
 int mpi_resize(MPI a, unsigned nlimbs);
 
+static inline MPI mpi_new(unsigned int nbits)
+{
+	return mpi_alloc((nbits + BITS_PER_MPI_LIMB - 1) / BITS_PER_MPI_LIMB);
+}
+
+MPI mpi_copy(MPI a);
+MPI mpi_alloc_like(MPI a);
+void mpi_snatch(MPI w, MPI u);
+MPI mpi_set(MPI w, MPI u);
+MPI mpi_set_ui(MPI w, unsigned long u);
+MPI mpi_alloc_set_ui(unsigned long u);
+void mpi_swap_cond(MPI a, MPI b, unsigned long swap);
+
+/* Constants used to return constant MPIs.  See mpi_init if you
+ * want to add more constants.
+ */
+#define MPI_NUMBER_OF_CONSTANTS 6
+enum gcry_mpi_constants {
+	MPI_C_ZERO,
+	MPI_C_ONE,
+	MPI_C_TWO,
+	MPI_C_THREE,
+	MPI_C_FOUR,
+	MPI_C_EIGHT
+};
+
+MPI mpi_const(enum gcry_mpi_constants no);
+
 /*-- mpicoder.c --*/
+
+/* Different formats of external big integer representation. */
+enum gcry_mpi_format {
+	GCRYMPI_FMT_NONE = 0,
+	GCRYMPI_FMT_STD = 1,    /* Twos complement stored without length. */
+	GCRYMPI_FMT_PGP = 2,    /* As used by OpenPGP (unsigned only). */
+	GCRYMPI_FMT_SSH = 3,    /* As used by SSH (like STD but with length). */
+	GCRYMPI_FMT_HEX = 4,    /* Hex format. */
+	GCRYMPI_FMT_USG = 5,    /* Like STD but unsigned. */
+	GCRYMPI_FMT_OPAQUE = 8  /* Opaque format (some functions only). */
+};
+
 MPI mpi_read_raw_data(const void *xbuffer, size_t nbytes);
 MPI mpi_read_from_buffer(const void *buffer, unsigned *ret_nread);
+int mpi_fromstr(MPI val, const char *str);
+MPI mpi_scanval(const char *string);
 MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int len);
 void *mpi_get_buffer(MPI a, unsigned *nbytes, int *sign);
 int mpi_read_buffer(MPI a, uint8_t *buf, unsigned buf_len, unsigned *nbytes,
 		    int *sign);
 int mpi_write_to_sgl(MPI a, struct scatterlist *sg, unsigned nbytes,
 		     int *sign);
+int mpi_print(enum gcry_mpi_format format, unsigned char *buffer,
+			size_t buflen, size_t *nwritten, MPI a);
+
+/*-- mpi-mod.c --*/
+void mpi_mod(MPI rem, MPI dividend, MPI divisor);
+
+/* Context used with Barrett reduction.  */
+struct barrett_ctx_s;
+typedef struct barrett_ctx_s *mpi_barrett_t;
+
+mpi_barrett_t mpi_barrett_init(MPI m, int copy);
+void mpi_barrett_free(mpi_barrett_t ctx);
+void mpi_mod_barrett(MPI r, MPI x, mpi_barrett_t ctx);
+void mpi_mul_barrett(MPI w, MPI u, MPI v, mpi_barrett_t ctx);
 
 /*-- mpi-pow.c --*/
 int mpi_powm(MPI res, MPI base, MPI exp, MPI mod);
@@ -62,10 +120,40 @@ int mpi_powm(MPI res, MPI base, MPI exp, MPI mod);
 /*-- mpi-cmp.c --*/
 int mpi_cmp_ui(MPI u, ulong v);
 int mpi_cmp(MPI u, MPI v);
+int mpi_cmpabs(MPI u, MPI v);
 
 /*-- mpi-bit.c --*/
 void mpi_normalize(MPI a);
 unsigned mpi_get_nbits(MPI a);
+int mpi_test_bit(MPI a, unsigned int n);
+void mpi_set_bit(MPI a, unsigned int n);
+void mpi_set_highbit(MPI a, unsigned int n);
+void mpi_clear_highbit(MPI a, unsigned int n);
+void mpi_clear_bit(MPI a, unsigned int n);
+void mpi_rshift_limbs(MPI a, unsigned int count);
+void mpi_rshift(MPI x, MPI a, unsigned int n);
+void mpi_lshift_limbs(MPI a, unsigned int count);
+void mpi_lshift(MPI x, MPI a, unsigned int n);
+
+/*-- mpi-add.c --*/
+void mpi_add_ui(MPI w, MPI u, unsigned long v);
+void mpi_add(MPI w, MPI u, MPI v);
+void mpi_sub_ui(MPI w, MPI u, unsigned long v);
+void mpi_sub(MPI w, MPI u, MPI v);
+void mpi_addm(MPI w, MPI u, MPI v, MPI m);
+void mpi_subm(MPI w, MPI u, MPI v, MPI m);
+
+/*-- mpi-mul.c --*/
+void mpi_mul(MPI w, MPI u, MPI v);
+void mpi_mulm(MPI w, MPI u, MPI v, MPI m);
+
+/*-- mpi-div.c --*/
+void mpi_tdiv_r(MPI rem, MPI num, MPI den);
+void mpi_fdiv_r(MPI rem, MPI dividend, MPI divisor);
+void mpi_fdiv_q(MPI quot, MPI dividend, MPI divisor);
+
+/*-- mpi-inv.c --*/
+int mpi_invm(MPI x, MPI a, MPI n);
 
 /* inline functions */
 
diff --git a/lib/mpi/Makefile b/lib/mpi/Makefile
index d5874a7f5ff9..5f40f93ff3d9 100644
--- a/lib/mpi/Makefile
+++ b/lib/mpi/Makefile
@@ -14,8 +14,13 @@ mpi-y = \
 	generic_mpih-sub1.o		\
 	generic_mpih-add1.o		\
 	mpicoder.o			\
+	mpi-add.o			\
 	mpi-bit.o			\
 	mpi-cmp.o			\
+	mpi-div.o			\
+	mpi-inv.o			\
+	mpi-mod.o			\
+	mpi-mul.o			\
 	mpih-cmp.o			\
 	mpih-div.o			\
 	mpih-mul.o			\
diff --git a/lib/mpi/mpi-add.c b/lib/mpi/mpi-add.c
new file mode 100644
index 000000000000..9afad7832737
--- /dev/null
+++ b/lib/mpi/mpi-add.c
@@ -0,0 +1,207 @@
+/* mpi-add.c  -  MPI functions
+ * Copyright (C) 1994, 1996, 1998, 2001, 2002,
+ *               2003 Free Software Foundation, Inc.
+ *
+ * This file is part of Libgcrypt.
+ *
+ * Note: This code is heavily based on the GNU MP Library.
+ *	 Actually it's the same code with only minor changes in the
+ *	 way the data is stored; this is to support the abstraction
+ *	 of an optional secure memory allocation which may be used
+ *	 to avoid revealing of sensitive data due to paging etc.
+ */
+
+#include "mpi-internal.h"
+
+/****************
+ * Add the unsigned integer V to the mpi-integer U and store the
+ * result in W. U and V may be the same.
+ */
+void mpi_add_ui(MPI w, MPI u, unsigned long v)
+{
+	mpi_ptr_t wp, up;
+	mpi_size_t usize, wsize;
+	int usign, wsign;
+
+	usize = u->nlimbs;
+	usign = u->sign;
+	wsign = 0;
+
+	/* If not space for W (and possible carry), increase space.  */
+	wsize = usize + 1;
+	if (w->alloced < wsize)
+		mpi_resize(w, wsize);
+
+	/* These must be after realloc (U may be the same as W).  */
+	up = u->d;
+	wp = w->d;
+
+	if (!usize) {  /* simple */
+		wp[0] = v;
+		wsize = v ? 1:0;
+	} else if (!usign) {  /* mpi is not negative */
+		mpi_limb_t cy;
+		cy = mpihelp_add_1(wp, up, usize, v);
+		wp[usize] = cy;
+		wsize = usize + cy;
+	} else {
+		/* The signs are different.  Need exact comparison to determine
+		 * which operand to subtract from which.
+		 */
+		if (usize == 1 && up[0] < v) {
+			wp[0] = v - up[0];
+			wsize = 1;
+		} else {
+			mpihelp_sub_1(wp, up, usize, v);
+			/* Size can decrease with at most one limb. */
+			wsize = usize - (wp[usize-1] == 0);
+			wsign = 1;
+		}
+	}
+
+	w->nlimbs = wsize;
+	w->sign   = wsign;
+}
+
+
+void mpi_add(MPI w, MPI u, MPI v)
+{
+	mpi_ptr_t wp, up, vp;
+	mpi_size_t usize, vsize, wsize;
+	int usign, vsign, wsign;
+
+	if (u->nlimbs < v->nlimbs) { /* Swap U and V. */
+		usize = v->nlimbs;
+		usign = v->sign;
+		vsize = u->nlimbs;
+		vsign = u->sign;
+		wsize = usize + 1;
+		RESIZE_IF_NEEDED(w, wsize);
+		/* These must be after realloc (u or v may be the same as w).  */
+		up = v->d;
+		vp = u->d;
+	} else {
+		usize = u->nlimbs;
+		usign = u->sign;
+		vsize = v->nlimbs;
+		vsign = v->sign;
+		wsize = usize + 1;
+		RESIZE_IF_NEEDED(w, wsize);
+		/* These must be after realloc (u or v may be the same as w).  */
+		up = u->d;
+		vp = v->d;
+	}
+	wp = w->d;
+	wsign = 0;
+
+	if (!vsize) {  /* simple */
+		MPN_COPY(wp, up, usize);
+		wsize = usize;
+		wsign = usign;
+	} else if (usign != vsign) { /* different sign */
+		/* This test is right since USIZE >= VSIZE */
+		if (usize != vsize) {
+			mpihelp_sub(wp, up, usize, vp, vsize);
+			wsize = usize;
+			MPN_NORMALIZE(wp, wsize);
+			wsign = usign;
+		} else if (mpihelp_cmp(up, vp, usize) < 0) {
+			mpihelp_sub_n(wp, vp, up, usize);
+			wsize = usize;
+			MPN_NORMALIZE(wp, wsize);
+			if (!usign)
+				wsign = 1;
+		} else {
+			mpihelp_sub_n(wp, up, vp, usize);
+			wsize = usize;
+			MPN_NORMALIZE(wp, wsize);
+			if (usign)
+				wsign = 1;
+		}
+	} else { /* U and V have same sign. Add them. */
+		mpi_limb_t cy = mpihelp_add(wp, up, usize, vp, vsize);
+		wp[usize] = cy;
+		wsize = usize + cy;
+		if (usign)
+			wsign = 1;
+	}
+
+	w->nlimbs = wsize;
+	w->sign = wsign;
+}
+EXPORT_SYMBOL_GPL(mpi_add);
+
+
+/****************
+ * Subtract the unsigned integer V from the mpi-integer U and store the
+ * result in W.
+ */
+void mpi_sub_ui(MPI w, MPI u, unsigned long v)
+{
+	mpi_ptr_t wp, up;
+	mpi_size_t usize, wsize;
+	int usign, wsign;
+
+	usize = u->nlimbs;
+	usign = u->sign;
+	wsign = 0;
+
+	/* If not space for W (and possible carry), increase space.  */
+	wsize = usize + 1;
+	if (w->alloced < wsize)
+		mpi_resize(w, wsize);
+
+	/* These must be after realloc (U may be the same as W).  */
+	up = u->d;
+	wp = w->d;
+
+	if (!usize) {  /* simple */
+		wp[0] = v;
+		wsize = v ? 1:0;
+		wsign = 1;
+	} else if (usign) {	/* mpi and v are negative */
+		mpi_limb_t cy;
+		cy = mpihelp_add_1(wp, up, usize, v);
+		wp[usize] = cy;
+		wsize = usize + cy;
+	} else {
+		/* The signs are different.  Need exact comparison to determine
+		 * which operand to subtract from which.
+		 */
+		if (usize == 1 && up[0] < v) {
+			wp[0] = v - up[0];
+			wsize = 1;
+			wsign = 1;
+		} else {
+			mpihelp_sub_1(wp, up, usize, v);
+			/* Size can decrease with at most one limb. */
+			wsize = usize - (wp[usize-1] == 0);
+		}
+	}
+
+	w->nlimbs = wsize;
+	w->sign   = wsign;
+}
+
+void mpi_sub(MPI w, MPI u, MPI v)
+{
+	MPI vv = mpi_copy(v);
+	vv->sign = !vv->sign;
+	mpi_add(w, u, vv);
+	mpi_free(vv);
+}
+
+
+void mpi_addm(MPI w, MPI u, MPI v, MPI m)
+{
+	mpi_add(w, u, v);
+	mpi_mod(w, w, m);
+}
+EXPORT_SYMBOL_GPL(mpi_addm);
+
+void mpi_subm(MPI w, MPI u, MPI v, MPI m)
+{
+	mpi_sub(w, u, v);
+	mpi_mod(w, w, m);
+}
+EXPORT_SYMBOL_GPL(mpi_subm);
diff --git a/lib/mpi/mpi-bit.c b/lib/mpi/mpi-bit.c
index 503537e08436..a5119a2bcdd4 100644
--- a/lib/mpi/mpi-bit.c
+++ b/lib/mpi/mpi-bit.c
@@ -32,6 +32,7 @@ void mpi_normalize(MPI a)
 	for (; a->nlimbs && !a->d[a->nlimbs - 1]; a->nlimbs--)
 		;
 }
+EXPORT_SYMBOL_GPL(mpi_normalize);
 
 /****************
  * Return the number of bits in A.
@@ -54,3 +55,253 @@ unsigned mpi_get_nbits(MPI a)
 	return n;
 }
 EXPORT_SYMBOL_GPL(mpi_get_nbits);
+
+/****************
+ * Test whether bit N is set.
+ */
+int mpi_test_bit(MPI a, unsigned int n)
+{
+	unsigned int limbno, bitno;
+	mpi_limb_t limb;
+
+	limbno = n / BITS_PER_MPI_LIMB;
+	bitno  = n % BITS_PER_MPI_LIMB;
+
+	if (limbno >= a->nlimbs)
+		return 0; /* too far left: this is a 0 */
+	limb = a->d[limbno];
+	return (limb & (A_LIMB_1 << bitno)) ? 1 : 0;
+}
+EXPORT_SYMBOL_GPL(mpi_test_bit);
+
+/****************
+ * Set bit N of A.
+ */
+void mpi_set_bit(MPI a, unsigned int n)
+{
+	unsigned int i, limbno, bitno;
+
+	limbno = n / BITS_PER_MPI_LIMB;
+	bitno  = n % BITS_PER_MPI_LIMB;
+
+	if (limbno >= a->nlimbs) {
+		for (i = a->nlimbs; i < a->alloced; i++)
+			a->d[i] = 0;
+		mpi_resize(a, limbno+1);
+		a->nlimbs = limbno+1;
+	}
+	a->d[limbno] |= (A_LIMB_1<<bitno);
+}
+
+/****************
+ * Set bit N of A. and clear all bits above
+ */
+void mpi_set_highbit(MPI a, unsigned int n)
+{
+	unsigned int i, limbno, bitno;
+
+	limbno = n / BITS_PER_MPI_LIMB;
+	bitno  = n % BITS_PER_MPI_LIMB;
+
+	if (limbno >= a->nlimbs) {
+		for (i = a->nlimbs; i < a->alloced; i++)
+			a->d[i] = 0;
+		mpi_resize(a, limbno+1);
+		a->nlimbs = limbno+1;
+	}
+	a->d[limbno] |= (A_LIMB_1<<bitno);
+	for (bitno++; bitno < BITS_PER_MPI_LIMB; bitno++)
+		a->d[limbno] &= ~(A_LIMB_1 << bitno);
+	a->nlimbs = limbno+1;
+}
+EXPORT_SYMBOL_GPL(mpi_set_highbit);
+
+/****************
+ * clear bit N of A and all bits above
+ */
+void mpi_clear_highbit(MPI a, unsigned int n)
+{
+	unsigned int limbno, bitno;
+
+	limbno = n / BITS_PER_MPI_LIMB;
+	bitno  = n % BITS_PER_MPI_LIMB;
+
+	if (limbno >= a->nlimbs)
+		return; /* not allocated, therefore no need to clear bits :-) */
+
+	for ( ; bitno < BITS_PER_MPI_LIMB; bitno++)
+		a->d[limbno] &= ~(A_LIMB_1 << bitno);
+	a->nlimbs = limbno+1;
+}
+
+/****************
+ * Clear bit N of A.
+ */
+void mpi_clear_bit(MPI a, unsigned int n)
+{
+	unsigned int limbno, bitno;
+
+	limbno = n / BITS_PER_MPI_LIMB;
+	bitno  = n % BITS_PER_MPI_LIMB;
+
+	if (limbno >= a->nlimbs)
+		return; /* Don't need to clear this bit, it's far too left.  */
+	a->d[limbno] &= ~(A_LIMB_1 << bitno);
+}
+EXPORT_SYMBOL_GPL(mpi_clear_bit);
+
+
+/****************
+ * Shift A by COUNT limbs to the right
+ * This is used only within the MPI library
+ */
+void mpi_rshift_limbs(MPI a, unsigned int count)
+{
+	mpi_ptr_t ap = a->d;
+	mpi_size_t n = a->nlimbs;
+	unsigned int i;
+
+	if (count >= n) {
+		a->nlimbs = 0;
+		return;
+	}
+
+	for (i = 0; i < n - count; i++)
+		ap[i] = ap[i+count];
+	ap[i] = 0;
+	a->nlimbs -= count;
+}
+
+/*
+ * Shift A by N bits to the right.
+ */
+void mpi_rshift(MPI x, MPI a, unsigned int n)
+{
+	mpi_size_t xsize;
+	unsigned int i;
+	unsigned int nlimbs = (n/BITS_PER_MPI_LIMB);
+	unsigned int nbits = (n%BITS_PER_MPI_LIMB);
+
+	if (x == a) {
+		/* In-place operation.  */
+		if (nlimbs >= x->nlimbs) {
+			x->nlimbs = 0;
+			return;
+		}
+
+		if (nlimbs) {
+			for (i = 0; i < x->nlimbs - nlimbs; i++)
+				x->d[i] = x->d[i+nlimbs];
+			x->d[i] = 0;
+			x->nlimbs -= nlimbs;
+		}
+		if (x->nlimbs && nbits)
+			mpihelp_rshift(x->d, x->d, x->nlimbs, nbits);
+	} else if (nlimbs) {
+		/* Copy and shift by more or equal bits than in a limb. */
+		xsize = a->nlimbs;
+		x->sign = a->sign;
+		RESIZE_IF_NEEDED(x, xsize);
+		x->nlimbs = xsize;
+		for (i = 0; i < a->nlimbs; i++)
+			x->d[i] = a->d[i];
+		x->nlimbs = i;
+
+		if (nlimbs >= x->nlimbs) {
+			x->nlimbs = 0;
+			return;
+		}
+
+		if (nlimbs) {
+			for (i = 0; i < x->nlimbs - nlimbs; i++)
+				x->d[i] = x->d[i+nlimbs];
+			x->d[i] = 0;
+			x->nlimbs -= nlimbs;
+		}
+
+		if (x->nlimbs && nbits)
+			mpihelp_rshift(x->d, x->d, x->nlimbs, nbits);
+	} else {
+		/* Copy and shift by less than bits in a limb.  */
+		xsize = a->nlimbs;
+		x->sign = a->sign;
+		RESIZE_IF_NEEDED(x, xsize);
+		x->nlimbs = xsize;
+
+		if (xsize) {
+			if (nbits)
+				mpihelp_rshift(x->d, a->d, x->nlimbs, nbits);
+			else {
+				/* The rshift helper function is not specified for
+				 * NBITS==0, thus we do a plain copy here.
+				 */
+				for (i = 0; i < x->nlimbs; i++)
+					x->d[i] = a->d[i];
+			}
+		}
+	}
+	MPN_NORMALIZE(x->d, x->nlimbs);
+}
+
+/****************
+ * Shift A by COUNT limbs to the left
+ * This is used only within the MPI library
+ */
+void mpi_lshift_limbs(MPI a, unsigned int count)
+{
+	mpi_ptr_t ap;
+	int n = a->nlimbs;
+	int i;
+
+	if (!count || !n)
+		return;
+
+	RESIZE_IF_NEEDED(a, n+count);
+
+	ap = a->d;
+	for (i = n-1; i >= 0; i--)
+		ap[i+count] = ap[i];
+	for (i = 0; i < count; i++)
+		ap[i] = 0;
+	a->nlimbs += count;
+}
+
+/*
+ * Shift A by N bits to the left.
+ */
+void mpi_lshift(MPI x, MPI a, unsigned int n)
+{
+	unsigned int nlimbs = (n/BITS_PER_MPI_LIMB);
+	unsigned int nbits = (n%BITS_PER_MPI_LIMB);
+
+	if (x == a && !n)
+		return;  /* In-place shift with an amount of zero.  */
+
+	if (x != a) {
+		/* Copy A to X.  */
+		unsigned int alimbs = a->nlimbs;
+		int asign = a->sign;
+		mpi_ptr_t xp, ap;
+
+		RESIZE_IF_NEEDED(x, alimbs+nlimbs+1);
+		xp = x->d;
+		ap = a->d;
+		MPN_COPY(xp, ap, alimbs);
+		x->nlimbs = alimbs;
+		x->flags = a->flags;
+		x->sign = asign;
+	}
+
+	if (nlimbs && !nbits) {
+		/* Shift a full number of limbs.  */
+		mpi_lshift_limbs(x, nlimbs);
+	} else if (n) {
+		/* We use a very dump approach: Shift left by the number of
+		 * limbs plus one and than fix it up by an rshift.
+		 */
+		mpi_lshift_limbs(x, nlimbs+1);
+		mpi_rshift(x, x, BITS_PER_MPI_LIMB - nbits);
+	}
+
+	MPN_NORMALIZE(x->d, x->nlimbs);
+}
diff --git a/lib/mpi/mpi-cmp.c b/lib/mpi/mpi-cmp.c
index d25e9e96c310..c4cfa3ff0581 100644
--- a/lib/mpi/mpi-cmp.c
+++ b/lib/mpi/mpi-cmp.c
@@ -41,28 +41,54 @@ int mpi_cmp_ui(MPI u, unsigned long v)
 }
 EXPORT_SYMBOL_GPL(mpi_cmp_ui);
 
-int mpi_cmp(MPI u, MPI v)
+static int do_mpi_cmp(MPI u, MPI v, int absmode)
 {
-	mpi_size_t usize, vsize;
+	mpi_size_t usize;
+	mpi_size_t vsize;
+	int usign;
+	int vsign;
 	int cmp;
 
 	mpi_normalize(u);
 	mpi_normalize(v);
+
 	usize = u->nlimbs;
 	vsize = v->nlimbs;
-	if (!u->sign && v->sign)
+	usign = absmode ? 0 : u->sign;
+	vsign = absmode ? 0 : v->sign;
+
+	/* Compare sign bits.  */
+
+	if (!usign && vsign)
 		return 1;
-	if (u->sign && !v->sign)
+	if (usign && !vsign)
 		return -1;
-	if (usize != vsize && !u->sign && !v->sign)
+
+	/* U and V are either both positive or both negative.  */
+
+	if (usize != vsize && !usign && !vsign)
 		return usize - vsize;
-	if (usize != vsize && u->sign && v->sign)
-		return vsize - usize;
+	if (usize != vsize && usign && vsign)
+		return vsize + usize;
 	if (!usize)
 		return 0;
 	cmp = mpihelp_cmp(u->d, v->d, usize);
-	if (u->sign)
-		return -cmp;
-	return cmp;
+	if (!cmp)
+		return 0;
+	if ((cmp < 0?1:0) == (usign?1:0))
+		return 1;
+
+	return -1;
+}
+
+int mpi_cmp(MPI u, MPI v)
+{
+	return do_mpi_cmp(u, v, 0);
 }
 EXPORT_SYMBOL_GPL(mpi_cmp);
+
+int mpi_cmpabs(MPI u, MPI v)
+{
+	return do_mpi_cmp(u, v, 1);
+}
+EXPORT_SYMBOL_GPL(mpi_cmpabs);
diff --git a/lib/mpi/mpi-div.c b/lib/mpi/mpi-div.c
new file mode 100644
index 000000000000..21332dab97d4
--- /dev/null
+++ b/lib/mpi/mpi-div.c
@@ -0,0 +1,238 @@
+/* mpi-div.c  -  MPI functions
+ * Copyright (C) 1994, 1996, 1998, 2001, 2002,
+ *               2003 Free Software Foundation, Inc.
+ *
+ * This file is part of Libgcrypt.
+ *
+ * Note: This code is heavily based on the GNU MP Library.
+ *	 Actually it's the same code with only minor changes in the
+ *	 way the data is stored; this is to support the abstraction
+ *	 of an optional secure memory allocation which may be used
+ *	 to avoid revealing of sensitive data due to paging etc.
+ */
+
+#include "mpi-internal.h"
+#include "longlong.h"
+
+void mpi_tdiv_qr(MPI quot, MPI rem, MPI num, MPI den);
+void mpi_fdiv_qr(MPI quot, MPI rem, MPI dividend, MPI divisor);
+
+void mpi_fdiv_r(MPI rem, MPI dividend, MPI divisor)
+{
+	int divisor_sign = divisor->sign;
+	MPI temp_divisor = NULL;
+
+	/* We need the original value of the divisor after the remainder has been
+	 * preliminary calculated.	We have to copy it to temporary space if it's
+	 * the same variable as REM.
+	 */
+	if (rem == divisor) {
+		temp_divisor = mpi_copy(divisor);
+		divisor = temp_divisor;
+	}
+
+	mpi_tdiv_r(rem, dividend, divisor);
+
+	if (((divisor_sign?1:0) ^ (dividend->sign?1:0)) && rem->nlimbs)
+		mpi_add(rem, rem, divisor);
+
+	if (temp_divisor)
+		mpi_free(temp_divisor);
+}
+
+void mpi_fdiv_q(MPI quot, MPI dividend, MPI divisor)
+{
+	MPI tmp = mpi_alloc(mpi_get_nlimbs(quot));
+	mpi_fdiv_qr(quot, tmp, dividend, divisor);
+	mpi_free(tmp);
+}
+
+void mpi_fdiv_qr(MPI quot, MPI rem, MPI dividend, MPI divisor)
+{
+	int divisor_sign = divisor->sign;
+	MPI temp_divisor = NULL;
+
+	if (quot == divisor || rem == divisor) {
+		temp_divisor = mpi_copy(divisor);
+		divisor = temp_divisor;
+	}
+
+	mpi_tdiv_qr(quot, rem, dividend, divisor);
+
+	if ((divisor_sign ^ dividend->sign) && rem->nlimbs) {
+		mpi_sub_ui(quot, quot, 1);
+		mpi_add(rem, rem, divisor);
+	}
+
+	if (temp_divisor)
+		mpi_free(temp_divisor);
+}
+
+/* If den == quot, den needs temporary storage.
+ * If den == rem, den needs temporary storage.
+ * If num == quot, num needs temporary storage.
+ * If den has temporary storage, it can be normalized while being copied,
+ *   i.e no extra storage should be allocated.
+ */
+
+void mpi_tdiv_r(MPI rem, MPI num, MPI den)
+{
+	mpi_tdiv_qr(NULL, rem, num, den);
+}
+
+void mpi_tdiv_qr(MPI quot, MPI rem, MPI num, MPI den)
+{
+	mpi_ptr_t np, dp;
+	mpi_ptr_t qp, rp;
+	mpi_size_t nsize = num->nlimbs;
+	mpi_size_t dsize = den->nlimbs;
+	mpi_size_t qsize, rsize;
+	mpi_size_t sign_remainder = num->sign;
+	mpi_size_t sign_quotient = num->sign ^ den->sign;
+	unsigned int normalization_steps;
+	mpi_limb_t q_limb;
+	mpi_ptr_t marker[5];
+	unsigned int marker_nlimbs[5];
+	int markidx = 0;
+
+	/* Ensure space is enough for quotient and remainder.
+	 * We need space for an extra limb in the remainder, because it's
+	 * up-shifted (normalized) below.
+	 */
+	rsize = nsize + 1;
+	mpi_resize(rem, rsize);
+
+	qsize = rsize - dsize;	  /* qsize cannot be bigger than this.	*/
+	if (qsize <= 0) {
+		if (num != rem) {
+			rem->nlimbs = num->nlimbs;
+			rem->sign = num->sign;
+			MPN_COPY(rem->d, num->d, nsize);
+		}
+		if (quot) {
+			/* This needs to follow the assignment to rem, in case the
+			 * numerator and quotient are the same.
+			 */
+			quot->nlimbs = 0;
+			quot->sign = 0;
+		}
+		return;
+	}
+
+	if (quot)
+		mpi_resize(quot, qsize);
+
+	/* Read pointers here, when reallocation is finished.  */
+	np = num->d;
+	dp = den->d;
+	rp = rem->d;
+
+	/* Optimize division by a single-limb divisor.  */
+	if (dsize == 1) {
+		mpi_limb_t rlimb;
+		if (quot) {
+			qp = quot->d;
+			rlimb = mpihelp_divmod_1(qp, np, nsize, dp[0]);
+			qsize -= qp[qsize - 1] == 0;
+			quot->nlimbs = qsize;
+			quot->sign = sign_quotient;
+		} else
+			rlimb = mpihelp_mod_1(np, nsize, dp[0]);
+		rp[0] = rlimb;
+		rsize = rlimb != 0?1:0;
+		rem->nlimbs = rsize;
+		rem->sign = sign_remainder;
+		return;
+	}
+
+
+	if (quot) {
+		qp = quot->d;
+		/* Make sure QP and NP point to different objects.  Otherwise the
+		 * numerator would be gradually overwritten by the quotient limbs.
+		 */
+		if (qp == np) { /* Copy NP object to temporary space.  */
+			marker_nlimbs[markidx] = nsize;
+			np = marker[markidx++] = mpi_alloc_limb_space(nsize);
+			MPN_COPY(np, qp, nsize);
+		}
+	} else /* Put quotient at top of remainder. */
+		qp = rp + dsize;
+
+	normalization_steps = count_leading_zeros(dp[dsize - 1]);
+
+	/* Normalize the denominator, i.e. make its most significant bit set by
+	 * shifting it NORMALIZATION_STEPS bits to the left.  Also shift the
+	 * numerator the same number of steps (to keep the quotient the same!).
+	 */
+	if (normalization_steps) {
+		mpi_ptr_t tp;
+		mpi_limb_t nlimb;
+
+		/* Shift up the denominator setting the most significant bit of
+		 * the most significant word.  Use temporary storage not to clobber
+		 * the original contents of the denominator.
+		 */
+		marker_nlimbs[markidx] = dsize;
+		tp = marker[markidx++] = mpi_alloc_limb_space(dsize);
+		mpihelp_lshift(tp, dp, dsize, normalization_steps);
+		dp = tp;
+
+		/* Shift up the numerator, possibly introducing a new most
+		 * significant word.  Move the shifted numerator in the remainder
+		 * meanwhile.
+		 */
+		nlimb = mpihelp_lshift(rp, np, nsize, normalization_steps);
+		if (nlimb) {
+			rp[nsize] = nlimb;
+			rsize = nsize + 1;
+		} else
+			rsize = nsize;
+	} else {
+		/* The denominator is already normalized, as required.	Copy it to
+		 * temporary space if it overlaps with the quotient or remainder.
+		 */
+		if (dp == rp || (quot && (dp == qp))) {
+			mpi_ptr_t tp;
+
+			marker_nlimbs[markidx] = dsize;
+			tp = marker[markidx++] = mpi_alloc_limb_space(dsize);
+			MPN_COPY(tp, dp, dsize);
+			dp = tp;
+		}
+
+		/* Move the numerator to the remainder.  */
+		if (rp != np)
+			MPN_COPY(rp, np, nsize);
+
+		rsize = nsize;
+	}
+
+	q_limb = mpihelp_divrem(qp, 0, rp, rsize, dp, dsize);
+
+	if (quot) {
+		qsize = rsize - dsize;
+		if (q_limb) {
+			qp[qsize] = q_limb;
+			qsize += 1;
+		}
+
+		quot->nlimbs = qsize;
+		quot->sign = sign_quotient;
+	}
+
+	rsize = dsize;
+	MPN_NORMALIZE(rp, rsize);
+
+	if (normalization_steps && rsize) {
+		mpihelp_rshift(rp, rp, rsize, normalization_steps);
+		rsize -= rp[rsize - 1] == 0?1:0;
+	}
+
+	rem->nlimbs = rsize;
+	rem->sign	= sign_remainder;
+	while (markidx) {
+		markidx--;
+		mpi_free_limb_space(marker[markidx]);
+	}
+}
diff --git a/lib/mpi/mpi-internal.h b/lib/mpi/mpi-internal.h
index 91df5f0b70f2..d29c4537c3a3 100644
--- a/lib/mpi/mpi-internal.h
+++ b/lib/mpi/mpi-internal.h
@@ -52,6 +52,12 @@
 typedef mpi_limb_t *mpi_ptr_t;	/* pointer to a limb */
 typedef int mpi_size_t;		/* (must be a signed type) */
 
+#define RESIZE_IF_NEEDED(a, b)			\
+	do {					\
+		if ((a)->alloced < (b))		\
+			mpi_resize((a), (b));	\
+	} while (0)
+
 /* Copy N limbs from S to D.  */
 #define MPN_COPY(d, s, n) \
 	do {					\
@@ -60,6 +66,14 @@ typedef int mpi_size_t;		/* (must be a signed type) */
 			(d)[_i] = (s)[_i];	\
 	} while (0)
 
+#define MPN_COPY_INCR(d, s, n)		\
+	do {					\
+		mpi_size_t _i;			\
+		for (_i = 0; _i < (n); _i++)	\
+			(d)[_i] = (s)[_i];	\
+	} while (0)
+
+
 #define MPN_COPY_DECR(d, s, n) \
 	do {					\
 		mpi_size_t _i;			\
@@ -92,6 +106,38 @@ typedef int mpi_size_t;		/* (must be a signed type) */
 			mul_n(prodp, up, vp, size, tspace);	\
 	} while (0);
 
+/* Divide the two-limb number in (NH,,NL) by D, with DI being the largest
+ * limb not larger than (2**(2*BITS_PER_MP_LIMB))/D - (2**BITS_PER_MP_LIMB).
+ * If this would yield overflow, DI should be the largest possible number
+ * (i.e., only ones).  For correct operation, the most significant bit of D
+ * has to be set.  Put the quotient in Q and the remainder in R.
+ */
+#define UDIV_QRNND_PREINV(q, r, nh, nl, d, di)				\
+	do {								\
+		mpi_limb_t _ql;						\
+		mpi_limb_t _q, _r;					\
+		mpi_limb_t _xh, _xl;					\
+		umul_ppmm(_q, _ql, (nh), (di));				\
+		_q += (nh);	/* DI is 2**BITS_PER_MPI_LIMB too small */ \
+		umul_ppmm(_xh, _xl, _q, (d));				\
+		sub_ddmmss(_xh, _r, (nh), (nl), _xh, _xl);		\
+		if (_xh) {						\
+			sub_ddmmss(_xh, _r, _xh, _r, 0, (d));		\
+			_q++;						\
+			if (_xh) {					\
+				sub_ddmmss(_xh, _r, _xh, _r, 0, (d));	\
+				_q++;					\
+			}						\
+		}							\
+		if (_r >= (d)) {					\
+			_r -= (d);					\
+			_q++;						\
+		}							\
+		(r) = _r;						\
+		(q) = _q;						\
+	} while (0)
+
+
 /*-- mpiutil.c --*/
 mpi_ptr_t mpi_alloc_limb_space(unsigned nlimbs);
 void mpi_free_limb_space(mpi_ptr_t a);
@@ -135,6 +181,8 @@ int mpihelp_mul(mpi_ptr_t prodp, mpi_ptr_t up, mpi_size_t usize,
 void mpih_sqr_n_basecase(mpi_ptr_t prodp, mpi_ptr_t up, mpi_size_t size);
 void mpih_sqr_n(mpi_ptr_t prodp, mpi_ptr_t up, mpi_size_t size,
 		mpi_ptr_t tspace);
+void mpihelp_mul_n(mpi_ptr_t prodp,
+		mpi_ptr_t up, mpi_ptr_t vp, mpi_size_t size);
 
 int mpihelp_mul_karatsuba_case(mpi_ptr_t prodp,
 			       mpi_ptr_t up, mpi_size_t usize,
@@ -146,9 +194,14 @@ mpi_limb_t mpihelp_mul_1(mpi_ptr_t res_ptr, mpi_ptr_t s1_ptr,
 			 mpi_size_t s1_size, mpi_limb_t s2_limb);
 
 /*-- mpih-div.c --*/
+mpi_limb_t mpihelp_mod_1(mpi_ptr_t dividend_ptr, mpi_size_t dividend_size,
+			 mpi_limb_t divisor_limb);
 mpi_limb_t mpihelp_divrem(mpi_ptr_t qp, mpi_size_t qextra_limbs,
 			  mpi_ptr_t np, mpi_size_t nsize,
 			  mpi_ptr_t dp, mpi_size_t dsize);
+mpi_limb_t mpihelp_divmod_1(mpi_ptr_t quot_ptr,
+			    mpi_ptr_t dividend_ptr, mpi_size_t dividend_size,
+			    mpi_limb_t divisor_limb);
 
 /*-- generic_mpih-[lr]shift.c --*/
 mpi_limb_t mpihelp_lshift(mpi_ptr_t wp, mpi_ptr_t up, mpi_size_t usize,
diff --git a/lib/mpi/mpi-inv.c b/lib/mpi/mpi-inv.c
new file mode 100644
index 000000000000..61e37d18f793
--- /dev/null
+++ b/lib/mpi/mpi-inv.c
@@ -0,0 +1,143 @@
+/* mpi-inv.c  -  MPI functions
+ *	Copyright (C) 1998, 2001, 2002, 2003 Free Software Foundation, Inc.
+ *
+ * This file is part of Libgcrypt.
+ *
+ * Libgcrypt is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as
+ * published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * Libgcrypt is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "mpi-internal.h"
+
+/****************
+ * Calculate the multiplicative inverse X of A mod N
+ * That is: Find the solution x for
+ *		1 = (a*x) mod n
+ */
+int mpi_invm(MPI x, MPI a, MPI n)
+{
+	/* Extended Euclid's algorithm (See TAOCP Vol II, 4.5.2, Alg X)
+	 * modified according to Michael Penk's solution for Exercise 35
+	 * with further enhancement
+	 */
+	MPI u, v, u1, u2 = NULL, u3, v1, v2 = NULL, v3, t1, t2 = NULL, t3;
+	unsigned int k;
+	int sign;
+	int odd;
+
+	if (!mpi_cmp_ui(a, 0))
+		return 0; /* Inverse does not exists.  */
+	if (!mpi_cmp_ui(n, 1))
+		return 0; /* Inverse does not exists.  */
+
+	u = mpi_copy(a);
+	v = mpi_copy(n);
+
+	for (k = 0; !mpi_test_bit(u, 0) && !mpi_test_bit(v, 0); k++) {
+		mpi_rshift(u, u, 1);
+		mpi_rshift(v, v, 1);
+	}
+	odd = mpi_test_bit(v, 0);
+
+	u1 = mpi_alloc_set_ui(1);
+	if (!odd)
+		u2 = mpi_alloc_set_ui(0);
+	u3 = mpi_copy(u);
+	v1 = mpi_copy(v);
+	if (!odd) {
+		v2 = mpi_alloc(mpi_get_nlimbs(u));
+		mpi_sub(v2, u1, u); /* U is used as const 1 */
+	}
+	v3 = mpi_copy(v);
+	if (mpi_test_bit(u, 0)) { /* u is odd */
+		t1 = mpi_alloc_set_ui(0);
+		if (!odd) {
+			t2 = mpi_alloc_set_ui(1);
+			t2->sign = 1;
+		}
+		t3 = mpi_copy(v);
+		t3->sign = !t3->sign;
+		goto Y4;
+	} else {
+		t1 = mpi_alloc_set_ui(1);
+		if (!odd)
+			t2 = mpi_alloc_set_ui(0);
+		t3 = mpi_copy(u);
+	}
+
+	do {
+		do {
+			if (!odd) {
+				if (mpi_test_bit(t1, 0) || mpi_test_bit(t2, 0)) {
+					/* one is odd */
+					mpi_add(t1, t1, v);
+					mpi_sub(t2, t2, u);
+				}
+				mpi_rshift(t1, t1, 1);
+				mpi_rshift(t2, t2, 1);
+				mpi_rshift(t3, t3, 1);
+			} else {
+				if (mpi_test_bit(t1, 0))
+					mpi_add(t1, t1, v);
+				mpi_rshift(t1, t1, 1);
+				mpi_rshift(t3, t3, 1);
+			}
+Y4:
+			;
+		} while (!mpi_test_bit(t3, 0)); /* while t3 is even */
+
+		if (!t3->sign) {
+			mpi_set(u1, t1);
+			if (!odd)
+				mpi_set(u2, t2);
+			mpi_set(u3, t3);
+		} else {
+			mpi_sub(v1, v, t1);
+			sign = u->sign; u->sign = !u->sign;
+			if (!odd)
+				mpi_sub(v2, u, t2);
+			u->sign = sign;
+			sign = t3->sign; t3->sign = !t3->sign;
+			mpi_set(v3, t3);
+			t3->sign = sign;
+		}
+		mpi_sub(t1, u1, v1);
+		if (!odd)
+			mpi_sub(t2, u2, v2);
+		mpi_sub(t3, u3, v3);
+		if (t1->sign) {
+			mpi_add(t1, t1, v);
+			if (!odd)
+				mpi_sub(t2, t2, u);
+		}
+	} while (mpi_cmp_ui(t3, 0)); /* while t3 != 0 */
+	/* mpi_lshift( u3, k ); */
+	mpi_set(x, u1);
+
+	mpi_free(u1);
+	mpi_free(v1);
+	mpi_free(t1);
+	if (!odd) {
+		mpi_free(u2);
+		mpi_free(v2);
+		mpi_free(t2);
+	}
+	mpi_free(u3);
+	mpi_free(v3);
+	mpi_free(t3);
+
+	mpi_free(u);
+	mpi_free(v);
+	return 1;
+}
+EXPORT_SYMBOL_GPL(mpi_invm);
diff --git a/lib/mpi/mpi-mod.c b/lib/mpi/mpi-mod.c
new file mode 100644
index 000000000000..47bc59edd4ff
--- /dev/null
+++ b/lib/mpi/mpi-mod.c
@@ -0,0 +1,155 @@
+/* mpi-mod.c -  Modular reduction
+ * Copyright (C) 1998, 1999, 2001, 2002, 2003,
+ *               2007  Free Software Foundation, Inc.
+ *
+ * This file is part of Libgcrypt.
+ */
+
+
+#include "mpi-internal.h"
+#include "longlong.h"
+
+/* Context used with Barrett reduction.  */
+struct barrett_ctx_s {
+	MPI m;   /* The modulus - may not be modified. */
+	int m_copied;   /* If true, M needs to be released.  */
+	int k;
+	MPI y;
+	MPI r1;  /* Helper MPI. */
+	MPI r2;  /* Helper MPI. */
+	MPI r3;  /* Helper MPI allocated on demand. */
+};
+
+
+
+void mpi_mod(MPI rem, MPI dividend, MPI divisor)
+{
+	mpi_fdiv_r(rem, dividend, divisor);
+}
+
+/* This function returns a new context for Barrett based operations on
+ * the modulus M.  This context needs to be released using
+ * _gcry_mpi_barrett_free.  If COPY is true M will be transferred to
+ * the context and the user may change M.  If COPY is false, M may not
+ * be changed until gcry_mpi_barrett_free has been called.
+ */
+mpi_barrett_t mpi_barrett_init(MPI m, int copy)
+{
+	mpi_barrett_t ctx;
+	MPI tmp;
+
+	mpi_normalize(m);
+	ctx = kcalloc(1, sizeof(*ctx), GFP_KERNEL);
+
+	if (copy) {
+		ctx->m = mpi_copy(m);
+		ctx->m_copied = 1;
+	} else
+		ctx->m = m;
+
+	ctx->k = mpi_get_nlimbs(m);
+	tmp = mpi_alloc(ctx->k + 1);
+
+	/* Barrett precalculation: y = floor(b^(2k) / m). */
+	mpi_set_ui(tmp, 1);
+	mpi_lshift_limbs(tmp, 2 * ctx->k);
+	mpi_fdiv_q(tmp, tmp, m);
+
+	ctx->y  = tmp;
+	ctx->r1 = mpi_alloc(2 * ctx->k + 1);
+	ctx->r2 = mpi_alloc(2 * ctx->k + 1);
+
+	return ctx;
+}
+
+void mpi_barrett_free(mpi_barrett_t ctx)
+{
+	if (ctx) {
+		mpi_free(ctx->y);
+		mpi_free(ctx->r1);
+		mpi_free(ctx->r2);
+		if (ctx->r3)
+			mpi_free(ctx->r3);
+		if (ctx->m_copied)
+			mpi_free(ctx->m);
+		kfree(ctx);
+	}
+}
+
+
+/* R = X mod M
+ *
+ * Using Barrett reduction.  Before using this function
+ * _gcry_mpi_barrett_init must have been called to do the
+ * precalculations.  CTX is the context created by this precalculation
+ * and also conveys M.  If the Barret reduction could no be done a
+ * straightforward reduction method is used.
+ *
+ * We assume that these conditions are met:
+ * Input:  x =(x_2k-1 ...x_0)_b
+ *     m =(m_k-1 ....m_0)_b	  with m_k-1 != 0
+ * Output: r = x mod m
+ */
+void mpi_mod_barrett(MPI r, MPI x, mpi_barrett_t ctx)
+{
+	MPI m = ctx->m;
+	int k = ctx->k;
+	MPI y = ctx->y;
+	MPI r1 = ctx->r1;
+	MPI r2 = ctx->r2;
+	int sign;
+
+	mpi_normalize(x);
+	if (mpi_get_nlimbs(x) > 2*k) {
+		mpi_mod(r, x, m);
+		return;
+	}
+
+	sign = x->sign;
+	x->sign = 0;
+
+	/* 1. q1 = floor( x / b^k-1)
+	 *    q2 = q1 * y
+	 *    q3 = floor( q2 / b^k+1 )
+	 * Actually, we don't need qx, we can work direct on r2
+	 */
+	mpi_set(r2, x);
+	mpi_rshift_limbs(r2, k-1);
+	mpi_mul(r2, r2, y);
+	mpi_rshift_limbs(r2, k+1);
+
+	/* 2. r1 = x mod b^k+1
+	 *	r2 = q3 * m mod b^k+1
+	 *	r  = r1 - r2
+	 * 3. if r < 0 then  r = r + b^k+1
+	 */
+	mpi_set(r1, x);
+	if (r1->nlimbs > k+1) /* Quick modulo operation.  */
+		r1->nlimbs = k+1;
+	mpi_mul(r2, r2, m);
+	if (r2->nlimbs > k+1) /* Quick modulo operation. */
+		r2->nlimbs = k+1;
+	mpi_sub(r, r1, r2);
+
+	if (mpi_has_sign(r)) {
+		if (!ctx->r3) {
+			ctx->r3 = mpi_alloc(k + 2);
+			mpi_set_ui(ctx->r3, 1);
+			mpi_lshift_limbs(ctx->r3, k + 1);
+		}
+		mpi_add(r, r, ctx->r3);
+	}
+
+	/* 4. while r >= m do r = r - m */
+	while (mpi_cmp(r, m) >= 0)
+		mpi_sub(r, r, m);
+
+	x->sign = sign;
+}
+
+
+void mpi_mul_barrett(MPI w, MPI u, MPI v, mpi_barrett_t ctx)
+{
+	mpi_mul(w, u, v);
+	mpi_mod_barrett(w, w, ctx);
+}
diff --git a/lib/mpi/mpi-mul.c b/lib/mpi/mpi-mul.c
new file mode 100644
index 000000000000..587e6335cc12
--- /dev/null
+++ b/lib/mpi/mpi-mul.c
@@ -0,0 +1,94 @@
+/* mpi-mul.c  -  MPI functions
+ * Copyright (C) 1994, 1996, 1998, 2001, 2002,
+ *               2003 Free Software Foundation, Inc.
+ *
+ * This file is part of Libgcrypt.
+ *
+ * Note: This code is heavily based on the GNU MP Library.
+ *	 Actually it's the same code with only minor changes in the
+ *	 way the data is stored; this is to support the abstraction
+ *	 of an optional secure memory allocation which may be used
+ *	 to avoid revealing of sensitive data due to paging etc.
+ */
+
+#include "mpi-internal.h"
+
+void mpi_mul(MPI w, MPI u, MPI v)
+{
+	mpi_size_t usize, vsize, wsize;
+	mpi_ptr_t up, vp, wp;
+	mpi_limb_t cy;
+	int usign, vsign, sign_product;
+	int assign_wp = 0;
+	mpi_ptr_t tmp_limb = NULL;
+	unsigned int tmp_limb_nlimbs = 0;
+
+	if (u->nlimbs < v->nlimbs) {
+		/* Swap U and V. */
+		usize = v->nlimbs;
+		usign = v->sign;
+		up    = v->d;
+		vsize = u->nlimbs;
+		vsign = u->sign;
+		vp    = u->d;
+	} else {
+		usize = u->nlimbs;
+		usign = u->sign;
+		up    = u->d;
+		vsize = v->nlimbs;
+		vsign = v->sign;
+		vp    = v->d;
+	}
+	sign_product = usign ^ vsign;
+	wp = w->d;
+
+	/* Ensure W has space enough to store the result.  */
+	wsize = usize + vsize;
+	if (w->alloced < wsize) {
+		if (wp == up || wp == vp) {
+			wp = mpi_alloc_limb_space(wsize);
+			assign_wp = 1;
+		} else {
+			mpi_resize(w, wsize);
+			wp = w->d;
+		}
+	} else { /* Make U and V not overlap with W.	*/
+		if (wp == up) {
+			/* W and U are identical.  Allocate temporary space for U. */
+			tmp_limb_nlimbs = usize;
+			up = tmp_limb = mpi_alloc_limb_space(usize);
+			/* Is V identical too?  Keep it identical with U.  */
+			if (wp == vp)
+				vp = up;
+			/* Copy to the temporary space.  */
+			MPN_COPY(up, wp, usize);
+		} else if (wp == vp) {
+			/* W and V are identical.  Allocate temporary space for V. */
+			tmp_limb_nlimbs = vsize;
+			vp = tmp_limb = mpi_alloc_limb_space(vsize);
+			/* Copy to the temporary space.  */
+			MPN_COPY(vp, wp, vsize);
+		}
+	}
+
+	if (!vsize)
+		wsize = 0;
+	else {
+		mpihelp_mul(wp, up, usize, vp, vsize, &cy);
+		wsize -= cy ? 0:1;
+	}
+
+	if (assign_wp)
+		mpi_assign_limb_space(w, wp, wsize);
+	w->nlimbs = wsize;
+	w->sign = sign_product;
+	if (tmp_limb)
+		mpi_free_limb_space(tmp_limb);
+}
+
+void mpi_mulm(MPI w, MPI u, MPI v, MPI m)
+{
+	mpi_mul(w, u, v);
+	mpi_tdiv_r(w, w, m);
+}
+EXPORT_SYMBOL_GPL(mpi_mulm);
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index eead4b339466..7ea225b2204f 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -25,6 +25,7 @@
 #include <linux/string.h>
 #include "mpi-internal.h"
 
+#define MAX_EXTERN_SCAN_BYTES (16*1024*1024)
 #define MAX_EXTERN_MPI_BITS 16384
 
 /**
@@ -109,6 +110,112 @@ MPI mpi_read_from_buffer(const void *xbuffer, unsigned *ret_nread)
 }
 EXPORT_SYMBOL_GPL(mpi_read_from_buffer);
 
+/****************
+ * Fill the mpi VAL from the hex string in STR.
+ */
+int mpi_fromstr(MPI val, const char *str)
+{
+	int sign = 0;
+	int prepend_zero = 0;
+	int i, j, c, c1, c2;
+	unsigned int nbits, nbytes, nlimbs;
+	mpi_limb_t a;
+
+	if (*str == '-') {
+		sign = 1;
+		str++;
+	}
+
+	/* Skip optional hex prefix.  */
+	if (*str == '0' && str[1] == 'x')
+		str += 2;
+
+	nbits = strlen(str);
+	if (nbits > MAX_EXTERN_SCAN_BYTES) {
+		mpi_clear(val);
+		return -EINVAL;
+	}
+	nbits *= 4;
+	if ((nbits % 8))
+		prepend_zero = 1;
+
+	nbytes = (nbits+7) / 8;
+	nlimbs = (nbytes+BYTES_PER_MPI_LIMB-1) / BYTES_PER_MPI_LIMB;
+
+	if (val->alloced < nlimbs)
+		mpi_resize(val, nlimbs);
+
+	i = BYTES_PER_MPI_LIMB - (nbytes % BYTES_PER_MPI_LIMB);
+	i %= BYTES_PER_MPI_LIMB;
+	j = val->nlimbs = nlimbs;
+	val->sign = sign;
+	for (; j > 0; j--) {
+		a = 0;
+		for (; i < BYTES_PER_MPI_LIMB; i++) {
+			if (prepend_zero) {
+				c1 = '0';
+				prepend_zero = 0;
+			} else
+				c1 = *str++;
+
+			if (!c1) {
+				mpi_clear(val);
+				return -EINVAL;
+			}
+			c2 = *str++;
+			if (!c2) {
+				mpi_clear(val);
+				return -EINVAL;
+			}
+			if (c1 >= '0' && c1 <= '9')
+				c = c1 - '0';
+			else if (c1 >= 'a' && c1 <= 'f')
+				c = c1 - 'a' + 10;
+			else if (c1 >= 'A' && c1 <= 'F')
+				c = c1 - 'A' + 10;
+			else {
+				mpi_clear(val);
+				return -EINVAL;
+			}
+			c <<= 4;
+			if (c2 >= '0' && c2 <= '9')
+				c |= c2 - '0';
+			else if (c2 >= 'a' && c2 <= 'f')
+				c |= c2 - 'a' + 10;
+			else if (c2 >= 'A' && c2 <= 'F')
+				c |= c2 - 'A' + 10;
+			else {
+				mpi_clear(val);
+				return -EINVAL;
+			}
+			a <<= 8;
+			a |= c;
+		}
+		i = 0;
+		val->d[j-1] = a;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(mpi_fromstr);
+
+MPI mpi_scanval(const char *string)
+{
+	MPI a;
+
+	a = mpi_alloc(0);
+	if (!a)
+		return NULL;
+
+	if (mpi_fromstr(a, string)) {
+		mpi_free(a);
+		return NULL;
+	}
+	mpi_normalize(a);
+	return a;
+}
+EXPORT_SYMBOL_GPL(mpi_scanval);
+
 static int count_lzeros(MPI a)
 {
 	mpi_limb_t alimb;
@@ -413,3 +520,232 @@ MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int nbytes)
 	return val;
 }
 EXPORT_SYMBOL_GPL(mpi_read_raw_from_sgl);
+
+/* Perform a two's complement operation on buffer P of size N bytes.  */
+static void twocompl(unsigned char *p, unsigned int n)
+{
+	int i;
+
+	for (i = n-1; i >= 0 && !p[i]; i--)
+		;
+	if (i >= 0) {
+		if ((p[i] & 0x01))
+			p[i] = (((p[i] ^ 0xfe) | 0x01) & 0xff);
+		else if ((p[i] & 0x02))
+			p[i] = (((p[i] ^ 0xfc) | 0x02) & 0xfe);
+		else if ((p[i] & 0x04))
+			p[i] = (((p[i] ^ 0xf8) | 0x04) & 0xfc);
+		else if ((p[i] & 0x08))
+			p[i] = (((p[i] ^ 0xf0) | 0x08) & 0xf8);
+		else if ((p[i] & 0x10))
+			p[i] = (((p[i] ^ 0xe0) | 0x10) & 0xf0);
+		else if ((p[i] & 0x20))
+			p[i] = (((p[i] ^ 0xc0) | 0x20) & 0xe0);
+		else if ((p[i] & 0x40))
+			p[i] = (((p[i] ^ 0x80) | 0x40) & 0xc0);
+		else
+			p[i] = 0x80;
+
+		for (i--; i >= 0; i--)
+			p[i] ^= 0xff;
+	}
+}
+
+int mpi_print(enum gcry_mpi_format format, unsigned char *buffer,
+			size_t buflen, size_t *nwritten, MPI a)
+{
+	unsigned int nbits = mpi_get_nbits(a);
+	size_t len;
+	size_t dummy_nwritten;
+	int negative;
+
+	if (!nwritten)
+		nwritten = &dummy_nwritten;
+
+	/* Libgcrypt does no always care to set clear the sign if the value
+	 * is 0.  For printing this is a bit of a surprise, in particular
+	 * because if some of the formats don't support negative numbers but
+	 * should be able to print a zero.  Thus we need this extra test
+	 * for a negative number.
+	 */
+	if (a->sign && mpi_cmp_ui(a, 0))
+		negative = 1;
+	else
+		negative = 0;
+
+	len = buflen;
+	*nwritten = 0;
+	if (format == GCRYMPI_FMT_STD) {
+		unsigned char *tmp;
+		int extra = 0;
+		unsigned int n;
+
+		tmp = mpi_get_buffer(a, &n, NULL);
+		if (!tmp)
+			return -EINVAL;
+
+		if (negative) {
+			twocompl(tmp, n);
+			if (!(*tmp & 0x80)) {
+				/* Need to extend the sign.  */
+				n++;
+				extra = 2;
+			}
+		} else if (n && (*tmp & 0x80)) {
+			/* Positive but the high bit of the returned buffer is set.
+			 * Thus we need to print an extra leading 0x00 so that the
+			 * output is interpreted as a positive number.
+			 */
+			n++;
+			extra = 1;
+		}
+
+		if (buffer && n > len) {
+			/* The provided buffer is too short. */
+			kfree(tmp);
+			return -E2BIG;
+		}
+		if (buffer) {
+			unsigned char *s = buffer;
+
+			if (extra == 1)
+				*s++ = 0;
+			else if (extra)
+				*s++ = 0xff;
+			memcpy(s, tmp, n-!!extra);
+		}
+		kfree(tmp);
+		*nwritten = n;
+		return 0;
+	} else if (format == GCRYMPI_FMT_USG) {
+		unsigned int n = (nbits + 7)/8;
+
+		/* Note:  We ignore the sign for this format.  */
+		/* FIXME: for performance reasons we should put this into
+		 * mpi_aprint because we can then use the buffer directly.
+		 */
+
+		if (buffer && n > len)
+			return -E2BIG;
+		if (buffer) {
+			unsigned char *tmp;
+
+			tmp = mpi_get_buffer(a, &n, NULL);
+			if (!tmp)
+				return -EINVAL;
+			memcpy(buffer, tmp, n);
+			kfree(tmp);
+		}
+		*nwritten = n;
+		return 0;
+	} else if (format == GCRYMPI_FMT_PGP) {
+		unsigned int n = (nbits + 7)/8;
+
+		/* The PGP format can only handle unsigned integers.  */
+		if (negative)
+			return -EINVAL;
+
+		if (buffer && n+2 > len)
+			return -E2BIG;
+
+		if (buffer) {
+			unsigned char *tmp;
+			unsigned char *s = buffer;
+
+			s[0] = nbits >> 8;
+			s[1] = nbits;
+
+			tmp = mpi_get_buffer(a, &n, NULL);
+			if (!tmp)
+				return -EINVAL;
+			memcpy(s+2, tmp, n);
+			kfree(tmp);
+		}
+		*nwritten = n+2;
+		return 0;
+	} else if (format == GCRYMPI_FMT_SSH) {
+		unsigned char *tmp;
+		int extra = 0;
+		unsigned int n;
+
+		tmp = mpi_get_buffer(a, &n, NULL);
+		if (!tmp)
+			return -EINVAL;
+
+		if (negative) {
+			twocompl(tmp, n);
+			if (!(*tmp & 0x80)) {
+				/* Need to extend the sign.  */
+				n++;
+				extra = 2;
+			}
+		} else if (n && (*tmp & 0x80)) {
+			n++;
+			extra = 1;
+		}
+
+		if (buffer && n+4 > len) {
+			kfree(tmp);
+			return -E2BIG;
+		}
+
+		if (buffer) {
+			unsigned char *s = buffer;
+
+			*s++ = n >> 24;
+			*s++ = n >> 16;
+			*s++ = n >> 8;
+			*s++ = n;
+			if (extra == 1)
+				*s++ = 0;
+			else if (extra)
+				*s++ = 0xff;
+			memcpy(s, tmp, n-!!extra);
+		}
+		kfree(tmp);
+		*nwritten = 4+n;
+		return 0;
+	} else if (format == GCRYMPI_FMT_HEX) {
+		unsigned char *tmp;
+		int i;
+		int extra = 0;
+		unsigned int n = 0;
+
+		tmp = mpi_get_buffer(a, &n, NULL);
+		if (!tmp)
+			return -EINVAL;
+		if (!n || (*tmp & 0x80))
+			extra = 2;
+
+		if (buffer && 2*n + extra + negative + 1 > len) {
+			kfree(tmp);
+			return -E2BIG;
+		}
+		if (buffer) {
+			unsigned char *s = buffer;
+
+			if (negative)
+				*s++ = '-';
+			if (extra) {
+				*s++ = '0';
+				*s++ = '0';
+			}
+
+			for (i = 0; i < n; i++) {
+				unsigned int c = tmp[i];
+
+				*s++ = (c >> 4) < 10 ? '0'+(c>>4) : 'A'+(c>>4)-10;
+				c &= 15;
+				*s++ = c < 10 ? '0'+c : 'A'+c-10;
+			}
+			*s++ = 0;
+			*nwritten = s - buffer;
+		} else {
+			*nwritten = 2*n + extra + negative + 1;
+		}
+		kfree(tmp);
+		return 0;
+	} else
+		return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(mpi_print);
diff --git a/lib/mpi/mpih-div.c b/lib/mpi/mpih-div.c
index 913a519eb005..182a656a1ba0 100644
--- a/lib/mpi/mpih-div.c
+++ b/lib/mpi/mpih-div.c
@@ -24,6 +24,150 @@
 #define UDIV_TIME UMUL_TIME
 #endif
 
+
+mpi_limb_t
+mpihelp_mod_1(mpi_ptr_t dividend_ptr, mpi_size_t dividend_size,
+			mpi_limb_t divisor_limb)
+{
+	mpi_size_t i;
+	mpi_limb_t n1, n0, r;
+	mpi_limb_t dummy;
+
+	/* Botch: Should this be handled at all?  Rely on callers?	*/
+	if (!dividend_size)
+		return 0;
+
+	/* If multiplication is much faster than division, and the
+	 * dividend is large, pre-invert the divisor, and use
+	 * only multiplications in the inner loop.
+	 *
+	 * This test should be read:
+	 *	 Does it ever help to use udiv_qrnnd_preinv?
+	 *	   && Does what we save compensate for the inversion overhead?
+	 */
+	if (UDIV_TIME > (2 * UMUL_TIME + 6)
+			&& (UDIV_TIME - (2 * UMUL_TIME + 6)) * dividend_size > UDIV_TIME) {
+		int normalization_steps;
+
+		normalization_steps = count_leading_zeros(divisor_limb);
+		if (normalization_steps) {
+			mpi_limb_t divisor_limb_inverted;
+
+			divisor_limb <<= normalization_steps;
+
+			/* Compute (2**2N - 2**N * DIVISOR_LIMB) / DIVISOR_LIMB.  The
+			 * result is a (N+1)-bit approximation to 1/DIVISOR_LIMB, with the
+			 * most significant bit (with weight 2**N) implicit.
+			 *
+			 * Special case for DIVISOR_LIMB == 100...000.
+			 */
+			if (!(divisor_limb << 1))
+				divisor_limb_inverted = ~(mpi_limb_t)0;
+			else
+				udiv_qrnnd(divisor_limb_inverted, dummy,
+						-divisor_limb, 0, divisor_limb);
+
+			n1 = dividend_ptr[dividend_size - 1];
+			r = n1 >> (BITS_PER_MPI_LIMB - normalization_steps);
+
+			/* Possible optimization:
+			 * if (r == 0
+			 * && divisor_limb > ((n1 << normalization_steps)
+			 *		       | (dividend_ptr[dividend_size - 2] >> ...)))
+			 * ...one division less...
+			 */
+			for (i = dividend_size - 2; i >= 0; i--) {
+				n0 = dividend_ptr[i];
+				UDIV_QRNND_PREINV(dummy, r, r,
+						((n1 << normalization_steps)
+						 | (n0 >> (BITS_PER_MPI_LIMB - normalization_steps))),
+						divisor_limb, divisor_limb_inverted);
+				n1 = n0;
+			}
+			UDIV_QRNND_PREINV(dummy, r, r,
+					n1 << normalization_steps,
+					divisor_limb, divisor_limb_inverted);
+			return r >> normalization_steps;
+		} else {
+			mpi_limb_t divisor_limb_inverted;
+
+			/* Compute (2**2N - 2**N * DIVISOR_LIMB) / DIVISOR_LIMB.  The
+			 * result is a (N+1)-bit approximation to 1/DIVISOR_LIMB, with the
+			 * most significant bit (with weight 2**N) implicit.
+			 *
+			 * Special case for DIVISOR_LIMB == 100...000.
+			 */
+			if (!(divisor_limb << 1))
+				divisor_limb_inverted = ~(mpi_limb_t)0;
+			else
+				udiv_qrnnd(divisor_limb_inverted, dummy,
+						-divisor_limb, 0, divisor_limb);
+
+			i = dividend_size - 1;
+			r = dividend_ptr[i];
+
+			if (r >= divisor_limb)
+				r = 0;
+			else
+				i--;
+
+			for ( ; i >= 0; i--) {
+				n0 = dividend_ptr[i];
+				UDIV_QRNND_PREINV(dummy, r, r,
+						n0, divisor_limb, divisor_limb_inverted);
+			}
+			return r;
+		}
+	} else {
+		if (UDIV_NEEDS_NORMALIZATION) {
+			int normalization_steps;
+
+			normalization_steps = count_leading_zeros(divisor_limb);
+			if (normalization_steps) {
+				divisor_limb <<= normalization_steps;
+
+				n1 = dividend_ptr[dividend_size - 1];
+				r = n1 >> (BITS_PER_MPI_LIMB - normalization_steps);
+
+				/* Possible optimization:
+				 * if (r == 0
+				 * && divisor_limb > ((n1 << normalization_steps)
+				 *		   | (dividend_ptr[dividend_size - 2] >> ...)))
+				 * ...one division less...
+				 */
+				for (i = dividend_size - 2; i >= 0; i--) {
+					n0 = dividend_ptr[i];
+					udiv_qrnnd(dummy, r, r,
+						((n1 << normalization_steps)
+						 | (n0 >> (BITS_PER_MPI_LIMB - normalization_steps))),
+						divisor_limb);
+					n1 = n0;
+				}
+				udiv_qrnnd(dummy, r, r,
+						n1 << normalization_steps,
+						divisor_limb);
+				return r >> normalization_steps;
+			}
+		}
+		/* No normalization needed, either because udiv_qrnnd doesn't require
+		 * it, or because DIVISOR_LIMB is already normalized.
+		 */
+		i = dividend_size - 1;
+		r = dividend_ptr[i];
+
+		if (r >= divisor_limb)
+			r = 0;
+		else
+			i--;
+
+		for (; i >= 0; i--) {
+			n0 = dividend_ptr[i];
+			udiv_qrnnd(dummy, r, r, n0, divisor_limb);
+		}
+		return r;
+	}
+}
+
 /* Divide num (NP/NSIZE) by den (DP/DSIZE) and write
  * the NSIZE-DSIZE least significant quotient limbs at QP
  * and the DSIZE long remainder at NP.	If QEXTRA_LIMBS is
@@ -221,3 +365,153 @@ mpihelp_divrem(mpi_ptr_t qp, mpi_size_t qextra_limbs,
 
 	return most_significant_q_limb;
 }
+
+/****************
+ * Divide (DIVIDEND_PTR,,DIVIDEND_SIZE) by DIVISOR_LIMB.
+ * Write DIVIDEND_SIZE limbs of quotient at QUOT_PTR.
+ * Return the single-limb remainder.
+ * There are no constraints on the value of the divisor.
+ *
+ * QUOT_PTR and DIVIDEND_PTR might point to the same limb.
+ */
+
+mpi_limb_t
+mpihelp_divmod_1(mpi_ptr_t quot_ptr,
+		mpi_ptr_t dividend_ptr, mpi_size_t dividend_size,
+		mpi_limb_t divisor_limb)
+{
+	mpi_size_t i;
+	mpi_limb_t n1, n0, r;
+	mpi_limb_t dummy;
+
+	if (!dividend_size)
+		return 0;
+
+	/* If multiplication is much faster than division, and the
+	 * dividend is large, pre-invert the divisor, and use
+	 * only multiplications in the inner loop.
+	 *
+	 * This test should be read:
+	 * Does it ever help to use udiv_qrnnd_preinv?
+	 * && Does what we save compensate for the inversion overhead?
+	 */
+	if (UDIV_TIME > (2 * UMUL_TIME + 6)
+			&& (UDIV_TIME - (2 * UMUL_TIME + 6)) * dividend_size > UDIV_TIME) {
+		int normalization_steps;
+
+		normalization_steps = count_leading_zeros(divisor_limb);
+		if (normalization_steps) {
+			mpi_limb_t divisor_limb_inverted;
+
+			divisor_limb <<= normalization_steps;
+
+			/* Compute (2**2N - 2**N * DIVISOR_LIMB) / DIVISOR_LIMB.  The
+			 * result is a (N+1)-bit approximation to 1/DIVISOR_LIMB, with the
+			 * most significant bit (with weight 2**N) implicit.
+			 */
+			/* Special case for DIVISOR_LIMB == 100...000.  */
+			if (!(divisor_limb << 1))
+				divisor_limb_inverted = ~(mpi_limb_t)0;
+			else
+				udiv_qrnnd(divisor_limb_inverted, dummy,
+						-divisor_limb, 0, divisor_limb);
+
+			n1 = dividend_ptr[dividend_size - 1];
+			r = n1 >> (BITS_PER_MPI_LIMB - normalization_steps);
+
+			/* Possible optimization:
+			 * if (r == 0
+			 * && divisor_limb > ((n1 << normalization_steps)
+			 *		       | (dividend_ptr[dividend_size - 2] >> ...)))
+			 * ...one division less...
+			 */
+			for (i = dividend_size - 2; i >= 0; i--) {
+				n0 = dividend_ptr[i];
+				UDIV_QRNND_PREINV(quot_ptr[i + 1], r, r,
+						((n1 << normalization_steps)
+						 | (n0 >> (BITS_PER_MPI_LIMB - normalization_steps))),
+						divisor_limb, divisor_limb_inverted);
+				n1 = n0;
+			}
+			UDIV_QRNND_PREINV(quot_ptr[0], r, r,
+					n1 << normalization_steps,
+					divisor_limb, divisor_limb_inverted);
+			return r >> normalization_steps;
+		} else {
+			mpi_limb_t divisor_limb_inverted;
+
+			/* Compute (2**2N - 2**N * DIVISOR_LIMB) / DIVISOR_LIMB.  The
+			 * result is a (N+1)-bit approximation to 1/DIVISOR_LIMB, with the
+			 * most significant bit (with weight 2**N) implicit.
+			 */
+			/* Special case for DIVISOR_LIMB == 100...000.  */
+			if (!(divisor_limb << 1))
+				divisor_limb_inverted = ~(mpi_limb_t) 0;
+			else
+				udiv_qrnnd(divisor_limb_inverted, dummy,
+						-divisor_limb, 0, divisor_limb);
+
+			i = dividend_size - 1;
+			r = dividend_ptr[i];
+
+			if (r >= divisor_limb)
+				r = 0;
+			else
+				quot_ptr[i--] = 0;
+
+			for ( ; i >= 0; i--) {
+				n0 = dividend_ptr[i];
+				UDIV_QRNND_PREINV(quot_ptr[i], r, r,
+						n0, divisor_limb, divisor_limb_inverted);
+			}
+			return r;
+		}
+	} else {
+		if (UDIV_NEEDS_NORMALIZATION) {
+			int normalization_steps;
+
+			normalization_steps = count_leading_zeros(divisor_limb);
+			if (normalization_steps) {
+				divisor_limb <<= normalization_steps;
+
+				n1 = dividend_ptr[dividend_size - 1];
+				r = n1 >> (BITS_PER_MPI_LIMB - normalization_steps);
+
+				/* Possible optimization:
+				 * if (r == 0
+				 * && divisor_limb > ((n1 << normalization_steps)
+				 *		   | (dividend_ptr[dividend_size - 2] >> ...)))
+				 * ...one division less...
+				 */
+				for (i = dividend_size - 2; i >= 0; i--) {
+					n0 = dividend_ptr[i];
+					udiv_qrnnd(quot_ptr[i + 1], r, r,
+						((n1 << normalization_steps)
+						 | (n0 >> (BITS_PER_MPI_LIMB - normalization_steps))),
+						divisor_limb);
+					n1 = n0;
+				}
+				udiv_qrnnd(quot_ptr[0], r, r,
+						n1 << normalization_steps,
+						divisor_limb);
+				return r >> normalization_steps;
+			}
+		}
+		/* No normalization needed, either because udiv_qrnnd doesn't require
+		 * it, or because DIVISOR_LIMB is already normalized.
+		 */
+		i = dividend_size - 1;
+		r = dividend_ptr[i];
+
+		if (r >= divisor_limb)
+			r = 0;
+		else
+			quot_ptr[i--] = 0;
+
+		for (; i >= 0; i--) {
+			n0 = dividend_ptr[i];
+			udiv_qrnnd(quot_ptr[i], r, r, n0, divisor_limb);
+		}
+		return r;
+	}
+}
diff --git a/lib/mpi/mpih-mul.c b/lib/mpi/mpih-mul.c
index a93647564054..e5f1c84e3c48 100644
--- a/lib/mpi/mpih-mul.c
+++ b/lib/mpi/mpih-mul.c
@@ -317,6 +317,31 @@ mpih_sqr_n(mpi_ptr_t prodp, mpi_ptr_t up, mpi_size_t size, mpi_ptr_t tspace)
 	}
 }
 
+
+void mpihelp_mul_n(mpi_ptr_t prodp,
+		mpi_ptr_t up, mpi_ptr_t vp, mpi_size_t size)
+{
+	if (up == vp) {
+		if (size < KARATSUBA_THRESHOLD)
+			mpih_sqr_n_basecase(prodp, up, size);
+		else {
+			mpi_ptr_t tspace;
+			tspace = mpi_alloc_limb_space(2 * size);
+			mpih_sqr_n(prodp, up, size, tspace);
+			mpi_free_limb_space(tspace);
+		}
+	} else {
+		if (size < KARATSUBA_THRESHOLD)
+			mul_n_basecase(prodp, up, vp, size);
+		else {
+			mpi_ptr_t tspace;
+			tspace = mpi_alloc_limb_space(2 * size);
+			mul_n(prodp, up, vp, size, tspace);
+			mpi_free_limb_space(tspace);
+		}
+	}
+}
+
 int
 mpihelp_mul_karatsuba_case(mpi_ptr_t prodp,
 			   mpi_ptr_t up, mpi_size_t usize,
diff --git a/lib/mpi/mpiutil.c b/lib/mpi/mpiutil.c
index 20ed0f766787..e4046f74f336 100644
--- a/lib/mpi/mpiutil.c
+++ b/lib/mpi/mpiutil.c
@@ -20,6 +20,63 @@
 
 #include "mpi-internal.h"
 
+/* Constants allocated right away at startup.  */
+static MPI constants[MPI_NUMBER_OF_CONSTANTS];
+
+/* Initialize the MPI subsystem.  This is called early and allows to
+ * do some initialization without taking care of threading issues.
+ */
+static int __init mpi_init(void)
+{
+	int idx;
+	unsigned long value;
+
+	for (idx = 0; idx < MPI_NUMBER_OF_CONSTANTS; idx++) {
+		switch (idx) {
+		case MPI_C_ZERO:
+			value = 0;
+			break;
+		case MPI_C_ONE:
+			value = 1;
+			break;
+		case MPI_C_TWO:
+			value = 2;
+			break;
+		case MPI_C_THREE:
+			value = 3;
+			break;
+		case MPI_C_FOUR:
+			value = 4;
+			break;
+		case MPI_C_EIGHT:
+			value = 8;
+			break;
+		default:
+			pr_err("MPI: invalid mpi_const selector %d\n", idx);
+			return -EFAULT;
+		}
+		constants[idx] = mpi_alloc_set_ui(value);
+		constants[idx]->flags = (16|32);
+	}
+
+	return 0;
+}
+postcore_initcall(mpi_init);
+
+/* Return a constant MPI descripbed by NO which is one of the
+ * MPI_C_xxx macros.  There is no need to copy this returned value; it
+ * may be used directly.
+ */
+MPI mpi_const(enum gcry_mpi_constants no)
+{
+	if ((int)no < 0 || no > MPI_NUMBER_OF_CONSTANTS)
+		pr_err("MPI: invalid mpi_const selector %d\n", no);
+	if (!constants[no])
+		pr_err("MPI: MPI subsystem not initialized\n");
+	return constants[no];
+}
+EXPORT_SYMBOL_GPL(mpi_const);
+
 /****************
  * Note:  It was a bad idea to use the number of limbs to allocate
  *	  because on a alpha the limbs are large but we normally need
@@ -106,6 +163,15 @@ int mpi_resize(MPI a, unsigned nlimbs)
 	return 0;
 }
 
+void mpi_clear(MPI a)
+{
+	if (!a)
+		return;
+	a->nlimbs = 0;
+	a->flags = 0;
+}
+EXPORT_SYMBOL_GPL(mpi_clear);
+
 void mpi_free(MPI a)
 {
 	if (!a)
@@ -122,5 +188,143 @@ void mpi_free(MPI a)
 }
 EXPORT_SYMBOL_GPL(mpi_free);
 
+/****************
+ * Note: This copy function should not interpret the MPI
+ *	 but copy it transparently.
+ */
+MPI mpi_copy(MPI a)
+{
+	int i;
+	MPI b;
+
+	if (a) {
+		b = mpi_alloc(a->nlimbs);
+		b->nlimbs = a->nlimbs;
+		b->sign = a->sign;
+		b->flags = a->flags;
+		b->flags &= ~(16|32); /* Reset the immutable and constant flags. */
+		for (i = 0; i < b->nlimbs; i++)
+			b->d[i] = a->d[i];
+	} else
+		b = NULL;
+	return b;
+}
+
+/****************
+ * This function allocates an MPI which is optimized to hold
+ * a value as large as the one given in the argument and allocates it
+ * with the same flags as A.
+ */
+MPI mpi_alloc_like(MPI a)
+{
+	MPI b;
+
+	if (a) {
+		b = mpi_alloc(a->nlimbs);
+		b->nlimbs = 0;
+		b->sign = 0;
+		b->flags = a->flags;
+	} else
+		b = NULL;
+
+	return b;
+}
+
+
+/* Set U into W and release U.  If W is NULL only U will be released. */
+void mpi_snatch(MPI w, MPI u)
+{
+	if (w) {
+		mpi_assign_limb_space(w, u->d, u->alloced);
+		w->nlimbs = u->nlimbs;
+		w->sign   = u->sign;
+		w->flags  = u->flags;
+		u->alloced = 0;
+		u->nlimbs = 0;
+		u->d = NULL;
+	}
+	mpi_free(u);
+}
+
+
+MPI mpi_set(MPI w, MPI u)
+{
+	mpi_ptr_t wp, up;
+	mpi_size_t usize = u->nlimbs;
+	int usign = u->sign;
+
+	if (!w)
+		w = mpi_alloc(mpi_get_nlimbs(u));
+	RESIZE_IF_NEEDED(w, usize);
+	wp = w->d;
+	up = u->d;
+	MPN_COPY(wp, up, usize);
+	w->nlimbs = usize;
+	w->flags = u->flags;
+	w->flags &= ~(16|32); /* Reset the immutable and constant flags.  */
+	w->sign = usign;
+	return w;
+}
+EXPORT_SYMBOL_GPL(mpi_set);
+
+MPI mpi_set_ui(MPI w, unsigned long u)
+{
+	if (!w)
+		w = mpi_alloc(1);
+	/* FIXME: If U is 0 we have no need to resize and thus possible
+	 * allocating the the limbs.
+	 */
+	RESIZE_IF_NEEDED(w, 1);
+	w->d[0] = u;
+	w->nlimbs = u ? 1 : 0;
+	w->sign = 0;
+	w->flags = 0;
+	return w;
+}
+EXPORT_SYMBOL_GPL(mpi_set_ui);
+
+MPI mpi_alloc_set_ui(unsigned long u)
+{
+	MPI w = mpi_alloc(1);
+	w->d[0] = u;
+	w->nlimbs = u ? 1 : 0;
+	w->sign = 0;
+	return w;
+}
+
+/****************
+ * Swap the value of A and B, when SWAP is 1.
+ * Leave the value when SWAP is 0.
+ * This implementation should be constant-time regardless of SWAP.
+ */
+void mpi_swap_cond(MPI a, MPI b, unsigned long swap)
+{
+	mpi_size_t i;
+	mpi_size_t nlimbs;
+	mpi_limb_t mask = ((mpi_limb_t)0) - swap;
+	mpi_limb_t x;
+
+	if (a->alloced > b->alloced)
+		nlimbs = b->alloced;
+	else
+		nlimbs = a->alloced;
+	if (a->nlimbs > nlimbs || b->nlimbs > nlimbs)
+		return;
+
+	for (i = 0; i < nlimbs; i++) {
+		x = mask & (a->d[i] ^ b->d[i]);
+		a->d[i] = a->d[i] ^ x;
+		b->d[i] = b->d[i] ^ x;
+	}
+
+	x = mask & (a->nlimbs ^ b->nlimbs);
+	a->nlimbs = a->nlimbs ^ x;
+	b->nlimbs = b->nlimbs ^ x;
+
+	x = mask & (a->sign ^ b->sign);
+	a->sign = a->sign ^ x;
+	b->sign = b->sign ^ x;
+}
+
 MODULE_DESCRIPTION("Multiprecision maths library");
 MODULE_LICENSE("GPL");
-- 
2.17.1


^ permalink raw reply related

* [PATCH v5 3/8] lib/mpi: Introduce ec implementation to MPI library
From: Tianjia Zhang @ 2020-07-09  8:40 UTC (permalink / raw)
  To: herbert, davem, dhowells, mcoquelin.stm32, alexandre.torgue,
	jmorris, serge, nramas, tusharsu, zohar, vt, gilad, pvanleeuwen,
	zhang.jia
  Cc: linux-crypto, linux-kernel, keyrings, linux-stm32,
	linux-arm-kernel, linux-security-module, linux-integrity,
	tianjia.zhang
In-Reply-To: <20200709084015.21886-1-tianjia.zhang@linux.alibaba.com>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 42524 bytes --]

The implementation of EC is introduced from libgcrypt as the
basic algorithm of elliptic curve, which can be more perfectly
integrated with MPI implementation.
Some other algorithms will be developed based on mpi ecc, such as SM2.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
---
 include/linux/mpi.h |  105 +++
 lib/mpi/Makefile    |    1 +
 lib/mpi/ec.c        | 1509 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 1615 insertions(+)
 create mode 100644 lib/mpi/ec.c

diff --git a/include/linux/mpi.h b/include/linux/mpi.h
index 2dddf4c6e011..20a31d5c29d2 100644
--- a/include/linux/mpi.h
+++ b/include/linux/mpi.h
@@ -155,6 +155,111 @@ void mpi_fdiv_q(MPI quot, MPI dividend, MPI divisor);
 /*-- mpi-inv.c --*/
 int mpi_invm(MPI x, MPI a, MPI n);
 
+/*-- ec.c --*/
+
+/* Object to represent a point in projective coordinates */
+struct gcry_mpi_point {
+	MPI x;
+	MPI y;
+	MPI z;
+};
+
+typedef struct gcry_mpi_point *MPI_POINT;
+
+/* Models describing an elliptic curve */
+enum gcry_mpi_ec_models {
+	/* The Short Weierstrass equation is
+	 *      y^2 = x^3 + ax + b
+	 */
+	MPI_EC_WEIERSTRASS = 0,
+	/* The Montgomery equation is
+	 *      by^2 = x^3 + ax^2 + x
+	 */
+	MPI_EC_MONTGOMERY,
+	/* The Twisted Edwards equation is
+	 *      ax^2 + y^2 = 1 + bx^2y^2
+	 * Note that we use 'b' instead of the commonly used 'd'.
+	 */
+	MPI_EC_EDWARDS
+};
+
+/* Dialects used with elliptic curves */
+enum ecc_dialects {
+	ECC_DIALECT_STANDARD = 0,
+	ECC_DIALECT_ED25519,
+	ECC_DIALECT_SAFECURVE
+};
+
+/* This context is used with all our EC functions. */
+struct mpi_ec_ctx {
+	enum gcry_mpi_ec_models model; /* The model describing this curve. */
+	enum ecc_dialects dialect;     /* The ECC dialect used with the curve. */
+	int flags;                     /* Public key flags (not always used). */
+	unsigned int nbits;            /* Number of bits.  */
+
+	/* Domain parameters.  Note that they may not all be set and if set
+	 * the MPIs may be flaged as constant.
+	 */
+	MPI p;         /* Prime specifying the field GF(p).  */
+	MPI a;         /* First coefficient of the Weierstrass equation.  */
+	MPI b;         /* Second coefficient of the Weierstrass equation.  */
+	MPI_POINT G;   /* Base point (generator).  */
+	MPI n;         /* Order of G.  */
+	unsigned int h;       /* Cofactor.  */
+
+	/* The actual key.  May not be set.  */
+	MPI_POINT Q;   /* Public key.   */
+	MPI d;         /* Private key.  */
+
+	const char *name;      /* Name of the curve.  */
+
+	/* This structure is private to mpi/ec.c! */
+	struct {
+		struct {
+			unsigned int a_is_pminus3:1;
+			unsigned int two_inv_p:1;
+		} valid; /* Flags to help setting the helper vars below.  */
+
+		int a_is_pminus3;  /* True if A = P - 3. */
+
+		MPI two_inv_p;
+
+		mpi_barrett_t p_barrett;
+
+		/* Scratch variables.  */
+		MPI scratch[11];
+
+		/* Helper for fast reduction.  */
+		/*   int nist_nbits; /\* If this is a NIST curve, the # of bits. *\/ */
+		/*   MPI s[10]; */
+		/*   MPI c; */
+	} t;
+
+	/* Curve specific computation routines for the field.  */
+	void (*addm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx);
+	void (*subm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ec);
+	void (*mulm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx);
+	void (*pow2)(MPI w, const MPI b, struct mpi_ec_ctx *ctx);
+	void (*mul2)(MPI w, MPI u, struct mpi_ec_ctx *ctx);
+};
+
+void mpi_ec_init(struct mpi_ec_ctx *ctx, enum gcry_mpi_ec_models model,
+			enum ecc_dialects dialect,
+			int flags, MPI p, MPI a, MPI b);
+void mpi_ec_deinit(struct mpi_ec_ctx *ctx);
+MPI_POINT mpi_point_new(unsigned int nbits);
+void mpi_point_release(MPI_POINT p);
+void mpi_point_init(MPI_POINT p);
+void mpi_point_free_parts(MPI_POINT p);
+int mpi_ec_get_affine(MPI x, MPI y, MPI_POINT point, struct mpi_ec_ctx *ctx);
+void mpi_ec_add_points(MPI_POINT result,
+			MPI_POINT p1, MPI_POINT p2,
+			struct mpi_ec_ctx *ctx);
+void mpi_ec_mul_point(MPI_POINT result,
+			MPI scalar, MPI_POINT point,
+			struct mpi_ec_ctx *ctx);
+int mpi_ec_curve_point(MPI_POINT point, struct mpi_ec_ctx *ctx);
+
 /* inline functions */
 
 /**
diff --git a/lib/mpi/Makefile b/lib/mpi/Makefile
index 5f40f93ff3d9..0d07e3d2e0f4 100644
--- a/lib/mpi/Makefile
+++ b/lib/mpi/Makefile
@@ -13,6 +13,7 @@ mpi-y = \
 	generic_mpih-rshift.o		\
 	generic_mpih-sub1.o		\
 	generic_mpih-add1.o		\
+	ec.o				\
 	mpicoder.o			\
 	mpi-add.o			\
 	mpi-bit.o			\
diff --git a/lib/mpi/ec.c b/lib/mpi/ec.c
new file mode 100644
index 000000000000..c21470122dfc
--- /dev/null
+++ b/lib/mpi/ec.c
@@ -0,0 +1,1509 @@
+/* ec.c -  Elliptic Curve functions
+ * Copyright (C) 2007 Free Software Foundation, Inc.
+ * Copyright (C) 2013 g10 Code GmbH
+ *
+ * This file is part of Libgcrypt.
+ *
+ * Libgcrypt is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as
+ * published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * Libgcrypt is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "mpi-internal.h"
+#include "longlong.h"
+
+#define point_init(a)  mpi_point_init((a))
+#define point_free(a)  mpi_point_free_parts((a))
+
+#define log_error(fmt, ...) pr_err(fmt, ##__VA_ARGS__)
+#define log_fatal(fmt, ...) pr_err(fmt, ##__VA_ARGS__)
+
+#define DIM(v) (sizeof(v)/sizeof((v)[0]))
+
+
+/* Create a new point option.  NBITS gives the size in bits of one
+ * coordinate; it is only used to pre-allocate some resources and
+ * might also be passed as 0 to use a default value.
+ */
+MPI_POINT mpi_point_new(unsigned int nbits)
+{
+	MPI_POINT p;
+
+	(void)nbits;  /* Currently not used.  */
+
+	p = kmalloc(sizeof(*p), GFP_KERNEL);
+	if (p)
+		mpi_point_init(p);
+	return p;
+}
+EXPORT_SYMBOL_GPL(mpi_point_new);
+
+/* Release the point object P.  P may be NULL. */
+void mpi_point_release(MPI_POINT p)
+{
+	if (p) {
+		mpi_point_free_parts(p);
+		kfree(p);
+	}
+}
+EXPORT_SYMBOL_GPL(mpi_point_release);
+
+/* Initialize the fields of a point object.  gcry_mpi_point_free_parts
+ * may be used to release the fields.
+ */
+void mpi_point_init(MPI_POINT p)
+{
+	p->x = mpi_new(0);
+	p->y = mpi_new(0);
+	p->z = mpi_new(0);
+}
+EXPORT_SYMBOL_GPL(mpi_point_init);
+
+/* Release the parts of a point object. */
+void mpi_point_free_parts(MPI_POINT p)
+{
+	mpi_free(p->x); p->x = NULL;
+	mpi_free(p->y); p->y = NULL;
+	mpi_free(p->z); p->z = NULL;
+}
+EXPORT_SYMBOL_GPL(mpi_point_free_parts);
+
+/* Set the value from S into D.  */
+static void point_set(MPI_POINT d, MPI_POINT s)
+{
+	mpi_set(d->x, s->x);
+	mpi_set(d->y, s->y);
+	mpi_set(d->z, s->z);
+}
+
+static void point_resize(MPI_POINT p, struct mpi_ec_ctx *ctx)
+{
+	size_t nlimbs = ctx->p->nlimbs;
+
+	mpi_resize(p->x, nlimbs);
+	p->x->nlimbs = nlimbs;
+	mpi_resize(p->z, nlimbs);
+	p->z->nlimbs = nlimbs;
+
+	if (ctx->model != MPI_EC_MONTGOMERY) {
+		mpi_resize(p->y, nlimbs);
+		p->y->nlimbs = nlimbs;
+	}
+}
+
+static void point_swap_cond(MPI_POINT d, MPI_POINT s, unsigned long swap,
+		struct mpi_ec_ctx *ctx)
+{
+	mpi_swap_cond(d->x, s->x, swap);
+	if (ctx->model != MPI_EC_MONTGOMERY)
+		mpi_swap_cond(d->y, s->y, swap);
+	mpi_swap_cond(d->z, s->z, swap);
+}
+
+
+/* W = W mod P.  */
+static void ec_mod(MPI w, struct mpi_ec_ctx *ec)
+{
+	if (ec->t.p_barrett)
+		mpi_mod_barrett(w, w, ec->t.p_barrett);
+	else
+		mpi_mod(w, w, ec->p);
+}
+
+static void ec_addm(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_add(w, u, v);
+	ec_mod(w, ctx);
+}
+
+static void ec_subm(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ec)
+{
+	mpi_sub(w, u, v);
+	while (w->sign)
+		mpi_add(w, w, ec->p);
+	/*ec_mod(w, ec);*/
+}
+
+static void ec_mulm(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_mul(w, u, v);
+	ec_mod(w, ctx);
+}
+
+/* W = 2 * U mod P.  */
+static void ec_mul2(MPI w, MPI u, struct mpi_ec_ctx *ctx)
+{
+	mpi_lshift(w, u, 1);
+	ec_mod(w, ctx);
+}
+
+static void ec_powm(MPI w, const MPI b, const MPI e,
+		struct mpi_ec_ctx *ctx)
+{
+	mpi_powm(w, b, e, ctx->p);
+	/* mpi_abs(w); */
+}
+
+/* Shortcut for
+ * ec_powm(B, B, mpi_const(MPI_C_TWO), ctx);
+ * for easier optimization.
+ */
+static void ec_pow2(MPI w, const MPI b, struct mpi_ec_ctx *ctx)
+{
+	/* Using mpi_mul is slightly faster (at least on amd64).  */
+	/* mpi_powm(w, b, mpi_const(MPI_C_TWO), ctx->p); */
+	ec_mulm(w, b, b, ctx);
+}
+
+/* Shortcut for
+ * ec_powm(B, B, mpi_const(MPI_C_THREE), ctx);
+ * for easier optimization.
+ */
+static void ec_pow3(MPI w, const MPI b, struct mpi_ec_ctx *ctx)
+{
+	mpi_powm(w, b, mpi_const(MPI_C_THREE), ctx->p);
+}
+
+static void ec_invm(MPI x, MPI a, struct mpi_ec_ctx *ctx)
+{
+	if (!mpi_invm(x, a, ctx->p))
+		log_error("ec_invm: inverse does not exist:\n");
+}
+
+static void mpih_set_cond(mpi_ptr_t wp, mpi_ptr_t up,
+		mpi_size_t usize, unsigned long set)
+{
+	mpi_size_t i;
+	mpi_limb_t mask = ((mpi_limb_t)0) - set;
+	mpi_limb_t x;
+
+	for (i = 0; i < usize; i++) {
+		x = mask & (wp[i] ^ up[i]);
+		wp[i] = wp[i] ^ x;
+	}
+}
+
+/* Routines for 2^255 - 19.  */
+
+#define LIMB_SIZE_25519 ((256+BITS_PER_MPI_LIMB-1)/BITS_PER_MPI_LIMB)
+
+static void ec_addm_25519(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_ptr_t wp, up, vp;
+	mpi_size_t wsize = LIMB_SIZE_25519;
+	mpi_limb_t n[LIMB_SIZE_25519];
+	mpi_limb_t borrow;
+
+	if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
+		log_bug("addm_25519: different sizes\n");
+
+	memset(n, 0, sizeof(n));
+	up = u->d;
+	vp = v->d;
+	wp = w->d;
+
+	mpihelp_add_n(wp, up, vp, wsize);
+	borrow = mpihelp_sub_n(wp, wp, ctx->p->d, wsize);
+	mpih_set_cond(n, ctx->p->d, wsize, (borrow != 0UL));
+	mpihelp_add_n(wp, wp, n, wsize);
+	wp[LIMB_SIZE_25519-1] &= ~((mpi_limb_t)1 << (255 % BITS_PER_MPI_LIMB));
+}
+
+static void ec_subm_25519(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_ptr_t wp, up, vp;
+	mpi_size_t wsize = LIMB_SIZE_25519;
+	mpi_limb_t n[LIMB_SIZE_25519];
+	mpi_limb_t borrow;
+
+	if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
+		log_bug("subm_25519: different sizes\n");
+
+	memset(n, 0, sizeof(n));
+	up = u->d;
+	vp = v->d;
+	wp = w->d;
+
+	borrow = mpihelp_sub_n(wp, up, vp, wsize);
+	mpih_set_cond(n, ctx->p->d, wsize, (borrow != 0UL));
+	mpihelp_add_n(wp, wp, n, wsize);
+	wp[LIMB_SIZE_25519-1] &= ~((mpi_limb_t)1 << (255 % BITS_PER_MPI_LIMB));
+}
+
+static void ec_mulm_25519(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_ptr_t wp, up, vp;
+	mpi_size_t wsize = LIMB_SIZE_25519;
+	mpi_limb_t n[LIMB_SIZE_25519*2];
+	mpi_limb_t m[LIMB_SIZE_25519+1];
+	mpi_limb_t cy;
+	int msb;
+
+	(void)ctx;
+	if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
+		log_bug("mulm_25519: different sizes\n");
+
+	up = u->d;
+	vp = v->d;
+	wp = w->d;
+
+	mpihelp_mul_n(n, up, vp, wsize);
+	memcpy(wp, n, wsize * BYTES_PER_MPI_LIMB);
+	wp[LIMB_SIZE_25519-1] &= ~((mpi_limb_t)1 << (255 % BITS_PER_MPI_LIMB));
+
+	memcpy(m, n+LIMB_SIZE_25519-1, (wsize+1) * BYTES_PER_MPI_LIMB);
+	mpihelp_rshift(m, m, LIMB_SIZE_25519+1, (255 % BITS_PER_MPI_LIMB));
+
+	memcpy(n, m, wsize * BYTES_PER_MPI_LIMB);
+	cy = mpihelp_lshift(m, m, LIMB_SIZE_25519, 4);
+	m[LIMB_SIZE_25519] = cy;
+	cy = mpihelp_add_n(m, m, n, wsize);
+	m[LIMB_SIZE_25519] += cy;
+	cy = mpihelp_add_n(m, m, n, wsize);
+	m[LIMB_SIZE_25519] += cy;
+	cy = mpihelp_add_n(m, m, n, wsize);
+	m[LIMB_SIZE_25519] += cy;
+
+	cy = mpihelp_add_n(wp, wp, m, wsize);
+	m[LIMB_SIZE_25519] += cy;
+
+	memset(m, 0, wsize * BYTES_PER_MPI_LIMB);
+	msb = (wp[LIMB_SIZE_25519-1] >> (255 % BITS_PER_MPI_LIMB));
+	m[0] = (m[LIMB_SIZE_25519] * 2 + msb) * 19;
+	wp[LIMB_SIZE_25519-1] &= ~((mpi_limb_t)1 << (255 % BITS_PER_MPI_LIMB));
+	mpihelp_add_n(wp, wp, m, wsize);
+
+	m[0] = 0;
+	cy = mpihelp_sub_n(wp, wp, ctx->p->d, wsize);
+	mpih_set_cond(m, ctx->p->d, wsize, (cy != 0UL));
+	mpihelp_add_n(wp, wp, m, wsize);
+}
+
+static void ec_mul2_25519(MPI w, MPI u, struct mpi_ec_ctx *ctx)
+{
+	ec_addm_25519(w, u, u, ctx);
+}
+
+static void ec_pow2_25519(MPI w, const MPI b, struct mpi_ec_ctx *ctx)
+{
+	ec_mulm_25519(w, b, b, ctx);
+}
+
+/* Routines for 2^448 - 2^224 - 1.  */
+
+#define LIMB_SIZE_448 ((448+BITS_PER_MPI_LIMB-1)/BITS_PER_MPI_LIMB)
+#define LIMB_SIZE_HALF_448 ((LIMB_SIZE_448+1)/2)
+
+static void ec_addm_448(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_ptr_t wp, up, vp;
+	mpi_size_t wsize = LIMB_SIZE_448;
+	mpi_limb_t n[LIMB_SIZE_448];
+	mpi_limb_t cy;
+
+	if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
+		log_bug("addm_448: different sizes\n");
+
+	memset(n, 0, sizeof(n));
+	up = u->d;
+	vp = v->d;
+	wp = w->d;
+
+	cy = mpihelp_add_n(wp, up, vp, wsize);
+	mpih_set_cond(n, ctx->p->d, wsize, (cy != 0UL));
+	mpihelp_sub_n(wp, wp, n, wsize);
+}
+
+static void ec_subm_448(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_ptr_t wp, up, vp;
+	mpi_size_t wsize = LIMB_SIZE_448;
+	mpi_limb_t n[LIMB_SIZE_448];
+	mpi_limb_t borrow;
+
+	if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
+		log_bug("subm_448: different sizes\n");
+
+	memset(n, 0, sizeof(n));
+	up = u->d;
+	vp = v->d;
+	wp = w->d;
+
+	borrow = mpihelp_sub_n(wp, up, vp, wsize);
+	mpih_set_cond(n, ctx->p->d, wsize, (borrow != 0UL));
+	mpihelp_add_n(wp, wp, n, wsize);
+}
+
+static void ec_mulm_448(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_ptr_t wp, up, vp;
+	mpi_size_t wsize = LIMB_SIZE_448;
+	mpi_limb_t n[LIMB_SIZE_448*2];
+	mpi_limb_t a2[LIMB_SIZE_HALF_448];
+	mpi_limb_t a3[LIMB_SIZE_HALF_448];
+	mpi_limb_t b0[LIMB_SIZE_HALF_448];
+	mpi_limb_t b1[LIMB_SIZE_HALF_448];
+	mpi_limb_t cy;
+	int i;
+#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
+	mpi_limb_t b1_rest, a3_rest;
+#endif
+
+	if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
+		log_bug("mulm_448: different sizes\n");
+
+	up = u->d;
+	vp = v->d;
+	wp = w->d;
+
+	mpihelp_mul_n(n, up, vp, wsize);
+
+	for (i = 0; i < (wsize + 1) / 2; i++) {
+		b0[i] = n[i];
+		b1[i] = n[i+wsize/2];
+		a2[i] = n[i+wsize];
+		a3[i] = n[i+wsize+wsize/2];
+	}
+
+#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
+	b0[LIMB_SIZE_HALF_448-1] &= ((mpi_limb_t)1UL << 32)-1;
+	a2[LIMB_SIZE_HALF_448-1] &= ((mpi_limb_t)1UL << 32)-1;
+
+	b1_rest = 0;
+	a3_rest = 0;
+
+	for (i = (wsize + 1) / 2 - 1; i >= 0; i--) {
+		mpi_limb_t b1v, a3v;
+		b1v = b1[i];
+		a3v = a3[i];
+		b1[i] = (b1_rest << 32) | (b1v >> 32);
+		a3[i] = (a3_rest << 32) | (a3v >> 32);
+		b1_rest = b1v & (((mpi_limb_t)1UL << 32)-1);
+		a3_rest = a3v & (((mpi_limb_t)1UL << 32)-1);
+	}
+#endif
+
+	cy = mpihelp_add_n(b0, b0, a2, LIMB_SIZE_HALF_448);
+	cy += mpihelp_add_n(b0, b0, a3, LIMB_SIZE_HALF_448);
+	for (i = 0; i < (wsize + 1) / 2; i++)
+		wp[i] = b0[i];
+#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
+	wp[LIMB_SIZE_HALF_448-1] &= (((mpi_limb_t)1UL << 32)-1);
+#endif
+
+#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
+	cy = b0[LIMB_SIZE_HALF_448-1] >> 32;
+#endif
+
+	cy = mpihelp_add_1(b1, b1, LIMB_SIZE_HALF_448, cy);
+	cy += mpihelp_add_n(b1, b1, a2, LIMB_SIZE_HALF_448);
+	cy += mpihelp_add_n(b1, b1, a3, LIMB_SIZE_HALF_448);
+	cy += mpihelp_add_n(b1, b1, a3, LIMB_SIZE_HALF_448);
+#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
+	b1_rest = 0;
+	for (i = (wsize + 1) / 2 - 1; i >= 0; i--) {
+		mpi_limb_t b1v = b1[i];
+		b1[i] = (b1_rest << 32) | (b1v >> 32);
+		b1_rest = b1v & (((mpi_limb_t)1UL << 32)-1);
+	}
+	wp[LIMB_SIZE_HALF_448-1] |= (b1_rest << 32);
+#endif
+	for (i = 0; i < wsize / 2; i++)
+		wp[i+(wsize + 1) / 2] = b1[i];
+
+#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
+	cy = b1[LIMB_SIZE_HALF_448-1];
+#endif
+
+	memset(n, 0, wsize * BYTES_PER_MPI_LIMB);
+
+#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
+	n[LIMB_SIZE_HALF_448-1] = cy << 32;
+#else
+	n[LIMB_SIZE_HALF_448] = cy;
+#endif
+	n[0] = cy;
+	mpihelp_add_n(wp, wp, n, wsize);
+
+	memset(n, 0, wsize * BYTES_PER_MPI_LIMB);
+	cy = mpihelp_sub_n(wp, wp, ctx->p->d, wsize);
+	mpih_set_cond(n, ctx->p->d, wsize, (cy != 0UL));
+	mpihelp_add_n(wp, wp, n, wsize);
+}
+
+static void ec_mul2_448(MPI w, MPI u, struct mpi_ec_ctx *ctx)
+{
+	ec_addm_448(w, u, u, ctx);
+}
+
+static void ec_pow2_448(MPI w, const MPI b, struct mpi_ec_ctx *ctx)
+{
+	ec_mulm_448(w, b, b, ctx);
+}
+
+struct field_table {
+	const char *p;
+
+	/* computation routines for the field.  */
+	void (*addm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx);
+	void (*subm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx);
+	void (*mulm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx);
+	void (*mul2)(MPI w, MPI u, struct mpi_ec_ctx *ctx);
+	void (*pow2)(MPI w, const MPI b, struct mpi_ec_ctx *ctx);
+};
+
+static const struct field_table field_table[] = {
+	{
+		"0x7FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFED",
+		ec_addm_25519,
+		ec_subm_25519,
+		ec_mulm_25519,
+		ec_mul2_25519,
+		ec_pow2_25519
+	},
+	{
+		"0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFE"
+		"FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF",
+		ec_addm_448,
+		ec_subm_448,
+		ec_mulm_448,
+		ec_mul2_448,
+		ec_pow2_448
+	},
+	{ NULL, NULL, NULL, NULL, NULL, NULL },
+};
+
+/* Force recomputation of all helper variables.  */
+static void mpi_ec_get_reset(struct mpi_ec_ctx *ec)
+{
+	ec->t.valid.a_is_pminus3 = 0;
+	ec->t.valid.two_inv_p = 0;
+}
+
+/* Accessor for helper variable.  */
+static int ec_get_a_is_pminus3(struct mpi_ec_ctx *ec)
+{
+	MPI tmp;
+
+	if (!ec->t.valid.a_is_pminus3) {
+		ec->t.valid.a_is_pminus3 = 1;
+		tmp = mpi_alloc_like(ec->p);
+		mpi_sub_ui(tmp, ec->p, 3);
+		ec->t.a_is_pminus3 = !mpi_cmp(ec->a, tmp);
+		mpi_free(tmp);
+	}
+
+	return ec->t.a_is_pminus3;
+}
+
+/* Accessor for helper variable.  */
+static MPI ec_get_two_inv_p(struct mpi_ec_ctx *ec)
+{
+	if (!ec->t.valid.two_inv_p) {
+		ec->t.valid.two_inv_p = 1;
+		if (!ec->t.two_inv_p)
+			ec->t.two_inv_p = mpi_alloc(0);
+		ec_invm(ec->t.two_inv_p, mpi_const(MPI_C_TWO), ec);
+	}
+	return ec->t.two_inv_p;
+}
+
+static const char *const curve25519_bad_points[] = {
+	"0x7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffed",
+	"0x0000000000000000000000000000000000000000000000000000000000000000",
+	"0x0000000000000000000000000000000000000000000000000000000000000001",
+	"0x00b8495f16056286fdb1329ceb8d09da6ac49ff1fae35616aeb8413b7c7aebe0",
+	"0x57119fd0dd4e22d8868e1c58c45c44045bef839c55b1d0b1248c50a3bc959c5f",
+	"0x7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffec",
+	"0x7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffee",
+	NULL
+};
+
+static const char *const curve448_bad_points[] = {
+	"0xfffffffffffffffffffffffffffffffffffffffffffffffffffffffe"
+	"ffffffffffffffffffffffffffffffffffffffffffffffffffffffff",
+	"0x00000000000000000000000000000000000000000000000000000000"
+	"00000000000000000000000000000000000000000000000000000000",
+	"0x00000000000000000000000000000000000000000000000000000000"
+	"00000000000000000000000000000000000000000000000000000001",
+	"0xfffffffffffffffffffffffffffffffffffffffffffffffffffffffe"
+	"fffffffffffffffffffffffffffffffffffffffffffffffffffffffe",
+	"0xffffffffffffffffffffffffffffffffffffffffffffffffffffffff"
+	"00000000000000000000000000000000000000000000000000000000",
+	NULL
+};
+
+static const char *const *bad_points_table[] = {
+	curve25519_bad_points,
+	curve448_bad_points,
+};
+
+static void mpi_ec_coefficient_normalize(MPI a, MPI p)
+{
+	if (a->sign) {
+		mpi_resize(a, p->nlimbs);
+		mpihelp_sub_n(a->d, p->d, a->d, p->nlimbs);
+		a->nlimbs = p->nlimbs;
+		a->sign = 0;
+	}
+}
+
+/* This function initialized a context for elliptic curve based on the
+ * field GF(p).  P is the prime specifying this field, A is the first
+ * coefficient.  CTX is expected to be zeroized.
+ */
+void mpi_ec_init(struct mpi_ec_ctx *ctx, enum gcry_mpi_ec_models model,
+			enum ecc_dialects dialect,
+			int flags, MPI p, MPI a, MPI b)
+{
+	int i;
+	static int use_barrett = -1 /* TODO: 1 or -1 */;
+
+	mpi_ec_coefficient_normalize(a, p);
+	mpi_ec_coefficient_normalize(b, p);
+
+	/* Fixme: Do we want to check some constraints? e.g.  a < p  */
+
+	ctx->model = model;
+	ctx->dialect = dialect;
+	ctx->flags = flags;
+	if (dialect == ECC_DIALECT_ED25519)
+		ctx->nbits = 256;
+	else
+		ctx->nbits = mpi_get_nbits(p);
+	ctx->p = mpi_copy(p);
+	ctx->a = mpi_copy(a);
+	ctx->b = mpi_copy(b);
+
+	ctx->t.p_barrett = use_barrett > 0 ? mpi_barrett_init(ctx->p, 0) : NULL;
+
+	mpi_ec_get_reset(ctx);
+
+	if (model == MPI_EC_MONTGOMERY) {
+		for (i = 0; i < DIM(bad_points_table); i++) {
+			MPI p_candidate = mpi_scanval(bad_points_table[i][0]);
+			int match_p = !mpi_cmp(ctx->p, p_candidate);
+			int j;
+
+			mpi_free(p_candidate);
+			if (!match_p)
+				continue;
+
+			for (j = 0; i < DIM(ctx->t.scratch) && bad_points_table[i][j]; j++)
+				ctx->t.scratch[j] = mpi_scanval(bad_points_table[i][j]);
+		}
+	} else {
+		/* Allocate scratch variables.  */
+		for (i = 0; i < DIM(ctx->t.scratch); i++)
+			ctx->t.scratch[i] = mpi_alloc_like(ctx->p);
+	}
+
+	ctx->addm = ec_addm;
+	ctx->subm = ec_subm;
+	ctx->mulm = ec_mulm;
+	ctx->mul2 = ec_mul2;
+	ctx->pow2 = ec_pow2;
+
+	for (i = 0; field_table[i].p; i++) {
+		MPI f_p;
+
+		f_p = mpi_scanval(field_table[i].p);
+		if (!f_p)
+			break;
+
+		if (!mpi_cmp(p, f_p)) {
+			ctx->addm = field_table[i].addm;
+			ctx->subm = field_table[i].subm;
+			ctx->mulm = field_table[i].mulm;
+			ctx->mul2 = field_table[i].mul2;
+			ctx->pow2 = field_table[i].pow2;
+			mpi_free(f_p);
+
+			mpi_resize(ctx->a, ctx->p->nlimbs);
+			ctx->a->nlimbs = ctx->p->nlimbs;
+
+			mpi_resize(ctx->b, ctx->p->nlimbs);
+			ctx->b->nlimbs = ctx->p->nlimbs;
+
+			for (i = 0; i < DIM(ctx->t.scratch) && ctx->t.scratch[i]; i++)
+				ctx->t.scratch[i]->nlimbs = ctx->p->nlimbs;
+
+			break;
+		}
+
+		mpi_free(f_p);
+	}
+}
+EXPORT_SYMBOL_GPL(mpi_ec_init);
+
+void mpi_ec_deinit(struct mpi_ec_ctx *ctx)
+{
+	int i;
+
+	mpi_barrett_free(ctx->t.p_barrett);
+
+	/* Domain parameter.  */
+	mpi_free(ctx->p);
+	mpi_free(ctx->a);
+	mpi_free(ctx->b);
+	mpi_point_release(ctx->G);
+	mpi_free(ctx->n);
+
+	/* The key.  */
+	mpi_point_release(ctx->Q);
+	mpi_free(ctx->d);
+
+	/* Private data of ec.c.  */
+	mpi_free(ctx->t.two_inv_p);
+
+	for (i = 0; i < DIM(ctx->t.scratch); i++)
+		mpi_free(ctx->t.scratch[i]);
+}
+EXPORT_SYMBOL_GPL(mpi_ec_deinit);
+
+/* Compute the affine coordinates from the projective coordinates in
+ * POINT.  Set them into X and Y.  If one coordinate is not required,
+ * X or Y may be passed as NULL.  CTX is the usual context. Returns: 0
+ * on success or !0 if POINT is at infinity.
+ */
+int mpi_ec_get_affine(MPI x, MPI y, MPI_POINT point, struct mpi_ec_ctx *ctx)
+{
+	if (!mpi_cmp_ui(point->z, 0))
+		return -1;
+
+	switch (ctx->model) {
+	case MPI_EC_WEIERSTRASS: /* Using Jacobian coordinates.  */
+		{
+			MPI z1, z2, z3;
+
+			z1 = mpi_new(0);
+			z2 = mpi_new(0);
+			ec_invm(z1, point->z, ctx);  /* z1 = z^(-1) mod p  */
+			ec_mulm(z2, z1, z1, ctx);    /* z2 = z^(-2) mod p  */
+
+			if (x)
+				ec_mulm(x, point->x, z2, ctx);
+
+			if (y) {
+				z3 = mpi_new(0);
+				ec_mulm(z3, z2, z1, ctx);      /* z3 = z^(-3) mod p */
+				ec_mulm(y, point->y, z3, ctx);
+				mpi_free(z3);
+			}
+
+			mpi_free(z2);
+			mpi_free(z1);
+		}
+		return 0;
+
+	case MPI_EC_MONTGOMERY:
+		{
+			if (x)
+				mpi_set(x, point->x);
+
+			if (y) {
+				log_fatal("%s: Getting Y-coordinate on %s is not supported\n",
+						"mpi_ec_get_affine", "Montgomery");
+				return -1;
+			}
+		}
+		return 0;
+
+	case MPI_EC_EDWARDS:
+		{
+			MPI z;
+
+			z = mpi_new(0);
+			ec_invm(z, point->z, ctx);
+
+			mpi_resize(z, ctx->p->nlimbs);
+			z->nlimbs = ctx->p->nlimbs;
+
+			if (x) {
+				mpi_resize(x, ctx->p->nlimbs);
+				x->nlimbs = ctx->p->nlimbs;
+				ctx->mulm(x, point->x, z, ctx);
+			}
+			if (y) {
+				mpi_resize(y, ctx->p->nlimbs);
+				y->nlimbs = ctx->p->nlimbs;
+				ctx->mulm(y, point->y, z, ctx);
+			}
+
+			mpi_free(z);
+		}
+		return 0;
+
+	default:
+		return -1;
+	}
+}
+EXPORT_SYMBOL_GPL(mpi_ec_get_affine);
+
+/*  RESULT = 2 * POINT  (Weierstrass version). */
+static void dup_point_weierstrass(MPI_POINT result,
+		MPI_POINT point, struct mpi_ec_ctx *ctx)
+{
+#define x3 (result->x)
+#define y3 (result->y)
+#define z3 (result->z)
+#define t1 (ctx->t.scratch[0])
+#define t2 (ctx->t.scratch[1])
+#define t3 (ctx->t.scratch[2])
+#define l1 (ctx->t.scratch[3])
+#define l2 (ctx->t.scratch[4])
+#define l3 (ctx->t.scratch[5])
+
+	if (!mpi_cmp_ui(point->y, 0) || !mpi_cmp_ui(point->z, 0)) {
+		/* P_y == 0 || P_z == 0 => [1:1:0] */
+		mpi_set_ui(x3, 1);
+		mpi_set_ui(y3, 1);
+		mpi_set_ui(z3, 0);
+	} else {
+		if (ec_get_a_is_pminus3(ctx)) {
+			/* Use the faster case.  */
+			/* L1 = 3(X - Z^2)(X + Z^2) */
+			/*                          T1: used for Z^2. */
+			/*                          T2: used for the right term. */
+			ec_pow2(t1, point->z, ctx);
+			ec_subm(l1, point->x, t1, ctx);
+			ec_mulm(l1, l1, mpi_const(MPI_C_THREE), ctx);
+			ec_addm(t2, point->x, t1, ctx);
+			ec_mulm(l1, l1, t2, ctx);
+		} else {
+			/* Standard case. */
+			/* L1 = 3X^2 + aZ^4 */
+			/*                          T1: used for aZ^4. */
+			ec_pow2(l1, point->x, ctx);
+			ec_mulm(l1, l1, mpi_const(MPI_C_THREE), ctx);
+			ec_powm(t1, point->z, mpi_const(MPI_C_FOUR), ctx);
+			ec_mulm(t1, t1, ctx->a, ctx);
+			ec_addm(l1, l1, t1, ctx);
+		}
+		/* Z3 = 2YZ */
+		ec_mulm(z3, point->y, point->z, ctx);
+		ec_mul2(z3, z3, ctx);
+
+		/* L2 = 4XY^2 */
+		/*                              T2: used for Y2; required later. */
+		ec_pow2(t2, point->y, ctx);
+		ec_mulm(l2, t2, point->x, ctx);
+		ec_mulm(l2, l2, mpi_const(MPI_C_FOUR), ctx);
+
+		/* X3 = L1^2 - 2L2 */
+		/*                              T1: used for L2^2. */
+		ec_pow2(x3, l1, ctx);
+		ec_mul2(t1, l2, ctx);
+		ec_subm(x3, x3, t1, ctx);
+
+		/* L3 = 8Y^4 */
+		/*                              T2: taken from above. */
+		ec_pow2(t2, t2, ctx);
+		ec_mulm(l3, t2, mpi_const(MPI_C_EIGHT), ctx);
+
+		/* Y3 = L1(L2 - X3) - L3 */
+		ec_subm(y3, l2, x3, ctx);
+		ec_mulm(y3, y3, l1, ctx);
+		ec_subm(y3, y3, l3, ctx);
+	}
+
+#undef x3
+#undef y3
+#undef z3
+#undef t1
+#undef t2
+#undef t3
+#undef l1
+#undef l2
+#undef l3
+}
+
+/*  RESULT = 2 * POINT  (Montgomery version). */
+static void dup_point_montgomery(MPI_POINT result,
+				MPI_POINT point, struct mpi_ec_ctx *ctx)
+{
+	(void)result;
+	(void)point;
+	(void)ctx;
+	log_fatal("%s: %s not yet supported\n",
+			"mpi_ec_dup_point", "Montgomery");
+}
+
+/*  RESULT = 2 * POINT  (Twisted Edwards version). */
+static void dup_point_edwards(MPI_POINT result,
+		MPI_POINT point, struct mpi_ec_ctx *ctx)
+{
+#define X1 (point->x)
+#define Y1 (point->y)
+#define Z1 (point->z)
+#define X3 (result->x)
+#define Y3 (result->y)
+#define Z3 (result->z)
+#define B (ctx->t.scratch[0])
+#define C (ctx->t.scratch[1])
+#define D (ctx->t.scratch[2])
+#define E (ctx->t.scratch[3])
+#define F (ctx->t.scratch[4])
+#define H (ctx->t.scratch[5])
+#define J (ctx->t.scratch[6])
+
+	/* Compute: (X_3 : Y_3 : Z_3) = 2( X_1 : Y_1 : Z_1 ) */
+
+	/* B = (X_1 + Y_1)^2  */
+	ctx->addm(B, X1, Y1, ctx);
+	ctx->pow2(B, B, ctx);
+
+	/* C = X_1^2 */
+	/* D = Y_1^2 */
+	ctx->pow2(C, X1, ctx);
+	ctx->pow2(D, Y1, ctx);
+
+	/* E = aC */
+	if (ctx->dialect == ECC_DIALECT_ED25519)
+		ctx->subm(E, ctx->p, C, ctx);
+	else
+		ctx->mulm(E, ctx->a, C, ctx);
+
+	/* F = E + D */
+	ctx->addm(F, E, D, ctx);
+
+	/* H = Z_1^2 */
+	ctx->pow2(H, Z1, ctx);
+
+	/* J = F - 2H */
+	ctx->mul2(J, H, ctx);
+	ctx->subm(J, F, J, ctx);
+
+	/* X_3 = (B - C - D) · J */
+	ctx->subm(X3, B, C, ctx);
+	ctx->subm(X3, X3, D, ctx);
+	ctx->mulm(X3, X3, J, ctx);
+
+	/* Y_3 = F · (E - D) */
+	ctx->subm(Y3, E, D, ctx);
+	ctx->mulm(Y3, Y3, F, ctx);
+
+	/* Z_3 = F · J */
+	ctx->mulm(Z3, F, J, ctx);
+
+#undef X1
+#undef Y1
+#undef Z1
+#undef X3
+#undef Y3
+#undef Z3
+#undef B
+#undef C
+#undef D
+#undef E
+#undef F
+#undef H
+#undef J
+}
+
+/*  RESULT = 2 * POINT  */
+static void
+mpi_ec_dup_point(MPI_POINT result, MPI_POINT point, struct mpi_ec_ctx *ctx)
+{
+	switch (ctx->model) {
+	case MPI_EC_WEIERSTRASS:
+		dup_point_weierstrass(result, point, ctx);
+		break;
+	case MPI_EC_MONTGOMERY:
+		dup_point_montgomery(result, point, ctx);
+		break;
+	case MPI_EC_EDWARDS:
+		dup_point_edwards(result, point, ctx);
+		break;
+	}
+}
+
+/* RESULT = P1 + P2  (Weierstrass version).*/
+static void add_points_weierstrass(MPI_POINT result,
+		MPI_POINT p1, MPI_POINT p2,
+		struct mpi_ec_ctx *ctx)
+{
+#define x1 (p1->x)
+#define y1 (p1->y)
+#define z1 (p1->z)
+#define x2 (p2->x)
+#define y2 (p2->y)
+#define z2 (p2->z)
+#define x3 (result->x)
+#define y3 (result->y)
+#define z3 (result->z)
+#define l1 (ctx->t.scratch[0])
+#define l2 (ctx->t.scratch[1])
+#define l3 (ctx->t.scratch[2])
+#define l4 (ctx->t.scratch[3])
+#define l5 (ctx->t.scratch[4])
+#define l6 (ctx->t.scratch[5])
+#define l7 (ctx->t.scratch[6])
+#define l8 (ctx->t.scratch[7])
+#define l9 (ctx->t.scratch[8])
+#define t1 (ctx->t.scratch[9])
+#define t2 (ctx->t.scratch[10])
+
+	if ((!mpi_cmp(x1, x2)) && (!mpi_cmp(y1, y2)) && (!mpi_cmp(z1, z2))) {
+		/* Same point; need to call the duplicate function.  */
+		mpi_ec_dup_point(result, p1, ctx);
+	} else if (!mpi_cmp_ui(z1, 0)) {
+		/* P1 is at infinity.  */
+		mpi_set(x3, p2->x);
+		mpi_set(y3, p2->y);
+		mpi_set(z3, p2->z);
+	} else if (!mpi_cmp_ui(z2, 0)) {
+		/* P2 is at infinity.  */
+		mpi_set(x3, p1->x);
+		mpi_set(y3, p1->y);
+		mpi_set(z3, p1->z);
+	} else {
+		int z1_is_one = !mpi_cmp_ui(z1, 1);
+		int z2_is_one = !mpi_cmp_ui(z2, 1);
+
+		/* l1 = x1 z2^2  */
+		/* l2 = x2 z1^2  */
+		if (z2_is_one)
+			mpi_set(l1, x1);
+		else {
+			ec_pow2(l1, z2, ctx);
+			ec_mulm(l1, l1, x1, ctx);
+		}
+		if (z1_is_one)
+			mpi_set(l2, x2);
+		else {
+			ec_pow2(l2, z1, ctx);
+			ec_mulm(l2, l2, x2, ctx);
+		}
+		/* l3 = l1 - l2 */
+		ec_subm(l3, l1, l2, ctx);
+		/* l4 = y1 z2^3  */
+		ec_powm(l4, z2, mpi_const(MPI_C_THREE), ctx);
+		ec_mulm(l4, l4, y1, ctx);
+		/* l5 = y2 z1^3  */
+		ec_powm(l5, z1, mpi_const(MPI_C_THREE), ctx);
+		ec_mulm(l5, l5, y2, ctx);
+		/* l6 = l4 - l5  */
+		ec_subm(l6, l4, l5, ctx);
+
+		if (!mpi_cmp_ui(l3, 0)) {
+			if (!mpi_cmp_ui(l6, 0)) {
+				/* P1 and P2 are the same - use duplicate function. */
+				mpi_ec_dup_point(result, p1, ctx);
+			} else {
+				/* P1 is the inverse of P2.  */
+				mpi_set_ui(x3, 1);
+				mpi_set_ui(y3, 1);
+				mpi_set_ui(z3, 0);
+			}
+		} else {
+			/* l7 = l1 + l2  */
+			ec_addm(l7, l1, l2, ctx);
+			/* l8 = l4 + l5  */
+			ec_addm(l8, l4, l5, ctx);
+			/* z3 = z1 z2 l3  */
+			ec_mulm(z3, z1, z2, ctx);
+			ec_mulm(z3, z3, l3, ctx);
+			/* x3 = l6^2 - l7 l3^2  */
+			ec_pow2(t1, l6, ctx);
+			ec_pow2(t2, l3, ctx);
+			ec_mulm(t2, t2, l7, ctx);
+			ec_subm(x3, t1, t2, ctx);
+			/* l9 = l7 l3^2 - 2 x3  */
+			ec_mul2(t1, x3, ctx);
+			ec_subm(l9, t2, t1, ctx);
+			/* y3 = (l9 l6 - l8 l3^3)/2  */
+			ec_mulm(l9, l9, l6, ctx);
+			ec_powm(t1, l3, mpi_const(MPI_C_THREE), ctx); /* fixme: Use saved value*/
+			ec_mulm(t1, t1, l8, ctx);
+			ec_subm(y3, l9, t1, ctx);
+			ec_mulm(y3, y3, ec_get_two_inv_p(ctx), ctx);
+		}
+	}
+
+#undef x1
+#undef y1
+#undef z1
+#undef x2
+#undef y2
+#undef z2
+#undef x3
+#undef y3
+#undef z3
+#undef l1
+#undef l2
+#undef l3
+#undef l4
+#undef l5
+#undef l6
+#undef l7
+#undef l8
+#undef l9
+#undef t1
+#undef t2
+}
+
+/* RESULT = P1 + P2  (Montgomery version).*/
+static void add_points_montgomery(MPI_POINT result,
+		MPI_POINT p1, MPI_POINT p2,
+		struct mpi_ec_ctx *ctx)
+{
+	(void)result;
+	(void)p1;
+	(void)p2;
+	(void)ctx;
+	log_fatal("%s: %s not yet supported\n",
+			"mpi_ec_add_points", "Montgomery");
+}
+
+/* RESULT = P1 + P2  (Twisted Edwards version).*/
+static void add_points_edwards(MPI_POINT result,
+		MPI_POINT p1, MPI_POINT p2,
+		struct mpi_ec_ctx *ctx)
+{
+#define X1 (p1->x)
+#define Y1 (p1->y)
+#define Z1 (p1->z)
+#define X2 (p2->x)
+#define Y2 (p2->y)
+#define Z2 (p2->z)
+#define X3 (result->x)
+#define Y3 (result->y)
+#define Z3 (result->z)
+#define A (ctx->t.scratch[0])
+#define B (ctx->t.scratch[1])
+#define C (ctx->t.scratch[2])
+#define D (ctx->t.scratch[3])
+#define E (ctx->t.scratch[4])
+#define F (ctx->t.scratch[5])
+#define G (ctx->t.scratch[6])
+#define tmp (ctx->t.scratch[7])
+
+	point_resize(result, ctx);
+
+	/* Compute: (X_3 : Y_3 : Z_3) = (X_1 : Y_1 : Z_1) + (X_2 : Y_2 : Z_3) */
+
+	/* A = Z1 · Z2 */
+	ctx->mulm(A, Z1, Z2, ctx);
+
+	/* B = A^2 */
+	ctx->pow2(B, A, ctx);
+
+	/* C = X1 · X2 */
+	ctx->mulm(C, X1, X2, ctx);
+
+	/* D = Y1 · Y2 */
+	ctx->mulm(D, Y1, Y2, ctx);
+
+	/* E = d · C · D */
+	ctx->mulm(E, ctx->b, C, ctx);
+	ctx->mulm(E, E, D, ctx);
+
+	/* F = B - E */
+	ctx->subm(F, B, E, ctx);
+
+	/* G = B + E */
+	ctx->addm(G, B, E, ctx);
+
+	/* X_3 = A · F · ((X_1 + Y_1) · (X_2 + Y_2) - C - D) */
+	ctx->addm(tmp, X1, Y1, ctx);
+	ctx->addm(X3, X2, Y2, ctx);
+	ctx->mulm(X3, X3, tmp, ctx);
+	ctx->subm(X3, X3, C, ctx);
+	ctx->subm(X3, X3, D, ctx);
+	ctx->mulm(X3, X3, F, ctx);
+	ctx->mulm(X3, X3, A, ctx);
+
+	/* Y_3 = A · G · (D - aC) */
+	if (ctx->dialect == ECC_DIALECT_ED25519) {
+		ctx->addm(Y3, D, C, ctx);
+	} else {
+		ctx->mulm(Y3, ctx->a, C, ctx);
+		ctx->subm(Y3, D, Y3, ctx);
+	}
+	ctx->mulm(Y3, Y3, G, ctx);
+	ctx->mulm(Y3, Y3, A, ctx);
+
+	/* Z_3 = F · G */
+	ctx->mulm(Z3, F, G, ctx);
+
+
+#undef X1
+#undef Y1
+#undef Z1
+#undef X2
+#undef Y2
+#undef Z2
+#undef X3
+#undef Y3
+#undef Z3
+#undef A
+#undef B
+#undef C
+#undef D
+#undef E
+#undef F
+#undef G
+#undef tmp
+}
+
+/* Compute a step of Montgomery Ladder (only use X and Z in the point).
+ * Inputs:  P1, P2, and x-coordinate of DIF = P1 - P1.
+ * Outputs: PRD = 2 * P1 and  SUM = P1 + P2.
+ */
+static void montgomery_ladder(MPI_POINT prd, MPI_POINT sum,
+		MPI_POINT p1, MPI_POINT p2, MPI dif_x,
+		struct mpi_ec_ctx *ctx)
+{
+	ctx->addm(sum->x, p2->x, p2->z, ctx);
+	ctx->subm(p2->z, p2->x, p2->z, ctx);
+	ctx->addm(prd->x, p1->x, p1->z, ctx);
+	ctx->subm(p1->z, p1->x, p1->z, ctx);
+	ctx->mulm(p2->x, p1->z, sum->x, ctx);
+	ctx->mulm(p2->z, prd->x, p2->z, ctx);
+	ctx->pow2(p1->x, prd->x, ctx);
+	ctx->pow2(p1->z, p1->z, ctx);
+	ctx->addm(sum->x, p2->x, p2->z, ctx);
+	ctx->subm(p2->z, p2->x, p2->z, ctx);
+	ctx->mulm(prd->x, p1->x, p1->z, ctx);
+	ctx->subm(p1->z, p1->x, p1->z, ctx);
+	ctx->pow2(sum->x, sum->x, ctx);
+	ctx->pow2(sum->z, p2->z, ctx);
+	ctx->mulm(prd->z, p1->z, ctx->a, ctx); /* CTX->A: (a-2)/4 */
+	ctx->mulm(sum->z, sum->z, dif_x, ctx);
+	ctx->addm(prd->z, p1->x, prd->z, ctx);
+	ctx->mulm(prd->z, prd->z, p1->z, ctx);
+}
+
+/* RESULT = P1 + P2 */
+void mpi_ec_add_points(MPI_POINT result,
+		MPI_POINT p1, MPI_POINT p2,
+		struct mpi_ec_ctx *ctx)
+{
+	switch (ctx->model) {
+	case MPI_EC_WEIERSTRASS:
+		add_points_weierstrass(result, p1, p2, ctx);
+		break;
+	case MPI_EC_MONTGOMERY:
+		add_points_montgomery(result, p1, p2, ctx);
+		break;
+	case MPI_EC_EDWARDS:
+		add_points_edwards(result, p1, p2, ctx);
+		break;
+	}
+}
+EXPORT_SYMBOL_GPL(mpi_ec_add_points);
+
+/* Scalar point multiplication - the main function for ECC.  If takes
+ * an integer SCALAR and a POINT as well as the usual context CTX.
+ * RESULT will be set to the resulting point.
+ */
+void mpi_ec_mul_point(MPI_POINT result,
+			MPI scalar, MPI_POINT point,
+			struct mpi_ec_ctx *ctx)
+{
+	MPI x1, y1, z1, k, h, yy;
+	unsigned int i, loops;
+	struct gcry_mpi_point p1, p2, p1inv;
+
+	if (ctx->model == MPI_EC_EDWARDS) {
+		/* Simple left to right binary method.  Algorithm 3.27 from
+		 * {author={Hankerson, Darrel and Menezes, Alfred J. and Vanstone, Scott},
+		 *  title = {Guide to Elliptic Curve Cryptography},
+		 *  year = {2003}, isbn = {038795273X},
+		 *  url = {http://www.cacr.math.uwaterloo.ca/ecc/},
+		 *  publisher = {Springer-Verlag New York, Inc.}}
+		 */
+		unsigned int nbits;
+		int j;
+
+		if (mpi_cmp(scalar, ctx->p) >= 0)
+			nbits = mpi_get_nbits(scalar);
+		else
+			nbits = mpi_get_nbits(ctx->p);
+
+		mpi_set_ui(result->x, 0);
+		mpi_set_ui(result->y, 1);
+		mpi_set_ui(result->z, 1);
+		point_resize(point, ctx);
+
+		point_resize(result, ctx);
+		point_resize(point, ctx);
+
+		for (j = nbits-1; j >= 0; j--) {
+			mpi_ec_dup_point(result, result, ctx);
+			if (mpi_test_bit(scalar, j))
+				mpi_ec_add_points(result, result, point, ctx);
+		}
+		return;
+	} else if (ctx->model == MPI_EC_MONTGOMERY) {
+		unsigned int nbits;
+		int j;
+		struct gcry_mpi_point p1_, p2_;
+		MPI_POINT q1, q2, prd, sum;
+		unsigned long sw;
+		mpi_size_t rsize;
+		int scalar_copied = 0;
+
+		/* Compute scalar point multiplication with Montgomery Ladder.
+		 * Note that we don't use Y-coordinate in the points at all.
+		 * RESULT->Y will be filled by zero.
+		 */
+
+		nbits = mpi_get_nbits(scalar);
+		point_init(&p1);
+		point_init(&p2);
+		point_init(&p1_);
+		point_init(&p2_);
+		mpi_set_ui(p1.x, 1);
+		mpi_free(p2.x);
+		p2.x = mpi_copy(point->x);
+		mpi_set_ui(p2.z, 1);
+
+		point_resize(&p1, ctx);
+		point_resize(&p2, ctx);
+		point_resize(&p1_, ctx);
+		point_resize(&p2_, ctx);
+
+		mpi_resize(point->x, ctx->p->nlimbs);
+		point->x->nlimbs = ctx->p->nlimbs;
+
+		q1 = &p1;
+		q2 = &p2;
+		prd = &p1_;
+		sum = &p2_;
+
+		for (j = nbits-1; j >= 0; j--) {
+			MPI_POINT t;
+
+			sw = mpi_test_bit(scalar, j);
+			point_swap_cond(q1, q2, sw, ctx);
+			montgomery_ladder(prd, sum, q1, q2, point->x, ctx);
+			point_swap_cond(prd, sum, sw, ctx);
+			t = q1;  q1 = prd;  prd = t;
+			t = q2;  q2 = sum;  sum = t;
+		}
+
+		mpi_clear(result->y);
+		sw = (nbits & 1);
+		point_swap_cond(&p1, &p1_, sw, ctx);
+
+		rsize = p1.z->nlimbs;
+		MPN_NORMALIZE(p1.z->d, rsize);
+		if (rsize == 0) {
+			mpi_set_ui(result->x, 1);
+			mpi_set_ui(result->z, 0);
+		} else {
+			z1 = mpi_new(0);
+			ec_invm(z1, p1.z, ctx);
+			ec_mulm(result->x, p1.x, z1, ctx);
+			mpi_set_ui(result->z, 1);
+			mpi_free(z1);
+		}
+
+		point_free(&p1);
+		point_free(&p2);
+		point_free(&p1_);
+		point_free(&p2_);
+		if (scalar_copied)
+			mpi_free(scalar);
+		return;
+	}
+
+	x1 = mpi_alloc_like(ctx->p);
+	y1 = mpi_alloc_like(ctx->p);
+	h  = mpi_alloc_like(ctx->p);
+	k  = mpi_copy(scalar);
+	yy = mpi_copy(point->y);
+
+	if (mpi_has_sign(k)) {
+		k->sign = 0;
+		ec_invm(yy, yy, ctx);
+	}
+
+	if (!mpi_cmp_ui(point->z, 1)) {
+		mpi_set(x1, point->x);
+		mpi_set(y1, yy);
+	} else {
+		MPI z2, z3;
+
+		z2 = mpi_alloc_like(ctx->p);
+		z3 = mpi_alloc_like(ctx->p);
+		ec_mulm(z2, point->z, point->z, ctx);
+		ec_mulm(z3, point->z, z2, ctx);
+		ec_invm(z2, z2, ctx);
+		ec_mulm(x1, point->x, z2, ctx);
+		ec_invm(z3, z3, ctx);
+		ec_mulm(y1, yy, z3, ctx);
+		mpi_free(z2);
+		mpi_free(z3);
+	}
+	z1 = mpi_copy(mpi_const(MPI_C_ONE));
+
+	mpi_mul(h, k, mpi_const(MPI_C_THREE)); /* h = 3k */
+	loops = mpi_get_nbits(h);
+	if (loops < 2) {
+		/* If SCALAR is zero, the above mpi_mul sets H to zero and thus
+		 * LOOPs will be zero.  To avoid an underflow of I in the main
+		 * loop we set LOOP to 2 and the result to (0,0,0).
+		 */
+		loops = 2;
+		mpi_clear(result->x);
+		mpi_clear(result->y);
+		mpi_clear(result->z);
+	} else {
+		mpi_set(result->x, point->x);
+		mpi_set(result->y, yy);
+		mpi_set(result->z, point->z);
+	}
+	mpi_free(yy); yy = NULL;
+
+	p1.x = x1; x1 = NULL;
+	p1.y = y1; y1 = NULL;
+	p1.z = z1; z1 = NULL;
+	point_init(&p2);
+	point_init(&p1inv);
+
+	/* Invert point: y = p - y mod p  */
+	point_set(&p1inv, &p1);
+	ec_subm(p1inv.y, ctx->p, p1inv.y, ctx);
+
+	for (i = loops-2; i > 0; i--) {
+		mpi_ec_dup_point(result, result, ctx);
+		if (mpi_test_bit(h, i) == 1 && mpi_test_bit(k, i) == 0) {
+			point_set(&p2, result);
+			mpi_ec_add_points(result, &p2, &p1, ctx);
+		}
+		if (mpi_test_bit(h, i) == 0 && mpi_test_bit(k, i) == 1) {
+			point_set(&p2, result);
+			mpi_ec_add_points(result, &p2, &p1inv, ctx);
+		}
+	}
+
+	point_free(&p1);
+	point_free(&p2);
+	point_free(&p1inv);
+	mpi_free(h);
+	mpi_free(k);
+}
+EXPORT_SYMBOL_GPL(mpi_ec_mul_point);
+
+/* Return true if POINT is on the curve described by CTX.  */
+int mpi_ec_curve_point(MPI_POINT point, struct mpi_ec_ctx *ctx)
+{
+	int res = 0;
+	MPI x, y, w;
+
+	x = mpi_new(0);
+	y = mpi_new(0);
+	w = mpi_new(0);
+
+	/* Check that the point is in range.  This needs to be done here and
+	 * not after conversion to affine coordinates.
+	 */
+	if (mpi_cmpabs(point->x, ctx->p) >= 0)
+		goto leave;
+	if (mpi_cmpabs(point->y, ctx->p) >= 0)
+		goto leave;
+	if (mpi_cmpabs(point->z, ctx->p) >= 0)
+		goto leave;
+
+	switch (ctx->model) {
+	case MPI_EC_WEIERSTRASS:
+		{
+			MPI xxx;
+
+			if (mpi_ec_get_affine(x, y, point, ctx))
+				goto leave;
+
+			xxx = mpi_new(0);
+
+			/* y^2 == x^3 + a·x + b */
+			ec_pow2(y, y, ctx);
+
+			ec_pow3(xxx, x, ctx);
+			ec_mulm(w, ctx->a, x, ctx);
+			ec_addm(w, w, ctx->b, ctx);
+			ec_addm(w, w, xxx, ctx);
+
+			if (!mpi_cmp(y, w))
+				res = 1;
+
+			mpi_free(xxx);
+		}
+		break;
+
+	case MPI_EC_MONTGOMERY:
+		{
+#define xx y
+			/* With Montgomery curve, only X-coordinate is valid. */
+			if (mpi_ec_get_affine(x, NULL, point, ctx))
+				goto leave;
+
+			/* The equation is: b * y^2 == x^3 + a · x^2 + x */
+			/* We check if right hand is quadratic residue or not by
+			 * Euler's criterion.
+			 */
+			/* CTX->A has (a-2)/4 and CTX->B has b^-1 */
+			ec_mulm(w, ctx->a, mpi_const(MPI_C_FOUR), ctx);
+			ec_addm(w, w, mpi_const(MPI_C_TWO), ctx);
+			ec_mulm(w, w, x, ctx);
+			ec_pow2(xx, x, ctx);
+			ec_addm(w, w, xx, ctx);
+			ec_addm(w, w, mpi_const(MPI_C_ONE), ctx);
+			ec_mulm(w, w, x, ctx);
+			ec_mulm(w, w, ctx->b, ctx);
+#undef xx
+			/* Compute Euler's criterion: w^(p-1)/2 */
+#define p_minus1 y
+			ec_subm(p_minus1, ctx->p, mpi_const(MPI_C_ONE), ctx);
+			mpi_rshift(p_minus1, p_minus1, 1);
+			ec_powm(w, w, p_minus1, ctx);
+
+			res = !mpi_cmp_ui(w, 1);
+#undef p_minus1
+		}
+		break;
+
+	case MPI_EC_EDWARDS:
+		{
+			if (mpi_ec_get_affine(x, y, point, ctx))
+				goto leave;
+
+			mpi_resize(w, ctx->p->nlimbs);
+			w->nlimbs = ctx->p->nlimbs;
+
+			/* a · x^2 + y^2 - 1 - b · x^2 · y^2 == 0 */
+			ctx->pow2(x, x, ctx);
+			ctx->pow2(y, y, ctx);
+			if (ctx->dialect == ECC_DIALECT_ED25519)
+				ctx->subm(w, ctx->p, x, ctx);
+			else
+				ctx->mulm(w, ctx->a, x, ctx);
+			ctx->addm(w, w, y, ctx);
+			ctx->mulm(x, x, y, ctx);
+			ctx->mulm(x, x, ctx->b, ctx);
+			ctx->subm(w, w, x, ctx);
+			if (!mpi_cmp_ui(w, 1))
+				res = 1;
+		}
+		break;
+	}
+
+leave:
+	mpi_free(w);
+	mpi_free(x);
+	mpi_free(y);
+
+	return res;
+}
+EXPORT_SYMBOL_GPL(mpi_ec_curve_point);
-- 
2.17.1


^ permalink raw reply related

* [PATCH v5 4/8] crypto: sm2 - introduce OSCCA SM2 asymmetric cipher algorithm
From: Tianjia Zhang @ 2020-07-09  8:40 UTC (permalink / raw)
  To: herbert, davem, dhowells, mcoquelin.stm32, alexandre.torgue,
	jmorris, serge, nramas, tusharsu, zohar, vt, gilad, pvanleeuwen,
	zhang.jia
  Cc: linux-crypto, linux-kernel, keyrings, linux-stm32,
	linux-arm-kernel, linux-security-module, linux-integrity,
	tianjia.zhang
In-Reply-To: <20200709084015.21886-1-tianjia.zhang@linux.alibaba.com>

This new module implement the SM2 public key algorithm. It was
published by State Encryption Management Bureau, China.
List of specifications for SM2 elliptic curve public key cryptography:

* GM/T 0003.1-2012
* GM/T 0003.2-2012
* GM/T 0003.3-2012
* GM/T 0003.4-2012
* GM/T 0003.5-2012

IETF: https://tools.ietf.org/html/draft-shen-sm2-ecdsa-02
oscca: http://www.oscca.gov.cn/sca/xxgk/2010-12/17/content_1002386.shtml
scctc: http://www.gmbz.org.cn/main/bzlb.html

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
---
 crypto/Kconfig           |  17 ++
 crypto/Makefile          |   8 +
 crypto/sm2.c             | 473 +++++++++++++++++++++++++++++++++++++++
 crypto/sm2signature.asn1 |   4 +
 include/crypto/sm2.h     |  25 +++
 5 files changed, 527 insertions(+)
 create mode 100644 crypto/sm2.c
 create mode 100644 crypto/sm2signature.asn1
 create mode 100644 include/crypto/sm2.h

diff --git a/crypto/Kconfig b/crypto/Kconfig
index 091c0a0bbf26..de330d658758 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -260,6 +260,23 @@ config CRYPTO_ECRDSA
 	  standard algorithms (called GOST algorithms). Only signature verification
 	  is implemented.
 
+config CRYPTO_SM2
+	tristate "SM2 algorithm"
+	select CRYPTO_SM3
+	select CRYPTO_AKCIPHER
+	select CRYPTO_MANAGER
+	select MPILIB
+	select ASN1
+	help
+	  Generic implementation of the SM2 public key algorithm. It was
+	  published by State Encryption Management Bureau, China.
+	  as specified by OSCCA GM/T 0003.1-2012 -- 0003.5-2012.
+
+	  References:
+	  https://tools.ietf.org/html/draft-shen-sm2-ecdsa-02
+	  http://www.oscca.gov.cn/sca/xxgk/2010-12/17/content_1002386.shtml
+	  http://www.gmbz.org.cn/main/bzlb.html
+
 config CRYPTO_CURVE25519
 	tristate "Curve25519 algorithm"
 	select CRYPTO_KPP
diff --git a/crypto/Makefile b/crypto/Makefile
index 4ca12b6044f7..b279483fba50 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -42,6 +42,14 @@ rsa_generic-y += rsa_helper.o
 rsa_generic-y += rsa-pkcs1pad.o
 obj-$(CONFIG_CRYPTO_RSA) += rsa_generic.o
 
+$(obj)/sm2signature.asn1.o: $(obj)/sm2signature.asn1.c $(obj)/sm2signature.asn1.h
+$(obj)/sm2.o: $(obj)/sm2signature.asn1.h
+
+sm2_generic-y += sm2signature.asn1.o
+sm2_generic-y += sm2.o
+
+obj-$(CONFIG_CRYPTO_SM2) += sm2_generic.o
+
 crypto_acompress-y := acompress.o
 crypto_acompress-y += scompress.o
 obj-$(CONFIG_CRYPTO_ACOMP2) += crypto_acompress.o
diff --git a/crypto/sm2.c b/crypto/sm2.c
new file mode 100644
index 000000000000..86da175bcda6
--- /dev/null
+++ b/crypto/sm2.c
@@ -0,0 +1,473 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * SM2 asymmetric public-key algorithm
+ * as specified by OSCCA GM/T 0003.1-2012 -- 0003.5-2012 SM2 and
+ * described at https://tools.ietf.org/html/draft-shen-sm2-ecdsa-02
+ *
+ * Copyright (c) 2020, Alibaba Group.
+ * Authors: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
+ */
+
+#include <linux/module.h>
+#include <linux/mpi.h>
+#include <crypto/internal/akcipher.h>
+#include <crypto/akcipher.h>
+#include <crypto/hash.h>
+#include <crypto/sm3_base.h>
+#include <crypto/rng.h>
+#include <crypto/sm2.h>
+#include "sm2signature.asn1.h"
+
+#define MPI_NBYTES(m)   ((mpi_get_nbits(m) + 7) / 8)
+
+struct ecc_domain_parms {
+	const char *desc;           /* Description of the curve.  */
+	unsigned int nbits;         /* Number of bits.  */
+	unsigned int fips:1; /* True if this is a FIPS140-2 approved curve */
+
+	/* The model describing this curve.  This is mainly used to select
+	 * the group equation.
+	 */
+	enum gcry_mpi_ec_models model;
+
+	/* The actual ECC dialect used.  This is used for curve specific
+	 * optimizations and to select encodings etc.
+	 */
+	enum ecc_dialects dialect;
+
+	const char *p;              /* The prime defining the field.  */
+	const char *a, *b;          /* The coefficients.  For Twisted Edwards
+				     * Curves b is used for d.  For Montgomery
+				     * Curves (a,b) has ((A-2)/4,B^-1).
+				     */
+	const char *n;              /* The order of the base point.  */
+	const char *g_x, *g_y;      /* Base point.  */
+	unsigned int h;             /* Cofactor.  */
+};
+
+static const struct ecc_domain_parms sm2_ecp = {
+	.desc = "sm2p256v1",
+	.nbits = 256,
+	.fips = 0,
+	.model = MPI_EC_WEIERSTRASS,
+	.dialect = ECC_DIALECT_STANDARD,
+	.p   = "0xfffffffeffffffffffffffffffffffffffffffff00000000ffffffffffffffff",
+	.a   = "0xfffffffeffffffffffffffffffffffffffffffff00000000fffffffffffffffc",
+	.b   = "0x28e9fa9e9d9f5e344d5a9e4bcf6509a7f39789f515ab8f92ddbcbd414d940e93",
+	.n   = "0xfffffffeffffffffffffffffffffffff7203df6b21c6052b53bbf40939d54123",
+	.g_x = "0x32c4ae2c1f1981195f9904466a39c9948fe30bbff2660be1715a4589334c74c7",
+	.g_y = "0xbc3736a2f4f6779c59bdcee36b692153d0a9877cc62a474002df32e52139f0a0",
+	.h = 1
+};
+
+static int sm2_ec_ctx_init(struct mpi_ec_ctx *ec)
+{
+	const struct ecc_domain_parms *ecp = &sm2_ecp;
+	MPI p, a, b;
+	MPI x, y;
+	int rc = -EINVAL;
+
+	p = mpi_scanval(ecp->p);
+	a = mpi_scanval(ecp->a);
+	b = mpi_scanval(ecp->b);
+	if (!p || !a || !b)
+		goto free_p;
+
+	x = mpi_scanval(ecp->g_x);
+	y = mpi_scanval(ecp->g_y);
+	if (!x || !y)
+		goto free;
+
+	/* mpi_ec_setup_elliptic_curve */
+	ec->G = mpi_point_new(0);
+	if (!ec->G)
+		goto free;
+
+	mpi_set(ec->G->x, x);
+	mpi_set(ec->G->y, y);
+	mpi_set_ui(ec->G->z, 1);
+
+	ec->n = mpi_scanval(ecp->n);
+	if (!ec->n) {
+		mpi_point_release(ec->G);
+		goto free;
+	}
+
+	ec->h = ecp->h;
+	ec->name = ecp->desc;
+	mpi_ec_init(ec, ecp->model, ecp->dialect, 0, p, a, b);
+
+	rc = 0;
+
+free:
+	mpi_free(x);
+	mpi_free(y);
+free_p:
+	mpi_free(p);
+	mpi_free(a);
+	mpi_free(b);
+
+	return rc;
+}
+
+static void sm2_ec_ctx_deinit(struct mpi_ec_ctx *ec)
+{
+	mpi_free(ec->n);
+	mpi_point_release(ec->G);
+
+	mpi_ec_deinit(ec);
+
+	memset(ec, 0, sizeof(*ec));
+}
+
+static int sm2_ec_ctx_reset(struct mpi_ec_ctx *ec)
+{
+	sm2_ec_ctx_deinit(ec);
+	return sm2_ec_ctx_init(ec);
+}
+
+/* RESULT must have been initialized and is set on success to the
+ * point given by VALUE.
+ */
+static int sm2_ecc_os2ec(MPI_POINT result, MPI value)
+{
+	int rc;
+	size_t n;
+	const unsigned char *buf;
+	unsigned char *buf_memory;
+	MPI x, y;
+
+	n = (mpi_get_nbits(value)+7)/8;
+	buf_memory = kmalloc(n, GFP_KERNEL);
+	rc = mpi_print(GCRYMPI_FMT_USG, buf_memory, n, &n, value);
+	if (rc) {
+		kfree(buf_memory);
+		return rc;
+	}
+	buf = buf_memory;
+
+	if (n < 1) {
+		kfree(buf_memory);
+		return -EINVAL;
+	}
+	if (*buf != 4) {
+		kfree(buf_memory);
+		return -EINVAL; /* No support for point compression.  */
+	}
+	if (((n-1)%2)) {
+		kfree(buf_memory);
+		return -EINVAL;
+	}
+	n = (n-1)/2;
+	x = mpi_read_raw_data(buf + 1, n);
+	if (!x) {
+		kfree(buf_memory);
+		return -ENOMEM;
+	}
+	y = mpi_read_raw_data(buf + 1 + n, n);
+	kfree(buf_memory);
+	if (!y) {
+		mpi_free(x);
+		return -ENOMEM;
+	}
+
+	mpi_normalize(x);
+	mpi_normalize(y);
+
+	mpi_set(result->x, x);
+	mpi_set(result->y, y);
+	mpi_set_ui(result->z, 1);
+
+	mpi_free(x);
+	mpi_free(y);
+
+	return 0;
+}
+
+struct sm2_signature_ctx {
+	MPI sig_r;
+	MPI sig_s;
+};
+
+int sm2_get_signature_r(void *context, size_t hdrlen, unsigned char tag,
+				const void *value, size_t vlen)
+{
+	struct sm2_signature_ctx *sig = context;
+
+	if (!value || !vlen)
+		return -EINVAL;
+
+	sig->sig_r = mpi_read_raw_data(value, vlen);
+	if (!sig->sig_r)
+		return -ENOMEM;
+
+	return 0;
+}
+
+int sm2_get_signature_s(void *context, size_t hdrlen, unsigned char tag,
+				const void *value, size_t vlen)
+{
+	struct sm2_signature_ctx *sig = context;
+
+	if (!value || !vlen)
+		return -EINVAL;
+
+	sig->sig_s = mpi_read_raw_data(value, vlen);
+	if (!sig->sig_s)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int sm2_z_digest_update(struct shash_desc *desc,
+			MPI m, unsigned int pbytes)
+{
+	static const unsigned char zero[32];
+	unsigned char *in;
+	unsigned int inlen;
+
+	in = mpi_get_buffer(m, &inlen, NULL);
+	if (!in)
+		return -EINVAL;
+
+	if (inlen < pbytes) {
+		/* padding with zero */
+		crypto_sm3_update(desc, zero, pbytes - inlen);
+		crypto_sm3_update(desc, in, inlen);
+	} else if (inlen > pbytes) {
+		/* skip the starting zero */
+		crypto_sm3_update(desc, in + inlen - pbytes, pbytes);
+	} else {
+		crypto_sm3_update(desc, in, inlen);
+	}
+
+	kfree(in);
+	return 0;
+}
+
+static int sm2_z_digest_update_point(struct shash_desc *desc,
+			MPI_POINT point, struct mpi_ec_ctx *ec, unsigned int pbytes)
+{
+	MPI x, y;
+	int ret = -EINVAL;
+
+	x = mpi_new(0);
+	y = mpi_new(0);
+
+	if (!mpi_ec_get_affine(x, y, point, ec) &&
+		!sm2_z_digest_update(desc, x, pbytes) &&
+		!sm2_z_digest_update(desc, y, pbytes))
+		ret = 0;
+
+	mpi_free(x);
+	mpi_free(y);
+	return ret;
+}
+
+int sm2_compute_z_digest(struct crypto_akcipher *tfm,
+			const unsigned char *id, size_t id_len,
+			unsigned char dgst[SM3_DIGEST_SIZE])
+{
+	struct mpi_ec_ctx *ec = akcipher_tfm_ctx(tfm);
+	uint16_t bits_len;
+	unsigned char entl[2];
+	SHASH_DESC_ON_STACK(desc, NULL);
+	unsigned int pbytes;
+
+	if (id_len > (USHRT_MAX / 8) || !ec->Q)
+		return -EINVAL;
+
+	bits_len = (uint16_t)(id_len * 8);
+	entl[0] = bits_len >> 8;
+	entl[1] = bits_len & 0xff;
+
+	pbytes = MPI_NBYTES(ec->p);
+
+	/* ZA = H256(ENTLA | IDA | a | b | xG | yG | xA | yA) */
+	sm3_base_init(desc);
+	crypto_sm3_update(desc, entl, 2);
+	crypto_sm3_update(desc, id, id_len);
+
+	if (sm2_z_digest_update(desc, ec->a, pbytes) ||
+		sm2_z_digest_update(desc, ec->b, pbytes) ||
+		sm2_z_digest_update_point(desc, ec->G, ec, pbytes) ||
+		sm2_z_digest_update_point(desc, ec->Q, ec, pbytes))
+		return -EINVAL;
+
+	crypto_sm3_finup(desc, NULL, 0, dgst);
+	return 0;
+}
+EXPORT_SYMBOL(sm2_compute_z_digest);
+
+static int _sm2_verify(struct mpi_ec_ctx *ec, MPI hash, MPI sig_r, MPI sig_s)
+{
+	int rc = -EINVAL;
+	struct gcry_mpi_point sG, tP;
+	MPI t = NULL;
+	MPI x1 = NULL, y1 = NULL;
+
+	mpi_point_init(&sG);
+	mpi_point_init(&tP);
+	x1 = mpi_new(0);
+	y1 = mpi_new(0);
+	t = mpi_new(0);
+
+	/* r, s in [1, n-1] */
+	if (mpi_cmp_ui(sig_r, 1) < 0 || mpi_cmp(sig_r, ec->n) > 0 ||
+		mpi_cmp_ui(sig_s, 1) < 0 || mpi_cmp(sig_s, ec->n) > 0) {
+		goto leave;
+	}
+
+	/* t = (r + s) % n, t == 0 */
+	mpi_addm(t, sig_r, sig_s, ec->n);
+	if (mpi_cmp_ui(t, 0) == 0)
+		goto leave;
+
+	/* sG + tP = (x1, y1) */
+	rc = -EBADMSG;
+	mpi_ec_mul_point(&sG, sig_s, ec->G, ec);
+	mpi_ec_mul_point(&tP, t, ec->Q, ec);
+	mpi_ec_add_points(&sG, &sG, &tP, ec);
+	if (mpi_ec_get_affine(x1, y1, &sG, ec))
+		goto leave;
+
+	/* R = (e + x1) % n */
+	mpi_addm(t, hash, x1, ec->n);
+
+	/* check R == r */
+	rc = -EKEYREJECTED;
+	if (mpi_cmp(t, sig_r))
+		goto leave;
+
+	rc = 0;
+
+leave:
+	mpi_point_free_parts(&sG);
+	mpi_point_free_parts(&tP);
+	mpi_free(x1);
+	mpi_free(y1);
+	mpi_free(t);
+
+	return rc;
+}
+
+static int sm2_verify(struct akcipher_request *req)
+{
+	struct crypto_akcipher *tfm = crypto_akcipher_reqtfm(req);
+	struct mpi_ec_ctx *ec = akcipher_tfm_ctx(tfm);
+	unsigned char *buffer;
+	struct sm2_signature_ctx sig;
+	MPI hash;
+	int ret;
+
+	if (unlikely(!ec->Q))
+		return -EINVAL;
+
+	buffer = kmalloc(req->src_len + req->dst_len, GFP_KERNEL);
+	if (!buffer)
+		return -ENOMEM;
+
+	sg_pcopy_to_buffer(req->src,
+			sg_nents_for_len(req->src, req->src_len + req->dst_len),
+			buffer, req->src_len + req->dst_len, 0);
+
+	sig.sig_r = NULL;
+	sig.sig_s = NULL;
+	ret = asn1_ber_decoder(&sm2signature_decoder, &sig, buffer, req->src_len);
+	if (ret)
+		goto error;
+
+	ret = -ENOMEM;
+	hash = mpi_read_raw_data(buffer + req->src_len, req->dst_len);
+	if (!hash)
+		goto error;
+
+	ret = _sm2_verify(ec, hash, sig.sig_r, sig.sig_s);
+
+	mpi_free(hash);
+error:
+	mpi_free(sig.sig_r);
+	mpi_free(sig.sig_s);
+	kfree(buffer);
+	return ret;
+}
+
+static int sm2_set_pub_key(struct crypto_akcipher *tfm, const void *key,
+						unsigned int keylen)
+{
+	struct mpi_ec_ctx *ec = akcipher_tfm_ctx(tfm);
+	MPI a;
+	int rc;
+
+	rc = sm2_ec_ctx_reset(ec);
+	if (rc)
+		return rc;
+
+	ec->Q = mpi_point_new(0);
+	if (!ec->Q)
+		return -ENOMEM;
+
+	/* include the uncompressed flag '0x04' */
+	rc = -ENOMEM;
+	a = mpi_read_raw_data(key, keylen);
+	if (!a)
+		goto error;
+
+	mpi_normalize(a);
+	rc = sm2_ecc_os2ec(ec->Q, a);
+	mpi_free(a);
+	if (rc)
+		goto error;
+
+	return 0;
+
+error:
+	mpi_point_release(ec->Q);
+	ec->Q = NULL;
+	return rc;
+}
+
+static unsigned int sm2_max_size(struct crypto_akcipher *tfm)
+{
+	/* Unlimited max size */
+	return PAGE_SIZE;
+}
+
+static void sm2_exit_tfm(struct crypto_akcipher *tfm)
+{
+	struct mpi_ec_ctx *ec = akcipher_tfm_ctx(tfm);
+
+	mpi_ec_deinit(ec);
+}
+
+static struct akcipher_alg sm2 = {
+	.verify = sm2_verify,
+	.set_pub_key = sm2_set_pub_key,
+	.max_size = sm2_max_size,
+	.exit = sm2_exit_tfm,
+	.base = {
+		.cra_name = "sm2",
+		.cra_driver_name = "sm2-generic",
+		.cra_priority = 100,
+		.cra_module = THIS_MODULE,
+		.cra_ctxsize = sizeof(struct mpi_ec_ctx),
+	},
+};
+
+static int sm2_init(void)
+{
+	return crypto_register_akcipher(&sm2);
+}
+
+static void sm2_exit(void)
+{
+	crypto_unregister_akcipher(&sm2);
+}
+
+subsys_initcall(sm2_init);
+module_exit(sm2_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Tianjia Zhang <tianjia.zhang@linux.alibaba.com>");
+MODULE_DESCRIPTION("SM2 generic algorithm");
+MODULE_ALIAS_CRYPTO("sm2-generic");
diff --git a/crypto/sm2signature.asn1 b/crypto/sm2signature.asn1
new file mode 100644
index 000000000000..ab8c0b754d21
--- /dev/null
+++ b/crypto/sm2signature.asn1
@@ -0,0 +1,4 @@
+Sm2Signature ::= SEQUENCE {
+	sig_r	INTEGER ({ sm2_get_signature_r }),
+	sig_s	INTEGER ({ sm2_get_signature_s })
+}
diff --git a/include/crypto/sm2.h b/include/crypto/sm2.h
new file mode 100644
index 000000000000..af452556dcd4
--- /dev/null
+++ b/include/crypto/sm2.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * sm2.h - SM2 asymmetric public-key algorithm
+ * as specified by OSCCA GM/T 0003.1-2012 -- 0003.5-2012 SM2 and
+ * described at https://tools.ietf.org/html/draft-shen-sm2-ecdsa-02
+ *
+ * Copyright (c) 2020, Alibaba Group.
+ * Written by Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
+ */
+
+#ifndef _CRYPTO_SM2_H
+#define _CRYPTO_SM2_H
+
+#include <crypto/sm3.h>
+#include <crypto/akcipher.h>
+
+/* The default user id as specified in GM/T 0009-2012 */
+#define SM2_DEFAULT_USERID "1234567812345678"
+#define SM2_DEFAULT_USERID_LEN 16
+
+extern int sm2_compute_z_digest(struct crypto_akcipher *tfm,
+			const unsigned char *id, size_t id_len,
+			unsigned char dgst[SM3_DIGEST_SIZE]);
+
+#endif /* _CRYPTO_SM2_H */
-- 
2.17.1


^ permalink raw reply related

* [PATCH v5 7/8] X.509: support OSCCA sm2-with-sm3 certificate verification
From: Tianjia Zhang @ 2020-07-09  8:40 UTC (permalink / raw)
  To: herbert, davem, dhowells, mcoquelin.stm32, alexandre.torgue,
	jmorris, serge, nramas, tusharsu, zohar, vt, gilad, pvanleeuwen,
	zhang.jia
  Cc: linux-crypto, linux-kernel, keyrings, linux-stm32,
	linux-arm-kernel, linux-security-module, linux-integrity,
	tianjia.zhang
In-Reply-To: <20200709084015.21886-1-tianjia.zhang@linux.alibaba.com>

The digital certificate format based on SM2 crypto algorithm as
specified in GM/T 0015-2012. It was published by State Encryption
Management Bureau, China.

The method of generating Other User Information is defined as
ZA=H256(ENTLA || IDA || a || b || xG || yG || xA || yA), it also
specified in https://tools.ietf.org/html/draft-shen-sm2-ecdsa-02.

The x509 certificate supports sm2-with-sm3 type certificate
verification.  Because certificate verification requires ZA
in addition to tbs data, ZA also depends on elliptic curve
parameters and public key data, so you need to access tbs in sig
and calculate ZA. Finally calculate the digest of the
signature and complete the verification work. The calculation
process of ZA is declared in specifications GM/T 0009-2012
and GM/T 0003.2-2012.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
---
 crypto/asymmetric_keys/Makefile          |  1 +
 crypto/asymmetric_keys/public_key.c      |  6 +++
 crypto/asymmetric_keys/public_key_sm2.c  | 61 ++++++++++++++++++++++++
 crypto/asymmetric_keys/x509_public_key.c |  3 ++
 include/crypto/public_key.h              | 15 ++++++
 5 files changed, 86 insertions(+)
 create mode 100644 crypto/asymmetric_keys/public_key_sm2.c

diff --git a/crypto/asymmetric_keys/Makefile b/crypto/asymmetric_keys/Makefile
index 28b91adba2ae..1a99ea5acb6b 100644
--- a/crypto/asymmetric_keys/Makefile
+++ b/crypto/asymmetric_keys/Makefile
@@ -11,6 +11,7 @@ asymmetric_keys-y := \
 	signature.o
 
 obj-$(CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE) += public_key.o
+obj-$(CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE) += public_key_sm2.o
 obj-$(CONFIG_ASYMMETRIC_TPM_KEY_SUBTYPE) += asym_tpm.o
 
 #
diff --git a/crypto/asymmetric_keys/public_key.c b/crypto/asymmetric_keys/public_key.c
index d7f43d4ea925..6b7a6286d5fd 100644
--- a/crypto/asymmetric_keys/public_key.c
+++ b/crypto/asymmetric_keys/public_key.c
@@ -298,6 +298,12 @@ int public_key_verify_signature(const struct public_key *pkey,
 	if (ret)
 		goto error_free_key;
 
+	if (strcmp(sig->pkey_algo, "sm2") == 0 && sig->data_size) {
+		ret = cert_sig_digest_update(sig, tfm);
+		if (ret)
+			goto error_free_key;
+	}
+
 	sg_init_table(src_sg, 2);
 	sg_set_buf(&src_sg[0], sig->s, sig->s_size);
 	sg_set_buf(&src_sg[1], sig->digest, sig->digest_size);
diff --git a/crypto/asymmetric_keys/public_key_sm2.c b/crypto/asymmetric_keys/public_key_sm2.c
new file mode 100644
index 000000000000..7325cf21dbb4
--- /dev/null
+++ b/crypto/asymmetric_keys/public_key_sm2.c
@@ -0,0 +1,61 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * asymmetric public-key algorithm for SM2-with-SM3 certificate
+ * as specified by OSCCA GM/T 0003.1-2012 -- 0003.5-2012 SM2 and
+ * described at https://tools.ietf.org/html/draft-shen-sm2-ecdsa-02
+ *
+ * Copyright (c) 2020, Alibaba Group.
+ * Authors: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
+ */
+
+#include <crypto/sm3_base.h>
+#include <crypto/sm2.h>
+#include <crypto/public_key.h>
+
+#if IS_REACHABLE(CONFIG_CRYPTO_SM2)
+
+int cert_sig_digest_update(const struct public_key_signature *sig,
+				struct crypto_akcipher *tfm_pkey)
+{
+	struct crypto_shash *tfm;
+	struct shash_desc *desc;
+	size_t desc_size;
+	unsigned char dgst[SM3_DIGEST_SIZE];
+	int ret;
+
+	BUG_ON(!sig->data);
+
+	ret = sm2_compute_z_digest(tfm_pkey, SM2_DEFAULT_USERID,
+					SM2_DEFAULT_USERID_LEN, dgst);
+	if (ret)
+		return ret;
+
+	tfm = crypto_alloc_shash(sig->hash_algo, 0, 0);
+	if (IS_ERR(tfm))
+		return PTR_ERR(tfm);
+
+	desc_size = crypto_shash_descsize(tfm) + sizeof(*desc);
+	desc = kzalloc(desc_size, GFP_KERNEL);
+	if (!desc)
+		goto error_free_tfm;
+
+	desc->tfm = tfm;
+
+	ret = crypto_shash_init(desc);
+	if (ret < 0)
+		goto error_free_desc;
+
+	ret = crypto_shash_update(desc, dgst, SM3_DIGEST_SIZE);
+	if (ret < 0)
+		goto error_free_desc;
+
+	ret = crypto_shash_finup(desc, sig->data, sig->data_size, sig->digest);
+
+error_free_desc:
+	kfree(desc);
+error_free_tfm:
+	crypto_free_shash(tfm);
+	return ret;
+}
+
+#endif /* ! IS_REACHABLE(CONFIG_CRYPTO_SM2) */
diff --git a/crypto/asymmetric_keys/x509_public_key.c b/crypto/asymmetric_keys/x509_public_key.c
index d964cc82b69c..ae450eb8be14 100644
--- a/crypto/asymmetric_keys/x509_public_key.c
+++ b/crypto/asymmetric_keys/x509_public_key.c
@@ -30,6 +30,9 @@ int x509_get_sig_params(struct x509_certificate *cert)
 
 	pr_devel("==>%s()\n", __func__);
 
+	sig->data = cert->tbs;
+	sig->data_size = cert->tbs_size;
+
 	if (!cert->pub->pkey_algo)
 		cert->unsupported_key = true;
 
diff --git a/include/crypto/public_key.h b/include/crypto/public_key.h
index 0588ef3bc6ff..f952f0378c23 100644
--- a/include/crypto/public_key.h
+++ b/include/crypto/public_key.h
@@ -12,6 +12,7 @@
 
 #include <linux/keyctl.h>
 #include <linux/oid_registry.h>
+#include <crypto/akcipher.h>
 
 /*
  * Cryptographic data for the public-key subtype of the asymmetric key type.
@@ -44,6 +45,8 @@ struct public_key_signature {
 	const char *pkey_algo;
 	const char *hash_algo;
 	const char *encoding;
+	const void *data;
+	unsigned int data_size;
 };
 
 extern void public_key_signature_free(struct public_key_signature *sig);
@@ -81,4 +84,16 @@ extern int verify_signature(const struct key *,
 int public_key_verify_signature(const struct public_key *pkey,
 				const struct public_key_signature *sig);
 
+#if IS_REACHABLE(CONFIG_CRYPTO_SM2)
+int cert_sig_digest_update(const struct public_key_signature *sig,
+				struct crypto_akcipher *tfm_pkey);
+#else
+static inline
+int cert_sig_digest_update(const struct public_key_signature *sig,
+				struct crypto_akcipher *tfm_pkey)
+{
+	return -ENOTSUPP;
+}
+#endif
+
 #endif /* _LINUX_PUBLIC_KEY_H */
-- 
2.17.1


^ permalink raw reply related

* [PATCH v5 8/8] integrity: Asymmetric digsig supports SM2-with-SM3 algorithm
From: Tianjia Zhang @ 2020-07-09  8:40 UTC (permalink / raw)
  To: herbert, davem, dhowells, mcoquelin.stm32, alexandre.torgue,
	jmorris, serge, nramas, tusharsu, zohar, vt, gilad, pvanleeuwen,
	zhang.jia
  Cc: linux-crypto, linux-kernel, keyrings, linux-stm32,
	linux-arm-kernel, linux-security-module, linux-integrity,
	tianjia.zhang
In-Reply-To: <20200709084015.21886-1-tianjia.zhang@linux.alibaba.com>

Asymmetric digsig supports SM2-with-SM3 algorithm combination,
so that IMA can also verify SM2's signature data.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
---
 security/integrity/digsig_asymmetric.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/security/integrity/digsig_asymmetric.c b/security/integrity/digsig_asymmetric.c
index 4e0d6778277e..9350fcfb9bf2 100644
--- a/security/integrity/digsig_asymmetric.c
+++ b/security/integrity/digsig_asymmetric.c
@@ -99,14 +99,22 @@ int asymmetric_verify(struct key *keyring, const char *sig,
 	memset(&pks, 0, sizeof(pks));
 
 	pks.hash_algo = hash_algo_name[hdr->hash_algo];
-	if (hdr->hash_algo == HASH_ALGO_STREEBOG_256 ||
-	    hdr->hash_algo == HASH_ALGO_STREEBOG_512) {
+	switch (hdr->hash_algo) {
+	case HASH_ALGO_STREEBOG_256:
+	case HASH_ALGO_STREEBOG_512:
 		/* EC-RDSA and Streebog should go together. */
 		pks.pkey_algo = "ecrdsa";
 		pks.encoding = "raw";
-	} else {
+		break;
+	case HASH_ALGO_SM3_256:
+		/* SM2 and SM3 should go together. */
+		pks.pkey_algo = "sm2";
+		pks.encoding = "raw";
+		break;
+	default:
 		pks.pkey_algo = "rsa";
 		pks.encoding = "pkcs1";
+		break;
 	}
 	pks.digest = (u8 *)data;
 	pks.digest_size = datalen;
-- 
2.17.1


^ permalink raw reply related

* [PATCH v5 5/8] crypto: testmgr - support test with different ciphertext per encryption
From: Tianjia Zhang @ 2020-07-09  8:40 UTC (permalink / raw)
  To: herbert, davem, dhowells, mcoquelin.stm32, alexandre.torgue,
	jmorris, serge, nramas, tusharsu, zohar, vt, gilad, pvanleeuwen,
	zhang.jia
  Cc: linux-crypto, linux-kernel, keyrings, linux-stm32,
	linux-arm-kernel, linux-security-module, linux-integrity,
	tianjia.zhang
In-Reply-To: <20200709084015.21886-1-tianjia.zhang@linux.alibaba.com>

Some asymmetric algorithms will get different ciphertext after
each encryption, such as SM2, and let testmgr support the testing
of such algorithms.

In struct akcipher_testvec, set c and c_size to be empty, skip
the comparison of the ciphertext, and compare the decrypted
plaintext with m to achieve the test purpose.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
---
 crypto/testmgr.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index 6863f911fcee..0dc94461c437 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -4025,7 +4025,7 @@ static int test_akcipher_one(struct crypto_akcipher *tfm,
 		pr_err("alg: akcipher: %s test failed. err %d\n", op, err);
 		goto free_all;
 	}
-	if (!vecs->siggen_sigver_test) {
+	if (!vecs->siggen_sigver_test && c) {
 		if (req->dst_len != c_size) {
 			pr_err("alg: akcipher: %s test failed. Invalid output len\n",
 			       op);
@@ -4056,6 +4056,11 @@ static int test_akcipher_one(struct crypto_akcipher *tfm,
 		goto free_all;
 	}
 
+	if (!vecs->siggen_sigver_test && !c) {
+		c = outbuf_enc;
+		c_size = req->dst_len;
+	}
+
 	op = vecs->siggen_sigver_test ? "sign" : "decrypt";
 	if (WARN_ON(c_size > PAGE_SIZE))
 		goto free_all;
-- 
2.17.1


^ permalink raw reply related

* [PATCH v3 03/12] ima: Free the entire rule if it fails to parse
From: Tyler Hicks @ 2020-07-09  6:19 UTC (permalink / raw)
  To: Mimi Zohar, Dmitry Kasatkin
  Cc: James Morris, Serge E . Hallyn, Lakshmi Ramasubramanian,
	Prakhar Srivastava, linux-kernel, linux-integrity,
	linux-security-module
In-Reply-To: <20200709061911.954326-1-tyhicks@linux.microsoft.com>

Use ima_free_rule() to fix memory leaks of allocated ima_rule_entry
members, such as .fsname and .keyrings, when an error is encountered
during rule parsing.

Set the args_p pointer to NULL after freeing it in the error path of
ima_lsm_rule_init() so that it isn't freed twice.

This fixes a memory leak seen when loading an rule that contains an
additional piece of allocated memory, such as an fsname, followed by an
invalid conditional:

 # echo "measure fsname=tmpfs bad=cond" > /sys/kernel/security/ima/policy
 -bash: echo: write error: Invalid argument
 # echo scan > /sys/kernel/debug/kmemleak
 # cat /sys/kernel/debug/kmemleak
 unreferenced object 0xffff98e7e4ece6c0 (size 8):
   comm "bash", pid 672, jiffies 4294791843 (age 21.855s)
   hex dump (first 8 bytes):
     74 6d 70 66 73 00 6b a5                          tmpfs.k.
   backtrace:
     [<00000000abab7413>] kstrdup+0x2e/0x60
     [<00000000f11ede32>] ima_parse_add_rule+0x7d4/0x1020
     [<00000000f883dd7a>] ima_write_policy+0xab/0x1d0
     [<00000000b17cf753>] vfs_write+0xde/0x1d0
     [<00000000b8ddfdea>] ksys_write+0x68/0xe0
     [<00000000b8e21e87>] do_syscall_64+0x56/0xa0
     [<0000000089ea7b98>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: f1b08bbcbdaf ("ima: define a new policy condition based on the filesystem name")
Fixes: 2b60c0ecedf8 ("IMA: Read keyrings= option from the IMA policy")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
---

* v3
  - No change
* v2
  - No change

 security/integrity/ima/ima_policy.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
index bf00b966e87f..e458cd47c099 100644
--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -913,6 +913,7 @@ static int ima_lsm_rule_init(struct ima_rule_entry *entry,
 
 		if (ima_rules == &ima_default_rules) {
 			kfree(entry->lsm[lsm_rule].args_p);
+			entry->lsm[lsm_rule].args_p = NULL;
 			result = -EINVAL;
 		} else
 			result = 0;
@@ -1404,7 +1405,7 @@ ssize_t ima_parse_add_rule(char *rule)
 
 	result = ima_parse_rule(p, entry);
 	if (result) {
-		kfree(entry);
+		ima_free_rule(entry);
 		integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL,
 				    NULL, op, "invalid-policy", result,
 				    audit_info);
-- 
2.25.1


^ permalink raw reply related

* [PATCH v3 04/12] ima: Fail rule parsing when buffer hook functions have an invalid action
From: Tyler Hicks @ 2020-07-09  6:19 UTC (permalink / raw)
  To: Mimi Zohar, Dmitry Kasatkin
  Cc: James Morris, Serge E . Hallyn, Lakshmi Ramasubramanian,
	Prakhar Srivastava, linux-kernel, linux-integrity,
	linux-security-module
In-Reply-To: <20200709061911.954326-1-tyhicks@linux.microsoft.com>

Buffer based hook functions, such as KEXEC_CMDLINE and KEY_CHECK, can
only measure. The process_buffer_measurement() function quietly ignores
all actions except measure so make this behavior clear at the time of
policy load.

The parsing of the keyrings conditional had a check to ensure that it
was only specified with measure actions but the check should be on the
hook function and not the keyrings conditional since
"appraise func=KEY_CHECK" is not a valid rule.

Fixes: b0935123a183 ("IMA: Define a new hook to measure the kexec boot command line arguments")
Fixes: 5808611cccb2 ("IMA: Add KEY_CHECK func to measure keys")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
---

* v3
  - Add comments to ima_validate_rule() to separate/explain the types of
    validation checks (section for action checks, section for hook
    function checks, soon to be a section for combination of options
    checks, etc.)
  - Removed the "if (entry->flags & IMA_FUNC)" conditional around the
    switch statement in ima_validate_rule() which reduced the overall indention
    by a tab. This could be removed because entry->func is NONE when the
    IMA_FUNC flag is not set. We'll explicitly enforce and then leverage
    that property in a later patch when we start validating all hook
    functions in ima_validate_rule().
  - Add comment explicitly stating that all hook functions except
    KEXEC_CMDLINE and KEY_CHECK are still being validated in
    ima_parse_rule().
* v2
  - No change

 security/integrity/ima/ima_policy.c | 40 +++++++++++++++++++++++++++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
index e458cd47c099..40c28f1a6a5a 100644
--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -973,6 +973,43 @@ static void check_template_modsig(const struct ima_template_desc *template)
 #undef MSG
 }
 
+static bool ima_validate_rule(struct ima_rule_entry *entry)
+{
+	/* Ensure that the action is set */
+	if (entry->action == UNKNOWN)
+		return false;
+
+	/*
+	 * Ensure that the hook function is compatible with the other
+	 * components of the rule
+	 */
+	switch (entry->func) {
+	case NONE:
+	case FILE_CHECK:
+	case MMAP_CHECK:
+	case BPRM_CHECK:
+	case CREDS_CHECK:
+	case POST_SETATTR:
+	case MODULE_CHECK:
+	case FIRMWARE_CHECK:
+	case KEXEC_KERNEL_CHECK:
+	case KEXEC_INITRAMFS_CHECK:
+	case POLICY_CHECK:
+		/* Validation of these hook functions is in ima_parse_rule() */
+		break;
+	case KEXEC_CMDLINE:
+	case KEY_CHECK:
+		if (entry->action & ~(MEASURE | DONT_MEASURE))
+			return false;
+
+		break;
+	default:
+		return false;
+	}
+
+	return true;
+}
+
 static int ima_parse_rule(char *rule, struct ima_rule_entry *entry)
 {
 	struct audit_buffer *ab;
@@ -1150,7 +1187,6 @@ static int ima_parse_rule(char *rule, struct ima_rule_entry *entry)
 			keyrings_len = strlen(args[0].from) + 1;
 
 			if ((entry->keyrings) ||
-			    (entry->action != MEASURE) ||
 			    (entry->func != KEY_CHECK) ||
 			    (keyrings_len < 2)) {
 				result = -EINVAL;
@@ -1356,7 +1392,7 @@ static int ima_parse_rule(char *rule, struct ima_rule_entry *entry)
 			break;
 		}
 	}
-	if (!result && (entry->action == UNKNOWN))
+	if (!result && !ima_validate_rule(entry))
 		result = -EINVAL;
 	else if (entry->action == APPRAISE)
 		temp_ima_appraise |= ima_appraise_flag(entry->func);
-- 
2.25.1


^ permalink raw reply related

* [PATCH v3 05/12] ima: Fail rule parsing when the KEXEC_CMDLINE hook is combined with an invalid cond
From: Tyler Hicks @ 2020-07-09  6:19 UTC (permalink / raw)
  To: Mimi Zohar, Dmitry Kasatkin
  Cc: James Morris, Serge E . Hallyn, Lakshmi Ramasubramanian,
	Prakhar Srivastava, linux-kernel, linux-integrity,
	linux-security-module
In-Reply-To: <20200709061911.954326-1-tyhicks@linux.microsoft.com>

The KEXEC_CMDLINE hook function only supports the pcr conditional. Make
this clear at policy load so that IMA policy authors don't assume that
other conditionals are supported.

Since KEXEC_CMDLINE's inception, ima_match_rules() has always returned
true on any loaded KEXEC_CMDLINE rule without any consideration for
other conditionals present in the rule. Make it clear that pcr is the
only supported KEXEC_CMDLINE conditional by returning an error during
policy load.

An example of why this is a problem can be explained with the following
rule:

 dont_measure func=KEXEC_CMDLINE obj_type=foo_t

An IMA policy author would have assumed that rule is valid because the
parser accepted it but the result was that measurements for all
KEXEC_CMDLINE operations would be disabled.

Fixes: b0935123a183 ("IMA: Define a new hook to measure the kexec boot command line arguments")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
---

* v3
  - Adjust for the indentation change introduced in patch #4
  - Added Lakshmi's Reviewed-by
* v2
  - Added Mimi's Reviewed-by

 security/integrity/ima/ima_policy.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
index 40c28f1a6a5a..1c64bd6f1728 100644
--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -343,6 +343,17 @@ static int ima_lsm_update_rule(struct ima_rule_entry *entry)
 	return 0;
 }
 
+static bool ima_rule_contains_lsm_cond(struct ima_rule_entry *entry)
+{
+	int i;
+
+	for (i = 0; i < MAX_LSM_RULES; i++)
+		if (entry->lsm[i].args_p)
+			return true;
+
+	return false;
+}
+
 /*
  * The LSM policy can be reloaded, leaving the IMA LSM based rules referring
  * to the old, stale LSM policy.  Update the IMA LSM based rules to reflect
@@ -998,6 +1009,16 @@ static bool ima_validate_rule(struct ima_rule_entry *entry)
 		/* Validation of these hook functions is in ima_parse_rule() */
 		break;
 	case KEXEC_CMDLINE:
+		if (entry->action & ~(MEASURE | DONT_MEASURE))
+			return false;
+
+		if (entry->flags & ~(IMA_FUNC | IMA_PCR))
+			return false;
+
+		if (ima_rule_contains_lsm_cond(entry))
+			return false;
+
+		break;
 	case KEY_CHECK:
 		if (entry->action & ~(MEASURE | DONT_MEASURE))
 			return false;
-- 
2.25.1


^ permalink raw reply related

* [PATCH v3 06/12] ima: Fail rule parsing when the KEY_CHECK hook is combined with an invalid cond
From: Tyler Hicks @ 2020-07-09  6:19 UTC (permalink / raw)
  To: Mimi Zohar, Dmitry Kasatkin
  Cc: James Morris, Serge E . Hallyn, Lakshmi Ramasubramanian,
	Prakhar Srivastava, linux-kernel, linux-integrity,
	linux-security-module
In-Reply-To: <20200709061911.954326-1-tyhicks@linux.microsoft.com>

The KEY_CHECK function only supports the uid, pcr, and keyrings
conditionals. Make this clear at policy load so that IMA policy authors
don't assume that other conditionals are supported.

Fixes: 5808611cccb2 ("IMA: Add KEY_CHECK func to measure keys")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
---

* v3
  - Added Lakshmi's Reviewed-by
  - Adjust for the indentation change introduced in patch #4
* v2
  - No change

 security/integrity/ima/ima_policy.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
index 1c64bd6f1728..81da02071d41 100644
--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -1023,6 +1023,13 @@ static bool ima_validate_rule(struct ima_rule_entry *entry)
 		if (entry->action & ~(MEASURE | DONT_MEASURE))
 			return false;
 
+		if (entry->flags & ~(IMA_FUNC | IMA_UID | IMA_PCR |
+				     IMA_KEYRINGS))
+			return false;
+
+		if (ima_rule_contains_lsm_cond(entry))
+			return false;
+
 		break;
 	default:
 		return false;
-- 
2.25.1


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox