Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH] drivers: net: ethernet: 3com: fix return value
From: Thomas Preisner @ 2016-12-23 23:00 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: linux-kernel, milan.stephan+linux, thomas.preisner+linux, dave

In a few cases the err-variable is not set to a negative error code if a
function call fails and thus 0 is returned instead.
It may be better to set err to the proper negative error code before
returning.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188841

Reported-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Thomas Preisner <thomas.preisner+linux@fau.de>
Signed-off-by: Milan Stephan <milan.stephan+linux@fau.de>
---
 drivers/net/ethernet/3com/typhoon.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/3com/typhoon.c b/drivers/net/ethernet/3com/typhoon.c
index a0cacbe..9a3ab58 100644
--- a/drivers/net/ethernet/3com/typhoon.c
+++ b/drivers/net/ethernet/3com/typhoon.c
@@ -2404,6 +2404,7 @@ typhoon_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	if(!is_valid_ether_addr(dev->dev_addr)) {
 		err_msg = "Could not obtain valid ethernet address, aborting";
+		err = -EIO;
 		goto error_out_reset;
 	}
 
@@ -2413,6 +2414,7 @@ typhoon_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	INIT_COMMAND_WITH_RESPONSE(&xp_cmd, TYPHOON_CMD_READ_VERSIONS);
 	if(typhoon_issue_command(tp, 1, &xp_cmd, 3, xp_resp) < 0) {
 		err_msg = "Could not get Sleep Image version";
+		err = -EIO;
 		goto error_out_reset;
 	}
 
@@ -2455,6 +2457,7 @@ typhoon_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	if(register_netdev(dev) < 0) {
 		err_msg = "unable to register netdev";
+		err = -EIO;
 		goto error_out_reset;
 	}
 
-- 
2.7.4

^ permalink raw reply related

* Re: George's crazy full state idea (Re: HalfSipHash Acceptable Usage)
From: George Spelvin @ 2016-12-23 23:39 UTC (permalink / raw)
  To: daniel, hannes, linux
  Cc: ak, davem, David.Laight, ebiggers3, eric.dumazet, Jason,
	kernel-hardening, linux-crypto, linux-kernel, luto, netdev, tom,
	tytso, vegard.nossum
In-Reply-To: <1482526135.2554.5.camel@stressinduktion.org>

Hannes Frederic Sowa wrote:
> In general this looks good, but bitbuffer needs to be protected from
> concurrent access, thus needing at least some atomic instruction and
> disabling of interrupts for the locking if done outside of
> get_random_long. Thus I liked your previous approach more where you
> simply embed this in the already necessary get_random_long and aliased
> get_random_long as get_random_bits(BITS_PER_LONG) accordingly, IMHO.

It's meant to be part of the same approach, and I didn't include locking
because that's a requirement for *any* solution, and isn't specific
to the part I was trying to illustrate.

(As for re-using the name "get_random_long", that was just so
I didn't have to explain it.  Call it "get_random_long_internal"
if you like.)

Possible locking implementations include:
1) Use the same locking as applies to get_random_long_internal(), or
2) Make bitbuffer a per-CPU variable (note that we currently have 128
   bits of per-CPU state in get_random_int_hash[]), and this is all a
   fast-path to bypass heavier locking in get_random_long_internal().

>> But, I just realized I've been overlooking something glaringly obvious...
>> there's no reason you can't compute multple blocks in advance.
>
> In the current code on the cost of per cpu allocations thus memory.

Yes, but on 4-core machines it's still not much, and 4096-core
behemoths have RAM to burn.

> In the extract_crng case, couldn't we simply use 8 bytes of the 64 byte
> return block to feed it directly back into the state chacha? So we pass
> on 56 bytes into the pcpu buffer, and consume 8 bytes for the next
> state. This would make the window max shorter than the anti
> backtracking protection right now from 300s to 14 get_random_int calls.
> Not sure if this is worth it.

We just finished discussing why 8 bytes isn't enough.  If you only
feed back 8 bytes, an attacker who can do 2^64 computation can find it
(by guessing and computing forward to verify the guess) and recover the
previous state.  You need to feed back at least as much output as your
security targete.  For /dev/urandom's ChaCha20, that's 256 bits.

>> For example, suppose we gave each CPU a small pool to minimize locking.
>> When one runs out and calls the global refill, it could refill *all*
>> of the CPU pools.  (You don't even need locking; there may be a race to
>> determine *which* random numbers the reader sees, but they're equally
>> good either way.)

> Yes, but still disabled interrupts, otherwise the same state could be
> used twice on the same CPU. Uff, I think we have this problem in
> prandom_u32.

There are some locking gotchas, but it is doable lock-free.

Basically, it's a seqlock.  The writer increments it once (to an odd
number) before starting to overwrite the buffer, and a second time (to
an even number) after.  "Before" and "after" mean smp_wmb().

The reader can use this to figure out how much of the data in the buffer
is safely fresh.  The full sequence of checks is a bit intricate,
but straightforward.

I didn't discuss the locking because I'm confident it's solvable,
not because I wasn't aware it has to be solved.

As for prandom_u32(), what's the problem?  Are you worried that
get_cpu_var disables preemption but not interrupts, and so an
ISR might return the same value as process-level code?

First of all, that's not a problem because prandom_u32() doesn't
have security guarantees.  Occasionally returning a dupicate number
is okay.

Second, if you do care, that could be trivially fixed by throwing
a barrier() in the middle of the code.  (Patch appended; S-o-B
if anyone wants it.)

diff --git a/lib/random32.c b/lib/random32.c
index c750216d..6bee4a36 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -55,16 +55,29 @@ static DEFINE_PER_CPU(struct rnd_state, net_rand_state);
  *
  *	This is used for pseudo-randomness with no outside seeding.
  *	For more random results, use prandom_u32().
+ *
+ *	The barrier() is to allow prandom_u32() to be called from interupt
+ *	context without locking.  An interrupt will run to completion,
+ *	updating all four state variables.  The barrier() ensures that
+ *	the interrupted code will compute a different result.  Either it
+ *	will have written s1 and s2 (so the interrupt will start with
+ *	the updated values), or it will use the values of s3 and s4
+ *	updated by the interrupt.
+ *
+ *	(The same logic applies recursively to nested interrupts, trap
+ *	handlers, and NMIs.)
  */
 u32 prandom_u32_state(struct rnd_state *state)
 {
+	register u32 x;
 #define TAUSWORTHE(s, a, b, c, d) ((s & c) << d) ^ (((s << a) ^ s) >> b)
-	state->s1 = TAUSWORTHE(state->s1,  6, 13, 4294967294U, 18U);
-	state->s2 = TAUSWORTHE(state->s2,  2, 27, 4294967288U,  2U);
-	state->s3 = TAUSWORTHE(state->s3, 13, 21, 4294967280U,  7U);
-	state->s4 = TAUSWORTHE(state->s4,  3, 12, 4294967168U, 13U);
+	x  = state->s1 = TAUSWORTHE(state->s1,  6, 13,   ~1u, 18);
+	x += state->s2 = TAUSWORTHE(state->s2,  2, 27,   ~7u,  2);
+	barrier();
+	x ^= state->s3 = TAUSWORTHE(state->s3, 13, 21,  ~15u,  7);
+	x += state->s4 = TAUSWORTHE(state->s4,  3, 12, ~127u, 13);

-	return (state->s1 ^ state->s2 ^ state->s3 ^ state->s4);
+	return x;
 }
 EXPORT_SYMBOL(prandom_u32_state);

^ permalink raw reply related

* Your Urgent Response Is Highly Anticipated{}
From: Mr. Khan Moosa @ 2016-12-23 23:42 UTC (permalink / raw)


Salam,

Please kindly pardon me for any inconvenience this letter may cost you
because I know it may come to you as a surprise as we have no previous
correspondence I'm Khan Moosa, From Ouagadougou Burkina Faso. I need
your assistance to validate your name in our Bank System to enable the
Bank transfer the sum of $7.5 Million Unclaimed fund into your
nominated bank accounts for onward investment (Hotel industries and
Estate Building management) or any profitable business in your country
and I will give you 40%, for your assistance.

My questions are?

1. Can you handle this project?

2. Can I give you this trust?

If Yes Then Send Me Your Information’s; to commence this transaction,
I require you to immediately indicate your interest by a return mail
for more details.

REPLY ME IMMEDIATELY

Yours faithfully,
Khan Moosa

^ permalink raw reply

* Re: George's crazy full state idea (Re: HalfSipHash Acceptable Usage)
From: Hannes Frederic Sowa @ 2016-12-24  0:12 UTC (permalink / raw)
  To: George Spelvin, daniel
  Cc: ak, davem, David.Laight, ebiggers3, eric.dumazet, Jason,
	kernel-hardening, linux-crypto, linux-kernel, luto, netdev, tom,
	tytso, vegard.nossum
In-Reply-To: <20161223233904.11739.qmail@ns.sciencehorizons.net>

Hi,

On 24.12.2016 00:39, George Spelvin wrote:
> Hannes Frederic Sowa wrote:
>> In general this looks good, but bitbuffer needs to be protected from
>> concurrent access, thus needing at least some atomic instruction and
>> disabling of interrupts for the locking if done outside of
>> get_random_long. Thus I liked your previous approach more where you
>> simply embed this in the already necessary get_random_long and aliased
>> get_random_long as get_random_bits(BITS_PER_LONG) accordingly, IMHO.
> 
> It's meant to be part of the same approach, and I didn't include locking
> because that's a requirement for *any* solution, and isn't specific
> to the part I was trying to illustrate.
> 
> (As for re-using the name "get_random_long", that was just so
> I didn't have to explain it.  Call it "get_random_long_internal"
> if you like.)
> 
> Possible locking implementations include:
> 1) Use the same locking as applies to get_random_long_internal(), or
> 2) Make bitbuffer a per-CPU variable (note that we currently have 128
>    bits of per-CPU state in get_random_int_hash[]), and this is all a
>    fast-path to bypass heavier locking in get_random_long_internal().

I understood that this is definitely a solvable problem.

>>> But, I just realized I've been overlooking something glaringly obvious...
>>> there's no reason you can't compute multple blocks in advance.
>>
>> In the current code on the cost of per cpu allocations thus memory.
> 
> Yes, but on 4-core machines it's still not much, and 4096-core
> behemoths have RAM to burn.
> 
>> In the extract_crng case, couldn't we simply use 8 bytes of the 64 byte
>> return block to feed it directly back into the state chacha? So we pass
>> on 56 bytes into the pcpu buffer, and consume 8 bytes for the next
>> state. This would make the window max shorter than the anti
>> backtracking protection right now from 300s to 14 get_random_int calls.
>> Not sure if this is worth it.
> 
> We just finished discussing why 8 bytes isn't enough.  If you only
> feed back 8 bytes, an attacker who can do 2^64 computation can find it
> (by guessing and computing forward to verify the guess) and recover the
> previous state.  You need to feed back at least as much output as your
> security targete.  For /dev/urandom's ChaCha20, that's 256 bits.

I followed the discussion but it appeared to me that this has the
additional constraint of time until the next reseeding event happenes,
which is 300s (under the assumption that calls to get_random_int happen
regularly, which I expect right now). After that the existing reseeding
mechansim will ensure enough backtracking protection. The number of
bytes can easily be increased here, given that reseeding was shown to be
quite fast already and we produce enough output. But I am not sure if
this is a bit overengineered in the end?

>>> For example, suppose we gave each CPU a small pool to minimize locking.
>>> When one runs out and calls the global refill, it could refill *all*
>>> of the CPU pools.  (You don't even need locking; there may be a race to
>>> determine *which* random numbers the reader sees, but they're equally
>>> good either way.)
> 
>> Yes, but still disabled interrupts, otherwise the same state could be
>> used twice on the same CPU. Uff, I think we have this problem in
>> prandom_u32.
> 
> There are some locking gotchas, but it is doable lock-free.
> 
> Basically, it's a seqlock.  The writer increments it once (to an odd
> number) before starting to overwrite the buffer, and a second time (to
> an even number) after.  "Before" and "after" mean smp_wmb().
> 
> The reader can use this to figure out how much of the data in the buffer
> is safely fresh.  The full sequence of checks is a bit intricate,
> but straightforward.
> 
> I didn't discuss the locking because I'm confident it's solvable,
> not because I wasn't aware it has to be solved.

Also agreed. Given your solution below to prandom_u32, I do think it
might also work without the seqlock now.

> As for prandom_u32(), what's the problem?  Are you worried that
> get_cpu_var disables preemption but not interrupts, and so an
> ISR might return the same value as process-level code?

Yes, exactly those were my thoughts.

> First of all, that's not a problem because prandom_u32() doesn't
> have security guarantees.  Occasionally returning a dupicate number
> is okay.
>
> Second, if you do care, that could be trivially fixed by throwing
> a barrier() in the middle of the code.  (Patch appended; S-o-B
> if anyone wants it.)

I wouldn't have added a disable irqs, but given that I really like your
proposal, I would take it in my todo branch and submit it when net-next
window opens.

> diff --git a/lib/random32.c b/lib/random32.c
> index c750216d..6bee4a36 100644
> --- a/lib/random32.c
> +++ b/lib/random32.c
> [...]

Thanks,
Hannes

^ permalink raw reply

* RE: [PATCH] scsi: bfa: Increase requested firmware version to 3.2.5.1
From: Mody, Rasesh @ 2016-12-24  0:15 UTC (permalink / raw)
  To: Benjamin Poirier, Martin K. Petersen
  Cc: Tim Ehlers, Rasesh Mody, Anil Gurumurthy, Sudarsana Kalluru,
	James E.J. Bottomley, linux-scsi@vger.kernel.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20161223220128.1226-1-bpoirier@suse.com>

> From: Benjamin Poirier [mailto:benjamin.poirier@gmail.com] On Behalf Of
> Benjamin Poirier
> Sent: Friday, December 23, 2016 2:01 PM
> 
> bna & bfa firmware version 3.2.5.1 was submitted to linux-firmware on Tue
> Feb 17 19:10:20 2015 -0500 in 0ab54ff1dc ("linux-firmware: Add QLogic BR
> Series Adapter Firmware").
> 
> bna was updated to use the newer firmware on Thu, 19 Feb 2015 16:02:32
> -0500 in 3f307c3d70 ("bna: Update the Driver and Firmware Version")
> 
> bfa was not updated. I presume this was an oversight but it broke support
> for bfa+bna cards such as the following
> 	04:00.0 Fibre Channel [0c04]: Brocade Communications Systems, Inc.
> 		1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
> 	04:00.1 Fibre Channel [0c04]: Brocade Communications Systems, Inc.
> 		1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
> 	04:00.2 Ethernet controller [0200]: Brocade Communications
> Systems,
> 		Inc. 1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
> 	04:00.3 Ethernet controller [0200]: Brocade Communications
> Systems,
> 		Inc. 1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
> 
> Currently, if the bfa module is loaded first, bna fails to probe the respective
> devices with [  215.026787] bna: QLogic BR-series 10G Ethernet driver -
> version: 3.2.25.1 [  215.043707] bna 0000:04:00.2: bar0 mapped to
> ffffc90001fc0000, len 262144 [  215.060656] bna 0000:04:00.2: initialization
> failed err=1 [  215.073893] bna 0000:04:00.3: bar0 mapped to
> ffffc90002040000, len 262144 [  215.090644] bna 0000:04:00.3: initialization
> failed err=1
> 
> Whereas if bna is loaded first, bfa fails with [  249.592109] QLogic BR-series
> BFA FC/FCOE SCSI driver - version: 3.2.25.0 [  249.610738] bfa 0000:04:00.0:
> Running firmware version is incompatible with the driver version [
> 249.833513] bfa 0000:04:00.0: bfa init failed [  249.833919] scsi host6: QLogic
> BR-series FC/FCOE Adapter, hwpath: 0000:04:00.0 driver: 3.2.25.0 [
> 249.841446] bfa 0000:04:00.1: Running firmware version is incompatible with
> the driver version [  250.045449] bfa 0000:04:00.1: bfa init failed [  250.045962]
> scsi host7: QLogic BR-series FC/FCOE Adapter, hwpath: 0000:04:00.1 driver:
> 3.2.25.0
> 
> Increase bfa's requested firmware version.
> I only tested that all of the devices probe without error.
> 
> Cc: Rasesh Mody <rasesh.mody@qlogic.com>
> Reported-by: Tim Ehlers <tehlers@gwdg.de>
> Signed-off-by: Benjamin Poirier <bpoirier@suse.com>

Acked-by: Rasesh Mody <rasesh.mody@cavium.com> 

Thanks! We also need to bump up the BFA driver version to 3.2.25.1.

> ---
>  drivers/scsi/bfa/bfad.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/bfa/bfad.c b/drivers/scsi/bfa/bfad.c index
> 9d253cb..e70410b 100644
> --- a/drivers/scsi/bfa/bfad.c
> +++ b/drivers/scsi/bfa/bfad.c
> @@ -64,9 +64,9 @@ int		max_rport_logins =
> BFA_FCS_MAX_RPORT_LOGINS;
>  u32	bfi_image_cb_size, bfi_image_ct_size, bfi_image_ct2_size;
>  u32	*bfi_image_cb, *bfi_image_ct, *bfi_image_ct2;
> 
> -#define BFAD_FW_FILE_CB		"cbfw-3.2.3.0.bin"
> -#define BFAD_FW_FILE_CT		"ctfw-3.2.3.0.bin"
> -#define BFAD_FW_FILE_CT2	"ct2fw-3.2.3.0.bin"
> +#define BFAD_FW_FILE_CB		"cbfw-3.2.5.1.bin"
> +#define BFAD_FW_FILE_CT		"ctfw-3.2.5.1.bin"
> +#define BFAD_FW_FILE_CT2	"ct2fw-3.2.5.1.bin"
> 
>  static u32 *bfad_load_fwimg(struct pci_dev *pdev);  static void
> bfad_free_fwimg(void);
> --
> 2.10.2

^ permalink raw reply

* Re: [PATCH] drivers: net: ethernet: 3com: fix return value
From: David Dillow @ 2016-12-24  1:05 UTC (permalink / raw)
  To: Thomas Preisner; +Cc: netdev, linux-kernel, linux-kernel, milan.stephan+linux
In-Reply-To: <1482534015-7828-1-git-send-email-thomas.preisner+linux@fau.de>

On Sat, 2016-12-24 at 00:00 +0100, Thomas Preisner wrote:
> diff --git a/drivers/net/ethernet/3com/typhoon.c b/drivers/net/ethernet/3com/typhoon.c
> index a0cacbe..9a3ab58 100644
> --- a/drivers/net/ethernet/3com/typhoon.c
> +++ b/drivers/net/ethernet/3com/typhoon.c
> @@ -2404,6 +2404,7 @@ typhoon_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  
>  	if(!is_valid_ether_addr(dev->dev_addr)) {
>  		err_msg = "Could not obtain valid ethernet address, aborting";
> +		err = -EIO;
>  		goto error_out_reset;

The change above is fine, but the other two should use the return value
from the failing function call.


> @@ -2413,6 +2414,7 @@ typhoon_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  	INIT_COMMAND_WITH_RESPONSE(&xp_cmd, TYPHOON_CMD_READ_VERSIONS);
>  	if(typhoon_issue_command(tp, 1, &xp_cmd, 3, xp_resp) < 0) {
>  		err_msg = "Could not get Sleep Image version";
> +		err = -EIO;
>  		goto error_out_reset;
>  	}
>  
> @@ -2455,6 +2457,7 @@ typhoon_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  
>  	if(register_netdev(dev) < 0) {
>  		err_msg = "unable to register netdev";
> +		err = -EIO;
>  		goto error_out_reset;
>  	}
>  

^ permalink raw reply

* Re: George's crazy full state idea (Re: HalfSipHash Acceptable Usage)
From: George Spelvin @ 2016-12-24  1:17 UTC (permalink / raw)
  To: daniel, hannes, linux
  Cc: ak, davem, David.Laight, ebiggers3, eric.dumazet, Jason,
	kernel-hardening, linux-crypto, linux-kernel, luto, netdev, tom,
	tytso, vegard.nossum
In-Reply-To: <254f8cd7-d045-172d-8692-6052d9da287e@stressinduktion.org>

Hannes Frederic Sowa wrote:
> On 24.12.2016 00:39, George Spelvin wrote:
>> We just finished discussing why 8 bytes isn't enough.  If you only
>> feed back 8 bytes, an attacker who can do 2^64 computation can find it
>> (by guessing and computing forward to verify the guess) and recover the
>> previous state.  You need to feed back at least as much output as your
>> security targete.  For /dev/urandom's ChaCha20, that's 256 bits.

> I followed the discussion but it appeared to me that this has the
> additional constraint of time until the next reseeding event happenes,
> which is 300s (under the assumption that calls to get_random_int happen
> regularly, which I expect right now). After that the existing reseeding
> mechansim will ensure enough backtracking protection. The number of
> bytes can easily be increased here, given that reseeding was shown to be
> quite fast already and we produce enough output. But I am not sure if
> this is a bit overengineered in the end?

I'm not following your description of how the time-based and call-based
mechanisms interact, but for any mix-back, you should either do enough
or none at all.  (Also called "catastrophic reseeding".)

For example, two mix-backs of 64 bits gives you 65 bit security, not 128.
(Because each mixback can be guessed and verified separately.)

> Also agreed. Given your solution below to prandom_u32, I do think it
> might also work without the seqlock now.

It's not technically a seqlock; in particular the reader doesn't
spin.  But the write side, and general logic is so similar it's
a good mental model.

Basically, assume a 64-byte buffer.  The reader has gone through
32 bytes of it, and has 32 left, and when he reads another 8 bytes,
has to distinguish three cases:

1) No update; we read the old bytes and there are now 32 - 24 bytes left.
2) Update completed while we weren't looking.  There are now new
   bytes in the buffer, and we took 8 leaving 64 - 8 = 56.
3) Update in progress at the time of the read.  We don't know if we
   are seeing old bytes or new bytes, so we have to assume the worst
   and not proceeed unless 32 >= 8, but assume at the end there are
   64 - 8 = 56 new bytes left.

> I wouldn't have added a disable irqs, but given that I really like your
> proposal, I would take it in my todo branch and submit it when net-next
> window opens.

If you want that, I have a pile of patches to prandom I really
should push upstream.  Shall I refresh them and send them to you?


commit 4cf1b3d9f4fbccc29ffc2fe4ca4ff52ea77253f1
Author: George Spelvin <linux@horizon.com>
Date:   Mon Aug 31 00:05:00 2015 -0400

    net: Use prandom_u32_max()
    
    It's slightly more efficient than "prandom_u32() % range"
    
    The net/802 code was already efficient, but prandom_u32_max() is simpler.
    
    In net/batman-adv/bat_iv_ogm.c, batadv_iv_ogm_fwd_send_time() got changed
    from picking a random number of milliseconds and converting to jiffies to
    picking a random number of jiffies, since the number of milliseconds (and
    thus the conversion to jiffies) is a compile-time constant.  The equivalent
    code in batadv_iv_ogm_emit_send_time was not changed, because the number
    of milliseconds is variable.
    
    In net/ipv6/ip6_flowlabel.c, ip6_flowlabel had htonl(prandom_u32()),
    which is silly.  Just cast to __be32 without permuting the bits.
    
    net/sched/sch_netem.c got adjusted to only need one call to prandom_u32
    instead of 2.  (Assuming skb_headlen can't exceed 512 MiB, which is
    hopefully safe for some time yet.)
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit 9c8fb80e1fd2be42c35cab1af27187d600fd85e3
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 15:20:47 2014 -0400

    mm/swapfile.c: Use prandom_u32_max()
    
    It's slightly more efficient than "prandom_u32() % range"
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit 2743eb01e5c5958fd88ae78d19c5fea772d4b117
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 15:19:53 2014 -0400

    lib: Use prandom_u32_max()
    
    It's slightly more efficient than "prandom_u32() % range"
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit 6a5e91bf395060a3351bfe5efc40ac20ffba2c1b
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 15:18:50 2014 -0400

    fs/xfs: Use prandom_u32_max()
    
    It's slightly more efficient than "prandom_u32() % range".
    
    Also changed the last argument of xfs_error_test() from "unsigned long"
    to "unsigned", since the code never did support values > 2^32, and
    the largest value ever passed is 100.
    
    The code could be improved even further by passing in 2^32/rf rather
    than rf, but I'll leave that to some XFS developers.
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit 6f6d485d9179ca6ec4e30caa06ade0e0c6931810
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 15:00:17 2014 -0400

    fs/ubifs: Use prandom_u32_max()
    
    It's slightly more efficient than "prandom_u32() % range".
    
    In fs/ubifs/debug.c, the chance() function got rewritten entirely
    to take advantage of the fact its arguments are always constant.
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit 0b6bf2c874bbbcfa74f6be0413c772b3ac134d12
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 14:52:17 2014 -0400

    fs/extX: Use prandom_u32_max()
    
    It's slightly more efficient than "prandom_u32() % range".
    
    In ext4/ialloc.c:436, there's one more chance to do that, but
    the modulo is required to keep the deterministic option consistent,
    so I left it alone.
    
    Switching to "parent_group = ((u64)grp * ngroups) >> 32;" would be more
    efficient, but would change user-visible behavior.
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit e79e0e8b491bc976c0b4e1b2070eccdf55b34f43
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 14:47:15 2014 -0400

    fs/ceph: Use prandom_u32_max()
    
    It's slightly more efficient than "prandom_u32() % range"
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit fc628326d8cf4abe364ea01259bc392c0bbaf5a0
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 14:46:29 2014 -0400

    drivers/scsi/fcoe/fcoe_ctlr.c: Use prandom_u32_max()
    
    It's slightly more efficient than "prandom_u32() % range"
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit 4810d67dd2edf08e7801ef47550d46b7397fe2dc
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 14:45:55 2014 -0400

    drivers/ntb/ntb_hw.c: Use prandom_u32_max()
    
    It's slightly more efficient than "prandom_u32() % range"
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit f4a806abbc0785e8f0363e02fac613246eed9e34
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 14:45:27 2014 -0400

    drivers/net: Use prandom_u32_max()
    
    It's slightly more efficient than "prandom_u32() % range"
    
    In some cases (like ethernet/broadcom/cnic.c and drivers/net/hamradio),
    the range is a compile-time constant power of 2, so the code doesn't
    get any better, but I'm trying to do a full sweep.
    
    In drivers/net/wireless/brcm80211/brcmfmac/p2p.c, a timeout is selected from
    100, 200 or 300 ms.  It would be easy enough to make the granularity finer,
    but I assume the existing code is that way for a reason.
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit 603e0c311fac09d633ff6c0fd9242108f1c52ead
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 13:55:09 2014 -0400

    drivers/mtd: Use prandom_u32_max()
    
    It's slightly more efficient than "prandom_u32() % range".
    
    And rewrote the 1-in-N code in drivers/mtd/ubi/debug.h to avoid
    even more arithmetic.
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit e0657cc865e8e02768906935b8e8bf63af58aa46
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 13:52:13 2014 -0400

    drivers/mmc/core: Use prandom_u32_max()
    
    It's slightly more efficient than "prandom_u32() % range"
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit 017ee6841ec8d416093fc1f18bdd3df0dfc6aacc
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 13:51:33 2014 -0400

    drivers/lguest/page_tables.c: Use prandom_u32_max()
    
    This doesn't actually change the code because the array size is
    a power of 2 (it's a 4-element array).  But I'm on a roll.
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit 87556165f46eb42c756bcb94e062c3fbd272a4bf
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 13:49:41 2014 -0400

    drivers/infiniband: Use prandom_u32_max()
    
    It's slightly more efficient than "prandom_u32() % range"
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit 1eafe1d429f442218810e8c604d4e7c466414cf3
Author: George Spelvin <linux@horizon.com>
Date:   Sun Aug 30 23:42:41 2015 -0400

    block/blk-mq-tag.c: Use prandom_u32_max()
    
    It's slightly more efficient than "prandom_u32() % range"
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit f03ee59a63098d244d5b8932fc68c9fc3e2bb222
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 13:46:52 2014 -0400

    arch/x86: Use prandom_u32_max()
    
    It's slightly more efficient than "prandom_u32() % range"
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit feafd3a3fb09924ea633d677a7ab8a25a817f39d
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 13:44:49 2014 -0400

    lib/random32.c: Remove redundant U suffixes on integers
    
    Get rid of a few of the extraneous U suffixes on ordinary integers.
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit f14328d248e59c862478633479528c9cb8554d7a
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 12:40:19 2014 -0400

    lib/random32.c: Randomize timeout to the millisecond, not the second.
    
    If you're going to bother randomizing it, do it right.
    And use prandom_u32_max(), which is designed for the job, rather
    than "% 40".
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit 143342006adfff718afedf58f639b72337d7c816
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 12:51:26 2014 -0400

    lib/random32.c: Make prandom_u32_max efficient for powers of 2
    
    The multiply-and-shift is efficient in the general case, but slower
    than a simple bitwise AND if the range is a power of 2.  Make the function
    handle this case so callers don't have to worry about micro-optimizing it.
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit 99864db5cb023e6b09d71cde4997a5f6cafb5cca
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 12:02:17 2014 -0400

    lib/random32.c: Use <asm/unaligned.h> instead of hand-rolling it
    
    The functions exist for a reason; the manual byte-at-a-time code
    is unnecessarily slow (and bloated).
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit 4ecd45f6eadd4d171782dc6b451ed1040e47d419
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 11:55:59 2014 -0400

    lib/random32.c: Debloat non-speed-critical code
    
    Unrolling code in rarely-used code paths is just silly.  There are
    two places that static calls to prandom_u32_state() can be removed:
    
    1) prandom_warmup() calls prandom_u32_state 10 times.
    2) prandom_state_selftest() can avoid one call and simplify the
       loop logic by repeating an assignment to a local variable
       (which probably adds zero code anyway).
    
    Signed-off-by: George Spelvin <linux@horizon.com>

commit 8765cff45da1d96e4310d50dd49231790c49b612
Author: George Spelvin <linux@horizon.com>
Date:   Sat May 24 11:52:34 2014 -0400

    lib/random32.c: Mark self-test data as __initconst
    
    So it can be thrown away along with the code that uses it.
    
    Signed-off-by: George Spelvin <linux@horizon.com>

^ permalink raw reply

* [RFC PATCH 4.10 0/6] Switch BPF's digest to SHA256
From: Andy Lutomirski @ 2016-12-24  2:22 UTC (permalink / raw)
  To: Daniel Borkmann, Netdev, LKML, Linux Crypto Mailing List
  Cc: Jason A. Donenfeld, Hannes Frederic Sowa, Alexei Starovoitov,
	Eric Dumazet, Eric Biggers, Tom Herbert, David S. Miller,
	Andy Lutomirski

Since there are plenty of uses for the new-in-4.10 BPF digest feature
that would be problematic if malicious users could produce collisions,
the BPF digest should be collision-resistant.  SHA-1 is no longer
considered collision-resistant, so switch it to SHA-256.

The actual switchover is trivial.  Most of this series consists of
cleanups to the SHA256 code to make it usable as a standalone library
(since BPF should not depend on crypto).

The cleaned up library is much more user-friendly than the SHA-1 code,
so this also significantly tidies up the BPF digest code.

This is intended for 4.10.  If this series misses 4.10 and nothing
takes its place, then we'll have an unpleasant ABI stability
situation.

Andy Lutomirski (6):
  crypto/sha256: Refactor the API so it can be used without shash
  crypto/sha256: Make the sha256 library functions selectable
  bpf: Use SHA256 instead of SHA1 for bpf digests
  bpf: Avoid copying the entire BPF program when hashing it
  bpf: Rename fdinfo's prog_digest to prog_sha256
  net: Rename TCA*BPF_DIGEST to ..._SHA256

 arch/arm/crypto/sha2-ce-glue.c      | 10 +++---
 arch/arm/crypto/sha256_glue.c       | 23 +++++++++-----
 arch/arm/crypto/sha256_neon_glue.c  | 34 ++++++++++----------
 arch/arm64/crypto/sha2-ce-glue.c    | 13 ++++----
 arch/arm64/crypto/sha256-glue.c     | 59 +++++++++++++++++++---------------
 arch/x86/crypto/sha256_ssse3_glue.c | 46 ++++++++++++++++-----------
 arch/x86/purgatory/purgatory.c      |  2 +-
 arch/x86/purgatory/sha256.c         | 25 ++-------------
 arch/x86/purgatory/sha256.h         | 22 -------------
 crypto/Kconfig                      |  8 +++++
 crypto/Makefile                     |  2 +-
 crypto/sha256_generic.c             | 54 +++++++++++++++++++++++--------
 include/crypto/sha.h                | 33 ++++++++++++++++---
 include/crypto/sha256_base.h        | 40 +++++++----------------
 include/linux/filter.h              | 11 ++-----
 include/uapi/linux/pkt_cls.h        |  2 +-
 include/uapi/linux/tc_act/tc_bpf.h  |  2 +-
 init/Kconfig                        |  1 +
 kernel/bpf/core.c                   | 63 +++++++++----------------------------
 kernel/bpf/syscall.c                |  2 +-
 net/sched/act_bpf.c                 |  2 +-
 net/sched/cls_bpf.c                 |  2 +-
 22 files changed, 225 insertions(+), 231 deletions(-)
 delete mode 100644 arch/x86/purgatory/sha256.h

-- 
2.9.3

^ permalink raw reply

* [RFC PATCH 4.10 1/6] crypto/sha256: Refactor the API so it can be used without shash
From: Andy Lutomirski @ 2016-12-24  2:22 UTC (permalink / raw)
  To: Daniel Borkmann, Netdev, LKML, Linux Crypto Mailing List
  Cc: Jason A. Donenfeld, Hannes Frederic Sowa, Alexei Starovoitov,
	Eric Dumazet, Eric Biggers, Tom Herbert, David S. Miller,
	Andy Lutomirski, Ard Biesheuvel, Herbert Xu
In-Reply-To: <cover.1482545792.git.luto@kernel.org>

There are some pieecs of kernel code that want to compute SHA256
directly without going through the crypto core.  Adjust the exported
API to decouple it from the crypto core.

I suspect this will very slightly speed up the SHA256 shash operations
as well by reducing the amount of indirection involved.

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/arm/crypto/sha2-ce-glue.c      | 10 ++++---
 arch/arm/crypto/sha256_glue.c       | 23 ++++++++++-----
 arch/arm/crypto/sha256_neon_glue.c  | 34 +++++++++++----------
 arch/arm64/crypto/sha2-ce-glue.c    | 13 ++++----
 arch/arm64/crypto/sha256-glue.c     | 59 +++++++++++++++++++++----------------
 arch/x86/crypto/sha256_ssse3_glue.c | 46 +++++++++++++++++------------
 arch/x86/purgatory/purgatory.c      |  2 +-
 arch/x86/purgatory/sha256.c         | 25 ++--------------
 arch/x86/purgatory/sha256.h         | 22 --------------
 crypto/sha256_generic.c             | 50 +++++++++++++++++++++++--------
 include/crypto/sha.h                | 29 ++++++++++++++----
 include/crypto/sha256_base.h        | 40 ++++++++-----------------
 12 files changed, 184 insertions(+), 169 deletions(-)
 delete mode 100644 arch/x86/purgatory/sha256.h

diff --git a/arch/arm/crypto/sha2-ce-glue.c b/arch/arm/crypto/sha2-ce-glue.c
index 0755b2d657f3..8832c2f85591 100644
--- a/arch/arm/crypto/sha2-ce-glue.c
+++ b/arch/arm/crypto/sha2-ce-glue.c
@@ -38,7 +38,7 @@ static int sha2_ce_update(struct shash_desc *desc, const u8 *data,
 		return crypto_sha256_arm_update(desc, data, len);
 
 	kernel_neon_begin();
-	sha256_base_do_update(desc, data, len,
+	sha256_base_do_update(sctx, data, len,
 			      (sha256_block_fn *)sha2_ce_transform);
 	kernel_neon_end();
 
@@ -48,17 +48,19 @@ static int sha2_ce_update(struct shash_desc *desc, const u8 *data,
 static int sha2_ce_finup(struct shash_desc *desc, const u8 *data,
 			 unsigned int len, u8 *out)
 {
+	struct sha256_state *sctx = shash_desc_ctx(desc);
+
 	if (!may_use_simd())
 		return crypto_sha256_arm_finup(desc, data, len, out);
 
 	kernel_neon_begin();
 	if (len)
-		sha256_base_do_update(desc, data, len,
+		sha256_base_do_update(sctx, data, len,
 				      (sha256_block_fn *)sha2_ce_transform);
-	sha256_base_do_finalize(desc, (sha256_block_fn *)sha2_ce_transform);
+	sha256_base_do_finalize(sctx, (sha256_block_fn *)sha2_ce_transform);
 	kernel_neon_end();
 
-	return sha256_base_finish(desc, out);
+	return crypto_sha2_final(desc, out);
 }
 
 static int sha2_ce_final(struct shash_desc *desc, u8 *out)
diff --git a/arch/arm/crypto/sha256_glue.c b/arch/arm/crypto/sha256_glue.c
index a84e869ef900..405a29a9a9d3 100644
--- a/arch/arm/crypto/sha256_glue.c
+++ b/arch/arm/crypto/sha256_glue.c
@@ -36,27 +36,34 @@ asmlinkage void sha256_block_data_order(u32 *digest, const void *data,
 int crypto_sha256_arm_update(struct shash_desc *desc, const u8 *data,
 			     unsigned int len)
 {
+	struct sha256_state *sctx = shash_desc_ctx(desc);
+
 	/* make sure casting to sha256_block_fn() is safe */
 	BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0);
 
-	return sha256_base_do_update(desc, data, len,
+	sha256_base_do_update(sctx, data, len,
 				(sha256_block_fn *)sha256_block_data_order);
+	return 0;
 }
 EXPORT_SYMBOL(crypto_sha256_arm_update);
 
-static int sha256_final(struct shash_desc *desc, u8 *out)
+static int sha256_arm_final(struct shash_desc *desc, u8 *out)
 {
-	sha256_base_do_finalize(desc,
+	struct sha256_state *sctx = shash_desc_ctx(desc);
+
+	sha256_base_do_finalize(sctx,
 				(sha256_block_fn *)sha256_block_data_order);
-	return sha256_base_finish(desc, out);
+	return crypto_sha2_final(desc, out);
 }
 
 int crypto_sha256_arm_finup(struct shash_desc *desc, const u8 *data,
 			    unsigned int len, u8 *out)
 {
-	sha256_base_do_update(desc, data, len,
+	struct sha256_state *sctx = shash_desc_ctx(desc);
+
+	sha256_base_do_update(sctx, data, len,
 			      (sha256_block_fn *)sha256_block_data_order);
-	return sha256_final(desc, out);
+	return crypto_sha2_final(desc, out);
 }
 EXPORT_SYMBOL(crypto_sha256_arm_finup);
 
@@ -64,7 +71,7 @@ static struct shash_alg algs[] = { {
 	.digestsize	=	SHA256_DIGEST_SIZE,
 	.init		=	sha256_base_init,
 	.update		=	crypto_sha256_arm_update,
-	.final		=	sha256_final,
+	.final		=	sha256_arm_final,
 	.finup		=	crypto_sha256_arm_finup,
 	.descsize	=	sizeof(struct sha256_state),
 	.base		=	{
@@ -79,7 +86,7 @@ static struct shash_alg algs[] = { {
 	.digestsize	=	SHA224_DIGEST_SIZE,
 	.init		=	sha224_base_init,
 	.update		=	crypto_sha256_arm_update,
-	.final		=	sha256_final,
+	.final		=	sha256_arm_final,
 	.finup		=	crypto_sha256_arm_finup,
 	.descsize	=	sizeof(struct sha256_state),
 	.base		=	{
diff --git a/arch/arm/crypto/sha256_neon_glue.c b/arch/arm/crypto/sha256_neon_glue.c
index 39ccd658817e..40c85d1d4c1e 100644
--- a/arch/arm/crypto/sha256_neon_glue.c
+++ b/arch/arm/crypto/sha256_neon_glue.c
@@ -29,8 +29,8 @@
 asmlinkage void sha256_block_data_order_neon(u32 *digest, const void *data,
 					     unsigned int num_blks);
 
-static int sha256_update(struct shash_desc *desc, const u8 *data,
-			 unsigned int len)
+static int sha256_neon_update(struct shash_desc *desc, const u8 *data,
+			      unsigned int len)
 {
 	struct sha256_state *sctx = shash_desc_ctx(desc);
 
@@ -39,41 +39,43 @@ static int sha256_update(struct shash_desc *desc, const u8 *data,
 		return crypto_sha256_arm_update(desc, data, len);
 
 	kernel_neon_begin();
-	sha256_base_do_update(desc, data, len,
+	sha256_base_do_update(sctx, data, len,
 			(sha256_block_fn *)sha256_block_data_order_neon);
 	kernel_neon_end();
 
 	return 0;
 }
 
-static int sha256_finup(struct shash_desc *desc, const u8 *data,
-			unsigned int len, u8 *out)
+static int sha256_neon_finup(struct shash_desc *desc, const u8 *data,
+			     unsigned int len, u8 *out)
 {
+	struct sha256_state *sctx = shash_desc_ctx(desc);
+
 	if (!may_use_simd())
 		return crypto_sha256_arm_finup(desc, data, len, out);
 
 	kernel_neon_begin();
 	if (len)
-		sha256_base_do_update(desc, data, len,
+		sha256_base_do_update(sctx, data, len,
 			(sha256_block_fn *)sha256_block_data_order_neon);
-	sha256_base_do_finalize(desc,
+	sha256_base_do_finalize(sctx,
 			(sha256_block_fn *)sha256_block_data_order_neon);
 	kernel_neon_end();
 
-	return sha256_base_finish(desc, out);
+	return crypto_sha2_final(desc, out);
 }
 
-static int sha256_final(struct shash_desc *desc, u8 *out)
+static int sha256_neon_final(struct shash_desc *desc, u8 *out)
 {
-	return sha256_finup(desc, NULL, 0, out);
+	return sha256_neon_finup(desc, NULL, 0, out);
 }
 
 struct shash_alg sha256_neon_algs[] = { {
 	.digestsize	=	SHA256_DIGEST_SIZE,
 	.init		=	sha256_base_init,
-	.update		=	sha256_update,
-	.final		=	sha256_final,
-	.finup		=	sha256_finup,
+	.update		=	sha256_neon_update,
+	.final		=	sha256_neon_final,
+	.finup		=	sha256_neon_finup,
 	.descsize	=	sizeof(struct sha256_state),
 	.base		=	{
 		.cra_name	=	"sha256",
@@ -86,9 +88,9 @@ struct shash_alg sha256_neon_algs[] = { {
 }, {
 	.digestsize	=	SHA224_DIGEST_SIZE,
 	.init		=	sha224_base_init,
-	.update		=	sha256_update,
-	.final		=	sha256_final,
-	.finup		=	sha256_finup,
+	.update		=	sha256_neon_update,
+	.final		=	sha256_neon_final,
+	.finup		=	sha256_neon_finup,
 	.descsize	=	sizeof(struct sha256_state),
 	.base		=	{
 		.cra_name	=	"sha224",
diff --git a/arch/arm64/crypto/sha2-ce-glue.c b/arch/arm64/crypto/sha2-ce-glue.c
index 7cd587564a41..e38dd301abce 100644
--- a/arch/arm64/crypto/sha2-ce-glue.c
+++ b/arch/arm64/crypto/sha2-ce-glue.c
@@ -39,7 +39,7 @@ static int sha256_ce_update(struct shash_desc *desc, const u8 *data,
 
 	sctx->finalize = 0;
 	kernel_neon_begin_partial(28);
-	sha256_base_do_update(desc, data, len,
+	sha256_base_do_update(&sctx->sst, data, len,
 			      (sha256_block_fn *)sha2_ce_transform);
 	kernel_neon_end();
 
@@ -64,13 +64,13 @@ static int sha256_ce_finup(struct shash_desc *desc, const u8 *data,
 	sctx->finalize = finalize;
 
 	kernel_neon_begin_partial(28);
-	sha256_base_do_update(desc, data, len,
+	sha256_base_do_update(&sctx->sst, data, len,
 			      (sha256_block_fn *)sha2_ce_transform);
 	if (!finalize)
-		sha256_base_do_finalize(desc,
+		sha256_base_do_finalize(&sctx->sst,
 					(sha256_block_fn *)sha2_ce_transform);
 	kernel_neon_end();
-	return sha256_base_finish(desc, out);
+	return crypto_sha2_final(desc, out);
 }
 
 static int sha256_ce_final(struct shash_desc *desc, u8 *out)
@@ -79,9 +79,10 @@ static int sha256_ce_final(struct shash_desc *desc, u8 *out)
 
 	sctx->finalize = 0;
 	kernel_neon_begin_partial(28);
-	sha256_base_do_finalize(desc, (sha256_block_fn *)sha2_ce_transform);
+	sha256_base_do_finalize(&sctx->sst,
+				(sha256_block_fn *)sha2_ce_transform);
 	kernel_neon_end();
-	return sha256_base_finish(desc, out);
+	return crypto_sha2_final(desc, out);
 }
 
 static struct shash_alg algs[] = { {
diff --git a/arch/arm64/crypto/sha256-glue.c b/arch/arm64/crypto/sha256-glue.c
index a2226f841960..132a1ef89a71 100644
--- a/arch/arm64/crypto/sha256-glue.c
+++ b/arch/arm64/crypto/sha256-glue.c
@@ -33,36 +33,39 @@ asmlinkage void sha256_block_data_order(u32 *digest, const void *data,
 asmlinkage void sha256_block_neon(u32 *digest, const void *data,
 				  unsigned int num_blks);
 
-static int sha256_update(struct shash_desc *desc, const u8 *data,
-			 unsigned int len)
+static int sha256_update_arm64(struct shash_desc *desc, const u8 *data,
+			       unsigned int len)
 {
-	return sha256_base_do_update(desc, data, len,
-				(sha256_block_fn *)sha256_block_data_order);
+	sha256_base_do_update(shash_desc_ctx(desc), data, len,
+			      (sha256_block_fn *)sha256_block_data_order);
+	return 0;
 }
 
-static int sha256_finup(struct shash_desc *desc, const u8 *data,
-			unsigned int len, u8 *out)
+static int sha256_finup_arm64(struct shash_desc *desc, const u8 *data,
+			      unsigned int len, u8 *out)
 {
+	struct sha256_state *sctx = shash_desc_ctx(desc);
+
 	if (len)
-		sha256_base_do_update(desc, data, len,
+		sha256_base_do_update(sctx, data, len,
 				(sha256_block_fn *)sha256_block_data_order);
-	sha256_base_do_finalize(desc,
+	sha256_base_do_finalize(sctx,
 				(sha256_block_fn *)sha256_block_data_order);
 
-	return sha256_base_finish(desc, out);
+	return crypto_sha2_final(desc, out);
 }
 
-static int sha256_final(struct shash_desc *desc, u8 *out)
+static int sha256_final_arm64(struct shash_desc *desc, u8 *out)
 {
-	return sha256_finup(desc, NULL, 0, out);
+	return sha256_finup_arm64(desc, NULL, 0, out);
 }
 
 static struct shash_alg algs[] = { {
 	.digestsize		= SHA256_DIGEST_SIZE,
 	.init			= sha256_base_init,
-	.update			= sha256_update,
-	.final			= sha256_final,
-	.finup			= sha256_finup,
+	.update			= sha256_update_arm64,
+	.final			= sha256_final_arm64,
+	.finup			= sha256_finup_arm64,
 	.descsize		= sizeof(struct sha256_state),
 	.base.cra_name		= "sha256",
 	.base.cra_driver_name	= "sha256-arm64",
@@ -73,9 +76,9 @@ static struct shash_alg algs[] = { {
 }, {
 	.digestsize		= SHA224_DIGEST_SIZE,
 	.init			= sha224_base_init,
-	.update			= sha256_update,
-	.final			= sha256_final,
-	.finup			= sha256_finup,
+	.update			= sha256_update_arm64,
+	.final			= sha256_final_arm64,
+	.finup			= sha256_finup_arm64,
 	.descsize		= sizeof(struct sha256_state),
 	.base.cra_name		= "sha224",
 	.base.cra_driver_name	= "sha224-arm64",
@@ -88,18 +91,22 @@ static struct shash_alg algs[] = { {
 static int sha256_update_neon(struct shash_desc *desc, const u8 *data,
 			      unsigned int len)
 {
+	struct sha256_state *sctx = shash_desc_ctx(desc);
+
 	/*
 	 * Stacking and unstacking a substantial slice of the NEON register
 	 * file may significantly affect performance for small updates when
 	 * executing in interrupt context, so fall back to the scalar code
 	 * in that case.
 	 */
-	if (!may_use_simd())
-		return sha256_base_do_update(desc, data, len,
+	if (!may_use_simd()) {
+		sha256_base_do_update(sctx, data, len,
 				(sha256_block_fn *)sha256_block_data_order);
+		return 0;
+	}
 
 	kernel_neon_begin();
-	sha256_base_do_update(desc, data, len,
+	sha256_base_do_update(sctx, data, len,
 				(sha256_block_fn *)sha256_block_neon);
 	kernel_neon_end();
 
@@ -109,22 +116,24 @@ static int sha256_update_neon(struct shash_desc *desc, const u8 *data,
 static int sha256_finup_neon(struct shash_desc *desc, const u8 *data,
 			     unsigned int len, u8 *out)
 {
+	struct sha256_state *sctx = shash_desc_ctx(desc);
+
 	if (!may_use_simd()) {
 		if (len)
-			sha256_base_do_update(desc, data, len,
+			sha256_base_do_update(sctx, data, len,
 				(sha256_block_fn *)sha256_block_data_order);
-		sha256_base_do_finalize(desc,
+		sha256_base_do_finalize(sctx,
 				(sha256_block_fn *)sha256_block_data_order);
 	} else {
 		kernel_neon_begin();
 		if (len)
-			sha256_base_do_update(desc, data, len,
+			sha256_base_do_update(sctx, data, len,
 				(sha256_block_fn *)sha256_block_neon);
-		sha256_base_do_finalize(desc,
+		sha256_base_do_finalize(sctx,
 				(sha256_block_fn *)sha256_block_neon);
 		kernel_neon_end();
 	}
-	return sha256_base_finish(desc, out);
+	return crypto_sha2_final(desc, out);
 }
 
 static int sha256_final_neon(struct shash_desc *desc, u8 *out)
diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c
index 9e79baf03a4b..e722fbaf0558 100644
--- a/arch/x86/crypto/sha256_ssse3_glue.c
+++ b/arch/x86/crypto/sha256_ssse3_glue.c
@@ -44,52 +44,60 @@ asmlinkage void sha256_transform_ssse3(u32 *digest, const char *data,
 				       u64 rounds);
 typedef void (sha256_transform_fn)(u32 *digest, const char *data, u64 rounds);
 
-static int sha256_update(struct shash_desc *desc, const u8 *data,
-			 unsigned int len, sha256_transform_fn *sha256_xform)
+static int sha256_fpu_update(struct shash_desc *desc, const u8 *data,
+			       unsigned int len,
+			       sha256_transform_fn *sha256_xform)
 {
 	struct sha256_state *sctx = shash_desc_ctx(desc);
 
 	if (!irq_fpu_usable() ||
-	    (sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE)
-		return crypto_sha256_update(desc, data, len);
+	    (sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE) {
+		sha256_update(sctx, data, len);
+		return 0;
+	}
 
 	/* make sure casting to sha256_block_fn() is safe */
 	BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0);
 
 	kernel_fpu_begin();
-	sha256_base_do_update(desc, data, len,
+	sha256_base_do_update(sctx, data, len,
 			      (sha256_block_fn *)sha256_xform);
 	kernel_fpu_end();
 
 	return 0;
 }
 
-static int sha256_finup(struct shash_desc *desc, const u8 *data,
+static int sha256_fpu_finup(struct shash_desc *desc, const u8 *data,
 	      unsigned int len, u8 *out, sha256_transform_fn *sha256_xform)
 {
-	if (!irq_fpu_usable())
-		return crypto_sha256_finup(desc, data, len, out);
+	struct sha256_state *sctx = shash_desc_ctx(desc);
+
+	if (!irq_fpu_usable()) {
+		sha256_finup(sctx, data, len, out);
+		return 0;
+	}
 
 	kernel_fpu_begin();
 	if (len)
-		sha256_base_do_update(desc, data, len,
+		sha256_base_do_update(sctx, data, len,
 				      (sha256_block_fn *)sha256_xform);
-	sha256_base_do_finalize(desc, (sha256_block_fn *)sha256_xform);
+	sha256_base_do_finalize(sctx, (sha256_block_fn *)sha256_xform);
 	kernel_fpu_end();
 
-	return sha256_base_finish(desc, out);
+	crypto_sha2_final(desc, out);
+	return 0;
 }
 
 static int sha256_ssse3_update(struct shash_desc *desc, const u8 *data,
 			 unsigned int len)
 {
-	return sha256_update(desc, data, len, sha256_transform_ssse3);
+	return sha256_fpu_update(desc, data, len, sha256_transform_ssse3);
 }
 
 static int sha256_ssse3_finup(struct shash_desc *desc, const u8 *data,
 	      unsigned int len, u8 *out)
 {
-	return sha256_finup(desc, data, len, out, sha256_transform_ssse3);
+	return sha256_fpu_finup(desc, data, len, out, sha256_transform_ssse3);
 }
 
 /* Add padding and return the message digest. */
@@ -152,13 +160,13 @@ asmlinkage void sha256_transform_avx(u32 *digest, const char *data,
 static int sha256_avx_update(struct shash_desc *desc, const u8 *data,
 			 unsigned int len)
 {
-	return sha256_update(desc, data, len, sha256_transform_avx);
+	return sha256_fpu_update(desc, data, len, sha256_transform_avx);
 }
 
 static int sha256_avx_finup(struct shash_desc *desc, const u8 *data,
 		      unsigned int len, u8 *out)
 {
-	return sha256_finup(desc, data, len, out, sha256_transform_avx);
+	return sha256_fpu_finup(desc, data, len, out, sha256_transform_avx);
 }
 
 static int sha256_avx_final(struct shash_desc *desc, u8 *out)
@@ -236,13 +244,13 @@ asmlinkage void sha256_transform_rorx(u32 *digest, const char *data,
 static int sha256_avx2_update(struct shash_desc *desc, const u8 *data,
 			 unsigned int len)
 {
-	return sha256_update(desc, data, len, sha256_transform_rorx);
+	return sha256_fpu_update(desc, data, len, sha256_transform_rorx);
 }
 
 static int sha256_avx2_finup(struct shash_desc *desc, const u8 *data,
 		      unsigned int len, u8 *out)
 {
-	return sha256_finup(desc, data, len, out, sha256_transform_rorx);
+	return sha256_fpu_finup(desc, data, len, out, sha256_transform_rorx);
 }
 
 static int sha256_avx2_final(struct shash_desc *desc, u8 *out)
@@ -318,13 +326,13 @@ asmlinkage void sha256_ni_transform(u32 *digest, const char *data,
 static int sha256_ni_update(struct shash_desc *desc, const u8 *data,
 			 unsigned int len)
 {
-	return sha256_update(desc, data, len, sha256_ni_transform);
+	return sha256_fpu_update(desc, data, len, sha256_ni_transform);
 }
 
 static int sha256_ni_finup(struct shash_desc *desc, const u8 *data,
 		      unsigned int len, u8 *out)
 {
-	return sha256_finup(desc, data, len, out, sha256_ni_transform);
+	return sha256_fpu_finup(desc, data, len, out, sha256_ni_transform);
 }
 
 static int sha256_ni_final(struct shash_desc *desc, u8 *out)
diff --git a/arch/x86/purgatory/purgatory.c b/arch/x86/purgatory/purgatory.c
index 25e068ba3382..ed6e80b844cf 100644
--- a/arch/x86/purgatory/purgatory.c
+++ b/arch/x86/purgatory/purgatory.c
@@ -10,7 +10,7 @@
  * Version 2.  See the file COPYING for more details.
  */
 
-#include "sha256.h"
+#include <crypto/sha.h>
 #include "../boot/string.h"
 
 struct sha_region {
diff --git a/arch/x86/purgatory/sha256.c b/arch/x86/purgatory/sha256.c
index 548ca675a14a..724925d5da61 100644
--- a/arch/x86/purgatory/sha256.c
+++ b/arch/x86/purgatory/sha256.c
@@ -17,7 +17,7 @@
 
 #include <linux/bitops.h>
 #include <asm/byteorder.h>
-#include "sha256.h"
+#include <crypto/sha.h>
 #include "../boot/string.h"
 
 static inline u32 Ch(u32 x, u32 y, u32 z)
@@ -208,22 +208,7 @@ static void sha256_transform(u32 *state, const u8 *input)
 	memset(W, 0, 64 * sizeof(u32));
 }
 
-int sha256_init(struct sha256_state *sctx)
-{
-	sctx->state[0] = SHA256_H0;
-	sctx->state[1] = SHA256_H1;
-	sctx->state[2] = SHA256_H2;
-	sctx->state[3] = SHA256_H3;
-	sctx->state[4] = SHA256_H4;
-	sctx->state[5] = SHA256_H5;
-	sctx->state[6] = SHA256_H6;
-	sctx->state[7] = SHA256_H7;
-	sctx->count = 0;
-
-	return 0;
-}
-
-int sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
+void sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
 {
 	unsigned int partial, done;
 	const u8 *src;
@@ -249,11 +234,9 @@ int sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
 		partial = 0;
 	}
 	memcpy(sctx->buf + partial, src, len - done);
-
-	return 0;
 }
 
-int sha256_final(struct sha256_state *sctx, u8 *out)
+void sha256_final(struct sha256_state *sctx, u8 *out)
 {
 	__be32 *dst = (__be32 *)out;
 	__be64 bits;
@@ -278,6 +261,4 @@ int sha256_final(struct sha256_state *sctx, u8 *out)
 
 	/* Zeroize sensitive information. */
 	memset(sctx, 0, sizeof(*sctx));
-
-	return 0;
 }
diff --git a/arch/x86/purgatory/sha256.h b/arch/x86/purgatory/sha256.h
deleted file mode 100644
index bd15a4127735..000000000000
--- a/arch/x86/purgatory/sha256.h
+++ /dev/null
@@ -1,22 +0,0 @@
-/*
- *  Copyright (C) 2014 Red Hat Inc.
- *
- *  Author: Vivek Goyal <vgoyal@redhat.com>
- *
- * This source code is licensed under the GNU General Public License,
- * Version 2.  See the file COPYING for more details.
- */
-
-#ifndef SHA256_H
-#define SHA256_H
-
-
-#include <linux/types.h>
-#include <crypto/sha.h>
-
-extern int sha256_init(struct sha256_state *sctx);
-extern int sha256_update(struct sha256_state *sctx, const u8 *input,
-				unsigned int length);
-extern int sha256_final(struct sha256_state *sctx, u8 *hash);
-
-#endif /* SHA256_H */
diff --git a/crypto/sha256_generic.c b/crypto/sha256_generic.c
index 8f9c47e1a96e..f2747893402c 100644
--- a/crypto/sha256_generic.c
+++ b/crypto/sha256_generic.c
@@ -231,6 +231,13 @@ static void sha256_transform(u32 *state, const u8 *input)
 	memzero_explicit(W, 64 * sizeof(u32));
 }
 
+int sha256_base_init(struct shash_desc *desc)
+{
+	sha256_init(shash_desc_ctx(desc));
+	return 0;
+}
+EXPORT_SYMBOL(sha256_base_init);
+
 static void sha256_generic_block_fn(struct sha256_state *sst, u8 const *src,
 				    int blocks)
 {
@@ -240,32 +247,49 @@ static void sha256_generic_block_fn(struct sha256_state *sst, u8 const *src,
 	}
 }
 
-int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
+void sha256_update(struct sha256_state *sctx, const u8 *data,
 			  unsigned int len)
 {
-	return sha256_base_do_update(desc, data, len, sha256_generic_block_fn);
+	sha256_base_do_update(sctx, data, len, sha256_generic_block_fn);
+}
+EXPORT_SYMBOL(sha256_update);
+
+void sha256_final(struct sha256_state *sctx, u8 *out)
+{
+	sha256_base_do_finalize(sctx, sha256_generic_block_fn);
+	sha256_base_finish(sctx, out);
 }
-EXPORT_SYMBOL(crypto_sha256_update);
+EXPORT_SYMBOL(sha256_final);
 
-static int sha256_final(struct shash_desc *desc, u8 *out)
+static int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
+				unsigned int len)
 {
-	sha256_base_do_finalize(desc, sha256_generic_block_fn);
-	return sha256_base_finish(desc, out);
+	sha256_update(shash_desc_ctx(desc), data, len);
+	return 0;
+}
+
+int crypto_sha2_final(struct shash_desc *desc, u8 *out)
+{
+	struct sha256_state *sctx = shash_desc_ctx(desc);
+
+	sha256_base_do_finalize(sctx, sha256_generic_block_fn);
+	sha2_base_finish(sctx, crypto_shash_digestsize(desc->tfm), out);
+	return 0;
 }
+EXPORT_SYMBOL(crypto_sha2_final);
 
-int crypto_sha256_finup(struct shash_desc *desc, const u8 *data,
-			unsigned int len, u8 *hash)
+static int crypto_sha256_finup(struct shash_desc *desc, const u8 *data,
+			       unsigned int len, u8 *hash)
 {
-	sha256_base_do_update(desc, data, len, sha256_generic_block_fn);
-	return sha256_final(desc, hash);
+	sha256_finup(shash_desc_ctx(desc), data, len, hash);
+	return 0;
 }
-EXPORT_SYMBOL(crypto_sha256_finup);
 
 static struct shash_alg sha256_algs[2] = { {
 	.digestsize	=	SHA256_DIGEST_SIZE,
 	.init		=	sha256_base_init,
 	.update		=	crypto_sha256_update,
-	.final		=	sha256_final,
+	.final		=	crypto_sha2_final,
 	.finup		=	crypto_sha256_finup,
 	.descsize	=	sizeof(struct sha256_state),
 	.base		=	{
@@ -279,7 +303,7 @@ static struct shash_alg sha256_algs[2] = { {
 	.digestsize	=	SHA224_DIGEST_SIZE,
 	.init		=	sha224_base_init,
 	.update		=	crypto_sha256_update,
-	.final		=	sha256_final,
+	.final		=	crypto_sha2_final,
 	.finup		=	crypto_sha256_finup,
 	.descsize	=	sizeof(struct sha256_state),
 	.base		=	{
diff --git a/include/crypto/sha.h b/include/crypto/sha.h
index c94d3eb1cefd..2b6978471605 100644
--- a/include/crypto/sha.h
+++ b/include/crypto/sha.h
@@ -96,11 +96,30 @@ extern int crypto_sha1_update(struct shash_desc *desc, const u8 *data,
 extern int crypto_sha1_finup(struct shash_desc *desc, const u8 *data,
 			     unsigned int len, u8 *hash);
 
-extern int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
-			      unsigned int len);
-
-extern int crypto_sha256_finup(struct shash_desc *desc, const u8 *data,
-			       unsigned int len, u8 *hash);
+static inline void sha256_init(struct sha256_state *sctx)
+{
+	sctx->state[0] = SHA256_H0;
+	sctx->state[1] = SHA256_H1;
+	sctx->state[2] = SHA256_H2;
+	sctx->state[3] = SHA256_H3;
+	sctx->state[4] = SHA256_H4;
+	sctx->state[5] = SHA256_H5;
+	sctx->state[6] = SHA256_H6;
+	sctx->state[7] = SHA256_H7;
+	sctx->count = 0;
+}
+
+extern void sha256_update(struct sha256_state *sctx, const u8 *data,
+			  unsigned int len);
+
+extern void sha256_final(struct sha256_state *sctx, u8 *out);
+
+static inline void sha256_finup(struct sha256_state *sctx, const u8 *data,
+				unsigned int len, u8 *hash)
+{
+	sha256_update(sctx, data, len);
+	sha256_final(sctx, hash);
+}
 
 extern int crypto_sha512_update(struct shash_desc *desc, const u8 *data,
 			      unsigned int len);
diff --git a/include/crypto/sha256_base.h b/include/crypto/sha256_base.h
index d1f2195bb7de..f65d9a516b36 100644
--- a/include/crypto/sha256_base.h
+++ b/include/crypto/sha256_base.h
@@ -35,29 +35,13 @@ static inline int sha224_base_init(struct shash_desc *desc)
 	return 0;
 }
 
-static inline int sha256_base_init(struct shash_desc *desc)
-{
-	struct sha256_state *sctx = shash_desc_ctx(desc);
-
-	sctx->state[0] = SHA256_H0;
-	sctx->state[1] = SHA256_H1;
-	sctx->state[2] = SHA256_H2;
-	sctx->state[3] = SHA256_H3;
-	sctx->state[4] = SHA256_H4;
-	sctx->state[5] = SHA256_H5;
-	sctx->state[6] = SHA256_H6;
-	sctx->state[7] = SHA256_H7;
-	sctx->count = 0;
-
-	return 0;
-}
+extern int sha256_base_init(struct shash_desc *desc);
 
-static inline int sha256_base_do_update(struct shash_desc *desc,
+static inline void sha256_base_do_update(struct sha256_state *sctx,
 					const u8 *data,
 					unsigned int len,
 					sha256_block_fn *block_fn)
 {
-	struct sha256_state *sctx = shash_desc_ctx(desc);
 	unsigned int partial = sctx->count % SHA256_BLOCK_SIZE;
 
 	sctx->count += len;
@@ -86,15 +70,12 @@ static inline int sha256_base_do_update(struct shash_desc *desc,
 	}
 	if (len)
 		memcpy(sctx->buf + partial, data, len);
-
-	return 0;
 }
 
-static inline int sha256_base_do_finalize(struct shash_desc *desc,
+static inline void sha256_base_do_finalize(struct sha256_state *sctx,
 					  sha256_block_fn *block_fn)
 {
 	const int bit_offset = SHA256_BLOCK_SIZE - sizeof(__be64);
-	struct sha256_state *sctx = shash_desc_ctx(desc);
 	__be64 *bits = (__be64 *)(sctx->buf + bit_offset);
 	unsigned int partial = sctx->count % SHA256_BLOCK_SIZE;
 
@@ -109,14 +90,11 @@ static inline int sha256_base_do_finalize(struct shash_desc *desc,
 	memset(sctx->buf + partial, 0x0, bit_offset - partial);
 	*bits = cpu_to_be64(sctx->count << 3);
 	block_fn(sctx, sctx->buf, 1);
-
-	return 0;
 }
 
-static inline int sha256_base_finish(struct shash_desc *desc, u8 *out)
+static inline void sha2_base_finish(struct sha256_state *sctx,
+				    unsigned int digest_size, u8 *out)
 {
-	unsigned int digest_size = crypto_shash_digestsize(desc->tfm);
-	struct sha256_state *sctx = shash_desc_ctx(desc);
 	__be32 *digest = (__be32 *)out;
 	int i;
 
@@ -124,5 +102,11 @@ static inline int sha256_base_finish(struct shash_desc *desc, u8 *out)
 		put_unaligned_be32(sctx->state[i], digest++);
 
 	*sctx = (struct sha256_state){};
-	return 0;
 }
+
+static inline void sha256_base_finish(struct sha256_state *sctx, u8 *out)
+{
+	sha2_base_finish(sctx, SHA256_DIGEST_SIZE, out);
+}
+
+extern int crypto_sha2_final(struct shash_desc *desc, u8 *out);
-- 
2.9.3

^ permalink raw reply related

* [RFC PATCH 4.10 2/6] crypto/sha256: Make the sha256 library functions selectable
From: Andy Lutomirski @ 2016-12-24  2:22 UTC (permalink / raw)
  To: Daniel Borkmann, Netdev, LKML, Linux Crypto Mailing List
  Cc: Jason A. Donenfeld, Hannes Frederic Sowa, Alexei Starovoitov,
	Eric Dumazet, Eric Biggers, Tom Herbert, David S. Miller,
	Andy Lutomirski
In-Reply-To: <cover.1482545792.git.luto@kernel.org>

This will let other kernel code call into sha256_init(), etc. without
pulling in the core crypto code.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 crypto/Kconfig          | 8 ++++++++
 crypto/Makefile         | 2 +-
 crypto/sha256_generic.c | 4 ++++
 include/crypto/sha.h    | 4 ++++
 4 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/crypto/Kconfig b/crypto/Kconfig
index 160f08e721cc..85a2b3440c2b 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -10,6 +10,13 @@ config XOR_BLOCKS
 source "crypto/async_tx/Kconfig"
 
 #
+# Cryptographic algorithms that are usable without the Crypto API.
+# None of these should have visible config options.
+#
+config CRYPTO_SHA256_LIB
+	bool
+
+#
 # Cryptographic API Configuration
 #
 menuconfig CRYPTO
@@ -763,6 +770,7 @@ config CRYPTO_SHA512_MB
 
 config CRYPTO_SHA256
 	tristate "SHA224 and SHA256 digest algorithm"
+	select CRYPTO_SHA256_LIB
 	select CRYPTO_HASH
 	help
 	  SHA256 secure hash standard (DFIPS 180-2).
diff --git a/crypto/Makefile b/crypto/Makefile
index b8f0e3eb0791..d147d4c911f5 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -71,7 +71,7 @@ obj-$(CONFIG_CRYPTO_RMD160) += rmd160.o
 obj-$(CONFIG_CRYPTO_RMD256) += rmd256.o
 obj-$(CONFIG_CRYPTO_RMD320) += rmd320.o
 obj-$(CONFIG_CRYPTO_SHA1) += sha1_generic.o
-obj-$(CONFIG_CRYPTO_SHA256) += sha256_generic.o
+obj-$(CONFIG_CRYPTO_SHA256_LIB) += sha256_generic.o
 obj-$(CONFIG_CRYPTO_SHA512) += sha512_generic.o
 obj-$(CONFIG_CRYPTO_SHA3) += sha3_generic.o
 obj-$(CONFIG_CRYPTO_WP512) += wp512.o
diff --git a/crypto/sha256_generic.c b/crypto/sha256_generic.c
index f2747893402c..9df71ac66dc4 100644
--- a/crypto/sha256_generic.c
+++ b/crypto/sha256_generic.c
@@ -261,6 +261,8 @@ void sha256_final(struct sha256_state *sctx, u8 *out)
 }
 EXPORT_SYMBOL(sha256_final);
 
+#ifdef CONFIG_CRYPTO_HASH
+
 static int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
 				unsigned int len)
 {
@@ -328,6 +330,8 @@ static void __exit sha256_generic_mod_fini(void)
 module_init(sha256_generic_mod_init);
 module_exit(sha256_generic_mod_fini);
 
+#endif /* CONFIG_CRYPTO_HASH */
+
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("SHA-224 and SHA-256 Secure Hash Algorithm");
 
diff --git a/include/crypto/sha.h b/include/crypto/sha.h
index 2b6978471605..381ba7fa5e3f 100644
--- a/include/crypto/sha.h
+++ b/include/crypto/sha.h
@@ -96,6 +96,8 @@ extern int crypto_sha1_update(struct shash_desc *desc, const u8 *data,
 extern int crypto_sha1_finup(struct shash_desc *desc, const u8 *data,
 			     unsigned int len, u8 *hash);
 
+#ifdef CONFIG_CRYPTO_SHA256_LIB
+
 static inline void sha256_init(struct sha256_state *sctx)
 {
 	sctx->state[0] = SHA256_H0;
@@ -121,6 +123,8 @@ static inline void sha256_finup(struct sha256_state *sctx, const u8 *data,
 	sha256_final(sctx, hash);
 }
 
+#endif /* CONFIG_CRYPTO_SHA256_LIB */
+
 extern int crypto_sha512_update(struct shash_desc *desc, const u8 *data,
 			      unsigned int len);
 
-- 
2.9.3

^ permalink raw reply related

* [RFC PATCH 4.10 3/6] bpf: Use SHA256 instead of SHA1 for bpf digests
From: Andy Lutomirski @ 2016-12-24  2:22 UTC (permalink / raw)
  To: Daniel Borkmann, Netdev, LKML, Linux Crypto Mailing List
  Cc: Jason A. Donenfeld, Hannes Frederic Sowa, Alexei Starovoitov,
	Eric Dumazet, Eric Biggers, Tom Herbert, David S. Miller,
	Andy Lutomirski, Alexei Starovoitov
In-Reply-To: <cover.1482545792.git.luto@kernel.org>

BPF digests are intended to be used to avoid reloading programs that
are already loaded.  For use cases (CRIU?) where untrusted programs
are involved, intentional hash collisions could cause the wrong BPF
program to execute.  Additionally, if BPF digests are ever used
in-kernel to skip verification, a hash collision could give privilege
escalation directly.

SHA1 is no longer considered adequately collision-resistant (see, for
example, all the major browsers dropping support for SHA1
certificates).  Use SHA256 instead.

I moved the digest field to keep all of the bpf program metadata in
the same cache line.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 include/linux/filter.h | 11 +++--------
 init/Kconfig           |  1 +
 kernel/bpf/core.c      | 41 +++++++----------------------------------
 3 files changed, 11 insertions(+), 42 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 702314253797..23df2574e30c 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -14,7 +14,8 @@
 #include <linux/workqueue.h>
 #include <linux/sched.h>
 #include <linux/capability.h>
-#include <linux/cryptohash.h>
+
+#include <crypto/sha.h>
 
 #include <net/sch_generic.h>
 
@@ -408,11 +409,11 @@ struct bpf_prog {
 	kmemcheck_bitfield_end(meta);
 	enum bpf_prog_type	type;		/* Type of BPF program */
 	u32			len;		/* Number of filter blocks */
-	u32			digest[SHA_DIGEST_WORDS]; /* Program digest */
 	struct bpf_prog_aux	*aux;		/* Auxiliary fields */
 	struct sock_fprog_kern	*orig_prog;	/* Original BPF program */
 	unsigned int		(*bpf_func)(const void *ctx,
 					    const struct bpf_insn *insn);
+	u8			digest[SHA256_DIGEST_SIZE]; /* Program digest */
 	/* Instructions for interpreter */
 	union {
 		struct sock_filter	insns[0];
@@ -519,12 +520,6 @@ static inline u32 bpf_prog_insn_size(const struct bpf_prog *prog)
 	return prog->len * sizeof(struct bpf_insn);
 }
 
-static inline u32 bpf_prog_digest_scratch_size(const struct bpf_prog *prog)
-{
-	return round_up(bpf_prog_insn_size(prog) +
-			sizeof(__be64) + 1, SHA_MESSAGE_BYTES);
-}
-
 static inline unsigned int bpf_prog_size(unsigned int proglen)
 {
 	return max(sizeof(struct bpf_prog),
diff --git a/init/Kconfig b/init/Kconfig
index 223b734abccd..5a4e2d99cc38 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1634,6 +1634,7 @@ config BPF_SYSCALL
 	bool "Enable bpf() system call"
 	select ANON_INODES
 	select BPF
+	select CRYPTO_SHA256_LIB
 	default n
 	help
 	  Enable the bpf() system call that allows to manipulate eBPF
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 1eb4f1303756..911993863799 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -148,22 +148,18 @@ void __bpf_prog_free(struct bpf_prog *fp)
 
 int bpf_prog_calc_digest(struct bpf_prog *fp)
 {
-	const u32 bits_offset = SHA_MESSAGE_BYTES - sizeof(__be64);
-	u32 raw_size = bpf_prog_digest_scratch_size(fp);
-	u32 ws[SHA_WORKSPACE_WORDS];
-	u32 i, bsize, psize, blocks;
+	struct sha256_state sha;
+	u32 i, psize;
 	struct bpf_insn *dst;
 	bool was_ld_map;
-	u8 *raw, *todo;
-	__be32 *result;
-	__be64 *bits;
+	u8 *raw;
 
-	raw = vmalloc(raw_size);
+	psize = bpf_prog_insn_size(fp);
+	raw = vmalloc(psize);
 	if (!raw)
 		return -ENOMEM;
 
-	sha_init(fp->digest);
-	memset(ws, 0, sizeof(ws));
+	sha256_init(&sha);
 
 	/* We need to take out the map fd for the digest calculation
 	 * since they are unstable from user space side.
@@ -188,30 +184,7 @@ int bpf_prog_calc_digest(struct bpf_prog *fp)
 		}
 	}
 
-	psize = bpf_prog_insn_size(fp);
-	memset(&raw[psize], 0, raw_size - psize);
-	raw[psize++] = 0x80;
-
-	bsize  = round_up(psize, SHA_MESSAGE_BYTES);
-	blocks = bsize / SHA_MESSAGE_BYTES;
-	todo   = raw;
-	if (bsize - psize >= sizeof(__be64)) {
-		bits = (__be64 *)(todo + bsize - sizeof(__be64));
-	} else {
-		bits = (__be64 *)(todo + bsize + bits_offset);
-		blocks++;
-	}
-	*bits = cpu_to_be64((psize - 1) << 3);
-
-	while (blocks--) {
-		sha_transform(fp->digest, todo, ws);
-		todo += SHA_MESSAGE_BYTES;
-	}
-
-	result = (__force __be32 *)fp->digest;
-	for (i = 0; i < SHA_DIGEST_WORDS; i++)
-		result[i] = cpu_to_be32(fp->digest[i]);
-
+	sha256_finup(&sha, raw, psize, fp->digest);
 	vfree(raw);
 	return 0;
 }
-- 
2.9.3

^ permalink raw reply related

* [RFC PATCH 4.10 4/6] bpf: Avoid copying the entire BPF program when hashing it
From: Andy Lutomirski @ 2016-12-24  2:22 UTC (permalink / raw)
  To: Daniel Borkmann, Netdev, LKML, Linux Crypto Mailing List
  Cc: Jason A. Donenfeld, Hannes Frederic Sowa, Alexei Starovoitov,
	Eric Dumazet, Eric Biggers, Tom Herbert, David S. Miller,
	Andy Lutomirski, Alexei Starovoitov
In-Reply-To: <cover.1482545792.git.luto@kernel.org>

The sha256 helpers can consume a message incrementally, so there's no need
to allocate a buffer to store the whole blob to be hashed.

This may be a slight slowdown for very long messages because gcc can't
inline the sha256_update() calls.  For reasonably-sized programs,
however, this should be a considerable speedup as vmalloc() is quite
slow.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 kernel/bpf/core.c | 34 ++++++++++++++--------------------
 1 file changed, 14 insertions(+), 20 deletions(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 911993863799..1c2931f505af 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -149,43 +149,37 @@ void __bpf_prog_free(struct bpf_prog *fp)
 int bpf_prog_calc_digest(struct bpf_prog *fp)
 {
 	struct sha256_state sha;
-	u32 i, psize;
-	struct bpf_insn *dst;
+	u32 i;
 	bool was_ld_map;
-	u8 *raw;
-
-	psize = bpf_prog_insn_size(fp);
-	raw = vmalloc(psize);
-	if (!raw)
-		return -ENOMEM;
 
 	sha256_init(&sha);
 
 	/* We need to take out the map fd for the digest calculation
 	 * since they are unstable from user space side.
 	 */
-	dst = (void *)raw;
 	for (i = 0, was_ld_map = false; i < fp->len; i++) {
-		dst[i] = fp->insnsi[i];
+		struct bpf_insn insn = fp->insnsi[i];
+
 		if (!was_ld_map &&
-		    dst[i].code == (BPF_LD | BPF_IMM | BPF_DW) &&
-		    dst[i].src_reg == BPF_PSEUDO_MAP_FD) {
+		    insn.code == (BPF_LD | BPF_IMM | BPF_DW) &&
+		    insn.src_reg == BPF_PSEUDO_MAP_FD) {
 			was_ld_map = true;
-			dst[i].imm = 0;
+			insn.imm = 0;
 		} else if (was_ld_map &&
-			   dst[i].code == 0 &&
-			   dst[i].dst_reg == 0 &&
-			   dst[i].src_reg == 0 &&
-			   dst[i].off == 0) {
+			   insn.code == 0 &&
+			   insn.dst_reg == 0 &&
+			   insn.src_reg == 0 &&
+			   insn.off == 0) {
 			was_ld_map = false;
-			dst[i].imm = 0;
+			insn.imm = 0;
 		} else {
 			was_ld_map = false;
 		}
+
+		sha256_update(&sha, (const u8 *)&insn, sizeof(insn));
 	}
 
-	sha256_finup(&sha, raw, psize, fp->digest);
-	vfree(raw);
+	sha256_final(&sha, fp->digest);
 	return 0;
 }
 
-- 
2.9.3

^ permalink raw reply related

* [RFC PATCH 4.10 5/6] bpf: Rename fdinfo's prog_digest to prog_sha256
From: Andy Lutomirski @ 2016-12-24  2:22 UTC (permalink / raw)
  To: Daniel Borkmann, Netdev, LKML, Linux Crypto Mailing List
  Cc: Jason A. Donenfeld, Hannes Frederic Sowa, Alexei Starovoitov,
	Eric Dumazet, Eric Biggers, Tom Herbert, David S. Miller,
	Andy Lutomirski, Alexei Starovoitov
In-Reply-To: <cover.1482545792.git.luto@kernel.org>

This makes it easier to add another digest algorithm down the road
if needed.  It also serves to force any programs that might have
been written against a kernel that had 'prog_digest' to be updated.

This shouldn't violate any stable API policies, as no released
kernel has ever had 'prog_digest'.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 kernel/bpf/syscall.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index e89acea22ecf..956370b80296 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -694,7 +694,7 @@ static void bpf_prog_show_fdinfo(struct seq_file *m, struct file *filp)
 	seq_printf(m,
 		   "prog_type:\t%u\n"
 		   "prog_jited:\t%u\n"
-		   "prog_digest:\t%s\n"
+		   "prog_sha256:\t%s\n"
 		   "memlock:\t%llu\n",
 		   prog->type,
 		   prog->jited,
-- 
2.9.3

^ permalink raw reply related

* [RFC PATCH 4.10 6/6] net: Rename TCA*BPF_DIGEST to ..._SHA256
From: Andy Lutomirski @ 2016-12-24  2:22 UTC (permalink / raw)
  To: Daniel Borkmann, Netdev, LKML, Linux Crypto Mailing List
  Cc: Jason A. Donenfeld, Hannes Frederic Sowa, Alexei Starovoitov,
	Eric Dumazet, Eric Biggers, Tom Herbert, David S. Miller,
	Andy Lutomirski, Alexei Starovoitov
In-Reply-To: <cover.1482545792.git.luto@kernel.org>

This makes it easier to add another digest algorithm down the road if
needed.  It also serves to force any programs that might have been
written against a kernel that had the old field name to notice the
change and make any necessary changes.

This shouldn't violate any stable API policies, as no released kernel
has ever had TCA*BPF_DIGEST.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 include/uapi/linux/pkt_cls.h       | 2 +-
 include/uapi/linux/tc_act/tc_bpf.h | 2 +-
 net/sched/act_bpf.c                | 2 +-
 net/sched/cls_bpf.c                | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index cb4bcdc58543..ac6b300c1550 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -397,7 +397,7 @@ enum {
 	TCA_BPF_NAME,
 	TCA_BPF_FLAGS,
 	TCA_BPF_FLAGS_GEN,
-	TCA_BPF_DIGEST,
+	TCA_BPF_SHA256,
 	__TCA_BPF_MAX,
 };
 
diff --git a/include/uapi/linux/tc_act/tc_bpf.h b/include/uapi/linux/tc_act/tc_bpf.h
index a6b88a6f7f71..eae18a7430eb 100644
--- a/include/uapi/linux/tc_act/tc_bpf.h
+++ b/include/uapi/linux/tc_act/tc_bpf.h
@@ -27,7 +27,7 @@ enum {
 	TCA_ACT_BPF_FD,
 	TCA_ACT_BPF_NAME,
 	TCA_ACT_BPF_PAD,
-	TCA_ACT_BPF_DIGEST,
+	TCA_ACT_BPF_SHA256,
 	__TCA_ACT_BPF_MAX,
 };
 #define TCA_ACT_BPF_MAX (__TCA_ACT_BPF_MAX - 1)
diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c
index 1c60317f0121..3868a66d0b24 100644
--- a/net/sched/act_bpf.c
+++ b/net/sched/act_bpf.c
@@ -123,7 +123,7 @@ static int tcf_bpf_dump_ebpf_info(const struct tcf_bpf *prog,
 	    nla_put_string(skb, TCA_ACT_BPF_NAME, prog->bpf_name))
 		return -EMSGSIZE;
 
-	nla = nla_reserve(skb, TCA_ACT_BPF_DIGEST,
+	nla = nla_reserve(skb, TCA_ACT_BPF_SHA256,
 			  sizeof(prog->filter->digest));
 	if (nla == NULL)
 		return -EMSGSIZE;
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index adc776048d1a..6fc110321621 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -555,7 +555,7 @@ static int cls_bpf_dump_ebpf_info(const struct cls_bpf_prog *prog,
 	    nla_put_string(skb, TCA_BPF_NAME, prog->bpf_name))
 		return -EMSGSIZE;
 
-	nla = nla_reserve(skb, TCA_BPF_DIGEST, sizeof(prog->filter->digest));
+	nla = nla_reserve(skb, TCA_BPF_SHA256, sizeof(prog->filter->digest));
 	if (nla == NULL)
 		return -EMSGSIZE;
 
-- 
2.9.3

^ permalink raw reply related

* Re: [RFC PATCH 4.10 1/6] crypto/sha256: Refactor the API so it can be used without shash
From: Andy Lutomirski @ 2016-12-24  2:26 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Daniel Borkmann, Netdev, LKML, Linux Crypto Mailing List,
	Jason A. Donenfeld, Hannes Frederic Sowa, Alexei Starovoitov,
	Eric Dumazet, Eric Biggers, Tom Herbert, David S. Miller,
	Ard Biesheuvel, Herbert Xu
In-Reply-To: <942b91f25a63b22ec4946378a1fffe78d655cf18.1482545792.git.luto@kernel.org>

On Fri, Dec 23, 2016 at 6:22 PM, Andy Lutomirski <luto@kernel.org> wrote:
> There are some pieecs of kernel code that want to compute SHA256
> directly without going through the crypto core.  Adjust the exported
> API to decouple it from the crypto core.
>
> I suspect this will very slightly speed up the SHA256 shash operations
> as well by reducing the amount of indirection involved.
>

I should also mention: there's a nice potential cleanup that's
possible on top of this.  Currently, most of the accelerated SHA256
implementations just swap out the block function.  Another approach to
enabling this would be to restructure sha256_update along the lines
of:

sha256_block_fn_t fn = arch_sha256_block_fn(len);
sha256_base_do_update(sctx, data, len, arch_sha256_block_fn(len));

The idea being that arch code can decide whether to use an accelerated
block function based on context (x86, for example, can't always use
xmm regs) and length (on x86, using the accelerated versions for short
digests is very slow due to the state save/restore that happens) and
then the core code can just use it.

This would allow a lot of the boilerplate that this patch was forced
to modify to be deleted outright.

--Andy

^ permalink raw reply

* [PATCH net] net: korina: Fix NAPI versus resources freeing
From: Florian Fainelli @ 2016-12-24  3:56 UTC (permalink / raw)
  To: netdev; +Cc: davem, alex, phil, Florian Fainelli

Commit beb0babfb77e ("korina: disable napi on close and restart")
introduced calls to napi_disable() that were missing before,
unfortunately this leaves a small window during which NAPI has a chance
to run, yet we just freed resources since korina_free_ring() has been
called:

Fix this by disabling NAPI first then freeing resource, and make sure
that we also cancel the restart taks before doing the resource freeing.

Fixes: beb0babfb77e ("korina: disable napi on close and restart")
Reported-by: Alexandros C. Couloumbis <alex@ozo.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/ethernet/korina.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/korina.c b/drivers/net/ethernet/korina.c
index cbeea915f026..8037426ec50f 100644
--- a/drivers/net/ethernet/korina.c
+++ b/drivers/net/ethernet/korina.c
@@ -900,10 +900,10 @@ static void korina_restart_task(struct work_struct *work)
 				DMA_STAT_DONE | DMA_STAT_HALT | DMA_STAT_ERR,
 				&lp->rx_dma_regs->dmasm);
 
-	korina_free_ring(dev);
-
 	napi_disable(&lp->napi);
 
+	korina_free_ring(dev);
+
 	if (korina_init(dev) < 0) {
 		printk(KERN_ERR "%s: cannot restart device\n", dev->name);
 		return;
@@ -1064,12 +1064,12 @@ static int korina_close(struct net_device *dev)
 	tmp = tmp | DMA_STAT_DONE | DMA_STAT_HALT | DMA_STAT_ERR;
 	writel(tmp, &lp->rx_dma_regs->dmasm);
 
-	korina_free_ring(dev);
-
 	napi_disable(&lp->napi);
 
 	cancel_work_sync(&lp->restart_task);
 
+	korina_free_ring(dev);
+
 	free_irq(lp->rx_irq, dev);
 	free_irq(lp->tx_irq, dev);
 	free_irq(lp->ovr_irq, dev);
-- 
2.9.3

^ permalink raw reply related

* [PATCH v2] scsi: bfa: Increase requested firmware version to 3.2.5.1
From: Benjamin Poirier @ 2016-12-24  4:40 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Tim Ehlers, Rasesh Mody, Anil Gurumurthy, Sudarsana Kalluru,
	James E.J. Bottomley, linux-scsi, netdev, linux-kernel
In-Reply-To: <BLUPR0701MB15724638DA0BB8FB5CB9B3CF9F940@BLUPR0701MB1572.namprd07.prod.outlook.com>

bna & bfa firmware version 3.2.5.1 was submitted to linux-firmware on
Feb 17 19:10:20 2015 -0500 in 0ab54ff1dc ("linux-firmware: Add QLogic BR
Series Adapter Firmware").

bna was updated to use the newer firmware on Feb 19 16:02:32 2015 -0500 in
3f307c3d70 ("bna: Update the Driver and Firmware Version")

bfa was not updated. I presume this was an oversight but it broke support
for bfa+bna cards such as the following
	04:00.0 Fibre Channel [0c04]: Brocade Communications Systems, Inc.
		1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
	04:00.1 Fibre Channel [0c04]: Brocade Communications Systems, Inc.
		1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
	04:00.2 Ethernet controller [0200]: Brocade Communications Systems,
		Inc. 1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)
	04:00.3 Ethernet controller [0200]: Brocade Communications Systems,
		Inc. 1010/1020/1007/1741 10Gbps CNA [1657:0014] (rev 01)

Currently, if the bfa module is loaded first, bna fails to probe the
respective devices with
[  215.026787] bna: QLogic BR-series 10G Ethernet driver - version: 3.2.25.1
[  215.043707] bna 0000:04:00.2: bar0 mapped to ffffc90001fc0000, len 262144
[  215.060656] bna 0000:04:00.2: initialization failed err=1
[  215.073893] bna 0000:04:00.3: bar0 mapped to ffffc90002040000, len 262144
[  215.090644] bna 0000:04:00.3: initialization failed err=1

Whereas if bna is loaded first, bfa fails with
[  249.592109] QLogic BR-series BFA FC/FCOE SCSI driver - version: 3.2.25.0
[  249.610738] bfa 0000:04:00.0: Running firmware version is incompatible with the driver version
[  249.833513] bfa 0000:04:00.0: bfa init failed
[  249.833919] scsi host6: QLogic BR-series FC/FCOE Adapter, hwpath: 0000:04:00.0 driver: 3.2.25.0
[  249.841446] bfa 0000:04:00.1: Running firmware version is incompatible with the driver version
[  250.045449] bfa 0000:04:00.1: bfa init failed
[  250.045962] scsi host7: QLogic BR-series FC/FCOE Adapter, hwpath: 0000:04:00.1 driver: 3.2.25.0

Increase bfa's requested firmware version. Also increase the driver
version.
I only tested that all of the devices probe without error.

Reported-by: Tim Ehlers <tehlers@gwdg.de>
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Rasesh Mody <rasesh.mody@cavium.com>
---
 drivers/scsi/bfa/bfad.c     | 6 +++---
 drivers/scsi/bfa/bfad_drv.h | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

Changes v1-v2:
Also increase the driver version

diff --git a/drivers/scsi/bfa/bfad.c b/drivers/scsi/bfa/bfad.c
index 9d253cb..e70410b 100644
--- a/drivers/scsi/bfa/bfad.c
+++ b/drivers/scsi/bfa/bfad.c
@@ -64,9 +64,9 @@ int		max_rport_logins = BFA_FCS_MAX_RPORT_LOGINS;
 u32	bfi_image_cb_size, bfi_image_ct_size, bfi_image_ct2_size;
 u32	*bfi_image_cb, *bfi_image_ct, *bfi_image_ct2;
 
-#define BFAD_FW_FILE_CB		"cbfw-3.2.3.0.bin"
-#define BFAD_FW_FILE_CT		"ctfw-3.2.3.0.bin"
-#define BFAD_FW_FILE_CT2	"ct2fw-3.2.3.0.bin"
+#define BFAD_FW_FILE_CB		"cbfw-3.2.5.1.bin"
+#define BFAD_FW_FILE_CT		"ctfw-3.2.5.1.bin"
+#define BFAD_FW_FILE_CT2	"ct2fw-3.2.5.1.bin"
 
 static u32 *bfad_load_fwimg(struct pci_dev *pdev);
 static void bfad_free_fwimg(void);
diff --git a/drivers/scsi/bfa/bfad_drv.h b/drivers/scsi/bfa/bfad_drv.h
index f9e8620..cfcfff4 100644
--- a/drivers/scsi/bfa/bfad_drv.h
+++ b/drivers/scsi/bfa/bfad_drv.h
@@ -58,7 +58,7 @@
 #ifdef BFA_DRIVER_VERSION
 #define BFAD_DRIVER_VERSION    BFA_DRIVER_VERSION
 #else
-#define BFAD_DRIVER_VERSION    "3.2.25.0"
+#define BFAD_DRIVER_VERSION    "3.2.25.1"
 #endif
 
 #define BFAD_PROTO_NAME FCPI_NAME
-- 
2.10.2

^ permalink raw reply related

* Re: [PATCH v3 3/3] nfc: trf7970a: Prevent repeated polling from crashing the kernel
From: Mark Greer @ 2016-12-24  6:01 UTC (permalink / raw)
  To: Geoff Lansberry
  Cc: linux-wireless, lauro.venancio, aloisio.almeida, sameo, robh+dt,
	mark.rutland, netdev, devicetree, linux-kernel, justin,
	Jaret Cantu
In-Reply-To: <1482380314-16440-3-git-send-email-geoff@kuvee.com>

On Wed, Dec 21, 2016 at 11:18:34PM -0500, Geoff Lansberry wrote:
> From: Jaret Cantu <jaret.cantu@timesys.com>
> 
> Repeated polling attempts cause a NULL dereference error to occur.
> This is because the state of the trf7970a is currently reading but
> another request has been made to send a command before it has finished.
> 
> The solution is to properly kill the waiting reading (workqueue)
> before failing on the send.
> 
> Signed-off-by: Geoff Lansberry <geoff@kuvee.com>
> ---

You've still provided virtually no information on the actual problem(s)
nor justified why you think this is the best solution.  You're adding
code to a section of code that should _never_ be executed so the only
reasonable things I can infer is that there are, at least, two problems:

1) There is a bug causing execution to get into this block of code.

2) Once in this block of code, there is another bug.

You seem to be attempting to fix 2) and completely ignoring 1).
1) is the first bug that needs to be root-caused and fixed.

Also, what exactly is the "NULL dereference error" you mention?
Is this the neard crash you talked about in another thread or is
this a kernel crash?  If it is the kernel crash, please post the
relevant information.  If this is the neard crash - which seems
unlikely - then how can changing a section of kernel code that
shouldn't be executed in the first place fix that?

Mark

^ permalink raw reply

* Re: [PATCH net] net, sched: fix soft lockup in tc_classify
From: Cong Wang @ 2016-12-24  7:34 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David Miller, Shahar Klein, Or Gerlitz, Roi Dayan, Jiri Pirko,
	John Fastabend, Linux Kernel Network Developers
In-Reply-To: <585C6F49.5030900@iogearbox.net>

On Thu, Dec 22, 2016 at 4:26 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 12/22/2016 08:05 PM, Cong Wang wrote:
>>
>> On Wed, Dec 21, 2016 at 1:07 PM, Daniel Borkmann <daniel@iogearbox.net>
>> wrote:
>>>
>>>
>>> Ok, you mean for net. In that case I prefer the smaller sized fix to be
>>> honest. It also covers everything from the point where we fetch the chain
>>> via cops->tcf_chain() to the end of the function, which is where most of
>>> the complexity resides, and only the two mentioned commits do the relock,
>>
>>
>> I really wish the problem is only about relocking, but look at the code,
>> the deeper reason why we have this bug is the complexity of the logic
>> inside tc_ctl_tfilter(): 1) the replay logic is hard, we have to make it
>> idempotent; 2) the request logic itself is hard, because of tc filter
>> design
>> and implementation.
>>
>> This is why I worry more than just relocking.
>
>
> But do you have a concrete 2nd issue/bug you're seeing? It rather sounds to
> me your argument is more about fear of complexity on tc framework itself.
> I agree it's complex, and tc_ctl_tfilter() is quite big in itself, where it
> would be good to reduce it's complexity into smaller pieces. But it's not
> really related to the fix itself, reducing complexity requires significantly
> more and deeper work on the code. We can rework tc_ctl_tfilter() in net-next
> to try to simplify it, sure, but I don't get why we have to discuss so much
> on this matter in this context, really.

Thanks for ignoring my point 1) above... You are dragging the discussion
further.


>
>>> so as a fix I think it's fine as-is. As mentioned, if there's need to
>>> refactor tc_ctl_tfilter() net-next would be better, imho.
>>
>>
>> Refactor is a too strong word, just moving the replay out is not a
>> refactor.
>> The end result will be still smaller than your commit d936377414fadbafb4,
>> which is already backported to stable.
>
>
> Commit d936377414fa is a whole different thing, and not related to the

Nope, you said "small", I show an example of non-small, perfectly fits
in the topic
and beats your argument w.r.t. patch size with your previous case.


> topic at all. The two-line changes with the initialization fix is quite
> straight forward and fixes a bug in the logic. It's also less complex in
> terms of lines but also in terms of complexity than to refactor or move the

Size doesn't tell everything, focus on the code:

1) With your current approach: we have to verify if 'tp_created' is really
correct in ALL cases including replay case; we also have to verify if any
other local variables are as correct as 'tp_created'.

2) With my proposed approach: replay case is much easier, compiler
does everything for us, the function itself is, and should be, good enough
for replaying, no need to track ANY local variables.

For me 2) is much better than 1). Don't look at size, look at the whole code
in a bigger picture.

> replay out. Moving it out contextually means that you also wouldn't rule
> out that things like nlmsg_parse(), or in-fact *any* of the
> __tc_ctl_tfilter()
> return paths could potentially return an error that suddenly would require
> a replay of the request. So, while moving it out fixes the bug, too, it also

The current replay loop already covers almost all of the function, so this
argument doesn't make sense.


> adds more potential points where you would go and replay the request where
> you didn't do so before and therefore also adds more complexity to the code
> that needs review/audit when fixing bugs since you don't have these
> hard/direct
> return paths anymore. Therefore I don't think it's better to go that way
> for the fix.

Please read the code again:

'goto replay' is located at line 383, tc_ctl_tfilter() ends at line 385

'replay' label is located at line 157, tc_ctl_tfilter() starts
(without local variables)
at line 153.

So, replay loop covers 227 lines of code, tc_ctl_tfilter() contains
233 lines of code,
therefore 97.4% of tc_ctl_tfilter() is the replay loop, moving it is
out is literately just
2.6%.

You call this refactor... Huh? Do your math please.

^ permalink raw reply

* RE: [PATCH] ethtool: add one ethtool option to set relax ordering mode
From: maowenan @ 2016-12-24  8:30 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Jeff Kirsher, Stephen Hemminger, netdev@vger.kernel.org,
	weiyongjun (A), Dingtianhong, Wangzhou (B)
In-Reply-To: <CAKgT0Ud0=fQpADLHPnZBLrG8xLGHUB20TZt1mi+GBNYJQngDCA@mail.gmail.com>



> -----Original Message-----
> From: Alexander Duyck [mailto:alexander.duyck@gmail.com]
> Sent: Friday, December 23, 2016 11:43 PM
> To: maowenan
> Cc: Jeff Kirsher; Stephen Hemminger; netdev@vger.kernel.org; weiyongjun (A);
> Dingtianhong
> Subject: Re: [PATCH] ethtool: add one ethtool option to set relax ordering mode
> 
> On Thu, Dec 22, 2016 at 10:14 PM, maowenan <maowenan@huawei.com>
> wrote:
> >
> >
> >> -----Original Message-----
> >> From: Jeff Kirsher [mailto:jeffrey.t.kirsher@intel.com]
> >> Sent: Friday, December 23, 2016 9:07 AM
> >> To: maowenan; Alexander Duyck
> >> Cc: Stephen Hemminger; netdev@vger.kernel.org; weiyongjun (A);
> >> Dingtianhong
> >> Subject: Re: [PATCH] ethtool: add one ethtool option to set relax
> >> ordering mode
> >>
> >> On Fri, 2016-12-23 at 00:40 +0000, maowenan wrote:
> >> > > -----Original Message-----
> >> > > From: Alexander Duyck [mailto:alexander.duyck@gmail.com]
> >> > > Sent: Thursday, December 22, 2016 11:54 PM
> >> > > To: maowenan
> >> > > Cc: Stephen Hemminger; netdev@vger.kernel.org;
> jeffrey.t.kirsher@intel.
> >> > > com;
> >> > > weiyongjun (A); Dingtianhong
> >> > > Subject: Re: [PATCH] ethtool: add one ethtool option to set relax
> >> > > ordering mode
> >> > >
> >> > > On Wed, Dec 21, 2016 at 5:39 PM, maowenan
> <maowenan@huawei.com>
> >> > > wrote:
> >> > > >
> >> > > >
> >> > > > > -----Original Message-----
> >> > > > > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> >> > > > > Sent: Thursday, December 22, 2016 9:28 AM
> >> > > > > To: maowenan
> >> > > > > Cc: netdev@vger.kernel.org; jeffrey.t.kirsher@intel.com
> >> > > > > Subject: Re: [PATCH] ethtool: add one ethtool option to set
> >> > > > > relax ordering mode
> >> > > > >
> >> > > > > On Thu, 8 Dec 2016 14:51:38 +0800 Mao Wenan
> >> > > > > <maowenan@huawei.com> wrote:
> >> > > > >
> >> > > > > > This patch provides one way to set/unset IXGBE NIC TX and
> >> > > > > > RX relax ordering mode, which can be set by ethtool.
> >> > > > > > Relax ordering is one mode of 82599 NIC, to enable this
> >> > > > > > mode can enhance the performance for some cpu architecure.
> >> > > > >
> >> > > > > Then it should be done by CPU architecture specific quirks
> >> > > > > (preferably in PCI
> >> > > > > layer) so that all users get the option without having to do
> >> > > > > manual
> >> > >
> >> > > intervention.
> >> > > > >
> >> > > > > > example:
> >> > > > > > ethtool -s enp1s0f0 relaxorder off ethtool -s enp1s0f0
> >> > > > > > relaxorder on
> >> > > > >
> >> > > > > Doing it via ethtool is a developer API (for testing) not
> >> > > > > something that makes sense in production.
> >> > > >
> >> > > >
> >> > > > This feature is not mandatory for all users, acturally relax
> >> > > > ordering default configuration of 82599 is 'disable', So this
> >> > > > patch gives one way to
> >> > >
> >> > > enable relax ordering to be selected in some performance condition.
> >> > >
> >> > > That isn't quite true.  The default for Sparc systems is to have
> >> > > it enabled.
> >> > >
> >> > > Really this is something that is platform specific.  I agree with
> >> > > Stephen that it would work better if this was handled as a series
> >> > > of platform specific quirks handled at something like the PCI
> >> > > layer rather than be a switch the user can toggle on and off.
> >> > >
> >> > > With that being said there are changes being made that should
> >> > > help to improve the situation.  Specifically I am looking at
> >> > > adding support for the DMA_ATTR_WEAK_ORDERING which may also
> >> > > allow us to identify cases where you might be able to specify the
> >> > > DMA behavior via the DMA mapping instead of having to make the
> >> > > final decision in the device itself.
> >> > >
> >> > > - Alex
> >> >
> >> > Yes, Sparc is a special case. From the NIC driver point of view, It
> >> > is no need for some ARCHs to do particular operation and compiling
> >> > branch, ethtool is a flexible method for user to make decision
> >> > whether
> >> > on|off this feature.
> >> > I think Jeff as maintainer of 82599 has some comments about this.
> >>
> >> My original comment/objection was that you attempted to do this
> >> change as a module parameter to the ixgbe driver, where I directed
> >> you to use ethtool so that other drivers could benefit from the
> >> ability to enable/disable relaxed ordering.  As far as how it gets
> >> implemented in ethtool or PCI layer, makes little difference to me, I
> >> only had issues with the driver specific module parameter implementation,
> which is not acceptable.
> >
> >
> > Thank you Jeff and Alex.
> > And then I have gone through mail thread about "i40e: enable PCIe
> > relax ordering for SPARC", It only works for SPARC, any other ARCH who
> > wants to enable DMA_ATTR_WEAK_ORDERING feature, should define the
> new macro, recompile the driver module.
> >
> > Because of the above reasons, we implement in ethtool to give the
> > final user a convenient way to on|off special feature, no need define
> > new macro, easy to extend the new features, and also good benefit for other
> driver as Jeff referred.
> >
> 
> I think the point is we shouldn't base the decision on user input.
> The fact is the PCIe device control register should have a bit that indicates if
> the device is allowed to enable relaxed ordering or not.
> If we can guarantee that the bit is set in all the cases where it should be set,
> and cleared in all the cases where it should not then we could use something
> like that to determine if the device is supposed to enable relaxed ordering
> instead of trying to make the decision ourselves.
> 
> - Alex

ok. We are focusing on the register. 
And yes, to enable relax ordering for 82599 should be set by one or more bits of Rx/TX DCA Control 
Register, these bits should be set in many cpu architectures, such as arm64, sparc, and so on, and 
should be cleared in other ARCHs.
By the way, how do you enable SPARC macro, how and where to define this compiling macro when user 
one to enable relax ordering under SPARC system?  
#ifndef CONFIG_SPARC




^ permalink raw reply

* Re: [RFC PATCH 4.10 1/6] crypto/sha256: Refactor the API so it can be used without shash
From: Ard Biesheuvel @ 2016-12-24 10:33 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Daniel Borkmann, Netdev, LKML, Linux Crypto Mailing List,
	Jason A. Donenfeld, Hannes Frederic Sowa, Alexei Starovoitov,
	Eric Dumazet, Eric Biggers, Tom Herbert, David S. Miller,
	Herbert Xu
In-Reply-To: <942b91f25a63b22ec4946378a1fffe78d655cf18.1482545792.git.luto@kernel.org>

Hi Andy,

On 24 December 2016 at 02:22, Andy Lutomirski <luto@kernel.org> wrote:
> There are some pieecs of kernel code that want to compute SHA256
> directly without going through the crypto core.  Adjust the exported
> API to decouple it from the crypto core.
>

There are a bunch of things happening at the same time in this patch,
i.e., unnecessary renames of functions with static linkage, return
type changes to the base prototypes (int (*)(...) to void (*)(...))
and the change for the base functions to take a struct sha256_state
ctx rather than a shash_desc. I suppose you are mainly after the
latter, so could we please drop the other changes?

For the name clashes, could we simply use the crypto_ prefix for the
globally visible functions rather than using names that are already in
use? (and having to go around clean up the conflicts)
As for the return type changes, the base functions intentionally
return int to allow tail calls from the functions exposed by the
crypto API (whose prototypes cannot be changed). Unlikely to matter in
the grand scheme of things (especially now that the base layer
consists of static inline functions primarily), but it is equally
pointless to go around and change them to return void IMO.

So what remains is the struct shash_desc to struct sha256_state
change, which makes sense given that you are after a sha256_digest()
function that does not require the crypto API. But it seems your use
case does not rely on incremental hashing, and so there is no reason
for the state to be exposed outside of the implementation, and we
could simply expose a crypto_sha256_digest() routine from the
sha256_generic.c implementation instead.

Also, I strongly feel that crypto and other security related patches
should be tested before being posted, even if they are only RFC,
especially when they are posted by high profile kernel devs like
yourself. (Your code incorrectly calls crypto_sha2_final() in a couple
of places, resulting in the finalization being performed twice, once
with the accelerated block transform and again with the generic
transform)

> I suspect this will very slightly speed up the SHA256 shash operations
> as well by reducing the amount of indirection involved.
>

I think you have a valid point when it comes to the complexity of the
crypto API in general. But the struct sha256_state is embedded in the
shash_desc rather than referred to via a pointer, so the level of
indirection does not appear to change. And given how 99.9% of the
SHA256 execution time is spent in the block transform routine anyway,
I expect the performance delta to be in the noise tbh.

Finally, another thing to keep in mind is that the base layers of
SHA-1, SHA-256 and SHA-512 are intentionally structured in the same
way. If there is a need for a digest() entry point, I'd prefer to add
them for all flavours.

E.g, something like

"""
@@ -126,3 +128,23 @@ static inline int sha256_base_finish(struct
shash_desc *desc, u8 *out)
        *sctx = (struct sha256_state){};
        return 0;
 }
+
+static inline int sha256_base_finish(struct shash_desc *desc, u8 *out)
+{
+       unsigned int digest_size = crypto_shash_digestsize(desc->tfm);
+       struct sha256_state *sctx = shash_desc_ctx(desc);
+
+       return __sha256_base_finish(sctx, out, digest_size);
+}
+
+static inline int sha256_base_digest(const u8 *data, unsigned int len, u8 *out,
+                                    sha256_block_fn *block_fn)
+{
+       struct sha256_state sctx;
+
+       __sha256_base_init(&sctx);
+       sha256_base_do_update(&sctx, data, len, block_fn);
+       sha256_base_do_finalize(&sctx, block_fn);
+
+       return __sha256_base_finish(&sctx, out, SHA256_DIGEST_SIZE);
+}
"""

(where __sha256_base_init() and __sha256_base_finish() are the
existing functions modified to take a struct sha256_state rather than
a shash_desc) should be sufficient to allow a generic
crypto_sha256_digest() to be composed that does not rely on the crypto
API.

Whether this still belongs under crypto or under lib/sha256.c as a
library function (allowing archs to override it) is open for debate.
If hashing BPF programs becomes a hot spot, we probably have bigger
problems.

Regards,
Ard.

P.S. I do take your point regarding the arch_sha256_block_transform()
proposed in your follow up email, but there are some details (SIMD,
availability of the instructions etc) that would make it only suitable
for the generic implementation anyway, and the base layer was already
a huge improvement compared to the open coded implementations of the
SHA boilerplate.


> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
>  arch/arm/crypto/sha2-ce-glue.c      | 10 ++++---
>  arch/arm/crypto/sha256_glue.c       | 23 ++++++++++-----
>  arch/arm/crypto/sha256_neon_glue.c  | 34 +++++++++++----------
>  arch/arm64/crypto/sha2-ce-glue.c    | 13 ++++----
>  arch/arm64/crypto/sha256-glue.c     | 59 +++++++++++++++++++++----------------
>  arch/x86/crypto/sha256_ssse3_glue.c | 46 +++++++++++++++++------------
>  arch/x86/purgatory/purgatory.c      |  2 +-
>  arch/x86/purgatory/sha256.c         | 25 ++--------------
>  arch/x86/purgatory/sha256.h         | 22 --------------
>  crypto/sha256_generic.c             | 50 +++++++++++++++++++++++--------
>  include/crypto/sha.h                | 29 ++++++++++++++----
>  include/crypto/sha256_base.h        | 40 ++++++++-----------------
>  12 files changed, 184 insertions(+), 169 deletions(-)
>  delete mode 100644 arch/x86/purgatory/sha256.h
>
> diff --git a/arch/arm/crypto/sha2-ce-glue.c b/arch/arm/crypto/sha2-ce-glue.c
> index 0755b2d657f3..8832c2f85591 100644
> --- a/arch/arm/crypto/sha2-ce-glue.c
> +++ b/arch/arm/crypto/sha2-ce-glue.c
> @@ -38,7 +38,7 @@ static int sha2_ce_update(struct shash_desc *desc, const u8 *data,
>                 return crypto_sha256_arm_update(desc, data, len);
>
>         kernel_neon_begin();
> -       sha256_base_do_update(desc, data, len,
> +       sha256_base_do_update(sctx, data, len,
>                               (sha256_block_fn *)sha2_ce_transform);
>         kernel_neon_end();
>
> @@ -48,17 +48,19 @@ static int sha2_ce_update(struct shash_desc *desc, const u8 *data,
>  static int sha2_ce_finup(struct shash_desc *desc, const u8 *data,
>                          unsigned int len, u8 *out)
>  {
> +       struct sha256_state *sctx = shash_desc_ctx(desc);
> +
>         if (!may_use_simd())
>                 return crypto_sha256_arm_finup(desc, data, len, out);
>
>         kernel_neon_begin();
>         if (len)
> -               sha256_base_do_update(desc, data, len,
> +               sha256_base_do_update(sctx, data, len,
>                                       (sha256_block_fn *)sha2_ce_transform);
> -       sha256_base_do_finalize(desc, (sha256_block_fn *)sha2_ce_transform);
> +       sha256_base_do_finalize(sctx, (sha256_block_fn *)sha2_ce_transform);
>         kernel_neon_end();
>
> -       return sha256_base_finish(desc, out);
> +       return crypto_sha2_final(desc, out);
>  }
>
>  static int sha2_ce_final(struct shash_desc *desc, u8 *out)
> diff --git a/arch/arm/crypto/sha256_glue.c b/arch/arm/crypto/sha256_glue.c
> index a84e869ef900..405a29a9a9d3 100644
> --- a/arch/arm/crypto/sha256_glue.c
> +++ b/arch/arm/crypto/sha256_glue.c
> @@ -36,27 +36,34 @@ asmlinkage void sha256_block_data_order(u32 *digest, const void *data,
>  int crypto_sha256_arm_update(struct shash_desc *desc, const u8 *data,
>                              unsigned int len)
>  {
> +       struct sha256_state *sctx = shash_desc_ctx(desc);
> +
>         /* make sure casting to sha256_block_fn() is safe */
>         BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0);
>
> -       return sha256_base_do_update(desc, data, len,
> +       sha256_base_do_update(sctx, data, len,
>                                 (sha256_block_fn *)sha256_block_data_order);
> +       return 0;
>  }
>  EXPORT_SYMBOL(crypto_sha256_arm_update);
>
> -static int sha256_final(struct shash_desc *desc, u8 *out)
> +static int sha256_arm_final(struct shash_desc *desc, u8 *out)
>  {
> -       sha256_base_do_finalize(desc,
> +       struct sha256_state *sctx = shash_desc_ctx(desc);
> +
> +       sha256_base_do_finalize(sctx,
>                                 (sha256_block_fn *)sha256_block_data_order);
> -       return sha256_base_finish(desc, out);
> +       return crypto_sha2_final(desc, out);

This is wrong: your crypto_sha2_final also calls sha256_base_do_finalize()

>  }
>
>  int crypto_sha256_arm_finup(struct shash_desc *desc, const u8 *data,
>                             unsigned int len, u8 *out)
>  {
> -       sha256_base_do_update(desc, data, len,
> +       struct sha256_state *sctx = shash_desc_ctx(desc);
> +
> +       sha256_base_do_update(sctx, data, len,
>                               (sha256_block_fn *)sha256_block_data_order);
> -       return sha256_final(desc, out);
> +       return crypto_sha2_final(desc, out);
>  }
>  EXPORT_SYMBOL(crypto_sha256_arm_finup);
>
> @@ -64,7 +71,7 @@ static struct shash_alg algs[] = { {
>         .digestsize     =       SHA256_DIGEST_SIZE,
>         .init           =       sha256_base_init,
>         .update         =       crypto_sha256_arm_update,
> -       .final          =       sha256_final,
> +       .final          =       sha256_arm_final,
>         .finup          =       crypto_sha256_arm_finup,
>         .descsize       =       sizeof(struct sha256_state),
>         .base           =       {
> @@ -79,7 +86,7 @@ static struct shash_alg algs[] = { {
>         .digestsize     =       SHA224_DIGEST_SIZE,
>         .init           =       sha224_base_init,
>         .update         =       crypto_sha256_arm_update,
> -       .final          =       sha256_final,
> +       .final          =       sha256_arm_final,
>         .finup          =       crypto_sha256_arm_finup,
>         .descsize       =       sizeof(struct sha256_state),
>         .base           =       {
> diff --git a/arch/arm/crypto/sha256_neon_glue.c b/arch/arm/crypto/sha256_neon_glue.c
> index 39ccd658817e..40c85d1d4c1e 100644
> --- a/arch/arm/crypto/sha256_neon_glue.c
> +++ b/arch/arm/crypto/sha256_neon_glue.c
> @@ -29,8 +29,8 @@
>  asmlinkage void sha256_block_data_order_neon(u32 *digest, const void *data,
>                                              unsigned int num_blks);
>
> -static int sha256_update(struct shash_desc *desc, const u8 *data,
> -                        unsigned int len)
> +static int sha256_neon_update(struct shash_desc *desc, const u8 *data,
> +                             unsigned int len)
>  {
>         struct sha256_state *sctx = shash_desc_ctx(desc);
>
> @@ -39,41 +39,43 @@ static int sha256_update(struct shash_desc *desc, const u8 *data,
>                 return crypto_sha256_arm_update(desc, data, len);
>
>         kernel_neon_begin();
> -       sha256_base_do_update(desc, data, len,
> +       sha256_base_do_update(sctx, data, len,
>                         (sha256_block_fn *)sha256_block_data_order_neon);
>         kernel_neon_end();
>
>         return 0;
>  }
>
> -static int sha256_finup(struct shash_desc *desc, const u8 *data,
> -                       unsigned int len, u8 *out)
> +static int sha256_neon_finup(struct shash_desc *desc, const u8 *data,
> +                            unsigned int len, u8 *out)
>  {
> +       struct sha256_state *sctx = shash_desc_ctx(desc);
> +
>         if (!may_use_simd())
>                 return crypto_sha256_arm_finup(desc, data, len, out);
>
>         kernel_neon_begin();
>         if (len)
> -               sha256_base_do_update(desc, data, len,
> +               sha256_base_do_update(sctx, data, len,
>                         (sha256_block_fn *)sha256_block_data_order_neon);
> -       sha256_base_do_finalize(desc,
> +       sha256_base_do_finalize(sctx,
>                         (sha256_block_fn *)sha256_block_data_order_neon);
>         kernel_neon_end();
>
> -       return sha256_base_finish(desc, out);
> +       return crypto_sha2_final(desc, out);
>  }
>
> -static int sha256_final(struct shash_desc *desc, u8 *out)
> +static int sha256_neon_final(struct shash_desc *desc, u8 *out)
>  {
> -       return sha256_finup(desc, NULL, 0, out);
> +       return sha256_neon_finup(desc, NULL, 0, out);
>  }
>
>  struct shash_alg sha256_neon_algs[] = { {
>         .digestsize     =       SHA256_DIGEST_SIZE,
>         .init           =       sha256_base_init,
> -       .update         =       sha256_update,
> -       .final          =       sha256_final,
> -       .finup          =       sha256_finup,
> +       .update         =       sha256_neon_update,
> +       .final          =       sha256_neon_final,
> +       .finup          =       sha256_neon_finup,
>         .descsize       =       sizeof(struct sha256_state),
>         .base           =       {
>                 .cra_name       =       "sha256",
> @@ -86,9 +88,9 @@ struct shash_alg sha256_neon_algs[] = { {
>  }, {
>         .digestsize     =       SHA224_DIGEST_SIZE,
>         .init           =       sha224_base_init,
> -       .update         =       sha256_update,
> -       .final          =       sha256_final,
> -       .finup          =       sha256_finup,
> +       .update         =       sha256_neon_update,
> +       .final          =       sha256_neon_final,
> +       .finup          =       sha256_neon_finup,
>         .descsize       =       sizeof(struct sha256_state),
>         .base           =       {
>                 .cra_name       =       "sha224",
> diff --git a/arch/arm64/crypto/sha2-ce-glue.c b/arch/arm64/crypto/sha2-ce-glue.c
> index 7cd587564a41..e38dd301abce 100644
> --- a/arch/arm64/crypto/sha2-ce-glue.c
> +++ b/arch/arm64/crypto/sha2-ce-glue.c
> @@ -39,7 +39,7 @@ static int sha256_ce_update(struct shash_desc *desc, const u8 *data,
>
>         sctx->finalize = 0;
>         kernel_neon_begin_partial(28);
> -       sha256_base_do_update(desc, data, len,
> +       sha256_base_do_update(&sctx->sst, data, len,
>                               (sha256_block_fn *)sha2_ce_transform);
>         kernel_neon_end();
>
> @@ -64,13 +64,13 @@ static int sha256_ce_finup(struct shash_desc *desc, const u8 *data,
>         sctx->finalize = finalize;
>
>         kernel_neon_begin_partial(28);
> -       sha256_base_do_update(desc, data, len,
> +       sha256_base_do_update(&sctx->sst, data, len,
>                               (sha256_block_fn *)sha2_ce_transform);
>         if (!finalize)
> -               sha256_base_do_finalize(desc,
> +               sha256_base_do_finalize(&sctx->sst,
>                                         (sha256_block_fn *)sha2_ce_transform);
>         kernel_neon_end();
> -       return sha256_base_finish(desc, out);
> +       return crypto_sha2_final(desc, out);
>  }
>
>  static int sha256_ce_final(struct shash_desc *desc, u8 *out)
> @@ -79,9 +79,10 @@ static int sha256_ce_final(struct shash_desc *desc, u8 *out)
>
>         sctx->finalize = 0;
>         kernel_neon_begin_partial(28);
> -       sha256_base_do_finalize(desc, (sha256_block_fn *)sha2_ce_transform);
> +       sha256_base_do_finalize(&sctx->sst,
> +                               (sha256_block_fn *)sha2_ce_transform);
>         kernel_neon_end();
> -       return sha256_base_finish(desc, out);
> +       return crypto_sha2_final(desc, out);
>  }
>
>  static struct shash_alg algs[] = { {
> diff --git a/arch/arm64/crypto/sha256-glue.c b/arch/arm64/crypto/sha256-glue.c
> index a2226f841960..132a1ef89a71 100644
> --- a/arch/arm64/crypto/sha256-glue.c
> +++ b/arch/arm64/crypto/sha256-glue.c
> @@ -33,36 +33,39 @@ asmlinkage void sha256_block_data_order(u32 *digest, const void *data,
>  asmlinkage void sha256_block_neon(u32 *digest, const void *data,
>                                   unsigned int num_blks);
>
> -static int sha256_update(struct shash_desc *desc, const u8 *data,
> -                        unsigned int len)
> +static int sha256_update_arm64(struct shash_desc *desc, const u8 *data,
> +                              unsigned int len)
>  {
> -       return sha256_base_do_update(desc, data, len,
> -                               (sha256_block_fn *)sha256_block_data_order);
> +       sha256_base_do_update(shash_desc_ctx(desc), data, len,
> +                             (sha256_block_fn *)sha256_block_data_order);
> +       return 0;
>  }
>
> -static int sha256_finup(struct shash_desc *desc, const u8 *data,
> -                       unsigned int len, u8 *out)
> +static int sha256_finup_arm64(struct shash_desc *desc, const u8 *data,
> +                             unsigned int len, u8 *out)
>  {
> +       struct sha256_state *sctx = shash_desc_ctx(desc);
> +
>         if (len)
> -               sha256_base_do_update(desc, data, len,
> +               sha256_base_do_update(sctx, data, len,
>                                 (sha256_block_fn *)sha256_block_data_order);
> -       sha256_base_do_finalize(desc,
> +       sha256_base_do_finalize(sctx,
>                                 (sha256_block_fn *)sha256_block_data_order);
>
> -       return sha256_base_finish(desc, out);
> +       return crypto_sha2_final(desc, out);
>  }
>
> -static int sha256_final(struct shash_desc *desc, u8 *out)
> +static int sha256_final_arm64(struct shash_desc *desc, u8 *out)
>  {
> -       return sha256_finup(desc, NULL, 0, out);
> +       return sha256_finup_arm64(desc, NULL, 0, out);
>  }
>
>  static struct shash_alg algs[] = { {
>         .digestsize             = SHA256_DIGEST_SIZE,
>         .init                   = sha256_base_init,
> -       .update                 = sha256_update,
> -       .final                  = sha256_final,
> -       .finup                  = sha256_finup,
> +       .update                 = sha256_update_arm64,
> +       .final                  = sha256_final_arm64,
> +       .finup                  = sha256_finup_arm64,
>         .descsize               = sizeof(struct sha256_state),
>         .base.cra_name          = "sha256",
>         .base.cra_driver_name   = "sha256-arm64",
> @@ -73,9 +76,9 @@ static struct shash_alg algs[] = { {
>  }, {
>         .digestsize             = SHA224_DIGEST_SIZE,
>         .init                   = sha224_base_init,
> -       .update                 = sha256_update,
> -       .final                  = sha256_final,
> -       .finup                  = sha256_finup,
> +       .update                 = sha256_update_arm64,
> +       .final                  = sha256_final_arm64,
> +       .finup                  = sha256_finup_arm64,
>         .descsize               = sizeof(struct sha256_state),
>         .base.cra_name          = "sha224",
>         .base.cra_driver_name   = "sha224-arm64",
> @@ -88,18 +91,22 @@ static struct shash_alg algs[] = { {
>  static int sha256_update_neon(struct shash_desc *desc, const u8 *data,
>                               unsigned int len)
>  {
> +       struct sha256_state *sctx = shash_desc_ctx(desc);
> +
>         /*
>          * Stacking and unstacking a substantial slice of the NEON register
>          * file may significantly affect performance for small updates when
>          * executing in interrupt context, so fall back to the scalar code
>          * in that case.
>          */
> -       if (!may_use_simd())
> -               return sha256_base_do_update(desc, data, len,
> +       if (!may_use_simd()) {
> +               sha256_base_do_update(sctx, data, len,
>                                 (sha256_block_fn *)sha256_block_data_order);
> +               return 0;
> +       }
>
>         kernel_neon_begin();
> -       sha256_base_do_update(desc, data, len,
> +       sha256_base_do_update(sctx, data, len,
>                                 (sha256_block_fn *)sha256_block_neon);
>         kernel_neon_end();
>
> @@ -109,22 +116,24 @@ static int sha256_update_neon(struct shash_desc *desc, const u8 *data,
>  static int sha256_finup_neon(struct shash_desc *desc, const u8 *data,
>                              unsigned int len, u8 *out)
>  {
> +       struct sha256_state *sctx = shash_desc_ctx(desc);
> +
>         if (!may_use_simd()) {
>                 if (len)
> -                       sha256_base_do_update(desc, data, len,
> +                       sha256_base_do_update(sctx, data, len,
>                                 (sha256_block_fn *)sha256_block_data_order);
> -               sha256_base_do_finalize(desc,
> +               sha256_base_do_finalize(sctx,
>                                 (sha256_block_fn *)sha256_block_data_order);
>         } else {
>                 kernel_neon_begin();
>                 if (len)
> -                       sha256_base_do_update(desc, data, len,
> +                       sha256_base_do_update(sctx, data, len,
>                                 (sha256_block_fn *)sha256_block_neon);
> -               sha256_base_do_finalize(desc,
> +               sha256_base_do_finalize(sctx,
>                                 (sha256_block_fn *)sha256_block_neon);
>                 kernel_neon_end();
>         }
> -       return sha256_base_finish(desc, out);
> +       return crypto_sha2_final(desc, out);
>  }
>
>  static int sha256_final_neon(struct shash_desc *desc, u8 *out)
> diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c
> index 9e79baf03a4b..e722fbaf0558 100644
> --- a/arch/x86/crypto/sha256_ssse3_glue.c
> +++ b/arch/x86/crypto/sha256_ssse3_glue.c
> @@ -44,52 +44,60 @@ asmlinkage void sha256_transform_ssse3(u32 *digest, const char *data,
>                                        u64 rounds);
>  typedef void (sha256_transform_fn)(u32 *digest, const char *data, u64 rounds);
>
> -static int sha256_update(struct shash_desc *desc, const u8 *data,
> -                        unsigned int len, sha256_transform_fn *sha256_xform)
> +static int sha256_fpu_update(struct shash_desc *desc, const u8 *data,
> +                              unsigned int len,
> +                              sha256_transform_fn *sha256_xform)
>  {
>         struct sha256_state *sctx = shash_desc_ctx(desc);
>
>         if (!irq_fpu_usable() ||
> -           (sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE)
> -               return crypto_sha256_update(desc, data, len);
> +           (sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE) {
> +               sha256_update(sctx, data, len);
> +               return 0;
> +       }
>
>         /* make sure casting to sha256_block_fn() is safe */
>         BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0);
>
>         kernel_fpu_begin();
> -       sha256_base_do_update(desc, data, len,
> +       sha256_base_do_update(sctx, data, len,
>                               (sha256_block_fn *)sha256_xform);
>         kernel_fpu_end();
>
>         return 0;
>  }
>
> -static int sha256_finup(struct shash_desc *desc, const u8 *data,
> +static int sha256_fpu_finup(struct shash_desc *desc, const u8 *data,
>               unsigned int len, u8 *out, sha256_transform_fn *sha256_xform)
>  {
> -       if (!irq_fpu_usable())
> -               return crypto_sha256_finup(desc, data, len, out);
> +       struct sha256_state *sctx = shash_desc_ctx(desc);
> +
> +       if (!irq_fpu_usable()) {
> +               sha256_finup(sctx, data, len, out);
> +               return 0;
> +       }
>
>         kernel_fpu_begin();
>         if (len)
> -               sha256_base_do_update(desc, data, len,
> +               sha256_base_do_update(sctx, data, len,
>                                       (sha256_block_fn *)sha256_xform);
> -       sha256_base_do_finalize(desc, (sha256_block_fn *)sha256_xform);
> +       sha256_base_do_finalize(sctx, (sha256_block_fn *)sha256_xform);
>         kernel_fpu_end();
>
> -       return sha256_base_finish(desc, out);
> +       crypto_sha2_final(desc, out);
> +       return 0;
>  }
>
>  static int sha256_ssse3_update(struct shash_desc *desc, const u8 *data,
>                          unsigned int len)
>  {
> -       return sha256_update(desc, data, len, sha256_transform_ssse3);
> +       return sha256_fpu_update(desc, data, len, sha256_transform_ssse3);
>  }
>
>  static int sha256_ssse3_finup(struct shash_desc *desc, const u8 *data,
>               unsigned int len, u8 *out)
>  {
> -       return sha256_finup(desc, data, len, out, sha256_transform_ssse3);
> +       return sha256_fpu_finup(desc, data, len, out, sha256_transform_ssse3);
>  }
>
>  /* Add padding and return the message digest. */
> @@ -152,13 +160,13 @@ asmlinkage void sha256_transform_avx(u32 *digest, const char *data,
>  static int sha256_avx_update(struct shash_desc *desc, const u8 *data,
>                          unsigned int len)
>  {
> -       return sha256_update(desc, data, len, sha256_transform_avx);
> +       return sha256_fpu_update(desc, data, len, sha256_transform_avx);
>  }
>
>  static int sha256_avx_finup(struct shash_desc *desc, const u8 *data,
>                       unsigned int len, u8 *out)
>  {
> -       return sha256_finup(desc, data, len, out, sha256_transform_avx);
> +       return sha256_fpu_finup(desc, data, len, out, sha256_transform_avx);
>  }
>
>  static int sha256_avx_final(struct shash_desc *desc, u8 *out)
> @@ -236,13 +244,13 @@ asmlinkage void sha256_transform_rorx(u32 *digest, const char *data,
>  static int sha256_avx2_update(struct shash_desc *desc, const u8 *data,
>                          unsigned int len)
>  {
> -       return sha256_update(desc, data, len, sha256_transform_rorx);
> +       return sha256_fpu_update(desc, data, len, sha256_transform_rorx);
>  }
>
>  static int sha256_avx2_finup(struct shash_desc *desc, const u8 *data,
>                       unsigned int len, u8 *out)
>  {
> -       return sha256_finup(desc, data, len, out, sha256_transform_rorx);
> +       return sha256_fpu_finup(desc, data, len, out, sha256_transform_rorx);
>  }
>
>  static int sha256_avx2_final(struct shash_desc *desc, u8 *out)
> @@ -318,13 +326,13 @@ asmlinkage void sha256_ni_transform(u32 *digest, const char *data,
>  static int sha256_ni_update(struct shash_desc *desc, const u8 *data,
>                          unsigned int len)
>  {
> -       return sha256_update(desc, data, len, sha256_ni_transform);
> +       return sha256_fpu_update(desc, data, len, sha256_ni_transform);
>  }
>
>  static int sha256_ni_finup(struct shash_desc *desc, const u8 *data,
>                       unsigned int len, u8 *out)
>  {
> -       return sha256_finup(desc, data, len, out, sha256_ni_transform);
> +       return sha256_fpu_finup(desc, data, len, out, sha256_ni_transform);
>  }
>
>  static int sha256_ni_final(struct shash_desc *desc, u8 *out)
> diff --git a/arch/x86/purgatory/purgatory.c b/arch/x86/purgatory/purgatory.c
> index 25e068ba3382..ed6e80b844cf 100644
> --- a/arch/x86/purgatory/purgatory.c
> +++ b/arch/x86/purgatory/purgatory.c
> @@ -10,7 +10,7 @@
>   * Version 2.  See the file COPYING for more details.
>   */
>
> -#include "sha256.h"
> +#include <crypto/sha.h>
>  #include "../boot/string.h"
>
>  struct sha_region {
> diff --git a/arch/x86/purgatory/sha256.c b/arch/x86/purgatory/sha256.c
> index 548ca675a14a..724925d5da61 100644
> --- a/arch/x86/purgatory/sha256.c
> +++ b/arch/x86/purgatory/sha256.c
> @@ -17,7 +17,7 @@
>
>  #include <linux/bitops.h>
>  #include <asm/byteorder.h>
> -#include "sha256.h"
> +#include <crypto/sha.h>
>  #include "../boot/string.h"
>
>  static inline u32 Ch(u32 x, u32 y, u32 z)
> @@ -208,22 +208,7 @@ static void sha256_transform(u32 *state, const u8 *input)
>         memset(W, 0, 64 * sizeof(u32));
>  }
>
> -int sha256_init(struct sha256_state *sctx)
> -{
> -       sctx->state[0] = SHA256_H0;
> -       sctx->state[1] = SHA256_H1;
> -       sctx->state[2] = SHA256_H2;
> -       sctx->state[3] = SHA256_H3;
> -       sctx->state[4] = SHA256_H4;
> -       sctx->state[5] = SHA256_H5;
> -       sctx->state[6] = SHA256_H6;
> -       sctx->state[7] = SHA256_H7;
> -       sctx->count = 0;
> -
> -       return 0;
> -}
> -
> -int sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
> +void sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
>  {
>         unsigned int partial, done;
>         const u8 *src;
> @@ -249,11 +234,9 @@ int sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
>                 partial = 0;
>         }
>         memcpy(sctx->buf + partial, src, len - done);
> -
> -       return 0;
>  }
>
> -int sha256_final(struct sha256_state *sctx, u8 *out)
> +void sha256_final(struct sha256_state *sctx, u8 *out)
>  {
>         __be32 *dst = (__be32 *)out;
>         __be64 bits;
> @@ -278,6 +261,4 @@ int sha256_final(struct sha256_state *sctx, u8 *out)
>
>         /* Zeroize sensitive information. */
>         memset(sctx, 0, sizeof(*sctx));
> -
> -       return 0;
>  }
> diff --git a/arch/x86/purgatory/sha256.h b/arch/x86/purgatory/sha256.h
> deleted file mode 100644
> index bd15a4127735..000000000000
> --- a/arch/x86/purgatory/sha256.h
> +++ /dev/null
> @@ -1,22 +0,0 @@
> -/*
> - *  Copyright (C) 2014 Red Hat Inc.
> - *
> - *  Author: Vivek Goyal <vgoyal@redhat.com>
> - *
> - * This source code is licensed under the GNU General Public License,
> - * Version 2.  See the file COPYING for more details.
> - */
> -
> -#ifndef SHA256_H
> -#define SHA256_H
> -
> -
> -#include <linux/types.h>
> -#include <crypto/sha.h>
> -
> -extern int sha256_init(struct sha256_state *sctx);
> -extern int sha256_update(struct sha256_state *sctx, const u8 *input,
> -                               unsigned int length);
> -extern int sha256_final(struct sha256_state *sctx, u8 *hash);
> -
> -#endif /* SHA256_H */
> diff --git a/crypto/sha256_generic.c b/crypto/sha256_generic.c
> index 8f9c47e1a96e..f2747893402c 100644
> --- a/crypto/sha256_generic.c
> +++ b/crypto/sha256_generic.c
> @@ -231,6 +231,13 @@ static void sha256_transform(u32 *state, const u8 *input)
>         memzero_explicit(W, 64 * sizeof(u32));
>  }
>
> +int sha256_base_init(struct shash_desc *desc)
> +{
> +       sha256_init(shash_desc_ctx(desc));
> +       return 0;
> +}
> +EXPORT_SYMBOL(sha256_base_init);
> +
>  static void sha256_generic_block_fn(struct sha256_state *sst, u8 const *src,
>                                     int blocks)
>  {
> @@ -240,32 +247,49 @@ static void sha256_generic_block_fn(struct sha256_state *sst, u8 const *src,
>         }
>  }
>
> -int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
> +void sha256_update(struct sha256_state *sctx, const u8 *data,
>                           unsigned int len)
>  {
> -       return sha256_base_do_update(desc, data, len, sha256_generic_block_fn);
> +       sha256_base_do_update(sctx, data, len, sha256_generic_block_fn);
> +}
> +EXPORT_SYMBOL(sha256_update);
> +
> +void sha256_final(struct sha256_state *sctx, u8 *out)
> +{
> +       sha256_base_do_finalize(sctx, sha256_generic_block_fn);
> +       sha256_base_finish(sctx, out);
>  }
> -EXPORT_SYMBOL(crypto_sha256_update);
> +EXPORT_SYMBOL(sha256_final);
>
> -static int sha256_final(struct shash_desc *desc, u8 *out)
> +static int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
> +                               unsigned int len)
>  {
> -       sha256_base_do_finalize(desc, sha256_generic_block_fn);
> -       return sha256_base_finish(desc, out);
> +       sha256_update(shash_desc_ctx(desc), data, len);
> +       return 0;
> +}
> +
> +int crypto_sha2_final(struct shash_desc *desc, u8 *out)
> +{
> +       struct sha256_state *sctx = shash_desc_ctx(desc);
> +
> +       sha256_base_do_finalize(sctx, sha256_generic_block_fn);
> +       sha2_base_finish(sctx, crypto_shash_digestsize(desc->tfm), out);
> +       return 0;
>  }
> +EXPORT_SYMBOL(crypto_sha2_final);
>
> -int crypto_sha256_finup(struct shash_desc *desc, const u8 *data,
> -                       unsigned int len, u8 *hash)
> +static int crypto_sha256_finup(struct shash_desc *desc, const u8 *data,
> +                              unsigned int len, u8 *hash)
>  {
> -       sha256_base_do_update(desc, data, len, sha256_generic_block_fn);
> -       return sha256_final(desc, hash);
> +       sha256_finup(shash_desc_ctx(desc), data, len, hash);
> +       return 0;
>  }
> -EXPORT_SYMBOL(crypto_sha256_finup);
>
>  static struct shash_alg sha256_algs[2] = { {
>         .digestsize     =       SHA256_DIGEST_SIZE,
>         .init           =       sha256_base_init,
>         .update         =       crypto_sha256_update,
> -       .final          =       sha256_final,
> +       .final          =       crypto_sha2_final,
>         .finup          =       crypto_sha256_finup,
>         .descsize       =       sizeof(struct sha256_state),
>         .base           =       {
> @@ -279,7 +303,7 @@ static struct shash_alg sha256_algs[2] = { {
>         .digestsize     =       SHA224_DIGEST_SIZE,
>         .init           =       sha224_base_init,
>         .update         =       crypto_sha256_update,
> -       .final          =       sha256_final,
> +       .final          =       crypto_sha2_final,
>         .finup          =       crypto_sha256_finup,
>         .descsize       =       sizeof(struct sha256_state),
>         .base           =       {
> diff --git a/include/crypto/sha.h b/include/crypto/sha.h
> index c94d3eb1cefd..2b6978471605 100644
> --- a/include/crypto/sha.h
> +++ b/include/crypto/sha.h
> @@ -96,11 +96,30 @@ extern int crypto_sha1_update(struct shash_desc *desc, const u8 *data,
>  extern int crypto_sha1_finup(struct shash_desc *desc, const u8 *data,
>                              unsigned int len, u8 *hash);
>
> -extern int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
> -                             unsigned int len);
> -
> -extern int crypto_sha256_finup(struct shash_desc *desc, const u8 *data,
> -                              unsigned int len, u8 *hash);
> +static inline void sha256_init(struct sha256_state *sctx)
> +{
> +       sctx->state[0] = SHA256_H0;
> +       sctx->state[1] = SHA256_H1;
> +       sctx->state[2] = SHA256_H2;
> +       sctx->state[3] = SHA256_H3;
> +       sctx->state[4] = SHA256_H4;
> +       sctx->state[5] = SHA256_H5;
> +       sctx->state[6] = SHA256_H6;
> +       sctx->state[7] = SHA256_H7;
> +       sctx->count = 0;
> +}
> +
> +extern void sha256_update(struct sha256_state *sctx, const u8 *data,
> +                         unsigned int len);
> +
> +extern void sha256_final(struct sha256_state *sctx, u8 *out);
> +
> +static inline void sha256_finup(struct sha256_state *sctx, const u8 *data,
> +                               unsigned int len, u8 *hash)
> +{
> +       sha256_update(sctx, data, len);
> +       sha256_final(sctx, hash);
> +}
>
>  extern int crypto_sha512_update(struct shash_desc *desc, const u8 *data,
>                               unsigned int len);
> diff --git a/include/crypto/sha256_base.h b/include/crypto/sha256_base.h
> index d1f2195bb7de..f65d9a516b36 100644
> --- a/include/crypto/sha256_base.h
> +++ b/include/crypto/sha256_base.h
> @@ -35,29 +35,13 @@ static inline int sha224_base_init(struct shash_desc *desc)
>         return 0;
>  }
>
> -static inline int sha256_base_init(struct shash_desc *desc)
> -{
> -       struct sha256_state *sctx = shash_desc_ctx(desc);
> -
> -       sctx->state[0] = SHA256_H0;
> -       sctx->state[1] = SHA256_H1;
> -       sctx->state[2] = SHA256_H2;
> -       sctx->state[3] = SHA256_H3;
> -       sctx->state[4] = SHA256_H4;
> -       sctx->state[5] = SHA256_H5;
> -       sctx->state[6] = SHA256_H6;
> -       sctx->state[7] = SHA256_H7;
> -       sctx->count = 0;
> -
> -       return 0;
> -}
> +extern int sha256_base_init(struct shash_desc *desc);
>
> -static inline int sha256_base_do_update(struct shash_desc *desc,
> +static inline void sha256_base_do_update(struct sha256_state *sctx,
>                                         const u8 *data,
>                                         unsigned int len,
>                                         sha256_block_fn *block_fn)
>  {
> -       struct sha256_state *sctx = shash_desc_ctx(desc);
>         unsigned int partial = sctx->count % SHA256_BLOCK_SIZE;
>
>         sctx->count += len;
> @@ -86,15 +70,12 @@ static inline int sha256_base_do_update(struct shash_desc *desc,
>         }
>         if (len)
>                 memcpy(sctx->buf + partial, data, len);
> -
> -       return 0;
>  }
>
> -static inline int sha256_base_do_finalize(struct shash_desc *desc,
> +static inline void sha256_base_do_finalize(struct sha256_state *sctx,
>                                           sha256_block_fn *block_fn)
>  {
>         const int bit_offset = SHA256_BLOCK_SIZE - sizeof(__be64);
> -       struct sha256_state *sctx = shash_desc_ctx(desc);
>         __be64 *bits = (__be64 *)(sctx->buf + bit_offset);
>         unsigned int partial = sctx->count % SHA256_BLOCK_SIZE;
>
> @@ -109,14 +90,11 @@ static inline int sha256_base_do_finalize(struct shash_desc *desc,
>         memset(sctx->buf + partial, 0x0, bit_offset - partial);
>         *bits = cpu_to_be64(sctx->count << 3);
>         block_fn(sctx, sctx->buf, 1);
> -
> -       return 0;
>  }
>
> -static inline int sha256_base_finish(struct shash_desc *desc, u8 *out)
> +static inline void sha2_base_finish(struct sha256_state *sctx,
> +                                   unsigned int digest_size, u8 *out)
>  {
> -       unsigned int digest_size = crypto_shash_digestsize(desc->tfm);
> -       struct sha256_state *sctx = shash_desc_ctx(desc);
>         __be32 *digest = (__be32 *)out;
>         int i;
>
> @@ -124,5 +102,11 @@ static inline int sha256_base_finish(struct shash_desc *desc, u8 *out)
>                 put_unaligned_be32(sctx->state[i], digest++);
>
>         *sctx = (struct sha256_state){};
> -       return 0;
>  }
> +
> +static inline void sha256_base_finish(struct sha256_state *sctx, u8 *out)
> +{
> +       sha2_base_finish(sctx, SHA256_DIGEST_SIZE, out);
> +}
> +
> +extern int crypto_sha2_final(struct shash_desc *desc, u8 *out);
> --
> 2.9.3
>

^ permalink raw reply

* Re: [PATCH] drivers: net: ethernet: 3com: fix return value
From: Thomas Preisner @ 2016-12-24 12:02 UTC (permalink / raw)
  To: dave
  Cc: netdev, linux-kernel, linux-kernel, milan.stephan+linux,
	thomas.preisner+linux
In-Reply-To: <1482541554.16278.2.camel@dillow-glaptop.roam.corp.google.com>

On Sat, 2016-12-24 at 02:06 +0100, David Dillow wrote:
>On Sat, 2016-12-24 at 00:00 +0100, Thomas Preisner wrote:
>> diff --git a/drivers/net/ethernet/3com/typhoon.c b/drivers/net/ethernet/3com/typhoon.c
>> index a0cacbe..9a3ab58 100644
>> --- a/drivers/net/ethernet/3com/typhoon.c
>> +++ b/drivers/net/ethernet/3com/typhoon.c
>> @@ -2404,6 +2404,7 @@ typhoon_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>>  
>>  	if(!is_valid_ether_addr(dev->dev_addr)) {
>>  		err_msg = "Could not obtain valid ethernet address, aborting";
>> +		err = -EIO;
>>  		goto error_out_reset;
>
>The change above is fine, but the other two should use the return value
>from the failing function call.
>
>
>> @@ -2413,6 +2414,7 @@ typhoon_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>>  	INIT_COMMAND_WITH_RESPONSE(&xp_cmd, TYPHOON_CMD_READ_VERSIONS);
>>  	if(typhoon_issue_command(tp, 1, &xp_cmd, 3, xp_resp) < 0) {
>>  		err_msg = "Could not get Sleep Image version";
>> +		err = -EIO;
>>  		goto error_out_reset;
>>  	}
>>  
>> @@ -2455,6 +2457,7 @@ typhoon_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>>  
>>  	if(register_netdev(dev) < 0) {
>>  		err_msg = "unable to register netdev";
>> +		err = -EIO;
>>  		goto error_out_reset;
>>  	}
>>  

You are of course right. After you mentioning this we've looked into it a bit
further and realized that the return values of failing function calls are not
being used in various occasions inside of typhoon_init_one().
That's why we've created a second patch to fix this misbehavior (if it is one).
In case this was intended, feel free to ignore the second patch.

Patch 1:
Makes the function typhoon_init_one() return a negative error code instead of 0.

Patch 2 [Optional]:
Makes the function typhoon_init_one() return the return value of the
corresponding failing function calls instead of a "fixed" negative error code.

With regards (and merry christmas),
Milan and Thomas

^ permalink raw reply

* [PATCH v2 1/2] drivers: net: ethernet: 3com: fix return value
From: Thomas Preisner @ 2016-12-24 12:02 UTC (permalink / raw)
  To: dave
  Cc: netdev, linux-kernel, linux-kernel, milan.stephan+linux,
	thomas.preisner+linux
In-Reply-To: <1482580958-15406-1-git-send-email-thomas.preisner+linux@fau.de>

In a few cases the err-variable is not set to a negative error code if a
function call fails and thus 0 is returned instead.
It may be better to set err to the appropriate negative error code
before returning.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188841

Reported-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Thomas Preisner <thomas.preisner+linux@fau.de>
Signed-off-by: Milan Stephan <milan.stephan+linux@fau.de>
---
 drivers/net/ethernet/3com/typhoon.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/3com/typhoon.c b/drivers/net/ethernet/3com/typhoon.c
index a0cacbe..c88b88a 100644
--- a/drivers/net/ethernet/3com/typhoon.c
+++ b/drivers/net/ethernet/3com/typhoon.c
@@ -2404,6 +2404,7 @@ typhoon_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	if(!is_valid_ether_addr(dev->dev_addr)) {
 		err_msg = "Could not obtain valid ethernet address, aborting";
+		err = -EIO;
 		goto error_out_reset;
 	}
 
@@ -2411,7 +2412,8 @@ typhoon_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	 * later when we print out the version reported.
 	 */
 	INIT_COMMAND_WITH_RESPONSE(&xp_cmd, TYPHOON_CMD_READ_VERSIONS);
-	if(typhoon_issue_command(tp, 1, &xp_cmd, 3, xp_resp) < 0) {
+	err = typhoon_issue_command(tp, 1, &xp_cmd, 3, xp_resp);
+	if(err < 0) {
 		err_msg = "Could not get Sleep Image version";
 		goto error_out_reset;
 	}
@@ -2453,7 +2455,8 @@ typhoon_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	dev->features = dev->hw_features |
 		NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_RXCSUM;
 
-	if(register_netdev(dev) < 0) {
+	err = register_netdev(dev);
+	if(err < 0) {
 		err_msg = "unable to register netdev";
 		goto error_out_reset;
 	}
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 2/2] drivers: net: ethernet: 3com: fix return value
From: Thomas Preisner @ 2016-12-24 12:02 UTC (permalink / raw)
  To: dave
  Cc: netdev, linux-kernel, linux-kernel, milan.stephan+linux,
	thomas.preisner+linux
In-Reply-To: <1482580958-15406-1-git-send-email-thomas.preisner+linux@fau.de>

In some cases the return value of a failing function is not being used
and the function typhoon_init_one() returns another negative error
code instead.

Signed-off-by: Thomas Preisner <thomas.preisner+linux@fau.de>
Signed-off-by: Milan Stephan <milan.stephan+linux@fau.de>
---
 drivers/net/ethernet/3com/typhoon.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/3com/typhoon.c b/drivers/net/ethernet/3com/typhoon.c
index c88b88a..8821a24 100644
--- a/drivers/net/ethernet/3com/typhoon.c
+++ b/drivers/net/ethernet/3com/typhoon.c
@@ -2370,9 +2370,9 @@ typhoon_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	 * 4) Get the hardware address.
 	 * 5) Put the card to sleep.
 	 */
-	if (typhoon_reset(ioaddr, WaitSleep) < 0) {
+	err = typhoon_reset(ioaddr, WaitSleep);
+	if (err < 0) {
 		err_msg = "could not reset 3XP";
-		err = -EIO;
 		goto error_out_dma;
 	}
 
@@ -2386,16 +2386,16 @@ typhoon_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	typhoon_init_interface(tp);
 	typhoon_init_rings(tp);
 
-	if(typhoon_boot_3XP(tp, TYPHOON_STATUS_WAITING_FOR_HOST) < 0) {
+	err = typhoon_boot_3XP(tp, TYPHOON_STATUS_WAITING_FOR_HOST);
+	if(err < 0) {
 		err_msg = "cannot boot 3XP sleep image";
-		err = -EIO;
 		goto error_out_reset;
 	}
 
 	INIT_COMMAND_WITH_RESPONSE(&xp_cmd, TYPHOON_CMD_READ_MAC_ADDRESS);
-	if(typhoon_issue_command(tp, 1, &xp_cmd, 1, xp_resp) < 0) {
+	err = typhoon_issue_command(tp, 1, &xp_cmd, 1, xp_resp);
+	if(err < 0) {
 		err_msg = "cannot read MAC address";
-		err = -EIO;
 		goto error_out_reset;
 	}
 
@@ -2430,9 +2430,9 @@ typhoon_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	if(xp_resp[0].numDesc != 0)
 		tp->capabilities |= TYPHOON_WAKEUP_NEEDS_RESET;
 
-	if(typhoon_sleep(tp, PCI_D3hot, 0) < 0) {
+	err = typhoon_sleep(tp, PCI_D3hot, 0);
+	if(err < 0) {
 		err_msg = "cannot put adapter to sleep";
-		err = -EIO;
 		goto error_out_reset;
 	}
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH] ipv4: Namespaceify tcp_tw_reuse knob
From: Haishuang Yan @ 2016-12-24 12:43 UTC (permalink / raw)
  To: David S. Miller, lexey Kuznetsov, James Morris, Patrick McHardy
  Cc: netdev, linux-kernel, Haishuang Yan

Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
---
 include/net/netns/ipv4.h   |  1 +
 include/net/tcp.h          |  1 -
 net/ipv4/sysctl_net_ipv4.c | 14 +++++++-------
 net/ipv4/tcp_ipv4.c        |  4 ++--
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index f0cf5a1..0378e88 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -110,6 +110,7 @@ struct netns_ipv4 {
 	int sysctl_tcp_orphan_retries;
 	int sysctl_tcp_fin_timeout;
 	unsigned int sysctl_tcp_notsent_lowat;
+	int sysctl_tcp_tw_reuse;
 
 	int sysctl_igmp_max_memberships;
 	int sysctl_igmp_max_msf;
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 207147b..6061963 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -252,7 +252,6 @@
 extern int sysctl_tcp_rmem[3];
 extern int sysctl_tcp_app_win;
 extern int sysctl_tcp_adv_win_scale;
-extern int sysctl_tcp_tw_reuse;
 extern int sysctl_tcp_frto;
 extern int sysctl_tcp_low_latency;
 extern int sysctl_tcp_nometrics_save;
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 80bc36b..22cbd61 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -433,13 +433,6 @@ static int proc_tcp_fastopen_key(struct ctl_table *ctl, int write,
 		.extra2		= &tcp_adv_win_scale_max,
 	},
 	{
-		.procname	= "tcp_tw_reuse",
-		.data		= &sysctl_tcp_tw_reuse,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec
-	},
-	{
 		.procname	= "tcp_frto",
 		.data		= &sysctl_tcp_frto,
 		.maxlen		= sizeof(int),
@@ -960,6 +953,13 @@ static int proc_tcp_fastopen_key(struct ctl_table *ctl, int write,
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
+	{
+		.procname	= "tcp_tw_reuse",
+		.data		= &init_net.ipv4.sysctl_tcp_tw_reuse,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec
+	},
 #ifdef CONFIG_IP_ROUTE_MULTIPATH
 	{
 		.procname	= "fib_multipath_use_neigh",
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 30d81f5..fe9da4f 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -84,7 +84,6 @@
 #include <crypto/hash.h>
 #include <linux/scatterlist.h>
 
-int sysctl_tcp_tw_reuse __read_mostly;
 int sysctl_tcp_low_latency __read_mostly;
 
 #ifdef CONFIG_TCP_MD5SIG
@@ -120,7 +119,7 @@ int tcp_twsk_unique(struct sock *sk, struct sock *sktw, void *twp)
 	   and use initial timestamp retrieved from peer table.
 	 */
 	if (tcptw->tw_ts_recent_stamp &&
-	    (!twp || (sysctl_tcp_tw_reuse &&
+	    (!twp || (sock_net(sk)->ipv4.sysctl_tcp_tw_reuse &&
 			     get_seconds() - tcptw->tw_ts_recent_stamp > 1))) {
 		tp->write_seq = tcptw->tw_snd_nxt + 65535 + 2;
 		if (tp->write_seq == 0)
@@ -2456,6 +2455,7 @@ static int __net_init tcp_sk_init(struct net *net)
 	net->ipv4.sysctl_tcp_orphan_retries = 0;
 	net->ipv4.sysctl_tcp_fin_timeout = TCP_FIN_TIMEOUT;
 	net->ipv4.sysctl_tcp_notsent_lowat = UINT_MAX;
+	net->ipv4.sysctl_tcp_tw_reuse = 0;
 
 	return 0;
 fail:
-- 
1.8.3.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox