public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] scsi: fc: use get/put_unaligned64 for wwn access
@ 2016-03-16 16:39 Arnd Bergmann
  2016-03-16 17:44 ` Ewan D. Milne
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Arnd Bergmann @ 2016-03-16 16:39 UTC (permalink / raw)
  To: James E.J. Bottomley, Martin K. Petersen
  Cc: Arnd Bergmann, James Bottomley, Hannes Reinecke, James Smart,
	Ewan D. Milne, linux-scsi, linux-kernel

A bug in the gcc-6.0 prerelease version caused at least one
driver (lpfc) to have excessive stack usage when dealing with
wwn data, on the ARM architecture.

lpfc_scsi.c: In function 'lpfc_find_next_oas_lun':
lpfc_scsi.c:117:1: warning: the frame size of 1152 bytes is larger than 1024 bytes [-Wframe-larger-than=]

I have reported this as a gcc regression in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70232

However, using a better implementation of wwn_to_u64() not only
helps with the particular gcc problem but also leads to better
object code for any version or architecture.

The kernel already provides get_unaligned_be64() and
put_unaligned_be64() helper functions that provide an
optimized implementation with the desired semantics.

The lpfc_find_next_oas_lun() function in the example that
grew from 1146 bytes to 5144 bytes when moving from gcc-5.3
to gcc-6.0 is now 804 bytes, as the optimized
get_unaligned_be64() load can be done in three instructions.
The stack usage is now down to 28 bytes from 128 bytes with
gcc-5.3 before.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 include/scsi/scsi_transport_fc.h | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index 784bc2c0929f..bf66ea6bed2b 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -28,6 +28,7 @@
 #define SCSI_TRANSPORT_FC_H
 
 #include <linux/sched.h>
+#include <asm/unaligned.h>
 #include <scsi/scsi.h>
 #include <scsi/scsi_netlink.h>
 
@@ -797,22 +798,12 @@ fc_remote_port_chkready(struct fc_rport *rport)
 
 static inline u64 wwn_to_u64(u8 *wwn)
 {
-	return (u64)wwn[0] << 56 | (u64)wwn[1] << 48 |
-	    (u64)wwn[2] << 40 | (u64)wwn[3] << 32 |
-	    (u64)wwn[4] << 24 | (u64)wwn[5] << 16 |
-	    (u64)wwn[6] <<  8 | (u64)wwn[7];
+	return get_unaligned_be64(wwn);
 }
 
 static inline void u64_to_wwn(u64 inm, u8 *wwn)
 {
-	wwn[0] = (inm >> 56) & 0xff;
-	wwn[1] = (inm >> 48) & 0xff;
-	wwn[2] = (inm >> 40) & 0xff;
-	wwn[3] = (inm >> 32) & 0xff;
-	wwn[4] = (inm >> 24) & 0xff;
-	wwn[5] = (inm >> 16) & 0xff;
-	wwn[6] = (inm >> 8) & 0xff;
-	wwn[7] = inm & 0xff;
+	put_unaligned_be64(inm, wwn);
 }
 
 /**
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] scsi: fc: use get/put_unaligned64 for wwn access
  2016-03-16 16:39 [PATCH] scsi: fc: use get/put_unaligned64 for wwn access Arnd Bergmann
@ 2016-03-16 17:44 ` Ewan D. Milne
  2016-03-17 12:57 ` Hannes Reinecke
  2016-03-18 19:30 ` Martin K. Petersen
  2 siblings, 0 replies; 4+ messages in thread
From: Ewan D. Milne @ 2016-03-16 17:44 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: James E.J. Bottomley, Martin K. Petersen, James Bottomley,
	Hannes Reinecke, James Smart, linux-scsi, linux-kernel

On Wed, 2016-03-16 at 17:39 +0100, Arnd Bergmann wrote:
> A bug in the gcc-6.0 prerelease version caused at least one
> driver (lpfc) to have excessive stack usage when dealing with
> wwn data, on the ARM architecture.
> 
> lpfc_scsi.c: In function 'lpfc_find_next_oas_lun':
> lpfc_scsi.c:117:1: warning: the frame size of 1152 bytes is larger than 1024 bytes [-Wframe-larger-than=]
> 
> I have reported this as a gcc regression in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70232
> 
> However, using a better implementation of wwn_to_u64() not only
> helps with the particular gcc problem but also leads to better
> object code for any version or architecture.
> 
> The kernel already provides get_unaligned_be64() and
> put_unaligned_be64() helper functions that provide an
> optimized implementation with the desired semantics.
> 
> The lpfc_find_next_oas_lun() function in the example that
> grew from 1146 bytes to 5144 bytes when moving from gcc-5.3
> to gcc-6.0 is now 804 bytes, as the optimized
> get_unaligned_be64() load can be done in three instructions.
> The stack usage is now down to 28 bytes from 128 bytes with
> gcc-5.3 before.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  include/scsi/scsi_transport_fc.h | 15 +++------------
>  1 file changed, 3 insertions(+), 12 deletions(-)
> 
> diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
> index 784bc2c0929f..bf66ea6bed2b 100644
> --- a/include/scsi/scsi_transport_fc.h
> +++ b/include/scsi/scsi_transport_fc.h
> @@ -28,6 +28,7 @@
>  #define SCSI_TRANSPORT_FC_H
>  
>  #include <linux/sched.h>
> +#include <asm/unaligned.h>
>  #include <scsi/scsi.h>
>  #include <scsi/scsi_netlink.h>
>  
> @@ -797,22 +798,12 @@ fc_remote_port_chkready(struct fc_rport *rport)
>  
>  static inline u64 wwn_to_u64(u8 *wwn)
>  {
> -	return (u64)wwn[0] << 56 | (u64)wwn[1] << 48 |
> -	    (u64)wwn[2] << 40 | (u64)wwn[3] << 32 |
> -	    (u64)wwn[4] << 24 | (u64)wwn[5] << 16 |
> -	    (u64)wwn[6] <<  8 | (u64)wwn[7];
> +	return get_unaligned_be64(wwn);
>  }
>  
>  static inline void u64_to_wwn(u64 inm, u8 *wwn)
>  {
> -	wwn[0] = (inm >> 56) & 0xff;
> -	wwn[1] = (inm >> 48) & 0xff;
> -	wwn[2] = (inm >> 40) & 0xff;
> -	wwn[3] = (inm >> 32) & 0xff;
> -	wwn[4] = (inm >> 24) & 0xff;
> -	wwn[5] = (inm >> 16) & 0xff;
> -	wwn[6] = (inm >> 8) & 0xff;
> -	wwn[7] = inm & 0xff;
> +	put_unaligned_be64(inm, wwn);
>  }
>  
>  /**

It would be nice to get rid of these functions completely and just
change the callers to use get/put_unaligned_be64() directly, like libfc
does, but that involves changing 7 drivers and scsi_transport_fc.

Reviewed-by: Ewan D. Milne <emilne@redhat.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] scsi: fc: use get/put_unaligned64 for wwn access
  2016-03-16 16:39 [PATCH] scsi: fc: use get/put_unaligned64 for wwn access Arnd Bergmann
  2016-03-16 17:44 ` Ewan D. Milne
@ 2016-03-17 12:57 ` Hannes Reinecke
  2016-03-18 19:30 ` Martin K. Petersen
  2 siblings, 0 replies; 4+ messages in thread
From: Hannes Reinecke @ 2016-03-17 12:57 UTC (permalink / raw)
  To: Arnd Bergmann, James E.J. Bottomley, Martin K. Petersen
  Cc: James Bottomley, James Smart, Ewan D. Milne, linux-scsi,
	linux-kernel

On 03/16/2016 05:39 PM, Arnd Bergmann wrote:
> A bug in the gcc-6.0 prerelease version caused at least one
> driver (lpfc) to have excessive stack usage when dealing with
> wwn data, on the ARM architecture.
> 
> lpfc_scsi.c: In function 'lpfc_find_next_oas_lun':
> lpfc_scsi.c:117:1: warning: the frame size of 1152 bytes is larger than 1024 bytes [-Wframe-larger-than=]
> 
> I have reported this as a gcc regression in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70232
> 
> However, using a better implementation of wwn_to_u64() not only
> helps with the particular gcc problem but also leads to better
> object code for any version or architecture.
> 
> The kernel already provides get_unaligned_be64() and
> put_unaligned_be64() helper functions that provide an
> optimized implementation with the desired semantics.
> 
> The lpfc_find_next_oas_lun() function in the example that
> grew from 1146 bytes to 5144 bytes when moving from gcc-5.3
> to gcc-6.0 is now 804 bytes, as the optimized
> get_unaligned_be64() load can be done in three instructions.
> The stack usage is now down to 28 bytes from 128 bytes with
> gcc-5.3 before.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  include/scsi/scsi_transport_fc.h | 15 +++------------
>  1 file changed, 3 insertions(+), 12 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] scsi: fc: use get/put_unaligned64 for wwn access
  2016-03-16 16:39 [PATCH] scsi: fc: use get/put_unaligned64 for wwn access Arnd Bergmann
  2016-03-16 17:44 ` Ewan D. Milne
  2016-03-17 12:57 ` Hannes Reinecke
@ 2016-03-18 19:30 ` Martin K. Petersen
  2 siblings, 0 replies; 4+ messages in thread
From: Martin K. Petersen @ 2016-03-18 19:30 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: James E.J. Bottomley, Martin K. Petersen, James Bottomley,
	Hannes Reinecke, James Smart, Ewan D. Milne, linux-scsi,
	linux-kernel

>>>>> "Arnd" == Arnd Bergmann <arnd@arndb.de> writes:

Arnd> A bug in the gcc-6.0 prerelease version caused at least one driver
Arnd> (lpfc) to have excessive stack usage when dealing with wwn data,
Arnd> on the ARM architecture.

Applied to 4.6/scsi-fixes.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-03-18 19:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-16 16:39 [PATCH] scsi: fc: use get/put_unaligned64 for wwn access Arnd Bergmann
2016-03-16 17:44 ` Ewan D. Milne
2016-03-17 12:57 ` Hannes Reinecke
2016-03-18 19:30 ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox