* [PATCH] fs: optimise kiocb_set_rw_flags()
@ 2020-01-17  1:16 Pavel Begunkov
  2020-01-17  1:21 ` Matthew Wilcox
  0 siblings, 1 reply; 11+ messages in thread
From: Pavel Begunkov @ 2020-01-17  1:16 UTC (permalink / raw)
  To: Alexander Viro, linux-fsdevel, linux-kernel
kiocb_set_rw_flags() generates a poor code with several memory writes
and a lot of jumps. Help compilers to optimise it.
Tested with gcc 9.2 on x64-86, and as a result, it its output now is a
plain code without jumps accumulating in a register before a memory
write.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/linux/fs.h | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 98e0349adb52..c3db8c80aed4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3402,22 +3402,27 @@ static inline int iocb_flags(struct file *file)
 
 static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
 {
+	int kiocb_flags = 0;
+
 	if (unlikely(flags & ~RWF_SUPPORTED))
 		return -EOPNOTSUPP;
 
 	if (flags & RWF_NOWAIT) {
 		if (!(ki->ki_filp->f_mode & FMODE_NOWAIT))
 			return -EOPNOTSUPP;
-		ki->ki_flags |= IOCB_NOWAIT;
+		kiocb_flags |= IOCB_NOWAIT;
 	}
 	if (flags & RWF_HIPRI)
-		ki->ki_flags |= IOCB_HIPRI;
+		kiocb_flags |= IOCB_HIPRI;
 	if (flags & RWF_DSYNC)
-		ki->ki_flags |= IOCB_DSYNC;
+		kiocb_flags |= IOCB_DSYNC;
 	if (flags & RWF_SYNC)
-		ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
+		kiocb_flags |= (IOCB_DSYNC | IOCB_SYNC);
 	if (flags & RWF_APPEND)
-		ki->ki_flags |= IOCB_APPEND;
+		kiocb_flags |= IOCB_APPEND;
+
+	if (kiocb_flags)
+		ki->ki_flags |= kiocb_flags;
 	return 0;
 }
 
-- 
2.24.0
^ permalink raw reply related	[flat|nested] 11+ messages in thread
* Re: [PATCH] fs: optimise kiocb_set_rw_flags()
  2020-01-17  1:16 Pavel Begunkov
@ 2020-01-17  1:21 ` Matthew Wilcox
  2020-01-17  1:23   ` Pavel Begunkov
  0 siblings, 1 reply; 11+ messages in thread
From: Matthew Wilcox @ 2020-01-17  1:21 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Alexander Viro, linux-fsdevel, linux-kernel
On Fri, Jan 17, 2020 at 04:16:41AM +0300, Pavel Begunkov wrote:
> kiocb_set_rw_flags() generates a poor code with several memory writes
> and a lot of jumps. Help compilers to optimise it.
> 
> Tested with gcc 9.2 on x64-86, and as a result, it its output now is a
> plain code without jumps accumulating in a register before a memory
> write.
Nice!
>  static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
>  {
> +	int kiocb_flags = 0;
> +
>  	if (unlikely(flags & ~RWF_SUPPORTED))
>  		return -EOPNOTSUPP;
>  
>  	if (flags & RWF_NOWAIT) {
>  		if (!(ki->ki_filp->f_mode & FMODE_NOWAIT))
>  			return -EOPNOTSUPP;
> -		ki->ki_flags |= IOCB_NOWAIT;
> +		kiocb_flags |= IOCB_NOWAIT;
>  	}
>  	if (flags & RWF_HIPRI)
> -		ki->ki_flags |= IOCB_HIPRI;
> +		kiocb_flags |= IOCB_HIPRI;
>  	if (flags & RWF_DSYNC)
> -		ki->ki_flags |= IOCB_DSYNC;
> +		kiocb_flags |= IOCB_DSYNC;
>  	if (flags & RWF_SYNC)
> -		ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
> +		kiocb_flags |= (IOCB_DSYNC | IOCB_SYNC);
>  	if (flags & RWF_APPEND)
> -		ki->ki_flags |= IOCB_APPEND;
> +		kiocb_flags |= IOCB_APPEND;
> +
> +	if (kiocb_flags)
> +		ki->ki_flags |= kiocb_flags;
>  	return 0;
>  }
Might it generate even better code to do ...
 	int kiocb_flags = 0;
 
+	if (!flags)
+		return 0;
 	if (unlikely(flags & ~RWF_SUPPORTED))
 		return -EOPNOTSUPP;
 
...
-	if (kiocb_flags)
-		ki->ki_flags |= kiocb_flags;
+	ki->ki_flags |= kiocb_flags;
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] fs: optimise kiocb_set_rw_flags()
  2020-01-17  1:21 ` Matthew Wilcox
@ 2020-01-17  1:23   ` Pavel Begunkov
  0 siblings, 0 replies; 11+ messages in thread
From: Pavel Begunkov @ 2020-01-17  1:23 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Alexander Viro, linux-fsdevel, linux-kernel
[-- Attachment #1.1: Type: text/plain, Size: 1669 bytes --]
On 17/01/2020 04:21, Matthew Wilcox wrote:
> On Fri, Jan 17, 2020 at 04:16:41AM +0300, Pavel Begunkov wrote:
>> kiocb_set_rw_flags() generates a poor code with several memory writes
>> and a lot of jumps. Help compilers to optimise it.
>>
>> Tested with gcc 9.2 on x64-86, and as a result, it its output now is a
>> plain code without jumps accumulating in a register before a memory
>> write.
> 
> Nice!
> 
>>  static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
>>  {
>> +	int kiocb_flags = 0;
>> +
>>  	if (unlikely(flags & ~RWF_SUPPORTED))
>>  		return -EOPNOTSUPP;
>>  
>>  	if (flags & RWF_NOWAIT) {
>>  		if (!(ki->ki_filp->f_mode & FMODE_NOWAIT))
>>  			return -EOPNOTSUPP;
>> -		ki->ki_flags |= IOCB_NOWAIT;
>> +		kiocb_flags |= IOCB_NOWAIT;
>>  	}
>>  	if (flags & RWF_HIPRI)
>> -		ki->ki_flags |= IOCB_HIPRI;
>> +		kiocb_flags |= IOCB_HIPRI;
>>  	if (flags & RWF_DSYNC)
>> -		ki->ki_flags |= IOCB_DSYNC;
>> +		kiocb_flags |= IOCB_DSYNC;
>>  	if (flags & RWF_SYNC)
>> -		ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
>> +		kiocb_flags |= (IOCB_DSYNC | IOCB_SYNC);
>>  	if (flags & RWF_APPEND)
>> -		ki->ki_flags |= IOCB_APPEND;
>> +		kiocb_flags |= IOCB_APPEND;
>> +
>> +	if (kiocb_flags)
>> +		ki->ki_flags |= kiocb_flags;
>>  	return 0;
>>  }
> 
> Might it generate even better code to do ...
Good idea, thanks! I'll resend
> 
>  	int kiocb_flags = 0;
>  
> +	if (!flags)
> +		return 0;
>  	if (unlikely(flags & ~RWF_SUPPORTED))
>  		return -EOPNOTSUPP;
>  
> ...
> 
> -	if (kiocb_flags)
> -		ki->ki_flags |= kiocb_flags;
> +	ki->ki_flags |= kiocb_flags;
> 
-- 
Pavel Begunkov
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply	[flat|nested] 11+ messages in thread
* [PATCH] fs: optimise kiocb_set_rw_flags()
@ 2020-08-01 10:36 Pavel Begunkov
  2020-08-01 10:42 ` Pavel Begunkov
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Pavel Begunkov @ 2020-08-01 10:36 UTC (permalink / raw)
  To: Jens Axboe, Alexander Viro, linux-fsdevel, linux-kernel
Use a local var to collect flags in kiocb_set_rw_flags(). That spares
some memory writes and allows to replace most of the jumps with MOVEcc.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
v2: fast exit on flags == 0 (Matthew Wilcox)
 include/linux/fs.h | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8a00ba99284e..b7f1f1b7d691 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3282,22 +3282,28 @@ static inline int iocb_flags(struct file *file)
 
 static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
 {
+	int kiocb_flags = 0;
+
+	if (!flags)
+		return 0;
 	if (unlikely(flags & ~RWF_SUPPORTED))
 		return -EOPNOTSUPP;
 
 	if (flags & RWF_NOWAIT) {
 		if (!(ki->ki_filp->f_mode & FMODE_NOWAIT))
 			return -EOPNOTSUPP;
-		ki->ki_flags |= IOCB_NOWAIT;
+		kiocb_flags |= IOCB_NOWAIT;
 	}
 	if (flags & RWF_HIPRI)
-		ki->ki_flags |= IOCB_HIPRI;
+		kiocb_flags |= IOCB_HIPRI;
 	if (flags & RWF_DSYNC)
-		ki->ki_flags |= IOCB_DSYNC;
+		kiocb_flags |= IOCB_DSYNC;
 	if (flags & RWF_SYNC)
-		ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
+		kiocb_flags |= (IOCB_DSYNC | IOCB_SYNC);
 	if (flags & RWF_APPEND)
-		ki->ki_flags |= IOCB_APPEND;
+		kiocb_flags |= IOCB_APPEND;
+
+	ki->ki_flags |= kiocb_flags;
 	return 0;
 }
 
-- 
2.24.0
^ permalink raw reply related	[flat|nested] 11+ messages in thread
* Re: [PATCH] fs: optimise kiocb_set_rw_flags()
  2020-08-01 10:36 [PATCH] fs: optimise kiocb_set_rw_flags() Pavel Begunkov
@ 2020-08-01 10:42 ` Pavel Begunkov
  2020-08-01 15:37 ` Matthew Wilcox
  2020-08-01 17:02 ` Jens Axboe
  2 siblings, 0 replies; 11+ messages in thread
From: Pavel Begunkov @ 2020-08-01 10:42 UTC (permalink / raw)
  To: Jens Axboe, Alexander Viro, linux-fsdevel, linux-kernel
On 01/08/2020 13:36, Pavel Begunkov wrote:
> Use a local var to collect flags in kiocb_set_rw_flags(). That spares
> some memory writes and allows to replace most of the jumps with MOVEcc.
This one is pretty old, but looks like nobody in fsdevel cares.
Jens, any chance you can pick this up? For instancec, I don't like the
binary it generates for io_uring.
> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
> 
> v2: fast exit on flags == 0 (Matthew Wilcox)
> 
>  include/linux/fs.h | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 8a00ba99284e..b7f1f1b7d691 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3282,22 +3282,28 @@ static inline int iocb_flags(struct file *file)
>  
>  static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
>  {
> +	int kiocb_flags = 0;
> +
> +	if (!flags)
> +		return 0;
>  	if (unlikely(flags & ~RWF_SUPPORTED))
>  		return -EOPNOTSUPP;
>  
>  	if (flags & RWF_NOWAIT) {
>  		if (!(ki->ki_filp->f_mode & FMODE_NOWAIT))
>  			return -EOPNOTSUPP;
> -		ki->ki_flags |= IOCB_NOWAIT;
> +		kiocb_flags |= IOCB_NOWAIT;
>  	}
>  	if (flags & RWF_HIPRI)
> -		ki->ki_flags |= IOCB_HIPRI;
> +		kiocb_flags |= IOCB_HIPRI;
>  	if (flags & RWF_DSYNC)
> -		ki->ki_flags |= IOCB_DSYNC;
> +		kiocb_flags |= IOCB_DSYNC;
>  	if (flags & RWF_SYNC)
> -		ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
> +		kiocb_flags |= (IOCB_DSYNC | IOCB_SYNC);
>  	if (flags & RWF_APPEND)
> -		ki->ki_flags |= IOCB_APPEND;
> +		kiocb_flags |= IOCB_APPEND;
> +
> +	ki->ki_flags |= kiocb_flags;
>  	return 0;
>  }
>  
> 
-- 
Pavel Begunkov
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] fs: optimise kiocb_set_rw_flags()
  2020-08-01 10:36 [PATCH] fs: optimise kiocb_set_rw_flags() Pavel Begunkov
  2020-08-01 10:42 ` Pavel Begunkov
@ 2020-08-01 15:37 ` Matthew Wilcox
  2020-08-01 17:01   ` Jens Axboe
  2020-08-02  8:41   ` Pavel Begunkov
  2020-08-01 17:02 ` Jens Axboe
  2 siblings, 2 replies; 11+ messages in thread
From: Matthew Wilcox @ 2020-08-01 15:37 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Jens Axboe, Alexander Viro, linux-fsdevel, linux-kernel
On Sat, Aug 01, 2020 at 01:36:33PM +0300, Pavel Begunkov wrote:
> Use a local var to collect flags in kiocb_set_rw_flags(). That spares
> some memory writes and allows to replace most of the jumps with MOVEcc.
> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
If you want to improve the codegen here further, I would suggest that
renumbering the IOCB flags to match the RWF flags would lead to better
codegen (can't do it the other way around; RWF flags are userspace ABI,
IOCB flags are not).  iocb_flags() probably doesn't get any worse because
the IOCB_ flags don't have the same numbers as the O_ bits (which differ
by arch anyway).
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] fs: optimise kiocb_set_rw_flags()
  2020-08-01 15:37 ` Matthew Wilcox
@ 2020-08-01 17:01   ` Jens Axboe
  2020-08-02  8:33     ` Pavel Begunkov
  2020-08-02  8:41   ` Pavel Begunkov
  1 sibling, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2020-08-01 17:01 UTC (permalink / raw)
  To: Matthew Wilcox, Pavel Begunkov
  Cc: Alexander Viro, linux-fsdevel, linux-kernel
On 8/1/20 9:37 AM, Matthew Wilcox wrote:
> On Sat, Aug 01, 2020 at 01:36:33PM +0300, Pavel Begunkov wrote:
>> Use a local var to collect flags in kiocb_set_rw_flags(). That spares
>> some memory writes and allows to replace most of the jumps with MOVEcc.
>>
>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> 
> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> 
> If you want to improve the codegen here further, I would suggest that
> renumbering the IOCB flags to match the RWF flags would lead to better
> codegen (can't do it the other way around; RWF flags are userspace ABI,
> IOCB flags are not).  iocb_flags() probably doesn't get any worse because
> the IOCB_ flags don't have the same numbers as the O_ bits (which differ
> by arch anyway).
Yeah that's not a bad idea, would kill a lot of branches.
-- 
Jens Axboe
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] fs: optimise kiocb_set_rw_flags()
  2020-08-01 10:36 [PATCH] fs: optimise kiocb_set_rw_flags() Pavel Begunkov
  2020-08-01 10:42 ` Pavel Begunkov
  2020-08-01 15:37 ` Matthew Wilcox
@ 2020-08-01 17:02 ` Jens Axboe
  2020-08-02  8:21   ` Pavel Begunkov
  2 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2020-08-01 17:02 UTC (permalink / raw)
  To: Pavel Begunkov, Alexander Viro, linux-fsdevel, linux-kernel
On 8/1/20 4:36 AM, Pavel Begunkov wrote:
> Use a local var to collect flags in kiocb_set_rw_flags(). That spares
> some memory writes and allows to replace most of the jumps with MOVEcc.
I've picked this one up.
-- 
Jens Axboe
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] fs: optimise kiocb_set_rw_flags()
  2020-08-01 17:02 ` Jens Axboe
@ 2020-08-02  8:21   ` Pavel Begunkov
  0 siblings, 0 replies; 11+ messages in thread
From: Pavel Begunkov @ 2020-08-02  8:21 UTC (permalink / raw)
  To: Jens Axboe, Alexander Viro, linux-fsdevel, linux-kernel
On 01/08/2020 20:02, Jens Axboe wrote:
> On 8/1/20 4:36 AM, Pavel Begunkov wrote:
>> Use a local var to collect flags in kiocb_set_rw_flags(). That spares
>> some memory writes and allows to replace most of the jumps with MOVEcc.
> 
> I've picked this one up.
Thanks
-- 
Pavel Begunkov
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] fs: optimise kiocb_set_rw_flags()
  2020-08-01 17:01   ` Jens Axboe
@ 2020-08-02  8:33     ` Pavel Begunkov
  0 siblings, 0 replies; 11+ messages in thread
From: Pavel Begunkov @ 2020-08-02  8:33 UTC (permalink / raw)
  To: Jens Axboe, Matthew Wilcox; +Cc: Alexander Viro, linux-fsdevel, linux-kernel
On 01/08/2020 20:01, Jens Axboe wrote:
> On 8/1/20 9:37 AM, Matthew Wilcox wrote:
>> On Sat, Aug 01, 2020 at 01:36:33PM +0300, Pavel Begunkov wrote:
>>> Use a local var to collect flags in kiocb_set_rw_flags(). That spares
>>> some memory writes and allows to replace most of the jumps with MOVEcc.
>>>
>>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>>
>> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
>>
>> If you want to improve the codegen here further, I would suggest that
>> renumbering the IOCB flags to match the RWF flags would lead to better
>> codegen (can't do it the other way around; RWF flags are userspace ABI,
>> IOCB flags are not).  iocb_flags() probably doesn't get any worse because
>> the IOCB_ flags don't have the same numbers as the O_ bits (which differ
>> by arch anyway).
> 
> Yeah that's not a bad idea, would kill a lot of branches.
Is that common here to do so? I've done this for io_uring flags a while
ago, but left RWF alone at the time, reluctant to check for possible
complications (e.g. bit magic).
-- 
Pavel Begunkov
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] fs: optimise kiocb_set_rw_flags()
  2020-08-01 15:37 ` Matthew Wilcox
  2020-08-01 17:01   ` Jens Axboe
@ 2020-08-02  8:41   ` Pavel Begunkov
  1 sibling, 0 replies; 11+ messages in thread
From: Pavel Begunkov @ 2020-08-02  8:41 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Jens Axboe, Alexander Viro, linux-fsdevel, linux-kernel
On 01/08/2020 18:37, Matthew Wilcox wrote:
> On Sat, Aug 01, 2020 at 01:36:33PM +0300, Pavel Begunkov wrote:
>> Use a local var to collect flags in kiocb_set_rw_flags(). That spares
>> some memory writes and allows to replace most of the jumps with MOVEcc.
>>
>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> 
> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Thanks for reviewing it
> 
> If you want to improve the codegen here further, I would suggest that
> renumbering the IOCB flags to match the RWF flags would lead to better
> codegen (can't do it the other way around; RWF flags are userspace ABI,
> IOCB flags are not).  iocb_flags() probably doesn't get any worse because
> the IOCB_ flags don't have the same numbers as the O_ bits (which differ
> by arch anyway).
> 
-- 
Pavel Begunkov
^ permalink raw reply	[flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-08-02  8:43 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-08-01 10:36 [PATCH] fs: optimise kiocb_set_rw_flags() Pavel Begunkov
2020-08-01 10:42 ` Pavel Begunkov
2020-08-01 15:37 ` Matthew Wilcox
2020-08-01 17:01   ` Jens Axboe
2020-08-02  8:33     ` Pavel Begunkov
2020-08-02  8:41   ` Pavel Begunkov
2020-08-01 17:02 ` Jens Axboe
2020-08-02  8:21   ` Pavel Begunkov
  -- strict thread matches above, loose matches on Subject: below --
2020-01-17  1:16 Pavel Begunkov
2020-01-17  1:21 ` Matthew Wilcox
2020-01-17  1:23   ` Pavel Begunkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).