Re: [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and per_cpu_ptr().

virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@kernel.org>
To: David Laight <David.Laight@aculab.com>
Cc: "'linux-kernel@vger.kernel.org'" <linux-kernel@vger.kernel.org>,
	"'peterz@infradead.org'" <peterz@infradead.org>,
	"'longman@redhat.com'" <longman@redhat.com>,
	"'mingo@redhat.com'" <mingo@redhat.com>,
	"'will@kernel.org'" <will@kernel.org>,
	"'boqun.feng@gmail.com'" <boqun.feng@gmail.com>,
	'Linus Torvalds' <torvalds@linux-foundation.org>,
	"'virtualization@lists.linux-foundation.org'"
	<virtualization@lists.linux-foundation.org>,
	'Zeng Heng' <zengheng4@huawei.com>
Subject: Re: [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and per_cpu_ptr().
Date: Tue, 2 Jan 2024 10:54:26 +0100	[thread overview]
Message-ID: <ZZPdUv6uRvqXlyYo@gmail.com> (raw)
In-Reply-To: <7c1148fe64fb46a7a81c984776cd91df@AcuMS.aculab.com>


* David Laight <David.Laight@ACULAB.COM> wrote:

> per_cpu_ptr() indexes __per_cpu_offset[] with the cpu number.
> This requires the cpu number be 64bit.
> However the value is osq_lock() comes from a 32bit xchg() and there
> isn't a way of telling gcc the high bits are zero (they are) so
> there will always be an instruction to clear the high bits.
> 
> The cpu number is also offset by one (to make the initialiser 0)
> It seems to be impossible to get gcc to convert __per_cpu_offset[cpu_p1 - 1]
> into (__per_cpu_offset - 1)[cpu_p1] (transferring the offset to the address).
> 
> Converting the cpu number to 32bit unsigned prior to the decrement means
> that gcc knows the decrement has set the high bits to zero and doesn't
> add a register-register move (or cltq) to zero/sign extend the value.
> 
> Not massive but saves two instructions.
> 
> Signed-off-by: David Laight <david.laight@aculab.com>
> ---
>  kernel/locking/osq_lock.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> index 35bb99e96697..37a4fa872989 100644
> --- a/kernel/locking/osq_lock.c
> +++ b/kernel/locking/osq_lock.c
> @@ -29,11 +29,9 @@ static inline int encode_cpu(int cpu_nr)
>  	return cpu_nr + 1;
>  }
>  
> -static inline struct optimistic_spin_node *decode_cpu(int encoded_cpu_val)
> +static inline struct optimistic_spin_node *decode_cpu(unsigned int encoded_cpu_val)
>  {
> -	int cpu_nr = encoded_cpu_val - 1;
> -
> -	return per_cpu_ptr(&osq_node, cpu_nr);
> +	return per_cpu_ptr(&osq_node, encoded_cpu_val - 1);

So why do we 'encode' the CPU number to begin with?

Why not use -1 as the special value? Checks for negative values 
generates similarly fast machine code compared to checking for 0, if 
the value is also used (which it is in most cases here). What am I 
missing? We seem to be going through a lot of unnecessary hoops, and 
some of that is in the runtime path.

Thanks,

	Ingo

next prev parent reply	other threads:[~2024-01-02  9:54 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-31 21:49 [PATCH next v2 0/5] locking/osq_lock: Optimisations to osq_lock code David Laight
2023-12-31 21:51 ` [PATCH next v2 1/5] locking/osq_lock: Defer clearing node->locked until the slow osq_lock() path David Laight
2024-01-01  4:08   ` Waiman Long
2023-12-31 21:52 ` [PATCH next v2 2/5] locking/osq_lock: Optimise the vcpu_is_preempted() check David Laight
2024-01-01  4:09   ` Waiman Long
2024-01-08  7:42   ` kernel test robot
2023-12-31 21:54 ` [PATCH next v2 3/5] locking/osq_lock: Use node->prev_cpu instead of saving node->prev David Laight
2024-01-01  4:09   ` Waiman Long
2023-12-31 21:54 ` [PATCH next v2 4/5] locking/osq_lock: Avoid writing to node->next in the osq_lock() fast path David Laight
2024-01-01  4:13   ` Waiman Long
2024-01-02  9:47   ` Ingo Molnar
2023-12-31 21:55 ` [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and per_cpu_ptr() David Laight
2024-01-01  4:14   ` Waiman Long
2024-01-01  8:47     ` David Laight
2024-05-03 15:59     ` Waiman Long
2024-05-03 16:16       ` David Laight
2024-05-03 21:10       ` David Laight
2024-05-03 22:13         ` Waiman Long
2024-05-04 20:26           ` David Laight
2024-01-02  9:54   ` Ingo Molnar [this message]
2024-01-02 10:20     ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZZPdUv6uRvqXlyYo@gmail.com \
    --to=mingo@kernel.org \
    --cc=David.Laight@aculab.com \
    --cc=boqun.feng@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=will@kernel.org \
    --cc=zengheng4@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).