All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Guo Ren <guoren@kernel.org>
Cc: Waiman Long <longman@redhat.com>,
	peterz@infradead.org, linux-kernel@vger.kernel.org,
	Guo Ren <guoren@linux.alibaba.com>,
	Boqun Feng <boqun.feng@gmail.com>, Will Deacon <will@kernel.org>,
	Ingo Molnar <mingo@redhat.com>
Subject: Re: [PATCH] locking/qspinlock: Optimize pending state waiting for unlock
Date: Wed, 4 Jan 2023 21:19:44 +0100	[thread overview]
Message-ID: <Y7XfYPnQhLTcNZSh@gmail.com> (raw)
In-Reply-To: <CAJF2gTTkLY+mUoG0oqw0mmJH0hK5bXYvrmYcLL1-zwNbzOb9TQ@mail.gmail.com>


* Guo Ren <guoren@kernel.org> wrote:

> > >> The situation is the SMT scenarios in the same core. Not an entering
> > >> low-power state situation. Of course, the granularity between cores is
> > >> "cacheline", but the granularity between SMT hw threads of the same
> > >> core could be "byte" which internal LSU handles. For example, when a
> > >> hw-thread yields the resources of the core to other hw-threads, this
> > >> patch could help the hw-thread stay in the sleep state and prevent it
> > >> from being woken up by other hw-threads xchg_tail.
> > >>
> > >> Finally, from the software semantic view, does the patch make it more
> > >> accurate? (We don't care about the tail here.)
> > >
> > > Thanks for the clarification.
> > >
> > > I am not arguing for the simplification part. I just want to clarify
> > > my limited understanding of how the CPU hardware are actually dealing
> > > with these conditions.
> > >
> > > With that, I am fine with this patch. It would be nice if you can
> > > elaborate a bit more in your commit log.
> > >
> > > Acked-by: Waiman Long <longman@redhat.com>
> > >
> > BTW, have you actually observe any performance improvement with this patch?
> Not yet. I'm researching how the hardware could satisfy qspinlock
> better. Here are three points I concluded:
>  1. Atomic forward progress guarantee: Prevent unnecessary LL/SC
> retry, which may cause expensive bus transactions when crossing the
> NUMA nodes.
>  2. Sub-word atomic primitive: Enable freedom from interference
> between locked, pending, and tail.
>  3. Load-cond primitive: Prevent processor from wasting loop
> operations for detection.

As to this patch, please send a -v2 version of this patch that has this 
discussion & explanation included in the changelog, as requested by Waiman.

Thanks,

	Ingo

  reply	other threads:[~2023-01-04 20:20 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-24 12:05 [PATCH] locking/qspinlock: Optimize pending state waiting for unlock guoren
2022-12-25  1:55 ` Waiman Long
2022-12-25  2:57   ` Guo Ren
2022-12-25  3:29     ` Waiman Long
2022-12-25  3:30       ` Waiman Long
2022-12-25 11:59         ` Guo Ren
2023-01-04 20:19           ` Ingo Molnar [this message]
2023-01-05  2:31             ` Guo Ren
2023-01-05 10:03               ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y7XfYPnQhLTcNZSh@gmail.com \
    --to=mingo@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=guoren@kernel.org \
    --cc=guoren@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.