All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leonardo Bras <leobras@redhat.com>
To: Waiman Long <longman@redhat.com>
Cc: Leonardo Bras <leobras@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Muchun Song <muchun.song@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Hyeonggon Yoo <42.hyeyoo@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [RFC PATCH v1 1/4] Introducing qpw_lock() and per-cpu queue & flush work
Date: Wed, 11 Sep 2024 04:18:42 -0300	[thread overview]
Message-ID: <ZuFEUk2jsRRWNG1I@LeoBras> (raw)
In-Reply-To: <a9fdcd85-633c-4e88-9e1f-db0b9d3b745c@redhat.com>

On Wed, Sep 04, 2024 at 08:08:12PM -0400, Waiman Long wrote:
> On 9/4/24 17:39, Waiman Long wrote:
> > On 6/21/24 23:58, Leonardo Bras wrote:
> > > Some places in the kernel implement a parallel programming strategy
> > > consisting on local_locks() for most of the work, and some rare remote
> > > operations are scheduled on target cpu. This keeps cache bouncing
> > > low since
> > > cacheline tends to be mostly local, and avoids the cost of locks in
> > > non-RT
> > > kernels, even though the very few remote operations will be
> > > expensive due
> > > to scheduling overhead.
> > > 
> > > On the other hand, for RT workloads this can represent a problem:
> > > getting
> > > an important workload scheduled out to deal with some unrelated task is
> > > sure to introduce unexpected deadline misses.
> > > 
> > > It's interesting, though, that local_lock()s in RT kernels become
> > > spinlock(). We can make use of those to avoid scheduling work on a
> > > remote
> > > cpu by directly updating another cpu's per_cpu structure, while holding
> > > it's spinlock().
> > > 
> > > In order to do that, it's necessary to introduce a new set of
> > > functions to
> > > make it possible to get another cpu's per-cpu "local" lock
> > > (qpw_{un,}lock*)
> > > and also the corresponding queue_percpu_work_on() and
> > > flush_percpu_work()
> > > helpers to run the remote work.
> > > 
> > > On non-RT kernels, no changes are expected, as every one of the
> > > introduced
> > > helpers work the exactly same as the current implementation:
> > > qpw_{un,}lock*()        ->  local_{un,}lock*() (ignores cpu parameter)
> > > queue_percpu_work_on()  ->  queue_work_on()
> > > flush_percpu_work()     ->  flush_work()
> > > 
> > > For RT kernels, though, qpw_{un,}lock*() will use the extra cpu
> > > parameter
> > > to select the correct per-cpu structure to work on, and acquire the
> > > spinlock for that cpu.
> > > 
> > > queue_percpu_work_on() will just call the requested function in the
> > > current
> > > cpu, which will operate in another cpu's per-cpu object. Since the
> > > local_locks() become spinlock()s in PREEMPT_RT, we are safe doing that.
> > > 
> > > flush_percpu_work() then becomes a no-op since no work is actually
> > > scheduled on a remote cpu.
> > > 
> > > Some minimal code rework is needed in order to make this mechanism work:
> > > The calls for local_{un,}lock*() on the functions that are currently
> > > scheduled on remote cpus need to be replaced by qpw_{un,}lock_n*(),
> > > so in
> > > RT kernels they can reference a different cpu. It's also necessary
> > > to use a
> > > qpw_struct instead of a work_struct, but it just contains a work struct
> > > and, in PREEMPT_RT, the target cpu.
> > > 
> > > This should have almost no impact on non-RT kernels: few this_cpu_ptr()
> > > will become per_cpu_ptr(,smp_processor_id()).
> > > 
> > > On RT kernels, this should improve performance and reduce latency by
> > > removing scheduling noise.
> > > 
> > > Signed-off-by: Leonardo Bras <leobras@redhat.com>
> > > ---
> > >   include/linux/qpw.h | 88 +++++++++++++++++++++++++++++++++++++++++++++
> > >   1 file changed, 88 insertions(+)
> > >   create mode 100644 include/linux/qpw.h
> > > 
> > > diff --git a/include/linux/qpw.h b/include/linux/qpw.h
> > > new file mode 100644
> > > index 000000000000..ea2686a01e5e
> > > --- /dev/null
> > > +++ b/include/linux/qpw.h
> > > @@ -0,0 +1,88 @@
> > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > +#ifndef _LINUX_QPW_H
> > > +#define _LINUX_QPW_H
> 
> I would suggest adding a comment with a brief description of what
> qpw_lock/unlock() are for and their use cases. The "qpw" prefix itself isn't
> intuitive enough for a casual reader to understand what they are for.

Agree, I am also open to discuss a more intuitive naming for these.

> 
> Cheers,
> Longman
> 

Thanks!
Leo


  reply	other threads:[~2024-09-11  7:19 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-22  3:58 [RFC PATCH v1 0/4] Introduce QPW for per-cpu operations Leonardo Bras
2024-06-22  3:58 ` [RFC PATCH v1 1/4] Introducing qpw_lock() and per-cpu queue & flush work Leonardo Bras
2024-09-04 21:39   ` Waiman Long
2024-09-05  0:08     ` Waiman Long
2024-09-11  7:18       ` Leonardo Bras [this message]
2024-09-11  7:17     ` Leonardo Bras
2024-09-11 13:39       ` Waiman Long
2024-06-22  3:58 ` [RFC PATCH v1 2/4] swap: apply new queue_percpu_work_on() interface Leonardo Bras
2024-06-22  3:58 ` [RFC PATCH v1 3/4] memcontrol: " Leonardo Bras
2024-06-22  3:58 ` [RFC PATCH v1 4/4] slub: " Leonardo Bras
2024-06-24  7:31 ` [RFC PATCH v1 0/4] Introduce QPW for per-cpu operations Vlastimil Babka
2024-06-24 22:54   ` Boqun Feng
2024-06-25  2:57     ` Leonardo Bras
2024-06-25 17:51       ` Boqun Feng
2024-06-26 16:40         ` Leonardo Bras
2024-06-28 18:47       ` Marcelo Tosatti
2024-06-25  2:36   ` Leonardo Bras
2024-07-15 18:38   ` Marcelo Tosatti
2024-07-23 17:14 ` Marcelo Tosatti
2024-09-05 22:19   ` Hillf Danton
2024-09-11  3:04     ` Marcelo Tosatti
2024-09-15  0:30       ` Hillf Danton
2024-09-11  6:42     ` Leonardo Bras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZuFEUk2jsRRWNG1I@LeoBras \
    --to=leobras@redhat.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=cl@linux.com \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=mhocko@kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=muchun.song@linux.dev \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.