linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: Michal Hocko <mhocko@suse.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Valentin Schneider <vschneid@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Oleg Nesterov <oleg@redhat.com>,
	linux-mm@kvack.org
Subject: Re: [PATCH 6/6 v2] mm: Drain LRUs upon resume to userspace on nohz_full CPUs
Date: Fri, 4 Apr 2025 15:14:50 +0200	[thread overview]
Message-ID: <Z-_bSlit_cqAhz7f@localhost.localdomain> (raw)
In-Reply-To: <Z6s3u6Uu8I7r2Jox@tiehlicka>

Le Tue, Feb 11, 2025 at 12:42:51PM +0100, Michal Hocko a écrit :
> On Sun 09-02-25 23:30:04, Frederic Weisbecker wrote:
> > LRUs can be drained through several ways. One of them may add disturbances
> > to isolated workloads while queuing a work at any time to any target,
> > whether running in nohz_full mode or not.
> > 
> > Prevent from that on isolated tasks with defering LRUs drains upon
> > resuming to userspace using the isolated task work framework.
> 
> I have to say this is rather cryptic description of the udnerlying
> problem. What do you think about the following:
> 
> LRU batching can be source of disturbances for isolated workloads
> running in the userspace because it requires kernel worker to handle
> that and that would preempt the said task. The primary source for such
> disruption would be __lru_add_drain_all which could be triggered from
> non-isolated CPUs.
> 
> Why would an isolated CPU have anything on the pcp cache? Many syscalls
> allocate pages that might end there. A typical and unavoidable one would
> be fork/exec leaving pages on the cache behind just waiting for somebody
> to drain.
> 
> This patch addresses the problem by noting a patch has been added to the
> cache and schedule draining to the return path to the userspace so the
> work is done while the syscall is still executing and there are no
> suprises while the task runs in the userspace where it doesn't want to
> be preempted.

Much better indeed :-)

> > @@ -376,6 +377,8 @@ static void __lru_cache_activate_folio(struct folio *folio)
> >  	}
> >  
> >  	local_unlock(&cpu_fbatches.lock);
> > +
> > +	isolated_task_work_queue();
> >  }
> 
> This placement doens't make much sense to me. I would put
> isolated_task_work_queue when we queue something up. That would be
> folio_batch_add if folio_batch_space(fbatch) > 0.

Ok I moved it there but why doesn't it make sense also when
folio_batch_space(fbatch) == 0, doesn't it mean that folio_batch_add()
still added the entry but there is just no further room left?

> 
> >  
> >  #ifdef CONFIG_LRU_GEN
> > @@ -738,7 +741,7 @@ void lru_add_drain(void)
> >   * the same cpu. It shouldn't be a problem in !SMP case since
> >   * the core is only one and the locks will disable preemption.
> >   */
> > -static void lru_add_and_bh_lrus_drain(void)
> > +void lru_add_and_bh_lrus_drain(void)
> >  {
> >  	local_lock(&cpu_fbatches.lock);
> >  	lru_add_drain_cpu(smp_processor_id());
> > @@ -769,6 +772,9 @@ static bool cpu_needs_drain(unsigned int cpu)
> >  {
> >  	struct cpu_fbatches *fbatches = &per_cpu(cpu_fbatches, cpu);
> >  
> > +	if (!housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE))
> > +		return false;
> > +
> 
> Would it make more sense to use cpu_is_isolated() and use it explicitly
> in __lru_add_drain_all so that it is clearly visible - with a comment
> that isolated workloads are dealing with cache on their return to
> userspace.

No because cpu_is_isolated() is also true when the CPU is on NULL domain
(isolated cpusset partition, or isolcpus=domain). And not all workloads
want the possible overhead of scattered batching after every syscalls.

Thanks.

> 
> >  	/* Check these in order of likelihood that they're not zero */
> >  	return folio_batch_count(&fbatches->lru_add) ||
> >  		folio_batch_count(&fbatches->lru_move_tail) ||
> > -- 
> > 2.46.0
> 
> -- 
> Michal Hocko
> SUSE Labs


      reply	other threads:[~2025-04-04 13:14 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-09 22:29 [PATCH 0/6 v2] mm: LRU drain flush on nohz_full Frederic Weisbecker
2025-02-09 22:29 ` [PATCH 1/6 v2] task_work: Provide means to check if a work is queued Frederic Weisbecker
2025-02-10 12:43   ` Oleg Nesterov
2025-03-25 14:25     ` Frederic Weisbecker
2025-02-27 16:25   ` Valentin Schneider
2025-02-09 22:30 ` [PATCH 2/6 v2] sched/fair: Use task_work_queued() on numa_work Frederic Weisbecker
2025-02-10 12:47   ` Oleg Nesterov
2025-02-27 16:25   ` Valentin Schneider
2025-02-09 22:30 ` [PATCH 3/6 v2] sched: Use task_work_queued() on cid_work Frederic Weisbecker
2025-02-10 12:49   ` Oleg Nesterov
2025-02-09 22:30 ` [PATCH 4/6 v2] tick/nohz: Move nohz_full related fields out of hot task struct's places Frederic Weisbecker
2025-02-09 22:30 ` [PATCH 5/6 v2] sched/isolation: Introduce isolated task work Frederic Weisbecker
2025-02-09 22:30 ` [PATCH 6/6 v2] mm: Drain LRUs upon resume to userspace on nohz_full CPUs Frederic Weisbecker
2025-02-10 10:50   ` Hillf Danton
2025-02-10 11:19     ` Michal Hocko
2025-02-10 11:46     ` Frederic Weisbecker
2025-02-11 11:31       ` Hillf Danton
2025-02-11 11:42   ` Michal Hocko
2025-04-04 13:14     ` Frederic Weisbecker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-_bSlit_cqAhz7f@localhost.localdomain \
    --to=frederic@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).