Re: [PATCH 11/11] mm/page_alloc: Embed per_cpu_pages locking within the per-cpu structure

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mel Gorman <mgorman@techsingularity.net>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Linux-MM <linux-mm@kvack.org>,
	Linux-RT-Users <linux-rt-users@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Chuck Lever <chuck.lever@oracle.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>, Michal Hocko <mhocko@kernel.org>
Subject: Re: [PATCH 11/11] mm/page_alloc: Embed per_cpu_pages locking within the per-cpu structure
Date: Thu, 15 Apr 2021 16:29:58 +0100	[thread overview]
Message-ID: <20210415152958.GL3697@techsingularity.net> (raw)
In-Reply-To: <de5d1dbb-fb56-9660-fadb-6318047305d4@suse.cz>

On Thu, Apr 15, 2021 at 04:53:46PM +0200, Vlastimil Babka wrote:
> On 4/14/21 3:39 PM, Mel Gorman wrote:
> > struct per_cpu_pages is protected by the pagesets lock but it can be
> > embedded within struct per_cpu_pages at a minor cost. This is possible
> > because per-cpu lookups are based on offsets. Paraphrasing an explanation
> > from Peter Ziljstra
> > 
> >   The whole thing relies on:
> > 
> >     &per_cpu_ptr(msblk->stream, cpu)->lock == per_cpu_ptr(&msblk->stream->lock, cpu)
> > 
> >   Which is true because the lhs:
> > 
> >     (local_lock_t *)((zone->per_cpu_pages + per_cpu_offset(cpu)) + offsetof(struct per_cpu_pages, lock))
> > 
> >   and the rhs:
> > 
> >     (local_lock_t *)((zone->per_cpu_pages + offsetof(struct per_cpu_pages, lock)) + per_cpu_offset(cpu))
> > 
> >   are identical, because addition is associative.
> > 
> > More details are included in mmzone.h. This embedding is not completely
> > free for three reasons.
> > 
> > 1. As local_lock does not return a per-cpu structure, the PCP has to
> >    be looked up twice -- first to acquire the lock and again to get the
> >    PCP pointer.
> > 
> > 2. For PREEMPT_RT and CONFIG_DEBUG_LOCK_ALLOC, local_lock is potentially
> >    a spinlock or has lock-specific tracking. In both cases, it becomes
> >    necessary to release/acquire different locks when freeing a list of
> >    pages in free_unref_page_list.
> 
> Looks like this pattern could benefit from a local_lock API helper that would do
> the right thing? It probably couldn't optimize much the CONFIG_PREEMPT_RT case
> which would need to be unlock/lock in any case, but CONFIG_DEBUG_LOCK_ALLOC
> could perhaps just keep the IRQ's disabled and just note the change of what's
> acquired?
> 

A helper could potentially be used but right now, there is only one
call-site that needs this type of care so it may be overkill. A helper
was proposed that can lookup and lock a per-cpu structure which is
generally useful but does not suit the case where different locks need
to be acquired.

> > 3. For most kernel configurations, local_lock_t is empty and no storage is
> >    required. By embedding the lock, the memory consumption on PREEMPT_RT
> >    and CONFIG_DEBUG_LOCK_ALLOC is higher.
> 
> But I wonder, is there really a benefit to this increased complexity? Before the
> patch we had "pagesets" - a local_lock that protects all zones' pcplists. Now
> each zone's pcplists have own local_lock. On !PREEMPT_RT we will never take the
> locks of multiple zones from the same CPU in parallel, because we use
> local_lock_irqsave(). Can that parallelism happen on PREEMPT_RT, because that
> could perhaps justify the change?
> 

I don't think PREEMPT_RT gets additional parallelism because it's still
a per-cpu structure that is being protected. The difference is whether
we are protecting the CPU-N index for all per_cpu_pages or just one.
The patch exists because it was asked why the lock was not embedded within
the structure it's protecting. I initially thought that was unsafe and
I was wrong as explained in the changelog. But now that I find it *can*
be done but it's a bit ugly so I put it at the end of the series so it
can be dropped if necessary.

-- 
Mel Gorman
SUSE Labs

     prev parent reply	other threads:[~2021-04-15 15:30 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-14 13:39 [PATCH 0/11 v3] Use local_lock for pcp protection and reduce stat overhead Mel Gorman
2021-04-14 13:39 ` [PATCH 01/11] mm/page_alloc: Split per cpu page lists and zone stats Mel Gorman
2021-04-14 13:39 ` [PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to local_lock Mel Gorman
2021-04-14 13:39 ` [PATCH 03/11] mm/vmstat: Convert NUMA statistics to basic NUMA counters Mel Gorman
2021-04-14 13:39 ` [PATCH 04/11] mm/vmstat: Inline NUMA event counter updates Mel Gorman
2021-04-14 16:20   ` Vlastimil Babka
2021-04-14 16:26     ` Vlastimil Babka
2021-04-15  9:34       ` Mel Gorman
2021-04-14 13:39 ` [PATCH 05/11] mm/page_alloc: Batch the accounting updates in the bulk allocator Mel Gorman
2021-04-14 16:31   ` Vlastimil Babka
2021-04-14 13:39 ` [PATCH 06/11] mm/page_alloc: Reduce duration that IRQs are disabled for VM counters Mel Gorman
2021-04-14 17:10   ` Vlastimil Babka
2021-04-14 13:39 ` [PATCH 07/11] mm/page_alloc: Remove duplicate checks if migratetype should be isolated Mel Gorman
2021-04-14 17:21   ` Vlastimil Babka
2021-04-15  9:33     ` Mel Gorman
2021-04-15 11:24       ` Vlastimil Babka
2021-04-14 13:39 ` [PATCH 08/11] mm/page_alloc: Explicitly acquire the zone lock in __free_pages_ok Mel Gorman
2021-04-15 10:24   ` Vlastimil Babka
2021-04-14 13:39 ` [PATCH 09/11] mm/page_alloc: Avoid conflating IRQs disabled with zone->lock Mel Gorman
2021-04-15 12:25   ` Vlastimil Babka
2021-04-15 14:11     ` Mel Gorman
2021-04-14 13:39 ` [PATCH 10/11] mm/page_alloc: Update PGFREE outside the zone lock in __free_pages_ok Mel Gorman
2021-04-15 13:04   ` Vlastimil Babka
2021-04-14 13:39 ` [PATCH 11/11] mm/page_alloc: Embed per_cpu_pages locking within the per-cpu structure Mel Gorman
2021-04-15 14:53   ` Vlastimil Babka
2021-04-15 15:29     ` Mel Gorman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210415152958.GL3697@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=brouer@redhat.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.