Re: [PATCH v6] mm/vmalloc: lock contention optimization under multi-threading

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Uladzislau Rezki <urezki@gmail.com>
To: Baoquan He <bhe@redhat.com>, rulinhuang <rulin.huang@intel.com>
Cc: rulinhuang <rulin.huang@intel.com>,
	urezki@gmail.com, akpm@linux-foundation.org,
	colin.king@intel.com, hch@infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	lstoakes@gmail.com, tianyou.li@intel.com, tim.c.chen@intel.com,
	wangyang.guo@intel.com, zhiguo.zhou@intel.com
Subject: Re: [PATCH v6] mm/vmalloc: lock contention optimization under multi-threading
Date: Thu, 29 Feb 2024 11:33:24 +0100	[thread overview]
Message-ID: <ZeBddOuKQKHO0V6b@pc636> (raw)
In-Reply-To: <ZeBYcCAdHBCiDJkz@MiWiFi-R3L-srv>

On Thu, Feb 29, 2024 at 06:12:00PM +0800, Baoquan He wrote:
> Hi Rulin,
> 
> Thanks for the great work and v6, some concerns, please see inline
> comments.
> 
> On 02/29/24 at 12:26am, rulinhuang wrote:
> > When allocating a new memory area where the mapping address range is
> > known, it is observed that the vmap_node->busy.lock is acquired twice.
> > 
> > The first acquisition occurs in the alloc_vmap_area() function when
> > inserting the vm area into the vm mapping red-black tree. The second
> > acquisition occurs in the setup_vmalloc_vm() function when updating the
> > properties of the vm, such as flags and address, etc.
> > 
> > Combine these two operations together in alloc_vmap_area(), which
> > improves scalability when the vmap_node->busy.lock is contended.
> > By doing so, the need to acquire the lock twice can also be eliminated
> >  to once.
> > 
> > With the above change, tested on intel sapphire rapids
> > platform(224 vcpu), a 4% performance improvement is
> > gained on stress-ng/pthread(https://github.com/ColinIanKing/stress-ng),
> > which is the stress test of thread creations.
> > 
> > Reviewed-by: Uladzislau Rezki <urezki@gmail.com>
> > Reviewed-by: Baoquan He <bhe@redhat.com>
> > Reviewed-by: "Chen, Tim C" <tim.c.chen@intel.com>
> > Reviewed-by: "King, Colin" <colin.king@intel.com>
> 
> 
> We possibly need remove these reviewers' tags when new code change is
> taken so that people check and add Acked-by or Reviewed-by again if then
> agree, or add new comments if any concern.
> 
> > Signed-off-by: rulinhuang <rulin.huang@intel.com>
> > ---
> > V1 -> V2: Avoided the partial initialization issue of vm and
> > separated insert_vmap_area() from alloc_vmap_area()
> > V2 -> V3: Rebased on 6.8-rc5
> > V3 -> V4: Rebased on mm-unstable branch
> > V4 -> V5: cancel the split of alloc_vmap_area()
> > and keep insert_vmap_area()
> > V5 -> V6: add bug_on
> > ---
> >  mm/vmalloc.c | 132 +++++++++++++++++++++++++--------------------------
> >  1 file changed, 64 insertions(+), 68 deletions(-)
> > 
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 25a8df497255..5ae028b0d58d 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -1841,15 +1841,66 @@ node_alloc(unsigned long size, unsigned long align,
> >  	return va;
> >  }
> >  
> > +/*** Per cpu kva allocator ***/
> > +
> > +/*
> > + * vmap space is limited especially on 32 bit architectures. Ensure there is
> > + * room for at least 16 percpu vmap blocks per CPU.
> > + */
> > +/*
> > + * If we had a constant VMALLOC_START and VMALLOC_END, we'd like to be able
> > + * to #define VMALLOC_SPACE		(VMALLOC_END-VMALLOC_START). Guess
> > + * instead (we just need a rough idea)
> > + */
> > +#if BITS_PER_LONG == 32
> > +#define VMALLOC_SPACE		(128UL*1024*1024)
> > +#else
> > +#define VMALLOC_SPACE		(128UL*1024*1024*1024)
> > +#endif
> > +
> > +#define VMALLOC_PAGES		(VMALLOC_SPACE / PAGE_SIZE)
> > +#define VMAP_MAX_ALLOC		BITS_PER_LONG	/* 256K with 4K pages */
> > +#define VMAP_BBMAP_BITS_MAX	1024	/* 4MB with 4K pages */
> > +#define VMAP_BBMAP_BITS_MIN	(VMAP_MAX_ALLOC*2)
> > +#define VMAP_MIN(x, y)		((x) < (y) ? (x) : (y)) /* can't use min() */
> > +#define VMAP_MAX(x, y)		((x) > (y) ? (x) : (y)) /* can't use max() */
> > +#define VMAP_BBMAP_BITS		\
> > +		VMAP_MIN(VMAP_BBMAP_BITS_MAX,	\
> > +		VMAP_MAX(VMAP_BBMAP_BITS_MIN,	\
> > +			VMALLOC_PAGES / roundup_pow_of_two(NR_CPUS) / 16))
> > +
> > +#define VMAP_BLOCK_SIZE		(VMAP_BBMAP_BITS * PAGE_SIZE)
> > +
> > +/*
> > + * Purge threshold to prevent overeager purging of fragmented blocks for
> > + * regular operations: Purge if vb->free is less than 1/4 of the capacity.
> > + */
> > +#define VMAP_PURGE_THRESHOLD	(VMAP_BBMAP_BITS / 4)
> > +
> > +#define VMAP_RAM		0x1 /* indicates vm_map_ram area*/
> > +#define VMAP_BLOCK		0x2 /* mark out the vmap_block sub-type*/
> > +#define VMAP_FLAGS_MASK		0x3
> 
> These code moving is made because we need check VMAP_RAM in advance. We
> may need move all those data structures and basic helpers related to per
> cpu kva allocator up too to along with these macros, just as the newly
> introduced vmap_node does. If that's agreed, better be done in a
> separate patch. My personal opinion. Not sure if Uladzislau has
> different thoughts.
> 
> Other than this, the overall looks good to me.
> 
I agree, the split should be done. One is a preparation move saying that
no functional change happens and final one an actual change is.

--
Uladzislau Rezki

next prev parent reply	other threads:[~2024-02-29 10:33 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-29  8:26 [PATCH v6] mm/vmalloc: lock contention optimization under multi-threading rulinhuang
2024-02-29  8:31 ` Huang, Rulin
2024-02-29 12:07   ` Baoquan He
2024-02-29 10:12 ` Baoquan He
2024-02-29 10:33   ` Uladzislau Rezki [this message]
2024-03-01  9:14 ` King, Colin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZeBddOuKQKHO0V6b@pc636 \
    --to=urezki@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=colin.king@intel.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=rulin.huang@intel.com \
    --cc=tianyou.li@intel.com \
    --cc=tim.c.chen@intel.com \
    --cc=wangyang.guo@intel.com \
    --cc=zhiguo.zhou@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.