All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Uladzislau Rezki <urezki@gmail.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	stephen.s.brennan@oracle.com, willy@infradead.org,
	akpm@linux-foundation.org, hch@infradead.org
Subject: Re: [PATCH v1 2/7] mm/vmalloc.c: add flags to mark vm_map_ram area
Date: Wed, 7 Dec 2022 16:03:41 +0800	[thread overview]
Message-ID: <Y5BI3Sp8QCyweXwt@MiWiFi-R3L-srv> (raw)
In-Reply-To: <Y43qfdseyq0zizJO@pc636>

On 12/05/22 at 01:56pm, Uladzislau Rezki wrote:
> > Through vmalloc API, a virtual kernel area is reserved for physical
> > address mapping. And vmap_area is used to track them, while vm_struct
> > is allocated to associate with the vmap_area to store more information
> > and passed out.
> > 
> > However, area reserved via vm_map_ram() is an exception. It doesn't have
> > vm_struct to associate with vmap_area. And we can't recognize the
> > vmap_area with '->vm == NULL' as a vm_map_ram() area because the normal
> > freeing path will set va->vm = NULL before unmapping, please see
> > function remove_vm_area().
> > 
> > Meanwhile, there are two types of vm_map_ram area. One is the whole
> > vmap_area being reserved and mapped at one time; the other is the
> > whole vmap_area with VMAP_BLOCK_SIZE size being reserved, while mapped
> > into split regions with smaller size several times via vb_alloc().
> > 
> > To mark the area reserved through vm_map_ram(), add flags field into
> > struct vmap_area. Bit 0 indicates whether it's a vm_map_ram area,
> > while bit 1 indicates whether it's a vmap_block type of vm_map_ram
> > area.
> > 
> > This is a preparatoin for later use.
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> >  include/linux/vmalloc.h |  1 +
> >  mm/vmalloc.c            | 18 +++++++++++++++++-
> >  2 files changed, 18 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> > index 096d48aa3437..69250efa03d1 100644
> > --- a/include/linux/vmalloc.h
> > +++ b/include/linux/vmalloc.h
> > @@ -76,6 +76,7 @@ struct vmap_area {
> >  		unsigned long subtree_max_size; /* in "free" tree */
> >  		struct vm_struct *vm;           /* in "busy" tree */
> >  	};
> > +	unsigned long flags; /* mark type of vm_map_ram area */
> >  };
> >  
> >  /* archs that select HAVE_ARCH_HUGE_VMAP should override one or more of these */
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 5d3fd3e6fe09..d6f376060d83 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -1815,6 +1815,7 @@ static void free_vmap_area_noflush(struct vmap_area *va)
> >  
> >  	spin_lock(&vmap_area_lock);
> >  	unlink_va(va, &vmap_area_root);
> > +	va->flags = 0;
> >  	spin_unlock(&vmap_area_lock);
> >  
> This is not a good place to set flags to zero. It looks to me like
> corner and kind of specific.

Thanks for reviewing.

Here, I thought to clear VMAP_RAM|VMAP_BLOCK on vmap->flags when free
the vmap_block. I didn't find a good place to do the clearing. When we
call free_vmap_block(), we either come from purge_fragmented_blocks(),
or from vb_free(). In vb_free(), it will call free_vmap_block() when
the whole vmap_block is dirty. In purge_fragmented_blocks(), it will
try to purge all vmap_block which only has dirty or free regions.
For both of above functions, they will call free_vmap_block() when
there's no being used region in the vmap_block.

  purge_fragmented_blocks()
  vb_free()
    -->free_vmap_block()

So seems we don't need to clear the VMAP_RAM|VMAP_BLOCK on vmap->flags
because there's no mapping existed in the vmap_block. The consequent
free_vmap_block() will remove the relevant vmap_area from vmap_area_list
and vmap_area_root tree.

So I plan to remove code change in this place.
> 
> 
> >  	nr_lazy = atomic_long_add_return((va->va_end - va->va_start) >>
> > @@ -1887,6 +1888,10 @@ struct vmap_area *find_vmap_area(unsigned long addr)
> >  
> >  #define VMAP_BLOCK_SIZE		(VMAP_BBMAP_BITS * PAGE_SIZE)
> >  
> > +#define VMAP_RAM		0x1
> > +#define VMAP_BLOCK		0x2
> > +#define VMAP_FLAGS_MASK		0x3
> > +
> >  struct vmap_block_queue {
> >  	spinlock_t lock;
> >  	struct list_head free;
> > @@ -1967,6 +1972,9 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask)
> >  		kfree(vb);
> >  		return ERR_CAST(va);
> >  	}
> > +	spin_lock(&vmap_area_lock);
> > +	va->flags = VMAP_RAM|VMAP_BLOCK;
> > +	spin_unlock(&vmap_area_lock);
> >
> The per-cpu code was created as a fast per-cpu allocator because of high
> vmalloc lock contention. If possible we should avoid of locking of the
> vmap_area_lock. Because it has a high contention.

Fair enough. I made below draft patch to address the concern. By
adding argument va_flags to alloc_vmap_area(), we can pass the 
vm_map_ram flags into alloc_vmap_area and filled into vmap_area->flags.
With this, we don't need add extra action to acquire vmap_area_root lock
and do the flags setting. Is it OK to you?

From 115f6080b339d0cf9dd20c5f6c0d3121f6b22274 Mon Sep 17 00:00:00 2001
From: Baoquan He <bhe@redhat.com>
Date: Wed, 7 Dec 2022 11:08:14 +0800
Subject: [PATCH] mm/vmalloc: change alloc_vmap_area() to pass in va_flags

With this change, we can pass and set vmap_area->flags for vm_map_ram area
in alloc_vmap_area(). Then no extra action need be added to acquire
vmap_area_lock when doing the vmap_area->flags setting.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 mm/vmalloc.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ccaa461998f3..d74eddec352f 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1586,7 +1586,9 @@ preload_this_cpu_lock(spinlock_t *lock, gfp_t gfp_mask, int node)
 static struct vmap_area *alloc_vmap_area(unsigned long size,
 				unsigned long align,
 				unsigned long vstart, unsigned long vend,
-				int node, gfp_t gfp_mask)
+				int node, gfp_t gfp_mask,
+				unsigned long va_flags)
+)
 {
 	struct vmap_area *va;
 	unsigned long freed;
@@ -1630,6 +1632,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 	va->va_start = addr;
 	va->va_end = addr + size;
 	va->vm = NULL;
+	va->flags = va_flags;
 
 	spin_lock(&vmap_area_lock);
 	insert_vmap_area(va, &vmap_area_root, &vmap_area_list);
@@ -1961,7 +1964,8 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask)
 
 	va = alloc_vmap_area(VMAP_BLOCK_SIZE, VMAP_BLOCK_SIZE,
 					VMALLOC_START, VMALLOC_END,
-					node, gfp_mask);
+					node, gfp_mask,
+					VMAP_RAM|VMAP_BLOCK);
 	if (IS_ERR(va)) {
 		kfree(vb);
 		return ERR_CAST(va);
@@ -2258,7 +2262,8 @@ void *vm_map_ram(struct page **pages, unsigned int count, int node)
 	} else {
 		struct vmap_area *va;
 		va = alloc_vmap_area(size, PAGE_SIZE,
-				VMALLOC_START, VMALLOC_END, node, GFP_KERNEL);
+				VMALLOC_START, VMALLOC_END,
+				node, GFP_KERNEL, VMAP_RAM|VMAP_BLOCK);
 		if (IS_ERR(va))
 			return NULL;
 
@@ -2498,7 +2503,7 @@ static struct vm_struct *__get_vm_area_node(unsigned long size,
 	if (!(flags & VM_NO_GUARD))
 		size += PAGE_SIZE;
 
-	va = alloc_vmap_area(size, align, start, end, node, gfp_mask);
+	va = alloc_vmap_area(size, align, start, end, node, gfp_mask, 0);
 	if (IS_ERR(va)) {
 		kfree(area);
 		return NULL;
-- 
2.34.1



  reply	other threads:[~2022-12-07  8:03 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-04  1:30 [PATCH v1 0/7] mm/vmalloc.c: allow vread() to read out vm_map_ram areas Baoquan He
2022-12-04  1:30 ` [PATCH v1 1/7] mm/vmalloc.c: add used_map into vmap_block to track space of vmap_block Baoquan He
2022-12-04  1:30 ` [PATCH v1 2/7] mm/vmalloc.c: add flags to mark vm_map_ram area Baoquan He
2022-12-05 12:56   ` Uladzislau Rezki
2022-12-07  8:03     ` Baoquan He [this message]
2022-12-08 19:52       ` Uladzislau Rezki
2022-12-09  8:27         ` Baoquan He
2022-12-04  1:30 ` [PATCH v1 3/7] mm/vmalloc.c: allow vread() to read out vm_map_ram areas Baoquan He
2022-12-04  3:47   ` kernel test robot
2022-12-17  1:14     ` Baoquan He
2022-12-04  1:30 ` [PATCH v1 4/7] mm/vmalloc: explicitly identify vm_map_ram area when shown in /proc/vmcoreinfo Baoquan He
2022-12-04  1:30 ` [PATCH v1 5/7] mm/vmalloc: skip the uninitilized vmalloc areas Baoquan He
2022-12-04  1:30 ` [PATCH v1 6/7] powerpc: mm: add VM_IOREMAP flag to the vmalloc area Baoquan He
2022-12-04  1:30   ` Baoquan He
2022-12-04  1:30 ` [PATCH v1 7/7] sh: mm: set " Baoquan He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y5BI3Sp8QCyweXwt@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=stephen.s.brennan@oracle.com \
    --cc=urezki@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.