From: Baoquan He <bhe@redhat.com>
To: Uladzislau Rezki <urezki@gmail.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
stephen.s.brennan@oracle.com, willy@infradead.org,
akpm@linux-foundation.org, hch@infradead.org
Subject: Re: [PATCH v2 2/7] mm/vmalloc.c: add flags to mark vm_map_ram area
Date: Fri, 23 Dec 2022 12:14:52 +0800 [thread overview]
Message-ID: <Y6UrPGMVYUMttKD3@MiWiFi-R3L-srv> (raw)
In-Reply-To: <Y6HpGayyQZH7U7Fd@pc636>
On 12/20/22 at 05:55pm, Uladzislau Rezki wrote:
> > Through vmalloc API, a virtual kernel area is reserved for physical
> > address mapping. And vmap_area is used to track them, while vm_struct
> > is allocated to associate with the vmap_area to store more information
> > and passed out.
> >
> > However, area reserved via vm_map_ram() is an exception. It doesn't have
> > vm_struct to associate with vmap_area. And we can't recognize the
> > vmap_area with '->vm == NULL' as a vm_map_ram() area because the normal
> > freeing path will set va->vm = NULL before unmapping, please see
> > function remove_vm_area().
> >
> A normal "free" path sets it to NULL in order to prevent a double-free
> of same VA. We can avoid of touching the va->vm if needed and do an unlink
> on entry in the remove_vm_area() when a lock is taken to find an area.
>
> Will it help you?
Sorry, this mail sneaked out of my sight until I notice it now. My mutt
client makes it look like in the thread I talked with Lorenzo.
Yes, as I replied to your v2 patch, that is very helpful, thanks.
>
> > Meanwhile, there are two types of vm_map_ram area. One is the whole
> > vmap_area being reserved and mapped at one time; the other is the
> > whole vmap_area with VMAP_BLOCK_SIZE size being reserved, while mapped
> > into split regions with smaller size several times via vb_alloc().
> >
> > To mark the area reserved through vm_map_ram(), add flags field into
> > struct vmap_area. Bit 0 indicates whether it's a vm_map_ram area,
> > while bit 1 indicates whether it's a vmap_block type of vm_map_ram
> > area.
> >
> > This is a preparatoin for later use.
> >
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> > include/linux/vmalloc.h | 1 +
> > mm/vmalloc.c | 22 +++++++++++++++++-----
> > 2 files changed, 18 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> > index 096d48aa3437..69250efa03d1 100644
> > --- a/include/linux/vmalloc.h
> > +++ b/include/linux/vmalloc.h
> > @@ -76,6 +76,7 @@ struct vmap_area {
> > unsigned long subtree_max_size; /* in "free" tree */
> > struct vm_struct *vm; /* in "busy" tree */
> > };
> > + unsigned long flags; /* mark type of vm_map_ram area */
> > };
> >
> > /* archs that select HAVE_ARCH_HUGE_VMAP should override one or more of these */
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 5d3fd3e6fe09..190f29bbaaa7 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -1586,7 +1586,8 @@ preload_this_cpu_lock(spinlock_t *lock, gfp_t gfp_mask, int node)
> > static struct vmap_area *alloc_vmap_area(unsigned long size,
> > unsigned long align,
> > unsigned long vstart, unsigned long vend,
> > - int node, gfp_t gfp_mask)
> > + int node, gfp_t gfp_mask,
> > + unsigned long va_flags)
> > {
> > struct vmap_area *va;
> > unsigned long freed;
> > @@ -1630,6 +1631,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
> > va->va_start = addr;
> > va->va_end = addr + size;
> > va->vm = NULL;
> > + va->flags = va_flags;
> >
> > spin_lock(&vmap_area_lock);
> > insert_vmap_area(va, &vmap_area_root, &vmap_area_list);
> > @@ -1887,6 +1889,10 @@ struct vmap_area *find_vmap_area(unsigned long addr)
> >
> > #define VMAP_BLOCK_SIZE (VMAP_BBMAP_BITS * PAGE_SIZE)
> >
> > +#define VMAP_RAM 0x1
> > +#define VMAP_BLOCK 0x2
> > +#define VMAP_FLAGS_MASK 0x3
> >
> Maybe to rename a VMAP_BLOCK to something like VMAP_BLOCK_RESERVED or
> VMAP_PER_CPU_BLOCK?
Both VMAP_BLOCK or VMAP_PER_CPU_BLOCK look good to me, please see my
explanation at below.
>
> > struct vmap_block_queue {
> > spinlock_t lock;
> > struct list_head free;
> > @@ -1962,7 +1968,8 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask)
> >
> > va = alloc_vmap_area(VMAP_BLOCK_SIZE, VMAP_BLOCK_SIZE,
> > VMALLOC_START, VMALLOC_END,
> > - node, gfp_mask);
> > + node, gfp_mask,
> > + VMAP_RAM|VMAP_BLOCK);
> >
> A new_vmap_block() is for a per-cpu path. As far as i see the VMAP_BLOCK
> flag is used to mark a VA that corresponds to a reserved per-cpu free area.
>
> Whereas a VMAP_RAM is for VA that was obtained over per-cpu path but
> over alloc_vmap_area() thus a VA should be read out over "busy" tree
> directly.
>
> Why do you need to set here both VMAP_RAM and VMAP_BLOCK?
My understanding is that the vm_map_ram area has two types, one is
the vb percpu area via vb_alloc(), the other is allocated via
alloc_vmap_area(). While both of them is got from vm_map_ram()
interface, this is the main point that distinguishes the vm_map_ram area
than the normal vmalloc area, and this makes vm_map_ram area not owning
va->vm pointer. So here, I use flag VMAP_RAM to mark the vm_map_ram
area, including the two types; meanwhile, I add VMAP_BLOCK to mark out
the vb percpu area.
I understand people could have different view about them, e.g as you
said, use VMAP_RAM to mark the type of vm_map_ram area allocated through
alloc_vmap_area(), while use VMAP_PER_CPU_BLOCK to mark vb percpu area
from vb_alloc. In this way, we may need to rename VMAP_RAM to reflect
the area allocated from alloc_vmap_area() only. Both is fine to me.
>
> > if (IS_ERR(va)) {
> > kfree(vb);
> > return ERR_CAST(va);
> > @@ -2229,8 +2236,12 @@ void vm_unmap_ram(const void *mem, unsigned int count)
> > return;
> > }
> >
> > - va = find_vmap_area(addr);
> > + spin_lock(&vmap_area_lock);
> > + va = __find_vmap_area((unsigned long)addr, &vmap_area_root);
> > BUG_ON(!va);
> > + if (va)
> > + va->flags &= ~VMAP_RAM;
> > + spin_unlock(&vmap_area_lock);
> > debug_check_no_locks_freed((void *)va->va_start,
> >
> Agree with Lorenzo. BUG_ON() should be out of spinlock(). Furthermore
> i think it makes sense to go with WARN_ON_ONCE() and do not kill a system.
> Instead emit a warning and bailout.
>
> What do you think? Maybe separate patch for it?
Agree, your patch looks great to me. Thanks.
>
> > (va->va_end - va->va_start));
> > free_unmap_vmap_area(va);
> > @@ -2265,7 +2276,8 @@ void *vm_map_ram(struct page **pages, unsigned int count, int node)
> > } else {
> > struct vmap_area *va;
> > va = alloc_vmap_area(size, PAGE_SIZE,
> > - VMALLOC_START, VMALLOC_END, node, GFP_KERNEL);
> > + VMALLOC_START, VMALLOC_END,
> > + node, GFP_KERNEL, VMAP_RAM);
> > if (IS_ERR(va))
> > return NULL;
> >
> > @@ -2505,7 +2517,7 @@ static struct vm_struct *__get_vm_area_node(unsigned long size,
> > if (!(flags & VM_NO_GUARD))
> > size += PAGE_SIZE;
> >
> > - va = alloc_vmap_area(size, align, start, end, node, gfp_mask);
> > + va = alloc_vmap_area(size, align, start, end, node, gfp_mask, 0);
> > if (IS_ERR(va)) {
> > kfree(area);
> > return NULL;
> >
> I know we have already discussed the new parameter. But what if we just
> use atomic_set operation to mark VA as either vmap-ram or vmap-block?
>
> As for alloc_vmap_area() we set it just as zero.
Sorry, I may not get your point clearly, could you be more specific?
next prev parent reply other threads:[~2022-12-23 4:15 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-17 1:54 [PATCH v2 0/7] mm/vmalloc.c: allow vread() to read out vm_map_ram areas Baoquan He
2022-12-17 1:54 ` [PATCH v2 1/7] mm/vmalloc.c: add used_map into vmap_block to track space of vmap_block Baoquan He
2022-12-17 1:54 ` [PATCH v2 2/7] mm/vmalloc.c: add flags to mark vm_map_ram area Baoquan He
2022-12-17 11:44 ` Lorenzo Stoakes
2022-12-19 8:01 ` Baoquan He
2022-12-19 9:09 ` Lorenzo Stoakes
2022-12-19 12:24 ` Baoquan He
2022-12-19 13:01 ` Lorenzo Stoakes
2022-12-20 12:14 ` Baoquan He
2022-12-20 12:42 ` Lorenzo Stoakes
2022-12-20 16:55 ` Uladzislau Rezki
2022-12-23 4:14 ` Baoquan He [this message]
2023-01-13 3:55 ` Baoquan He
2023-01-16 17:54 ` Uladzislau Rezki
2023-01-18 3:09 ` Baoquan He
2023-01-18 12:20 ` Uladzislau Rezki
2022-12-17 1:54 ` [PATCH v2 3/7] mm/vmalloc.c: allow vread() to read out vm_map_ram areas Baoquan He
2022-12-17 4:10 ` kernel test robot
2022-12-17 6:41 ` kernel test robot
2022-12-17 9:46 ` Baoquan He
2022-12-17 12:06 ` Lorenzo Stoakes
2023-01-04 8:01 ` Baoquan He
2023-01-04 20:20 ` Lorenzo Stoakes
2023-01-09 4:35 ` Baoquan He
2023-01-09 7:12 ` Lorenzo Stoakes
2023-01-09 12:49 ` Baoquan He
2022-12-17 1:54 ` [PATCH v2 4/7] mm/vmalloc: explicitly identify vm_map_ram area when shown in /proc/vmcoreinfo Baoquan He
2022-12-17 1:54 ` [PATCH v2 5/7] mm/vmalloc: skip the uninitilized vmalloc areas Baoquan He
2022-12-17 12:07 ` Lorenzo Stoakes
2022-12-19 7:16 ` Baoquan He
2022-12-17 1:54 ` [PATCH v2 6/7] powerpc: mm: add VM_IOREMAP flag to the vmalloc area Baoquan He
2022-12-17 1:54 ` Baoquan He
2022-12-17 1:54 ` [PATCH v2 7/7] sh: mm: set " Baoquan He
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y6UrPGMVYUMttKD3@MiWiFi-R3L-srv \
--to=bhe@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=stephen.s.brennan@oracle.com \
--cc=urezki@gmail.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.