All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Uladzislau Rezki <urezki@gmail.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Michal Hocko <mhocko@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 6/8] mm/vmalloc: Defer freeing partly initialized vm_struct
Date: Tue, 19 Aug 2025 16:56:25 +0800	[thread overview]
Message-ID: <aKQ8OY04a0ACqZ2O@MiWiFi-R3L-srv> (raw)
In-Reply-To: <aKMkgbZqOqyGVF1C@pc636>

On 08/18/25 at 03:02pm, Uladzislau Rezki wrote:
> On Mon, Aug 18, 2025 at 12:21:15PM +0800, Baoquan He wrote:
> > On 08/07/25 at 09:58am, Uladzislau Rezki (Sony) wrote:
> > > __vmalloc_area_node() may call free_vmap_area() or vfree() on
> > > error paths, both of which can sleep. This becomes problematic
> > > if the function is invoked from an atomic context, such as when
> > > GFP_ATOMIC or GFP_NOWAIT is passed via gfp_mask.
> > > 
> > > To fix this, unify error paths and defer the cleanup of partly
> > > initialized vm_struct objects to a workqueue. This ensures that
> > > freeing happens in a process context and avoids invalid sleeps
> > > in atomic regions.
> > > 
> > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > > ---
> > >  include/linux/vmalloc.h |  6 +++++-
> > >  mm/vmalloc.c            | 34 +++++++++++++++++++++++++++++++---
> > >  2 files changed, 36 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> > > index fdc9aeb74a44..b1425fae8cbf 100644
> > > --- a/include/linux/vmalloc.h
> > > +++ b/include/linux/vmalloc.h
> > > @@ -50,7 +50,11 @@ struct iov_iter;		/* in uio.h */
> > >  #endif
> > >  
> > >  struct vm_struct {
> > > -	struct vm_struct	*next;
> > > +	union {
> > > +		struct vm_struct *next;	  /* Early registration of vm_areas. */
> > > +		struct llist_node llnode; /* Asynchronous freeing on error paths. */
> > > +	};
> > > +
> > >  	void			*addr;
> > >  	unsigned long		size;
> > >  	unsigned long		flags;
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index 7f48a54ec108..2424f80d524a 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -3680,6 +3680,35 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> > >  	return nr_allocated;
> > >  }
> > >  
> > > +static LLIST_HEAD(pending_vm_area_cleanup);
> > > +static void cleanup_vm_area_work(struct work_struct *work)
> > > +{
> > > +	struct vm_struct *area, *tmp;
> > > +	struct llist_node *head;
> > > +
> > > +	head = llist_del_all(&pending_vm_area_cleanup);
> > > +	if (!head)
> > > +		return;
> > > +
> > > +	llist_for_each_entry_safe(area, tmp, head, llnode) {
> > > +		if (!area->pages)
> > > +			free_vm_area(area);
> > > +		else
> > > +			vfree(area->addr);
> > > +	}
> > > +}
> > > +
> > > +/*
> > > + * Helper for __vmalloc_area_node() to defer cleanup
> > > + * of partially initialized vm_struct in error paths.
> > > + */
> > > +static DECLARE_WORK(cleanup_vm_area, cleanup_vm_area_work);
> > > +static void defer_vm_area_cleanup(struct vm_struct *area)
> > > +{
> > > +	if (llist_add(&area->llnode, &pending_vm_area_cleanup))
> > > +		schedule_work(&cleanup_vm_area);
> > > +}
> > 
> > Wondering why here we need call schudule_work() when
> > pending_vm_area_cleanup was empty before adding new entry. Shouldn't
> > it be as below to schedule the job? Not sure if I miss anything.
> > 
> > 	if (!llist_add(&area->llnode, &pending_vm_area_cleanup))
> > 		schedule_work(&cleanup_vm_area);
> > 
> > =====
> > /**
> >  * llist_add - add a new entry
> >  * @new:        new entry to be added
> >  * @head:       the head for your lock-less list
> >  *
> >  * Returns true if the list was empty prior to adding this entry.
> >  */
> > static inline bool llist_add(struct llist_node *new, struct llist_head *head)
> > {
> >         return llist_add_batch(new, new, head);
> > }
> > =====
> > 
> But then you will not schedule. If the list is empty, we add one element
> llist_add() returns 1, but your condition expects 0.
> 
> How it works:
> 
> If someone keeps adding to the llist and it is not empty we should not
> trigger a new work, because a current work is in flight(it will cover new comers),
> i.e. it has been scheduled but it has not yet completed llist_del_all() on
> the head.
> 
> Once it is done, a new comer will trigger a work again only if it sees NULL,
> i.e. when the list is empty.

Fair enough. I thought it's a deferring work, in fact it's aiming to put the
error handling in a workqueue, but not the current atomic context.
Thanks for the explanation.



  reply	other threads:[~2025-08-19  8:56 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-07  7:58 [PATCH 0/8] __vmalloc() and no-block support Uladzislau Rezki (Sony)
2025-08-07  7:58 ` [PATCH 1/8] lib/test_vmalloc: add no_block_alloc_test case Uladzislau Rezki (Sony)
2025-08-07  7:58 ` [PATCH 2/8] lib/test_vmalloc: Remove xfail condition check Uladzislau Rezki (Sony)
2025-08-07  7:58 ` [PATCH 3/8] mm/vmalloc: Support non-blocking GFP flags in alloc_vmap_area() Uladzislau Rezki (Sony)
2025-08-07 11:20   ` Michal Hocko
2025-08-08  9:59     ` Uladzislau Rezki
2025-08-18  2:11   ` Baoquan He
2025-08-07  7:58 ` [PATCH 4/8] mm/vmalloc: Remove cond_resched() in vm_area_alloc_pages() Uladzislau Rezki (Sony)
2025-08-07 11:22   ` Michal Hocko
2025-08-08 10:08     ` Uladzislau Rezki
2025-08-18  2:14   ` Baoquan He
2025-08-07  7:58 ` [PATCH 5/8] mm/kasan, mm/vmalloc: Respect GFP flags in kasan_populate_vmalloc() Uladzislau Rezki (Sony)
2025-08-07 16:05   ` Andrey Ryabinin
2025-08-08 10:18     ` Uladzislau Rezki
2025-08-07  7:58 ` [PATCH 6/8] mm/vmalloc: Defer freeing partly initialized vm_struct Uladzislau Rezki (Sony)
2025-08-07 11:25   ` Michal Hocko
2025-08-08 10:37     ` Uladzislau Rezki
2025-08-18  4:21   ` Baoquan He
2025-08-18 13:02     ` Uladzislau Rezki
2025-08-19  8:56       ` Baoquan He [this message]
2025-08-19  9:20         ` Uladzislau Rezki
2025-08-07  7:58 ` [PATCH 7/8] mm/vmalloc: Support non-blocking GFP flags in __vmalloc_area_node() Uladzislau Rezki (Sony)
2025-08-07 11:54   ` Michal Hocko
2025-08-08 11:54     ` Uladzislau Rezki
2025-08-18  4:35   ` Baoquan He
2025-08-18 13:08     ` Uladzislau Rezki
2025-08-19  8:46       ` Baoquan He
2025-08-07  7:58 ` [PATCH 8/8] mm: Drop __GFP_DIRECT_RECLAIM flag if PF_MEMALLOC is set Uladzislau Rezki (Sony)
2025-08-07 11:58   ` Michal Hocko
2025-08-08 13:12     ` Uladzislau Rezki
2025-08-08 14:16       ` Michal Hocko
2025-08-08 16:56         ` Uladzislau Rezki
2025-08-07 11:01 ` [PATCH 0/8] __vmalloc() and no-block support Marco Elver
2025-08-08  8:48   ` Uladzislau Rezki
2025-08-23  9:35     ` Uladzislau Rezki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aKQ8OY04a0ACqZ2O@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=urezki@gmail.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.