All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Uladzislau Rezki <urezki@gmail.com>
Cc: "zhaoyang.huang" <zhaoyang.huang@unisoc.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Hellwig <hch@infradead.org>,
	Lorenzo Stoakes <lstoakes@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	hailong liu <hailong.liu@oppo.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Zhaoyang Huang <huangzhaoyang@gmail.com>,
	steve.kang@unisoc.com
Subject: Re: [PATCHv3] mm: fix incorrect vbq reference in purge_fragmented_block
Date: Fri, 31 May 2024 18:03:00 +0800	[thread overview]
Message-ID: <ZlmgVAZ6KABfpn8K@MiWiFi-R3L-srv> (raw)
In-Reply-To: <ZlmEp9nxKiG9gWFj@pc636>

On 05/31/24 at 10:04am, Uladzislau Rezki wrote:
> On Fri, May 31, 2024 at 11:05:20AM +0800, zhaoyang.huang wrote:
> > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> > 
> > vmalloc area runs out in our ARM64 system during an erofs test as
> > vm_map_ram failed[1]. By following the debug log, we find that
> > vm_map_ram()->vb_alloc() will allocate new vb->va which corresponding
> > to 4MB vmalloc area as list_for_each_entry_rcu returns immediately
> > when vbq->free->next points to vbq->free. That is to say, 65536 times
> > of page fault after the list's broken will run out of the whole
> > vmalloc area. This should be introduced by one vbq->free->next point to
> > vbq->free which makes list_for_each_entry_rcu can not iterate the list
> > and find the BUG.
> > 
> > [1]
> > PID: 1        TASK: ffffff80802b4e00  CPU: 6    COMMAND: "init"
> >  #0 [ffffffc08006afe0] __switch_to at ffffffc08111d5cc
> >  #1 [ffffffc08006b040] __schedule at ffffffc08111dde0
> >  #2 [ffffffc08006b0a0] schedule at ffffffc08111e294
> >  #3 [ffffffc08006b0d0] schedule_preempt_disabled at ffffffc08111e3f0
> >  #4 [ffffffc08006b140] __mutex_lock at ffffffc08112068c
> >  #5 [ffffffc08006b180] __mutex_lock_slowpath at ffffffc08111f8f8
> >  #6 [ffffffc08006b1a0] mutex_lock at ffffffc08111f834
> >  #7 [ffffffc08006b1d0] reclaim_and_purge_vmap_areas at ffffffc0803ebc3c
> >  #8 [ffffffc08006b290] alloc_vmap_area at ffffffc0803e83fc
> >  #9 [ffffffc08006b300] vm_map_ram at ffffffc0803e78c0
> > 
> > Fixes: fc1e0d980037 ("mm/vmalloc: prevent stale TLBs in fully utilized blocks")
> > 
> > Suggested-by: Hailong.Liu <hailong.liu@oppo.com>
> > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> >
> Is a problem related to run out of vmalloc space _only_ or it is a problem
> with broken list? From the commit message it is hard to follow the reason.

The broken list caused the vmalloc space run out. I think we should fix
the broken list.

Wondering if the issue can be always reproduced, or rarely seen. We
should try making a patch to fix the list breakage unless it's not
feasible. I will have a look at this.

> 
> Could you please post a full trace or panic?
> 
> > ---
> > v2: introduce cpu in vmap_block to record the right CPU number
> > v3: use get_cpu/put_cpu to prevent schedule between core
> > ---
> > ---
> >  mm/vmalloc.c | 12 ++++++++----
> >  1 file changed, 8 insertions(+), 4 deletions(-)
> > 
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 22aa63f4ef63..ecdb75d10949 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -2458,6 +2458,7 @@ struct vmap_block {
> >  	struct list_head free_list;
> >  	struct rcu_head rcu_head;
> >  	struct list_head purge;
> > +	unsigned int cpu;
> >  };
> >  
> >  /* Queue of free and dirty vmap blocks, for allocation and flushing purposes */
> > @@ -2586,10 +2587,12 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask)
> >  		return ERR_PTR(err);
> >  	}
> >  
> > +	vb->cpu = get_cpu();
> >  	vbq = raw_cpu_ptr(&vmap_block_queue);
> >  	spin_lock(&vbq->lock);
> >  	list_add_tail_rcu(&vb->free_list, &vbq->free);
> >  	spin_unlock(&vbq->lock);
> > +	put_cpu();
> >  
> Why do you need get_cpu() here? Can you go with raw_smp_processor_id()
> and then access the per-cpu "vmap_block_queue"? get_cpu() disables
> preemption and then a spin-lock is take within this critical section.
> From the first glance PREEMPT_RT is broken in this case.
> 
> I am on a vacation, responds can be with delays.
> 
> --
> Uladzislau Rezki
> 



  parent reply	other threads:[~2024-05-31 10:03 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-31  3:05 [PATCHv3] mm: fix incorrect vbq reference in purge_fragmented_block zhaoyang.huang
2024-05-31  3:23 ` hailong liu
2024-05-31  8:04 ` Uladzislau Rezki
2024-05-31  8:53   ` hailong liu
2024-05-31  9:11   ` Zhaoyang Huang
2024-05-31  9:55     ` Barry Song
2024-05-31 10:17       ` Zhaoyang Huang
2024-05-31 10:52         ` hailong liu
2024-05-31 10:03   ` Baoquan He [this message]
2024-05-31 10:44   ` Hillf Danton
2024-05-31 10:57     ` Hailong Liu
2024-06-01  2:34   ` Baoquan He
2024-06-02 11:02     ` Zhaoyang Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZlmgVAZ6KABfpn8K@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hailong.liu@oppo.com \
    --cc=hch@infradead.org \
    --cc=huangzhaoyang@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=steve.kang@unisoc.com \
    --cc=tglx@linutronix.de \
    --cc=urezki@gmail.com \
    --cc=zhaoyang.huang@unisoc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.