All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Nadav Amit <nadav.amit@gmail.com>
Cc: Linux-MM <linux-mm@kvack.org>,
	kernel list <linux-kernel@vger.kernel.org>,
	Naoya Horiguchi <naoya.horiguchi@linux.dev>,
	David Hildenbrand <david@redhat.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Mina Almasry <almasrymina@google.com>,
	Peter Xu <peterx@redhat.com>, Rik van Riel <riel@surriel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Matthew Wilcox <willy@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Wei Chen <harperchen1110@gmail.com>,
	"# 5 . 10+" <stable@vger.kernel.org>
Subject: Re: [PATCH v6] hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing
Date: Mon, 31 Oct 2022 18:48:37 -0700	[thread overview]
Message-ID: <Y2B69ViNMYHvPXYk@monkey> (raw)
In-Reply-To: <8D2D2F0F-9A53-42B5-8A9D-936E06E4A4E9@gmail.com>

On 10/31/22 17:54, Nadav Amit wrote:
> On Oct 31, 2022, at 3:34 PM, Mike Kravetz <mike.kravetz@oracle.com> wrote:
> 
> > madvise(MADV_DONTNEED) ends up calling zap_page_range() to clear the page
> > tables associated with the address range.  For hugetlb vmas,
> > zap_page_range will call __unmap_hugepage_range_final.  However,
> > __unmap_hugepage_range_final assumes the passed vma is about to be removed
> > and deletes the vma_lock to prevent pmd sharing as the vma is on the way
> > out.  In the case of madvise(MADV_DONTNEED) the vma remains, but the
> > missing vma_lock prevents pmd sharing and could potentially lead to issues
> > with truncation/fault races.
> > 
> 
> [snip]
> 
> > index 978c17df053e..517c8cc8ccb9 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -3464,4 +3464,7 @@ madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> >  */
> > #define  ZAP_FLAG_DROP_MARKER        ((__force zap_flags_t) BIT(0))
> > 
> > +/* Set in unmap_vmas() to indicate an unmap call.  Only used by hugetlb */
> > +#define  ZAP_FLAG_UNMAP              ((__force zap_flags_t) BIT(1))
> 
> PeterZ wants to add ZAP_FLAG_FORCE_FLUSH that would be set on
> zap_pte_range(). Not sure you would want to combine them both together, but
> at least be aware of potential conflict.
> 
> https://lore.kernel.org/all/Y1f7YvKuwOl1XEwU@hirez.programming.kicks-ass.net/
> 

Ok, noted

> [snip]
> 
> > +#ifdef CONFIG_ADVISE_SYSCALLS
> > +/*
> > + * Similar setup as in zap_page_range().  madvise(MADV_DONTNEED) can not call
> > + * zap_page_range for hugetlb vmas as __unmap_hugepage_range_final will delete
> > + * the associated vma_lock.
> > + */
> > +void clear_hugetlb_page_range(struct vm_area_struct *vma, unsigned long start,
> > +				unsigned long end)
> > +{
> > +	struct mmu_notifier_range range;
> > +	struct mmu_gather tlb;
> > +
> > +	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
> > +				start, end);
> > +	adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end);
> > +	tlb_gather_mmu(&tlb, vma->vm_mm);
> > +	update_hiwater_rss(vma->vm_mm);
> > +	mmu_notifier_invalidate_range_start(&range);
> > +
> > +	__unmap_hugepage_range_locking(&tlb, vma, start, end, NULL, 0);
> > +
> > +	mmu_notifier_invalidate_range_end(&range);
> > +	tlb_finish_mmu(&tlb);
> > }
> > +#endif
> 
> I hate ifdef’s. And the second definition of clear_hugetlb_page_range() is
> confusing since it does not have an ifdef at all. . How about moving the
> ifdef’s into the function like being done in io_madvise_prep()? I think it
> is less confusing.

Dang!!!  I sent the wrong version of the patch.  clear_hugetlb_page_range
is not even used here.  I used Peter's idea to eliminate the need for
this separate routine.  And, this was a development version.

Really sorry about that.

Andrew, if you can please drop the version of the patch.
-- 
Mike Kravetz

      reply	other threads:[~2022-11-01  1:51 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-31 22:34 [PATCH v6] hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing Mike Kravetz
2022-11-01  0:54 ` Nadav Amit
2022-11-01  1:48   ` Mike Kravetz [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y2B69ViNMYHvPXYk@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=axelrasmussen@google.com \
    --cc=david@redhat.com \
    --cc=harperchen1110@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nadav.amit@gmail.com \
    --cc=naoya.horiguchi@linux.dev \
    --cc=peterx@redhat.com \
    --cc=riel@surriel.com \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.