Re: [PATCH v3 3/8] mm: drop per-VMA lock in handle_mm_fault if retrying or when finished

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Peter Xu <peterx@redhat.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, willy@infradead.org,
	hannes@cmpxchg.org, mhocko@suse.com, josef@toxicpanda.com,
	jack@suse.cz, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com,
	michel@lespinasse.org, liam.howlett@oracle.com,
	jglisse@google.com, vbabka@suse.cz, minchan@google.com,
	dave@stgolabs.net, punit.agrawal@bytedance.com,
	lstoakes@gmail.com, hdanton@sina.com, apopple@nvidia.com,
	ying.huang@intel.com, david@redhat.com, yuzhao@google.com,
	dhowells@redhat.com, hughd@google.com, viro@zeniv.linux.org.uk,
	brauner@kernel.org, pasha.tatashin@soleen.com,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH v3 3/8] mm: drop per-VMA lock in handle_mm_fault if retrying or when finished
Date: Tue, 27 Jun 2023 11:27:55 -0400	[thread overview]
Message-ID: <ZJr/+83t9ndwHCd6@x1n> (raw)
In-Reply-To: <20230627042321.1763765-4-surenb@google.com>

On Mon, Jun 26, 2023 at 09:23:16PM -0700, Suren Baghdasaryan wrote:
> handle_mm_fault returning VM_FAULT_RETRY or VM_FAULT_COMPLETED means
> mmap_lock has been released. However with per-VMA locks behavior is
> different and the caller should still release it. To make the
> rules consistent for the caller, drop the per-VMA lock before returning
> from handle_mm_fault when page fault should be retried or is completed.
> 
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> ---
>  arch/arm64/mm/fault.c   |  3 ++-
>  arch/powerpc/mm/fault.c |  3 ++-
>  arch/s390/mm/fault.c    |  3 ++-
>  arch/x86/mm/fault.c     |  3 ++-
>  mm/memory.c             | 12 +++++++++++-
>  5 files changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 6045a5117ac1..89f84e9ea1ff 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -601,7 +601,8 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  		goto lock_mmap;
>  	}
>  	fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
> -	vma_end_read(vma);
> +	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
> +		vma_end_read(vma);
>  
>  	if (!(fault & VM_FAULT_RETRY)) {
>  		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 531177a4ee08..4697c5dca31c 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -494,7 +494,8 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
>  	}
>  
>  	fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
> -	vma_end_read(vma);
> +	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
> +		vma_end_read(vma);
>  
>  	if (!(fault & VM_FAULT_RETRY)) {
>  		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
> index b65144c392b0..cccefe41038b 100644
> --- a/arch/s390/mm/fault.c
> +++ b/arch/s390/mm/fault.c
> @@ -418,7 +418,8 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access)
>  		goto lock_mmap;
>  	}
>  	fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
> -	vma_end_read(vma);
> +	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
> +		vma_end_read(vma);
>  	if (!(fault & VM_FAULT_RETRY)) {
>  		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>  		goto out;
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index e4399983c50c..d69c85c1c04e 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -1347,7 +1347,8 @@ void do_user_addr_fault(struct pt_regs *regs,
>  		goto lock_mmap;
>  	}
>  	fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
> -	vma_end_read(vma);
> +	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
> +		vma_end_read(vma);
>  
>  	if (!(fault & VM_FAULT_RETRY)) {
>  		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> diff --git a/mm/memory.c b/mm/memory.c
> index f69fbc251198..9011ad63c41b 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5086,7 +5086,17 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
>  		}
>  	}
>  
> -	return handle_pte_fault(&vmf);
> +	ret = handle_pte_fault(&vmf);
> +	if (ret & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)) {
> +		/*
> +		 * In case of VM_FAULT_RETRY or VM_FAULT_COMPLETED we might
> +		 * be still holding per-VMA lock to keep the vma stable as long
> +		 * as possible. Drop it before returning.
> +		 */
> +		if (vmf.flags & FAULT_FLAG_VMA_LOCK)
> +			vma_end_read(vma);
> +	}

This smells hackish.. I'd think better we just release the lock at the
place where we'll return RETRY, and AFAIU swap is the only place vma lock
returns a RETRY with current code base?

do_swap_page():
        if (vmf->flags & FAULT_FLAG_VMA_LOCK) {
+               vma_end_read(vma);
                ret = VM_FAULT_RETRY;
                goto out;
        }

I.e., I don't think VM_FAULT_COMPLETED can even be returned with vma lock
paths yet as it doesn't yet support VM_SHARED.

-- 
Peter Xu

next prev parent reply	other threads:[~2023-06-27 15:28 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-27  4:23 [PATCH v3 0/8] Per-VMA lock support for swap and userfaults Suren Baghdasaryan
2023-06-27  4:23 ` [PATCH v3 1/8] swap: remove remnants of polling from read_swap_cache_async Suren Baghdasaryan
2023-06-27  4:23 ` [PATCH v3 2/8] mm: add missing VM_FAULT_RESULT_TRACE name for VM_FAULT_COMPLETED Suren Baghdasaryan
2023-06-27  4:23 ` [PATCH v3 3/8] mm: drop per-VMA lock in handle_mm_fault if retrying or when finished Suren Baghdasaryan
2023-06-27 15:27   ` Peter Xu [this message]
2023-06-27 16:25     ` Suren Baghdasaryan
2023-06-27  4:23 ` [PATCH v3 4/8] mm: replace folio_lock_or_retry with folio_lock_fault Suren Baghdasaryan
2023-06-27 15:22   ` Peter Xu
2023-06-27 16:27     ` Suren Baghdasaryan
2023-06-27  4:23 ` [PATCH v3 5/8] mm: make folio_lock_fault indicate the state of mmap_lock upon return Suren Baghdasaryan
2023-06-27  8:06   ` Alistair Popple
2023-06-27 16:01     ` Suren Baghdasaryan
2023-06-27 15:32   ` Peter Xu
2023-06-27 16:00     ` Suren Baghdasaryan
2023-06-27  4:23 ` [PATCH v3 6/8] mm: handle swap page faults under per-VMA lock Suren Baghdasaryan
2023-06-27 15:41   ` Peter Xu
2023-06-27 16:05     ` Suren Baghdasaryan
2023-06-27 16:24       ` Peter Xu
2023-06-27  4:23 ` [PATCH v3 7/8] mm: drop VMA lock before waiting for migration Suren Baghdasaryan
2023-06-27  8:02   ` Alistair Popple
2023-06-27 15:35     ` Suren Baghdasaryan
2023-06-27 15:49   ` Peter Xu
2023-06-27 16:23     ` Suren Baghdasaryan
2023-06-28  3:22       ` Alistair Popple
2023-06-27  4:23 ` [PATCH v3 8/8] mm: handle userfaults under VMA lock Suren Baghdasaryan
2023-06-27 15:54   ` Peter Xu
2023-06-27 16:10     ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZJr/+83t9ndwHCd6@x1n \
    --to=peterx@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=brauner@kernel.org \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hdanton@sina.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jglisse@google.com \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@android.com \
    --cc=laurent.dufour@fr.ibm.com \
    --cc=ldufour@linux.ibm.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=mhocko@suse.com \
    --cc=michel@lespinasse.org \
    --cc=minchan@google.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=punit.agrawal@bytedance.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.