From: Jerome Glisse <jglisse@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
David Hildenbrand <david@redhat.com>,
Hugh Dickins <hughd@google.com>, Maya Gokhale <gokhale2@llnl.gov>,
Pavel Emelyanov <xemul@virtuozzo.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Martin Cracauer <cracauer@cons.org>, Shaohua Li <shli@fb.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Mike Kravetz <mike.kravetz@oracle.com>,
Denis Plotnikov <dplotnikov@virtuozzo.com>,
Mike Rapoport <rppt@linux.vnet.ibm.com>,
Marty McFadden <mcfadden8@llnl.gov>, Mel Gorman <mgorman@suse.de>,
"Kirill A . Shutemov" <kirill@shutemov.name>,
"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH v3 04/28] mm: allow VM_FAULT_RETRY for multiple times
Date: Thu, 18 Apr 2019 16:11:08 -0400 [thread overview]
Message-ID: <20190418201108.GJ3288@redhat.com> (raw)
In-Reply-To: <20190320020642.4000-5-peterx@redhat.com>
On Wed, Mar 20, 2019 at 10:06:18AM +0800, Peter Xu wrote:
> The idea comes from a discussion between Linus and Andrea [1].
>
> Before this patch we only allow a page fault to retry once. We
> achieved this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing
> handle_mm_fault() the second time. This was majorly used to avoid
> unexpected starvation of the system by looping over forever to handle
> the page fault on a single page. However that should hardly happen,
> and after all for each code path to return a VM_FAULT_RETRY we'll
> first wait for a condition (during which time we should possibly yield
> the cpu) to happen before VM_FAULT_RETRY is really returned.
>
> This patch removes the restriction by keeping the
> FAULT_FLAG_ALLOW_RETRY flag when we receive VM_FAULT_RETRY. It means
> that the page fault handler now can retry the page fault for multiple
> times if necessary without the need to generate another page fault
> event. Meanwhile we still keep the FAULT_FLAG_TRIED flag so page
> fault handler can still identify whether a page fault is the first
> attempt or not.
>
> Then we'll have these combinations of fault flags (only considering
> ALLOW_RETRY flag and TRIED flag):
>
> - ALLOW_RETRY and !TRIED: this means the page fault allows to
> retry, and this is the first try
>
> - ALLOW_RETRY and TRIED: this means the page fault allows to
> retry, and this is not the first try
>
> - !ALLOW_RETRY and !TRIED: this means the page fault does not allow
> to retry at all
>
> - !ALLOW_RETRY and TRIED: this is forbidden and should never be used
>
> In existing code we have multiple places that has taken special care
> of the first condition above by checking against (fault_flags &
> FAULT_FLAG_ALLOW_RETRY). This patch introduces a simple helper to
> detect the first retry of a page fault by checking against
> both (fault_flags & FAULT_FLAG_ALLOW_RETRY) and !(fault_flag &
> FAULT_FLAG_TRIED) because now even the 2nd try will have the
> ALLOW_RETRY set, then use that helper in all existing special paths.
> One example is in __lock_page_or_retry(), now we'll drop the mmap_sem
> only in the first attempt of page fault and we'll keep it in follow up
> retries, so old locking behavior will be retained.
>
> This will be a nice enhancement for current code [2] at the same time
> a supporting material for the future userfaultfd-writeprotect work,
> since in that work there will always be an explicit userfault
> writeprotect retry for protected pages, and if that cannot resolve the
> page fault (e.g., when userfaultfd-writeprotect is used in conjunction
> with swapped pages) then we'll possibly need a 3rd retry of the page
> fault. It might also benefit other potential users who will have
> similar requirement like userfault write-protection.
>
> GUP code is not touched yet and will be covered in follow up patch.
>
> Please read the thread below for more information.
>
> [1] https://lkml.org/lkml/2017/11/2/833
> [2] https://lkml.org/lkml/2018/12/30/64
>
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Suggested-by: Andrea Arcangeli <aarcange@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
A minor comment suggestion below but it can be fix in a followup patch.
[...]
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 80bb6408fe73..f73dbc4a1957 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -336,16 +336,52 @@ extern unsigned int kobjsize(const void *objp);
> */
> extern pgprot_t protection_map[16];
>
> +/*
> + * About FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_TRIED: we can specify whether we
> + * would allow page faults to retry by specifying these two fault flags
> + * correctly. Currently there can be three legal combinations:
> + *
> + * (a) ALLOW_RETRY and !TRIED: this means the page fault allows retry, and
> + * this is the first try
> + *
> + * (b) ALLOW_RETRY and TRIED: this means the page fault allows retry, and
> + * we've already tried at least once
> + *
> + * (c) !ALLOW_RETRY and !TRIED: this means the page fault does not allow retry
> + *
> + * The unlisted combination (!ALLOW_RETRY && TRIED) is illegal and should never
> + * be used. Note that page faults can be allowed to retry for multiple times,
> + * in which case we'll have an initial fault with flags (a) then later on
> + * continuous faults with flags (b). We should always try to detect pending
> + * signals before a retry to make sure the continuous page faults can still be
> + * interrupted if necessary.
> + */
> +
> #define FAULT_FLAG_WRITE 0x01 /* Fault was a write access */
> #define FAULT_FLAG_MKWRITE 0x02 /* Fault was mkwrite of existing pte */
> #define FAULT_FLAG_ALLOW_RETRY 0x04 /* Retry fault if blocking */
> #define FAULT_FLAG_RETRY_NOWAIT 0x08 /* Don't drop mmap_sem and wait when retrying */
> #define FAULT_FLAG_KILLABLE 0x10 /* The fault task is in SIGKILL killable region */
> -#define FAULT_FLAG_TRIED 0x20 /* Second try */
> +#define FAULT_FLAG_TRIED 0x20 /* We've tried once */
> #define FAULT_FLAG_USER 0x40 /* The fault originated in userspace */
> #define FAULT_FLAG_REMOTE 0x80 /* faulting for non current tsk/mm */
> #define FAULT_FLAG_INSTRUCTION 0x100 /* The fault was during an instruction fetch */
>
> +/*
> + * Returns true if the page fault allows retry and this is the first
> + * attempt of the fault handling; false otherwise. This is mostly
> + * used for places where we want to try to avoid taking the mmap_sem
> + * for too long a time when waiting for another condition to change,
> + * in which case we can try to be polite to release the mmap_sem in
> + * the first round to avoid potential starvation of other processes
> + * that would also want the mmap_sem.
> + */
You should be using kernel function documentation style above.
> +static inline bool fault_flag_allow_retry_first(unsigned int flags)
> +{
> + return (flags & FAULT_FLAG_ALLOW_RETRY) &&
> + (!(flags & FAULT_FLAG_TRIED));
> +}
> +
next prev parent reply other threads:[~2019-04-18 20:11 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-20 2:06 [PATCH v3 00/28] userfaultfd: write protection support Peter Xu
2019-03-20 2:06 ` [PATCH v3 01/28] mm: gup: rename "nonblocking" to "locked" where proper Peter Xu
2019-03-20 2:06 ` [PATCH v3 02/28] mm: userfault: return VM_FAULT_RETRY on signals Peter Xu
2019-03-20 2:06 ` [PATCH v3 03/28] userfaultfd: don't retake mmap_sem to emulate NOPAGE Peter Xu
2019-03-20 2:06 ` [PATCH v3 04/28] mm: allow VM_FAULT_RETRY for multiple times Peter Xu
2019-04-18 20:11 ` Jerome Glisse [this message]
2019-04-19 6:00 ` Peter Xu
2019-03-20 2:06 ` [PATCH v3 05/28] mm: gup: " Peter Xu
2019-03-20 2:06 ` [PATCH v3 06/28] userfaultfd: wp: add helper for writeprotect check Peter Xu
2019-03-20 2:06 ` [PATCH v3 07/28] userfaultfd: wp: hook userfault handler to write protection fault Peter Xu
2019-04-18 20:03 ` Jerome Glisse
2019-03-20 2:06 ` [PATCH v3 08/28] userfaultfd: wp: add WP pagetable tracking to x86 Peter Xu
2019-03-20 2:06 ` [PATCH v3 09/28] userfaultfd: wp: userfaultfd_pte/huge_pmd_wp() helpers Peter Xu
2019-03-20 2:06 ` [PATCH v3 10/28] userfaultfd: wp: add UFFDIO_COPY_MODE_WP Peter Xu
2019-03-20 2:06 ` [PATCH v3 11/28] mm: merge parameters for change_protection() Peter Xu
2019-03-20 2:06 ` [PATCH v3 12/28] userfaultfd: wp: apply _PAGE_UFFD_WP bit Peter Xu
2019-03-20 2:06 ` [PATCH v3 13/28] mm: export wp_page_copy() Peter Xu
2019-03-20 2:06 ` [PATCH v3 14/28] userfaultfd: wp: handle COW properly for uffd-wp Peter Xu
2019-04-18 20:51 ` Jerome Glisse
2019-04-19 6:26 ` Peter Xu
2019-04-19 15:02 ` Jerome Glisse
2019-04-22 12:20 ` Peter Xu
2019-04-22 14:54 ` Jerome Glisse
2019-04-23 3:00 ` Peter Xu
2019-04-23 15:34 ` Jerome Glisse
2019-04-24 8:38 ` Peter Xu
2019-03-20 2:06 ` [PATCH v3 15/28] userfaultfd: wp: drop _PAGE_UFFD_WP properly when fork Peter Xu
2019-03-20 2:06 ` [PATCH v3 16/28] userfaultfd: wp: add pmd_swp_*uffd_wp() helpers Peter Xu
2019-03-20 2:06 ` [PATCH v3 17/28] userfaultfd: wp: support swap and page migration Peter Xu
2019-04-18 20:59 ` Jerome Glisse
2019-04-19 7:42 ` Peter Xu
2019-04-19 15:08 ` Jerome Glisse
2019-04-22 12:23 ` Peter Xu
2019-03-20 2:06 ` [PATCH v3 18/28] khugepaged: skip collapse if uffd-wp detected Peter Xu
2019-03-20 2:06 ` [PATCH v3 19/28] userfaultfd: introduce helper vma_find_uffd Peter Xu
2019-03-20 2:06 ` [PATCH v3 20/28] userfaultfd: wp: support write protection for userfault vma range Peter Xu
2019-03-20 2:06 ` [PATCH v3 21/28] userfaultfd: wp: add the writeprotect API to userfaultfd ioctl Peter Xu
2019-03-20 2:06 ` [PATCH v3 22/28] userfaultfd: wp: enabled write protection in userfaultfd API Peter Xu
2019-03-22 21:37 ` Mike Rapoport
2019-03-20 2:06 ` [PATCH v3 23/28] userfaultfd: wp: don't wake up when doing write protect Peter Xu
2019-03-20 2:06 ` [PATCH v3 24/28] userfaultfd: wp: UFFDIO_REGISTER_MODE_WP documentation update Peter Xu
2019-03-22 21:46 ` Mike Rapoport
2019-03-20 2:06 ` [PATCH v3 25/28] userfaultfd: wp: fixup swap entries in change_pte_range Peter Xu
2019-04-18 21:01 ` Jerome Glisse
2019-03-20 2:06 ` [PATCH v3 26/28] userfaultfd: wp: declare _UFFDIO_WRITEPROTECT conditionally Peter Xu
2019-03-22 21:43 ` Mike Rapoport
2019-03-20 2:06 ` [PATCH v3 27/28] userfaultfd: selftests: refactor statistics Peter Xu
2019-03-20 2:06 ` [PATCH v3 28/28] userfaultfd: selftests: add write-protect test Peter Xu
2019-04-09 6:08 ` [PATCH v3 00/28] userfaultfd: write protection support Peter Xu
2019-04-18 21:07 ` Jerome Glisse
2019-04-19 7:53 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190418201108.GJ3288@redhat.com \
--to=jglisse@redhat.com \
--cc=aarcange@redhat.com \
--cc=cracauer@cons.org \
--cc=david@redhat.com \
--cc=dgilbert@redhat.com \
--cc=dplotnikov@virtuozzo.com \
--cc=gokhale2@llnl.gov \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mcfadden8@llnl.gov \
--cc=mgorman@suse.de \
--cc=mike.kravetz@oracle.com \
--cc=peterx@redhat.com \
--cc=rppt@linux.vnet.ibm.com \
--cc=shli@fb.com \
--cc=xemul@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).