linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Oscar Salvador <osalvador@suse.de>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-api@vger.kernel.org, hughd@google.com,
	joel@joelfernandes.org, jglisse@redhat.com,
	yang.shi@linux.alibaba.com, mgorman@techsingularity.net
Subject: Re: [RFC PATCH] mm,mremap: Bail out earlier in mremap_to under map pressure
Date: Mon, 25 Feb 2019 15:16:01 +0300	[thread overview]
Message-ID: <20190225121601.k4g7cabebeemthae@kshutemo-mobl1> (raw)
In-Reply-To: <cfc53e5a-a403-a732-69d2-1f96b8416f6d@suse.cz>

On Mon, Feb 25, 2019 at 12:46:46PM +0100, Vlastimil Babka wrote:
> On 2/22/19 2:01 PM, Kirill A. Shutemov wrote:
> > On Thu, Feb 21, 2019 at 09:54:06AM +0100, Oscar Salvador wrote:
> >> When using mremap() syscall in addition to MREMAP_FIXED flag,
> >> mremap() calls mremap_to() which does the following:
> >>
> >> 1) unmaps the destination region where we are going to move the map
> >> 2) If the new region is going to be smaller, we unmap the last part
> >>    of the old region
> >>
> >> Then, we will eventually call move_vma() to do the actual move.
> >>
> >> move_vma() checks whether we are at least 4 maps below max_map_count
> >> before going further, otherwise it bails out with -ENOMEM.
> >> The problem is that we might have already unmapped the vma's in steps
> >> 1) and 2), so it is not possible for userspace to figure out the state
> >> of the vma's after it gets -ENOMEM, and it gets tricky for userspace
> >> to clean up properly on error path.
> >>
> >> While it is true that we can return -ENOMEM for more reasons
> >> (e.g: see may_expand_vm() or move_page_tables()), I think that we can
> >> avoid this scenario in concret if we check early in mremap_to() if the
> >> operation has high chances to succeed map-wise.
> >>
> >> Should not be that the case, we can bail out before we even try to unmap
> >> anything, so we make sure the vma's are left untouched in case we are likely
> >> to be short of maps.
> >>
> >> The thumb-rule now is to rely on the worst-scenario case we can have.
> >> That is when both vma's (old region and new region) are going to be split
> >> in 3, so we get two more maps to the ones we already hold (one per each).
> >> If current map count + 2 maps still leads us to 4 maps below the threshold,
> >> we are going to pass the check in move_vma().
> >>
> >> Of course, this is not free, as it might generate false positives when it is
> >> true that we are tight map-wise, but the unmap operation can release several
> >> vma's leading us to a good state.
> >>
> >> Because of that I am sending this as a RFC.
> >> Another approach was also investigated [1], but it may be too much hassle
> >> for what it brings.
> > 
> > I believe we don't need the check in move_vma() with this patch. Or do we?
> 
> move_vma() can be also called directly from SYSCALL_DEFINE5(mremap) for
> the non-MMAP_FIXED case. So unless there's further refactoring, the
> check is still needed.

Okay, makes sense.

> >>
> >> [1] https://lore.kernel.org/lkml/20190219155320.tkfkwvqk53tfdojt@d104.suse.de/
> >>
> >> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov

      reply	other threads:[~2019-02-25 12:16 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-21  8:54 [RFC PATCH] mm,mremap: Bail out earlier in mremap_to under map pressure Oscar Salvador
2019-02-22 13:01 ` Kirill A. Shutemov
2019-02-25 11:46   ` Vlastimil Babka
2019-02-25 12:16     ` Kirill A. Shutemov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190225121601.k4g7cabebeemthae@kshutemo-mobl1 \
    --to=kirill@shutemov.name \
    --cc=hughd@google.com \
    --cc=jglisse@redhat.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=osalvador@suse.de \
    --cc=vbabka@suse.cz \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).