linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Arnd Bergmann <arnd@arndb.de>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	X86 ML <x86@kernel.org>, Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Andi Kleen <ak@linux.intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	linux-arch <linux-arch@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>
Subject: Re: [RFC, PATCHv2 29/29] mm, x86: introduce RLIMIT_VADDR
Date: Tue, 3 Jan 2017 10:29:33 -0800	[thread overview]
Message-ID: <CALCETrUCdu3kTBU09gXaSppO7VCm+872zkGnovaZKTXBbY2wTg@mail.gmail.com> (raw)
In-Reply-To: <3492795.xaneWtGxgW@wuerfel>

On Tue, Jan 3, 2017 at 5:18 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Monday, January 2, 2017 10:08:28 PM CET Andy Lutomirski wrote:
>>
>> > This seems to nicely address the same problem on arm64, which has
>> > run into the same issue due to the various page table formats
>> > that can currently be chosen at compile time.
>>
>> On further reflection, I think this has very little to do with paging
>> formats except insofar as paging formats make us notice the problem.
>> The issue is that user code wants to be able to assume an upper limit
>> on an address, and it gets an upper limit right now that depends on
>> architecture due to paging formats.  But someone really might want to
>> write a *portable* 64-bit program that allocates memory with the high
>> 16 bits clear.  So let's add such a mechanism directly.
>>
>> As a thought experiment, what if x86_64 simply never allocated "high"
>> (above 2^47-1) addresses unless a new mmap-with-explicit-limit syscall
>> were used?  Old glibc would continue working.  Old VMs would work.
>> New programs that want to use ginormous mappings would have to use the
>> new syscall.  This would be totally stateless and would have no issues
>> with CRIU.
>
> I can see this working well for the 47-bit addressing default, but
> what about applications that actually rely on 39-bit addressing
> (I'd have to double-check, but I think this was the limit that
> people were most interested in for arm64)?
>
> 39 bits seems a little small to make that the default for everyone
> who doesn't pass the extra flag. Having to pass another flag to
> limit the addresses introduces other problems (e.g. mmap from
> library call that doesn't pass that flag).

That's a fair point.  Maybe my straw man isn't so good.

>
>> If necessary, we could also have a prctl that changes a
>> "personality-like" limit that is in effect when the old mmap was used.
>> I say "personality-like" because it would reset under exactly the same
>> conditions that personality resets itself.
>
> For "personality-like", it would still have to interact
> with the existing PER_LINUX32 and PER_LINUX32_3GB flags that
> do the exact same thing, so actually using personality might
> be better.
>
> We still have a few bits in the personality arguments, and
> we could combine them with the existing ADDR_LIMIT_3GB
> and ADDR_LIMIT_32BIT flags that are mutually exclusive by
> definition, such as
>
>         ADDR_LIMIT_32BIT =      0x0800000, /* existing */
>         ADDR_LIMIT_3GB   =      0x8000000, /* existing */
>         ADDR_LIMIT_39BIT =      0x0010000, /* next free bit */
>         ADDR_LIMIT_42BIT =      0x8010000,
>         ADDR_LIMIT_47BIT =      0x0810000,
>         ADDR_LIMIT_48BIT =      0x8810000,
>
> This would probably take only one or two personality bits for the
> limits that are interesting in practice.

Hmm.  What if we approached this a bit differently?  We could add a
single new personality bit ADDR_LIMIT_EXPLICIT.  Setting this bit
cause PER_LINUX32_3GB etc to be automatically cleared.  When
ADDR_LIMIT_EXPLICIT is in effect, prctl can set a 64-bit numeric
limit.  If ADDR_LIMIT_EXPLICIT is cleared, the prctl value stops being
settable and reading it via prctl returns whatever is implied by the
other personality bits.

--Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-01-03 18:29 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20161227015413.187403-1-kirill.shutemov@linux.intel.com>
2016-12-27  1:54 ` [RFC, PATCHv2 29/29] mm, x86: introduce RLIMIT_VADDR Kirill A. Shutemov
2016-12-27  2:06   ` Andy Lutomirski
2016-12-27  2:24     ` Kirill A. Shutemov
2016-12-27  3:22       ` Andy Lutomirski
2017-01-02  9:09         ` Kirill A. Shutemov
2016-12-29  2:53       ` Carlos O'Donell
2016-12-31  2:08         ` Andy Lutomirski
2017-01-02  8:35           ` Kirill A. Shutemov
2017-01-13 20:11             ` H.J. Lu
2017-01-02  8:44   ` Arnd Bergmann
2017-01-03  6:08     ` Andy Lutomirski
2017-01-03 13:18       ` Arnd Bergmann
2017-01-03 18:29         ` Andy Lutomirski [this message]
2017-01-03 22:07           ` Arnd Bergmann
2017-01-03 22:09             ` Andy Lutomirski
2017-01-04 13:55               ` Arnd Bergmann
2017-01-03 16:04       ` Kirill A. Shutemov
2017-01-03 18:27         ` Andy Lutomirski
2017-01-04 14:19           ` Kirill A. Shutemov
2017-01-05 17:53             ` Andy Lutomirski
2017-01-05 19:13   ` Dave Hansen
2017-01-05 19:29     ` Kirill A. Shutemov
2017-01-05 19:39       ` Dave Hansen
2017-01-05 20:11         ` Kirill A. Shutemov
2017-01-05 20:14         ` Andy Lutomirski
2017-01-05 20:49           ` Dave Hansen
     [not found]             ` <978d5f1a-ec4d-f747-93fd-27ecfe10cb88-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-01-05 21:27               ` Andy Lutomirski
2017-01-05 23:17                 ` Dave Hansen
2017-01-11 14:29             ` Kirill A. Shutemov
2017-01-11 18:09               ` Andy Lutomirski
2017-01-11 18:37                 ` Kirill A. Shutemov
2017-01-11 18:49                   ` Dave Hansen
2017-01-11 19:20                     ` Andy Lutomirski
2017-01-11 19:31                       ` Linus Torvalds
2017-01-11 21:46                         ` Andi Kleen
2017-01-11 19:32                       ` Kirill A. Shutemov
2017-01-11 19:39                         ` Linus Torvalds
     [not found]               ` <20170111142904.GD4895-sVvlyX1904swdBt8bTSxpkEMvNT87kid@public.gmane.org>
2017-01-11 18:26                 ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrUCdu3kTBU09gXaSppO7VCm+872zkGnovaZKTXBbY2wTg@mail.gmail.com \
    --to=luto@amacapital.net \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@intel.com \
    --cc=hpa@zytor.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).