From: Charlie Jenkins <charlie@rivosinc.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Richard Henderson <richard.henderson@linaro.org>,
Ivan Kokshaysky <ink@jurassic.park.msu.ru>,
Matt Turner <mattst88@gmail.com>,
Vineet Gupta <vgupta@kernel.org>,
Russell King <linux@armlinux.org.uk>, guoren <guoren@kernel.org>,
Huacai Chen <chenhuacai@kernel.org>,
WANG Xuerui <kernel@xen0n.name>,
Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
"James E . J . Bottomley" <James.Bottomley@hansenpartnership.com>,
Helge Deller <deller@gmx.de>,
Michael Ellerman <mpe@ellerman.id.au>,
Nicholas Piggin <npiggin@gmail.com>,
Christophe Leroy <christophe.leroy@csgroup.eu>,
Naveen N Rao <naveen@kernel.org>,
Alexander Gordeev <agordeev@linux.ibm.com>,
Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
Heiko Carstens <hca@linux.ibm.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Sven Schnelle <svens@linux.ibm.com>,
Yoshinori Sato <ysato@users.sourceforge.jp>,
Rich Felker <dalias@libc.org>,
John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>,
"David S . Miller" <davem@davemloft.net>,
Andreas Larsson <andreas@gaisler.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
Andy Lutomirski <luto@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Muchun Song <muchun.song@linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>, shuah <shuah@kernel.org>,
Christoph Hellwig <hch@infradead.org>,
Michal Hocko <mhocko@suse.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Chris Torek <chris.torek@gmail.com>,
Linux-Arch <linux-arch@vger.kernel.org>,
linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org,
linux-snps-arc@lists.infradead.org,
linux-arm-kernel@lists.infradead.org,
"linux-csky@vger.kernel.org" <linux-csky@vger.kernel.org>,
loongarch@lists.linux.dev, linux-mips@vger.kernel.org,
linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
linux-s390@vger.kernel.org, linux-sh@vger.kernel.org,
sparclinux@vger.kernel.org, linux-mm@kvack.org,
linux-kselftest@vger.kernel.org,
linux-abi-devel@lists.sourceforge.net
Subject: Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits
Date: Tue, 10 Sep 2024 16:29:10 -0700 [thread overview]
Message-ID: <ZuDWRq+9b1o864vY@ghost> (raw)
In-Reply-To: <89d21669-8daa-4225-b6d2-33d439ebd746@app.fastmail.com>
On Tue, Sep 10, 2024 at 09:13:33AM +0000, Arnd Bergmann wrote:
> On Mon, Sep 9, 2024, at 23:22, Charlie Jenkins wrote:
> > On Fri, Sep 06, 2024 at 10:52:34AM +0100, Lorenzo Stoakes wrote:
> >> On Fri, Sep 06, 2024 at 09:14:08AM GMT, Arnd Bergmann wrote:
> >> The intent is to optionally be able to run a process that keeps higher bits
> >> free for tagging and to be sure no memory mapping in the process will
> >> clobber these (correct me if I'm wrong Charlie! :)
> >>
> >> So you really wouldn't want this if you are using tagged pointers, you'd
> >> want to be sure literally nothing touches the higher bits.
>
> My understanding was that the purpose of the existing design
> is to allow applications to ask for a high address without having
> to resort to the complexity of MAP_FIXED.
>
> In particular, I'm sure there is precedent for applications that
> want both tagged pointers (for most mappings) and untagged pointers
> (for large mappings). With a per-mm_struct or per-task_struct
> setting you can't do that.
>
> > Various architectures handle the hint address differently, but it
> > appears that the only case across any architecture where an address
> > above 47 bits will be returned is if the application had a hint address
> > with a value greater than 47 bits and was using the MAP_FIXED flag.
> > MAP_FIXED bypasses all other checks so I was assuming that it would be
> > logical for MAP_FIXED to bypass this as well. If MAP_FIXED is not set,
> > then the intent is for no hint address to cause a value greater than 47
> > bits to be returned.
>
> I don't think the MAP_FIXED case is that interesting here because
> it has to work in both fixed and non-fixed mappings.
>
> >> This would be more consistent vs. other arches.
> >
> > Yes riscv is an outlier here. The reason I am pushing for something like
> > a flag to restrict the address space rather than setting it to be the
> > default is it seems like if applications are relying on upper bits to be
> > free, then they should be explicitly asking the kernel to keep them free
> > rather than assuming them to be free.
>
> Let's see what the other architectures do and then come up with
> a way that fixes the pointer tagging case first on those that are
> broken. We can see if there needs to be an extra flag after that.
> Here is what I found:
>
> - x86_64 uses DEFAULT_MAP_WINDOW of BIT(47), uses a 57 bit
> address space when an addr hint is passed.
> - arm64 uses DEFAULT_MAP_WINDOW of BIT(47) or BIT(48), returns
> higher 52-bit addresses when either a hint is passed or
> CONFIG_EXPERT and CONFIG_ARM64_FORCE_52BIT is set (this
> is a debugging option)
> - ppc64 uses a DEFAULT_MAP_WINDOW of BIT(47) or BIT(48),
> returns 52 bit address when an addr hint is passed
> - riscv uses a DEFAULT_MAP_WINDOW of BIT(47) but only uses
> it for allocating the stack below, ignoring it for normal
> mappings
> - s390 has no DEFAULT_MAP_WINDOW but tried to allocate in
> the current number of pgtable levels and only upgrades to
> the next level (31, 42, 53, 64 bits) if a hint is passed or
> the current level is exhausted.
> - loongarch64 has no DEFAULT_MAP_WINDOW, and a default VA
> space of 47 bits (16K pages, 3 levels), but can support
> a 55 bit space (64K pages, 3 levels).
> - sparc has no DEFAULT_MAP_WINDOW and up to 52 bit VA space.
> It may allocate both positive and negative addresses in
> there. (?)
> - mips64, parisc64 and alpha have no DEFAULT_MAP_WINDOW and
> at most 48, 41 or 39 address bits, respectively.
>
> I would suggest these changes:
>
> - make riscv enforce DEFAULT_MAP_WINDOW like x86_64, arm64
> and ppc64, leave it at 47
>
> - add DEFAULT_MAP_WINDOW on loongarch64 (47/48 bits
> based on page size), sparc (48 bits) and s390 (unsure if
> 42, 53, 47 or 48 bits)
>
> - leave the rest unchanged.
>
> Arnd
Changing all architectures to have a standardized DEFAULT_MAP_WINDOW
mostly solves the problem. However, I am concerned that it is fragile
for applications to rely on a default like this. Having the personality
bit flag is supposed to provide an intuitive ABI for users that
guarantees that they will not accidentally request for memory outside of
the boundary that they specified.
Also you bring up that the DEFAULT_MAP_WINDOW would not be able to be
standardized across architectures, so we still have the problem that
this default behavior will be different across architectures which I am
trying to solve.
- Charlie
next prev parent reply other threads:[~2024-09-10 23:29 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-05 21:15 [PATCH RFC v3 0/2] mm: Introduce ADDR_LIMIT_47BIT personality flag Charlie Jenkins
2024-09-05 21:15 ` [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits Charlie Jenkins
2024-09-06 6:59 ` Michael Ellerman
2024-09-09 19:07 ` Charlie Jenkins
2024-09-10 9:20 ` Christophe Leroy
2024-09-10 12:43 ` Geert Uytterhoeven
2024-09-11 13:38 ` Michael Ellerman
2024-09-12 6:20 ` Charlie Jenkins
2024-09-20 5:10 ` Michael Ellerman
2024-09-11 13:37 ` Michael Ellerman
2024-09-06 7:17 ` Arnd Bergmann
2024-09-06 8:02 ` Lorenzo Stoakes
2024-09-06 8:14 ` Lorenzo Stoakes
2024-09-06 9:14 ` Arnd Bergmann
2024-09-06 9:52 ` Lorenzo Stoakes
2024-09-09 23:22 ` Charlie Jenkins
2024-09-10 9:13 ` Arnd Bergmann
2024-09-10 23:29 ` Charlie Jenkins [this message]
2024-09-11 13:50 ` Michael Ellerman
2024-09-06 9:14 ` Guo Ren
2024-09-06 9:55 ` Arnd Bergmann
2024-09-06 11:43 ` Catalin Marinas
2024-09-10 19:08 ` Liam R. Howlett
2024-09-11 0:45 ` Charlie Jenkins
2024-09-11 7:25 ` Arnd Bergmann
2024-09-12 6:06 ` Charlie Jenkins
2024-09-11 18:21 ` Catalin Marinas
2024-09-12 6:18 ` Charlie Jenkins
2024-09-12 10:53 ` Catalin Marinas
2024-09-12 21:15 ` Charlie Jenkins
2024-09-13 10:08 ` Catalin Marinas
2024-09-13 10:21 ` Catalin Marinas
2024-09-13 20:15 ` Charlie Jenkins
2024-09-13 7:41 ` Lorenzo Stoakes
2024-09-13 21:04 ` Charlie Jenkins
2024-10-02 14:26 ` Palmer Dabbelt
2024-09-05 21:15 ` [PATCH RFC v3 2/2] selftests/mm: Create ADDR_LIMIT_47BIT test Charlie Jenkins
2024-09-06 6:08 ` [PATCH RFC v3 0/2] mm: Introduce ADDR_LIMIT_47BIT personality flag Guo Ren
2024-09-06 6:19 ` John Paul Adrian Glaubitz
2024-09-08 11:26 ` Jiaxun Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZuDWRq+9b1o864vY@ghost \
--to=charlie@rivosinc.com \
--cc=James.Bottomley@hansenpartnership.com \
--cc=Liam.Howlett@oracle.com \
--cc=agordeev@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=andreas@gaisler.com \
--cc=arnd@arndb.de \
--cc=borntraeger@linux.ibm.com \
--cc=bp@alien8.de \
--cc=chenhuacai@kernel.org \
--cc=chris.torek@gmail.com \
--cc=christophe.leroy@csgroup.eu \
--cc=dalias@libc.org \
--cc=dave.hansen@linux.intel.com \
--cc=davem@davemloft.net \
--cc=deller@gmx.de \
--cc=gerald.schaefer@linux.ibm.com \
--cc=glaubitz@physik.fu-berlin.de \
--cc=gor@linux.ibm.com \
--cc=guoren@kernel.org \
--cc=hca@linux.ibm.com \
--cc=hch@infradead.org \
--cc=hpa@zytor.com \
--cc=ink@jurassic.park.msu.ru \
--cc=kernel@xen0n.name \
--cc=kirill@shutemov.name \
--cc=linux-abi-devel@lists.sourceforge.net \
--cc=linux-alpha@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-csky@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-parisc@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux-sh@vger.kernel.org \
--cc=linux-snps-arc@lists.infradead.org \
--cc=linux@armlinux.org.uk \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=loongarch@lists.linux.dev \
--cc=lorenzo.stoakes@oracle.com \
--cc=luto@kernel.org \
--cc=mattst88@gmail.com \
--cc=mhocko@suse.com \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=muchun.song@linux.dev \
--cc=naveen@kernel.org \
--cc=npiggin@gmail.com \
--cc=peterz@infradead.org \
--cc=richard.henderson@linaro.org \
--cc=shuah@kernel.org \
--cc=sparclinux@vger.kernel.org \
--cc=svens@linux.ibm.com \
--cc=tglx@linutronix.de \
--cc=tsbogend@alpha.franken.de \
--cc=vbabka@suse.cz \
--cc=vgupta@kernel.org \
--cc=x86@kernel.org \
--cc=ysato@users.sourceforge.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).