public inbox for linux-riscv@lists.infradead.org
 help / color / mirror / Atom feed
From: "Arnd Bergmann" <arnd@arndb.de>
To: "Alexandre Ghiti" <alex@ghiti.fr>,
	"Samuel Holland" <samuel.holland@sifive.com>,
	"Alexandre Ghiti" <alexghiti@rivosinc.com>
Cc: "Palmer Dabbelt" <palmer@dabbelt.com>,
	linux-riscv@lists.infradead.org,
	"Albert Ou" <aou@eecs.berkeley.edu>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Charlie Jenkins" <charlie@rivosinc.com>,
	guoren <guoren@kernel.org>, "Jisheng Zhang" <jszhang@kernel.org>,
	"Kemeng Shi" <shikemeng@huaweicloud.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Mike Rapoport" <rppt@kernel.org>,
	"Paul Walmsley" <paul.walmsley@sifive.com>,
	"Xiao W Wang" <xiao.w.wang@intel.com>,
	"Yangyu Chen" <cyy@cyyself.name>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] riscv: Define TASK_SIZE_MAX for __access_ok()
Date: Mon, 25 Mar 2024 12:15:10 +0100	[thread overview]
Message-ID: <eaa73bca-1e81-4a6f-a89f-e63329d359f9@app.fastmail.com> (raw)
In-Reply-To: <1883cf6b-7756-405b-a843-d8ae31e01d61@ghiti.fr>

On Mon, Mar 25, 2024, at 08:25, Alexandre Ghiti wrote:
> On 24/03/2024 23:05, Arnd Bergmann wrote:
>> On Tue, Mar 19, 2024, at 17:51, Alexandre Ghiti wrote:
>>>
>>> The use of alternatives allows to return right away if the buffer is
>>> beyond the usable user address space, and it's not just "slightly
>>> faster" for some cases (a very large buffer with only a few bytes being
>>> beyond the limit or someone could fault-in all the user pages and fail
>>> very late...etc). access_ok() is here to guarantee that such situations
>>> don't happen, so actually it makes more sense to use an alternative to
>>> avoid that.
>> The access_ok() function really wants a compile-time constant
>> value for TASK_SIZE_MAX so it can do constant folding for
>> repeated calls inside of one function, so for configurations
>> with a boot-time selected TASK_SIZE_64 it's already not ideal,
>> with or without alternatives.
>>
>> If I read the current code correctly, riscv doesn't even
>> have a way to build with a compile-time selected
>> VA_BITS/PGDIR_SIZE, which is probably a better place to
>> start optimizing, since this rarely needs to be selected
>> dynamically.
>
>
> Indeed, we do not support compile-time fixed VA_BITS! We could, but that 
> would only be used for custom kernels. I don't think distro kernels will 
> ever (?) propose 3 different kernels for sv39, sv48 and sv57 because the 
> cost of dynamically choosing the address space width is not big enough 
> to me (and the burden of maintaining 3 different kernels is).
>
> Let me know if I'm wrong, I'd be happy to work on that.

My feeling is that in most cases, users are better off with a
compile-time default, given that the addressable memory has
a factor of 512 between each step. With sv39, I think you are
limited to having all RAM in the first 128GB of physical
address space, and each process is limited to 256GB virtual
addressing, but this is already covers pretty much anything
you want to do on small systems that run a custom kernel.

On most desktop/server/cloud distros, hardwiring sv48 is
probably sufficient if all general purpose machines support
this, and it should be enough even for commercial databases
that micro-optimize 100TB datasets through a permanent mmap(),
as well as most NUMA systems with discontiguous memory.
This adds a little cost over hardcoded sv39, but is still
faster than a boot-time sv39/sv48 config that most users
will not be aware of.

Once enterprise distros certify systems beyond a few dozen
TB of RAM, they probably need to enable the boot time
detection, until then I think the few users with gigantic
systems will probably be fine running a custom sv57
kernel. At that point, that distro can start shipping a
kernel with boot-time detected page table sizes.

     Arnd

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2024-03-25 11:15 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-13 17:59 [PATCH] riscv: Define TASK_SIZE_MAX for __access_ok() Samuel Holland
2024-03-18 20:50 ` Alexandre Ghiti
2024-03-18 21:29   ` Samuel Holland
2024-03-19 16:51     ` Alexandre Ghiti
2024-03-24 19:42       ` David Laight
2024-03-25  7:30         ` Alexandre Ghiti
2024-03-25  9:30           ` David Laight
2024-03-25 16:39           ` Mark Rutland
2024-03-25 18:02             ` Arnd Bergmann
2024-03-25 18:30               ` Mark Rutland
2024-03-25 19:20                 ` Samuel Holland
2024-03-25 20:38                 ` Arnd Bergmann
2024-03-26 10:19                   ` David Laight
2024-03-26 14:49                     ` Mark Rutland
2024-03-25 20:12             ` Alexandre Ghiti
2024-03-24 22:05       ` Arnd Bergmann
2024-03-25  7:25         ` Alexandre Ghiti
2024-03-25 11:15           ` Arnd Bergmann [this message]
2024-03-25 20:40 ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eaa73bca-1e81-4a6f-a89f-e63329d359f9@app.fastmail.com \
    --to=arnd@arndb.de \
    --cc=akpm@linux-foundation.org \
    --cc=alex@ghiti.fr \
    --cc=alexghiti@rivosinc.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=charlie@rivosinc.com \
    --cc=cyy@cyyself.name \
    --cc=guoren@kernel.org \
    --cc=jszhang@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=rppt@kernel.org \
    --cc=samuel.holland@sifive.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=willy@infradead.org \
    --cc=xiao.w.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox