From: "Eric W. Biederman" <ebiederm@xmission.com>
To: Steve Wahl <steve.wahl@hpe.com>
Cc: Russ Anderson <rja@hpe.com>, Ingo Molnar <mingo@kernel.org>,
Dave Hansen <dave.hansen@linux.intel.com>,
Andy Lutomirski <luto@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
linux-kernel@vger.kernel.org,
Linux regressions mailing list <regressions@lists.linux.dev>,
Pavin Joseph <me@pavinjoseph.com>,
stable@vger.kernel.org, Eric Hagberg <ehagberg@gmail.com>,
Simon Horman <horms@verge.net.au>,
Dave Young <dyoung@redhat.com>, Sarah Brofeldt <srhb@dbc.dk>,
Dimitri Sivanich <sivanich@hpe.com>
Subject: Re: [PATCH] x86/mm/ident_map: Use full gbpages in identity maps except on UV platform.
Date: Sat, 30 Mar 2024 22:46:21 -0500 [thread overview]
Message-ID: <87msqf12sy.fsf@email.froward.int.ebiederm.org> (raw)
In-Reply-To: <ZgWO5I_p8zHyp3en@swahl-home.5wahls.com> (Steve Wahl's message of "Thu, 28 Mar 2024 10:38:12 -0500")
Steve Wahl <steve.wahl@hpe.com> writes:
> On Thu, Mar 28, 2024 at 12:05:02AM -0500, Eric W. Biederman wrote:
>>
>> From my perspective the entire reason for wanting to be fine grained and
>> precise in the kernel memory map is because the UV systems don't have
>> enough MTRRs. So you have to depend upon the cache-ability attributes
>> for specific addresses of memory coming from the page tables instead of
>> from the MTRRs.
>
> It would be more accurate to say we depend upon the addresses not
> being listed in the page tables at all. We'd be OK with mapped but
> not accessed, if it weren't for processor speculation. There's no "no
> access" setting within the existing MTRR definitions, though there may
> be a setting that would rein in processor speculation enough to make
> due.
The uncached setting and the write-combining settings that are used for
I/O are required to disable speculation for any regions so marked. Any
reads or writes to a memory mapped I/O region can result in hardware
with processing it as a command. Which as I understand it is exactly
the problem with UV systems.
Frankly not mapping an I/O region (in an identity mapped page table)
instead of properly mapping it as it would need to be mapped for
performing I/O seems like a bit of a bug.
>> If you had enough MTRRs more defining the page tables to be precisely
>> what is necessary would be simply an exercise in reducing kernel
>> performance, because it is more efficient in both page table size, and
>> in TLB usage to use 1GB pages instead of whatever smaller pages you have
>> to use for oddball regions.
>>
>> For systems without enough MTRRs the small performance hit in paging
>> performance is the necessary trade off.
>>
>> At least that is my perspective. Does that make sense?
>
> I think I'm begining to get your perspective. From your point of
> view, is kexec failing with "nogbpages" set a bug? My point of view
> is it likely is. I think your view would say it isn't?
I would say it is a bug.
Part of the bug is someone yet again taking something simple that
kexec is doing and reworking it to use generic code, then changing
the generic code to do something different from what kexec needs
and then being surprised that kexec stops working.
The interface kexec wants to provide to whatever is being loaded is not
having to think about page tables until that software is up far enough
to enable their own page tables.
People being clever and enabling just enough pages in the page tables
to work based upon the results of some buggy (they are always buggy some
are just less so than others) boot up firmware is where I get concerned.
Said another way the point is to build an identity mapped page table.
Skipping some parts of the physical<->virtual identity because we seem
to think no one will use it is likely a bug.
I really don't see any point in putting holes in such a page table for
any address below the highest address that is good for something. Given
that on some systems the MTRRs are insufficient to do there job it
definitely makes sense to not enable caching on areas that we don't
think are memory.
Eric
next prev parent reply other threads:[~2024-03-31 3:47 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-22 16:21 [PATCH] x86/mm/ident_map: Use full gbpages in identity maps except on UV platform Steve Wahl
2024-03-22 16:27 ` Dave Hansen
2024-03-22 17:31 ` Eric W. Biederman
2024-03-22 17:40 ` Dave Hansen
2024-03-22 17:43 ` Dave Hansen
2024-03-22 18:06 ` Steve Wahl
2024-03-22 18:05 ` Steve Wahl
2024-03-22 23:29 ` Dave Hansen
2024-03-24 4:45 ` Eric W. Biederman
2024-03-24 18:16 ` Dave Hansen
2024-03-25 19:15 ` Steve Wahl
2024-03-24 10:31 ` Ingo Molnar
2024-03-25 2:03 ` Russ Anderson
2024-03-25 10:58 ` Ingo Molnar
2024-04-05 13:13 ` Eric Hagberg
2024-04-05 13:35 ` Greg KH
2024-03-25 15:04 ` Eric W. Biederman
2024-03-25 19:41 ` Steve Wahl
2024-03-27 12:57 ` Eric W. Biederman
2024-03-27 15:33 ` Steve Wahl
2024-03-28 5:05 ` Eric W. Biederman
2024-03-28 15:38 ` Steve Wahl
2024-03-31 3:46 ` Eric W. Biederman [this message]
2024-04-01 15:15 ` Steve Wahl
2024-04-01 18:03 ` Dave Hansen
2024-04-01 18:49 ` Steve Wahl
2024-04-04 19:56 ` Steve Wahl
2024-03-25 19:22 ` Steve Wahl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87msqf12sy.fsf@email.froward.int.ebiederm.org \
--to=ebiederm@xmission.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=dyoung@redhat.com \
--cc=ehagberg@gmail.com \
--cc=horms@verge.net.au \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=me@pavinjoseph.com \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=regressions@lists.linux.dev \
--cc=rja@hpe.com \
--cc=sivanich@hpe.com \
--cc=srhb@dbc.dk \
--cc=stable@vger.kernel.org \
--cc=steve.wahl@hpe.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox