linux-modules.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>,
	Anton Ivanov <anton.ivanov@cambridgegreys.com>,
	Borislav Petkov <bp@alien8.de>,
	Brendan Higgins <brendan.higgins@linux.dev>,
	Daniel Gomez <da.gomez@samsung.com>,
	Daniel Thompson <danielt@kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	David Gow <davidgow@google.com>,
	Douglas Anderson <dianders@chromium.org>,
	Ingo Molnar <mingo@redhat.com>,
	Jason Wessel <jason.wessel@windriver.com>,
	Jiri Kosina <jikos@kernel.org>,
	Joe Lawrence <joe.lawrence@redhat.com>,
	Johannes Berg <johannes@sipsolutions.net>,
	Josh Poimboeuf <jpoimboe@kernel.org>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Miroslav Benes <mbenes@suse.cz>, "H. Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Petr Mladek <pmladek@suse.com>, Petr Pavlu <petr.pavlu@suse.com>,
	Rae Moar <rmoar@google.com>, Richard Weinberger <richard@nod.at>,
	Sami Tolvanen <samitolvanen@google.com>,
	Shuah Khan <shuah@kernel.org>, Song Liu <song@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	kgdb-bugreport@lists.sourceforge.net, kunit-dev@googlegroups.com,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	linux-mm@kvack.org, linux-modules@vger.kernel.org,
	linux-trace-kernel@vger.kernel.org, linux-um@lists.infradead.org,
	live-patching@vger.kernel.org, x86@kernel.org
Subject: Re: [PATCH 3/8] x86/mm/pat: Restore large pages after fragmentation
Date: Tue, 14 Jan 2025 13:01:09 +0200	[thread overview]
Message-ID: <Z4ZD9exBt-JQiuS6@kernel.org> (raw)
In-Reply-To: <sivhweds7p5sst2jpxanrj6qc7wlonqkod64nsr5cgttma7ntp@bhqespo3jdqz>

On Mon, Jan 13, 2025 at 10:01:13AM +0200, Kirill A. Shutemov wrote:
> On Sun, Jan 12, 2025 at 10:54:46AM +0200, Mike Rapoport wrote:
> > Hi Kirill,
> > 
> > On Fri, Jan 10, 2025 at 12:36:59PM +0200, Kirill A. Shutemov wrote:
> > > On Fri, Dec 27, 2024 at 09:28:20AM +0200, Mike Rapoport wrote:
> > > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > > 
> > > > Change of attributes of the pages may lead to fragmentation of direct
> > > > mapping over time and performance degradation as result.
> > > > 
> > > > With current code it's one way road: kernel tries to avoid splitting
> > > > large pages, but it doesn't restore them back even if page attributes
> > > > got compatible again.
> > > > 
> > > > Any change to the mapping may potentially allow to restore large page.
> > > > 
> > > > Hook up into cpa_flush() path to check if there's any pages to be
> > > > recovered in PUD_SIZE range around pages we've just touched.
> > > > 
> > > > CPUs don't like[1] to have to have TLB entries of different size for the
> > > > same memory, but looks like it's okay as long as these entries have
> > > > matching attributes[2]. Therefore it's critical to flush TLB before any
> > > > following changes to the mapping.
> > > > 
> > > > Note that we already allow for multiple TLB entries of different sizes
> > > > for the same memory now in split_large_page() path. It's not a new
> > > > situation.
> > > > 
> > > > set_memory_4k() provides a way to use 4k pages on purpose. Kernel must
> > > > not remap such pages as large. Re-use one of software PTE bits to
> > > > indicate such pages.
> > > > 
> > > > [1] See Erratum 383 of AMD Family 10h Processors
> > > > [2] https://lore.kernel.org/linux-mm/1da1b025-cabc-6f04-bde5-e50830d1ecf0@amd.com/
> > > > 
> > > > [rppt@kernel.org:
> > > >  * s/restore/collapse/
> > > >  * update formatting per peterz
> > > >  * use 'struct ptdesc' instead of 'struct page' for list of page tables to
> > > >    be freed
> > > >  * try to collapse PMD first and if it succeeds move on to PUD as peterz
> > > >    suggested
> > > >  * flush TLB twice: for changes done in the original CPA call and after
> > > >    collapsing of large pages
> > > > ]
> > > > 
> > > > Link: https://lore.kernel.org/all/20200416213229.19174-1-kirill.shutemov@linux.intel.com
> > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> > > > Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> > > 
> > > When I originally attempted this, the patch was dropped because of
> > > performance regressions. Was it addressed somehow?
> > 
> > I didn't realize the patch was dropped because of performance regressions,
> > so I didn't address it.
> > 
> > Do you remember where did the regressions show up?
> 
> https://github.com/zen-kernel/zen-kernel/issues/169
> 
> My understanding is if userspace somewhat frequently triggers set_memory_*
> codepath we will get a performance hit.

This version of the patch will cause smaller performance hit because it
does not scan an entire PUD every time collapse_large_pages() is called.

Still, when I tweaked cpa-test to take some time measurements I see about
60% increase in the time it takes to perform set_memory operations.

Since we only really care about restoring large pages for ROX mapping, I'm
going to update the patch so that we'll try to collapse large pages only
from set_memory_rox().
 
> -- 
>   Kiryl Shutsemau / Kirill A. Shutemov

-- 
Sincerely yours,
Mike.

  reply	other threads:[~2025-01-14 11:01 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-27  7:28 [PATCH 0/8] x86/module: rework ROX cache to avoid writable copy Mike Rapoport
2024-12-27  7:28 ` [PATCH 1/8] x86/mm/pat: cpa-test: fix length for CPA_ARRAY test Mike Rapoport
2025-01-03 11:19   ` Peter Zijlstra
2024-12-27  7:28 ` [PATCH 2/8] x86/mm/pat: drop duplicate variable in cpa_flush() Mike Rapoport
2024-12-27  7:28 ` [PATCH 3/8] x86/mm/pat: Restore large pages after fragmentation Mike Rapoport
2025-01-10 10:36   ` Kirill A. Shutemov
2025-01-10 19:18     ` Luis Chamberlain
2025-01-12  8:54     ` Mike Rapoport
2025-01-13  8:01       ` Kirill A. Shutemov
2025-01-14 11:01         ` Mike Rapoport [this message]
2024-12-27  7:28 ` [PATCH 4/8] execmem: add API for temporal remapping as RW and restoring ROX afterwards Mike Rapoport
2024-12-27  7:28 ` [PATCH 5/8] module: introduce MODULE_STATE_GONE Mike Rapoport
2025-01-08 15:43   ` Daniel Thompson
2024-12-27  7:28 ` [PATCH 6/8] modules: switch to execmem API for remapping as RW and restoring ROX Mike Rapoport
2025-01-02 21:30   ` Lorenzo Stoakes
2025-01-03  2:06     ` Andrew Cooper
2025-01-03  5:57       ` Andrew Morton
2025-01-03 10:57         ` Marek Marczykowski-Górecki
2025-01-03  6:58       ` Jürgen Groß
2025-01-04  2:07         ` Luis Chamberlain
2025-01-05 12:52           ` Marek Maślanka
2024-12-27  7:28 ` [PATCH 7/8] Revert "x86/module: prepare module loading for ROX allocations of text" Mike Rapoport
2024-12-27  7:28 ` [PATCH 8/8] module: drop unused module_writable_address() Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z4ZD9exBt-JQiuS6@kernel.org \
    --to=rppt@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=anton.ivanov@cambridgegreys.com \
    --cc=bp@alien8.de \
    --cc=brendan.higgins@linux.dev \
    --cc=da.gomez@samsung.com \
    --cc=danielt@kernel.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=davidgow@google.com \
    --cc=dianders@chromium.org \
    --cc=hpa@zytor.com \
    --cc=jason.wessel@windriver.com \
    --cc=jikos@kernel.org \
    --cc=joe.lawrence@redhat.com \
    --cc=johannes@sipsolutions.net \
    --cc=jpoimboe@kernel.org \
    --cc=kgdb-bugreport@lists.sourceforge.net \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=kunit-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=linux-um@lists.infradead.org \
    --cc=live-patching@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mbenes@suse.cz \
    --cc=mcgrof@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=petr.pavlu@suse.com \
    --cc=pmladek@suse.com \
    --cc=richard@nod.at \
    --cc=rmoar@google.com \
    --cc=rostedt@goodmis.org \
    --cc=samitolvanen@google.com \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).