linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Hugh Dickins <hughd@google.com>
Cc: Michel Lespinasse <michel@lespinasse.org>,
	linux-ia64@vger.kernel.org, David Hildenbrand <david@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Qi Zheng <zhengqi.arch@bytedance.com>,
	linux-kernel@vger.kernel.org, Max Filippov <jcmvbkbc@gmail.com>,
	sparclinux@vger.kernel.org, linux-riscv@lists.infradead.org,
	Claudio Imbrenda <imbrenda@linux.ibm.com>,
	Will Deacon <will@kernel.org>, Greg Ungerer <gerg@linux-m68k.org>,
	linux-s390@vger.kernel.org, linux-sh@vger.kernel.org,
	Helge Deller <deller@gmx.de>,
	x86@kernel.org, Russell King <linux@armlinux.org.uk>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Alexandre Ghiti <alexghiti@rivosinc.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	linux-m68k@lists.linux-m68k.org,
	John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>,
	John David Anglin <dave.anglin@bell.net>,
	Suren Baghdasaryan <surenb@google.com>,
	linux-arm-kernel@lists.infradead.org,
	Chris Z ankel <chris@zankel.net>, Michal Simek <monstr@monstr.eu>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	linux-parisc@vger.kernel.org, linux-mm@kvack.org,
	linux-mips@vger.kernel.org, Palmer Dabbelt <palmer@dabbelt.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linuxppc-dev@lists.ozlabs.org,
	"David S. Miller" <davem@davemloft.net>,
	Mike Rapoport <rppt@kernel.org>,
	Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [PATCH 00/23] arch: allow pte_offset_map[_lock]() to fail
Date: Thu, 11 May 2023 15:02:55 +0100	[thread overview]
Message-ID: <ZFz1j1slZHCQmwMJ@casper.infradead.org> (raw)
In-Reply-To: <d7f3c7b2-25b8-ef66-98a8-43d68f4499f@google.com>

On Wed, May 10, 2023 at 09:35:44PM -0700, Hugh Dickins wrote:
> On Wed, 10 May 2023, Matthew Wilcox wrote:
> > On Tue, May 09, 2023 at 09:39:13PM -0700, Hugh Dickins wrote:
> > > Two: pte_offset_map() will need to do an rcu_read_lock(), with the
> > > corresponding rcu_read_unlock() in pte_unmap().  But most architectures
> > > never supported CONFIG_HIGHPTE, so some don't always call pte_unmap()
> > > after pte_offset_map(), or have used userspace pte_offset_map() where
> > > pte_offset_kernel() is more correct.  No problem in the current tree,
> > > but a problem once an rcu_read_unlock() will be needed to keep balance.
> > 
> > Hi Hugh,
> > 
> > I shall have to spend some time looking at these patches, but at LSFMM
> > just a few hours ago, I proposed and nobody objected to removing
> > CONFIG_HIGHPTE.  I don't intend to take action on that consensus
> > immediately, so I can certainly wait until your patches are applied, but
> > if this information simplifies what you're doing, feel free to act on it.
> 
> Thanks a lot, Matthew: very considerate, as usual.
> 
> Yes, I did see your "Whither Highmem?" (wither highmem!) proposal on the

I'm glad somebody noticed the pun ;-)

> list, and it did make me think, better get these patches and preview out
> soon, before you get to vanish pte_unmap() altogether.  HIGHMEM or not,
> HIGHPTE or not, I think pte_offset_map() and pte_unmap() still have an
> important role to play.
> 
> I don't really understand why you're going down a remove-CONFIG_HIGHPTE
> route: I thought you were motivated by the awkardness of kmap on large
> folios; but I don't see how removing HIGHPTE helps with that at all
> (unless you have a "large page tables" effort in mind, but I doubt it).

Quite right, my primary concern is filesystem metadata; primarily
directories as I don't think anybody has ever supported symlinks or
superblocks larger than 4kB.

I was thinking that removing CONFIG_HIGHPTE might simplify the page
fault handling path a little, but now I've looked at it some more, and
I'm not sure there's any simplification to be had.  It should probably
use kmap_local instead of kmap_atomic(), though.

> But I've no investment in CONFIG_HIGHPTE if people think now is the
> time to remove it: I disagree, but wouldn't miss it myself - so long
> as you leave pte_offset_map() and pte_unmap() (under whatever names).
> 
> I don't think removing CONFIG_HIGHPTE will simplify what I'm doing.
> For a moment it looked like it would: the PAE case is nasty (and our
> data centres have not been on PAE for a long time, so it wasn't a
> problem I had to face before); and knowing pmd_high must be 0 for a
> page table looked like it would help, but now I'm not so sure of that
> (hmm, I'm changing my mind again as I write).
> 
> Peter's pmdp_get_lockless() does rely for complete correctness on
> interrupts being disabled, and I suspect that I may be forced in the
> PAE case to do so briefly; but detest that notion.  For now I'm just
> deferring it, hoping for a better idea before third series finalized.
> 
> I mention this (and Cc Peter) in passing: don't want this arch thread
> to go down into that rabbit hole: we can start a fresh thread on it if
> you wish, but right now my priority is commit messages for the second
> series, rather than solving (or even detailing) the PAE problem.

I infer that what you need is a pte_access_start() and a
pte_access_end() which look like they can be plausibly rcu_read_lock()
and rcu_read_unlock(), but might need to be local_irq_save() and
local_irq_restore() in some configurations?

We also talked about moving x86 to always RCU-free page tables in
order to make accessing /proc/$pid/smaps lockless.  I believe Michel
is going to take a swing at this project.

  reply	other threads:[~2023-05-11 14:05 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-10  4:39 [PATCH 00/23] arch: allow pte_offset_map[_lock]() to fail Hugh Dickins
2023-05-10  4:42 ` [PATCH 01/23] arm: " Hugh Dickins
2023-05-10 14:28   ` Matthew Wilcox
2023-05-11  3:40     ` Hugh Dickins
2023-05-10  4:43 ` [PATCH 02/23] arm64: allow pte_offset_map() " Hugh Dickins
2023-05-25 16:37   ` Catalin Marinas
2023-05-10  4:45 ` [PATCH 03/23] arm64/hugetlb: pte_alloc_huge() pte_offset_huge() Hugh Dickins
2023-05-25 16:37   ` Catalin Marinas
2023-05-10  4:47 ` [PATCH 04/23] ia64/hugetlb: " Hugh Dickins
2023-05-10  4:48 ` [PATCH 05/23] m68k: allow pte_offset_map[_lock]() to fail Hugh Dickins
2023-05-10  7:13   ` Geert Uytterhoeven
2023-05-11  2:57     ` Hugh Dickins
2023-05-11  6:53       ` Geert Uytterhoeven
2023-05-10  4:49 ` [PATCH 06/23] microblaze: allow pte_offset_map() " Hugh Dickins
2023-05-10  4:51 ` [PATCH 07/23] mips: update_mmu_cache() can replace __update_tlb() Hugh Dickins
2023-05-10  4:52 ` [PATCH 08/23] parisc: add pte_unmap() to balance get_ptep() Hugh Dickins
2023-05-13 21:35   ` Helge Deller
2023-05-14 18:20     ` Hugh Dickins
2023-05-10  4:54 ` [PATCH 09/23] parisc: unmap_uncached_pte() use pte_offset_kernel() Hugh Dickins
2023-05-10  4:55 ` [PATCH 10/23] parisc/hugetlb: pte_alloc_huge() pte_offset_huge() Hugh Dickins
2023-05-10  4:56 ` [PATCH 11/23] powerpc: kvmppc_unmap_free_pmd() pte_offset_kernel() Hugh Dickins
2023-05-10  4:57 ` [PATCH 12/23] powerpc: allow pte_offset_map[_lock]() to fail Hugh Dickins
2023-05-10  4:58 ` [PATCH 13/23] powerpc/hugetlb: pte_alloc_huge() Hugh Dickins
2023-05-10  4:59 ` [PATCH 14/23] riscv/hugetlb: pte_alloc_huge() pte_offset_huge() Hugh Dickins
2023-05-10  8:01   ` Alexandre Ghiti
2023-05-10 14:01   ` Palmer Dabbelt
2023-05-10  5:01 ` [PATCH 15/23] s390: allow pte_offset_map_lock() to fail Hugh Dickins
2023-05-17 10:35   ` Claudio Imbrenda
2023-05-17 21:50     ` Hugh Dickins
2023-05-23 12:00       ` Claudio Imbrenda
2023-05-24  1:49         ` Hugh Dickins
2023-05-25  7:23           ` Claudio Imbrenda
2023-05-10  5:02 ` [PATCH 16/23] s390: gmap use pte_unmap_unlock() not spin_unlock() Hugh Dickins
2023-05-17 11:28   ` Alexander Gordeev
2023-05-10  5:03 ` [PATCH 17/23] sh/hugetlb: pte_alloc_huge() pte_offset_huge() Hugh Dickins
2023-05-10  5:04 ` [PATCH 18/23] sparc/hugetlb: " Hugh Dickins
2023-05-10  5:05 ` [PATCH 19/23] sparc: allow pte_offset_map() to fail Hugh Dickins
2023-05-10  5:07 ` [PATCH 20/23] sparc: iounit and iommu use pte_offset_kernel() Hugh Dickins
2023-05-10  5:08 ` [PATCH 21/23] x86: Allow get_locked_pte() to fail Hugh Dickins
2023-05-10  8:18   ` Peter Zijlstra
2023-05-11  3:16     ` Hugh Dickins
2023-05-11  7:29       ` Peter Zijlstra
2023-05-10  5:09 ` [PATCH 22/23] x86: sme_populate_pgd() use pte_offset_kernel() Hugh Dickins
2023-05-10  5:11 ` [PATCH 23/23] xtensa: add pte_unmap() to balance pte_offset_map() Hugh Dickins
2023-05-10  6:07 ` [PATCH 00/23] arch: allow pte_offset_map[_lock]() to fail Matthew Wilcox
2023-05-11  4:35   ` Hugh Dickins
2023-05-11 14:02     ` Matthew Wilcox [this message]
2023-05-11 22:37       ` Hugh Dickins
2023-05-12  3:38       ` Mike Rapoport
2023-05-16 10:41       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZFz1j1slZHCQmwMJ@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexghiti@rivosinc.com \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=catalin.marinas@arm.com \
    --cc=chris@zankel.net \
    --cc=dave.anglin@bell.net \
    --cc=davem@davemloft.net \
    --cc=david@redhat.com \
    --cc=deller@gmx.de \
    --cc=geert@linux-m68k.org \
    --cc=gerg@linux-m68k.org \
    --cc=glaubitz@physik.fu-berlin.de \
    --cc=hca@linux.ibm.com \
    --cc=hughd@google.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=jcmvbkbc@gmail.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-m68k@lists.linux-m68k.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=michel@lespinasse.org \
    --cc=mike.kravetz@oracle.com \
    --cc=monstr@monstr.eu \
    --cc=palmer@dabbelt.com \
    --cc=peterz@infradead.org \
    --cc=rppt@kernel.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=surenb@google.com \
    --cc=tsbogend@alpha.franken.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).