From: Jason Gunthorpe <jgg@ziepe.ca>
To: Hugh Dickins <hughd@google.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Heiko Carstens <hca@linux.ibm.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Claudio Imbrenda <imbrenda@linux.ibm.com>,
Alexander Gordeev <agordeev@linux.ibm.com>,
linux-s390@vger.kernel.org
Subject: Re: [PATCH 07/12] s390: add pte_free_defer(), with use of mmdrop_async()
Date: Fri, 16 Jun 2023 09:35:41 -0300 [thread overview]
Message-ID: <ZIxXHUcb5LumkxEH@ziepe.ca> (raw)
In-Reply-To: <422e8778-444c-d291-988c-26fc041a481@google.com>
On Thu, Jun 15, 2023 at 02:09:30PM -0700, Hugh Dickins wrote:
> On Thu, 15 Jun 2023, Jason Gunthorpe wrote:
> > On Wed, Jun 14, 2023 at 02:59:33PM -0700, Hugh Dickins wrote:
> >
> > > I guess the best thing would be to modify kernel/fork.c to allow the
> > > architecture to override free_mm(), and arch/s390 call_rcu to free mm.
> > > But as a quick and dirty s390-end workaround, how about:
> >
> > RCU callbacks are not ordered so that doesn't seem like it helps..
>
> Thanks, that's an interesting and important point, which I need to knock
> into my head better.
>
> But can you show me where that's handled in the existing mm/mmu_gather.c
> include/asm-generic/tlb.h framework? I don't see any rcu_barrier()s
> there, yet don't the pmd_huge_pte pointers point into pud page tables
> freed shortly afterwards also by RCU?
I don't know anything about the pmd_huge_pte stuff.. I was expecting
it got cleaned up explicitly before things reached the call_rcu? Where is it
touched from a call_rcu callback?
> > Making the page frag pool global (per-cpu global I guess) would also
> > remove the need to reach back to the freeable mm_struct and reduce the
> > need for struct page memory. This views it as a special kind of
> > kmemcache.
>
> I haven't thought in that direction at all. Hmm. Or did I think of
> it once, but discarded for accounting reasons - IIRC (haven't rechecked)
> page table pages are charged to memcg, and counted for meminfo and other(?)
> purposes: if the fragments are all lumped into a global pool, we
> lose that.
You'd have to search the free list for fragments that match the
current memcg to avoid creating mismatches :\, or rework how memcg
accouting works for page tables - eg move the memcg from the struct
page to the mm_struct so that each frag can be accounted differently.
> > Can arches opt in to RCU freeing page table support and still keep
> > your series sane?
>
> Yes, or perhaps we mean different things: I thought most architectures
> are already freeing page tables by RCU. s390 included.
> "git grep MMU_GATHER_RCU_TABLE_FREE" shows plenty of selects.
MMU_GATHER_RCU_TABLE_FREE is a very confusing option. What it really
says is that the architecture doesn't do an IPI so we sometimes use
RCU as a replacement for the IPI, but not always.
Specifically this means it doesn't allow rcu reading of the page
tables. You still have to take the IPI blocking interrupt-disable lock
to read page tables, even if MMU_GATHER_RCU_TABLE_FREE is set.
IMHO I would be alot happier with what you were trying to do here if
it came along with full RCU enabling of page tables so that we could
say that the rcu_read_lock() is sufficient locking to read page tables
*always*.
I didn't really put together how this series works that we could
introduce rcu_read_lock() in only one specific place..
My query was simpler - if we could find enough space to put a rcu_head
in the ptdesc for many architectures, and thus *always* RCU free on
many architectures, could you do what you want but disable it on S390
and POWER which would still have to rely on an RCU head allocation and
a backup IPI?
Jason
next prev parent reply other threads:[~2023-06-16 12:35 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-29 6:11 [PATCH 00/12] mm: free retracted page table by RCU Hugh Dickins
2023-05-29 6:14 ` [PATCH 01/12] mm/pgtable: add rcu_read_lock() and rcu_read_unlock()s Hugh Dickins
2023-05-31 17:06 ` Jann Horn
2023-06-02 2:50 ` Hugh Dickins
2023-06-02 14:21 ` Jann Horn
2023-05-29 6:16 ` [PATCH 02/12] mm/pgtable: add PAE safety to __pte_offset_map() Hugh Dickins
2023-05-29 13:56 ` Matthew Wilcox
[not found] ` <ZHeg3oRljRn6wlLX@ziepe.ca>
2023-06-02 5:35 ` Hugh Dickins
2023-05-29 6:17 ` [PATCH 03/12] arm: adjust_pte() use pte_offset_map_nolock() Hugh Dickins
2023-05-29 6:18 ` [PATCH 04/12] powerpc: assert_pte_locked() " Hugh Dickins
2023-05-29 6:20 ` [PATCH 05/12] powerpc: add pte_free_defer() for pgtables sharing page Hugh Dickins
2023-05-29 14:02 ` Matthew Wilcox
2023-05-29 14:36 ` Hugh Dickins
2023-06-01 13:57 ` Gerald Schaefer
2023-06-02 6:38 ` Hugh Dickins
2023-06-02 14:20 ` Jason Gunthorpe
2023-06-06 3:40 ` Hugh Dickins
2023-06-06 18:23 ` Jason Gunthorpe
2023-06-06 19:03 ` Peter Xu
2023-06-06 19:08 ` Jason Gunthorpe
2023-06-07 3:49 ` Hugh Dickins
2023-05-29 6:21 ` [PATCH 06/12] sparc: " Hugh Dickins
2023-06-06 3:46 ` Hugh Dickins
2023-05-29 6:22 ` [PATCH 07/12] s390: add pte_free_defer(), with use of mmdrop_async() Hugh Dickins
2023-06-06 5:11 ` Hugh Dickins
2023-06-06 18:39 ` Jason Gunthorpe
2023-06-08 2:46 ` Hugh Dickins
2023-06-06 19:40 ` Gerald Schaefer
2023-06-08 3:35 ` Hugh Dickins
2023-06-08 13:58 ` Jason Gunthorpe
2023-06-08 15:47 ` Gerald Schaefer
2023-06-13 6:34 ` Hugh Dickins
2023-06-14 13:30 ` Gerald Schaefer
2023-06-14 21:59 ` Hugh Dickins
2023-06-15 12:11 ` Gerald Schaefer
2023-06-15 20:06 ` Hugh Dickins
2023-06-16 8:38 ` Gerald Schaefer
2023-06-15 12:34 ` Jason Gunthorpe
2023-06-15 21:09 ` Hugh Dickins
2023-06-16 12:35 ` Jason Gunthorpe [this message]
2023-05-29 6:23 ` [PATCH 08/12] mm/pgtable: add pte_free_defer() for pgtable as page Hugh Dickins
2023-06-01 13:31 ` Jann Horn
[not found] ` <ZHekpAKJ05cr/GLl@ziepe.ca>
2023-06-02 6:03 ` Hugh Dickins
2023-06-02 12:15 ` Jason Gunthorpe
2023-05-29 6:25 ` [PATCH 09/12] mm/khugepaged: retract_page_tables() without mmap or vma lock Hugh Dickins
2023-05-29 23:26 ` Peter Xu
2023-05-31 0:38 ` Hugh Dickins
2023-05-31 15:34 ` Jann Horn
[not found] ` <ZHe0A079X9B8jWlH@x1n>
2023-05-31 22:18 ` Jann Horn
2023-06-01 14:06 ` Jason Gunthorpe
2023-06-06 6:18 ` Hugh Dickins
2023-05-29 6:26 ` [PATCH 10/12] mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock() Hugh Dickins
2023-05-31 17:25 ` Jann Horn
2023-06-02 5:11 ` Hugh Dickins
2023-05-29 6:28 ` [PATCH 11/12] mm/khugepaged: delete khugepaged_collapse_pte_mapped_thps() Hugh Dickins
2023-05-29 6:30 ` [PATCH 12/12] mm: delete mmap_write_trylock() and vma_try_start_write() Hugh Dickins
2023-05-31 17:59 ` [PATCH 00/12] mm: free retracted page table by RCU Jann Horn
2023-06-02 4:37 ` Hugh Dickins
2023-06-02 15:26 ` Jann Horn
2023-06-06 6:28 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZIxXHUcb5LumkxEH@ziepe.ca \
--to=jgg@ziepe.ca \
--cc=agordeev@linux.ibm.com \
--cc=borntraeger@linux.ibm.com \
--cc=gerald.schaefer@linux.ibm.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=hughd@google.com \
--cc=imbrenda@linux.ibm.com \
--cc=linux-s390@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox