linux-mips.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Andy Lutomirski" <luto@kernel.org>
To: "Nadav Amit" <nadav.amit@gmail.com>, "Song Liu" <song@kernel.org>
Cc: "Mike Rapoport" <rppt@kernel.org>,
	"Mark Rutland" <mark.rutland@arm.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Catalin Marinas" <catalin.marinas@arm.com>,
	"Christophe Leroy" <christophe.leroy@csgroup.eu>,
	"David S. Miller" <davem@davemloft.net>,
	"Dinh Nguyen" <dinguyen@kernel.org>,
	"Heiko Carstens" <hca@linux.ibm.com>,
	"Helge Deller" <deller@gmx.de>,
	"Huacai Chen" <chenhuacai@kernel.org>,
	"Kent Overstreet" <kent.overstreet@linux.dev>,
	"Luis Chamberlain" <mcgrof@kernel.org>,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	"Naveen N. Rao" <naveen.n.rao@linux.ibm.com>,
	"Palmer Dabbelt" <palmer@dabbelt.com>,
	"Puranjay Mohan" <puranjay12@gmail.com>,
	"Rick P Edgecombe" <rick.p.edgecombe@intel.com>,
	"Russell King (Oracle)" <linux@armlinux.org.uk>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Thomas Bogendoerfer" <tsbogend@alpha.franken.de>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Will Deacon" <will@kernel.org>, bpf <bpf@vger.kernel.org>,
	linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org,
	linux-mm <linux-mm@kvack.org>,
	linux-modules@vger.kernel.org, linux-parisc@vger.kernel.org,
	linux-riscv@lists.infradead.org,
	linux-s390 <linux-s390@vger.kernel.org>,
	linux-trace-kernel@vger.kernel.org,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	loongarch@lists.linux.dev, netdev@vger.kernel.org,
	sparclinux@vger.kernel.org,
	"the arch/x86 maintainers" <x86@kernel.org>
Subject: Re: [PATCH v2 02/12] mm: introduce execmem_text_alloc() and jit_text_alloc()
Date: Tue, 20 Jun 2023 10:24:29 -0700	[thread overview]
Message-ID: <6145cabf-d016-4dba-b5d2-0fb793352058@app.fastmail.com> (raw)
In-Reply-To: <7F566E60-C371-449B-992B-0C435AD6016B@gmail.com>



On Mon, Jun 19, 2023, at 1:18 PM, Nadav Amit wrote:
>> On Jun 19, 2023, at 10:09 AM, Andy Lutomirski <luto@kernel.org> wrote:
>> 
>> But jit_text_alloc() can't do this, because the order of operations doesn't match.  With jit_text_alloc(), the executable mapping shows up before the text is populated, so there is no atomic change from not-there to populated-and-executable.  Which means that there is an opportunity for CPUs, speculatively or otherwise, to start filling various caches with intermediate states of the text, which means that various architectures (even x86!) may need serialization.
>> 
>> For eBPF- and module- like use cases, where JITting/code gen is quite coarse-grained, perhaps something vaguely like:
>> 
>> jit_text_alloc() -> returns a handle and an executable virtual address, but does *not* map it there
>> jit_text_write() -> write to that handle
>> jit_text_map() -> map it and synchronize if needed (no sync needed on x86, I think)
>
> Andy, would you mind explaining why you think a sync is not needed? I 
> mean I have a “feeling” that perhaps TSO can guarantee something based 
> on the order of write and page-table update. Is that the argument?

Sorry, when I say "no sync" I mean no cross-CPU synchronization.  I'm assuming the underlying sequence of events is:

allocate physical pages (jit_text_alloc)

write to them (with MOV, memcpy, whatever), via the direct map or via a temporary mm

do an appropriate *local* barrier (which, on x86, is probably implied by TSO, as the subsequent pagetable change is at least a release; also, any any previous temporary mm stuff would have done MOV CR3 afterwards, which is a full "serializing" barrier)

optionally zap the direct map via IPI, assuming the pages are direct mapped (but this could be avoided with a smart enough allocator and temporary_mm above)

install the final RX PTE (jit_text_map), which does a MOV or maybe a LOCK CMPXCHG16B.  Note that the virtual address in question was not readable or executable before this, and all CPUs have serialized since the last time it was executable.

either jump to the new text locally, or:

1. Do a store-release to tell other CPUs that the text is mapped
2. Other CPU does a load-acquire to detect that the text is mapped and jumps to the text

This is all approximately the same thing that plain old mmap(..., PROT_EXEC, ...) does.

>
> On this regard, one thing that I clearly do not understand is why 
> *today* it is ok for users of bpf_arch_text_copy() not to call 
> text_poke_sync(). Am I missing something?

I cannot explain this, because I suspect the current code is wrong.  But it's only wrong across CPUs, because bpf_arch_text_copy goes through text_poke_copy, which calls unuse_temporary_mm(), which is serializing.  And it's plausible that most eBPF use cases don't actually cause the loaded program to get used on a different CPU without first serializing on the CPU that ends up using it.  (Context switches and interrupts are serializing.)

FRED could make interrupts non-serializing. I sincerely hope that FRED doesn't cause this all to fall apart.

--Andy

  reply	other threads:[~2023-06-20 17:25 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-16  8:50 [PATCH v2 00/12] mm: jit/text allocator Mike Rapoport
2023-06-16  8:50 ` [PATCH v2 01/12] nios2: define virtual address space for modules Mike Rapoport
2023-06-16 16:00   ` Edgecombe, Rick P
2023-06-17  5:52     ` Mike Rapoport
2023-06-16 18:14   ` Song Liu
2023-06-16  8:50 ` [PATCH v2 02/12] mm: introduce execmem_text_alloc() and jit_text_alloc() Mike Rapoport
2023-06-16 16:48   ` Kent Overstreet
2023-06-16 18:18     ` Song Liu
2023-06-17  5:57     ` Mike Rapoport
2023-06-17 20:38   ` Andy Lutomirski
2023-06-18  8:00     ` Mike Rapoport
2023-06-19 17:09       ` Andy Lutomirski
2023-06-19 20:18         ` Nadav Amit
2023-06-20 17:24           ` Andy Lutomirski [this message]
2023-06-25 16:14         ` Mike Rapoport
2023-06-25 16:59           ` Andy Lutomirski
2023-06-25 17:42             ` Mike Rapoport
2023-06-25 18:07               ` Kent Overstreet
2023-06-26  6:13                 ` Song Liu
2023-06-26  9:54                   ` Puranjay Mohan
2023-06-26 12:23                     ` Mark Rutland
2023-06-26 12:31           ` Mark Rutland
2023-06-26 17:48             ` Song Liu
2023-07-17 17:23               ` Andy Lutomirski
2023-06-26 13:01         ` Mark Rutland
2023-06-19 11:34     ` Kent Overstreet
2023-06-16  8:50 ` [PATCH v2 03/12] mm/execmem, arch: convert simple overrides of module_alloc to execmem Mike Rapoport
2023-06-16 18:36   ` Song Liu
2023-06-16  8:50 ` [PATCH v2 04/12] mm/execmem, arch: convert remaining " Mike Rapoport
2023-06-16 16:16   ` Edgecombe, Rick P
2023-06-17  6:10     ` Mike Rapoport
2023-06-16 18:53   ` Song Liu
2023-06-17  6:14     ` Mike Rapoport
2023-06-16  8:50 ` [PATCH v2 05/12] modules, execmem: drop module_alloc Mike Rapoport
2023-06-16 18:56   ` Song Liu
2023-06-16  8:50 ` [PATCH v2 06/12] mm/execmem: introduce execmem_data_alloc() Mike Rapoport
2023-06-16 16:55   ` Edgecombe, Rick P
2023-06-17  6:44     ` Mike Rapoport
2023-06-16 20:01   ` Song Liu
2023-06-17  6:51     ` Mike Rapoport
2023-06-18 22:32   ` Thomas Gleixner
2023-06-18 23:14     ` Kent Overstreet
2023-06-19  0:43       ` Thomas Gleixner
2023-06-19  2:12         ` Kent Overstreet
2023-06-20 14:51         ` Steven Rostedt
2023-06-20 15:32           ` Alexei Starovoitov
2023-06-19 15:23     ` Mike Rapoport
2023-06-16  8:50 ` [PATCH v2 07/12] arm64, execmem: extend execmem_params for generated code definitions Mike Rapoport
2023-06-16 20:05   ` Song Liu
2023-06-17  6:57     ` Mike Rapoport
2023-06-17 15:36       ` Kent Overstreet
2023-06-17 16:38         ` Song Liu
2023-06-17 20:37           ` Kent Overstreet
2023-06-16  8:50 ` [PATCH v2 08/12] riscv: extend execmem_params for kprobes allocations Mike Rapoport
2023-06-16 20:09   ` Song Liu
2023-06-16  8:50 ` [PATCH v2 09/12] powerpc: " Mike Rapoport
2023-06-16 20:09   ` Song Liu
2023-06-16  8:50 ` [PATCH v2 10/12] arch: make execmem setup available regardless of CONFIG_MODULES Mike Rapoport
2023-06-16 20:17   ` Song Liu
2023-06-16  8:50 ` [PATCH v2 11/12] x86/ftrace: enable dynamic ftrace without CONFIG_MODULES Mike Rapoport
2023-06-16 20:18   ` Song Liu
2023-06-16  8:50 ` [PATCH v2 12/12] kprobes: remove dependcy on CONFIG_MODULES Mike Rapoport
2023-06-16 11:44   ` Björn Töpel
2023-06-17  6:52     ` Mike Rapoport
2023-06-16 17:02 ` [PATCH v2 00/12] mm: jit/text allocator Edgecombe, Rick P

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6145cabf-d016-4dba-b5d2-0fb793352058@app.fastmail.com \
    --to=luto@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bpf@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=chenhuacai@kernel.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=davem@davemloft.net \
    --cc=deller@gmx.de \
    --cc=dinguyen@kernel.org \
    --cc=hca@linux.ibm.com \
    --cc=keescook@chromium.org \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=loongarch@lists.linux.dev \
    --cc=mark.rutland@arm.com \
    --cc=mcgrof@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=nadav.amit@gmail.com \
    --cc=naveen.n.rao@linux.ibm.com \
    --cc=netdev@vger.kernel.org \
    --cc=palmer@dabbelt.com \
    --cc=puranjay12@gmail.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=song@kernel.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tsbogend@alpha.franken.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).