[PATCH 0/2] ring-buffer: Allow persistent memory to be user space mmapped

linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Steven Rostedt <rostedt@goodmis.org>
To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vincent Donnefort <vdonnefort@google.com>,
	Vlastimil Babka <vbabka@suse.cz>
Subject: [PATCH 0/2] ring-buffer: Allow persistent memory to be user space mmapped
Date: Fri, 28 Mar 2025 18:08:36 -0400	[thread overview]
Message-ID: <20250328220836.812222422@goodmis.org> (raw)

Linus,

This is an update to the code that we discussed in making the persistent
ring buffer work with user space memory mapping. I based it on top of
the second version of the pull request I just sent out.

Note, I'm not suggesting this is to go into this merge window. I'm
happy to wait until the next window.

The first patch moves the memory mapping of the physical memory returned
by reserve_mem from the tracing code to the ring buffer code. This makes
sense as this gives more control over to the ring buffer in knowing exactly
how the pages were created. It keeps track of where the physical memory
that was mapped and also handles the freeing of this memory (removing the
burden from the tracing code from having to do this). It also handles
knowing if the buffer may be memory mapped or not. The check is removed
from the tracing code, but if the tracing code tries to memory map the
persistent ring buffer, the call to the ring buffer code will fail with
the same error as before.

The second patch implements the user space memory mapping of the persistent
ring buffer. It does so by adding several helper functions to annotate
what the code is doing. By doing this, I also discovered that that "hack" 
you did not like was not needed for the meta page. There's two meta pages
here. One is mapped between the kernel and user space and is used to inform
user space of updates to the ring buffer. The other is inside the persistent
memory that is used to pass information across boots. The persistent memory
meta data is never exposed to user space. The meta data for user space
mapping is always allocated via the normal memory allocation.

The helper functions are:

 rb_struct_page() - This is the rb_get_page() from our discussions, but
                    I renamed it because "get" implies "put".
                    This function will return the struct page for a given
                    buffer page by either virt_to_page() if the page was
                    allocated via the normal memory allocator, or it
                    is found via pfn_to_page() by using the saved physical
                    and virtual address of the mapped location. It uses
                    that to calculate the physical address from the virtual
                    address of the page and then pfn_to_page() can be used
                    from that.

  rb_fush_buffer_page() - this calls the above rb_struct_page() and then
                    calls flush_dcache_folio() to make sure the kernel
                    and user space is coherent.

  rb_flush_meta() - This just uses virt_to_page() and calls flush_dcache_folio()
                    as it is always allocated by the normal memory allocator.
                    I created it just to be consistent.

  rb_page_id() - The mappings require knowing where they are mapped.
                 As the normal allocated pages are done in a way that they
                 may exist anywhere from the kernel's point of view, they
                 need to be labelled to know where they are mapped in user
                 space. The bpage->id is used for this. But for the persistent
                 memory, that bpage->id is already used for knowing the order
                 of the pages that are still active in the write part of
                 the buffer. This means that they are not consecutive. For
                 the user space mapping, the index of where the pages exist
                 in the physical memory is used for the placement in user
                 space. In order to manage this difference between how the
                 ids are used, this helper function handles that.

I personally feel this version of the code is much cleaner and with the
helper functions, much easier to follow. As doing this exercise found that
the test against virt_addr_valid() wasn't needed in every location
(which is no longer used here).

Steven Rostedt (2):
      tracing: ring-buffer: Have the ring buffer code do the vmap of physical memory
      ring-buffer: Allow persistent ring buffers to be mmapped

----
 include/linux/ring_buffer.h |  19 ++---
 kernel/trace/ring_buffer.c  | 180 +++++++++++++++++++++++++++++++++++++++-----
 kernel/trace/trace.c        |  65 ++++------------
 3 files changed, 186 insertions(+), 78 deletions(-)

next             reply	other threads:[~2025-03-28 22:08 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-28 22:08 Steven Rostedt [this message]
2025-03-28 22:08 ` [PATCH 1/2] tracing: ring-buffer: Have the ring buffer code do the vmap of physical memory Steven Rostedt
2025-03-28 22:08 ` [PATCH 2/2] ring-buffer: Allow persistent ring buffers to be mmapped Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250328220836.812222422@goodmis.org \
    --to=rostedt@goodmis.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=vdonnefort@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).