linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kees Cook <kees@kernel.org>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vincent Donnefort <vdonnefort@google.com>,
	Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
	Tony Luck <tony.luck@intel.com>,
	"Guilherme G. Piccoli" <gpiccoli@igalia.com>,
	linux-hardening@vger.kernel.org
Subject: Re: [PATCH v2 1/2] tracing: ring-buffer: Have the ring buffer code do the vmap of physical memory
Date: Thu, 3 Apr 2025 09:45:48 -0700	[thread overview]
Message-ID: <202504030941.E0AA2E023@keescook> (raw)
In-Reply-To: <20250331133906.48e115f5@gandalf.local.home>

On Mon, Mar 31, 2025 at 01:39:06PM -0400, Steven Rostedt wrote:
> On Mon, 31 Mar 2025 09:55:28 -0700
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > Anyway, that takes care of the horrific interface. However, there's
> > another issue:
> > 
> > > +       pages = kmalloc_array(page_count, sizeof(struct page *), GFP_KERNEL);  
> > 
> > you create this pointless array of pages. Why? It's a physically
> > contiguous area.
> > 
> > You do that just because you want to use vmap() to map that contiguous
> > area one page at a time.
> > 
> > But this is NOT a new thing. It's exactly what every single PCI device
> > with a random physical memory region BAR needs to do. And no, they
> > don't create arrays of 'struct page *', because they use memory that
> > doesn't even have page backing.
> > 
> > So we actually have interfaces to do linear virtual mappings of
> > physical pages that *predate* vmap(), and do the right thing without
> > any of these games.
> 
> [ Added the pstore folks ]
> 
> OK, so I did copy this from fs/pstore/ram_core.c as this does basically the
> same thing as pstore. And it looks like pstore should be updated too.

I think we're talking about persistent_ram_vmap()? That code predates my
maintainership, but I'm happy to update it to use better APIs.

> > Yes, the legacy versions of interfaces are all for IO memory, but we
> > do have things like vmap_page_range() which should JustWork(tm).
> > 
> > Yeah, you'll need to do something like
> > 
> >         unsigned long vmap_start, vmap_end;
> > 
> >         area = get_vm_area(size, VM_IOREMAP);
> >         if (!area)
> >                 return NULL;
> > 
> >         vmap_start = (unsigned long) area->addr;
> >         vmap_end = vmap_start + size;
> > 
> >         ret = vmap_page_range(vmap_start, vmap_end,
> >                 *start, prot_nx(PAGE_KERNEL));
> > 
> >         if (ret < 0) {
> >                 free_vm_area(area);
> >                 return NULL;
> >         }
> > 
> > and the above is *entirely* untested and maybe there's something wrong
> > there, but the concept should work, and when you don't do it a page at
> > a time, you not only don't need the kmalloc_array(), it should even do
> > things like be able to use large page mappings if the alignment and
> > size work out.
> > 
> > That said, the old code is *really* broken to begin with. I don't
> > understand why you want to vmap() a contiguous physical range. Either
> > it's real pages to begin with, and you can just use "page_address()"
> > to get a virtual address, it's *not* real pages, and doing
> > "pfn_to_page()" is actively wrong, because it creates a fake 'struct
> > page *' pointer that isn't valid.
> > 
> > Is this all just for some disgusting HIGHMEM use (in which case you
> > need the virtual mapping because of HIGHMEM)? Is there any reason to
> > support HIGHMEM in this area at all?
> > 
> > So I'm not sure why this code does all this horror in the first place.
> > Either it's all just confused code that just didn't know what it was
> > doing and just happened to work (very possible..) or there is
> > something odd going on.

pstore tries to work with either real RAM or with iomem things. What
is there now Currently Works Fine, but should this be using
vmap_page_range()?

-- 
Kees Cook

  parent reply	other threads:[~2025-04-03 16:45 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-31 14:34 [PATCH v2 0/2] ring-buffer: Allow persistent memory to be user space mmapped Steven Rostedt
2025-03-31 14:34 ` [PATCH v2 1/2] tracing: ring-buffer: Have the ring buffer code do the vmap of physical memory Steven Rostedt
2025-03-31 16:55   ` Linus Torvalds
2025-03-31 17:39     ` Steven Rostedt
2025-03-31 19:12       ` Linus Torvalds
2025-03-31 20:58         ` Steven Rostedt
2025-03-31 21:42           ` Linus Torvalds
2025-03-31 23:42             ` Steven Rostedt
2025-04-01  0:09               ` Jann Horn
2025-04-01  1:02                 ` Steven Rostedt
2025-04-01  1:28                   ` Jann Horn
2025-04-01  1:50                     ` Steven Rostedt
2025-04-01  2:23                       ` Mathieu Desnoyers
2025-04-01  1:30                   ` Linus Torvalds
2025-04-01  1:41                     ` Steven Rostedt
2025-04-01  1:55                       ` Linus Torvalds
2025-04-01  9:53                         ` Ingo Molnar
2025-04-01  0:11               ` Linus Torvalds
2025-04-01  0:27                 ` Linus Torvalds
2025-04-01  0:30                 ` Steven Rostedt
2025-04-01  0:38                   ` Linus Torvalds
2025-04-01  0:49                     ` Linus Torvalds
2025-04-01  1:36                       ` Steven Rostedt
2025-04-01  1:44                         ` Linus Torvalds
2025-04-03  5:59                     ` Herbert Xu
2025-04-03 16:47                       ` Linus Torvalds
2025-04-01  9:56             ` Mike Rapoport
2025-04-01 15:11               ` Steven Rostedt
2025-04-01 15:26                 ` Mike Rapoport
2025-04-01 15:54                   ` Steven Rostedt
2025-04-01 17:58                     ` Mike Rapoport
2025-04-03 16:45       ` Kees Cook [this message]
2025-04-03 16:51         ` Linus Torvalds
2025-04-03 17:15           ` Steven Rostedt
2025-03-31 14:34 ` [PATCH v2 2/2] ring-buffer: Allow persistent ring buffers to be mmapped Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202504030941.E0AA2E023@keescook \
    --to=kees@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=gpiccoli@igalia.com \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=vdonnefort@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).