From: Vincent Donnefort <vdonnefort@google.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: mhiramat@kernel.org, linux-kernel@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH v2 1/2] ring-buffer: Introducing ring-buffer mapping functions
Date: Thu, 30 Mar 2023 11:30:51 +0100 [thread overview]
Message-ID: <ZCVk26InuXhy+Lmg@google.com> (raw)
In-Reply-To: <20230329113234.3285209c@gandalf.local.home>
On Wed, Mar 29, 2023 at 11:32:34AM -0400, Steven Rostedt wrote:
> On Wed, 29 Mar 2023 14:55:41 +0100
> Vincent Donnefort <vdonnefort@google.com> wrote:
>
> > > Yes, in fact it shouldn't need to call the ioctl until after it read it.
> > >
> > > Maybe, we should have the ioctl take a parameter of how much was read?
> > > To prevent races?
> >
> > Races would only be with other consuming readers. In that case we'd probably
> > have many other problems anyway as I suppose nothing would prevent another one
> > of swapping the page while our userspace reader is still processing it?
>
> I'm not worried about user space readers. I'm worried about writers, as
> the ioctl will update the reader_page->read = reader_page->commit. The time
> that the reader last read and stopped and then called the ioctl, a writer
> could fill the page, then the ioctl may even swap the page. By passing in
> the read amount, the ioctl will know if it needs to keep the same page or
> not.
How about?
userspace:
prev_read = meta->read;
ioctl(TRACE_MMAP_IOCTL_GET_READER_PAGE)
kernel:
ring_buffer_get_reader_page()
rb_get_reader_page(cpu_buffer);
cpu_buffer->reader_page->read = rb_page_size(reader);
meta->read = cpu_buffer->reader_page->read;
userspace:
/* if new page prev_read = 0 */
/* read between prev_read and meta->read */
If the writer does anything in-between, wouldn't rb_get_reader_page() handle it
nicely by returning the same reader as more would be there to read?
It is similar to rb_advance_reader() except we'd be moving several events at
once?
>
> >
> > I don't know if this is worth splitting the ABI between the meta-page and the
> > ioctl parameters for this?
> >
> > Or maybe we should say the meta-page contains things modified by the writer and
> > parameters modified by the reader are passed by the get_reader_page ioctl i.e.
> > the reader page ID and cpu_buffer->reader_page->read? (for the hyp tracing, we
> > have up to 4 registers for the HVC which would replace in our case the ioctl)
>
> I don't think we need the reader_page id, as that should never move without
> reader involvement. If there's more than one reader, that's up to the
> readers to keep track of each other, not the kernel.
>
> Which BTW, the more I look at doing this without ioctls, I think we may
> need to update things slightly different.
>
> I would keep the current approach, but for clarification of terminology, we
> have:
>
> meta_data - the data that holds information that is shared between user and
> kernel space.
>
> data_pages - this is a separate mapping that holds the mapped ring buffer
> pages. In user space, this is one contiguous array and also holds
> the reader page.
>
> data_index - This is an array of what the writer sees. It maps the index
> into data_pages[] of where to find the mapped pages. It does not
> contain the reader page. We currently map this with the meta_data,
> but that's not a requirement (although we may continue to do so).
>
> I'm thinking that we make the data_index[] elements into a structure:
>
> struct trace_map_data_index {
> int idx; /* index into data_pages[] */
> int cnt; /* counter updated by writer */
> };
>
> The cnt is initialized to zero when initially mapped.
>
> Instead of having the bpage->id = index into data_pages[], have it equal
> the index into data_index[].
>
> The cpu_buffer->reader_page->id = -1;
>
> meta_data->reader_page = index into data_pages[] of reader page
>
> The swapping of the header page would look something like this:
>
> static inline void
> rb_meta_page_head_swap(struct ring_buffer_per_cpu *cpu_buffer)
> {
> struct ring_buffer_meta_page *meta = cpu_buffer->meta_page;
> int head_page;
>
> if (!READ_ONCE(cpu_buffer->mapped))
> return;
>
> head_page = meta->data_pages[meta->hdr.data_page_head];
> meta->data_pages[meta->hdr.data_page_head] = meta->hdr.reader_page;
> meta->hdr.reader_page = head_page;
> meta->data_pages[head_page]->id = -1;
> }
>
> As hdr.data_page_head would be an index into data_index[] and not
> data_pages[].
>
> The fact that bpage->id points to the data_index[] and not the data_pages[]
> means that the writer can easily get to that index, and modify the count.
> That way, in rb_tail_page_update() (between cmpxchgs) we can do something
> like:
>
> if (cpu_buffer->mapped) {
> meta = cpu_buffer->meta_page;
> meta->data_index[next_page->id].cnt++;
> }
>
> And this will allow the reader to know if the current page it is on just
> got overwritten by the writer, by doing:
>
> prev_id = meta->data_index[this_page].cnt;
> smp_rmb();
> read event (copy it, whatever)
> smp_rmb();
> if (prev_id != meta->data_index[this_page].cnt)
> /* read data may be corrupted, abort it */
Couldn't the reader just check for the page commit field? rb_iter_head_event()
does something like this to check if the writer is on its page.
>
>
> Does this make sense?
>
> -- Steve
next prev parent reply other threads:[~2023-03-30 10:31 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-22 10:22 [PATCH v2 0/2] Introducing trace buffer mapping by user-space Vincent Donnefort
2023-03-22 10:22 ` [PATCH v2 1/2] ring-buffer: Introducing ring-buffer mapping functions Vincent Donnefort
2023-03-29 2:44 ` Steven Rostedt
2023-03-29 9:19 ` Vincent Donnefort
2023-03-29 11:03 ` Steven Rostedt
2023-03-29 12:07 ` Steven Rostedt
2023-03-29 12:27 ` Vincent Donnefort
2023-03-29 12:23 ` Vincent Donnefort
2023-03-29 12:47 ` Steven Rostedt
2023-03-29 13:10 ` Vincent Donnefort
2023-03-29 13:14 ` Steven Rostedt
2023-03-30 14:48 ` Vincent Donnefort
2023-03-29 12:51 ` Steven Rostedt
2023-03-29 13:01 ` Vincent Donnefort
2023-03-29 13:11 ` Steven Rostedt
2023-03-29 13:31 ` Vincent Donnefort
2023-03-29 13:36 ` Steven Rostedt
2023-03-29 13:55 ` Vincent Donnefort
2023-03-29 15:08 ` Vincent Donnefort
2023-03-29 15:32 ` Steven Rostedt
2023-03-30 10:30 ` Vincent Donnefort [this message]
2023-03-30 15:21 ` Steven Rostedt
2023-03-22 10:22 ` [PATCH v2 2/2] tracing: Allow user-space mapping of the ring-buffer Vincent Donnefort
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZCVk26InuXhy+Lmg@google.com \
--to=vdonnefort@google.com \
--cc=kernel-team@android.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox