All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Mike Galbraith <efault@gmx.de>, Paul Mackerras <paulus@samba.org>,
	Stephane Eranian <eranian@google.com>,
	Andi Kleen <ak@linux.intel.com>
Subject: Re: [RFC 2/2] perf: add AUX area to ring buffer for raw data streams
Date: Mon, 19 May 2014 10:58:54 +0200	[thread overview]
Message-ID: <20140519085854.GW30445@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <1400166510-9234-3-git-send-email-alexander.shishkin@linux.intel.com>

[-- Attachment #1: Type: text/plain, Size: 3564 bytes --]

On Thu, May 15, 2014 at 06:08:30PM +0300, Alexander Shishkin wrote:
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -252,6 +252,19 @@ struct pmu {
>  	 * flush branch stack on context-switches (needed in cpu-wide mode)
>  	 */
>  	void (*flush_branch_stack)	(void);
> +
> +	/*
> +	 * Allocate AUX space buffer: return an array of @nr_pages pages to be
> +	 * mapped to userspace that will also be passed to ->free_aux.
> +	 */
> +	void *(*alloc_aux)		(int cpu, int nr_pages, bool overwrite,
> +					 struct perf_event_mmap_page *user_page);
> +					/* optional */
> +
> +	/*
> +	 * Free AUX buffer
> +	 */
> +	void (*free_aux)		(void *aux); /* optional */
>  };

I'm not entirely thrilled to expose it to the PMU like this.. I realize
you want this in order to get physically contiguous pages.

Are you aware of allocation constraints for other architectures?

>  #define PERF_RECORD_MISC_CPUMODE_MASK		(7 << 0)
> @@ -710,6 +726,18 @@ enum perf_event_type {
>  	 */
>  	PERF_RECORD_MMAP2			= 10,
>  
> +	/*
> +	 * Records that new data landed in the AUX buffer part.
> +	 *
> +	 * struct {
> +	 * 	struct perf_event_header	header;
> +	 *
> +	 * 	u64				aux_offset;
> +	 * 	u64				aux_size;
> +	 * };
> +	 */
> +	PERF_RECORD_AUX				= 11,
> +
>  	PERF_RECORD_MAX,			/* non-ABI */
>  };

Ideally the patch introducing this would also introduce code to generate
these records.

> @@ -4076,7 +4090,63 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)
>  		return -EINVAL;
>  
>  	vma_size = vma->vm_end - vma->vm_start;
> -	nr_pages = (vma_size / PAGE_SIZE) - 1;
> +
> +	if (vma->vm_pgoff == 0) {
> +		nr_pages = (vma_size / PAGE_SIZE) - 1;
> +	} else {
> +		/*
> +		 * AUX area mapping: if rb->aux_nr_pages != 0, it's already
> +		 * mapped, all subsequent mappings should have the same size
> +		 * and offset. Must be above the normal perf buffer.
> +		 */
> +		u64 aux_offset, aux_size;
> +
> +		if (!event->rb)
> +			return -EINVAL;
> +
> +		nr_pages = vma_size / PAGE_SIZE;
> +
> +		mutex_lock(&event->mmap_mutex);
> +		ret = -EINVAL;
> +
> +		rb = event->rb;
> +		if (!rb)
> +			goto aux_unlock;
> +
> +		aux_offset = ACCESS_ONCE(rb->user_page->aux_offset);
> +		aux_size = ACCESS_ONCE(rb->user_page->aux_size);
> +
> +		if (aux_offset < perf_data_size(rb) + PAGE_SIZE)
> +			goto aux_unlock;
> +
> +		if (aux_offset != vma->vm_pgoff << PAGE_SHIFT)
> +			goto aux_unlock;
> +
> +		/* already mapped with a different offset */
> +		if (rb_has_aux(rb) && rb->aux_pgoff != vma->vm_pgoff)
> +			goto aux_unlock;
> +
> +		if (aux_size != vma_size || aux_size != nr_pages * PAGE_SIZE)
> +			goto aux_unlock;
> +
> +		/* already mapped with a different size */
> +		if (rb_has_aux(rb) && rb->aux_nr_pages != nr_pages)
> +			goto aux_unlock;
> +
> +		if (!atomic_inc_not_zero(&rb->mmap_count))
> +			goto aux_unlock;
> +
> +		if (rb_has_aux(rb)) {
> +			atomic_inc(&rb->aux_mmap_count);
> +			ret = 0;
> +			goto unlock;
> +		}
> +
> +		atomic_set(&rb->aux_mmap_count, 1);
> +		user_extra = nr_pages;
> +
> +		goto accounting;
> +	}

That appears to be missing a is_power_of_2(aux_size) check.

The problem with not having that is that since
perf_event_mmap_page::aux_{head,tail} are of Z mod 2^64 but your actual
{head,tail} are of Z mod aux_size, you need aux_size to be a full
divider of 2^64 or otherwise you get wrapping issues at the overflow.

Having it them all 2^n makes the divider trivial.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

  parent reply	other threads:[~2014-05-19  8:59 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-15 15:08 [RFC 0/2] perf: add AUX space to ring_buffer Alexander Shishkin
2014-05-15 15:08 ` [RFC 1/2] perf: add data_{offset,size} to user_page Alexander Shishkin
2014-05-15 18:14   ` Robert Richter
2014-05-15 15:08 ` [RFC 2/2] perf: add AUX area to ring buffer for raw data streams Alexander Shishkin
2014-05-15 18:02   ` Robert Richter
2014-05-16  7:07     ` Alexander Shishkin
2014-05-19  8:58   ` Peter Zijlstra [this message]
2014-05-19 12:57     ` Alexander Shishkin
2014-05-20  9:51       ` Peter Zijlstra
2014-05-21 14:02         ` Alexander Shishkin
2014-06-05 11:58           ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140519085854.GW30445@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=efault@gmx.de \
    --cc=eranian@google.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.