All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boris Brezillon <boris.brezillon@collabora.com>
To: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Cc: Rob Herring <robh+dt@kernel.org>,
	kernel@collabora.com, dri-devel@lists.freedesktop.org,
	Neil Armstrong <narmstrong@baylibre.com>
Subject: Re: [PATCH 2/3] drm/panfrost: Expose HW counters to userspace
Date: Thu, 4 Apr 2019 20:17:17 +0200	[thread overview]
Message-ID: <20190404201717.78d6cf47@collabora.com> (raw)
In-Reply-To: <20190404154129.GA2999@rosenzweig.io>

On Thu, 4 Apr 2019 08:41:29 -0700
Alyssa Rosenzweig <alyssa@rosenzweig.io> wrote:


> > +/*
> > + * Returns true if the 2 jobs have exactly the same perfcnt context, false
> > + * otherwise.
> > + */
> > +static bool panfrost_perfcnt_job_ctx_cmp(struct panfrost_perfcnt_job_ctx *a,
> > +					 struct panfrost_perfcnt_job_ctx *b)
> > +{
> > +	unsigned int i, j;
> > +
> > +	if (a->perfmon_count != b->perfmon_count)
> > +		return false;
> > +
> > +	for (i = 0; i < a->perfmon_count; i++) {
> > +		for (j = 0; j < b->perfmon_count; j++) {
> > +			if (a->perfmons[i] == b->perfmons[j])
> > +				break;
> > +		}
> > +
> > +		if (j == b->perfmon_count)
> > +			return false;
> > +	}
> > +  
> 
> Would using memcmp() be cleaner here?

memcmp() does not account for the case where 2 jobs contain exactly the
same perfmons but in a different order. This being said, it's rather
unlikely to happen, so maybe we can accept the perf penalty for that
case.

> 
> > +	if (panfrost_model_cmp(pfdev, 0x1000) >= 0)  
> 
> What does 0x1000 refer to here? I'm assuming maybe Bifrost, but it's not
> obvious... probably better to have a #define somewhere and use that (or
> an enum equivalently).

Yes, all numbers above 0xfff are bifrost GPUs. I'll add a macro.

> 
> > +	/*
> > +	 * Due to PRLAM-8186 we need to disable the Tiler before we enable HW
> > +	 * counters.
> > +	 */  
> 
> What on earth is PRLAM-8186? :)
> 
> Actually, wait, I can answer that -- old kbase versions had an errata
> list:
> 
>         /* Write of PRFCNT_CONFIG_MODE_MANUAL to PRFCNT_CONFIG causes a instrumentation dump if
>            PRFCNT_TILER_EN is enabled */
>         BASE_HW_ISSUE_8186,
> 
> So that's why. If people want, I'm considering moving these errata
> descriptions back into the kernel where possible, since otherwise code
> like this is opaque.

Will copy the errata.

> 
> > +		unsigned int nl2c, ncores;
> > +
> > +		/*
> > +		 * TODO: define a macro to extract the number of l2 caches from
> > +		 * mem_features.
> > +		 */
> > +		nl2c = ((pfdev->features.mem_features >> 8) & GENMASK(3, 0)) + 1;
> > +
> > +		/*
> > +		 * The ARM driver is grouping cores per core group and then
> > +		 * only using the number of cores in group 0 to calculate the
> > +		 * size. Not sure why this is done like that, but I guess
> > +		 * shader_present will only show cores in the first group
> > +		 * anyway.
> > +		 */
> > +		ncores = hweight64(pfdev->features.shader_present);
> > +  
> 
> Deja vu. Was this copypaste dmaybe?

Actually, that one is from me, hence the 'not sure why' part :).

> 
> > +		  (panfrost_model_cmp(pfdev, 0x1000) >= 0 ?  
> 
> THere's that pesky 0x1000 again.
> 
> > @@ -55,6 +63,15 @@ struct drm_panfrost_submit {
> >  
> >  	/** A combination of PANFROST_JD_REQ_* */
> >  	__u32 requirements;
> > +
> > +	/** Pointer to a u32 array of perfmons that should be attached to the job. */
> > +	__u64 perfmon_handles;
> > +
> > +	/** Number of perfmon handles passed in (size is that times 4). */
> > +	__u32 perfmon_handle_count;
> > +
> > +	/** Unused field, should be set to 0. */
> > +	__u32 padding;  
> 
> Bleep blorp. If we're modifying _submit, we'll need to be swift about
> merging this ahead of the main code to make sure we don't break the
> UABI. Although I guess if we're just adding fields at the end, that's a
> nonissue.

Others are using the same "if data passed is smaller than expected
size, unassigned fields are zeroed". That allows us to extend a struct
without breaking the ABI as long as zero is a valid value and does not
change the behavior compared to when the field was not present.

This is the case here: perfmon_handle_count = 0 means no perfmon
attached to the job, so the driver is acting like it previously was.

No need to get that part merged in the initial patch series IMO.

> 
> > +struct drm_panfrost_block_perfcounters {
> > +	/*
> > +	 * For DRM_IOCTL_PANFROST_GET_PERFCNT_LAYOUT, encodes the available
> > +	 * instances for a specific given block type.
> > +	 * For DRM_IOCTL_PANFROST_CREATE_PERFMON, encodes the instances the
> > +	 * user wants to monitor.
> > +	 * Note: the bitmap might be sparse.
> > +	 */
> > +	__u64 instances;
> > +
> > +	/*
> > +	 * For DRM_IOCTL_PANFROST_GET_PERFCNT_LAYOUT, encodes the available
> > +	 * counters attached to a specific block type.
> > +	 * For DRM_IOCTL_PANFROST_CREATE_PERFMON, encodes the counters the user
> > +	 * wants to monitor.
> > +	 * Note: the bitmap might be sparse.
> > +	 */
> > +	__u64 counters;
> > +};  
> 
> I don't understand this. Aren't there more than 64 counters?
> 
> > +struct drm_panfrost_get_perfcnt_layout {
> > +	struct drm_panfrost_block_perfcounters counters[PANFROST_NUM_BLOCKS];
> > +};  
> 
> --Oh. It's per-block. Got it.
> 
> > + * Used to create a performance monitor. Each perfmonance monitor is assigned an  
> 
> Typo.

Will fix.

> 
> ---
> 
> Overall, this looks really great! Thank you! :)

Thanks a lot for your reviews. That was pretty damn fast!
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2019-04-04 18:17 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-04 15:20 [PATCH 0/3] drm/panfrost: Expose HW counters to userspace Boris Brezillon
2019-04-04 15:20 ` [PATCH 1/3] drm/panfrost: Move gpu_{write, read}() macros to panfrost_regs.h Boris Brezillon
2019-04-04 15:20 ` [PATCH 2/3] drm/panfrost: Expose HW counters to userspace Boris Brezillon
2019-04-04 15:41   ` Alyssa Rosenzweig
2019-04-04 18:17     ` Boris Brezillon [this message]
2019-04-04 22:40       ` Alyssa Rosenzweig
2019-04-05 15:36     ` Eric Anholt
2019-04-05 16:17       ` Alyssa Rosenzweig
2019-04-04 15:20 ` [PATCH 3/3] panfrost/drm: Define T860 perf counters Boris Brezillon
2019-04-05 15:20 ` [PATCH 0/3] drm/panfrost: Expose HW counters to userspace Steven Price
2019-04-05 16:33   ` Alyssa Rosenzweig
2019-04-05 17:40     ` Boris Brezillon
2019-04-05 17:43       ` Alyssa Rosenzweig
2019-04-30 12:42   ` Boris Brezillon
2019-04-30 13:10     ` Rob Clark
2019-04-30 15:49       ` Jordan Crouse
2019-05-12 13:40         ` Boris Brezillon
2019-05-13 15:00           ` Jordan Crouse
2019-05-01 17:12     ` Eric Anholt
2019-05-12 13:17       ` Boris Brezillon
2019-05-11 22:32     ` Alyssa Rosenzweig
2019-05-12 13:38       ` Boris Brezillon
2019-05-13 12:48         ` Steven Price
2019-05-13 13:39           ` Boris Brezillon
2019-05-13 14:13             ` Steven Price
2019-05-13 14:56             ` Alyssa Rosenzweig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190404201717.78d6cf47@collabora.com \
    --to=boris.brezillon@collabora.com \
    --cc=alyssa@rosenzweig.io \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kernel@collabora.com \
    --cc=narmstrong@baylibre.com \
    --cc=robh+dt@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.