From: Boris Brezillon <boris.brezillon@collabora.com>
To: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Cc: Rob Herring <robh+dt@kernel.org>,
kernel@collabora.com, dri-devel@lists.freedesktop.org,
Neil Armstrong <narmstrong@baylibre.com>
Subject: Re: [PATCH 2/3] drm/panfrost: Expose HW counters to userspace
Date: Thu, 4 Apr 2019 20:17:17 +0200 [thread overview]
Message-ID: <20190404201717.78d6cf47@collabora.com> (raw)
In-Reply-To: <20190404154129.GA2999@rosenzweig.io>
On Thu, 4 Apr 2019 08:41:29 -0700
Alyssa Rosenzweig <alyssa@rosenzweig.io> wrote:
> > +/*
> > + * Returns true if the 2 jobs have exactly the same perfcnt context, false
> > + * otherwise.
> > + */
> > +static bool panfrost_perfcnt_job_ctx_cmp(struct panfrost_perfcnt_job_ctx *a,
> > + struct panfrost_perfcnt_job_ctx *b)
> > +{
> > + unsigned int i, j;
> > +
> > + if (a->perfmon_count != b->perfmon_count)
> > + return false;
> > +
> > + for (i = 0; i < a->perfmon_count; i++) {
> > + for (j = 0; j < b->perfmon_count; j++) {
> > + if (a->perfmons[i] == b->perfmons[j])
> > + break;
> > + }
> > +
> > + if (j == b->perfmon_count)
> > + return false;
> > + }
> > +
>
> Would using memcmp() be cleaner here?
memcmp() does not account for the case where 2 jobs contain exactly the
same perfmons but in a different order. This being said, it's rather
unlikely to happen, so maybe we can accept the perf penalty for that
case.
>
> > + if (panfrost_model_cmp(pfdev, 0x1000) >= 0)
>
> What does 0x1000 refer to here? I'm assuming maybe Bifrost, but it's not
> obvious... probably better to have a #define somewhere and use that (or
> an enum equivalently).
Yes, all numbers above 0xfff are bifrost GPUs. I'll add a macro.
>
> > + /*
> > + * Due to PRLAM-8186 we need to disable the Tiler before we enable HW
> > + * counters.
> > + */
>
> What on earth is PRLAM-8186? :)
>
> Actually, wait, I can answer that -- old kbase versions had an errata
> list:
>
> /* Write of PRFCNT_CONFIG_MODE_MANUAL to PRFCNT_CONFIG causes a instrumentation dump if
> PRFCNT_TILER_EN is enabled */
> BASE_HW_ISSUE_8186,
>
> So that's why. If people want, I'm considering moving these errata
> descriptions back into the kernel where possible, since otherwise code
> like this is opaque.
Will copy the errata.
>
> > + unsigned int nl2c, ncores;
> > +
> > + /*
> > + * TODO: define a macro to extract the number of l2 caches from
> > + * mem_features.
> > + */
> > + nl2c = ((pfdev->features.mem_features >> 8) & GENMASK(3, 0)) + 1;
> > +
> > + /*
> > + * The ARM driver is grouping cores per core group and then
> > + * only using the number of cores in group 0 to calculate the
> > + * size. Not sure why this is done like that, but I guess
> > + * shader_present will only show cores in the first group
> > + * anyway.
> > + */
> > + ncores = hweight64(pfdev->features.shader_present);
> > +
>
> Deja vu. Was this copypaste dmaybe?
Actually, that one is from me, hence the 'not sure why' part :).
>
> > + (panfrost_model_cmp(pfdev, 0x1000) >= 0 ?
>
> THere's that pesky 0x1000 again.
>
> > @@ -55,6 +63,15 @@ struct drm_panfrost_submit {
> >
> > /** A combination of PANFROST_JD_REQ_* */
> > __u32 requirements;
> > +
> > + /** Pointer to a u32 array of perfmons that should be attached to the job. */
> > + __u64 perfmon_handles;
> > +
> > + /** Number of perfmon handles passed in (size is that times 4). */
> > + __u32 perfmon_handle_count;
> > +
> > + /** Unused field, should be set to 0. */
> > + __u32 padding;
>
> Bleep blorp. If we're modifying _submit, we'll need to be swift about
> merging this ahead of the main code to make sure we don't break the
> UABI. Although I guess if we're just adding fields at the end, that's a
> nonissue.
Others are using the same "if data passed is smaller than expected
size, unassigned fields are zeroed". That allows us to extend a struct
without breaking the ABI as long as zero is a valid value and does not
change the behavior compared to when the field was not present.
This is the case here: perfmon_handle_count = 0 means no perfmon
attached to the job, so the driver is acting like it previously was.
No need to get that part merged in the initial patch series IMO.
>
> > +struct drm_panfrost_block_perfcounters {
> > + /*
> > + * For DRM_IOCTL_PANFROST_GET_PERFCNT_LAYOUT, encodes the available
> > + * instances for a specific given block type.
> > + * For DRM_IOCTL_PANFROST_CREATE_PERFMON, encodes the instances the
> > + * user wants to monitor.
> > + * Note: the bitmap might be sparse.
> > + */
> > + __u64 instances;
> > +
> > + /*
> > + * For DRM_IOCTL_PANFROST_GET_PERFCNT_LAYOUT, encodes the available
> > + * counters attached to a specific block type.
> > + * For DRM_IOCTL_PANFROST_CREATE_PERFMON, encodes the counters the user
> > + * wants to monitor.
> > + * Note: the bitmap might be sparse.
> > + */
> > + __u64 counters;
> > +};
>
> I don't understand this. Aren't there more than 64 counters?
>
> > +struct drm_panfrost_get_perfcnt_layout {
> > + struct drm_panfrost_block_perfcounters counters[PANFROST_NUM_BLOCKS];
> > +};
>
> --Oh. It's per-block. Got it.
>
> > + * Used to create a performance monitor. Each perfmonance monitor is assigned an
>
> Typo.
Will fix.
>
> ---
>
> Overall, this looks really great! Thank you! :)
Thanks a lot for your reviews. That was pretty damn fast!
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
next prev parent reply other threads:[~2019-04-04 18:17 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-04 15:20 [PATCH 0/3] drm/panfrost: Expose HW counters to userspace Boris Brezillon
2019-04-04 15:20 ` [PATCH 1/3] drm/panfrost: Move gpu_{write, read}() macros to panfrost_regs.h Boris Brezillon
2019-04-04 15:20 ` [PATCH 2/3] drm/panfrost: Expose HW counters to userspace Boris Brezillon
2019-04-04 15:41 ` Alyssa Rosenzweig
2019-04-04 18:17 ` Boris Brezillon [this message]
2019-04-04 22:40 ` Alyssa Rosenzweig
2019-04-05 15:36 ` Eric Anholt
2019-04-05 16:17 ` Alyssa Rosenzweig
2019-04-04 15:20 ` [PATCH 3/3] panfrost/drm: Define T860 perf counters Boris Brezillon
2019-04-05 15:20 ` [PATCH 0/3] drm/panfrost: Expose HW counters to userspace Steven Price
2019-04-05 16:33 ` Alyssa Rosenzweig
2019-04-05 17:40 ` Boris Brezillon
2019-04-05 17:43 ` Alyssa Rosenzweig
2019-04-30 12:42 ` Boris Brezillon
2019-04-30 13:10 ` Rob Clark
2019-04-30 15:49 ` Jordan Crouse
2019-05-12 13:40 ` Boris Brezillon
2019-05-13 15:00 ` Jordan Crouse
2019-05-01 17:12 ` Eric Anholt
2019-05-12 13:17 ` Boris Brezillon
2019-05-11 22:32 ` Alyssa Rosenzweig
2019-05-12 13:38 ` Boris Brezillon
2019-05-13 12:48 ` Steven Price
2019-05-13 13:39 ` Boris Brezillon
2019-05-13 14:13 ` Steven Price
2019-05-13 14:56 ` Alyssa Rosenzweig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190404201717.78d6cf47@collabora.com \
--to=boris.brezillon@collabora.com \
--cc=alyssa@rosenzweig.io \
--cc=dri-devel@lists.freedesktop.org \
--cc=kernel@collabora.com \
--cc=narmstrong@baylibre.com \
--cc=robh+dt@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).