From: "Souza, Jose" <jose.souza@intel.com>
To: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
"Nerlige Ramappa, Umesh" <umesh.nerlige.ramappa@intel.com>,
"Landwerlin, Lionel G" <lionel.g.landwerlin@intel.com>
Subject: Re: [PATCH 00/17] Add OA functionality to Xe
Date: Wed, 22 May 2024 16:08:19 +0000 [thread overview]
Message-ID: <10fe79e987915dd334bd947aa923c71b1d38d441.camel@intel.com> (raw)
In-Reply-To: <87v836wpfo.wl-ashutosh.dixit@intel.com>
On Tue, 2024-05-21 at 19:28 -0700, Dixit, Ashutosh wrote:
> On Tue, 21 May 2024 14:00:21 -0700, Souza, Jose wrote:
> >
> > On Tue, 2024-05-21 at 13:24 -0700, Dixit, Ashutosh wrote:
> > > On Tue, 21 May 2024 11:48:01 -0700, Souza, Jose wrote:
> > > >
> > >
> > > Hi Jose,
> > >
> > > > On Tue, 2024-05-21 at 11:02 -0700, Dixit, Ashutosh wrote:
> > > > > On Tue, 21 May 2024 10:39:17 -0700, Souza, Jose wrote:
> > > > > >
> > > > > > On Tue, 2024-05-21 at 09:43 -0700, Dixit, Ashutosh wrote:
> > > > > > > On Tue, 21 May 2024 09:29:51 -0700, Souza, Jose wrote:
> > > > > > > >
> > > > > > > > On Tue, 2024-05-21 at 09:10 -0700, Dixit, Ashutosh wrote:
> > > > > > > > > On Tue, 21 May 2024 07:47:58 -0700, Souza, Jose wrote:
> > > > > > > > >
> > > > > > > > > Hi Jose,
> > > > > > > > >
> > > > > > > > > > > Other ask, can you remove this 'Failed to remove unknown OA config'
> > > > > > > > > > > debug message from xe_oa_remove_config_ioctl()?
> > > > > > > > > >
> > > > > > > > > > Missed 'Insufficient privileges to remove xe OA config', that need to be
> > > > > > > > > > removed too from xe_oa_remove_config_ioctl().
> > > > > > > > > >
> > > > > > > > > > > Mesa will be using DRM_XE_PERF_OP_REMOVE_CONFIG with config id set to
> > > > > > > > > > > UINT64_MAX to detect if Xe KMD supports OA counters and if application
> > > > > > > > > > > has enough permissions to use it. So it causes dmesg to be flooded
> > > > > > > > > > > with 'xe 0000:00:02.0: [drm:xe_oa_remove_config_ioctl [xe]] Failed to
> > > > > > > > > > > remove unknown OA config' messages when running tests suites.
> > > > > > > > > > >
> > > > > > > > > > > Or do you have other suggestion of uAPI that I can use.
> > > > > > > > >
> > > > > > > > > OK, so you are relying on ENODEV and EACCES errno's from
> > > > > > > > > DRM_XE_PERF_OP_REMOVE_CONFIG to find out (a) if OA is present and (b) if
> > > > > > > > > you need to be root (actually CAP_PERFMON or CAP_SYS_ADMIN).
> > > > > > > >
> > > > > > > > yep
> > > > > > > >
> > > > > > > > >
> > > > > > > > > This logic in Xe should be close to what we have in i915? What does Mesa do
> > > > > > > > > for i915, or what doesn't work in Xe?
> > > > > > > > >
> > > > > > > > > Here are some pointers:
> > > > > > > > >
> > > > > > > > > * You can execute DRM_XE_DEVICE_QUERY_OA_UNITS to see if OA is present
> > > > > > > > >
> > > > > > > > > * Add/remove OA configs and using the global OAG buffer (time based
> > > > > > > > > sampling or DRM_XE_OA_PROPERTY_SAMPLE_OA set) are priviliged operations
> > > > > > > > > (need root). Operations which only need OAR/OAC (OA queries, without
> > > > > > > > > DRM_XE_OA_PROPERTY_SAMPLE_OA) can be executed by non-root.
> > > > > > > > >
> > > > > > > > > * If "/proc/sys/dev/xe/perf_stream_paranoid" is 0, all operations can be
> > > > > > > > > executed by non-root users. Otherwise, as I described in the previous
> > > > > > > > > point.
> > > > > > > >
> > > > > > > > It is possible that process not started by root has CAP_PERFMON:
> > > > > > >
> > > > > > > Yes, correct.
> > > > > > >
> > > > > > > Above I am using "root" loosely as "CAP_PERFMON or CAP_SYS_ADMIN".
> > > > > > >
> > > > > > > So if root sets 'perf_stream_paranoid' to 0, normal users can do OA
> > > > > > > priviliged operations (as described above).
> > > > > > >
> > > > > > > If root sets 'perf_stream_paranoid' to 1, only CAP_PERFMON or CAP_SYS_ADMIN
> > > > > > > users can do OA priviliged operations.
> > > > > >
> > > > > > oh okay so perf_stream_paranoid has a good usage but it do not covers
> > > > > > CAP_PERFMON, see below.
> > > > > >
> > > > > > >
> > > > > > > > "Unprivileged processes with enabled CAP_PERFMON capability are treated
> > > > > > > > as privileged processes with respect to perf_events performance
> > > > > > > > monitoring and observability operations,..."
> > > > > > > >
> > > > > > > > And from what I understood only root can write to perf_stream_paranoid,
> > > > > > > > so I don't see a point in having this file...
> > > > > > >
> > > > > > > So I don't know how this statement follows?
> > > > > > >
> > > > > > > root or superuser is the one which gives the permission to non-CAP_PERFMON
> > > > > > > and non-CAP_SYS_ADMIN users to be able to do priviliged OA operations.
> > > > > >
> > > > > > so if I'm running a process with CAP_PERFMON and read
> > > > > > perf_stream_paranoid and it returns 0
> > > > >
> > > > > 0 if fine, everyone can use all OA calls. The issue is with 1.
> > > > >
> > > > > > there is no way for UMD to know
> > > > > > that process is allowed to use Xe KMD OA feature without do a uAPI call
> > > > > > that checks for CAP_PERFMON.
> > > > > >
> > > > > > That is why I think is better just do a single
> > > > > > DRM_XE_PERF_OP_REMOVE_CONFIG to detect if feature is present and if
> > > > > > process is allowed to use it. But it generates a bunch of debug messages.
> > > > >
> > > > > A process should be able to find out on its own, without help from Xe
> > > > > driver, what it's capabilities (CAP_PERFMON or CAP_SYS_ADMIN or neither)
> > > > > are:
> > > > >
> > > > > https://www.google.com/search?q=linux+get+process+capabilities+in+c
> > > > > https://man7.org/linux/man-pages/man3/libcap.3.html
> > > > >
> > > >
> > > > I don't think this is much portable, I don't think BSD has this sysfs
> > > > with PID capabilities of process.
> > >
> > > Not sure why we are bringing BSD here since we have a Linux KMD, not a BSD
> > > KMD. In any case, every OS has system calls for a process to query its
> > > properties.
> >
> > BSD wraps Linux drm drivers into a translation layer.
> >
> > https://www.freshports.org/graphics/drm-kmod/
> >
> > >
> > > So if these are not portable, you don't have to use these library calls,
> > > you should be able to use direct calls into the OS to find stuff out.
> >
> > That is the problem, Linux exposes process capabilities in sysfs while
> > BSD don't even has concept of capabilities, it has privileges...
>
> I guess then that the Linux to BSD translation layer will convert
> capabilities to priviliges.
It will, it will just not expose capabilities or privileges in sysfs like Linux does.
>
> As I see it, Mesa will need to have Linux/BSD specific code. Similar to the
> Linux link above here is the BSD specific link:
>
> https://www.google.com/search?q=freebsd+get+process+privileges+in+c
Again that is what I want to avoid.
We can avoid it by just removing 2 debug messages, don't look that hard.
>
> >
> > >
> > > > Also if later Xe KMD adds or removes process capabilities that has access
> > > > to OA it will break UMDs.
> > >
> > > That would be breaking the uapi, so once it is merged it cannot be changed.
> > >
> > > > https://wiki.freebsd.org/Graphics/Intel-GPU-Matrix
> > > >
> > > > I think would be better to use uAPIs to figure out permissions.
> > >
> > > As I said, the correct way to do this is to use OS calls. In any case, I
> > > don't think we can remove the debug messages because they are useful for
> > > debugging (that's why they are there).
> >
> > Most of Xe uAPI don't have this level debug message. EACCES is pretty
> > clear what is the issue and it don't need a debug message in my opinion.
>
> The problem is we have all sorts of IGT tests which test these error code
> paths. When these fail in CI, all we have is dmesg to figure out what's
> going on.
IGT will also include the errno code, will not be much harder to figure the failure reason.
>
> And anyway according to me, the right way to do this is to query the
> OS. Maybe you need to add dependencies on some of these other libraries?
>
> And I am not sure if the approach you are suggesting will be considered
> acceptable in your Mesa code review.
>
> Also, I don't know why you have oa_metrics_available() at all, since the
> next step in intel_perf_init_metrics() is to add the OA configs and if the
> process doesn't have sufficient priviliges, the kernel will reject adding
> these configs. So just the return code from the kernel will suffice.
if oa_metrics_available() returns false it will not try to add OA configs.
>
> So you may be able to skip oa_metrics_available() for Xe with a little bit
> of code refactoring, no?
>
> >
> > >
> > > Also, looking at intel_perf_init_metrics()->oa_metrics_available(),
> > > currently it just calls geteuid() and doesn't worry about CAP_PERFMON. Of
> > > course you can add that, but then you will need to figure out the correct
> > > way to do it. The current implementation of oa_metrics_available() seems
> > > mostly ok to me and doesn't call remove_config() etc. so IMO we should
> > > continue that approach for Xe too.
> >
> > i915 is missing CAP_PERFMON handling because it was added (
> > https://gitlab.freedesktop.org/drm/xe/kernel/-/commit/980737282232b7 )
> > after i915 OA uAPI. For i915 I don't care at this point but for Xe KMD
> > it is important to support CAP_PERFMON and CAP_SYS_ADMIN.
>
> You might add this for i915 too while you're at it :)
i915 will also span dmesg.
>
> >
> > >
> > > Thanks.
> > > --
> > > Ashutosh
> > >
> > >
> > >
> > > >
> > > > > And therefore, along with perf_stream_paranoid, determine which OA calls it
> > > > > can use.
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > > So basically I think you just need to check for the perf_stream_paranoid
> > > > > > > > > file above. It will tell you both (a) if OA is present (because we are
> > > > > > > > > going to merge the code which creates this file together with OA) and (b)
> > > > > > > > > if you need to be root for particular operations.
> > > > > > > > >
> > > > > > > > > Thanks.
> > > > > > > > > --
> > > > > > > > > Ashutosh
> > > > > > > >
> > > > > >
> > > >
> >
next prev parent reply other threads:[~2024-05-22 16:08 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-15 1:35 [PATCH 00/17] Add OA functionality to Xe Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 01/17] drm/xe/perf/uapi: "Perf" layer to support multiple perf counter stream types Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 02/17] drm/xe/perf/uapi: Add perf_stream_paranoid sysctl Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 03/17] drm/xe/oa/uapi: Add OA data formats Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 04/17] drm/xe/oa/uapi: Initialize OA units Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 05/17] drm/xe/oa/uapi: Add/remove OA config perf ops Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 06/17] drm/xe/oa/uapi: Define and parse OA stream properties Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 07/17] drm/xe/oa: OA stream initialization (OAG) Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 08/17] drm/xe/oa/uapi: Expose OA stream fd Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 09/17] drm/xe/oa/uapi: Read file_operation Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 10/17] drm/xe/oa: Add OAR support Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 11/17] drm/xe/oa: Add OAC support Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 12/17] drm/xe/oa/uapi: Query OA unit properties Ashutosh Dixit
2024-04-24 23:26 ` Dixit, Ashutosh
2024-04-25 13:10 ` Lucas De Marchi
2024-03-15 1:35 ` [PATCH 13/17] drm/xe/oa/uapi: OA buffer mmap Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 14/17] drm/xe/oa: Add MMIO trigger support Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 15/17] drm/xe/oa: Override GuC RC with OA on PVC Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 16/17] drm/xe/oa: Changes to OA_TAKEN Ashutosh Dixit
2024-03-15 1:35 ` [PATCH 17/17] drm/xe/oa: Enable Xe2+ overrun mode Ashutosh Dixit
2024-03-15 1:53 ` ✓ CI.Patch_applied: success for Add OA functionality to Xe (rev13) Patchwork
2024-03-15 1:53 ` ✗ CI.checkpatch: warning " Patchwork
2024-03-15 1:54 ` ✓ CI.KUnit: success " Patchwork
2024-03-15 2:05 ` ✓ CI.Build: " Patchwork
2024-03-15 2:07 ` ✗ CI.Hooks: failure " Patchwork
2024-03-15 2:08 ` ✓ CI.checksparse: success " Patchwork
2024-03-15 2:34 ` ✓ CI.BAT: " Patchwork
2024-05-17 18:42 ` [PATCH 00/17] Add OA functionality to Xe Souza, Jose
2024-05-18 1:42 ` Dixit, Ashutosh
2024-05-21 14:43 ` Souza, Jose
2024-05-21 14:47 ` Souza, Jose
2024-05-21 16:10 ` Dixit, Ashutosh
2024-05-21 16:29 ` Souza, Jose
2024-05-21 16:43 ` Dixit, Ashutosh
2024-05-21 17:39 ` Souza, Jose
2024-05-21 18:02 ` Dixit, Ashutosh
2024-05-21 18:11 ` Dixit, Ashutosh
2024-05-21 19:01 ` Souza, Jose
2024-05-21 18:48 ` Souza, Jose
2024-05-21 20:24 ` Dixit, Ashutosh
2024-05-21 21:00 ` Souza, Jose
2024-05-22 2:28 ` Dixit, Ashutosh
2024-05-22 16:08 ` Souza, Jose [this message]
2024-05-22 4:42 ` Dixit, Ashutosh
2024-05-22 16:13 ` Souza, Jose
2024-05-22 18:50 ` Dixit, Ashutosh
2024-05-22 19:30 ` Souza, Jose
2024-05-25 1:16 ` Dixit, Ashutosh
2024-05-27 17:02 ` Souza, Jose
2024-05-21 19:58 ` Souza, Jose
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=10fe79e987915dd334bd947aa923c71b1d38d441.camel@intel.com \
--to=jose.souza@intel.com \
--cc=ashutosh.dixit@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lionel.g.landwerlin@intel.com \
--cc=umesh.nerlige.ramappa@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox